Monday, May 09, 2005

Google...Directory Service or Server?

Another interesting scientific paper (don't worry - the last for awhile) is the Google Labs write up on GFS.

As I read this paper, I was struck with the architectural similarities to Directory Server as well as work we'd done on a system called the "distributor'". While many Directory servers have reasonably good vertical scaling capacity, there is a certain point where the management functions of a system becomes too fragile, sequential operations take to long, and the statistically low write volumes (that tend to be a directory design center & increasing liability) still result in large write volumes as a function of a user population. Deployment of a single physical system, while technically viable, starts failing to meet design constraints in ways which simply aren't an issue at low scale. Ultimately, decomposition of data into smaller logical "chunks" not only allowed better aggregate write throughput, but also allowed parallelization of time dependent operations (backup, recovery, etc).

I had a lot of fun working on this with some of my favorite people there - Steve Shoaff and Neil Wilson. I hope that Sun let's this loose. - in the meantime, read about GFS.

No comments: