The basic idea is to create a distributed data store over a small cluster. It is similar in motivation to Bigtable and GFS, but lower-level. From the paper:
The overall goal of the Boxwood project is to experiment with data abstractions as the underlying basis for storage infrastructure ... [that includes] redundancy and backup schemes to tolerate failures, expansion mechanisms for load and capacity balancing, and consistency maintenance in the presence of failures.It is worth noting right away that Boxwood is a research project, not a deployed system. The Boxwood prototype runs on a small cluster of eight machines. GFS and Bigtable run on tens of thousands of machines and provide the backend for many of Google's products.
The principal client-visible abstractions that Boxwood provides are a B-tree abstraction and a simple chunk store abstraction provided by the Chunk Manager.
It is also worth noting that they have different standards for failure tolerance. For one of several examples, the Boxwood paper says that "failures are assumed to be fail-stop". Contrast that with the experience of the folks at Google working on Bigtable:
One lesson we learned is that large distributed systems are vulnerable to many types of failures, not just the standard network partitions and fail-stop failures assumed in many distributed protocols.In any case, the Boxwood paper is an interesting read. This is work at Microsoft that may follow a similar path to GFS and Bigtable.
For example, we have seen problems due to all of the following causes: memory and network corruption, large clock skew, hung machines, extended and asymmetric network partitions, bugs in other systems that we are using (Chubby for example), overflow of GFS quotas, and planned and unplanned hardware maintenance.
See also my previous post, "Yahoo building a Google FS clone?", that talks about Yahoo's involvement in Hadoop.
See also the Eclipse project at Microsoft Research.
Update: Mary Jo Foley mentions another Microsoft Research project called Dryad and quotes Bill Gates as saying, "[Google] did MapReduce; we have this thing called Dryad that's better." Unfortunately, there appears to be very little public information on Dryad; I can find no publications on the work.
Update: A year later, Microsoft Researcher Michael Isard gives a Google Tech Talk on Dryad with plenty of details.