What trends in disk drive technology mean for data processing.
I've bumped into consistent hashing a couple of times lately. But what is it and why should you care? This post has a look.
How to run data processing applications on a rented grid.
Amazon's new Elastic Compute Cloud should be a perfect fit for running Hadoop jobs.
Implementing a distributed java.util.Map using Amazon S3.
With the launch of Amazon S3 (Simple Storage Service) we are seeing a continuation of the trend for the big web companies to monetize their computing infrastructure by opening it up to developers.
The MapReduce and distributed filesytem parts of Nutch (inspired by projects from Google) have been split into a new project, called Hadoop.
MapReduce is an amazing distributed system for massive data processing from Google Labs. There's now a Java implementation.