Skip to main content

Tom White

Tom White is a committer on the Apache Hadoop project, and a member of the Lucene Project Management Committee. He works as an independent consultant specializing in Hadoop and distributed computing. He has been writing Java full time since 1996, and writing about Java since 2003 for O'Reilly, and IBM's developerWorks. Outside programming Tom enjoys making his daughters laugh, and watching 1930s Hollywood films.


tomwhite's blog

"Disks have become tapes"

Posted by tomwhite on March 18, 2008 at 6:07 AM PDT

MapReduce is a programming model for processing vast amounts of data. One of the reasons that it works so well is because it exploits a sweet spot of modern disk drive technology trends. In essence MapReduce works by repeatedly sorting and merging data that is streamed to and from disk at the transfer rate of the disk.

Consistent Hashing

Posted by tomwhite on November 27, 2007 at 9:56 AM PST

I've bumped into consistent hashing a couple of times lately.

Hadoop + EC2 + S3

Posted by tomwhite on July 20, 2007 at 1:10 AM PDT

I've raved about the MapReduce parallel programming model in the past, and Apache Hadoop (the framework for running MapReduce applications), and Amazon's compute and storage webservices (EC2 and S3).

Wanted: A Public Amazon EC2 AMI for Java EE

Posted by tomwhite on June 27, 2007 at 1:00 PM PDT

I noticed that Paul Dowman has created a Ruby on Rails AMI for use on Amazon EC2 (Amazon's rented CPU service). It allows you to fire up a fully-configured RoR environment that you deploy your application to.

jMock 2 and my Java Unit Testing Toolkit

Posted by tomwhite on April 11, 2007 at 6:16 AM PDT

The long-awaited final version of jMock 2 was released today. There are some big changes since version one. For example, you can now write

Cat cat = mock(Cat.class);

and then set expectations on the returned cat object itself:

Testing for errant network connections

Posted by tomwhite on February 8, 2007 at 1:37 AM PST

We kept breaking our XML catalog resolution in the course of developing an application. We would refactor the parser code, or we would upgrade a schema and forget to upgrade the catalog.


Posted by tomwhite on December 22, 2006 at 12:27 PM PST

In Literate Programming with jMock
I enthused about jMock's idea of constraints and flexible assertions.
Now the jMock team has released version 1.0 of Hamcrest,
the constraints part of jMock.

Lift Off

Posted by tomwhite on October 30, 2006 at 3:50 AM PST

In a previous blog entry I mentioned a literate functional testing framework that we had developed at our company, Kizoom.

Are your beans thread-safe?

Posted by tomwhite on September 21, 2006 at 1:55 PM PDT

[Update: changed wording per comments to fix error.]

Affordable Web-Scale Computing Redux

Posted by tomwhite on August 24, 2006 at 2:01 PM PDT

In March I wrote of affordable web-scale computing: