Posted by sdo
on September 30, 2005 at 12:05 PM PDT
When looking at benchmark results, make sure to look at what's important.
For the last few years, I've worked in the Java Performance Group at Sun
Microsystems. So I thought it might be good to begin my first blog entry by
talking about what's important in looking at performance.
I'm prompted to look into this topic because of a recent blog by
who discusses the performance of SPECjAppServer 2004. The thing about
application server benchmarks -- and especially the ones from SPEC -- is
that they are trivial to scale horizontally. It's not enough just to look and
see which application server has the highest score, because any vendor
at any time can put together a larger configuration of machines and get the
new world's record. [There's a slight complication here, in that the
benchmark is a system benchmark, not an application server benchmark, and
hence is subject to pressure from a back-end database. That's an
important topic for another day, but it doesn't affect today's point.]
Interestingly, Eric understands this at some level, because he concludes
by calculating how many transaction each application server can get on their
respective CPUs. When you know that the total score doesn't mean anything, it's natural to attempt to normalize the numbers from disparate
software and hardware combinations.
And of course, this particular normalization allows him to show Sun's score
in a bad light and conclude "this is a perfect example of the potential for free software, such as Sun's app server or JBoss, to drive costs up by needing significantly more hardware."
Right instinct; wrong analysis.
It's the right instinct because performance isn't who has the highest
benchmark number. Performance is who performs the work "best" -- where
it's up to you to define best. Maybe someone actually would define best as "most
transactions per CPU" (even if, as in this case, the CPUs aren't equivalent,
which makes the calculation that more irrelevent). But perhaps you'd
like to define best as "most cost-effective overall." I'd argue that
definition has more merit.
Let's look at the acquisition costs of these results. By my calculations, BEA's application server costs around $120K to produce 1664 JOPS on machines that cost around $8K. Sun's application server is free but requires more ~$4K machines. So that translates to roughly $100 per transaction for BEA and about $53 for Sun.
Of course, you may argue that I'm being overly simplistic; there are additional
costs like support costs, and database costs, and so on. Some of those might be important to your decision making and some may not -- in particular, the backend database is going to support a certain number of transactions regardless of the front end, so leaving that out is a way to concentrate only on the relative merits of the appserver tier (plus, database hardware and software pricing makes price comparisons between disparate systems much less interesting).
SPEC makes it theoretically possible (though tedious) to figure this out for
all submissions include a full Bill of Materials from which you can figure
out the total cost of the submission (assuming vendors or their resellers have the relevent prices on their websites), or just the software, or just what's
needed to run the appservers without the database, with our without
supports costs, or whatever parts you want to include or isolate.
And that, of course, is the point: it's up to you to determine what's
important when you make a software/hardware decision. Just don't be
swayed by incomplete arguments that free software is going to cost you
more in the end.