Skip to main content

Best x86 processor for float/double performance

7 replies [Last post]
u8slawo
Offline
Joined: 2005-10-12
Points: 0

Hi all,
we are doing a massive statistical data analysis based on double precision floating point numbers.
Now we are considering running this analysis either on a Pentium4 or an AMD64 processor machines.
I have seen evidence that the Pentium4 processor is much faster than the AMD processor while computing floats, say for 3D applications.
But is this also true for Java doubles?
I'm very thankful for all comments on this.
Slawo.

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
jwenting
Offline
Joined: 2003-12-02
Points: 0

>
> Well, for me twice as fast != a bit faster:
>
> twice as fast would be an illusion since there are so
> many key-factors playing together that in reality you
> should not expect more than a 25-40% win.
>

All depends on what you're doing.
If your floating point work is twice as fast but it's only 1 promille of your total calculation time it's still negligable as you'd save a total of 0.5 millisecond per every second of work performed.
If your floating point calculations make up 10% of your total time you're not saving 0.05 seconds per second of work performed, which might be worth something.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

> 64 bit Java is very fast. Sun engineers seem to have
> done great job utilizing all the extra registers
> AMD64 made available. Magically, performance is
> exactly twice as fast as compared to 32 bit JVM.
> Benchmark that I've witnessed this with is referred
> from Sun web site:
> http://math.nist.gov/cgi-bin/ScimarkSummary?complete

Well, for me twice as fast != a bit faster:
560.93 Sun 1.5.0 Linux 2.6.5-7.111-default Sun. 1.5.0; ; AMD64 FX-53 2.4GHz (makub@ics.muni.cz) 2004-12-16
27. 560.44 Sun 1.5.0-rc WinXP 5.1 Sun. 1.5.0-rc; IBM; P4 3.0 GHz HT

twice as fast would be an illusion since there are so many key-factors playing together that in reality you should not expect more than a 25-40% win.

lg Clemens

u8slawo
Offline
Joined: 2005-10-12
Points: 0

Thank you very much for your comments.
I found the comment on 64 bit performance particularly interesting.
Is there a switch in the AMD64 JVM to switch between modes, like say -classic and -fast64?

Slawo

denka
Offline
Joined: 2003-07-06
Points: 0

64 bit Java is very fast. Sun engineers seem to have done great job utilizing all the extra registers AMD64 made available. Magically, performance is exactly twice as fast as compared to 32 bit JVM. Benchmark that I've witnessed this with is referred from Sun web site: http://math.nist.gov/cgi-bin/ScimarkSummary?complete

gbarton
Offline
Joined: 2003-07-08
Points: 0

Also, for Java get a processor with a large cache. For some reason (not sure why) most java VM's incur lots of cache misses. Increasing cache size can help reduce this.

alanstange
Offline
Joined: 2003-06-12
Points: 0

There is no best x86 processor for FP performance, there is only the one which works best on your code.

Of course, after stating the obvious, the answer to your questions is: an Opteron.

We have racks of top speed dual Xeons and dual Opterons, and the Opterons are fastest every test I run. For anything that requires a lot of memory, the Opteron will also outperform as it has a much better memory subsystem.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

To be honest when really writing number-crunching code I would recommend writing the low-level code which does actually the crunching in C and only the suoftware arround it in Java - this is the way we wrote our applications and we was able to nearly double performance.

The reason is that modern C-compilers are able to vectorize code, which means they can e.g. pack 4 integers into an SSE register and perform the same instruction in one cycle on all 4 data elements, which the server-jvm is till now not able to optimize. Another problem with our code was random array access which could not be optimized away by hotspot.

However this really depends, but in general I really would recommend an Althon64/Opteron - they are really fast.
However when writing mostly SSE(2)-only code P4/Xeon doesn't do a bad job either but when you need high performance code running on P4 you need to optimize it deticated to that processor to get high performance.
For java doubles/floats Althlon64 should be faster.

Note the expensive server cpus are not worth the extra money when writing really optimized code since their price mostly comes from their large L2 chaches which is something totally uninterresting to number crunching code (except you need random access to large data blocks).

lg Clemens

Message was edited by: linuxhippy