Skip to main content

Found it! Speedstep causes increasing efficiency with higher load

1 reply [Last post]
nigelss
Offline
Joined: 2007-07-29
Points: 0

Hi everyone,

I was about to post this (incomplete) question this morning... luckily, after days and days of tests, I found what I was doing wrong. Speedstep was still enabled on my notebook. Make sure you use "Always On" power profile on Windows XP for any experimental testing!!!

http://www.bay-wolf.com/speedstep.htm

The Hotspot JIT dynamic optimising compiler was probably also having an impact: make sure you test with -XX:CompileThreshold=1 and do a high event rate warmup to get rid of any JIT effects if you are doing benchmarking. I found that the -server JIT compiler also had a slight performance improvement over -client with forced compilation.

Cheers,

Nigel

Subject: Efficiency (and performance) decreases as incoming rate drops ?!

Hi everyone,

I have noticed some strange behaviour, and I am trying to identify the root cause. I have a Java app that processes incoming events, and uses Hibernate 3.2.1 for POJO storage against an external MS SQL Server database. Pretty much everything is cached in 2nd level storage, except for a single data update per event. There are typically 4 threads - 2 threads wait on a UDP socket, and then read data from the cache to identify the relevant "event source" (although in these tests, the event source is an external event generator, using exponential distribution, fixed lambda per source, but increasing # of event sources). Then, two "processing jobs" are added to a central queue for another two threads, which do the actual work.

What I have found is that as the event rate increases (by increasing the "event sources"), both the CPU time required and the total processing time decreases for each event. I'm a little baffled why this is occurring.

Has anyone noticed this behaviour before? Is there a way of isolating this further?

My observations so far:

** JIT / dynamic optimising compiler
Using high-rate warmups, forced compile settings, and reversing (decreasing event rate, rather than increasing) shows this has only a minor impact. The forced compile gives flat performance at high rates, but still gives increasing efficiency at low event rates (< 100 events per second).

** Garbage collection
Collection happens more often with higher events, but it is more efficient. Throwing more heap (-Xms and -Xmx == 600M) and trying -Xincgc and -Xnoclassgc doesn't seem to make much difference. GC is usually < 5%

** Locking
Perhaps inefficient locking (unlikely), or something to do with lack of wait()/notify() when queue fills - however, my micro-benchmarks exclude this. Using the demo version of YourKit Java Profiler beta 7 for monitor contention checks, it is evident that locking times are increasing. Maybe this is allowing some threads to run for longer with less inefficiency. There is a bit of suggestion that 1 ms per event of CPU time is added at low event rates.

** Database access and commits
DB access is significant during commits (around 30% of "processing time") but only makes a minor contribution to the event rate - possibly Hibernate's "dirty checks" during flushing

** New thought... Maybe Intel SpeedStep resulting in non-const CPU speed??

I'm using 1-20 multiples of 48 event sources, exponential distribution lambda = 0.2, so event rate ranges from 8 - 194 events per second linearly. CPU is monitored using JNI calls to Win32 API each second to determine Java process kernel+user time, idle time and other "overheads" (unaccounted Java, other processes).

The system is a Windows XPSP2, Dual Core 2Ghz Dell Latitude D620 2Gb RAM

java -version gives:

java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode)

... and java -version -server:

java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)

I need this sorted pretty quickly because I need to model the behaviour fairly accurately in a Discrete Event Simulator to urgently finish my PhD (already taken 3.5 months off work). Any help, advice or suggestions would be greatly appreciated.

Best regards,

Nigel

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
nigelss
Offline
Joined: 2007-07-29
Points: 0

/notify() when queue fills - however, my micro-benchmarks exclude this. Using the demo version of YourKit Java Profiler beta 7 for monitor contention checks, it is evident that locking times are increasing. Maybe this is allowing some threads to run for longer with less inefficiency. There is a bit of suggestion that 1 ms per event of CPU time is added at low event rates.

** Database access and commits
DB access is significant during commits (around 30% of "processing time") but only makes a minor contribution to the event rate - possibly Hibernate's "dirty checks" during flushing

** New thought... Maybe Intel SpeedStep resulting in non-const CPU speed??

I'm using 1-20 multiples of 48 event sources, exponential distribution lambda = 0.2, so event rate ranges from 8 - 194 events per second linearly. CPU is monitored using JNI calls to Win32 API each second to determine Java process kernel+user time, idle time and other "overheads" (unaccounted Java, other processes).

The system is a Windows XPSP2, Dual Core 2Ghz Dell Latitude D620 2Gb RAM

java -version gives:

java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode)

... and java -version -server:

java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)

I need this sorted pretty quickly because I need to model the behaviour fairly accurately in a Discrete Event Simulator to urgently finish my PhD (already taken 3.5 months off work). Any help, advice or suggestions would be greatly appreciated.

Best regards,

Nigel