Skip to main content

GC long pauses with CMS

2 replies [Last post]
boris_petrovic
Offline
Joined: 2006-11-28
Points: 0

Hello,

We are testing Telco application based on Java OSGI and in long runs experiencing pauses over 2-10 seconds. Since it is Telco area, reducing long pauses is very important.

Hardware is FSC Blade BX600 dual core Intel Xeon 3.0 Ghz ,Hyperthreading (4 logical CPUs - 2 year old hardware)
SLES 9 SP2 32 bit
2GB of RAM
JRE : Sun 1.5.0_12

JVM settings:
-Xmx512m
-Xms512m
-XX:MaxPermSize=256m
-XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing
-XX:NewSize=64m
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps

We were forced to increase permanent space due to out of memory in the permanent space. In previous tests in trace logs I have identified often young collection (parallel new) with young space size of 16m, not keeping up, so we increased to 64m. However this is not our main problem. Main problem is further long full collections and often "concurrent mode failures" occasionally followed by class unloading.

Below are traces from the last session. I found lot of tweaking advices, but currently I am lost in a variety of options to tune.
Another observation is that our CMS concurrent phases last too long (CMS-concurrent-mark: 1.030/69.455 secs).

Here is couple of typical traces:

Application time: 47.1989000 seconds
1368.943: [Full GC 1368.943: [CMS1369.393: [CMS-concurrent-mark: 1.030/69.455 secs]
(concurrent mode failure)[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor120]
[Unloading class sun.reflect.GeneratedMethodAccessor631]
......
[Unloading class sun.reflect.GeneratedMethodAccessor497]
: 118174K->98861K(458752K), 2.5512070 secs] 147043K->98861K(524224K), [CMS Perm : 116709K->115734K(194532K)] icms_dc=10 , 2.5518450 secs]
Total time for which application threads were stopped: 2.5544050 seconds

493.187: [Full GC 1493.187: [CMS1494.003: [CMS-concurrent-mark: 0.816/60.808 secs]
(concurrent mode failure)[Unloading class sun.reflect.GeneratedMethodAccessor945]
[Unloading class sun.reflect.GeneratedMethodAccessor950]
[Unloading class sun.reflect.GeneratedMethodAccessor941]
[Unloading class sun.reflect.GeneratedMethodAccessor940]
[Unloading class sun.reflect.GeneratedMethodAccessor948]
Mon Dec 17 17:26:35 CET 2007
: 92028K->92583K(458752K), 2.4635680 secs] 107507K->92583K(524224K), [CMS Perm : 115782K->115772K(194532K)] icms_dc=10 , 2.4641410 secs]
Total time for which application threads were stopped: 2.4665520 seconds

13424.288: [Full GC 13424.288: [CMS13424.288: [CMS-concurrent-abortable-preclean: 2.156/98.536 secs]
(concurrent mode failure)[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor330]
[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor342]
...//around 180 classes unloaded
[Unloading class sun.reflect.GeneratedMethodAccessor2045]
: 386880K->369517K(458752K), 4.9890290 secs] 391575K->369517K(524224K), [CMS Perm : 116642K->116317K(194916K)] icms_dc=100 , 4.9895980 secs]
Total time for which application threads were stopped: 4.9920160 seconds

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
linuxhippy
Offline
Joined: 2004-01-07
Points: 0

I am not an CMS expert and I have not had any major problems with it, but you could :

1.) Make sure your box does not do swapping - this could be the reason why concurrent-mark takes that long.

2.) Lower the trigger before a concurrent a concurrent gc happens:
-XX:CMSInitiatingOccupancyFraction=, is by default somewhere at 68%, you could try to lower it to e.g. 50 - but you will have more overhead cpu overhead.

3.) Why are you setting:
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing
This adds additional presure and overhead - are you really experiencing a slowdown of your application during CMS?

4.) It seems theres a lot of reflection going on in your application, and a lot of Method-Accessor-classes which have to be fanished. Maybe this could fill up your permament generation and leads to CMS failure. However this goes byond my knowledge.

5.) Are you using RMI, or does your code do System.gc() somewhere? Under 5.0 this triggers a full stop-the-world collection, for 6.0 there's an option which allows to specify the behaviour.

lg Clemens

PS: Oh, the hardware is really ugly ;)

boris_petrovic
Offline
Joined: 2006-11-28
Points: 0

Thank for your hints I was a bit absent around Christmas.
I excluded swapping and lowering -XX:CMSInitiatingOccupancyFraction, actually with value of 45% I got terrible pauses of 40-50sec. There is also no explicit gc neither RMI.

Reason why I am using incremental mode is recommendation from Sun (in
http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html) is using it on boxes with 2-4 hardware threads, other switch -XX:+CMSIncrementalPacing is set of starters recommended for tweaking.

Actually, our code is using heavily serialization and that is why Method-Accessor-classes are unloaded so far I googled about it. I will try to disable class unloading because those are just referenced classes and benefit in a permanent space is not significant.