We have a telecom carrier grade platform which is supposed to meet the requirement of 250 ms response for 95% of requests.
We run the system and load it with 70% of maximum traffic it is able to handle. The system has been running for 7 days without any problem. Average ParNew time is 200 ms. Then ParNew time smoothly increases to 500 ms, then we see heap usage increase, CMS concurrent mode failure and Full GC.
There were no other activities on this host, only our system.
CPU load was about 50%.
Are there any monitoring tools so we can see where GC spends much time and why?
-d64 -server -Djava.net.preferIPv4Stack=true -Xms10G -Xmx10G -XX:MaxNewSize=192m -XX:NewSize=192m -XX:MaxPermSize=128m -XX:PermSize=128m -XX:SurvivorRatio=4 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -verbose:gc -Xloggc:../log/gc.log -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) Server VM (build 16.0-b13, mixed mode)
Solaris 10/Sun Fire(TM) T1000