Skip to main content

too much 'system time' on java/linux webapp

1 reply [Last post]
eoster
Offline
Joined: 2005-01-02
Points: 0

Hi,

We have a simple struts web application on a linux server that is using a
disproportionate amount of 'sy' CPU (per vmstat). A load test with 200 virtual
users run (10 second think time) produces (per vmstat) about 65% 'sy' cpu
utilization and about 35% 'us'. Was expecting just 5-15% sy time.

from vmstat:
us: Time spent running non-kernel code. (user time, including nice time)
sy: Time spent running kernel code. (system time)

To see why this was happening, I ran the 'thread' portion of this little
benchmark test on the linux server:

http://www.cs.ucsd.edu/~wgg/JavaProf/javaprof.html
http://www.cs.ucsd.edu/~wgg/JavaProf/Code/benchmark.tar.gz

Note that I modifed the original parameters, as shown in this stdout from the test
Here is the command to run it:

$ java Benchmark thread
30 threads, switched 10000 times each: 4.123

I didn't know whether this "4.123" seconds was good or bad until I ran the
exact same test on my WinXP 2.8Ghz desktop.

c:\>java Benchmark thread
30 threads, switched 10000 times each: 0.391

The test on the linux box is about 10x slower!
I then re-ran the test on linux and captured vmstat data.
Perhaps this is incidental, but there was also a disproportionate amount of sy cpu util.
See the 99's below? ...and heavy context switching (cs), too.

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st

0 0 60 156832 174160 628932 0 0 0 0 1006 9 0 0 100 0 0
31 0 60 152652 174176 628944 0 0 0 116 1035 24541 2 47 51 0 0
31 0 60 152652 174176 628964 0 0 0 0 1016 78660 1 99 0 0 0
31 0 60 152652 174176 628964 0 0 0 0 1028 77000 1 99 0 0 0
31 0 60 152652 174176 628964 0 0 0 0 1008 76816 1 99 0 0 0
0 0 60 156740 174176 628936 0 0 0 0 1041 53486 2 76 22 0 0
0 0 60 156740 174188 628924 0 0 0 27 1017 23 0 1 98 1 0

I found a separate little thread benchmark test where the winxp and the linux perform (basically) the same. (see Parallel.java at end of this email). The difference is that in the first test where linux is handily out-performed, Thread.yield() is invoked many times.

Thanks all. I'd appreciate any input you've got on this.

--Erik Ostermueller

##### Environment info ##############################################
this machine is a RH linux 'guest' box on an AIX host.
tomcat 5.5.23

$ java -version
java version "1.5.0_16"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_16-b02)
Java HotSpot(TM) Client VM (build 1.5.0_16-b02, mixed mode, sharing)

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.3 (Tikanga)

$ dmesg | head -1
Linux version 2.6.18-128.el5 (mockbuild@hs20-bc1-5.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Dec 17 11:42:39 EST 2008

$ egrep '^[^#]*title' /boot/grub/menu.lst | grep -v 'memtest'
title Red Hat Enterprise Linux Server (2.6.18-128.el5)
title Red Hat Enterprise Linux Server (2.6.18-92.1.22.el5)
title Red Hat Enterprise Linux Server (2.6.18-92.1.10.el5)

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 8
cpu MHz : 2799.205
cache size : 2048 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss constant_tsc up pni ds_cpl
bogomips : 5656.39

START of infor for the winxp box:
C:\src\jsource\benchmarks\benchmark>set proc
PROCESSOR_ARCHITECTURE=x86
PROCESSOR_IDENTIFIER=x86 Family 15 Model 4 Stepping 9, GenuineIntel
PROCESSOR_LEVEL=15
PROCESSOR_REVISION=0409
END of info for the winxp box:

/*
* Parallel.java
*
*/
/*
* Parallel.java taken from http://www.amitysolutions.com.au/documents/NPTL_Java_threads.pdf
* My linux and winxp performed about the same on this test.
* Created on 2 July 2003, 11:51
*/
package com.amity.tester;
import java.lang.Thread;
import java.util.Date;
import java.util.HashMap;
/**
*
* @author jbarnett
* @version 1.0
*/
public class Parallel
{
public Parallel() {
System.out.println("parallel ctor");
}
/**
* Used to test the throughput response.
*
* @returns long which should be millisecond time
*/
class ThroughputTest extends Thread {
private int id = 0;
private int length = 0;
public ThroughputTest(int id, int length) {
this.id = id;
this.length = length;
}
public void run()
{
HashMap map = new HashMap();
String key = "Key";
Date start = new Date();
for (int i = length; i -- > 0;)
map.put(key, new Object());

Date end = new Date();
long counter = end.getTime()-start.getTime();
System.out.println("Load " + id + ": , run length = " + length + ", time = " + counter + " ms");
}
}

public void generateLoad(int threads, int length) {
for (int i = 0; i < threads; i++)
new ThroughputTest(i, length).start();
}

/**
* @param args the command line arguments
*/
public static void main(String[] args)
{
System.out.println("Hello.");
Parallel test = new Parallel();
if (args.length < 2) {
System.err.println("Expected parallel thread count and test run length.");
System.exit(1);
}

try {
int i = Integer.parseInt(args[0]);
int j = Integer.parseInt(args[1]);
test.generateLoad(i, j);
} catch(Exception e) {
System.err.println("Arguments expected to be integers.");
System.exit(2);
}
}
}

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
eoster
Offline
Joined: 2005-01-02
Points: 0

....replying to self with a correction.

The linux box is a virtual guest on VMware ESX 3.x. I had originally said aix. Hardware is HP DL380G4/5.

Thanks,
--Erik