Skip to main content

server twice slower than client?

19 replies [Last post]
oneguyks
Offline
Joined: 2008-04-28
Points: 0

Well, I posted this to Java SE sub-forum mistakenly, so I will post it here instead. Delete the other post if you wish.

The Meteor Puzzle and 3 Java puzzle solvers are described in "Optimize your Java application's performance on IBM page:

http://www-128.ibm.com/developerworks/java/library/j-javaopt/

Here is one solution (fastest so far?)

http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=meteor&l...

However, -server VM is twice slower than client in this case. Is there a reason why that is the case?

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
oneguyks
Offline
Joined: 2008-04-28
Points: 0

Doubling the size of file again

$ time cat reg2.txt | java -Xmx1024m regexdna >\NULL

[b]real 0m23.975s[/b]
user 0m0.061s
sys 0m0.109s

$ time cat reg2.txt | java -server -Xmx1024m regexdna >\NULL

[b]real 0m45.953s[/b]
user 0m0.046s
sys 0m0.062s

If -server can't catch up even after 46 seconds, then it's just slower (twice slower) in this case. There is no point in finding excuses and trying to find "tricks" that might or might not help it.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

> There is no point in finding excuses and trying to find "tricks" that might or might not help it.'
Abusing software for things it was not designed is simply stupid.
Using it the right way means not trick-ing, but rather making the test more real-world like.

I ran your first test - and it was faster with -server than with client. Now you come with another one.
You measure ellapsed time although I recommended to warm-up hotspot and measure time in java code, I just wonder what your intention is?

After all, have a lot of fun with your numbers...

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

>I ran your first test - and it was faster with -server than with client. Now you come >with another one.

Not true. My first post was about meteor-contest where client is faster than server but it runs for only a second. The second one (fannkuch) I said nothing about client and server difference. That was a completely different topic about inner loops and array bound checking. You mentioned that client is faster in fannkuch. It wan't for me. Nor did I make any such claim.

In the third case, server is clearly slower than client. It's not just slower, it's HALF the speed of client.

Message was edited by: oneguyks

oneguyks
Offline
Joined: 2008-04-28
Points: 0

>Yes, JET is often really good for integer-heavy code which
>does not benefit from >hotspot's profiling.

Another thing where Jet performs better than HotSpot is recursive methods. It's around 1.5 times faster for recursion.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

[quote]
real 0m24.580s
user 0m0.030s
sys 0m0.094s
It's still twice slower with 24 secs
[/quote]

Your benchmark is still flawed, I recommend reading some artikles about how to do proper java benchmarking.
1.) Don't use time, never, except you would like to benchmark JVM startup
2.) Use a warmup-phase, which you don't include in your profiles. 24s is still way too short for a benchmark.

So let the benchmark run a minute or two, the do profiling for 5min and then look at your results.

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

Start time is irrelevant here. Both client and server have pretty bad start up time. It's not that like I am comparing JVM vs C++. server is twice slower than client in this case, and I doubt this will change even if you run it for hour.

oneguyks
Offline
Joined: 2008-04-28
Points: 0

>I really hate the "language shootout", as its nothing more than stupid, but still >people seem to get influenced by it.

Though the site is not credible, somethings can still be learnt from it.

Have a look at this:

http://shootout.alioth.debian.org/gp4/benchmark.php?test=fannkuch&lang=j...

enter n = 11 at command line:

This one has many arrays and inner loops. What's strange is that Jet, GCJ all are twice faster than HotSpot in my tests. How come? My guess is that it's inner loops and array bound checking. Can this be classified as bug in HotSpot (given it's twice slower than other JVM and compilers)?

Message was edited by: oneguyks

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

Yes, right - seems hotspot server doesn't do that well on this one.

/home/ce/Programme/jdk1.7.0b25/bin/java -server fannkuch
Pfannkuchen(11) = 51 took:10912
Pfannkuchen(11) = 51 took:10668
Pfannkuchen(11) = 51 took:7342
Pfannkuchen(11) = 51 took:7321

[ce@localhost ~]$ /home/ce/Programme/IBMJava2-142/jre/bin/java fannkuch
Pfannkuchen(11) = 51 took:5273
Pfannkuchen(11) = 51 took:5213
Pfannkuchen(11) = 51 took:5222
Pfannkuchen(11) = 51 took:5209

[ce@localhost ~]$ /home/ce/Programme/jrockit-R26.4.0-jdk1.5.0_06/bin/java fannkuch
Pfannkuchen(11) = 51 took:9934
Pfannkuchen(11) = 51 took:5131
Pfannkuchen(11) = 51 took:5043

Don't know, sure you can report it, but what for?

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

If you try Jet (http://www.excelsior-usa.com/jet.html ) with this benchmark, it will be much faster than client too.

I guess it must be related to arrays in inner loops. HotSpot (both -client and server) seem to have a problem with this benchmark.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

> If you try Jet (http://www.excelsior-usa.com/jet.html
> ) with this benchmark
Yes, JET is often really good for integer-heavy code which does not benefit from hotspot's profiling.

>it will be much faster than
> client too.
Notice, if I execute the benchmark in a loop (without restarting the JVM) server is a bit faster than client for me.

> I guess it must be related to arrays in inner loops.
> HotSpot (both -client and server) seem to have a
> problem with this benchmark.
I guess this is caused by the strange (ugly?) loops used, with strange exit conditions.

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

Strangely I am finding more and cases where client is almost twice faster. In regex benchmark, I tried using this package: http://jregex.sourceforge.net/

I am wondering, in this case, whether it has something to do with default GC behavior ...

First client look at client:

$ time java -verbose:gc -Xmx128m regexdna2 [GC 693K->456K(5056K), 0.0019620 secs]
[GC 2378K->1736K(5056K), 0.0006466 secs]
[Full GC 1736K->1416K(5056K), 0.0146277 secs]
[GC 3979K->3976K(5056K), 0.0005007 secs]
[Full GC 3976K->2696K(5056K), 0.0162736 secs]
[GC 7822K->7816K(10580K), 0.0003291 secs]
[Full GC 7816K->5256K(10580K), 0.0190434 secs]
[GC 15508K->15496K(20824K), 0.0004453 secs]
[Full GC 15496K->10376K(20824K), 0.0217736 secs]
[GC 20940K->20713K(31500K), 0.0011386 secs]
[Full GC 20713K->20713K(31500K), 0.0140789 secs]
[GC 30841K->30813K(37212K), 0.0004944 secs]
[Full GC 30813K->30813K(37212K), 0.0147285 secs]
[GC 48023K->47982K(55260K), 0.0006394 secs]
[Full GC 47982K->30813K(55260K), 0.0151136 secs]
[GC 40790K->40750K(55260K), 0.0004879 secs]
[Full GC 40750K->20547K(55260K), 0.0254714 secs]
agga|ttt 77650
[GC 37480K->37442K(55260K), 0.0004350 secs]
[Full GC 37442K->10309K(55260K), 0.0225812 secs]
[cgt]gggtaaa|tttaccc[acg] 150
a[act]ggtaaa|tttacc[agt]t 400
[GC 44133K->44094K(55260K), 0.0004913 secs]
[Full GC 44094K->10309K(55260K), 0.0146342 secs]
ag[act]gtaaa|tttac[agt]ct 400
agg[act]taaa|ttta[agt]cct 350
[GC 44131K->44094K(53436K), 0.0004450 secs]
[Full GC 44094K->10309K(53436K), 0.0150857 secs]
aggg[acg]aaa|ttt[cgt]ccct 150
[GC 27236K->27203K(46872K), 0.0003822 secs]
[Full GC 27203K->10280K(46872K), 0.0188164 secs]
agggt[cgt]aa|tt[acg]accct 200
[GC 27200K->27175K(36896K), 0.0004127 secs]
[Full GC 27175K->10280K(36896K), 0.0140028 secs]
agggta[cgt]a|t[acg]taccct 150
[GC 27200K->27175K(36896K), 0.0003505 secs]
[Full GC 27175K->10280K(36896K), 0.0146204 secs]
agggtaa[cgt]|[acg]ttaccct 250
[GC 27200K->27172K(36896K), 0.0004026 secs]
[Full GC 27172K->10278K(36896K), 0.0135637 secs]
[GC 20242K->20214K(36896K), 0.0003480 secs]
[Full GC 20214K->20214K(36896K), 0.0135797 secs]
[GC 37134K->37106K(53788K), 0.0004962 secs]
[Full GC 37106K->37106K(53788K), 0.0143887 secs]
[GC 57027K->56978K(66520K), 0.0005408 secs]
[Full GC 56978K->30150K(66520K), 0.0282548 secs]
[GC 50518K->50470K(66520K), 0.0005620 secs]
[Full GC 50470K->20661K(66520K), 0.0294084 secs]
[GC 58299K->58250K(66520K), 0.0005634 secs]
[Full GC 58250K->30820K(66520K), 0.0298184 secs]
[GC 51654K->51606K(66520K), 0.0004780 secs]
[Full GC 51606K->21128K(66520K), 0.0293133 secs]
[GC 59630K->59581K(66520K), 0.0005053 secs]
[Full GC 59581K->31520K(66520K), 0.0288703 secs]
[GC 52817K->52768K(66520K), 0.0005343 secs]
[Full GC 52768K->21589K(66520K), 0.0301817 secs]
[GC 60946K->60897K(66520K), 0.0005704 secs]
[Full GC 60897K->32213K(66520K), 0.0287576 secs]
[GC 53971K->53923K(66520K), 0.0005994 secs]
[Full GC 53923K->22051K(66520K), 0.0309468 secs]
[GC 40553K->40504K(66520K), 0.0004549 secs]
[Full GC 40504K->40504K(66520K), 0.0138565 secs]
[GC 62267K->62213K(72628K), 0.0005409 secs]
[Full GC 62213K->32905K(72628K), 0.0302025 secs]
[GC 55113K->55059K(72628K), 0.0005716 secs]
[Full GC 55059K->22495K(72628K), 0.0304660 secs]
[GC 63532K->63478K(72628K), 0.0005776 secs]
[Full GC 63478K->33571K(72628K), 0.0307626 secs]
[GC 56301K->56247K(72628K), 0.0004857 secs]
[Full GC 56247K->23018K(72628K), 0.0315106 secs]
[GC 65022K->64967K(72628K), 0.0005530 secs]
[Full GC 64967K->34355K(72628K), 0.0294395 secs]
[GC 57768K->57715K(72628K), 0.0005243 secs]
[Full GC 57715K->23701K(72628K), 0.0315908 secs]
[GC 66969K->66914K(72628K), 0.0006321 secs]
[Full GC 66914K->35380K(72628K), 0.0318008 secs]
[GC 59510K->59457K(72628K), 0.0005190 secs]
[Full GC 59457K->24418K(72628K), 0.0319360 secs]
[GC 44937K->44882K(72628K), 0.0005123 secs]
[Full GC 44882K->44882K(72628K), 0.0149970 secs]
[GC 69019K->68958K(80504K), 0.0005908 secs]
[Full GC 68958K->36456K(80504K), 0.0320348 secs]
[GC 61285K->61225K(80504K), 0.0005162 secs]
[Full GC 61225K->25111K(80504K), 0.0335020 secs]
[GC 70993K->70932K(80504K), 0.0006611 secs]
[Full GC 70932K->37495K(80504K), 0.0328627 secs]
[GC 63085K->63026K(80504K), 0.0005654 secs]
[Full GC 63026K->25872K(80504K), 0.0328683 secs]
[GC 73163K->73103K(80504K), 0.0005802 secs]
[Full GC 73103K->38637K(80504K), 0.0341534 secs]

5170900
5087300
6771300

real 0m4.929s
user 0m0.000s
sys 0m0.015s

Notice the time... 4.9 secs

Now server ....

$ time java -server -verbose:gc -Xmx128m regexdna2 [GC 31085K->28415K(33280K), 0.0019650 secs]
[Full GC 28415K->20714K(33664K), 0.0239854 secs]
agga|ttt 77650
[cgt]gggtaaa|tttaccc[acg] 150
a[act]ggtaaa|tttacc[agt]t 400
[GC 108649K->108640K(120128K), 0.0012729 secs]
[GC 108640K->108624K(120128K), 0.0025710 secs]
[Full GC 108624K->10281K(27584K), 0.0240469 secs]
ag[act]gtaaa|tttac[agt]ct 400
agg[act]taaa|ttta[agt]cct 350
aggg[acg]aaa|ttt[cgt]ccct 150
agggt[cgt]aa|tt[acg]accct 200
agggta[cgt]a|t[acg]taccct 150
agggtaa[cgt]|[acg]ttaccct 250
[GC 111666K->111646K(122496K), 0.0016516 secs]
[Full GC 111646K->10281K(31168K), 0.0169781 secs]
[GC 114953K->114921K(122496K), 0.0011348 secs]
[Full GC 114921K->30820K(53440K), 0.0392442 secs]
[GC 111370K->111322K(125760K), 0.0013002 secs]
[Full GC 111322K->21590K(49088K), 0.0311288 secs]
[GC 101167K->101075K(126656K), 0.0010298 secs]
[GC 101075K->101075K(130688K), 0.0022281 secs]
[Full GC 101075K->40504K(74432K), 0.0417102 secs]
[Full GC 125511K->33571K(68736K), 0.0367921 secs]
[Full GC 121662K->23701K(60288K), 0.0304179 secs]
[GC 111525K->111487K(130368K), 0.0187701 secs]
[Full GC 111487K->44883K(87936K), 0.0472232 secs]
[GC 114826K->114812K(130624K), 0.0205054 secs]
[Full GC 114812K->46164K(95552K), 0.0435001 secs]
[Full GC 118193K->47573K(99136K), 0.0430711 secs]

5170900
5087300
6771300

real 0m8.921s
user 0m0.000s
sys 0m0.031s

Note it's two times slower.

briand
Offline
Joined: 2005-07-11
Points: 0

Again, 8s is most likely not enough time for the server compiler
to adequately warm up. Run your microbenchmark with -XX:+PrintCompilation
and observe how the compilations differ. Consider modifying your
benchmark to introduce a warm-up section, calling the logic to be
measured at least 3 times for entering the measurement phase.
The logic to be measured should take a parameter for how long
it should run. For the warm-up phase, use a smaller number, like
60s or so. For the measurement phase, run for 10mins or more.
The result should be based on the number of operations completed
during the measurement interval. In your case, probably pattern
matches/sec. Again, watch the output of -XX:+PrintCompilation to
see how -client and -server differ.

Brian

oneguyks
Offline
Joined: 2008-04-28
Points: 0

>Again, 8s is most likely not enough time for the server compiler
>to adequately warm up. Run your microbenchmark with -XX:+PrintCompilation

Ok, here is the benchmark:

http://pastebin.com/f7077d88c

the regex package I am using is this:
http://jregex.sourceforge.net/index.html

The input file is this
http://shootout.alioth.debian.org/download/regexdna-input.txt

made several times larger. The size of the file I used is 5 MB concatenated together from the above file. I can simply make the file larger (let's say tripple it to 15 MB). That should cover some of the warmup phase, wouldn't it?

First the client:

$ time cat reg2.txt | java -XX:+PrintCompilation -Xmx512m regexdna2 >lo
g.txt

1 java.lang.String::hashCode (60 bytes)
2 java.lang.String::charAt (33 bytes)
3 sun.nio.cs.SingleByteDecoder::decode (33 bytes)
4 ! sun.nio.cs.SingleByteDecoder::decodeArrayLoop (308 bytes)
5 java.nio.Buffer::position (43 bytes)
6 java.nio.charset.CoderResult::isUnderflow (13 bytes)
7 java.nio.Buffer::remaining (10 bytes)
--- n java.lang.System::arraycopy (static)
8 java.nio.charset.CoderResult::isOverflow (14 bytes)
9 java.nio.HeapByteBuffer::ix (7 bytes)
10 java.io.BufferedInputStream::getBufIfOpen (21 bytes)
11 java.lang.Object:: (1 bytes)
12 java.nio.Buffer::limit (62 bytes)
13 ! java.nio.charset.CharsetDecoder::decode (287 bytes)
14 sun.nio.cs.SingleByteDecoder::decodeLoop (28 bytes)
15 java.nio.ByteBuffer::hasArray (20 bytes)
16 java.nio.CharBuffer::hasArray (20 bytes)
17 sun.nio.cs.StreamDecoder::implRead (243 bytes)
18 java.io.BufferedInputStream::getInIfOpen (21 bytes)
19 java.nio.ByteBuffer::arrayOffset (35 bytes)
20 java.nio.ByteBuffer::array (35 bytes)
21 java.nio.Buffer::flip (20 bytes)
--- n java.io.FileInputStream::readBytes
22 java.io.FileInputStream::read (8 bytes)
23 ! sun.nio.cs.StreamDecoder::readBytes (281 bytes)
24 java.nio.HeapByteBuffer::compact (48 bytes)
25 s java.io.BufferedInputStream::read (113 bytes)
26 java.io.BufferedInputStream::read1 (108 bytes)
27 java.nio.Buffer:: (68 bytes)
28 java.io.InputStreamReader::read (11 bytes)
--- n java.io.FileInputStream::available
29 ! sun.nio.cs.StreamDecoder::read (196 bytes)
30 sun.nio.cs.StreamDecoder::ensureOpen (18 bytes)
31 ! sun.nio.cs.StreamDecoder::inReady (36 bytes)
32 java.lang.AbstractStringBuilder::append (46 bytes)
33 s java.io.BufferedInputStream::available (18 bytes)
34 java.lang.StringBuilder::append (10 bytes)
35 java.lang.String::indexOf (151 bytes)
36 java.lang.String::indexOf (166 bytes)
37 java.lang.String::replace (142 bytes)
38 java.lang.String::lastIndexOf (156 bytes)
39 java.lang.Character::getType (5 bytes)
40 java.lang.Character::getType (158 bytes)
41 jregex.Block::set (66 bytes)
42 java.lang.Character::getPlane (5 bytes)
43 java.lang.CharacterData00::getType (10 bytes)
44 java.lang.CharacterData00::getProperties (32 bytes)
1% jregex.Bitset:: @ 17 (88 bytes)
45 jregex.Block::count (29 bytes)
46 jregex.Bitset::add (52 bytes)
47 jregex.Block::add (109 bytes)
48 java.lang.String::equals (88 bytes)
49 java.lang.String::startsWith (78 bytes)
50 jregex.Matcher::search (4841 bytes)
51 jregex.SearchEntry::reset (69 bytes)
52 jregex.Matcher::bounds (221 bytes)
53 jregex.PerlSubstitution::appendSubstitution (24 bytes)
54 java.lang.AbstractStringBuilder::append (60 bytes)
55 s java.lang.StringBuffer::append (8 bytes)
56 jregex.Matcher::flush (100 bytes)
57 jregex.Matcher::init (23 bytes)
58 jregex.Matcher::find (17 bytes)
59 jregex.SearchEntry::popState (92 bytes)
60 s java.lang.StringBuffer::append (10 bytes)
61 jregex.Matcher::end (10 bytes)
62 jregex.Matcher::start (10 bytes)
63 jregex.Matcher::getGroup (38 bytes)
64 jregex.Replacer$1::append (12 bytes)
65 jregex.PerlSubstitution$PlainElement::append (35 bytes)
66 jregex.Replacer$1::append (10 bytes)
67 jregex.Matcher::setTarget (110 bytes)
2% jregex.Replacer::replace @ 8 (74 bytes)
68 jregex.Matcher::repeat (360 bytes)
69 jregex.Matcher::findBack (332 bytes)
agggtaaa|tttaccct 0
[cgt]gggtaaa|tttaccc[acg] 450
70 jregex.Matcher::skip (57 bytes)
a[act]ggtaaa|tttacc[agt]t 1200
ag[act]gtaaa|tttac[agt]ct 1200
agg[act]taaa|ttta[agt]cct 1050
aggg[acg]aaa|ttt[cgt]ccct 450
agggt[cgt]aa|tt[acg]accct 600
agggta[cgt]a|t[acg]taccct 450
agggtaa[cgt]|[acg]ttaccct 750
71 jregex.Replacer::replace (74 bytes)
3% jregex.Matcher::find @ 48 (306 bytes)
72 jregex.Matcher::find (306 bytes)
73 java.lang.String::getChars (66 bytes)

15512700
15261900
20313900

[b]real 0m12.972s[/b]
user 0m0.045s
sys 0m0.015s

Note the time is 0m12.972s

Now server ...

$ time cat reg2.txt | java -server -XX:+PrintCompilation -Xmx512m regexdna2 >lo
g.txt

1 java.lang.String::charAt (33 bytes)
2 sun.nio.cs.SingleByteDecoder::decode (33 bytes)
3 ! sun.nio.cs.SingleByteDecoder::decodeArrayLoop (308 bytes)
1% ! sun.nio.cs.SingleByteDecoder::decodeArrayLoop @ 129 (308 bytes)
4 java.nio.Buffer::position (43 bytes)
5 java.nio.ByteBuffer::arrayOffset (35 bytes)
6 java.nio.CharBuffer::arrayOffset (35 bytes)
7 java.lang.Character::getType (5 bytes)
8 java.lang.Character::getType (158 bytes)
9 jregex.Block::set (66 bytes)
10 java.lang.Character::getPlane (5 bytes)
11 java.lang.CharacterData00::getType (10 bytes)
12 java.lang.CharacterData00::getProperties (32 bytes)
2% jregex.Bitset:: @ 17 (88 bytes)
13 jregex.Block::count (29 bytes)
14 jregex.Bitset::add (52 bytes)
15 jregex.Matcher::search (4841 bytes)
16 jregex.SearchEntry::reset (69 bytes)
17 jregex.Matcher::bounds (221 bytes)
18 jregex.PerlSubstitution::appendSubstitution (24 bytes)
--- n java.lang.System::arraycopy (static)
19 java.lang.AbstractStringBuilder::append (46 bytes)
20 java.lang.AbstractStringBuilder::append (60 bytes)
21 s java.lang.StringBuffer::append (8 bytes)
22 jregex.Matcher::flush (100 bytes)
23 jregex.Matcher::init (23 bytes)
24 jregex.Matcher::find (17 bytes)
25 jregex.SearchEntry::popState (92 bytes)
26 s java.lang.StringBuffer::append (10 bytes)
27 jregex.Matcher::end (10 bytes)
28 jregex.Matcher::start (10 bytes)
29 jregex.Matcher::getGroup (38 bytes)
30 jregex.Replacer$1::append (12 bytes)
31 jregex.Replacer$1::append (10 bytes)
32 jregex.Matcher::setTarget (110 bytes)
3% jregex.Replacer::replace @ 8 (74 bytes)
33 jregex.Matcher::repeat (360 bytes)
34 jregex.Matcher::findBack (332 bytes)
15 made not entrant (2) jregex.Matcher::search (4841 bytes)
4% jregex.Matcher::search @ 248 (4841 bytes)
agggtaaa|tttaccct 0
35 jregex.Matcher::search (4841 bytes)
[cgt]gggtaaa|tttaccc[acg] 450
a[act]ggtaaa|tttacc[agt]t 1200
ag[act]gtaaa|tttac[agt]ct 1200
agg[act]taaa|ttta[agt]cct 1050
aggg[acg]aaa|ttt[cgt]ccct 450
agggt[cgt]aa|tt[acg]accct 600
agggta[cgt]a|t[acg]taccct 450
agggtaa[cgt]|[acg]ttaccct 750
36 jregex.Replacer::replace (74 bytes)
5% jregex.Matcher::find @ 48 (306 bytes)
29 made not entrant (2) jregex.Matcher::getGroup (38 bytes)
32 made not entrant (2) jregex.Matcher::setTarget (110 bytes)
37 jregex.Matcher::find (306 bytes)
38 java.lang.String::getChars (66 bytes)
39 jregex.Matcher::getGroup (38 bytes)
36 made not entrant (2) jregex.Replacer::replace (74 bytes)
17 made not entrant (2) jregex.Matcher::bounds (221 bytes)
15 made zombie (2) jregex.Matcher::search (4841 bytes)
32 made zombie (2) jregex.Matcher::setTarget (110 bytes)
29 made zombie (2) jregex.Matcher::getGroup (38 bytes)
40 jregex.Matcher::setTarget (110 bytes)
41 jregex.Replacer::replace (74 bytes)
36 made zombie (2) jregex.Replacer::replace (74 bytes)
17 made zombie (2) jregex.Matcher::bounds (221 bytes)

15512700
15261900
20313900

real [b]0m24.580s[/b]
user 0m0.030s
sys 0m0.094s

It's still twice slower with 24 secs

Message was edited by: oneguyks

Message was edited by: oneguyks

oneguyks
Offline
Joined: 2008-04-28
Points: 0

Also, how JIT compiler deals with inner loops? Can it remove array bound checking in such situations? For example, for code with loops like this:
[code]
for(;;)
{
if( something< 30 )
{
for( i=0;i }
for(;r!=1;--r)
{
count[r-1] = r;
}
if(whatever) break
}
[/code]

How would this get compiled? Would JIT remove bound checking here?

Message was edited by: oneguyks

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

For loops the JIT has a technique called "on-stack-replacement".

> How would this get compiled? Would JIT remove bound
> checking here?
I don't know for this example. Fast-Debug builds have the ability to output generated assembly code if asked, guess this could be interesting for you :)

lg Clemens

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

How long-running is the benchmark? I ran it and it completed in a few seconds.

The server-compiler however is optimized for long-running applications, it does very expensive optimizations and lets the program run interpreted to generate a profile of the running app.

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

Even for application that run a few seconds only, in most cases -server is faster than client. In this case, it's apparently slower.

linuxhippy
Offline
Joined: 2004-01-07
Points: 0

Well after all, benchmarks that run for a few seconds are totally flawed.
If the benchmark focuses on runtime-startup performance its ok, but from what I see its focused on comparing "language performance" (whatever that means), and performance is way more than startup time.
I really hate the "language shootout", as its nothing more than stupid, but still people seem to get influenced by it.

Server is slower than client because:
- Needs longer for startup, initialization
- Runs longer in interpreter-mode (10.000 method invcations, compared to 1500 with client)
- its code generation is quite expensive, which therefor it may run way more than 10.000 invocations interpreted (its triggered at 10.000 but installed when its done) and consume cpu away from the running code itself.

But don't worry, these results are really completly meaningless.

lg Clemens

oneguyks
Offline
Joined: 2008-04-28
Points: 0

" really hate the "language shootout", as its nothing more than stupid, but still people seem to get influenced by it."

The example originally came from IBM site as I posted the link.

" Runs longer in interpreter-mode (10.000 method invcations, compared to 1500 with client)"

Does that 10.000 invcations apply to method invcations only or loops also? How would -server compile if the method is just main with many loops.