Posted by peterkessler
on November 17, 2004 at 5:16 PM PST
Rummage around in the HotSpot virtual machine code and you'll often find two of everything. Here's why.
You might have noticed that in addition to the
Tiger source snapshots , we have just posted a
Mustang source snapshot under the
Java Research license .
So now we have two code lines being actively worked on:
we continue to find bugs in Tiger and fix them in update releases,
and we have on-going development in Mustang.
That's bound to be confusing.
I'm a HotSpot virtual machine engineer so I'm used to having
two (or more) code lines in progress. But if you are going
to rummage around in the HotSpot virtual machine sources
hotspot/src), I'd like to explain why we seem to have
two of a lot of things. I'll gradually work my way down through
the layers of this particular onion.
- We have to ship releases.
We have Tiger and Mustang (and all the other releases)
because we want to get stuff out into the hands of our users.
But we're never really "done", so development is continuous
while releases are periodic. What you see by looking
at the current Tiger and Mustang source snapshots is that early
in a release (Mustang) there isn't really that much difference
between from the previous release (Tiger). But new development
stopped on Tiger months ago. At any given time, we have (at
least) two releases in progress, one in active development,
the other(s) for bug fix updates.
- Backward compatibility.
Once you get into the sources for the virtual machine,
you'll sometimes find that we often have two implementations
of things. One reason for this is because we think backward
compatibility is really important. While we are working
on some new thing, we have to keep the old thing working,
and the easiest way to do that is to keep the old thing
around. While browsing through the sources, you'll find
a lot of code guarded by command-line switches you probably
didn't know about. (Look at all those command line switches
hotspot/src/share/runtime/globals.hpp!) Those are there
so we can do A-B comparisons for functionality, conformance,
performance, footprint, etc. Only when an new implementation
shows itself to be compatible and substantially better than
the old one do we throw the switch to use the new one. And
we usually leave the switch around for a release or two
in case someone wants to revert to the old behavior.
- Dessert topping or floor wax.
One of the problems with being a successful Java virtual
machine is that people want to use you for everything, even
things you didn't exactly anticipate. While the Java platform
might have burst on the scene as a way of executing content
for small applets in web browsers (and people still use it for
that), people now also use it for running gigantic high-throughput
applications on big multiprocessors. Of course, we want to
make everyone happy, but that often means having alternate
implementations inside the virtual machine. You can see this
in the choice of the client versus server runtime compiler:
the client runtime compiler gives good startup and modest
performance, while the server runtime compiler is not as fast
to start up, but the code it generates runs significantly faster.
You can't use them both at the same time (yet), but if you are
looking around the code base, you'll find both runtime compilers
- One size does not fit all.
In that same style of offering different qualities of service,
we offer something like 3 different garbage collection algorithms.
The concurrent mark sweep algorithm provides lower pause times at
some cost in performance, while the parallel collector offers
better performance with occasional longer pauses. We're not going
make that choice for our users. If you go looking for "the
garbage collector", you'll be disappointed (or maybe pleasantly
surprised) to find at least three of them in there.
- We learn, but slowly.
The HotSpot Java virtual machine is a work in progress.
Ideas that we had a while ago might have been appropriate for
then, but things change. So the source base changes too. We
learn things about the interactions of the different parts of
the virtual machine (basically: compiler(s), garbage collector(s),
and runtime system) and we try to clean up the code. In some
parts of the virtual machine, that means changing the interfaces,
but our desire not to be disruptive means we often leave the
old interface and implementation in place for a release or two.
For example, the older garbage collectors use a "generation
framework" that is extremely flexible, but has some overhead.
The newest collector uses a less flexible interface that is
more efficient. We won't gratuitously convert the older
collectors (we'd risk breaking things, for no benefit to you,
our customers), so you'll find both programming styles in there
if you look.
- It depends on your point of view.
Sometimes you'll be prowling through the source and come
across things that look to be two of the same thing. For example,
hotspot/src/share/oops/oopsHierarchy.hpp shows what appear to be
similar hierarchies for
But those are not alternate implementations, or us evolving the
interface, or anything like that. They are two faces of
the virtual machine's view of the data structures used to
represent Java objects (and a few VM internal data structures).
Simplifying somewhat: an oop is the Java reference to an object,
whereas the klass is the way we manipulate that object from the
C++ code inside the virtual machine. That's an example of where
you have to be able to hold both ideas in your head at the same
time, instead of looking at only the one you think you are
The HotSpot virtual machine is a collection of engineering
tradeoffs and compromises. As such you will often find more
than one way of doing things when you look through the sources.
I hope I've clarified some of the reasons for that. If not,
ask questions and I'll try to come up with answers.