Posted by mlam
on November 25, 2006 at 1:19 PM PST
Why is the phoneME Advanced VM (CVM) written in C instead of C++? And a few other thoughts about performance and portability ...
I've been talking a lot about esoteric knowledge about the phoneME Advanced VM (CVM), and thought that it is about time to feed you some really technical data. So, I spent most of yesterday rendering a Map of CVM to show you the lay of the land, but it is taking a lot longer than I thought. As a result, no blog entry yesterday. :-( Hopefully, I will get it done today, and be able to do a write up for monday. Look for it. It'll be like CVM in a nutshell.
By the way, I'm using InkScape to do my rendering of the CVM map (a colleague pointed me to it). I don't know if it's the best, but it certainly does the job. So, I thought I'd give it a mention here in case others are looking for a tool like this too. I'm using InkScape because I wanted to render the CVM map in SVG, so that you'll get to scale it to match whatever resolution you need without sacrificing detail. But alas, I'm finding that my browsers aren't quite able to display the SVG format yet (or maybe I'm not exporting to the right format). If anyone has hints on what SVG format is supported by popular web browsers, please let me know. Otherwise, I will go with a bitmap for ease of viewing and a PDF for finer inspection.
Incidentally, I also want to thank the 2 people who have left comments for me so far. It's nice to know that I'm not just talking to a wall.
So, on to today's topic(s) ...
Why is CVM written in C?
The choice was either C or C++? As I've pointed out in a previous article , CVM's architecture matches closely to an object class hierarchy. Using C++ could have been an option. The reason we chose C is because the availability of good C compilers for the embedded space far exceeds the availability of good C++ compilers. I remind you that portability is one of the prime objectives of CVM, and is one reason why it is viable in the embedded space. Typically when hardware is introduced, it will at least come with a C compiler. A C++ compiler may or may not come later. We wanted the Java platform to be available on every device. Hence, it was an obvious choice to go with C.
On a second note, we've found that some C++ compilers also generate very inefficient code in terms of footprint (2 to 3 times more footprint). This certainly is not good for any embedded software. Now, before you jump to conclusions, I don't think that this inefficiency necessarily had to do with the C++ language itself. Personally, I'm a fan of C++ as well, and I know how it can let you write really elegant and efficient code (assuming the compiler cooperates), as well as really bad bloated code. My guess at the time was that people in general didn't care enough about C++ to invest in its toolchain (in comparison with C) ... not to say that there aren't very good C++ tool chains out there. As a result, C++ is given a bad name ... which I think is unfortunate.
Mind you, the CVM decision was made some 7 years ago. The inefficient C++ code generation was observed about 3 to 4 years ago. Perhaps, these issues of availability and efficiency have been fixed since.
Some more of my thoughts on portability and performance below ...a note on Portability
I've mentioned previously that CVM's code is organized with the highest code density in the shared folder (as opposed to the platform specific ones). To give you an idea of specifics, back in 2003, we measured the lines of code for the dynamic adaptive compiler (the JIT) in shared vs platform specific. In this case, the shared portion included the RISC layer in the portlibs folder. The ratio was 80-90% of code being shared code, with the remaining being CPU or OS specific code. The majority of these non-shared code consists of assembler routines for the specific CPU architecture. If we factor in the rest of the VM, then the ratio for shared code will go even higher. We didn't actually measure this, but if I had to take a wild guess, I'd say upwards of 95%.
A very small subset of the assembler routines were necessary as glue between the compiled code and the rest of the VM. The remaining majority are all for optimizations for the specific CPU architecture. The good news is that, except for the small amount of glue code, the CPU specific optimizations were all optional. That means that they need not be implemented in order for the JIT to be operational with correct behavior. The shared portion of the code alone (with the minimum assembler glue) will yield significant amounts of performance gain (50-70% of the possible gains). The additional optimizations can be added as needed, or as time allows based on the VM developer's schedule.
This highlights another one of CVM's architecture philosophy. That is, a VM developer must be able to bring up a port of CVM with the absolute minimal effort, but yet get as much performance as is reasonably possible. Then, as time permits, tuning and optimizations can be added to get the additional performance gains. This is why we do the majority of innovations and enhancements in shared code and write it in C.
To bring up just an interpreter build of CVM (i.e. with the JIT disabled), it takes even less effort. For that case, there is only one assembler routine that needs to be written. That routine is commonly known as invokeNative, and it is responsible for being the glue between the interpreter and native code. Its full name is actually CVMjniInvokeNative. See here for an example. If you are doing a linux port (or another OS which we have a port for), then you are good to go after this. If not, you will have to implement the HPI, but you can use code from existing ports as reference.
In summary, CVM achieves ease of portability not only by reducing the amount of porting work necessary (i.e. high shared code to platform specific code ratio), it also aims to allow the porting effort to be done in phases. Apart from making it easier for developers to meet their deployment schedules, it also increases testability of the system. As the system is brought up in phases, it can be tested in phases. It's not all or nothing.
a note on Performance
My official education background is in hardware. So, I used to get confused whenever I hear about how certain software innovations would improve the performance of the system. How can software possibly make hardware run any faster than it already does? The answer is that it doesn't.
The fastest any software can get is as dictated / limited by the hardware that it runs on. However, hardware just by itself usually isn't very useful. It takes software to direct the hardware in order to do useful work. This "directing" translates into a management cost that we usually don't like to pay for though it is necessary. Hence, the whole idea of software performance improvements is not so much about getting more performance out of the hardware. But rather, it is about reclaiming performance that was previously lost to the management cost.
Some of this "management cost" is incurred to make the code more portable and maintainable. This topic of code maintainability probably does not need a lot of explaining for any developer who has ever had to maintain a piece of software for years. However, I thought I'd bring it up anyway just in case there are those who are less seasoned who will be contributing to our open-source code base ... and I certainly hope that there will be. Everyone is welcomed.
Like portability, maintainability of CVM code is also very important. Since phoneME is now open-source and soon, everyone will be able to contribute bug fixes and enhancements, I think it is important to remind everyone that the code needs to stay in a maintainable form. After all, how open-source is a piece of code if no one can understand it? It might as well have been encrypted with the secret key thrown away. I say thrown away because unmaintainable code also tend to be unmaintainable even to its creators once some time has passed.
The point is that we shouldn't sacrifice maintainability in the name of performance. It is a fallacy to think that we can't have both high performance and maintainable code at the same time. Very rarely will there be an exception where we might have to sacrifice some maintainability. Even then, we should try to contain the damage so that it is localized and not wide spread across the code base.
This is one reason that CVM's interpreter loop was written in C instead of assembler. It is certainly possible to write the interpreter loop in assembler, but the cost of doing so would be significant in terms of portability and maintainability. In practice, we found that the performance of CVM's C interpreter loop is not far behind an optimized assembler loop, but the portability and maintainability are significantly better.
end of rant
OK, thanks for bearing with me through that. Next, a look at the world of CVM, its subsystems, and data structures, all on a single map ... a single picture ... the BIG PICTURE (literally).
Have a nice weekend, everyone. :-)