Posted by mlam
on April 22, 2007 at 3:46 AM PDT
This is a follow-up to my previous article "Why Choose Java?". This article will try to provide some answers about various things that embedded device developers are likely to ask when choosing a software platform to develop on.
Previously, I talked about why an embedded systems developer would choose to develop on the Java platform. If you have read that article and are intrigued by the benefits that the Java platform offers, then the next step is probably to ask some more deep probing questions like ...
Do I really need the Java platform?
Sure, the Java platform offers many benefits. But is it needed for my specific device?
Well, if you want the benefits of a runtime interpreted scripting language (i.e. isolation, upgradeability, etc.), then, as I have explained previously, your best bet is with the Java platform.
You may not need the Java platform if your device has the following characteristics:
static functionality: the software functionality never needs to be upgraded in the field, not even for bug fixes. Or, it is cheaper to replace the device than to replace the software (although this isn't very eco-responsible). Or you can live with the cost of providing the service and infrastructure to completely re-flash the software in deployed devices. Under these conditions, you do not need the Java platform's dynamic class loading/unloading feature.
simple software: the software application is extremely simple. When the number of lines of code is low enough, the complexity of the software may be manageable, and not be overwhelming. Hence, the likelihood of your programmers being able to understand the entire system is higher, and the number of details for them to remember is lower. Under this condition, you can live without the Java platform's isolation property and language features (e.g. protection from stray pointers, automatic garbage collection, structured locking, etc.), and still be able to get a reasonable amount of developer productivity.
I will also talk more about "simple software" from the perspective of performance and footprint below.
small and restricted developer group: if the group of developers is small, then they are easier to manage. The likelihood that their code will accidentally step on each other's code is less. If there is only one developer group for the software and you will never need other groups or third parties to develop software for your device, then there is no risk of their code trampling on your code. Under these circumstances, you may not need the Java platform's isolation property and security features.
software can always be trusted: If there will never be any software deployed on your device that is from an untrusted source (i.e. can perform attacks on or crash your device), then you may not need the Java platform's isolation property and security features. Or alternatively, if you don't care if they attack and crash your device, then you may not need the Java platform.
Generally, if your situation doesn't fit into one of the above profiles, then it is likely that you will benefit from developing on the Java platform.
How does Java fragmentation affect me?
Fragmentation is the term that most people use to describe what happened in the mobile phone space with JavaME CLDC implementations. The effects of fragmentation is that even though the Java platform aims to be Write Once Run Anywhere (WORA), in the mobile phone space, this isn't quite true. As a result, application developers end up having to test their applications on many different devices (instead of just on one reference platform), and sometimes (or maybe, often times), they also have to write customizations in the application for each of the variations that they discover in those devices.
Fragmentation is bad for the application developers (also known as Independent Software Vendors, or ISVs) because the extra coding and testing adds additional development costs. Fragmentation is also bad for the phone carriers because it reduces the amount of available content (i.e. applications and services) for their phones, as well as increases deployment costs because of the version tracking that is needed to deal with the variations between devices on their network. Bottom line, fragmentation is bad for everyone's business.
So, does this affect you as an embedded device developer?
Well, first of all, let's understand why there is fragmentation. The reason for this fragmentation (i.e. differences in behavior) is because the mobile phone industry wanted to leave room in the specifications for differentiation of products. Unfortunately, the amount of room available led to the type of variations that introduce incompatibilities between devices. Hence, the incompatibility in the Java implementations are there because the industry "wanted" it that way. I put "wanted" in quotes because, in retrospect, most people would agree that this was a bad decision. That's why there are efforts (e.g. JSR 248, Mobile Service Architecture, commonly referred to as MSA) to create a more unified Java implementation.
But let's look at the embedded device developer's perspective. Fragmentation does not affect you because you get to make sure that the hardware is adequately compatible between your devices. You would have to do this anyway if you were coding with a native language like C/C++ and wanted to re-use code between your devices.
Portability in C/C++?
If the device hardware is adequately compatible, then why can't you just implement your software in C/C++ while keeping portability in mind? Wouldn't you get just as much portability from your code because you designed it well?
The answer is yes. It is possible to implement a good porting abstraction in C/C++ so that it will minimize your effort to port to a new device. Sun's CVM (aka phoneME Advanced VM) itself, which I work on, is for the most part C code that implements a well thought out porting abstraction. That's why it is easy to port to different devices.
Why go with the Java platform then? As I have explained previously, portability of software isn't the only benefit of the Java platform. You also get lots of other Java platform perks like:
isolation of critical code: don't need to worry about stray pointers interfering with your critical code anymore, or crashing your system.
improved productivity: because you don't have to worry about stray pointers interfering with your critical code, or any code for that matter. There are no more stray pointer problems.
language and API feature set: fine-grain security, class loading and unloading, etc.
Development, Testing, and Deployment Costs
Let's say we disregard the above benefits of the Java platform, and just compare development, testing, and deployment costs. How would the C/C++ development route compare?
In terms of development, as a device developer, you won't be able to get away from C/C++ (or even assembly) completely. Ultimately, you need to do some low level work that is not conveniently done in the Java language. In the least, you need to port the Java VM and library native code.
But consider this: if you develop your software (application and system code) entirely in C/C++, you may be subject to debugging problems like memory corruptions, stack overflows, and other such bugs which are extremely difficult to isolate. If you develop most of your software in Java and implement only a small critical set in C/C++, then debugging time decreases because the percentage of code that can contribute to the problem has been reduced significantly.
An old proven method for managing the complexity of a problem is the "divide and conquer" method. For debugging, this works by isolating sub-systems from each other so that you can eliminate each of them as the potential source of the bug. With the majority of your code written in Java, your debugging session will start out with a lot of your code already proven to not be the cause of the memory corruption, or stack overflows. And this is simply because they are implemented in the Java language. Hence, using the Java platform can help reduce your debugging effort (and improve your productivity).
Since you would have to port the Java implementation to your new device, wouldn't you incur the same amount of testing cost as when you implemented your software entirely in C/C++?
Well, if your software is implemented in the Java language, you can retest all your software if you want to, but you don't have to. All you have to retest is the newly ported Java implementation. As I have mentioned previously, as a licensee of the Java platform, you will have access to the TCK suite which checks for compliance of the implementation. There are also more elaborate testing options that money can buy if you want it. By ensuring that the Java platform is behaving in a conformant manner, you can have a high degree of confidence that your applications will perform on your new device just as they did on your previous ones.
To be realistic, I would not recommend that anyone only test their port of the Java implementation. This is because at least some part of your system is likely to be implemented in C/C++ (like native sub-systems that may not be exercised by the Java platform e.g. a native watchdog thread). These need to be tested. One way of doing that is by testing the new device with your applications which causes interactions with these system components. Another reason is that there may be performance difference between the devices. If your Java applications were written in ways that made assumptions about the timing characteristics of the device, then these applications may behave differently on different devices. Even without these factors, it is sound practice to at least run tests on a representative cross section of your software in addition to your port of the Java implementation.
Note that by programming to the Java platform, you have the option to ensure that the platform APIs your software depends on are behaving in a consistent manner. If you program to C/C++ and the OS directly, there is no guarantee that the APIs are behaving consistently on your new device (in comparison to your previous device). If there are bugs/differences in the C/C++ compiler or the OS, the only way you will find out is when the differences manifest in your applications. Depending on how comprehensive and elaborate your regression test suite is, you may or may not detect these problems until late in the development / deployment stages when the problem becomes expensive to fix. With a port of the Java platform, the tests of the Java platform is fairly comprehensive. Hence, differences / bugs in the C/C++ compiler and OS can be detected earlier via the Java platform tests.
Therefore, with the Java platform, your testing time is at worst equivalent to the case where you develop your software entirely in C/C++. Under more ideal circumstances, you can achieve the same amount of confidence about quality with less testing time. In terms of testing options, with the Java platform, you are provided the TCK which helps you to ensure compatible behavior in your new device, and you have the option to purchase additional testing services if needed. With a C/C++ only development, you're on your own in terms of ensuring compatible behavior between the devices.
Wait a minute! Didn't I say that the Java implementation needs to be tested? Isn't that one less thing you would need to test if you went with a native only system? Well, how did you test the integrity of the OS and your C/C++ compiler runtime libraries before with pure native systems? The answer is you don't. You license an OS and its toolchain, and you expect the OS vendor to have tested their software for compliance / compatibility of some sort. Well, the same applies to software for the Java platform. You license it from a Java technology vendor (like Sun), whom you expect to have tested the implementation for compliance and compatibility. The difference is that there is a specification which Java implementations must conform to. This specification keeps the Java implementation compatible. The same is not necessarily true of OSes. They (the OSes) can vary from version to version in incompatible ways.
How do you know that the OS port to your device is fully compatible? You don't because the OS vendor usually doesn't test the specific port to your device. That is unless you are using standard hardware that does not have any value added customizations of your own, or if you are a mega-corporation that can make the OS vendor do a test run on your specific device. In contrast, how do you know that the Java implementation port to your device is fully compatible? You know because the TCK is actually run on your device, and not just on the vendor's reference device. This should give you a lot more confidence about the quality of your implementation.
Again, in summary, the testing requirement with the Java platform is at worst the same as with native code, and at best, takes less time then with native code, and yet yields higher confidence of compatibility.
Above, when discussing the problems of fragmentation, I mentioned that fragmentation leads to increased deployment cost because the deployment system will have to keep track of the different versions of the software needed for each device variation. As a device developer, if you create the new version of your device on a difference CPU architecture and/or OS, then consider the following:
If you develop your software entirely in C/C++, then you are guaranteed to incur the same added deployment cost as is the case with fragmentation. The executables are incompatible due to the devices' different CPU architectures and/or OS.
If you develop the majority of your software in the Java language, then you may not have to incur the added deployment cost. If your devices has conformant and compatible Java implementations, then there will not be a fragmentation problem, and the cost is averted.
Hence, using the Java platform allows you to reduce deployment cost whereas software implemented entirely in native code will not give you this option.
Since the Java platform is an interpreted technology, wouldn't it be slower than software implemented in native language like C/C++? The answer is it depends on how complex the software in your device needs to be. If your software is adequately complex, then the Java platform can give you performance that is equivalent to that of C/C++. I have discussed this to some extent in a previous article here .
In general, the Java platform can give you this kind of performance because of the combination of 2 reasons:
If the software is adequately complex or may be subject to modular mutation over time, you will benefit from organizing the code in an object-oriented manner. Hence, even if you implement your code in C/C++, the code will incur the same sort of overhead to achieve the OO paradigm. This is the cost of getting a more manageable and mutable code base.
Just-In-Time (JIT) Compiler Technology
JIT technology can optimize the code using information from runtime profiling. This sometimes allows the JIT to produce code that out-performs even C/C++ compilers. For example, JITs can allow you to inline calls from untrusted application code all the way into trusted system code, without sacrificing the security aspects that ensures the application must have authorization to access that system code. The JIT can also do this inlining even for future applications and upgraded system modules.
For a C/C++ compiler, at best, you must stop the inlining at the crossing boundary from application code to system code (to make sure that the application code cannot bypass the needed security checks by the system). And if you solve that security problem, you must disallow updating of the system code (without having to update the application as well) if you allow parts of the system code to be inlined into the application. Otherwise, crashes or undefined failures can occur due to mismatched behavior.
I cannot say that Java code will always out-perform native C/C++ code, but it is also not true that native C/C++ will always outperform Java code.
But bear in mind that using the Java platform does impose some overhead. All those extra features that you benefit from do not come for free. In the case of performance, there is VM initialization, JIT compilation, and garbage collection costs. However, if your software is not trivial, then this cost may be amortized and not significant. On the otherhand, if your application is trivial, then the added cost of the Java platform may not be worth it.
Does the Java platform use more memory than software implemented in native code like C/C++? The answer is again dependent on the complexity of your software. The Java platform does incur some footprint for the VM and library code. To give you an idea of how much memory is needed, here are the footprint of some Java platforms (based on Sun's implementations):
JavaCard (interpreter only): 10s of Ks ROM, 100s of bytes RAM
CLDC: can work in 100K RAM (interpreter only), more with JIT. Typical deployments fit in 500K to 16M RAM.
CDC: static footprint: VM = ~250K, JIT = <300K, VM + JIT + Foundation Profile class libraries = 2 to 2.5M. Typical deployments fit in 4M to 32M RAM.
Of course, the needed amount of memory depends on size of the application and how it uses memory. Note that the Java platform doesn't have to take up a lot of memory. But of course, the smaller versions will have a lot less features.
But wait a minute, so far, I have only given you half the story. The other part is that Java code are deployed in the form of classfiles which contain bytecodes and meta-data. From a sampling of CVM's ROMized data (based on the CDC Foundation Profile libraries), I found that bytecodes takes up about 30% of the total ROMized data (i.e. RAM footprint of the loaded classes). From previous observations on the footprint of JIT compiled code compared to bytecodes, we found that the footprint increases in size by a factor of about 7x to 8x on the ARM instruction set. So, let's do a quick estimate:
Loaded Java classes to bytecode ratio ~= 1 / 30% ~= 3.3x (based on ROMized data)
Native code to bytecode ratio ~= 7x (assuming it is equivalent to JIT compiled code size)
This means that Java code footprint is in general about half that of native code. Bear in mind that this is a very rough estimate though, but it is a reasonable one.
Now, note that with a Java VM's JIT, only the portion of the code (application and system code) which is hot gets compiled. "Hot" here refers to the code is in the critical path for performance, and hence should be compiled. Other code in the system are only executed occasionally using the interpreter. Also, the JIT will keep the compiled code footprint bounded within a limit. On CVM, the default is around 512K. We have seen some fairly elaborate applications that run optimally in only 100K to ~300K of JIT compiled code footprint.
This means that, for software written in the Java language, only the portion which is hot incurs the footprint hit of being compiled to native instructions. In contrast, for software written entirely in C/C++, the entire software needs to be compiled to native and thereby incur the footprint hit all over (even for code that is not hot).
Therefore, it is unsure whether your software written in the Java language will use more memory than C/C++. It depends on the complexity (and therefore, size) of your software in comparison with the platform code. If the software is large, then the overhead of the Java platform will be a minor cost in comparison with the footprint savings from the compactness of Java classes. In this case, you will save on footprint by choosing to develop on the Java platform. If the software is small, then the overhead of the Java platform will be a major cost, and you will save on footprint by developing in C/C++.
What Else to Consider?
This article is getting rather long already. So, I'll stop here for now. Next time, I will talk about real-time development and low level IO programming in the Java platform, and maybe a few other things.
Till then, have a nice day. :-)