It is an accepted fact that Java does not support return-type-based method overloading. This means that a class cannot have two methods
that differ only by return type -- you can't have int
doXyz(int x) and double doXyz(int x) in the
same class. And indeed, the Java compiler duly rejects any such
attempt. But recently I discovered a way to do this, which I wish
to share with all. Along the way, we will also explore some
rudiments of Java bytecode programming.
The tools that we need for this exploration are rather simple: the JDK, a Java assembler, a text editor, and a bytecode engineering library. We will use the Jasmin [12] Java assembler and the ASM [13] bytecode manipulation framework.
Let us review the prominent features of method invocation in
the JVM. Whenever a method is invoked, a new frame is created
on the execution stack. Each frame has a local variable array and
an operand stack. When the frame is created, the operand stack is
empty and the local variable array is populated with the target
object this (in case of instance methods) and the
method's arguments. All the processing occurs on the operand stack.
The maximum number of local variables and stack slots that will be
used during the method invocation at any given moment must be known
at compile time.
To invoke a method on an object, the object reference (in the
case of instance methods) and then the method arguments in the
proper sequence must be loaded on the operand stack. The
method should then be invoked using an appropriate invoke instruction.
There are four invoke instructions: invokevirtual,
invokestatic, invokeinterface, and
invokespecial. The different instructions correspond
to different method types. The invokevirtual instruction is used to
invoke instance methods; invokestatic for static
methods; invokeinterface for interface methods; and
invokespecial for constructors, private methods of the
present class, and instance methods of superclass.
To return a value, be it a primitive value or a reference, the
value must be loaded on the operand stack and the appropriate
return instruction should then be executed. The instruction return returns
void; areturn returns a reference;
dreturn, freturn, and lreturn
return a double, float, and
long respectively; and finally, ireturn is
used to return an int, a short,
a char, a byte, or a boolean.
Let us now start with programming. We will create a Hello World program in Java and its equivalent Java assembly code in Jasmin. Then, by comparing the two, we can pick up the rudiments of Java bytecode programming. Here goes the code:
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello World");
}
}
Let us write the Java assembly code equivalent to the above Java program in Jasmin:
; Filename : HelloWorld.j
; The semicolon comments out this line.
.class public HelloWorld2
.super java/lang/Object
.method public <init>()V
.limit stack 1
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public static main([Ljava/lang/String;)V
.limit stack 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "Hello Bytecode World"
invokevirtual
java/io/PrintStream/println(Ljava/lang/String;)V
; For this code to compile, invokevirtual and its
; argument must be on the same line.
return
.end method
To compile the HelloWorld.j file, use java -jar
jasmin.jar HelloWorld.j to get a HelloWorld2.class
file. This class file can be run as usual.
Now let us go through the assembly code and compare it with the
Java code where necessary. In bytecode, we do not have the luxury
of import statements. We must specify the fully qualified
classnames, and that too in a different way. Here, we use the forward slash (/) as
the package delimiter instead of the usual period (.). Hence, instead of
Object, use java/lang/Object. This
representation is called the internal form of the class
name. The .class, .super, and .end
method directives are pretty self-explanatory. Let us take a
look at the .method directive. The public
or static keywords are the attributes of the method.
The last token of the .method directive is the method
name and the method descriptor concatenated into one token. A
constructor is always named <init> and in
bytecode we must always specify the constructor explicitly.
The method descriptors deserve a special elaboration here, since our attempt to do return-type-based overloading hinges on method descriptors. Method descriptors are composed of type descriptors of parameters and return value. The type descriptors of various Java data types are listed in Table 1.1.
| Table 1.1: Type Descriptors | |
|---|---|
| Type | Type Descriptor |
byte |
B |
char |
C |
double |
D |
float |
F |
int |
I |
long |
J |
short |
S |
boolean |
Z |
void |
V |
| One array dimension | [ |
| An instance of class | L<classname>; |
The descriptors of primitive types as well as void are pretty
simple. However, descriptors of reference types may need further
elaboration. Hence, table 1.2 lists descriptors for some sample
reference types.
| Table 1.2: Reference Type Descriptors | |
|---|---|
| Type | Type Descriptor |
String |
Ljava/lang/String; |
byte[] |
[B |
Object[] |
[Ljava/lang/Object; |
To form a method descriptor, type descriptors of parameters are concatenated without any spaces inside a pair of parentheses, followed by the type descriptor of return type. In a class file, the method descriptor must be unique for every method. Table 1.3 lists some sample method descriptors.
| Table 1.3: Method Descriptors | |
|---|---|
| Method | Method Descriptor |
void method() |
()V |
byte[][] method() |
()[[B |
String method(double x) |
(D)Ljava/lang/String; |
void method(int a, byte b, String[] s) |
(IB[Ljava/lang/String;)V |
Coming back to the assembly code, inside the
<init> method, the .limit stack 1
directive declares that the maximum number of stack slots used in
this method at any given time is 1. aload_0 loads the
value at index 0 in local variable array onto the operand stack.
This value is nothing but the reference to the target object
this. On this reference, the
invokespecial instruction invokes the no-argument
constructor of the Object class, the superclass of the
present class. And finally, the return instruction
returns from the constructor.
Now let us see what's new inside the main method.
The getstatic instruction is used to load a static
field of a class onto the operand stack. For this, the field name
in internal form and the type descriptor of the field must be
specified. Here, it loads the out static variable of
the java.lang.System class, the type descriptor of
out being Ljava/io/PrintStream;. The
ldc instruction is used to load constants onto the
operand stack. Here, it loads the reference of the "Hello
Bytecode World" string object. And finally, the
invokevirtual instruction invokes the
println method on the out object. Note
that while invoking a method, a full method descriptor must be
specified. This sequence of three instructions stands for the
System.out.println("Hello Bytecode World"); statement
in Java.
As noted in the last section, while invoking a method, a full method descriptor must be specified, and in a class file, the method descriptor must be unique for every method. So why can't we overload a method based on return type? In Java, we call a method by its name and arguments, not by its return type or method descriptor. While calling a method, the return type does not play any part in deciding which overloaded method should be called; in fact, there's no syntactic need to do anything with the return value at all. So there would be no way to distinguish which method we mean to call, if return-type-based method overloading is allowed. But there is no such limitation for bytecode. The method descriptor is capable of distinguishing two methods on the basis of their return types, even if their parameters are same. To achieve our objective, we must bypass the Java compiler and use assembler instead. Let us see how to do this.
Following is the assembly code for a class named
Overloaded, containing two instance methods: void returnDifferent() and String
returnDifferent(). The void returnDifferent()
prints Returning Void and returns nothing, whereas the
String returnDifferent() does nothing and returns a
String -- a hardcoded value of Returning
String.
;Overloaded.j
.class public Overloaded
.super java/lang/Object
.method public <init>()V
.limit stack 1
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public returnDifferent()V
.limit stack 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "Returning Void"
invokevirtual
java/io/PrintStream/println(Ljava/lang/String;)V
return
.end method
.method public returnDifferent()Ljava/lang/String;
.limit stack 1
ldc "Returning String"
areturn ; returns a reference
.end method
We have a class file that supposedly contains two methods overloaded on basis of return type. But how do we verify it? How do we call those methods?
The Java class file format contains a methods table. Each value
in this table is a structure containing a complete description of a
method in the class or interface. In the case of
Overloaded.class, there will be two methods named
returnDifferent. So, if we were to use a statement like
returnDifferent(); in Java code, the Java compiler
would look up the table and encode a call to the first method
having the required name and parameters. We would end up with a call
to one specific method, always. My experience is that it is always
the first method in the assembly code that gets called. Are we
stuck with methods that we cannot use? Fortunately, reflection
comes to our rescue here. The following code invokes these methods
using reflection.
import java.lang.reflect.Method;
public class CallOverloadedMethods {
public static void main(String[] args) throws Exception {
Overloaded oc = new Overloaded();
Class c = Overloaded.class;
Method[] m = c.getDeclaredMethods();
for (int i=0; i<m.length; ++i) {
if (m[i].getName().equals("returnDifferent")) {
if (m[i].getReturnType().getName().equals("void"))
m[i].invoke(oc, new Object[]{});
else if (m[i].getReturnType().getName().equals(
"java.lang.String"))
System.out.println(m[i].invoke(oc, new Object[]{}));
}
}
}
}
This code iterates over all the declared methods of the
Overloaded class and looks for methods named
returnDifferent. It assumes that all the
returnDifferent methods have empty parameter lists. It
only checks each of the returnDifferent method's
return type and then uses the method in an appropriate way.
Compile this class and run. Voila. It runs perfectly, giving the expected output. We have implemented return-type-based method overloading in Java.
Although we have been able to pull this off, you may be wondering if it is of any practical use. After all, we can not call those overloaded methods without resorting to reflection. So, what is the value?
It turns out that there is a useful application. Suppose that a class is required to implement two interfaces that have methods with identical names and argument lists, differing only in return type. Using normal Java code, we cannot have a class that implements both the interfaces. But using the technique described above, we can have such a class. Moreover, we do not need reflection to use those methods. This is because when we call a method on an interface reference, the relevant method's descriptor, as specified in the interface class file, is automatically used to make the call. Let us see a concrete example. Consider two interfaces as follows:
interface Interface1 {
void doSomething();
}
interface Interface2 {
String doSomething();
}
To implement these two interfaces, we can write assembly code as follows:
;ImplementBoth.j
.class public ImplementBoth
.super java/lang/Object
.implements Interface1
.implements Interface2
.method public <init>()V
.limit stack 1
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public doSomething()Ljava/lang/String;
.limit stack 1
ldc "Hello from STRING"
areturn
.end method
.method public doSomething()V
.limit stack 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "Hello from VOID"
invokevirtual
java/io/PrintStream/println(Ljava/lang/String;)V
return
.end method
Now we can access both of these methods using normal Java code, as demonstrated below:
public class UsingImplementBoth {
public static void main(String[] args) {
ImplementBoth ib = new ImplementBoth();
((Interface1)ib).doSomething();
System.out.println(((Interface2)ib).doSomething());
}
}
Now this technique has proved to be useful. But still, we need to code in Java assembly for its implementation, which is quite troublesome, especially with complex logic. Can we find some way to use this technique and still be able to code in Java rather than assembly? Certainly. The byte code engineering tools allow us to add or remove a method or field, or to change the class attributes of a compiled Java class. So we can have our class implement one of the interfaces and we can write the code for the other interface's method in some other method. Our Java code would then look like this:
public class BetterTechnique implements Interface1 {
void doSomething() {
// -- Complex code --
return;
}
String delegatedDoSomething() {
// -- Complex Code --
return someStringValue;
}
}
Now we compile this code to get BetterTechnique.class
file. Then, using bytecode engineering tools, we mark this class as
implementing the Interface2 interface and add the
method String doSomething(). This method will need to
be coded in assembly, but we just need to call
delegatedDoSomething() from that method and return the
result. So it's not a big deal. See the Resources [11] section for the sample code, which
includes all the necessary files for this example, in the
EnhancedTechnique folder. For step-by-step instructions
about compilation and modification, please go through the
ReadMe.txt file.
To simplify the use of this technique, it is possible to develop an annotation for a class to indicate which method is to be overloaded, which is a delegate method, and which additional interface should be implemented by the class. Then a tool can inspect the class file reflectively, and if it encounters the said annotation, it can automatically transform the class file accordingly. An interested reader may explore these possibilities.
We have demonstrated that it is possible to overload Java methods based solely on their return types. However, whether this undocumented feature of Java is a deliberate choice or an accident can only be clarified by the more knowledgeable people here. I guess there may be some internal use of this feature for JVM, or else why would it be put in? Anyway, let us look forward to interesting and enlightening discussions.
Links:
[1] http://www.java.net/author/vinit-joglekar
[2] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java
[3] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#required-tools
[4] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#basics-method-invokation
[5] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#basics-bytecode-prog
[6] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#implementing
[7] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#how-to-invoke
[8] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#is-this-useful
[9] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#enhancements
[10] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#conclusion
[11] http://www.java.net/article/2008/07/29/return-type-based-method-overloading-java#resources
[12] http://jasmin.sourceforge.net/
[13] http://asm.objectweb.org/
[14] http://www.java.net/today/2008/07/31/sources.zip
[15] http://java.sun.com/docs/books/jvms/second_edition/html/VMSpecTOC.doc.html
[16] http://www.cybergrain.com/tech/jvmref/
[17] http://download.forge.objectweb.org/asm/asm-guide.pdf