Skip to main content

Unify class customization and construction

14 replies [Last post]
cowwoc
Offline
Joined: 2003-08-24

Hi,

I'd like to propose an alternate syntax for Generics. This is motivated by my belief that although Generics is an important and useful paradigm, its syntax (derived from C++) is really poor and poses a serious readability problem.

Here is my short proposal. Please try to improve upon it if you find problems and/or let me know what you think about it.

My understanding is that:

- templates and generics are meant to customize classes whereas constructors are meant to customize objects.

My feeling is that:

- templates and generics' syntax is a really poor approach to customize classes and a cleaner approach should be invented.

PROPOSAL:

- Class customization should occur within the normal class constructor.
- Implication: many class customizations may exist (support for polymorphism)
- When creating a new object the new syntax is:

ArrayList myList = new ArrayList(int.class);

Notice there are no special template arguments. That is because it is up to the constructor to decide what values to assign to the parameterization arguments.

Example code:

<br />
class ArrayList<br />
{<br />
  private ClassParameter T;<br />
  private T[] value;</p>
<p>  public ArrayList(int size, Class type)<br />
  {<br />
    T = type;</p>
<p>    this.value = new T[size];<br />
  }</p>
<p>  public ArrayList(int[] initialValues)<br />
  {<br />
    value = initialValues;<br />
    T = int.class;<br />
  }</p>
<p>  T[] toArray();<br />
};<br />

Notice how in the second constructor (taking initialValues as an argument) we automatically know that the ArrayList contains ints.

Java would be responsible for scanning the code, finding all ClassParameters and ensuring that all constructors set their value, otherwise throw a compile error. I know this is possible because javac already does this if you do not assign final field variables a value within the constructors.

Benefits:

- More readable (as long as not abused)
- Doesn't require any bytecode changes (type erasure is still possible)
- More flexible class construction (polymorphic constructors for assigning parameter types, support for default values or complex expressions for deciding their value, etc)

Disadvantages:

- Syntax is unfamiliar to C++ developers.

I look forward to your feedback.

Thank you,
Gili

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
tackline
Offline
Joined: 2003-06-19

I have found it useful to have a Map keyed on a primitive. Often a wire protocol, database or similar specifies integer identifiers. In order to associate model ("reference") objects with these identifiers, a Map fits the bill.

In a broader area, anywhere you now have int[], List would be a more flexible true collection.

cowwoc
Offline
Joined: 2003-08-24

> What would be your equivalent to:
> [code]
> ArrayList intList = getIntsFromSomewhere();
> [/code]
> How do you specify a variable to only accept a
> specialized generic type?

First I should warn you I'm answering this question from a developer's point of view. I have no idea how hard/easy this would be to implement from a compiler's point of view...

To me, there is absolutely no need to limit reference assignment based on the object type. Ultimately, all we care about is that the compiler should be able to give us compile-time errors if we try applying invalid operations to an object that cannot support them. For example, if you have an ArrayList (using the current Generics syntax) and you try invoking methods on the elements contained within the ArrayList then obviously we would want to get an error that "int" is not an object and does not support any methods. Another example is that when we have ArrayList and we try invoking charAt() on the elements contained within it should complain that "File.charAt() does not exist" or something similar.

So here is what I am proposing:

Reference assignments are always legal. Under the hood the compiler keeps track of the types of the objects (this is available at compile-time but not runtime) and then for any operation you try invoking on the elements it'll verify whether such an operation is permissible. If not, it'll throw an error. This means that I can do this legally:

[code]
ArrayList values = new ArrayList();
if (someCondition)
values = getArrayListOfTypeFloat();
System.out.println(values.get(0).toString());
[/code]

Why? Because both Integer and Float support toString() hence the operation is legal.

In such cases where there is insufficient information for the compiler to figure out the ClassParameter value, it should collect the union of all possible types (i.e. it could be Integer, Float, String, etc) based upon all logic branches that could execute. Then, for every operation invoked on the elements of the ArrayList, it would verify whether this operation exists for all possible types. If not, it'll throw an error.

This is bound to cause some annoyances when the compiler isn't smart enough to figure out that at a certain point in the code, the ArrayList will only contain Floats (whereas it might only be smart enough to think it contains Floats or Integers), but it's a small step in the right direction and if anything we're giving users more flexibility, not less, so they have nothing to complain about. The workaround for the above problem is that users should either use easy-to-decypher logic branches (so the compiler will be able to pick up the type) or pretend their function is returning the most-common-superclass of all possible return types.

Anyway... I'm just throwing these ideas out there. I fully expect there might be some problems to what I said but I encourage you to feel them out and perhaps improve on them to fix those problems.

My personal belief is that the ultimate answer to this question and others is Design by Contact. As soon as Sun implements this paradigm a lot of these questions will go away.

Gili

terkans
Offline
Joined: 2004-09-17

Your proposal has some merit, but it doesn't go far enough. What you get when you carry your logic through, is a system of Type Inference. This system does away with (almost) all type declarations for variables, parameters, and return types. The compiler can deduce all these types from it's knowledge of the basic types of the language. If you want more information, look at http://c2.com/cgi/wiki?TypeInference .

This is a very powerful system which can eliminate all type errors at compile type (and not requiring casts either). However, this has not often been applied to stateful object-oriented languages. It is commonly used in functional languages such as ML, Haskel and others.

While I would be very interested in seeing such a system implemented, it is a very radical departure from what Java currently is, and I would not expect it to be added to the language. Certainly not soon (inside 5 years) and probably not ever.

> [...] The workaround
> for the above problem is that users should either use
> easy-to-decypher logic branches (so the compiler will
> be able to pick up the type) or pretend their
> function is returning the most-common-superclass of
> all possible return types.

I strongly think that any proposal that suggest users to limit their logic to make the compilers job easier would (and should) be shot down in this timeframe. It would be a defensible and reasonable position several decades ago, when compilers (due to hardware limitations) were taking hours to build a program. This limitation had already gone away when I started using computers about 20 years ago.

> My personal belief is that the ultimate answer to
> this question and others is Design by Contact. As
> soon as Sun implements this paradigm a lot of these
> questions will go away.

No, this is not related to Design By Contract (DBC). DBC covers much te same territory as unit testing does. It documents (and enforces), what goes into methods, what comes out, and the correct state of objects.

Like missing (or empty) unit tests, even requiring DBC clauses can guarantee nothing. Setting the DBC clauses to {precondition=true; postcondition=true; invariant=true} will allow anything to happen in a method, and ensure nothing.

> Gili

Maarten

forax
Offline
Joined: 2004-10-07

> > > - Doesn't require any bytecode changes (type
> > erasure
> > > is still possible)
> >
> > Unfortunately, this is not completely true.
> Erasure
> > removes the type information completely, you
> suggest
> > storing it in a private member field.
>
> The compiler can erase those member fields once it
> it is done with them. They simply serve as
> placeholders for allowing users to parameterize the
> class. When/if Java retains parameters at runtime
> (i.e. no need for erasure anymore) they can be kept
> around.

why do not let the devleopper choose if he wants
parameter types at runtime or not.

@AtRuntime
class LinkedList {
public LinkedList() {
}
static class Entry {

}
}

is erased by the compiler :
class LinkedList {
public LinkedList(Class E) {
this.E=E;
}

private final Class E;
static class Entry {
}
}

With this, you can choose if you want :
- backward binary compatibility but without
type information
- backward compatibility with type information

Rémi Forax

cowwoc
Offline
Joined: 2003-08-24

I'm sorry but I don't understand your example in the previous post. I understand that you want to be able to specify whether type-erasure occurs or not ... fine, I agree. Although in my opinion you should specify this as a javac command-line argument and it should affect all files being compiled, as opposed to specifying it in the source-code itself.

The reality is that right now Sun isn't supporting class parameters are runtime so I am trying to show that my proposal will work whether or not type-erasure is applied.

Gili

forax
Offline
Joined: 2004-10-07

> I'm sorry but I don't understand your example in the
> previous post. I understand that you want to be able
> to specify whether type-erasure occurs or not ...
> fine, I agree. Although in my opinion you should
> specify this as a javac command-line argument and it
> should affect all files being compiled, as opposed to
> specifying it in the source-code itself.

I want to choose as a programmer for a specific
parametric type if i can obtains parameter argument
at runtime or not.
In any case, type erasure occurs but if i want
parameter type at runtime, the compiler add
a new field that maintains runtime type.

@AtRuntime
class Pair {
public Pair(A first,B second) {
this.first=first;
this.second=second;
}
private final A first;
private final B second;

public static void main(String[] args) {
Pair pair=new Pair("a",3);
Pair p2=(Pair)pair;
}
}

The second assignation is a SAFE cast because pair
contains two fields (A$type ans B$type) that
respectively contains String.class and Integer.class.

The Java code corresponding to the generated byte-code is:

@AtRuntime
class Pair { // A and B are erased

// marks constructor as synthetized
public Pair(Class A$type,Class B$type,
Object first,Object second) {

this.A$type=A$type; // synthetized by the compiler
this.B$type=B$type; // synthetized
this.first=first;
this.second=second;
}

// perhaps add a Pair(Object,Object) by compatibility
public Pair(Object first,Object second) {
this(Object.class,Object.class,first,second);
}

private final A first;
private final B second;
private final Class A$type; // synthetized
private final Class B$type; // synthetized

// marks method as synthetized
public void $safeCast(Class t1,Class t2) {

if (t1!=A$type|| t2!=B$type)
throw new ClassCastException();
}

public static void main(String[] args) {
Pair pair=new Pair(String.class,Integer.class,"a",3);

pair.$safeCast(Integer,Integer);
Pair p2=pair;
}
}

>
...
>
> Gili

Rémi Forax

terkans
Offline
Joined: 2004-09-17

What would be your equivalent to:
[code]
ArrayList intList = getIntsFromSomewhere();
[/code]
How do you specify a variable to only accept a specialized generic type?

Regards,

Maarten

tsinger
Offline
Joined: 2003-06-10

Just curious, for what do you need Collections of primitive types? What do you store in them?

I ask, because we never needed them in our 6 years of Java development, instead we put real-objects (not primitive wrappers!) in collections.

Tom

terkans
Offline
Joined: 2004-09-17

> Just curious, for what do you need Collections of
> primitive types? What do you store in them?
>
> Tom

I usually don't. I just used the same type as in the first message.

Maarten

peterahe
Offline
Joined: 2004-11-22

> - Doesn't require any bytecode changes (type erasure
> is still possible)

Unfortunately, this is not completely true. Erasure
removes the type information completely, you suggest
storing it in a private member field.

Furthermore, how do you specify that you need a particular
kind of List?

cowwoc
Offline
Joined: 2003-08-24

> > - Doesn't require any bytecode changes (type
> erasure
> > is still possible)
>
> Unfortunately, this is not completely true. Erasure
> removes the type information completely, you suggest
> storing it in a private member field.

The compiler can erase those member fields once it is done with them. They simply serve as placeholders for allowing users to parameterize the class. When/if Java retains parameters at runtime (i.e. no need for erasure anymore) they can be kept around.

> Furthermore, how do you specify that you need a
> particular
> kind of List?

Either the type of the list is obvious from the constructor arguments or you'd add new constructors that allow the user to specify the List type. In my example, I defined ArrayList(Class type) which allowed the user to define the kind of ArrayList he was interested in.

Gili

forax
Offline
Joined: 2004-10-07

...
> > Furthermore, how do you specify that you need a
> > particular
> > kind of List?
>
> Either the type of the list is obvious from the
> he constructor arguments or you'd add new
> constructors that allow the user to specify the List
> type. In my example, I defined ArrayList(Class type)
> which allowed the user to define the kind of
> ArrayList he was interested in.

so you have the infomation at runtime but the
compiler can't verify typing at compile time ??

>
> Gili

Rémi Forax

cowwoc
Offline
Joined: 2003-08-24

> so you have the infomation at runtime but the
> compiler can't verify typing at compile time ??

No, what I meant is this:

if the original code reads:

[code]
class ArrayList
{
ClassParameter T;
T[] contents;

ArrayList(int[] values)
{
T = int.class;
}
}
[/code]

then if the compiler decides to apply type-erasure, it'll translate it into this at compile-time:

[code]
class ArrayList
{
int[] contents;

ArrayList(int[] values)
{
}
}
[/code]

or

[code]
class ArrayList
{
Object[] contents; /* compiler will add casts to Integer[] automatically */

ArrayList(int[] values)
{
}
}
[/code]

so as you can see type-erasure is possible and in the future if we support class parameters at runtime, we can just leave them in.

Gili

peterahe
Offline
Joined: 2004-11-22

This looks very similar to virtual types in BETA:

http://www.daimi.au.dk/~beta/

Also see this treatment of how to relate virtual types
to paramterized types: Kresten Krab Thorup and
Mads Torgersen "Unifying Genericity - Combining the
Benefits of Virtual Types and Parameterized Classes".

http://portal.acm.org/citation.cfm?id=679846

This lead to this idea: Atsushi Igarashi and Mirko Viroli
"On Variance-Based Subtyping for Parametric Types".

http://portal.acm.org/citation.cfm?id=680032

This was then refined into wildcards.

Message was edited by: peterahe