Skip to main content

Make Collection and arrays more compatible

5 replies [Last post]
monika_krug
Offline
Joined: 2004-10-14

and more easily to use together.

Add the following methods to Collection:
addAll(<? extends E>[] array)
containsAll(<? extends E>[] array)
removeAll(<? extends E>[] array)
retainAll(<? extends E>[] array)
(Or maybe better to a new interface, so that existing code won't be broken, and have the library classes that implement Collection implement the new interface).

And if this is possible (I never understood why it is not working this way and why one has to supply an array as an argument):
change Object[] toArray() to E[] toArray() and make it work like T[] toArray(T[] a)

Monika.

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
opinali
Offline
Joined: 2003-06-17

addAll(Arrays.asList(array)) forces creation a new collection to hold the original collections's elements temporarily. This is insane, and if the array is very big, it's a big "spike" in heap usage without any need. My vote goes to the new overloads accepting arrays directly. Lacking such methods, I often have to use utility methods like "addAll (Collection, Object[])" but the performance is also not ideal because elements must be added one by one.

With the new methods, the argument's elements could be copied to the internal data structures with the most efficient strategy possible -- when the internal data uses a simple array, like ArrayList, that would mean a System.arraycopy(). Right now, addAll() cannot even hope to approach that performance, whatever the types of the argument and receiver collections.

mthornton
Offline
Joined: 2003-06-10

Evidently you haven't checked to see what Arrays.asList actually does. The amount of additional memory is NOT dependent on the size of the array. The new List returned is of fixed size and uses the array you supplied for its content --- it does not copy that array at all, just keeps a reference to it.

opinali
Offline
Joined: 2003-06-17

Thanks for pointing this! I think I never cared to read the docs for such a simple API :) and just assumed that the new list wouldn't share the argument array for its private storage, because this pattern is very rare in the Java APIs as it would certainly create many problems.

I noticed that asList() uses a special, fixed-size implementation of List to make this possible w/o screwing with the standard ArrayList, and more efficient too. And it's not even a new API! I'll be certaily reviewing some of my code to exploit asList() where adequate.

On the other hand, we still have a performance problem, because addAll(c) (in array-backed collections like ArrayList) has an optimization that uses c.toArray() so it can fetch the objects to an array and then copy'em to its internal array the fastest way possible (System.arraycopy()), and in addAll(index,c), also to move the elements after the insertion points only once. But the toArray() method will create a new array, even for the special list created by asList(T[]), where the cloning is completely meaningless because its internal array field is already exposed due to the way the list is created. I see that the problem is that Collection.toArray() documents that this method must always return a new array, but IMHO in this special case, honoring this contract is just "specification fanatism" because it doesn't make the list much safer. One can argue that the current asList() creates a backdoor only for the code that creates the array, not for code that eventually receives the list, but this backdoor means nothing anyway, as anything you can do with the internal array, you can do with the list's get/set(). Notice that in this special list, there is no growing of that backing array, and no unused elements (size() is always equal to the array's length), the only reasons why even java.util.ArrayList couldn't also expose its internal array with toArray().

In short, my suggestion is: just change Arrays.ArrayList.toArray() so it returns a straight reference to its internal array. It's the only missing piece to make Arrays.asList() a perfect, near-zero-overhead wrapper for primitive arrays. This optimization will be only a "weak violation" of Collection.toArray()'s contract, because it won't produce any of the adverse side effects that this contract means to evict. Only the JCK would probably be enhanced to consider this special case, it probably contains tests that invoke toArray() repeatedly on the same collections and checks if different arrays are returned each time. Another idea would be enhancing toArray()'s spec to not require a new array to be returned in the special cases (like Arrays.ArrayList) where this doesn't break anything.

mthornton
Offline
Joined: 2003-06-10

You can't have E[] toArray() because of type erasure. For the rest you could use addAll(Arrays.asList(array))

monika_krug
Offline
Joined: 2004-10-14

If Type[] array = (Type[]) collection.toArray(); worked, that would be enough.

Thanks for the other info.

Monika.