In Java 8, why were Arrays not given the forEach method of Iterable?

Kedar Mhaswade picture Kedar Mhaswade · Feb 20, 2016 · Viewed 50.7k times · Source

I must be missing something here.

In Java 5, the "for-each loop" statement (also called the enhanced for loop) was introduced. It appears that it was introduced mainly to iterate through Collections. Any collection (or container) class that implements the Iterable interface is eligible for iteration using the "for-each loop". Perhaps for historic reasons, the Java arrays did not implement the Iterable interface. But since arrays were/are ubiquitous, javac would accept the use of for-each loop on arrays (generating bytecode equivalent to a traditional for loop).

In Java 8, the forEach method was added to the Iterable interface as a default method. This made passing lambda expressions to collections (while iterating) possible (e.g. list.forEach(System.out::println)). But again, arrays don't enjoy this treatment. (I understand that there are workarounds).

Are there technical reasons why javac couldn't be enhanced to accept arrays in forEach, just like it accepts them in the enhanced for loop? It appears that code generation would be possible without requiring that arrays implement Iterable. Am I being naive?

This is especially important for a newcomer to the language who rather naturally uses arrays because of their syntactical ease. It's hardly natural to switch to Lists and use Arrays.asList(1, 2, 3).

Answer

Stuart Marks picture Stuart Marks · Feb 24, 2016

There are a bunch of special cases in the Java language and in the JVM for arrays. Arrays have an API, but it's barely visible. It is as if arrays are declared to have:

  • implements Cloneable, Serializable
  • public final int length
  • public T[] clone() where T is the array's component type

However, these declarations aren't visible in any source code anywhere. See JLS 4.10.3 and JLS 10.7 for explanations. Cloneable and Serializable are visible via reflection, and are returned by a call to

Object[].class.getInterfaces()

Perhaps surprisingly, the length field and the clone() method aren't visible reflectively. The length field isn't a field at all; using it turns into a special arraylength bytecode. A call to clone() results in an actual virtual method call, but if the receiver is an array type, this is handled specially by the JVM.

Notably, though, array classes do not implement the Iterable interface.

When the enhanced-for loop ("for-each") was added in Java SE 5, it supported two different cases for the right-hand-side expression: an Iterable or an array type (JLS 14.14.2). The reason is that Iterable instances and arrays are handled completely differently by the enhanced-for statement. That section of the JLS gives the full treatment, but put more simply, the situation is as follows.

For an Iterable<T> iterable, the code

for (T t : iterable) {
    <loop body>
}

is syntactic sugar for

for (Iterator<T> iterator = iterable.iterator(); iterator.hasNext(); ) {
    t = iterator.next();
    <loop body>
}

For an array T[], the code

for (T t : array) {
    <loop body>
}

is syntactic sugar for

int len = array.length;
for (int i = 0; i < len; i++) {
    t = array[i];
    <loop body>
}

Now, why was it done this way? It would certainly be possible for arrays to implement Iterable, since they implement other interfaces already. It would also be possible for the compiler to synthesize an Iterator implementation that's backed by an array. (There is precedent for this. The compiler already synthesizes the static values() and valueOf() methods that are automatically added to every enum class, as described in JLS 8.9.3.)

But arrays are a very low-level construct, and accessing an array by an int value is expected to be extremely inexpensive operation. It's quite idiomatic to run a loop index from 0 to an array's length, incrementing by one each time. The enhanced-for loop on an array does exactly that. If the enhanced-for loop over an array were implemented using the Iterable protocol, I think most people would be unpleasantly surprised to discover that looping over an array involved an initial method call and memory allocation (creating the Iterator), followed by two method calls per loop iteration.

So when default methods were added to Iterable in Java 8, this didn't affect arrays at all.

As others have noted, if you have an array of int, long, double, or of reference type, it's possible to turn this into a stream using one of the Arrays.stream() calls. This provides access to map(), filter(), forEach(), etc.

It would be nice, though, if the special cases in the Java language and JVM for arrays were replaced by real constructs (along with fixing a bunch of other array-related problems, such as poor handling of 2+ dimensional arrays, the 2^31 length limitation, and so forth). This is the subject of the "Arrays 2.0" investigation being led by John Rose. See John's talk at JVMLS 2012 (video, slides). The ideas relevant to this discussion include introduction of an actual interface for arrays, to allow libraries to interpose element access, to support additional operations such as slicing and copying, and so forth.

Note that all of this is investigation and future work. There is nothing from these array enhancements that is committed in the Java roadmap for any release, as of this writing (2016-02-23).