I am running some micro benchmarks on Java list iteration code. I have used -XX:+PrintCompilation, and -verbose:gc flags to ensure that nothing is happening in the background when the timing is being run. However, I see something in the output which I cannot understand.
Here's the code, I am running the benchmark on:
import java.util.ArrayList;
import java.util.List;
public class PerformantIteration {
private static int theSum = 0;
public static void main(String[] args) {
System.out.println("Starting microbenchmark on iterating over collections with a call to size() in each iteration");
List<Integer> nums = new ArrayList<Integer>();
for(int i=0; i<50000; i++) {
nums.add(i);
}
System.out.println("Warming up ...");
//warmup... make sure all JIT comliling is done before the actual benchmarking starts
for(int i=0; i<10; i++) {
iterateWithConstantSize(nums);
iterateWithDynamicSize(nums);
}
//actual
System.out.println("Starting the actual test");
long constantSizeBenchmark = iterateWithConstantSize(nums);
long dynamicSizeBenchmark = iterateWithDynamicSize(nums);
System.out.println("Test completed... printing results");
System.out.println("constantSizeBenchmark : " + constantSizeBenchmark);
System.out.println("dynamicSizeBenchmark : " + dynamicSizeBenchmark);
System.out.println("dynamicSizeBenchmark/constantSizeBenchmark : " + ((double)dynamicSizeBenchmark/(double)constantSizeBenchmark));
}
private static long iterateWithDynamicSize(List<Integer> nums) {
int sum=0;
long start = System.nanoTime();
for(int i=0; i<nums.size(); i++) {
// appear to do something useful
sum += nums.get(i);
}
long end = System.nanoTime();
setSum(sum);
return end-start;
}
private static long iterateWithConstantSize(List<Integer> nums) {
int count = nums.size();
int sum=0;
long start = System.nanoTime();
for(int i=0; i<count; i++) {
// appear to do something useful
sum += nums.get(i);
}
long end = System.nanoTime();
setSum(sum);
return end-start;
}
// invocations to this method simply exist to fool the VM into thinking that we are doing something useful in the loop
private static void setSum(int sum) {
theSum = sum;
}
}
Here's the output.
152 1 java.lang.String::charAt (33 bytes)
160 2 java.lang.String::indexOf (151 bytes)
165 3Starting microbenchmark on iterating over collections with a call to size() in each iteration java.lang.String::hashCode (60 bytes)
171 4 sun.nio.cs.UTF_8$Encoder::encodeArrayLoop (490 bytes)
183 5
java.lang.String::lastIndexOf (156 bytes)
197 6 java.io.UnixFileSystem::normalize (75 bytes)
200 7 java.lang.Object::<init> (1 bytes)
205 8 java.lang.Number::<init> (5 bytes)
206 9 java.lang.Integer::<init> (10 bytes)
211 10 java.util.ArrayList::add (29 bytes)
211 11 java.util.ArrayList::ensureCapacity (58 bytes)
217 12 java.lang.Integer::valueOf (35 bytes)
221 1% performance.api.PerformantIteration::main @ 21 (173 bytes)
Warming up ...
252 13 java.util.ArrayList::get (11 bytes)
252 14 java.util.ArrayList::rangeCheck (22 bytes)
253 15 java.util.ArrayList::elementData (7 bytes)
260 2% performance.api.PerformantIteration::iterateWithConstantSize @ 19 (59 bytes)
268 3% performance.api.PerformantIteration::iterateWithDynamicSize @ 12 (57 bytes)
272 16 performance.api.PerformantIteration::iterateWithConstantSize (59 bytes)
278 17 performance.api.PerformantIteration::iterateWithDynamicSize (57 bytes)
Starting the actual test
Test completed... printing results
constantSizeBenchmark : 301688
dynamicSizeBenchmark : 782602
dynamicSizeBenchmark/constantSizeBenchmark : 2.5940773249184588
I don't understand these four lines from the output.
260 2% performance.api.PerformantIteration::iterateWithConstantSize @ 19 (59 bytes)
268 3% performance.api.PerformantIteration::iterateWithDynamicSize @ 12 (57 bytes)
272 16 performance.api.PerformantIteration::iterateWithConstantSize (59 bytes)
278 17 performance.api.PerformantIteration::iterateWithDynamicSize (57 bytes)
I am going to attempt answering my own question with the help of this link posted by Thomas Jungblut.
260 2% performance.api.PerformantIteration::iterateWithConstantSize @ 19 (59 bytes)
268 3% performance.api.PerformantIteration::iterateWithDynamicSize @ 12 (57 bytes)
272 16 performance.api.PerformantIteration::iterateWithConstantSize (59 bytes)
278 17 performance.api.PerformantIteration::iterateWithDynamicSize (57 bytes)
First column
The first column '260' is the timestamp.
Second column
The second column is the compilation_id and method_attributes. When a HotSpot compilation is triggered, every compilation unit gets a compilation id. The number in the second column is the compilation id. JIT compilation, and OSR compilation have two different sequences for the compilation id. So 1% and 1 are different compilation units. The % in the first two rows, refer to the fact that this is an OSR compilation. An OSR compilation was triggered because the code was looping over a large loop, and the VM determined that this code is hot. So an OSR compilation was triggered, which would enable the VM to do an On Stack Replacement and move over to the optimized code, once it is ready.
Third column
The third column performance.api.PerformantIteration::iterateWithConstantSize
is the method name.
Fourth column
The fourth column is again different when OSR compilation happens and when it does not. Let's look at the common parts first. The end of the fourth column (59 bytes), refers to the size of the compilation unit in bytecode (not the size of the compiled code). The @ 19 part in OSR compilation refers to the osr_bci. I am going to quote from the link mentioned above -
A "place" in a Java method is defined by its bytecode index (BCI), and the place that triggered an OSR compilation is called the "osr_bci". An OSR-compiled nmethod can only be entered from its osr_bci; there can be multiple OSR-compiled versions of the same method at the same time, as long as their osr_bci differ.
Finally, why was the method compiled twice ?
The first one is an OSR compilation, which presumably happened while the loop was running due to the warmup code (in the example), and the second compilation is a JIT compilation, presumably to further optimize the compiled code ?