When to use Array, Buffer or direct Buffer

TwoThe picture TwoThe · Sep 20, 2013 · Viewed 11.9k times · Source

Question

While writing a Matrix class for use with OpenGL libraries, I came across the question of whether to use Java arrays or a Buffer strategy to store the data (JOGL offers direct-buffer copy for Matrix operations). To analyze this, I wrote a small performance test program that compares the relative speeds of loop and bulk operations on Arrays vs Buffers vs direct Buffers.

I'd like to share my results with you here (as I find them rather interesting). Please feel free to comment and/or point out any mistakes.
The code can be viewed at pastebin.com/is7UaiMV.

Notes

  • Loop-read array is implemented as A[i] = B[i] as otherwise the JIT optimizer will completely remove that code. Actual var = A[i] seems to be pretty much the same.

  • In the sample result for array size of 10,000 it is very likely that the JIT optimizer has replaced the looped array access with a System.arraycopy like implementation.

  • There is no bulk-get buffer->buffer as Java implements A.get(B) as B.put(A), therefore the results would be the same as the bulk-put results.

Conclusion

Under almost all situations it is strongly recommended to use the Java internal Arrays. Not only is the put/get speed massively faster, the JIT is as well able to perform much better optimizations on the final code.

Buffers should only be used if both the following applies:

  • You need to process large amounts of data.
  • That data is mostly or always bulk-processed.

Note that a backened-buffer has a Java Array backening the content of the buffer. It is recommended to do operations on this back-buffer instead of looping put/get.

Direct buffers should only be used if you worry about memory usage and never access the underlying data. They are slightly slower than non-direct buffers, much slower if the underlying data is accessed, but use less memory. In addition there is an extra overhead when converting non-byte data (like float-arrays) into bytes when using a direct buffer.

For more details see here:

Sample results

Note: Percentage is only for ease of reading and has no real meaning.

Using arrays of size 16 with 10,000,000 iterations...

-- Array tests: -----------------------------------------

Loop-write array:           87.29 ms  11,52%
Arrays.fill:                64.51 ms   8,51%
Loop-read array:            42.11 ms   5,56%
System.arraycopy:           47.25 ms   6,23%

-- Buffer tests: ----------------------------------------

Loop-put buffer:           603.71 ms  79,65%
Index-put buffer:          536.05 ms  70,72%
Bulk-put array->buffer:    105.43 ms  13,91%
Bulk-put buffer->buffer:    99.09 ms  13,07%

Bulk-put bufferD->buffer:   80.38 ms  10,60%
Loop-get buffer:           505.77 ms  66,73%
Index-get buffer:          562.84 ms  74,26%
Bulk-get buffer->array:    137.86 ms  18,19%

-- Direct buffer tests: ---------------------------------

Loop-put bufferD:          570.69 ms  75,29%
Index-put bufferD:         562.76 ms  74,25%
Bulk-put array->bufferD:   712.16 ms  93,96%
Bulk-put buffer->bufferD:   83.53 ms  11,02%

Bulk-put bufferD->bufferD: 118.00 ms  15,57%
Loop-get bufferD:          528.62 ms  69,74%
Index-get bufferD:         560.36 ms  73,93%
Bulk-get bufferD->array:   757.95 ms 100,00%

Using arrays of size 1,000 with 100,000 iterations...

-- Array tests: -----------------------------------------

Loop-write array:           22.10 ms   6,21%
Arrays.fill:                10.37 ms   2,91%
Loop-read array:            81.12 ms  22,79%
System.arraycopy:           10.59 ms   2,97%

-- Buffer tests: ----------------------------------------

Loop-put buffer:           355.98 ms 100,00%
Index-put buffer:          353.80 ms  99,39%
Bulk-put array->buffer:     16.33 ms   4,59%
Bulk-put buffer->buffer:     5.40 ms   1,52%

Bulk-put bufferD->buffer:    4.95 ms   1,39%
Loop-get buffer:           299.95 ms  84,26%
Index-get buffer:          343.05 ms  96,37%
Bulk-get buffer->array:     15.94 ms   4,48%

-- Direct buffer tests: ---------------------------------

Loop-put bufferD:          355.11 ms  99,75%
Index-put bufferD:         348.63 ms  97,93%
Bulk-put array->bufferD:   190.86 ms  53,61%
Bulk-put buffer->bufferD:    5.60 ms   1,57%

Bulk-put bufferD->bufferD:   7.73 ms   2,17%
Loop-get bufferD:          344.10 ms  96,66%
Index-get bufferD:         333.03 ms  93,55%
Bulk-get bufferD->array:   190.12 ms  53,41%

Using arrays of size 10,000 with 100,000 iterations...

-- Array tests: -----------------------------------------

Loop-write array:          156.02 ms   4,37%
Arrays.fill:               109.06 ms   3,06%
Loop-read array:           300.45 ms   8,42%
System.arraycopy:          147.36 ms   4,13%

-- Buffer tests: ----------------------------------------

Loop-put buffer:          3385.94 ms  94,89%
Index-put buffer:         3568.43 ms 100,00%
Bulk-put array->buffer:    159.40 ms   4,47%
Bulk-put buffer->buffer:     5.31 ms   0,15%

Bulk-put bufferD->buffer:    6.61 ms   0,19%
Loop-get buffer:          2907.21 ms  81,47%
Index-get buffer:         3413.56 ms  95,66%
Bulk-get buffer->array:    177.31 ms   4,97%

-- Direct buffer tests: ---------------------------------

Loop-put bufferD:         3319.25 ms  93,02%
Index-put bufferD:        3538.16 ms  99,15%
Bulk-put array->bufferD:  1849.45 ms  51,83%
Bulk-put buffer->bufferD:    5.60 ms   0,16%

Bulk-put bufferD->bufferD:   7.63 ms   0,21%
Loop-get bufferD:         3227.26 ms  90,44%
Index-get bufferD:        3413.94 ms  95,67%
Bulk-get bufferD->array:  1848.24 ms  51,79%

Answer

Holger picture Holger · Sep 20, 2013

Direct buffers are not meant to accelerate access from Java code. (If that were possible there was something wrong with the JVM’s own array implementation.)

These byte buffers are for interfacing with other components as you can write a byte buffer to a ByteChannel and you can use direct buffers in conjunction with native code such as with the OpenGL libraries you mentioned. It’s intended to accelerate these operation then. Using a graphics card’s chip for rendering can accelerate the overall operation to a degree more than compensating the possibly slower access to the buffer from Java code.

By the way, if you measure the access speed to a byte buffer, especially the direct byte buffers, it’s worth changing the byte order to the native byte order before acquiring a FloatBuffer view:

FloatBuffer bufferD = ByteBuffer.allocateDirect(SIZE * 4)
                                .order(ByteOrder.nativeOrder())
                                .asFloatBuffer();