Reading/writing GC-less memory

August 04, 2011

Overview

How you access data can make a difference to the speed. Whether you use manual loop unrolling or let the JIT do it for you can also make a difference to performance.

I have included C++ and Java tests doing the same thing for comparison.

Tests

In each case, different approaches to storing 16 GB of data were compared.

In the following tests I compared storing data

allocating, writing to, reading from and total GC times
byte[] (smallest primitive) and long[] (largest primitive)
arrays, direct ByteBuffer and Unsafe
JIT optimised and hand unrolled four times

store	type	size	unrolled	allocate	writing	reading	GC time
C++ char[]	native	8-bit char	no	31 μs	12.0 s	8.7 s	N/A
C++ char[]	native	8-bit char	yes	5 μs	8.8 s	6.6 s	N/A
C++ long long[]	native	64-bit int	no	11 μs	4.6 s	1.4 s	N/A
C++ long long[]	native	64-bit int	yes	12 μs	4.2 s	1.2 s	N/A
byte[]	heap	byte	no	4.9 s	20.7/7.8 s	7.4 s	51 ms
byte[]	heap	byte	yes	4.9 s	7.1 s	8.5 s	44 ms
long[]	heap	long	no	4.7 s	1.6 s	1.5 s	37 ms
long[]	heap	long	yes	4.7 s	1.5 s	1.4 s	45 ms
ByteBuffer	direct	byte	no	4.8 s	18.1/10.0 s	14.0 s	6.1 ms
ByteBuffer	direct	byte	yes	4.8 s	12.2/10.0 s	16.7 s	6.1 ms
ByteBuffer	direct	long	no	4.7 s	6.0/3.9 s	2.4 s	6.1 ms
ByteBuffer	direct	long	yes	4.6 s	4.7/2.3 s	7.9 s	6.1 ms
Unsafe	direct	byte	no	10 μs	18.2 s	13.8 s	6.0 ms
Unsafe	direct	byte	yes	10 μs	8.7 s	8.3 s	6.0 ms
Unsafe	direct	long	no	10 μs	5.2 s	1.9 s	6.0 ms
Unsafe	direct	long	yes	10 μs	4.2 s	1.3 s	6.0 ms

In each case, this is the time to perform 8-bit byte or 64-bit long operations on 16 GB of data in different structures as required. In C++ and using Unsafe, I single array/block memory was used. For Java array and ByteBuffer multiple objects were use to create the same total amount of space.

C++ test configuration

All tests were performed with gcc 4.5.2 on ubuntu 11.04, compiled with -O2

Java test configuration

All test were performed with Java 6 update 26 and Java 7 update 0, on a fast PC with 24 GB of memory. Timings are for 6/7. Where there one value they were the same.

All tests were run with the options -mx23g -XX:MaxDirectMemorySize=20g -verbosegc

Curiosity

For me the most curious result was the performance of the long[] which was very fast in Java, faster than using C++ or Unsafe directly.

The code

C++ tests - memorytest/main.cpp

Java tests - MemoryTest.java

Vanilla Java