Surprising results of autoboxing

Overview

There are a number of surprising consequences of auto-boxing. Some are more widely known than others. Most of them are due to the effect of caching of some auto boxed objects.

== and equals may or may not match

If you run the program AutoboxEqualsMain with default options.
All Boolean are cached.
All Byte are cached.
The first auto-boxed char is 0 and the last is 127
The first auto-boxed short is -128 and the last is 127
The first auto-boxed int is -128 and the last is 127
The first auto-boxed long is -128 and the last is 127
No Float are cached.
No Double are cached.
For those that know that == compares references, for Integer or any Object type.
Run the program again with -XX:+AggressiveOpts on Java 7 and you get almost the same result.
All Boolean are cached.
All Byte are cached.
The first auto-boxed char is 0 and the last is 127
The first auto-boxed short is -128 and the last is 127
The first auto-boxed int is -128 and the last is 20000
The first auto-boxed long is -128 and the last is 127
No Float are cached.
No Double are cached.
Note: This option has increased the maximum for the Integer cache.

The performance improvements

The memory size for an int[] and an ArrayList<Integer> can be almost the same size for cached values. If you run this program which builds a collection of both using the numbers 1 to 16,000 with -XX:-UseTLAB and -XX:+AgressiveOpts you can get this result.
The int[16000] took 64592 bytes and new ArrayList() with 16000 values took 65048 bytes
This result is surprising for two reasons. Firstly, it is run in a 64-bit JVM. The ArrayList is an array of references, however the JVM can use 32-bit reference for up to 32 GB of heap memory. Note: You can use direct memory and memory mapped files to use much more than 32 GB, but as long as your heap is smaller than 32 GB it can use 32-bit references.
Secondly, the objects themselves don't appear to take any space. This is because all the values are cached before the program starts so you don't see the space they occupy. Note: being such a short test the values for size you get vary with each run. I ran this test a few times and took the lowest result. The highest results were not that much higher (within 20%)

Auto-boxed objects don't always get garbage collected

If you use auto-boxed objects as keys, they might never be unloaded.
Map<Long, String> keyValueMap = new WeakHashMap<Long, String>(10000);
for (long i = 1; i <= 8192; i *= 2)
    keyValueMap.put(i, "key-" + i);
System.out.println("Before GC: " + keyValueMap);
System.gc();
System.out.println("After GC: " + keyValueMap);
The keys of the map are not being held so you would expect them to be cleaned up on a GC. What might be surprising is that not all keys are cleaned up.
Before GC: {8192=key-8192, 4096=key-4096, 2048=key-2048, 1024=key-1024, 512=key-512, 256=key-256, 
    128=key-128, 64=key-64, 32=key-32, 16=key-16, 8=key-8, 4=key-4, 2=key-2, 1=key-1}
After GC: {64=key-64, 32=key-32, 16=key-16, 8=key-8, 4=key-4, 2=key-2, 1=key-1}
Note: I increased the initial capacity of the WeakHashMap so that the keys would appear in a predictable order.

Don't use auto-boxed object for as locks

This goes up there with not using string literals for locks
synchronized("lock one") { // DON'T do this.
But what can make using auto-boxed values worse is that your program can work fine when tested, provided the objects locked are small. When you program happens to use a larger value, the locking suddenly fails.
Object o = 
Integer lock = o.hashCode() & 255;
synchronized(lock)  { // DON'T do this either.
This could look like a clever way to have locks for all objects which are equal. However, its very brittle and obtuse and you could find your program fails unexpectedly in ways which are difficult to reproduce e.g. depending on whether you have use -XX:+AggressiveOpts If you have more than one library or section of code which does this, you could get bizarre deadlocks between unrelated code.

The code






Comments

  1. I don't think that's correct: "Java uses comparison of values for Float, and == is true if the values are the same but the objects are different."

    It just looks like there is no cache (see Float.valueOf(float f)).

    Float F1 = 0.0f;
    Float F2 = 0.0f;
    float f1 = 0.0f;
    float f2 = 0.0f;
    System.out.println(f1 == f2);
    System.out.println(f1 == F2);
    System.out.println(F1 == f2);
    System.out.println(F1 == F2);

    I get: true, true, true, false.

    ReplyDelete
    Replies
    1. @Thomas Mueller no == never compares values of two different objects
      case 1 :
      Float f1 = 1.1f;
      Float f2 = 1.1f;
      system.out.println(f1==f2);
      Output : True
      case 2 :
      Float f1 = new Float(1.1f);
      Float f2 = new Float(1.1f);
      system.out.println(f1==f2);
      Output : Flase

      You got True in case1 because jvm reference f1,f2 to same float pool or something ,if you explicitly use new operator then only two memory locations are created for two objects. Why JVM do like this in first case is because to remove overhead of creating too many objects in heap when we don't need as we cant reset wrapper classes. So use new operator with wrapper classes if u really need.

      Delete
  2. @Thomas Mueller, Hmmmm, Thank you for the correction. I tested this but the test was wrong.

    ReplyDelete
  3. More over - you could enlarge the top of Integer cache using java.lang.Integer.IntegerCache.high jvm option ( in java 6, not in java 5 )

    ReplyDelete
  4. This is why for any object, even the wrappers, you'll want to use .equals()

    ReplyDelete
  5. WeakHashMap explanation:

    WeakHashMap does the job correctly (even with long). This is not a WeakHashMap issue nor an auto-boxing issue.
    This is an impact of the "long" cache of the JVM (a JVM has a cache for long values between -128 and 127). So if you use a long which has a value between -128 and 127 in your key, then the JVM has always a reference to this long (in its cache). If you use a "Long" instance instead, then the WeakHashMap will be fully garbaged because you have no more reference to those "Long".

    With the following snippet (which use "long"), the WeakHashMap will be fully garbaged because the index starts from 128 (higher than 127). Also try to play with the START value (ie 127,126) to see how it influences the remaining entries !

    Map keyValueMap = new WeakHashMap(10000);

    for (long index = 128; index <= 10000; i++) {
    keyValueMap.put(index, "key-" + index);
    }
    System.out.println("Before GC: size=" + keyValueMap.size() + " / " + keyValueMap);
    for (int g = 0; g < 100; g++) {
    System.gc();
    System.out.println("After GC: size=" + keyValueMap.size() + " / " + keyValueMap);
    }

    Note: The size() method does not always reflect the real entry count in the WeakHashMap !


    And last thing: using "new Long(34)" is not the same as using "Long.valueOf(34)" - With "Long.valueOf" the JVM tries to use a cached value ! read the Javadoc

    Hope it helps

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Unusual Java: StackTrace Extends Throwable