Posts

Showing posts from 2012

Can synchronization be optimised away?

Overview There is a common misconception that because the JIT is smart and synchronization can be eliminated for an object which is only local to a method that there is no performance impact. A test comparing StringBuffer and StringBuilder These two classes do basically the same thing except one is synchronized (StringBuffer) and the other is not. It is also a class which is often used in one method to build a String.  The following test attempts to determine how much difference using one other the other can make. static String dontOptimiseAway = null; static String[] words = new String[100000]; public static void main(String... args) {     for (int i = 0; i < words.length; i++)         words[i] = Integer.toString(i);     for (int i = 0; i < 10; i++) {         dontOptimiseAway = testStringBuffer();         dontOptimiseAway = testStringBuilder();     } } private static String testStringBuffer() {     long start = System.nanoTime();     StringBuffer sb = new StringBuffer();

Local variables inside a loop and performance

Overview Sometimes a question comes up about how much work allocating a new local variable takes.  My feeling has always been that the code becomes optimised to the point where this cost is static i.e. done once, not each time the code is run. Recently Ishwor Gurung suggested considering moving some local variables outside a loop. I suspected it wouldn't make a difference but I had never tested to see if this was the case. The test. This is the test I ran public static void main(String... args) {     for (int i = 0; i < 10; i++) {         testInsideLoop();         testOutsideLoop();     } } private static void testInsideLoop() {     long start = System.nanoTime();     int[] counters = new int[144];     int runs = 200 * 1000;     for (int i = 0; i < runs; i++) {         int x = i % 12;         int y = i / 12 % 12;         int times = x * y;         counters[times]++;     }     long time = System.nanoTime() - start;     System.out.printf("Inside: Average loop

Object resurrection

Overview After an object which overrides finalize() is collected it is added to a finalization queue to be cleaned up after calling the finalize() method of each object.  By what happens if you resurrect the object? When is finalize called? The finalize method is called by a single threaded system task which calls this method for each object which has been collected. Note: the nodes in the finalization queue are objects also have finalize() methods notionally. Objects cannot be cleaned up until the GC after they have been finalized. Most objects (including the node in the finalization queue) don't overriden finalize() and so the GC is smart enough to detect this and not add them to the queue. These obejcts can be cleaned up immediately. If you override the method, even with an empty one, it makes a difference. What about resurrected objects? In the finalize() method, you can resurrect the object by giving making something point to it. e.g. a static collection.  This ob

Over one million page views

A some point recently I got more than one million pages views, including mirror sites. Thank you all for your interest and encouragement.

A use for overflow

There is simple, but expensive problem in cryptography is Modular Exponentiation which can be written as Calculate y = ( p ^ n ) mod ( 2 ^ 32 ) where p is a prime number and n is a large integer. Using BigInteger You can use the obvious way using BigInteger. BigInteger answer = BigInteger.valueOf(prime)                               .pow(number)                               .mod(BigInteger.valueOf(2).pow(32)); This works, but is very expensive for large numbers.  This can take minutes/hours for large number. Fortunately BigInteger has a method which is useful in the situation. Instead of generating a very large numebr only to modulus back into a smaller one, it has a combined method called BigInteger modPow(BigInteger power, BigInteger mod)   This method is much faster and can take milli-seconds. Using Overflow If you look at the magic number 2^32 this appears to be an odd choice at first.  The trick is to realise this is the same as & 0xFFFFFFFFL or just

Performance of inlined virtual method invocations in Java

Overview One of the benefits of dynamic compilation it the ability to support extensive method inlining on virtual method code. While inlining the code improves performance, the code still has to check the type (in case it has changed since it was optimised) or select between multiple possible implementations. This leads to the question; how does having multiple implementations of a method, called via an interface,  impact performance. Benchmarking This benchmark tests a trivial method invoked as elements of a list.  It compares different numbers of classes in the list to see how it performs, but also varies the amount the loop is unrolled so that the number of possible classes is reduced. e.g. for a list of 12 different classes, in a repeating pattern, with out unrolling the method invocation could be to any of the 12 classes, but when the loop is unrolled to a degree of two, each call has only 6 possible classes (12/2). When unrolled to a degree of three, there is 4 possibl

When using direct memory can be faster

Overview Using direct memory is no guarantee of improving performance.  Given it adds complexity, it should be avoided unless you have a compelling reason to use it. This excellent article by Sergio Oliveira Jr shows its not simply a matter of using direct memory to improve performance Which one is faster: Java heap or native memory? Where direct memory and memory mapped files can help is when you have a large amounts of data and/or you have to perform some IO with that data. Time series data. Time series data tends to have both a large number of entries and involve IO to load and store the data.  This makes it a good candidate for memory mapped files and direct memory. I have provided an example here;  main and tests where the same operations are performed on regular objects and on memory mapped files.  Note: I am not suggesting that the access to the Objects is slow but the overhead of using objects which is the issue. e.g. loading, creating, the size of the objec

Wasting time by saving memory

Overview Memory and disk space is getting cheaper all the time, but the cost of an hours development is increasing.  Often I see people trying to save memory or disk space which literally wasn't worth worrying about. I have a tendancy to do this myself, because I can, not because it is good use of my time. ;) Costs for comparison Cheap memory - You can buy 16 GB of memory for £28. Expensive memory - You can buy 1 GB in a phone for £320. (The entire cost of the phone) Cheap disk space - You can buy 3 TB of disk for £120 Expensive disk space - You can buy 1 TB of the fastest RAID-1 PCI SDD for £2000. The living wage in London is £8.55. You might say that at my company, the hard ware is 10x more expensive, but it also likely you time is costing the company about the same more.  In any case, this article attempts to demonstate that there is a tipping point where it no longer makes sense to spend time saving memory, or even thinking about it. time spent cheap memor

Why is it not allowed in Java to overload Foo(Object…) with Foo(Object[])?

This is from Why is it not allowed in Java to overload Foo(Object…) with Foo(Object[])? Question I was wondering why it is not allowed in Java to overload Foo(Object[] args) with Foo(Object... args) , though they are used in a different way? Foo ( Object [] args ){} is used like:   Foo ( new Object []{ new Object (), new Object ()}); while the other form:   Foo ( Object ... args ){} is used like:   Foo ( new Object (), new Object ()); Is there any reason behind this? Answer This 15.12.2.5 Choosing the Most Specific Method talk about this, but its quite complex. e.g. Choosing between Foo(Number... ints) and Foo(Integer... ints) In the interests of backward compatibility, these are effectively the same thing. public Foo ( Object ... args ){} // syntactic sugar for Foo(Object[] args){} // calls the varargs method. Foo ( new Object []{ new Object (), new Object ()}); e.g. you can define main() as   public static void main ( String ..

Practical uses for WeakReferences

This is based on answers to Is there a practical use for weak references? Question Since weak references can be claimed by the garbage collector at any time, is there any practical reason for using them? Answers If you want to keep a reference to something as long as it is used elsewhere e.g. a Listener, you can use a weak reference. WeakHashMap can be used as a short lived cache of keys to derived data. It can also be used to keep information about objects used else where and you don't know when those objects are discarded. BTW Soft References are like Weak references, but they will not always be cleaned up immediately. The GC will always discard weak references when it can and retain Soft References when it can. There is another kind of reference called a Phantom Reference . This is used in the GC clean up process and refers to an object which isn't accessible to "normal" code because its in the process of being cleaned up.

Why Double.NaN==Double.NaN is false

This is taken from the top two answers Why does Double.NaN==Double.NaN return false? Question I was just studying OCPJP questions and I found this strange code: public static void main ( String a []) { System . out . println ( Double . NaN == Double . NaN ); System . out . println ( Double . NaN != Double . NaN ); } When I ran the code, I got: false true How is the output false when we're comparing two things that look the same as each other? What does NaN mean? Answer NaN is by definition not equal to any number including NaN. This is part of the IEEE 754 standard and implemented by the CPU/FPU. It is not something the JVM has to add any logic to support. http://en.wikipedia.org/wiki/NaN A comparison with a NaN always returns an unordered result even when comparing with itself. ... The equality and inequality predicates are non-signaling so x = x returning false can be used to test if x is a quiet NaN. Java treats all NaN as quiet NaN. Java

Java Intrinsics and Performance

The original question was How to count the number of 1's a number will have in binary?  I included a performance comparison of using Integer.bitCount() which can be turned into an intrinic i.e. a single machine code instruction POPCNT and the Java code which does the same thing. Question How do I count the number of 1 's a number will have in binary? So let's say I have the number 45 , which is equal to 101101 in binary and has 4 1 's in it. What's the most efficient way to write an algorithm to do this? Answer Instead of writing an algorithm to do this it's best to use the built in function. Integer.bitCount() What makes this especially efficient is that the JVM can treat this as an intrinsic. i.e. recognise and replace the whole thing with a single machine code instruction on a platform which supports it e.g. Intel/AMD To demonstrate how effective this optimisation is public static void main ( String ... args ) { perfTestIntrinsic ();

Java += and implicit casting

Image
This is from two popular answers to the question Java += operator Question Until today I thought that for example: i += j ; is just a shortcut for: i = i + j ; But what if we try this: int i = 5 ; long j = 8 ; Then i = i + j; will not compile but i += j; will compile fine. Does it mean that in fact i += j; is a shortcut for something like this i = (type of i) (i + j) ? I've tried googling for it but couldn't find anything relevant. Answers As always with these questions, the JLS holds the answer. In this case §15.26.2 Compound Assignment Operators . An extract: A compound assignment expression of the form E1 op= E2 is equivalent to E1 = (T)((E1) op (E2)), where T is the type of E1, except that E1 is evaluated only once. And an example: For example, the following code is correct: short x = 3 ; x += 4.6 ;   and results in x having the value 7 because it is equivalent to: short x = 3 ; x = ( short )( x + 4.6 ); In other words

Moving the decimal place in a double

This is taken from a popular answer to the question Moving decimal places over in a double Question So I have a double set to equal 1234, I want to move a decimal place over to make it 12.34 So to do this I multiply .1 to 1234 two times, kinda like this double x = 1234 ; for ( int i = 1 ; i <= 2 ; i ++) { x = x *. 1 ; } System . out . println ( x ); This will print the result, "12.340000000000002" Is there a way, without simply formatting it to two decimal places, to have the double store 12.34 correctly? Answer If you use double or float , you should use rounding or expect to see some rounding errors. If you can't do this, use BigDecimal . The problem you have is that 0.1 is not an exact representation, and by performing the calculation twice, you are compounding that error. However, 100 can be represented accurately, so try: double x = 1234 ; x /= 100 ; System . out . println ( x ); which prints:   12.34 This works because Dou

Can try/finally prevent a StackOverflowError?

This post is taken from a popular answer to a question  try-finally block prevents StackOverflowError Question Take a look at the following two methods: public static void foo () { try { foo (); } finally { foo (); } } public static void bar () { bar (); } Running bar() clearly results in a StackOverflowError , but running foo() does not (the program just seems to run indefinitely). Why is that? Answer It doesn't run forever. Each stack overflow causes the code to move to the finally block. The problem is that it will take a really, really long time. The order of time is O(2^N) where N is the maximum stack depth. Imagine the maximum depth is 5 foo () calls foo () calls foo () calls foo () calls foo () which fails to call foo () finally calls foo () which fails to call foo () finally foo () calls foo () which f

Hidden code

Overview Sometime ago I came across the issue of invisible characters in Strings.  These can really cause confusion because they are invisible.     String a = "Hello\u200e";     String b = "Hello\u200f";     System.out.println('\'' + a + "' and '" + b + "' are length "                      + a.length() + " and " + b.length()                       + ", equals() is " + a.equals(b)); prints 'Hello‎' and 'Hello‏' are length 6 and 6, equals() is false Invisible identifiers Imagine my reaction to discovering you can use invisible characters in identifiers in Java :P.  These characters cannot appear at the start of identifiers in Java but can be anywhere else.     System.out.println("String _\u200e = \"Hello \";");     System.out.println("String _\u200f = \"World\";");     System.out.println("String _\u200e\u200f = \" !!\"

String intern puzzle

A String based puzzle for you. The following program String te = "te", st = "st"; // "test".length(); String username = te + st; username.intern(); System.out.println("String object the same is: "      + (username == "test")); prints under Java 7 update 7. String object the same is: true but uncomment the "test".length(); line, or run with Java 6 and it prints String object the same is: false Can you work out why?

Map interface and equals()

The Map interface cannot have duplicate keys and implies that equals is used to determine this. boolean containsKey(Object key) Returns true if this map contains a mapping for the specified key. More formally, returns true if and only if this map contains a mapping for a key k such that (key==null ? k==null : key.equals(k)). (There can be at most one such mapping.) The Maps which support this are HashMap LinkedHashMap ConcurrentHashMap Hashtable WeakHashMap The Maps which don't use equals() are IndentityHashMap TreeMap ConcurrentSkipListMap EnumMap All have different reasons as to why they violate the contract but it strikes me as not a great rule that it is as often violated as followed.

Java memes which refuse to die

Also titled; My pet hates in Java coding. There are a number of Java memes which annoy me, partly because they were always a bad idea, but mostly because people still keep picking them up years after there is better alternatives. Using StringBuffer instead of StringBuilder The Javadoc for StringBuffer from 2004 states As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization. Not only is StringBuilder a better choice, the occasions where you could have used a synchronized StringBuffer are so rare, its unlike it was ever a good idea. Say you had the code // run in two threads sb.append(key).append("=").append(value).append(", "); Each append is thread safe, but the lock could be release at any point meaning you c

Uses for special characters in Java code

Overview Ever wondered how you can write code like this in Java? if( ⁀ ‿ ⁀ == ⁀ ⁔ ⁀ || ¢ + ¢== ₡) Background Underscores has long been using in C like language such as Java to distinguish fields and method names. It is common to see a leading underscore like _field or an underscore in a constant like UPPER_CASE . In Java the $ is also used in class names and accessor method names. The SCJP has notes which state Identifiers must start with a letter, a currency character ($), or a connecting character such as the underscore ( _ ). Identifiers cannot start with a number! This leads to the question; what other connecting characters are there? What are connecting characters? A connecting character joins two words together. This page lists ten connecting characters U+005F LOW LINE _ view U+203F UNDERTIE ‿ view U+2040 CHARACTER TIE ⁀ view U+2054 INVERTED UNDERTIE ⁔ view U+FE33 PRESENTATION FORM FOR VER