Recycling objects to improve performance

October 30, 2011

Overview

In a previous article I stated that the reason the deserialisation of objects was faster was due to using recycled objects. This is potentially surprising for two reasons, 1) the belief that creating objects is so fast these days, it doesn't matter or is just as fast as recycling yourself, 2) None of the serialisation libraries use recycling by default.

This article explores deserialisation with and without recycling objects. How it not only is slower to create objects, but it slows down the rest of your program by pushing data out of your CPU caches.

While this talks about deserialisaton, the same applies to parsing text or reading binary files, as the actions being performed are the same.

The test

In this test, I deserialise 1000 Price objects, but also time how long it takes to copy a block of data. The copy represents work which the application might have to perform after deserialising.

The test is timed one million times and those results sorted. The X-Axis shows the percentile timing. e.g. the 90% values is the 90% worst value. (or 10% of values are higher)

As you can see, the deserialisation take longer if it has to create objects as it goes, however sometimes it takes much much long. This is perhaps not so surprising as creating objects means doing more work and possibly being delayed by a GC. However, it is the increase in the time to copy a block of data which is surprising. This demonstrates that not only is the deserialisation slower, but any work which needs the data cache is also slower as a result. (Which is just about anything you might do in a real application)
Performances tests rarely show you the impact on the rest of your application.

In more detail

Examining the higher percentile (longest times) you can see that the performance consistently bad if the deserialisation has to wait for the GC.

And the performance of the copy increases significantly in the worst case.

The code

Recycling example code

Comments

Anonymous31 October 2011 at 01:50
Hi, could you please explain what the x-axis in these graphs is?

Cheers!
ReplyDelete
Replies
Peter Lawrey31 October 2011 at 07:38
@Maksim Sipos, I have added this comment to the post

"The test is timed one million times and those results sorted. The X-Axis shows the percentile timing. e.g. the 90% values is the 90% worst value. (or 10% of values are higher)"
ReplyDelete
Replies
Andre2 November 2011 at 13:24
This is an interesting benchmark.

The thing I always try to keep in mind after reading an article like this is how this applies to a real life scenario in which your application is doing other things.

You quote a common java developer belief: "creating objects is so fast these days, it doesn't matter or is just as fast as recycling yourself". I guess this is probably true in mose use cases.
ReplyDelete
Replies
Peter Lawrey2 November 2011 at 13:39
@Andre, When I code I make sure all the data structures taken from input e.g. Sockets are recyclable with the convention that any mutable data to be retained has to be copied.

Creating immutable objects is often simpler and fast enough for most use cases. But if you are creating too much garbage or need to go faster, there is something you can do about it.
ReplyDelete
Replies
shrini10004 May 2012 at 07:47
Good article, but a question: even when you call 'readResolve', the object is already created through deserialization process, right? i.e., readResolve being an instance method, it has to be called on an object which means the process does create an object; so object reuse will only help reduce the cost of data copy, or am I missing something?
ReplyDelete
Replies
Peter Lawrey4 May 2012 at 08:12
readResolve only reduces the size of the resulting object. It does mean you create temporary objects but these are relatively cheap.

In this benchmark I use completely custom serialization and don't use ObjectInput/OutputStream so I don't have this restriction.
ReplyDelete
Replies

Add comment

Vanilla Java