SharedHashMap vs Redis

May 26, 2014

Overview

This is a comparison between OpenHFT's SharedHashMap and a popular key-value store Redis.

Any vendor will tell you how great their product is, so I will start by outlining why you wouldn't use SharedHashMap, before I tell you why it is a "must have" for performant applications.

Why you would use Redis?

Redis is a more mature database, relatively widely used and it includes;

Support for multiple languages.
Access over TCP to remote clients.
A command line management tool.
It out performs many other key-value stores.

Why you would use OpenHFT's SharedHashMap?

You need to maximise performance in Java. It outperforms Redis, and many other popular key-values stores by more than an order of magnitude in Java.

Why does SharedHashMap out perform Redis?

It is designed for performance from the start, by being as lightweight as possible.

It acts as an embedded data store, even across multiple processes. You don't pay the price of TCP messaging via the kernel.
It was designed to be used in Java in a pause less, garbage free manner.
It is written in Java, for Java.

But C is faster than Java?

It is only faster when you compare like for like, and even then, not always. However if you compare an embedded data store written in Java to one which must pass over TCP and translate between languages, the embedded data store is much faster.

How much difference does it make?

Benchmarks can be bend to suit any argument. Vendor benchmarks tend to give you the most optimistic numbers because they know what their product will do best. The simplest benchmarks for key-value stores are the same, and one of them is to start with an empty data store and setting lots of small key-values, and that at least gives you an ideal of the best you can hope for. Your use case is likely to be slower, with more complex requirements.

Setting millions of key-values on a 16 core server with 128 GB of memory.

	Redis	SharedHashMap
Single threaded	~10K updates/sec	~3M updates/sec
Multi-threaded	~100K updates/sec	~30M updates/sec

The numbers are approximate, but they were performed on the same machine for the same operations in Java (using Jedis to connect to Redis) YMMWV

Conclusion

Accurate and impartial benchmarks are a myth, but the do accurately serve you in giving a vendors view of a product. OpenHFT's view of SharedHashMap is that it is designed for Java and is performant in a way that many popular key-values stores cannot match. If you need to maximise the efficiency of your Java system, you should be considering OpenHFT's data management products.

Comments

Rüdiger Möller27 May 2014 at 00:18
Just a question: Was the test case "put's" async or synchronous (if redis supports async put at all) ? A standard in process put is afaik ~10Million/second single threaded on i7, so the 'price' of shared is div 3.
For the asynchronous case a remote put should not be that slower as long bandwith is not an issue. In fact async put can be in the millions also (ofc with higher latency).
ReplyDelete
Replies
Anonymous27 May 2014 at 13:14
"But C is faster than Java". Can you pls. give some hint on 'like for like' comparison? By embedded data store, are you referring to intrinsic methods in java?

Btw: there are numerous blogs comparing performance between C & Java, however, for like minded readers http://vanillajava.blogspot.com/2011/08/java-can-be-significantly-faster-than-c.html is a good read.
ReplyDelete
Replies
Rüdiger Möller27 May 2014 at 13:46
Ah, thanks for the reply :-). SharedHashMap looks like a very valuable tool especially as servers grow in size & mem, but GC requires java processes to stay <4Gb. Could be very interesting to interconnect several smaller java processes on a big box.
BTW google has no errors, its just eventually consistent :-)
ReplyDelete
Replies
Unknown29 May 2014 at 18:26
My 2c: 1. Redis can be used by multiple remote clients whereas, if I understand correctly, SHM are local to the server and 2. if you run Redis locally (i.e. on the same server as the app) you can use local sockets instead of the relatively expensive TCP stack.
ReplyDelete
Replies
Unknown31 May 2014 at 03:26
Please provide more info how to use SHM. Is it good use case to load all master data at application start up ?
ReplyDelete
Replies
CGen3 June 2014 at 06:02
Hello! It's not a fair comparison. Redis can not run in the same process.

Although 100K looks strange. Look at the OrientDB http://www.orientechnologies.com/orientdb/
It's not just key-value.
It ACID compliance. It can store up to 150,000 records(on disk) per second on notebook.

"OrientDB is an Open Source NoSQL DBMS with the features of both Document and Graph DBMSs. It's written in Java and it's amazingly fast: it can store up to 150,000 records per second on common hardware."

ReplyDelete
Replies
Unknown26 August 2014 at 00:36
Peter
How does this compare with Lightning Memory Mapped Database (http://symas.com/mdb/, http://parleys.com/play/515727e0e4b0c779d7881428/chapter0/about)?

Planning to try this out - my key values are JSON objects - a few kBs to a few being hundreds of kBs. However dont need the kind of performance that you have published
ReplyDelete
Replies

Add comment

Vanilla Java