SharedHashMap vs Redis
OverviewThis is a comparison between OpenHFT's SharedHashMap and a popular key-value store Redis.
Any vendor will tell you how great their product is, so I will start by outlining why you wouldn't use SharedHashMap, before I tell you why it is a "must have" for performant applications.
Why you would use Redis?Redis is a more mature database, relatively widely used and it includes;
- Support for multiple languages.
- Access over TCP to remote clients.
- A command line management tool.
- It out performs many other key-value stores.
Why you would use OpenHFT's SharedHashMap?
You need to maximise performance in Java. It outperforms Redis, and many other popular key-values stores by more than an order of magnitude in Java.
Why does SharedHashMap out perform Redis?
It is designed for performance from the start, by being as lightweight as possible.
- It acts as an embedded data store, even across multiple processes. You don't pay the price of TCP messaging via the kernel.
- It was designed to be used in Java in a pause less, garbage free manner.
- It is written in Java, for Java.
But C is faster than Java?
It is only faster when you compare like for like, and even then, not always. However if you compare an embedded data store written in Java to one which must pass over TCP and translate between languages, the embedded data store is much faster.
How much difference does it make?
Benchmarks can be bend to suit any argument. Vendor benchmarks tend to give you the most optimistic numbers because they know what their product will do best. The simplest benchmarks for key-value stores are the same, and one of them is to start with an empty data store and setting lots of small key-values, and that at least gives you an ideal of the best you can hope for. Your use case is likely to be slower, with more complex requirements.
Setting millions of key-values on a 16 core server with 128 GB of memory.
Setting millions of key-values on a 16 core server with 128 GB of memory.
|Single threaded||~10K updates/sec||~3M updates/sec|
|Multi-threaded||~100K updates/sec||~30M updates/sec|
The numbers are approximate, but they were performed on the same machine for the same operations in Java (using Jedis to connect to Redis) YMMWV
Accurate and impartial benchmarks are a myth, but the do accurately serve you in giving a vendors view of a product. OpenHFT's view of SharedHashMap is that it is designed for Java and is performant in a way that many popular key-values stores cannot match. If you need to maximise the efficiency of your Java system, you should be considering OpenHFT's data management products.
Just a question: Was the test case "put's" async or synchronous (if redis supports async put at all) ? A standard in process put is afaik ~10Million/second single threaded on i7, so the 'price' of shared is div 3.ReplyDelete
For the asynchronous case a remote put should not be that slower as long bandwith is not an issue. In fact async put can be in the millions also (ofc with higher latency).
"But C is faster than Java". Can you pls. give some hint on 'like for like' comparison? By embedded data store, are you referring to intrinsic methods in java?ReplyDelete
Btw: there are numerous blogs comparing performance between C & Java, however, for like minded readers http://vanillajava.blogspot.com/2011/08/java-can-be-significantly-faster-than-c.html is a good read.
Ah, thanks for the reply :-). SharedHashMap looks like a very valuable tool especially as servers grow in size & mem, but GC requires java processes to stay <4Gb. Could be very interesting to interconnect several smaller java processes on a big box.ReplyDelete
BTW google has no errors, its just eventually consistent :-)
ConcurrentHashMap compared with SharedHashMap, is about 3x faster assuming you have plenty of memory (as it uses up to 5x as much), plenty of CPU (as it produces a lot more heap usage and garbage) and you don't need sharing between processes or persistence.Delete
SHM's updates are synchronous to memory and if the process dies, they are not lost. It is asynchronous to disk, so if the power is lost, you can lose changes.
Redis' C benchmark is about 3x faster than Java, but I can't explain why that is about 1% of the performance.
My 2c: 1. Redis can be used by multiple remote clients whereas, if I understand correctly, SHM are local to the server and 2. if you run Redis locally (i.e. on the same server as the app) you can use local sockets instead of the relatively expensive TCP stack.ReplyDelete
Correct. As noted, Redis supports remote client which SHM doesn't, yet. The tests above were performed over loopback for Redis. I haven't tried it on a network yet.Delete
Please provide more info how to use SHM. Is it good use case to load all master data at application start up ?ReplyDelete
SHM is persisted, so you shouldn't have to reload it each time, but you can do that if you need to.Delete
Hello! It's not a fair comparison. Redis can not run in the same process.ReplyDelete
Although 100K looks strange. Look at the OrientDB http://www.orientechnologies.com/orientdb/
It's not just key-value.
It ACID compliance. It can store up to 150,000 records(on disk) per second on notebook.
"OrientDB is an Open Source NoSQL DBMS with the features of both Document and Graph DBMSs. It's written in Java and it's amazingly fast: it can store up to 150,000 records per second on common hardware."
The 100K/second was using Redis' own test. It is possible OrientDB is 50% faster or use a machine which is 50% faster. I go to length to point out it is not a direct comparison but then again, if you don't need the TCP layer, you should have to pay the price. Is it fair that Redis doesn't support embedded processing in Java when it can make such a huge difference?Delete
OrientDB sounds interesting, I will take a look. Is it GC free?
>Is it GC free?Delete
I can not say for sure. It was designed not as in-memory database, but can create a database only in memory. In this case the off-heap memory is used.
An interesting comparison with Mongo http://www.orientechnologies.com/orientdb-vs-mongodb/.
Firstly it is interesting to me as Object Database.
The products look interesting, esp in features. The pricing is pretty reasonable as well.Delete
How does this compare with Lightning Memory Mapped Database (http://symas.com/mdb/, http://parleys.com/play/515727e0e4b0c779d7881428/chapter0/about)?
Planning to try this out - my key values are JSON objects - a few kBs to a few being hundreds of kBs. However dont need the kind of performance that you have published
As LMDB notes, it is not in the same league. Many DB products provide features we don't have such as transactions. This means we are *much* faster but if you don't need the speed, perhaps transactions are worth having. ;) For example, in the test provided, they get 10K writes per second for 10 M records. Chronicle Map gets 400 K writes per second for 4 billion records. We can get closer to 30 million updates per second if the records easily fit into memory. If the application dies, no data is lost, but if the whole machine dies, data can be lost and that is a key difference.Delete
Thank you very much for the article and for all the replies. I would have some follow-up questions.Delete
- First, I would like to ask how many threads were involved in the benchmarks for the multi-threaded results showed in the article.
- Secondly I would like to know how many threads were used for the following reply you have kindly given: "Chronicle Map gets 400 K writes per second for 4 billion records. We can get closer to 30 million updates per second if the records easily fit into memory."
On the machine with 16 cores, 32 logical threads, I use 32 threads. The benchmark was not intended to be a definitive comparison as the size of entry, the access pattern and the choice of hardware all make a difference. Both products have had performance improvements since the article was written. The only conclusion you can draw is that for at least one test the performance can be very different. In fact Redis don't claim millions of operations per second under any use case.Delete
The same number of threads was used for the larger test. We get about 36 million updates per second for the same test & hardware now.
The performance of the out of memory benchmark depends on the speed of the disk sub-system and we have achieved 900K updates per second with faster disk storage. We are looking to work with hardware vendors who can give us transaction rates much closer to in memory speed even if the data set is 10x main memory size or more.
I really appreciated the answering speed and the clarifications are really useful.Delete
I'll try to be short with another clarifying questions.
- Why is there such a huge difference between number of writes (400K writes/sec) and number of updates (millions/sec). Isn't an update requiring a read and a write?
I got similar numbers for ChronicleMap (millions/sec for read requests and hundred of thousands/sec for write requests). However shouldn't updates be slowed down by the write part?
...or maybe I understood wrongly what it is meant by update.
Chronicle Map write data to memory and lets the OS write the data to disk asynchronously. If you need data to be committed we do that via replication rather than to disk. When you are accessing data which fits in memory you do this at in memory speeds. When you randomly access data which doesn't fit in memory, you are dependant on the speed the data can be retrieved from disk.Delete
An update does a read and write and it is the read which actually is slow when you update out of memory records which are written asynchronously.
All right. Things are getting more clear now. Thanks.Delete
So regarding to the 400K writes/sec. Which kind of assumptions do you have for them? Are they also committed asynchronous and on which kind of systems are they written?
Is there an official set of results you have obtained for update/write/reads for different subsystems (RAM only/disk involved/ other storage systems).
The assumption for the out of memory test was randomly accessing of a data set twice the size of main memory. i.e. 256 GB on a 128 GB machine. The size of the writes was 100 bytes entries. All the tests were for asynchronous commits.Delete
The choice of disk subsystem wasn't very important. I compared RAM only (tmpfs) vs OCZ RevoDrive v2 vs a stripe set of Samsung 840 EVO drives and achieved a similar result of 30 - 36 million writes per second for data sets which fit in memory depending on the number of keys (rather than the choice of disk subsystem) Less keys e.g. 50 million is slightly faster, 36M/s than more keys 500 million keys was 30 M/s.