Chronicle Map and Yahoo Cloud Service Benchmark

Overview

Yahoo Cloud Service Benchmark is a reasonably widely used benchmarking tool for testing key value stores for a significant number of key e.g 100 million, and a modest number of clients i.e. served from one machine.

In this article I look at how a test of 100 million * 1 KB key/values performed using Chronicle Map on a single machine with 128 GB memory, dual Intel E5-2650 v2 @ 2.60GHz, and six Samsung 840 EVO SSDs.

The 1 KB value consists of ten fields of 100 byte Strings.  For a more optimal solution, primitive numbers would be a better choice. While the SSDs helped, the peak transfer rate was 700 MB/s which could be supported by two SATA SSD drives.

These benchmarks were performed using the latest version at the time of the report, Chronicle Map 2.0.6a-SNAPSHOT.

Micro-second world.

Something which confounds me when reading benchmarks about key-value stores is that they start with the premise that performance is really important.  IMHO, about 90% of the time, performance is not the most important feature, provided you have sufficient performance.
These benchmark reports then continue to report times in milli-seconds, not micro-seconds and throughputs in the tens of thousands instead of the hundreds of thousands or millions.  If performance really was that important, they would have built their products around performance, instead of the useful features they do support, like multi-key transactionality, quorum updates and other features Chronicle Map doesn't support, for performance reasons.

So how would a key-store built for performance look with YCSB?

Throughput measures

The "50/50" tests 50% random reads and 50% random writes, the "95/5" tests 95% reads to 5% writes. It is expected that writes will be more expensive, and a higher percentage of reads results in higher throughputs.

Threads 50/50 read/update 95/5 read/update
1 122 K/s 262 K/s
2 235 K/s 496 K/s
4 339 K/s 910 K/s
8 565 K/s 1.010 M/s
15 973 K/s 1.445 M/s
30 816 K/s 1.787 M/s

Latencies

The following latencies are in micro-seconds, not milli-seconds.

Threads: 8 50/50 read 95/5 read 50/50 update 95/5 update
average 5.7 µs 4.9 µs 13 µs 12.9 µs
95th 15 µs 13 µs 27 µs 25 µs
99th 25 µs 30 µs 44 µs 47 µs
worst 52 ms 52 ms 52 ms 52 ms

Note: the benchmark is not designed to be GC free and creates some garbage.  This is not particularly high and the benchmark itself uses only about 1/4 of CPU according to flight simulator, however it does impact the worst latencies.

Conclusion

Make sure the key-value store has the features you need, but if performance is critical, look for a solution designed for performance as this can be 100x faster than full featured products.

Other high performance examples

Aerospike benchmark - Single server benchmark with over 1 M TPS, sub-micro-second latencies. Uses smaller 100 byte records.
NuoDB benchmark - Supports transactions across a quorum. 24 nodes for 1 M TPS.
Oracle NoSQL benchmark - A couple of years old, uses a lot of threads, otherwise a good result.
VoltDB benchmark - Not tested to 1 M TPS, but promising. Latencies around 1-2 ms, report has 99th percentile latencies which others don't include.

Room for improvement

MongoDB driver benchmark - Has 1000s of micro-seconds instead of milli-seconds.
Cassandra, HBase, Redis - Shows you can get 1 million TPS if you use enough servers, 288 nodes for 1 M TPS.
Report including Elasticsearch - Report includes runtime in a "resource Austere Environment"
Hyperdex - Cover throughput only.
WhiteDB - Reports latencies in micro-seconds for 170 K records, and modest throughputs.
Benchmark including Aerospace - Reports 

Footnote

Using smaller values helps, and we suggest trying to make values closer to 100 bytes.  This is the result of the 95/5 workload B, using 10x10 byte fields, and 50 M entries as the Aerospike benchmark does. (30 clients)

[OVERALL], RunTime(ms), 60,669
[OVERALL], Throughput(ops/sec), 3,296,576
[READ], Operations, 190002671
[READ], AverageLatency(us), 4.81
[READ], MinLatency(us), 0
[READ], MaxLatency(us), 74864
[READ], 95thPercentileLatency(ms), 0.009
[READ], 99thPercentileLatency(ms), 0.014
[READ], Return=0, 115841209
[READ], Return=1, 74161462
[UPDATE], Operations, 9997309
[UPDATE], AverageLatency(us), 12.23
[UPDATE], MinLatency(us), 1
[UPDATE], MaxLatency(us), 75015
[UPDATE], 95thPercentileLatency(ms), 0.017
[UPDATE], 99thPercentileLatency(ms), 0.028
[UPDATE], Return=0, 9997309






Comments

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues