High performance libraries in Java

There is an increasing number of libraries which are described as high performance and have benchmarks to back that claim up. Here is a selection that I am aware of.

Disruptor library

http://code.google.com/p/disruptor/

LMAX aims to be the fastest trading platform in the world. Clearly, in order to achieve this we needed to do something special to achieve very low-latency and high-throughput with our Java platform. Performance testing showed that using queues to pass data between stages of the system was introducing latency, so we focused on optimising this area.
The Disruptor is the result of our research and testing. We found that cache misses at the CPU-level, and locks requiring kernel arbitration are both extremely costly, so we created a framework which has "mechanical sympathy" for the hardware it's running on, and that's lock-free.
The 6 million TPS benchmark was measured on a 3Ghz dual-socket quad-core Nehalem based Dell server with 32GB RAM.
http://martinfowler.com/articles/lmax.html

Java Chronicle

https://github.com/peter-lawrey/Java-Chronicle
This library is an ultra low latency, high throughput, persisted, messaging and event driven in memory database. The typical latency is as low as 16 nano-seconds and supports throughputs of 5-20 million messages/record updates per second.
It uses almost no heap, trivial GC impact, can be much larger than your physical memory size (only limited by the size of your disk). and can be shared between processes with better than 1/10th latency of using Sockets over loopback.

It can change the way you design your system because it allows you to have independent processes which can be running or not at the same time (as no messages are lost) This is useful for restarting services and testing your services from canned data. e.g. like sub-microsecond durable messaging.
You can attach any number of readers, including tools to see the exact state of the data externally. e.g. You can use; od -t cx1 {file} to see the current state.

Colt Matrix library

http://acs.lbl.gov/software/colt/
Scientific and technical computing, as, for example, carried out at CERN, is characterized by demanding problem sizes and a need for high performance at reasonably small memory footprint. There is a perception by many that the Java language is unsuited for such work. However, recent trends in its evolution suggest that it may soon be a major player in performance sensitive scientific and technical computing. For example, IBM Watson's Ninja project showed that Java can indeed perform BLAS matrix computations up to 90% as fast as optimized Fortran. The Java Grande Forum Numerics Working Group provides a focal point for information on numerical computing in Java. With the performance gap steadily closing, Java has recently found increased adoption in the field. The reasons include ease of use, cross-platform nature, built-in support for multi-threading, network friendly APIs and a healthy pool of available developers. Still, these efforts are to a significant degree hindered by the lack of foundation toolkits broadly available and conveniently accessible in C and Fortran.

The latest stable Colt release breaks the 1.9 Gflop/s barrier on JDK ibm-1.4.1, RedHat 9.0, 2x IntelXeon@2.8 GHz.

Javolution

http://javolution.org/ Javolution real-time goals are simple: To make your application faster and more time predictable! That being accomplished through:
  • High performance and time-deterministic (real-time) util / lang / text / io / xml base classes. 
  • Context programming in order to achieve true separation of concerns (logging, performance, etc). 
  • A testing framework addressing not only unit tests but also performance and regression tests as well. 
  • Straightforward and low-level parallel computing capabilities with ConcurrentContext. Struct and Union base classes for direct interfacing with native applications (e.g. C/C++). 
  • World's fastest and first hard real-time XML marshalling/unmarshalling facility. Simple yet flexible configuration management of your application. 

Trove collections for primitives

http://trove.starlight-systems.com/

The Trove library provides high speed regular and primitive collections for Java.

The GNU Trove library has two objectives:

  • Provide "free" (as in "free speech" and "free beer"), fast, lightweight implementations of the java.util Collections API. These implementations are designed to be pluggable replacements for their JDK equivalents.
  • Provide primitive collections with similar APIs to the above. This gap in the JDK is often addressed by using the "wrapper" classes (java.lang.Integer, java.lang.Float, etc.) with Object-based collections. For most applications, however, collections which store primitives directly will require less space and yield significant performance gains.

MG4J: Managing Gigabytes for Java™

http://mg4j.dsi.unimi.it/

MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java. MG4J is a highly customisable, high-performance, full-fledged search engine providing state-of-the-art features (such as BM25/BM25F scoring) and new research algorithms.

Other links

Overview of 8 performance libraries

http://www.dzone.com/links/r/8_best_open_source_high_performance_java_collecti.html

Sometimes collection classes in JDK may not sufficient. We may require some high performance hashtable, Bigarrays etc. Check out the list of open source high performance collection libraries.

Serialization benchmark

http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

This is a comparison of some serialization libraries.

Its still hard to beat hand coded serialization.

http://vanillajava.blogspot.com/2011/10/serialization-using-bytebuffer-and.html

Comments

  1. Good compilation and thanks for sharing this list Peter, indeed useful. how about EhCache, I heard its a good solution of Caching and can be used in higher performance application.

    Javin
    4 Database performance tips for Java application here

    ReplyDelete
  2. Have you looked at ojAlgo instead of Colt? I have seen benchmarks suggesting that it is dramatically faster.

    ReplyDelete
  3. This is a very interesting blog and useful too.I appreciate that you shared all these thing with us.
    chech this modern leading ecommerce development site

    ReplyDelete
  4. In my situation, contains() and is Exist do not work, and there is no error message when I use them.
    I am new to web design development and programming have a list of urls and a substring to search from those urls and report whether or not they exist.

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Unusual Java: StackTrace Extends Throwable