Showing posts from February, 2012

Benchmarking a slow machine

Overview When performing benchmarks I usually reach for the fastest machine I can. The theory is that if speed matters you will use a fast machine. Recently I have tried benchmarking the slowest machine I have access to. A Pentium IV dual core laptop with 4 GB of memory and a regular HDD. The assumption is that if this performs better than your need, it might not matter how fast your system is. In the article, I am comparing this laptop to a fast machine . The point is, if the slow machine is more than enough you don't need to worry about hardware. Latency test The round trip average latencies tests were 10 times slower and there were over 1000x more delayed messages. Even so, a delay of an average of 4 micro-seconds is faster than many applications need. The average RTT latency was 3,442 ns. The 50/99 / 99.9/99.99%tile latencies were 1,790/1,790 / 103,550/2,147,483,647. There were 39,891 delays over 100 μs Throughput test The biggest difference I saw in the

Attending the LJC QCon London Meeting.

The meeting is at 6.30 on Thursday March the 8th, its free and happens to be the only night I am not already doing something for the next couple of weeks. So I'll be there. ;) QCon London - LJC User Group Meeting

Magazine layout appears to increase page views

It appears that using the magazine layout for this blog has really encouraged people to look at more of my articles. In the past I had a few popular articles, but this didn't translate to people to reading my other articles. i.e. most of my hits for the month came from between one and three articles. With the new format it appears that people are reading more of my articles. i.e. I got over 110K hits in Feb 2012, but the top ten articles account for only 9% of these. It has also translated in a significant increase in revenue from ads. It almost pays for my internet connection. :D

File local access

Overview I have used nested classes which accessed the private members of top level classes for some time, and discovered that a top level class can access the private members of nested classes. Recently, I discovered that nested classes can access private member or other nested classes. i.e. where the two classes are not nested, but share a top level class. Example Perhaps private should be called "file local" c.f. package local. ;) public interface MyApp { class Runner { public static void main(String... args) { // access a private member of another class // in the same file, but not nested. SomeEnum.VALUE1.value = "Hello World"; System.out.println(SomeEnum.VALUE1); } } enum SomeEnum { VALUE1("value1"), VALUE2("value2"), VALUE3("value3"); private String value; SomeEnum(final String value) {

How much difference can thread affinity make

Overview In the past when I have performed performance tests using thread affinity, at best, it didn't appear to make much difference. I recently developed a library to use thread affinity in a more controlled manner, and another library which makes use of threads working together in a tightly coupled manner. The later is important because it appears to me that affinity makes the most difference when you have tightly coupled threads. Results The results indicate that System.nanotime() impacts perform at this level in Centos 6.2, as it did in Centos 5.7. Using the JNI wrapper for RDTSC improved timings. From other test I have done, Ubuntu 11 didn't appear to make as much difference. Using thread affinity without isolating cpus improved latencies. Using thread affinity on isolated cpus didn't improved latencies much but did improve throughput. Using hyper threading on a high performance system can be a practical option for key threads where a drop of 20% throughput

High performance libraries in Java

There is an increasing number of libraries which are described as high performance and have benchmarks to back that claim up. Here is a selection that I am aware of. Disruptor library LMAX aims to be the fastest trading platform in the world. Clearly, in order to achieve this we needed to do something special to achieve very low-latency and high-throughput with our Java platform. Performance testing showed that using queues to pass data between stages of the system was introducing latency, so we focused on optimising this area. The Disruptor is the result of our research and testing. We found that cache misses at the CPU-level, and locks requiring kernel arbitration are both extremely costly, so we created a framework which has "mechanical sympathy" for the hardware it's running on, and that's lock-free. The 6 million TPS benchmark was measured on a 3Ghz dual-socket quad-core Nehalem based Dell server with 32GB RAM. http:

Using Java 7 to target much older JVMs

Overview Java 5.0 and 6 used to have poor support for compiling classes to target older versions of Java. It always supported the previous version, but often no more. Even if you could compile for previous version, you had to be careful not to use functionality which did exist in the previous versions. Java 7 Java 7 addresses both these issues. Firstly, it supports sources back to 1.2 and targets back to Java 1.1. Secondly. it insists you set the bootclasspath so you can include the version of the libraries you will be using for that version. public class Main { public static void main(String[] args) { System.out.println("Hello World!"); } } $ javac -target 1.7 -source 1.7 $ javac -target 1.6 -source 1.6 warning: [options] bootstrap class path not set in conjunction with -source 1.6 1 warning $ javac -Xbootclasspath:/usr/java/jdk1.6.0_29/jre/lib/rt.jar -target 1.6 -source 1.6 $ javac -Xbootclasspath:/usr/java/jdk1.5.0_22

Ultra low latency Event Store

Overview There are two basic libraries for managing data in Java, JDBC (for connecting to database) and JMS (for messaging). For some use cases you ideally want both, and you want it to be very fast. History This is a redevelopment of a previous project HugeCollections The project is still on hold because its too complex IMHO for what it does. This library is lower level and much simpler to understand. It may become the basis for the higher level HugeCollections library. The Java Chronicle Library This library attempts to provide ultra low latency, high throughput, persisted, messaging and event driven in memory database with random access to previous messages) The typical latency is as low as 80 nanoseconds (between processes), supporting throughputs of 5-20 million messages per second. Technical Features It uses almost no heap with trivial GC impact regardless of size,  It can be much larger than your physical memory size (only limited by the size of your disk