Showing posts from 2014

On heap vs off heap memory usage

Overview I was recently asked about the benefits and wisdom of using off heap memory in Java.  The answers may be of interest to others facing the same choices. Off heap memory is nothing special.  The thread stacks, application code, NIO buffers are all off heap.  In fact in C and C++, you only have unmanaged memory as it does not have a managed heap by default.  The use of managed memory or "heap" in Java is a special feature of the language. Note: Java is not the only language to do this. new Object() vs Object pool vs Off Heap memory. new Object() Before Java 5.0, using object pools was very popular.  Creating objects was still very expensive.   However, from Java 5.0, object allocation and garbage cleanup was made much cheaper, and developers found they got a performance speed up and a simplification of their code by removing object pools and just creating new objects whenever needed.  Before Java 5.0, almost any object pool, even an object pool which used objec

Chronicle Map and Yahoo Cloud Service Benchmark

Overview Yahoo Cloud Service Benchmark is a reasonably widely used benchmarking tool for testing key value stores for a significant number of key e.g 100 million, and a modest number of clients i.e. served from one machine. In this article I look at how a test of 100 million * 1 KB key/values performed using Chronicle Map on a single machine with 128 GB memory, dual Intel E5-2650 v2 @ 2.60GHz, and six Samsung 840 EVO SSDs. The 1 KB value consists of ten fields of 100 byte Strings.  For a more optimal solution, primitive numbers would be a better choice. While the SSDs helped, the peak transfer rate was 700 MB/s which could be supported by two SATA SSD drives. These benchmarks were performed using the latest version at the time of the report, Chronicle Map 2.0.6a-SNAPSHOT. Micro-second world. Something which confounds me when reading benchmarks about key-value stores is that they start with the premise that performance is really important.  IMHO, about 90% of the time, pe

Essentialism and Technology

Overview Essentialism isn't just flawed, it is flawed at many levels, some of which should be familiar to technologists.  I have read a number of arguments against essentialism, however for me they miss a number of points which relate to technology, an area I claim to know more about.  In technology you can measurably demonstrate the performance difference between two solutions, and yet it is difficult to convince people to let go of their biases. What is Essentialism (from Google) "a belief that things have a set of characteristics that make them what they are, and that the task of science and philosophy is their discovery and expression; the doctrine that essence is prior to existence. the view that all children should be taught on traditional lines the ideas and methods regarded as essential to the prevalent culture. the view that categories of people, such as women and men, or heterosexuals and homosexuals, or members of ethnic groups, have intrinsically different

A Java conversion puzzler, not suitable for work (or interviews)

A really hard interview question would be something like this int i = Integer.MAX_VALUE; i += 0.0f; int j = i; System.out.println(j == Integer.MAX_VALUE); // true Why does this print true? At first glace, the answer seem obvious, until you realise that if you change  int i for long i things get weird long i = Integer.MAX_VALUE; i += 0.0f; int j = (int) i; System.out.println(j == Integer.MAX_VALUE); // false System.out.println(j == Integer.MIN_VALUE); // true What is going on you might wonder? When did Java become JavaScript? Let me start by explaining why long gives a such a strange result. An important detail about += is that it does an implicit cast.  You might think that a += b; is the same as a = a + b; and basically it is except with a subtle difference which most of the time doesn't matter; a = (typeOf(a)) (a + b); Another subtle feature of addition is the the result is the "wider" of the two types.  This means that i += 0.0f;

Kafka Benchmark on Chronicle Queue

Overview I was recently asked to compare the performance of Kafka with Chronicle Queue.  No two products are exactly alike, and performing a fair comparison is not easy.  We can try to run similar tests and see what results we get. This test is based on Apache Kafka Performance Results What was the test used? One area Kafka tests is multi-threaded performance.  In tests we have done, it is neither better or worse to use more more threads (up to the number CPUs you have).  We didn't benchmark this here. All tests use one producer. Another difference, is that we flush to disk periodically by time rather than by count.  Being able to say you are never behind by more than X milli-seconds is often more useful than say 600 messages, as you don't know how long those messages could have been waiting there.  For our tests, we look at flush periods of between 1 ms and 10 ms.  In Kafka's tests, they appears to be every 3 ms approximately. The message size used was 200

Men in Tech

Background Between my partner and I, we have six daughters, and as they have grown I have thought more interested in their long term future, the role of women in society, the way technology will change our lives and in particular the role of women in technology. On the last topic, all the articles I have read have been written my women.  In this post, I hope to outline my experiences in this regard. Women face many challenges, most typical of male dominated professions as well as some specific to the technology field.  I believe that there will be a time when IT, like teaching and accounting, will have more women than men. I can understand it is frustrating as a woman in technology to be treated as a novelty.  Perhaps my experience can illustrate why that might be. One thing I have learnt over the years is that while I am happy to talk technology all day, there is one time to stop, in social situations when women are present.  This doesn't come from men, but women overwhe

lambdas and side effects.

Overview Java 8 has added features such as lambdas and type inference. This makes the language less verbose and cleaner, however it comes with more side effects as you don't have to be as explicit in what you are doing. The return type of a lambda matters Java 8 infers the type of a closure.  One way it does this is to look at the return type (or whether anything is returned)  This can have a surprising side effect.  Consider this code. ExecutorService es = Executors.newSingleThreadExecutor(); es.submit(() -> {     try(Scanner scanner = new Scanner(new FileReader("file.txt"))) {         String line = scanner.nextLine();         process(line);     }     return null; }); This code compiles fine. However, the line return null;  appears redundant and you might be tempted to remove it.  However if you remove the line, you get an error. Error:(12, 39) java: unreported exception; must be caught or declared to

An Inconvenient Latency

Overview Vendors typically publish numbers they are happy with, and avoid telling you about a product's weaknesses.  However, behind the numbers is a dirty secret if you know where to look. Why don't we use GPUs for everything? Finding problems which naturally scale to thousands of data points/tasks is easy for some problems, and very hard for others. GPUs are designed for computing large vector operations.  However, what is "large" and why does it matter? Say you have a GPU with 1024 cores. This means it can process a vector of length 1024 all at once and 2048 double the time.  But what happen if we only have a vector of 100, or 10 or 1. The inconvenient answer is it take the same amount of time because you can’t make use of all of your cores at once.  You get only 10%, 1% or just 0.1% of the peak performance.  If you want to get best efficiency, you want a problem which has many thousands of values which can be processed concurrently.  If you don

Try optimising the memory consumption first

Overview You would think that if you wanted your application to go faster you would start with the CPU profiling.  However, when looking for quick wins, it's the memory profiler I target first. Allocating memory is cheap Allocating memory has never been cheaper.  Memory is cheaper, you can get machines will thousands of GBs of memory. You can buy 16 GB for less than $200. The memory allocation operation is cheaper than in the past, and it's multi-threaded so it scales reasonably well. However, memory allocation is not free.  Your CPU cache is a precious resources especially if you are trying to use multiple threads.  While you can buy 16 GB of main memory easily, you might only have 2 MB of cache per logical CPU.  If you want these CPUs to run independently, you want to spend as much time as possible within the 256 KB L2 cache. Cache level Size access time in clock cycles concurrency 1 32 KB data 32 KB instruction 1 cores independent 2 256 KB 3 c

Team training in Expert Core Java

Overview We have new course material for the second half of this year.  Core Java Training We provide tailored team training for advanced and expert Java Developers at a low cost per head.  Select from the topics below.  We can provide training on site for your organization, world wide. For more details, see the  Core Java Training  web page. Expert Java Development (2-3 days) Working with primitives to save memory and reduce garbage. How to use  double  and  long  safely instead of BigDecimal. How to use collections like Map, Set, ConcurrentMap, NavigableMap, List, Queue, BlockingQueue and Deque effectively. How to use thread pools and fork join. Asynchronous processing and exception handling. How to use Lambdas in Java 8 for lazy evaluation and parallel coding. How to use Plain IO and NIO, files, TCP and UDP. volatile , read/write memory barriers and when you need them. default  methods in Java 8. Using  enum  for singletons and utility classes. Java 8 JSR-310

Compounding double error

Overview In a previous article, I outlined why BigDecimal is not the answer most of the time.  While it is possible to construct situations where double produces an error, it is also just as easy to construct situations were BigDecimal get an error. BigDecimal is easier to get right, but easier to get wrong. The anecdotal evidence is that junior developers don't have as much trouble getting BigDecimal right as they do getting double with rounding right.  However, I am sceptical of this because in BigDecimal it is much easier for an error to go unnoticed as well. Lets take this example where double produces an incorrect answer. double d = 1.00; d /= 49; d *= 49 * 2; System.out.println("d=" + d); BigDecimal bd = BigDecimal.ONE; bd = bd .divide(BigDecimal.valueOf(49), 2, BigDecimal.ROUND_HALF_UP); bd = bd.multiply(BigDecimal.valueOf(49*2)); System.out.println("bd=" + bd); prints d=1.9999999999999998 bd=1.96 In this case, double loo

If BigDecimal is the answer, it must have been a strange question.

Overview Many developers have determined that BigDecimal is the only way to deal with money.  Often they site that by replacing double with BigDecimal, they fixed a bug or ten.  What I find unconvincing about this is that perhaps they could have fixed the bug in the handling of double and that the extra overhead of using BigDecimal. My comparison, when asked to improve the performance of a financial application, I know at some time we will be removing BigDecimal if it is there. (It is usually not the biggest source of delays, but as we fix the system it moves up to the worst offender) BigDecimal is not an improvement BigDecimal has many problems, so take your pick, but an ugly syntax is perhaps the worst sin. BigDecimal syntax is an unnatural. BigDecimal uses more memory BigDecimal creates garbage BigDecimal is much slower for most operations (there are exceptions) The following JMH benchmark demonstrates two problems with BigDecimal, clarity and performance. The

Adding @atomic operations to Java

Overview How might atomic operations work in Java, and is there a current alternative in OpenJDK/Hotspot it could translate to. Feedback In my previous article on  Making operations on volatile fields atomic.  it was pointed out a few times that "fixing" previous behaviour is unlikely to go ahead regardless of good intentions. An alternative to this is to add an @atomic annotation.  This has the advantage of only applying to new code and not risk breaking old code. Note: The use of a lower case name is  intentional  as it *doesn't* follow current coding conventions. Atomic operations Any field listed with an @atomic would make the whole expression atomic.  Variables which are non-volatile and non-atomic could be read at the start, or set after the completion of the expression.  The expression itself may require locking on some platforms, CAS operations or TSX depending on the CPU technology. If fields are only read, or only one is written too, this wo