Low latency services

March 01, 2013

Overview

Low latency services are designed to be as simple as possible. All the same it is good to a have a picture of the the iteration between a low latency processing engine and the rest of the world.

Why are we doing all this?

Using the following model you can create a processing engine which is deterministic both in behaviour and performance, reproducible for testing, replication and restart of the application, and keeps a record of all actions for issue analysis. including micro-second timings.

High level

From a high level, a processing engine need inputs from gateway processes or thread or components which normalise incoming data or requests. These requests are consumed by the processing engine and a log is produced. From this log the gateway processes can respond to request or send outbound messages to key systems.

Also reading the processing engine's log are database persister needed to support reporting and ad hoc queries. It is also used to send updates via a gateway to GUI clients.

Replication and restart

All the state of the system and everything it does is recorded in the logs. This mean the engine can be restarted to the same state by re-reading the logs. If the processing engine can read the logs fast enough, this restart time can be seconds. For fail over, you need a copy of these logs.

Performance

The biggest challenge is in feeding the processing engine enough request/events to keep it busy. The gateway processes have a typical delay of 20 to 200 micro-seconds and a throughput of 5,000 to 50,000 messages per second. If you compare this to a typical processing engine this takes 1 to 10 micro-seconds and process 100,000 to 1,000,000 events per second, the gateways are pretty slow. You can scale the number of processing engines, but it is likely you won't need to as you won't have enough events being passed through the gateways.

The main advantage of such a fast processing engine is stable performance even into the 99.99% of the time, and fast restarting (as the restart isn't slowed down by the gateways)

Java Chronicle

Java Chronicle is designed with these assumptions in mind. It supports messaging between processes with a latency around 100 nano-seconds. It persists every message synchronously, meaning the message cannot be read until it has been written to the OS and if all processes die, the data will still be saved to disk.

Java Chronicle is lock free, supports ultra low GC foot print and busy waiting/thread affinity so you don't give up the CPU or context switch (except non-maskable interrupts)

In my next blog entry I will discuss a demo application written this way and how to measure it's performance.

Vanilla Java