Being as fast as possible is a bad idea

Overview



When you design a system which must perform, a common assumption is that you need to it to be as fast as possible. However this needs to be qualified and not having a clear idea of how much performance you need can mean you spend more money, waste time or impact your design more than needed.

The first question is knowing whether it is latency, throughput or both which are required. Often only one really matters. A low latency usually gives you a high throughput. If only throughput is required using parallelism is often a cost effective solution.

Should you only design a system to be just what you need?

It is a brave move to only design a system to do exactly what you need and no more. This is because systems often behave as well as they should on paper.

The other reason is that systems tend to vary in their performance due to the complexity of their systems. End users tend to remember the worse performance they ever got making the occasion slow performance more significant than you might give credit.

My rule of thumb is to say; if your design is ten times the realistic performance required, you are likely to be safe and don't aim higher unless you are sure it won't cost you anything.

Without clarity of the performance required it is easy to be out by more than a factor of ten, which is usually a disaster (if too low) or costly (if too high) (See below)

You shouldn't feel that whatever you produce is set in stone. If a solution is particularly successful, you will find you can get more resources to improve to solution. If you spend too much resources building a solution which will never be needed you can't get that time/resource back.

How fast is fast enough?

When people say something must be as fast as possible, in my experience this can mean just about anything. For some people this means 100 ms, 10 ms, 1 ms, 100 μs, 10 μs, 1 μs is acceptible. Additionally you need to determine what is to be measured and how to measure it, but you have to start with a ball park figure.

Just this week I have had several conversations with people who said they need a top candidate with strong high frequency trading experience (sounds great). So I ask them what are they trying to achieve and they have to admit they are not trying anything like what I have been doing for the last couple of years, by more than ten times. (sounds disappointing)

For many systems, a low latency gives a high throughput. The relationship is usually inverse. i.e. a system with a latency of 1 ms can handle at least 1 K/s and 10 μs can handle 100 K/s. The reason this matters is, if the latency is low enough the throughput can be fine as well. In this case you only have to worry about latency.

Note: 1 μs is one micro-second or a millionth of a second.

What throughput do you need?

Often having a low latency system alone is not enough and perhaps not required. Many libraries, frameworks and systems were not designed with low latency in mind and re-engineering them is expensive.

Systems with high throughput are often highly parallel performing many unrelated tasks at once. Most systems have a throughput of 100/s, 1K/s, 10K/s, 100K/s, 1M/s or 10M/s.

When you can get it really wrong


When an application takes far too long

A large project I worked on was a GIS application in Australia. It had every telecom wire, every property boundary, every fence, street etc in Australia. Australia has a relatively sparse population but it has a large area (alot of it not containing any recorded features)

For this program we had to migrate data from the old system to the new system. The time taken to migrate the data was proprortional to the area described. (Did I mention Australia was big?) and unfortunately this hadn't been recognised as a problem until the project had spent $70m. A migration of data which had to take 24 hours I estimated to take 100 years. That is out by many orders of magintude. Even after explaining the problem to the vendor, they didn't see the need to redesign the software. A year later, they made it four times faster. (Funding for the project was stopped at that point as 25 years was still too long)

When a project is far too fast

One project I did design, there was a "requirement" to support 8,000 concurrent users. At the time I was highly sceptical, however I was keen to build such a systemm as a technical challenge and it didn't bother me then it might never be needed. However getting to the end of the project, I realised we could have delivered the solution perhaps 6 months earlier if we had a more realistic number of users which was closer to 800 users. As it turned out, we got more hardware than I had assumed we would have as it was a successful project and the systems never used more than 2% of cpu.

The unrealtistic requirement delayed the project and we could have been in the market making money earlier.

Human factors

You have to realistic when you have a human input or display. A human cannot respond faster than about 1/20 of a second or about 50 ms. There is no point updating a screen too often. A delay of less than 10 ms is unlikely to make much difference to a user.

People tend to remember the worse service/performance they got. When measuring a system, you want to pay attension to the high percentile/worst times and try to minimise these as well as looking at the averages.

Comments

Popular posts from this blog

Low Latency Microservices, A Retrospective

Unusual Java: StackTrace Extends Throwable

System wide unique nanosecond timestamps