Why we shouldn't use more threads than we need to
OverviewThere is a common argument that because we have lots of cores, and will have even more in the future we have to use them. We just we need to find the best ways to use them but just because we can doesn't mean we should.
What is our goal?
Good reasons to use multiple threads are
- the performance of using one thread is not enough.
- you have profiled your application to ensure there is no low hanging fruit.
- multiple threads improve the throughput, latency or consistency.
At this point you should add a thread when you know it gets you closer to your goal.
A bad reason to use multiple threads
Just because we can use more threads doesn't mean we should. Multiple threads
- Adds complexity to the code
- There are other was to speed up an application. You L1 cache is 10-20x faster than you L3 cache and if you can spend more time in you L1 cache by optimising your memory usage and access, you can gain more performance than using every CPU in your socket.
- Multiple thread can introduce subtle, rarely seen bugs which just wouldn't be there with single threaded code.
- Multiple threads adds synchronization, more use of immutable objects instead of recycling mutable one.
- Multiple threads tend to lead to much worse jitter and worse case performance even if the typical performance is better.
A simple example of this is calculating Fibonacci numbers. These are very easy to describe recursively and create lots of threads. Thus calculation Fibonacci numbers are often used as a example of how to use lots of threads. What they often don't mentions is that the number of threads you create is equal to the answer i.e. it grow exponentially. This means that while iterating in one loop/thread take about 4 ms to compute fib(69), the multi-threaded version will create trillions of trillions of threads and will take longer than the age of the universe if it didn't crash.
But if I have CPUs idle I am wasting them.If you want to use every CPU, just write a busy waiting thread for every CPU and you are done, every CPU is at 100%
Say you want to travel from A to B, sometimes you can take one street and sometimes taking four streets is faster. But there are 20 streets near A and B and you should go up and down all twenty street because otherwise there is no point them being there, right!?