Computing units don't have to be confusing

Overview

Often programmers will use units in a non standard way and usually you can work out what they meant by context.  Many developers never realise there is inconsistencies is technical material as the difference is small to the amount of error and the difficult of reproducing benchmarks reported.

There are times when the difference is so large it does matter and the writer has confused themselves by a lack of understanding which makes your job of trying to work out what they done harder.

An array of measures


  • b = bit.
  • B = byte or 8 bits or 8b
  • g = gram
  • kb = kilobit or 1000 bits. kilo being the scientific prefix for 1000 like kilometer.
  • kB = kilobyte or 1000 bytes, sometimes used for storage
  • kg = kilogram
  • Kb = Kibitkibibit or 1024 bytes.
  • KB = KiBkibibyte or 1024 bytes, sometimes used for memory
  • Mb = megabit = 1000^2 bits or 125 kB. Mb/s is used for network bandwidth.
  • MB = megabyte = 1000^2 bytes, MB/s is used for disk bandwidth.
  • MiB = mebibyte = 1024^2 bytes, often referred to as MB or Megabyte for memory.
  • mb = millibits = 0.001 bits - never use correctly.
  • mB = millibytes = 0.008 bits - could be used for compression, but isn't
  • Gb = gigabit = 1000^3 bits or 125 MB, Gb/s is used for network bandwidth.
  • GB = gigabyte = 1000^3 bytes, used for disk space, GB/s is used for memory bandwidth
  • GiB = gibibyte = 1024^3 bytes, often referred to as GB or Gigabyte for memory.
  • gb = gram-bit (also means to feel itchy)
  • gB = gram-byte (someone's Xbox id)
  • TB = Terabyte = 1000^4 bytes, used for disk space.
  • TiB = Tebibyte = 1024^4 bytes, used for memory but usually written as TB.
  • PB = Petabyte = 1000^5 bytes, used for large storage solutions.

Why does it matter?

Where the confusion often arises is when network bandwidth, which is usually in bits/second is confused with disk or memory bandwidth which is usually in bytes/second. To add to the confusion, disk and memory use different prefixes.  

Q: You might see a HDD which can write at 40 MB/s and a network which is 100 Mb/s.  Which is faster? 

A: The HDD is much faster as it can write 320 Mb/s and the network can only write up to 12.5 MB/s (most likely only gets 11 MB/s with overhead)

The use of 'b' and 'B' is a factor of eight difference.

Also confusing is

... memory and disk sizes.  Memory is measured in Gibibytes (1024^3 bytes) as memory has to be a power of 2 in sizes but most people call it GB or gigabytes.  Disk space is measured in gigabytes (1000^3 bytes) as disk space doesn't have to be a power of two, but mostly because it makes the drives appear almost 8% larger.  This is not such a serious problem and it is small enough difference that most people ignore it.

I remember when 4.0 GB drives were replaced by 4.3 GB drives and was amazed to discover they were exactly the same model, but had just been re-labelled.

Memory bandwidth measures are inconsistent and I suspect most of the time it is GB/s which is reported.  (as it is simpler for  human to understand, but more importantly higher) This means if you can read 1 GB (GiB) of memory in one second, the bandwidth is reported as 1.08 GB/s.

Conclusion

Be aware that marketing material and even other developers will mix their units.  You want to have at least a clear idea in your own mind what is really going on. If you can educate them along the way, that would be an improvement.


Comments

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues