Java low level: Converting between integers and text (part 1)
OverviewAs Java has a number of way to convert an integer to a String, its not something you might have considered writing yourself.
However, there may be situations where you want to do this. One of them is when performance is critical. Java libraries tend to create more objects than are required. Usually this doesn't matter but there are times when you need the system to go faster and reading and writing numbers is a big hit for you.
Performance differenceThe following is based on a tests of 128K integers in binary and text formats repeatedly to take an average time to write and read an integer.
Source to run all tests
Source for the examples Scroll down for the Unsafe examples.
The i7 has a faster clock speed than the Xeon, but a smaller cache. This narrows the gap between the fastest and slowest times. When comparing using a direct ByteBuffer and a heap ByteBuffer, there was little difference on the Xeon, however the direct ByteBuffer was consistently faster on the i7.
In this test, reading/writing integers as text using different approaches varies by as much as 14x. If your integer format is needs to be customised you may need to use DecimalFormat, or write your own.
Note: the faster options for writing/reading text were faster than using the binary DataInput/OutputStream. The stream arranges a long as 8 bytes in big-endian order which is more work than converting the number to text in these cases.
Writing an integer to textFor this example, all integers are treated as long type. There is a small performance advantage in having an int type instead but it is relatively small.
Integer representationAll signed integers, byte, short, int and long are represented as twos-complement Encoding/decoding this format doesn't require any bitwise operations like floating point numbers can to.
Extract the signFirstly extract the sign. This is simple to do,
There is one edge case here which is Long.MIN_VALUE. Due to two-complement representation, this value is the negative of itself. One way to handle this value is to encode it specially. e.g. have a constant which contains what it should be encoded as. Another approach is to treat Long.MIN_VALUE as an unset value or NaN. Most spreadsheet applications treat an empty field as an unset cell. (This is my preference) Another special value is zero. Other numbers do not have a leading zero, but zero needs to have at least one digit.