Performance Tip: Rethinking Collection.toArray(new Type[0])
Introduction
Have you ever considered the performance implications of converting collections to arrays in Java? It's a common task; your chosen method can impact your application's efficiency. In this article, I will explore different approaches to toArray()
, benchmark their performance, and determine which method is optimal for various scenarios.
The Challenge
Converting a Collection
to an array seems straightforward, but the standard practice of using collection.toArray(new Type[0])
might not be the most efficient. Understanding the nuances of this method can help you write more performant code.
Exploring the Approaches
Let's delve into four primary methods and a combination for converting collections to arrays:
1. Using toArray()
Without Arguments
Object[] array = { "Hello", "world" };
String[] strings = (String[]) array; // Throws ClassCastException at runtime
While this approach avoids additional array creation and can be fast, it lacks type safety and requires casting, leading to potential runtime exceptions.
2. Passing a Zero-Length Array: toArray(new Type[0])
A common practice involves passing a new zero-length array to the toArray()
method.
String[] notifTypesArray = notifTypes.toArray(new String[0]);
This code creates a new zero-length array every time, incurring unnecessary allocation and reflection costs, especially in performance-critical applications.
3. Pre-Sizing the Array: toArray(new Type[collection.size()])
return (String[]) v.toArray(new String[v.size()]);
This method eliminates the need for toArray()
to internally create a new array, enhancing performance for collections with known sizes.
4. Using a Constant Empty Array
private static final String[] NO_STRINGS = {};
// later
return s.toArray(NO_STRINGS);
This approach minimises array creation when the collection is empty but may introduce reflection overhead when elements are present.
5. Attempt to Get the Best of Both Worlds
return s.isEmpty() ? NO_STRINGS : (String[]) s.toArray(new String[s.size()]);
private static final String[] NO_STRINGS = {};
This way, an empty array is reused whenever there are no results, and a variety of the correct size and type is used when the size is greater than or equal to one.
The Benchmark
To evaluate these methods, I conducted a benchmark using JMH (Java Microbenchmark Harness), available here.
Collections Tested
- ArrayList: Sizes of 0, 3, 7, and 16 elements.
- HashSet and TreeSet: Created from the same elements as the ArrayLists.
Benchmark Configuration
- Warmup: 2 iterations, 1 second each.
- Measurement: 3 iterations, 10 seconds each.
- Threads: Configurable via
-Dthreads
, defaulting to 8. - Forks: 7 separate JVM instances for accurate results.
Results and Analysis
The benchmark results on an 8-core Ryzen 5950X were illuminating:
- Throughput: Between 210 million and 450 million operations per second.
- Margin of Error: Approximately 15 million ops/sec for
HashSet
andArrayList
, and about 40 million ops/sec forTreeSet
.
Practical Recommendations
Based on the results:
- Avoid
toArray(new Type[0])
: It introduces unnecessary overhead without significant benefits. - Leverage Constant Empty Arrays When Appropriate: If collections are frequently empty, reusing a constant can save resources.
- Or Use Pre-Sized Arrays:
toArray(new Type[collection.size()])
is efficient and straightforward.
Conclusion
Avoid using Collection.toArray(new Type[0])
if you can. It’s probably not worth changing your code for, but if you use another approach, go with whatever you consider simplest. For me, that means using the NO_STRINGS
constant.
What details about the benchmark would you like to know in the comments or a follow-up post?
Have you faced performance issues with the toArray()
methods? How did you tackle them? Share your experiences and join the discussion!
About the author
As the CEO of Chronicle Software, Peter Lawrey leads the development of cutting-edge, low-latency solutions trusted by 8 out of the top 11 global investment banks. With decades of experience in the financial technology sector, he specialises in delivering ultra-efficient enabling technology which empowers businesses to handle massive volumes of data with unparalleled speed and reliability. Peter's deep technical expertise and passion for sharing knowledge have established him as a thought leader and mentor in the Java and FinTech communities. Follow Peter on BlueSky or Mastodon
You recommend to avoid using Collection.toArray(new Type[0]) . But Aleksey Shipilёv in this article: https://shipilev.net/blog/2016/arrays-wisdom-ancients/ says: "..toArray(new T[0]) seems faster, safer, and contractually cleaner, and therefore should be the default choice now. "
ReplyDeleteWhat do you think?