Posts

Showing posts from 2024

Storing 1 TB in Virtual Memory on a 64 GB Machine with Chronicle Queue

Image
As Java developers, we often face the challenge of handling very large datasets within the constraints of the Java Virtual Machine (JVM). When the heap size grows significantly—often beyond 32 GB—garbage collection (GC) pause times can escalate, leading to performance degradation. This article explores how Chronicle Queue enables the storage and efficient access of a 1 TB dataset on a machine with only 64 GB of RAM. The Challenge of Large Heap Sizes Using standard JVMs like Oracle HotSpot or OpenJDK, increasing the heap size to accommodate large datasets can result in longer GC pauses. These pauses occur because the garbage collector requires more time to manage the larger heap, which can negatively impact application responsiveness. One solution is to use a concurrent garbage collector, such as the one provided by Azul Zing , designed to handle larger heap sizes while reducing GC pause times. However, this approach may only scale well when the dataset is within the available main ...

Unveiling Floating-Point Modulus Surprises in Java

When working with double in Java, floating-point representation errors can accumulate, leading to unexpected behaviour—especially when using the modulus operator. In this article, we'll explore how these errors manifest and why they can cause loops to terminate earlier than anticipated. The Unexpected Loop Termination Consider the following loop: Set<Double> set = new HashSet<>(); for (int i = 0; set.size() < 1000; i++) { double d = i / 10.0; double mod = d % 0.1; if (set.add(mod)) { System.out.printf("i: %,d / 10.0 = %s, with %% 0.1 = %s%n", i, new BigDecimal(d), new BigDecimal(mod)); } } At first glance, this loop should run indefinitely. After all, the modulus of d % 0.1 for multiples of 0.1 should always be zero, right? Surprisingly, this loop completes after 2,243 iterations, having collected 1,000 unique modulus values. How is this possible? The full code is available on GitHub. Understanding Flo...

Thread Safety Issues with Vector and Hashtable

Considered legacy since Java 1.2 (1998), Vector and Hashtable remain prevalent in many Java applications. A common belief is that their synchronised methods render them inherently thread-safe. However, this assumption can lead to subtle concurrency problems that are often overlooked. Have you encountered unexpected exceptions when iterating over a Vector in a multi-threaded environment? Let’s delve into why this happens and how to address it. The Misconception of Thread Safety Both Vector and Hashtable synchronise individual method calls, leading many developers to assume that these classes are safe to use concurrently without additional synchronisation. While each method is thread-safe, combining multiple method calls can introduce race conditions if not properly managed. Are Iterators Thread-Safe? The Iterator is not thread-safe for most collections, including Vector . Iterators are designed to fail fast by throwing a ConcurrentModificationException when t...

Exceptional Exception, StackTrace extends Throwable

Exploring Surprising Properties of Extending Throwable in Java In Java, most developers are familiar with extending Exception or Error to create custom exceptions. However, directly extending Throwable can lead to surprising and potentially useful behaviours. In this article, we'll delve into the nuances of extending Throwable and explore practical applications that can enhance debugging and monitoring in Java applications. The example code is available here Extending Throwable At first glance, extending Throwable might seem unusual. Unlike Exception , which is checked, or Error , which is unchecked, Throwable itself can be extended to create a new checked throwable that is neither an exception nor an error. public class MyThrowable extends Throwable { } public static void main(String... args) throws MyThrowable { throw new MyThrowable(); // Must be declared or caught } In this example, MyThrowable is a checked throwable, and the compiler enforces that it mu...

StringBuffer is Dead, Long Live StringBuffer

When Java 5.0 was released on 30 th September 2004, it introduced StringBuilder as a replacement for StringBuffer in cases where thread safety isn't required. The idea was simple: if you're manipulating strings within a single thread, StringBuilder offers a faster, unsynchronized alternative to StringBuffer . This is an updated article from 2011 From the Javadoc for StringBuilder : This class provides an API compatible with StringBuffer , but with no guarantee of synchronization. This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. Is StringBuffer Really Dead? You might think that StringBuffer has become redundant, given that most single-threaded scenarios can use StringBuilder , and thread safety often require...

What can make Java code go faster, and then slower?

It's well-known that the JVM optimises code during execution, resulting in faster performance over time. However, less commonly understood is how operations performed before a code section can negatively impact its execution speed. In this post, I'll use practical examples to explore how warming up and cooling down code affects performance. The code is available here for you to run Warming Up Code When code is executed repeatedly, the JVM optimises performance. Consider the following code snippet: int[] display = {0, 1, 10, 100, 1_000, 10_000, 20_000, 100_001}; for (int i = 0; i <= display[display.length - 1]; i++) { long start = System.nanoTime(); doTask(); long time = System.nanoTime() - start; if (Arrays.binarySearch(display, i) >= 0) System.out.printf("%,d: Took %,d us to serialise/deserialise GregorianCalendar%n", i, time / 1_000); } This code measures the time taken to execute doTask() over multiple iterations, printing...

Overly Long Class Names in Java or Geeky Poem?

In Java development, clear and concise naming conventions are essential for code readability and maintainability. However, sometimes, we stumble upon class names that stretch the limits of practicality. One such example is InternalFrameTitlePaneMaximizeButtonWindowNotFocusedState . But did you know that in Java 6, this class name was even longer? Within the Java 6 JRE, there's a class with an astonishingly lengthy name: com.sun.java.swing.plaf.nimbus.InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedState This mouthful appears to be the product of a code generator that needed to be reviewed, leading to redundant and cumbersome naming. Or is it a geeky poem buried in the code? InternalFrame InternalFrame Title Pane, Internal Frame Title Pane. Maximize Button Window, Not Focused State. The moral of the story is always check the readability/sanity of generated code. In this Hacker News Discussion another class was also consid...

Unexpected Full GCs Triggered by RMI in Latency-Sensitive Applications

We observed an unexpected increase in Full Garbage Collections (Full GCs) while optimising a latency-sensitive application with minimal object creation. Despite reducing the frequency of minor GCs to enhance performance, the system began to exhibit hourly periodic pauses due to Full GCs, which was counterintuitive. Investigating the Source of Full GCs Upon closer examination, we discovered that the Java Remote Method Invocation (RMI) system was initiating Full GCs every hour. Specifically, the RMI Distributed Garbage Collector (DGC) checks if a GC has occurred in the last hour and, if not, forces a Full GC. This behaviour occurs even if the application does not actively use RMI, leading to unnecessary performance overhead. Understanding RMI's Impact on Garbage Collection The RMI DGC collects periodic garbage to clean up unused remote objects. By default, it is configured to trigger a Full GC if none has occurred within a specified interval (defaulting to one hour). This me...

How SLOW can you read/write files in Java?

A common question on Stack Overflow is: Why is reading/writing from a file in Java so slow? What is the fastest way? The discussion often revolves around comparing NIO versus IO . However, the bottleneck is usually not the read/write operations themselves, and the specific approach often has little significance in the bigger picture. To demonstrate, I’ll show one of the simplest (and perhaps slowest) ways to read/write text, using PrintWriter and Files.lines(Path) . The code is available here While it’s slower than writing binary using NIO or IO , it’s fast enough for most typical use cases. Example Output The program on a Ryzen 5950X running Linux outputs: Run 1, Write speed: 0.900 GB/sec, read speed 0.832 GB/sec Run 2, Write speed: 0.918 GB/sec, read speed 1.208 GB/sec Run 3, Write speed: 0.933 GB/sec, read speed 1.197 GB/sec If you find that 900 MB/s is more than fast enough for your application, the specific method of reading/wri...

Advanced Applications of Dynamic Code in Java

Dynamic code compilation and execution in Java offer powerful capabilities that can enhance application flexibility and performance. Back in 2008, I developed a library called Essence JCF , which has since evolved into the Java Runtime Compiler . Initially, its purpose was to load configuration files written in Java instead of traditional XML or properties files. A key advantage of this library is its ability to load classes into the current class loader, allowing immediate use of interfaces or classes without the need for reflection or additional class loaders. Why Use Dynamic Code Compilation? While dynamic code compilation didn't initially solve a pressing problem, over time, several practical use cases have emerged where it proves particularly beneficial: 1. Objects in Direct Memory By generating code dynamically, you can build data stores from interfaces that are either row-based or column-based, stored in the heap or direct memory. This approach reduces the number ...

Two Overlooked Uses of Enums in Java

Enums in Java are commonly used to represent a fixed set of constants. However, they offer more versatility than often realized. In this article, we'll explore two practical yet often overlooked uses of enums: creating utility classes and implementing singletons. 1. Using Enums as Utility Classes Utility classes contain static methods and are not meant to be instantiated. A typical approach is to define a class with a private constructor to prevent instantiation. Enums provide a more straightforward way to achieve this by leveraging their inherent characteristics. Here's how you can define a utility class using an enum: public enum MyUtils { ; public static String process(String text) { // Your utility method implementation return text.trim().toLowerCase(); } } By declaring an enum with no instances (note the semicolon after the enum name), you prevent instantiation naturally. This approach simplifies the code and clearly indicates tha...

How to avoid using a Triple Cast

OMG: Using a Triple Cast We've all faced situations where a seemingly simple task spirals into unexpected complexity. In 2010, I encountered such a scenario and ended up writing a triple cast! 😅 The challenge: I needed a method that would return the default value for a given type. Here's the first approach I wrote at the time: public static <T> T defaultValue(Class<T> clazz) { if (clazz == byte.class) return (T) (Byte) (byte) 0; // Other primitive types handled... return null; } Yes, this is casting madness! Let’s break it down: (byte) 0 : Initializes the default value for the byte primitive type. (Byte) : Wraps the primitive into its wrapper type, Byte . (T) : Casts it to the generic type T . While functional, this approach is overly verbose, difficult to read, and frankly, not very elegant. So, I decided to refactor it into something cleaner and more effici...

Calculating an Average Without Overflow: Rounding Methods

Calculating the midpoint between two integers may seem trivial, but the naive approach can lead to overflow errors. Code sample MidpointCalculator available here The classic midpoint formula: int m = (h + l) / 2; is prone to overflow if h and l are large, causing the result to be incorrect. This bug appears in many algorithms, including binary search implementations. Understanding the Problem of Overflow In Java, the int type has a fixed range from -2,147,483,648 to 2,147,483,647 . If h and l are large, their sum might exceed this range, leading to overflow. When overflow occurs, Java wraps the result around to the negative range without warning, causing unpredictable results. Safer Approaches to Calculate a Midpoint Using a Safer Formula A well-known alternative to avoid overflow is: int m = l + (h - l) / 2; Here, we compute the difference (h - l) before dividing by 2, ensuring ...

Why double Still Outperforms BigDecimal: A Decade-Long Performance Comparison

Overview Many developers consider BigDecimal the go-to solution for handling money in Java. They often claim that by replacing double with BigDecimal , they have fixed one or more bugs in their applications. However, I find this reasoning unconvincing. It's possible that the issue lies not with double , but rather with the way it was being handled. Additionally, BigDecimal introduces significant overhead that may not justify its use. When asked to improve the performance of a financial application, I know that if BigDecimal is involved, it will eventually need to be removed. While it may not be the largest performance bottleneck initially, as we optimize the system, BigDecimal often becomes one of the main culprits. BigDecimal is not an improvement BigDecimal comes with several drawbacks. Here's a quick list of some of its key issues: It has an unnatural syntax. It uses more memory. It creates more garbage (i.e., it causes more frequent garbage colle...

Uncomparable Puzzles in Java

Here are a few puzzles for you to solve in Java. The source is available here Try running the following code to reproduce the output below. See if you can work out why these results occur: long a = (1L << 54) + 1; double b = a; System.out.println("b == a is " + (b == a)); System.out.println("(long) b < a is " + ((long) b < a)); double c = 1e19; long d = 0; d += c; System.out.println("\nd < c is " + (d < c)); System.out.println("d < (long) c is " + (d < (long) c)); Double e = 0.0; Double f = 0.0; System.out.println("\ne <= f is " + (e <= f)); System.out.println("e >= f is " + (e >= f)); System.out.println("e == f is " + (e == f)); BigDecimal x = new BigDecimal("0.0"); BigDecimal y = BigDecimal.ZERO; System.out.println("\nx == y is " + (x == y)); System.out.println("x.doubleValue() == y.doubleValue() is " + (x.doubleValue() == y.doub...

Ten Java Myths and Misconceptions

Advanced Java Questions These questions delve into Java's more intricate behaviors and are often too advanced for typical interviews, as they might be discouraging for candidates. However, they are excellent for deepening your understanding of Java's core workings in your own time. Myth 1: System.exit(0) Prevents finally Block Execution Consider the following code: // many frameworks have a SecurityManager System.setSecurityManager(new SecurityManager() { @Override public void checkExit(int status) { throw new ThreadDeath(); } }); try { System.exit(0); } finally { System.out.println("In the finally block"); } This code will output: In the finally block Explanation: The System.exit(0) call triggers the checkExit method in the custom SecurityManager . By throwing a ThreadDeath exception instead of terminating, the finally block is allowed to execute, explaining the "In the finally block" output. Since ThreadDea...