How *slow* can you read/write files in Java?

A common question on stackoverflow is; Why is reading/writing from a file in Java so slow? What is the fastest way?

The discussion often compares NIO vs IO. However the read/writing is usually not the problem and the comparison has little importance.

To demonstrate this, I am going to show the one of the simplest/slowest ways to read/write text which is slower than writing binary in NIO or IO but is fast enough IMHO for most use cases.

The following program prints the following output.
Wrote 101 MB/s.
Read 109 MB/s.

If 100 MB/s is fast enough, it really shouldn't matter which way you read/write data to disk.

Note: For very large files, this result depends on the speed and configuration of your disks. In which case, you need to look at how your hardware, don't blame your software.
// generate a string with full of A's
char[] chars = new char[40];
Arrays.fill(chars, 'A');
String text = new String(chars);

final File file = new File("/tmp/a.txt");
long start = System.nanoTime();
PrintWriter pw = new PrintWriter(new FileWriter(file));
for(int i=0;i<2 -="" 1000l="" br.close="" br.readline="" br="new" bufferedreader="" ead="" file="" filereader="" i="" length="" long="" mb="" new="" null="" pre="" pw.close="" pw.println="" rote="" s.="" start2="" start="" system.out.println="" text="" time2="" time="" while="">

Comments

  1. Can you post the hardware specs for this benchmark? Drives, any RAID, etc...

    ReplyDelete
  2. AMD 4800+ processor, 6 GB, Ubuntu 10.10, a 1 TB SATA drive.

    ReplyDelete
  3. I'm only getting like 40mbps but my computer is pretty decent so I think I must have some configuration or setup problems. Can you possibily lead me in the proper directions?

    For more context, I'm using netbeans to develop a java simulation. On a MUCH lower speced 6 year old computer, the exact same project runs quite smoothly, but on an almost new computer, the simulation runs incredibly slow. The problem is one line of file output that if I comment out, the simulation will run smoothly again. This 1 line does occur 100 times for every step in the simulation though.

    ReplyDelete
  4. You need to determine if the problem is the rate you can perform IO or the rate you can process that data.

    I suggest try the code above to see how fast your computer is reading/writing without doing anything else.

    If this is slow, you have an IO issue and you need to look at your hardware. If this is fast but your application is slow, you have a problem with what you do with the data. In this case I would suggest you use a profiler to look at how you are using CPU and memory in your application.

    ReplyDelete
  5. wow, thanks so much for the reply, I know this is not supposed to be a help forum, but I think my problem is simple yet I can't figure it out. I've tried updated all the java stuff to, I just feel like there's some windows 7 quad core configuration that I don't have.

    On all the computers I've tried but 1 (which is the fastest with quad core 64bit OS and the one I want to use), the application does not lag. I'm not really processing information, and that doesn't seem to be the source of the slow down. I'm not doing anything with file data, just writing to a text file. I'm basically trying to log information continuously and the program writes this for every step:

    FileOutputStream fout = new FileOutputStream("filename.txt", true);
    PrintStream out = new PrintStream(fout);
    double currentStep = getSteps()+1;
    out.println(currentStep);
    out.close();

    putting your code exactly where the above code takes place, I get write speeds ranging from 60-90, which is more than what the other computers are getting. But if I just comment out the one "out.println(currentStep)" line, then there is no lag and the application runs smoothly.

    ReplyDelete
  6. I will admit you found a much slower way to write to a file than occurred to me. ;)

    Opening, flushing and closing files is expensive, as is not using any buffering.

    Can you try opening the file once and closing it when you shutdown the application or don't need it any more?

    ReplyDelete
  7. haha, yes, I'm quite a pro at inefficient coding... but just to point out, if I leave every line but the "out.println(currentStep)" line, the simulation runs smoothly. Basically, it doesn't seem like it's the opening and close action itself but specifically the write-to-file. however, I don't know if there may be some dependency about write-to-file slowing things down with the opening and closing.

    Ideally, I would've liked to open the file once, and close it once when the simulation is finished, as you state; however the collective code is very fragmented, written and controlled by different people, so the specific simulation I am running does not actually know when it's done (because it is never really finished). That probably doesn't make much sense, but the thing is, my exact same code/simulation seems to work on every other computer but the one that is twice as powerful... so I was just hoping for some possible insight in diagnostics. Given enough time, I will go look into changing the other code to open and write once and get back to you. For the time being, I've settled on using the older and slower computers.

    Thanks so much for your help thus far

    ReplyDelete
  8. The OS can do various tricks to reduce the overhead of openning/closing files so it doesn't surprise me that it may be efficient on one machine but not another.

    ReplyDelete
  9. Just to update you, I took out the close command, and now it runs significantly more smoothly, although still not as smoothly as the slower computers. I put in a flush command and it still runs okay. The only other thing I know the close function does is unlock the file and free up system resources. I guess there's something funny going on from writing the string and then closing. If if I write but don't close, it's fine, and if I leave close in but don't write anything, it's also fine (more fine in fact). Is that not odd since closing should free up resources? or is the act of closing just too much...

    good thing JAVA's memory management is a lot more friendly than C or obj-C. I guess I'll just have a 100 agents opening and writing to the same file without closing until I figure out a way to get a single agent to do it.

    ReplyDelete
  10. To make your code shorter, PrintWriter is already buffered if you instantiate it with a File argument.

    ReplyDelete
  11. @mauhiz, Thank you. Just the sort of comment I would also make. It is also inefficient to have pointless buffering.

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues