The Java version of patenting the Wheel

Overview

In 2001, an Australian man successfully patented a round device to aid movement.  Wheel patented in Australia

More recently, a company patented using native memory as way of storing data and objects.  OFF-HEAP DIRECT-MEMORY DATA STORES, METHODS OF CREATING AND/OR MANAGING OFF-HEAP DIRECT-MEMORY DATA STORES, AND/OR SYSTEMS INCLUDING OFF-HEAP DIRECT-MEMORY DATA STORE

The "off-heap" here being the Java terminology for direct or native memory.

"computer systems that leverage an off-heap direct-memory data store that is massively scalable and highly efficient"

It's such a good idea you might wonder, has this every been done before?  

Well I deployed a project called Yellow Page's online in Australia in 2001 where the bulk of the data was in native memory. Why? because the JVMs at that time were not as efficient as I could write the code in C++ to pack and search the data. The project used EJBs, but they were implemented in native methods.  (sounds horrible and it was ;)

Where did I get the idea, well C++ had been using native memory for storing unmanaged data for years and this came from the C world, which gave you a high level language which mimicked what the CPU does.  That's right CPUs have supported using native memory to store data in an unmanaged way for a very long time.


Perhaps it's a new idea for Java to use off heap memory to store data in an efficient manner?  

The JVM has always done this.  The largest region being Perm Gen, which stores code and data in a memory efficient manner, off heap.


Perhaps using ByteBuffers for storing long lived data is a new idea?  

IMHO The only obvious use case for using memory mapped files with ByteBuffers is for storing long lived data in an efficient and scalable way. This was added almost 10 years before the patent was dated. Applications have been using memory mapped files have been used for decades before Java made it available.

But did anyone implement a open source solution like this before?

I did here, last released Aug 2011 https://code.google.com/p/vanilla-java/downloads/list 

Is it completely old hat?

There is a section [0024] talks about "thief" and "victim" blocks of data, which is not something I have come across before.  This could be new or it could be that I don't know everything ;)

Grabbing all the free direct memory on startup [0027] is not a very good idea, and I would be surprised if other solutions thought it was a good idea.  The problem is that once all the direct memory has been grabbed, no other library can use direct memory including NIO Sockets, a built in library, widely used. It goes on to do a search for the largest amount of memory it can allocate which is pretty dumb given the JVM has a field for this which is available via reflection. Note: it is also changeable via reflection allowing you to ignore it or adjust it to suit your use case.

Other notes

"There therefore oftentimes is a practical (and oftentimes recommended) 6 GB limit to Java heaps,"

Never heard of this suggestion.  I have heard Azul suggest limits for heap sizes, but not this number.  The "default" limit is close to 30 GB. i.e. The server JVM will give you a heap of 1/4 of your memory up to the limit for compress oops which is about 30 GB. e.g. if you have 256 GB of main memory, you will get a heap of about 30 GB.

"The inventors of the instant application have realized that elements stored in a cache have a simple lifecycle that is serializable"

Who are the inventors of the instant application!? I googled "serializable life cycle" and it has no idea what that means, nor do I.

"Most people incorrectly think that collecting dead objects takes time, but it is the number of live objects that actually has the greatest effect on garbage collection performance"

Collecting dead objects only takes time if you don't use an instant application. ;)  "Greatest effect" implies that the rest has a lesser effect which contradicts the first part.

"Java ByteBuffers may be used for persisting data in off-heap storage in a less transient manner than otherwise would be expected"

My view is that memory mapped files is less transient direct memory.  I.e. they have a solution which is more transient, not less.

"Eviction decisions in the off-heap store's cache implementation may be performed using a clock eviction scheme."

If you use memory mapped files, the OS will do this for you. Most OSes uses a clock eviction scheme.  The benefit of using the OS to do this is it is aware of everything running on the machine, it is written for you, it happens asynchronously and finally the OS uses off heap data and code to do it. ;)  

IMHO, It's not too sane to rewrite all this in Java yourself as I don't believe you will get it working as efficiently as letting the OS do the job for you.  In that sense, it could be original, if dumb, but Re-inventing The Wheel is nothing new in the software industry unfortunately.

What I found surprising about the patent is how much of it reads like a whitepaper.  It appears that someone wrote a whitepaper including a business case for such a solution, and listing the problems it solves for you, and said; lets re-format turn this into a patent.

Summary

Assembly, C and C++ use unmanaged memory to store data and objects in a scalable and efficient way and have done so for a very long time.  Java added managed objects (again not a new idea).

IMHO, It is natural to realise that managed data is not the best solution for all volumes of data and you have to manage the data yourself, e.g. storing the bulk of your data in a database or you can use native memory like C++ and assembly does.

Reference

Patent Stackexchange discussion on this Patent.

Comments

  1. Isn't that something that real-time Java supports as well?

    ReplyDelete
  2. @Sebastian, good point From Wikipedia, "The RTSJ addressed the critical issues by mandating a minimum specification for the threading model (and allowing other models to be plugged into the VM) and by providing for areas of memory that are not subject to garbage collection, along with threads that are not preemptable by the garbage collector. These areas are instead managed using region-based memory management."

    ReplyDelete
  3. Thank you for this great blog post,TechnologyPartner.in is provied Learn java online & Online java Test .The Java platform differs from most other platforms in that it's a software-only platform that runs on top of other hardware-based platforms.

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues