Posts

Showing posts from September, 2011

HugeCollections moved to github

HugeCollections has been moved to github. to support easier forking and collaboration. https://github.com/HugeCollections/Collections Its a pain to setup , but once that is done it makes working on the code and merging changes much easier.

New Contributors to HugeCollections

What is the HugeCollection library The objectives of HugeCollections for the first release are fairly ambitious. Scale to massive collections i.e. sizes much larger than 2 billion without significant heap foot print or GC impact. i.e. using heap-less memory Be faster and more efficient that using plain JavaBeans with an ArrayList or Map and a vector or unordered_map in C++. Support durability (transparently saved and loaded from disk) Support thread safety (with no overhead if not required) and using multiple threads implicitly. i.e. large operations are automatically distributed. Be faster and more efficient than using a database. Transactions will NOT be in this release.  Support in one application what might have to be distributed otherwise. Dynamic code generation as required (no need to pre-generate code in the build) A prototype has been built which shows these objectives are possible, however to turn this library in to a usable release, will take some help. Note

Forum for ideas

Someone recently asked me if had a forum to include suggestions for articles to add. Does anyone have a forum to work well with blogger. suggestions for articles. Please comment with your ideas.

The Exchanger and GC-less Java

Overview The Exchanger class is very efficient at passing work between thread and recycling the objects used. AFAIK, It is also one of the least used Concurrency classes. As @Marksim Sipos points out, if you don't need GC less logging using an ArrayBlockingQueue is much simpler. Exchanger class The Exchanger class is useful for passing data back and forth between two threads. e.g. Producer/Consumer. It has the property of naturally recycling the data structures used to pass the work and supports GC-less sharing of work in an efficient manner. Here is an example, passing logs to a background logger. Work (a log entry) is batched into LogEntries and passed to a background thread which later passes it back to the thread so it can add more work. Provided the background thread is always finished before the batch is full, it is almost transparent. Increasing the size of the batch reduces how often the batch is full but increase the number of unprocessed entries waiting at an

Is making Boost more like Java a good idea?

Overview I was reading a question StackOverflow about What is Boost missing? and I was wondering how many of these features are available in Java already and whether making boost more like Java is good idea. Top suggestions Suggestion What is in Java SQL support JDBC JSon Requires an additional library like XStream, Possibly should be in core Java Audio Java has basic support. Don't know if it much good. logging Logger callstack, a standard API Throwable.getStackTrace() and Thread.getStackTrace() Support for archives Support ZIP and JAR Redeveloped collections The same could be said for Java Collections Standard XML parsing for UTF-8 text SAX (event driven) and DOM (Document object model) parsers for XML Platform independent GUI Swing Concurrency with lock free collections and atomic operations Concurrency library Arbitrary precision floating point and decimal BigDecimal and BigInteger Python, Ruby and Lua support Not built in, but there is Jython, JRuby and Lu

Memory alignment in C, C++ and Java

Overview You might assume that reducing the size of a struct or class saves the same amount of memory.    However due to memory alignment in memory allocators it can make no difference, or perhaps more difference than you might expect.    This is because the the amount of memory reserved is usually a multiple of the memory alignment. For Java and C this can 8 or 16 bytes. Size of memory reserved These tests were performed in a 64-bit C program (gcc 4.5.2) and a 64 JVM (Oracle Java 7) In Java, direct memory is largely a wrapper for malloc and free. Bytes C malloc() reserved Java ByteBuffer.allocateDirect() 0 to 24 32 bytes 32 bytes + a ByteBuffer 25 to 40 48 bytes 48 bytes + a ByteBuffer 41 to 56 64 bytes 64 bytes + a ByteBuffer 57 to 72 80 bytes 80 bytes + a ByteBuffer Constructing objects is a similar story Number of fields C class of int (heap/stack) C class of void * (heap/stack)  Java class   with int  Java class with Object references 1 32/16 bytes 32/16 bytes 16

Why thread priority rarely matters

Overview Its is tempting to use the Thread.setPriority() option in Java. However for many applications this is more a comment for the developer than something which will make a measurable difference. esp. with multi-core systems. Why it usually doesn't matter If you have plenty of free CPU, every thread which can run will run. The OS has no reason not to run a low priority thread or process when it has free resources. If your system is close to 100% of CPU on every core , the OS has to make a choice as to how much time each thread or process gets on the CPU and it is likely to give favour to higher priority threads over lower priority threads, (many OSes ignore the hint) and other factors are likely to matter as well. This priority only extends to raw CPU. Threads compete equally for CPU cache, heap space, CPU to memory bandwidth, file cache, disk IO, network IO and everything else. If any of these resource are in competition, they are all equal. To set a high priority on Wi

Order of elements in a hash collection

Overview While is it generally understood that keys or elements in a HashMap or HashSet occur in a pseudo random order, what is not obvious is that two collections with the same elements can be in different orders. This is because the capacity of the collection also determines the order the elements appear. This can be important if you have only an Iterator to a Set. It is easy to assume the order will be the same and most of the time it will work (you can write unit tests with this assumption which will pass) However, if the Set has a different capacity (something you don't normally know or worry about) the order will change. Find what order elements can appear in The following test adds the same 11 elements to a HashSet, each time with a different collection and prints out all the combinations it finds. @Test public void testSetOrder() { Set<String> order = new HashSet<String>(); Collection<String> elements = Arrays.asList( &quo

The importance of innovation

Overview In a recent article I wrote about how you do things can make a big difference. Java can be significantly faster than C The aim was to show how you can use the same algorithm (the what), implemented with a different approach (the how) and make significant improvement in performance. What I have learnt I now understand that the purpose of the web site is compare languages and remove the developer from the equation by making "the how" as similar to what has been written before as possible. I see this is legitimate approach to have a fair comparison of languages. The Importance of Innovation To me it shows you what can happen when you deliberately limit innovation, even if you have good reasons to do so. If you prescribe one way of implementing a requirement, it can make a big difference to the solution you can achieve. In the case of this benchmark, following the requirements as closely as I could, but admittedly disregarding how these requirements had been ach