MapDB 1.1 and sun.misc.Unsafe
High performance Java projects usually use sun.misc.Unsafe
to directly manipulate memory.
It is not officially part of JDK, so it does not work on Dalvik and some other JVMs.
Also it might cause JVM to crash in cause of an error, making debugging and user support hard.
MapDB uses safer ByteBuffer
s, thin abstraction over Unsafe
.
It is safer and easier to use for about 10% performance penalty.
This choice helped to develop MapDB faster and more robust.
However 10% performance bonus is nothing to sneeze at, so Unsafe
storage is supported in form of optional extension.
MapDB is build around ByteBuffer
s and it can not take full advantage of Unsafe
yet. MapDB 1.1 will add necessary changes to make it more usable.
So what is wrong with current ByteBuffer
s?
Unnecessary boundary checks
On each read or put, BB checks if offsets are within its limits. This checks can not be optimized away by JIT, as with byte[]
. Boundary checking adds about 10% of overhead.
It is especially bad with many small calls generated by MapDB deserializer. Workaround is to read all data into single byte[] and deserialize small chunks from there. But it means additional copying and allocations which MapDB tries to avoid.
Defensive copy to transfer data
BB stores offsets internally, so to copy data from one BB to another one has to call bb.duplicate()
and update offsets in new instance. Under heavy load it triggers GC and ruins CPU caches. MapDB is already quite optimized, so this could be almost 50% of GC trash.
General inflexibility of ByteBuffers
ByteBuffer's are not really that well designed in Java terms. There is 32bit addressing limit. It is hard to extend. And most implementations (java.nio.DirectByteBuffer
) are final and
package protected.
In short I don't like BB anymore. To fix their problems, there are dozens
abstractions including Volume
from MapDB. From now MapDB will build around
ByteBuffer
s rather then on top.
MapDB 1.1 will add some changes:
DBMaker.newMemoryDB() will not use HeapByteBuffer
but raw byte[]
.
It means one less abstraction layer, no boundary checks and instant 10% performance boost. This change is already done in MapDB snapshots.
Direct transfers between Volume. MapDB tries to move data directly from one location to another without using third buffer. So far we relied on ByteBuffer
s to do direct copying, but it does not support raw byte[]
, Unsafe
and other storages. There is new commit which makes this independent on ByteBuffer
s
Off-heap memory based on Unsafe. So far sun.misc.Unsafe
is not really supported.
We can not add official support, since it is not part of JDK and does not work on Android.
But the [mapdb-unsafe](](https://github.com/jankotek/mapdb-unsafe) extension will be treated as first class citizen. Its releases
will be synchronized with MapDB release. And it will get bug fixing, support and will be mentioned in documentation.
Memory-mapped files via Unsafe. Memory mapped files use ByteBuffer
s as well.
The boundary checking overhead is not that prominent here, since disk speeds are lower.
But extension should support mmap-unsafe files anyway.
Partial async file IO. Breaking away from ByteBuffer
offers option for asynchronous file reads (better term is probably lazily read). In practical terms the DataInput
will be loaded once the Serializer
actually starts reading data. This could improve performance by a few percent.
blog comments powered by Disqus