Motivation:
The main use case with Buf.compact is in conjunction with ensureWritable.
It turns out we can get a simpler API, and faster methods, by combining those two operations, because it allows us to relax some guarantees and skip some steps in certain cases, which wouldn't be as neat or clean if they were two separate steps.
Modification:
Add a new Buf.ensureWritable method, which takes an allowCompaction argument.
In MemSegBuf, we can just delegate to compact() when applicable.
In CompositeBuf, we can sometimes get away with just reorganising the bufs array.
Result:
We can now do ensureWritable without allocating in some cases, and this can in particular make the operation faster for CompositeBuf.
Motivation:
Compaction makes more space available at the end of a buffer, by discarding bytes at the beginning that have already been processed.
Modification:
Add a copying compact method to Buf.
Result:
It is now possible to discard read bytes by calling `compact()`.
Motivation:
There are many use cases where other objects will have fields that are buffers.
Since buffers are reference counted, their life cycle needs to be managed carefully.
Modification:
Add the abstract BufHolder, and the concrete sub-class BufRef, as neat building blocks for building other classes that contain field references to buffers.
The behaviours of closed/sent buffers have also been specified in tests, and tightened up in the code.
Result:
It is now easier to create classes/objects that wrap buffers.
Motivation:
There are use cases that involve accumulating data into a buffer, then carving out prefix slices and sending them off on their own journey for further processing.
Modification:
Add a Buf.bifurcate API, that split a buffer, and its ownership, in two.
Internally, the API will inject and maintain an atomically reference counted Drop instance, so that the original memory segment is not released until all bifurcated parts are closed.
This works particularly well for composite buffers, where only the buffer (if any) wherein the bifurcation point lands, will actually have its memory split. A composite buffer can otherwise just crack its buffer array in two.
Result:
We now have a safe way of breaking the single ownership of some memory into multiple parts, that can be sent and owned independently.
Motivation:
Cursors are better than iterators in that they only need to check boundary conditions once per iteration, when processed in a loop.
This should make them easier for the compiler to optimise.
Modification:
Change the ByteIterator to a ByteCursor. The API is almost the same, but with a few subtle differences in semantics.
The primary difference is that the cursor movement and boundary condition checking and position movement happen at the same time, and do not need to occur when the values are fetched out of the cursor.
An iterator, on the other hand, needs to throw an exception if "next" is called too many times.
Result:
Simpler code, and hopefully faster code as well.
Motivation:
This will likely be a somewhat common operation, as buffers move between eventloop and worker threads, so it's important to have an understanding of how it performs.
Modification:
Add a benchmark that specifically targets the send() operation on buffers.
Result:
We got benchmark numbers that clearly show the cost of confinement transfer
Motivation:
Composite buffers are uniquely positioned to be able to extend their underlying storage relatively cheaply.
This fact is relied upon in a couple of buffer use cases within Netty, that we wish to support.
Modification:
Add a static `extend` method to Allocator, so that the CompositeBuf class can remain internal.
The `extend` method inserts the extension buffer at the end of the composite buffer as if it had been included from the start.
This involves checking offsets and byte order invariants.
We also require that the composite buffer be in an owned state.
Result:
It's now possible to extend a composite buffer with a specific buffer, after the composite buffer has been created.
Motivation:
Capture the performance characteristics of this primitive for various buffer implementations.
Modification:
Add a benchmark that iterate 4KiB buffers forwards, and backwards, on various buffer implementations.
Result:
Another aspect of the implementation covered by benchmarks.
Turns out the composite iterators a somewhat slow.
Motivation:
Pooled buffers are a very important use case, and they change the cost dynamics around shared memory segments, so it's worth looking into in detail.
Modification:
Add another explicit close of pooled direct buffers to MemorySegmentClosedByCleanerBenchmark
Result:
Explicitly closing of pooled buffers is even out-performing cleaner close on the "heavy" workload, so this is currently the fastest way to run that workload:
Benchmark (workload) Mode Cnt Score Error Units
MemorySegmentClosedByCleanerBenchmark.cleanerClose heavy avgt 150 14,194 ± 0,558 us/op
MemorySegmentClosedByCleanerBenchmark.explicitClose heavy avgt 150 40,496 ± 0,414 us/op
MemorySegmentClosedByCleanerBenchmark.explicitPooledClose heavy avgt 150 12,723 ± 0,134 us/op
Motivation:
Buffers should always behave the same, regardless of their underlying implementation and how they are allocated.
Modification:
The SizeClassedMemoryPool did not properly reset the internal buffer state prior to reusing them.
The offsets, byte order, and contents are now cleared before a buffer is reused.
Result:
There is no way to observe externally whether a buffer was reused or not.
Motivation:
Because of the current dependency on snapshot versions of the Panama Foreign version of OpenJDK 16, this project is fairly involved to build.
Modification:
To make it easier for newcomers to build the binaries for this project, a docker-based build is added.
The docker image is constructed such that it contains a fresh snapshot build of the right fork of Java.
A make file has also been added, which encapsulates the common commands one would use for working with the docker build.
Result:
It is now easy for newcomers to make builds, and run tests, of this project, as long as they have a working docker installation.