In nioBuffer(int,int) in CompositeByteBuf , we create a sub-array of nioBuffers for the components that are in range, then concatenate all the components in range into a single bigger buffer.
However, if the call to nioBuffers() returned only one sub-buffer, then we are copying it to a newly-allocated buffer "merged" for no reason.
Motivation:
Profiler for Spark shows a lot of time spent in put() method inside nioBuffer(), while usually no copy of data is required.
Modification:
This change skips this last step and just returns a duplicate of the single buffer returned by the call to nioBuffers(), which will in most implementation not copy the data
Result:
No copy when the source is only 1 buffer
Motivation:
We need to add special handling for WrappedCompositeByteBuf as these also extend AbstractByteBuf, otherwise we will not correctly adjust / read the writerIndex during processing.
Modifications:
- Add instanceof checks for WrappedCompositeByteBuf as well.
- Add testcases
Result:
Fixes https://github.com/netty/netty/issues/8152.
Motivation:
5b1fe611a6 introduced the usage of a finalizer as last resort for PoolThreadCache. As we may call free() from the FastThreadLocal.onRemoval(...) and finalize() we need to guard against multiple calls as otherwise we will corrupt internal state (that is used for metrics).
Modifications:
Use AtomicBoolean to guard against multiple calls of PoolThreadCache.free().
Result:
No more corruption of internal state caused by calling PoolThreadCache.free() multuple times.
Motivation:
Recent PR https://github.com/netty/netty/pull/8040 introduced
Unpooled.wrappedUnmodifiableBuffer(ByteBuf...) which has the same
behaviour but wraps the provided array directly. This is preferred for
most uses (including varargs-based use) and if there are any unusual
cases of an explicit array which is re-used before the ByteBuf is
finished with, it can just be copied first.
Modifications:
Added @Deprecated annotation and javadoc to
Unpooled.unmodifiableBuffer(ByteBuf...).
Result:
Unpooled.unmodifiableBuffer(ByteBuf...) will be deprecated.
Motivation:
ObjectCleaner does start a Thread to handle the cleaning of resources which leaks into the users application. We should not use it in netty itself to make things more predictable.
Modifications:
- Remove usage of ObjectCleaner and use finalize as a replacement when possible.
- Clarify javadocs for FastThreadLocal.onRemoval(...) to ensure its clear that remove() is not guaranteed to be called when the Thread completees and so this method is not enough to guarantee cleanup for this case.
Result:
Fixes https://github.com/netty/netty/issues/8017.
Motivation:
Unpooled.unmodifiableBuffer() is currently used to efficiently write
arrays of ByteBufs via FixedCompositeByteBuf, but involves an allocation
and content-copy of the provided ByteBuf array which in many (most?)
cases shouldn't be necessary.
Modifications:
Modify the internal FixedCompositeByteBuf class to support wrapping the
provided ByteBuf array directly. Control this behaviour with a
constructor flag and expose the "unsafe" version via a new
Unpooled.wrappedUnmodifiableBuffer(ByteBuf...) method.
Result:
Less garbage on IO paths. I would guess pretty much all existing usage
of unmodifiableBuffer() could use the copy-free version but assume it's
not safe to change its default behaviour.
Motivation:
Eliminate avoidable backing array reallocations when constructing
composite ByteBufs from existing buffer arrays/Iterables. This also
applies to the Unpooled.wrappedBuffer(...) methods.
Modifications:
Ensure the initial components ComponentList is sized at least as large
as the provided buffer array/Iterable in the CompositeByteBuffer
constructors.
In single-arg Unpooled.wrappedBuffer(...) methods, set maxNumComponents
to the count of provided buffers, rather than a fixed default of 16. It
seems likely that most usage of these involves wrapping a list without
subsequent modification, particularly since they return a ByteBuf rather
than CompositeByteBuf. If a different/larger max is required there are
already the wrappedBuffer(int, ...) variants.
In fact the current behaviour could be considered inconsistent - if you
call Unpooled.wrappedBuffer(int, ByteBuf) with a single buffer, you
might expect to subsequently be able to add buffers to it (since you
specified a max related to consolidation), but it will in fact return
just a slice of the provided ByteBuf.
Result:
Fewer and smaller allocations in some cases when using CompositeByteBufs
or Unpooled.wrappedBuffer(...).
Motivation:
Currently there is not a clear way to provide a byte array to a netty
ByteBuf and be informed when it is released. This is a would be a
valuable addition for projects that integrate with netty but also pool
their own byte arrays.
Modification:
Modified the UnpooledHeapByteBuf class so that the freeArray method is
protected visibility instead of default. This will allow a user to
subclass the UnpooledHeapByteBuf, provide a byte array, and override
freeArray to return the byte array to a pool when it is called.
Additionally this makes this implementation equivalent to
UnpooledDirectByteBuf (freeDirect is protected).
Additionally allocateArray is also made protect to provide another override
option for subclasses.
Result:
Users can override UnpooledHeapByteBuf#freeArray and
UnpooledHeapByteBuf#allocateArray.
Motivation:
When I read the source code, I found that the comment of PoolChunk is out of date, it may confuses readers with the description about memoryMap.
Modifications:
update the last passage of the comment of the PoolChunk class.
Result:
No change to any source code , just update comment.
Motivation:
When a buffer is over-released, the current error message of `IllegalReferenceCountException` is `refCnt: XXX, increment: XXX`, which is confusing. The correct message should be `refCnt: XXX, decrement: XXX`.
Modifications:
Pass `-decrement` to create `IllegalReferenceCountException`.
Result:
The error message will be `refCnt: XXX, decrement: XXX` when a buffer is over-released.
Motivation:
It should be possible to write a ReadOnlyByteBufferBuf to a channel without errors. However, ReadOnlyByteBufferBuf does not override isWritable and ensureWritable, which can cause some handlers to mistakenly assume they can write to the ReadOnlyByteBufferBuf, resulting in ReadOnlyBufferException.
Modification:
Added isWritable and ensureWritable method overrides on ReadOnlyByteBufferBuf to indicate that it is never writable. Added tests for these methods.
Result:
Can successfully write ReadOnlyByteBufferBuf to a channel with an SslHandler (or any other handler which may attempt to write to the ByteBuf it receives).
Motivation:
The `AbstractByteBuf#equals` method doesn't take into account the
class of buffer instance. So the two buffers with different classes
must have the same `hashCode` values if `equals` method returns `true`.
But `EmptyByteBuf#hashCode` is not consistent with `#hashCode`
of the empty `AbstractByteBuf`, that is violates the contract and
can lead to errors.
Modifications:
Return `1` in `EmptyByteBuf#hashCode`.
Result:
Consistent behavior of `EmptyByteBuf#hashCode` and `AbstractByteBuf#hashCode`.
Motivation:
The `ByteBuf#slice` and `ByteBuf#duplicate` methods should check
an accessibility to prevent creation slice or duplicate
of released buffer. At now this works not in the all scenarios.
Modifications:
Add missed checks.
Result:
More correct and consistent behavior of `ByteBuf` methods.
Motivation:
The `#ensureAccessible` method in `UnpooledHeapByteBuf#capacity` used
to prevent NPE if buffer is released and `array` is `null`. In all
other implementations of `ByteBuf` the accessible is not checked by
`capacity` method. We can assign an empty array to `array`
in the `deallocate` and don't worry about NPE in the `#capacity`.
This will help reduce the number of repeated calls of the
`#ensureAccessible` in many operations with `UnpooledHeapByteBuf`.
Modifications:
1. Remove `#ensureAccessible` call from `UnpooledHeapByteBuf#capacity`.
Use the `EmptyArrays#EMPTY_BYTES` instead of `null` in `#deallocate`.
2. Fix access checks in `AbstractUnsafeSwappedByteBuf` and
`AbstractByteBuf#slice` that relied on `#ensureAccessible`
in `UnpooledHeapByteBuf#capacity`. This was found by unit tests.
Result:
Less double calls of `#ensureAccessible` for `UnpooledHeapByteBuf`.
Motivation:
Currently copying a direct ByteBuf copies it fully into the heap before writing it to an output stream.
The can result in huge memory usage on the heap.
Modification:
copy the bytebuf contents via an 8k buffer into the output stream
Result:
Fixes#7804
Motivation:
We should allow to access the memoryAddress / array of the FixedCompositeByteBuf when it only wraps a single ByteBuf. We do the same for CompositeByteBuf.
Modifications:
- Check how many buffers FixedCompositeByteBuf wraps and depending on it delegate the access to the memoryAddress / array
- Add unit tests.
Result:
Fixes [#7752].
Motivation:
If someone invoke writeByte(), markWriterIndex(), readByte() in order first, and then invoke resetWriterIndex() should be throw a IndexOutOfBoundsException to obey the rule that the buffer declared "0 <= readerIndex <= writerIndex <= capacity".
Modification:
Changed the code writerIndex = markedWriterIndex; into writerIndex(markedWriterIndex); to make the check affect
Result:
Throw IndexOutOfBoundsException if any invalid happened in resetWriterIndex.
Motivation:
Read-only heap ByteBuffer doesn't expose array: the existent method to perform copies to direct ByteBuf involves the creation of a (maybe pooled) additional heap ByteBuf instance and copy
Modifications:
To avoid stressing the allocator with additional (and stealth) heap ByteBuf allocations is provided a method to perform copies using the (pooled) internal NIO buffer
Result:
Copies from read-only heap ByteBuffer to direct ByteBuf won't create any intermediate ByteBuf
Motivation:
To avoid eager allocation of the destination and to perform length prefixed encoding of UTF-8 string with forward only access pattern
Modifications:
The original writeUtf8 is modified by allowing customization of the reserved bytes on the destination buffer and is introduced an exact UTF-8 length estimator.
Result:
Is now possible to perform length first encoding with UTF-8 well-formed char sequences following a forward only write access pattern on the destination buffer.
Motivation:
ByteBufUtil by default will cache DirectByteBuffer objects, and the
associated direct memory (up to 64k). In combination with the Recycler which may
cache up to 32k elements per thread may lead to a large amount of direct
memory being retained per EventLoop thread. As traffic spikes come this
may be perceived as a memory leak because the memory in the Recycler
will never be reclaimed.
Modifications:
- By default we shouldn't cache DirectByteBuffer objects.
Result:
Less direct memory consumption due to caching DirectByteBuffer objects.
Motivation:
There is some cleanup that can be done.
Modifications:
- Use intializer list expression where possible
- Remove unused imports.
Result:
Cleaner code.
Motivation:
We need the memoryAddress of a direct buffer when using our native transports. For this reason ReadOnlyUnsafeDirectByteBuf.memoryAddress() should not throw.
Modifications:
- Correctly override ReadOnlyUnsafeDirectByteBuf.memoryAddress() and hasMemoryAddress()
- Add test case
Result:
Fixes [#7672].
Motivation:
We saw some timeouts on the CI when the leak detection is enabled.
Modifications:
- Use smaller number of operations in test
- Increase timeout
Result:
CI not times out.
Motivation:
ByteBufUtil.isText(...) may produce unexpected results if called concurrently on the same ByteBuffer.
Modifications:
- Don't use internalNioBuffer where it is not safe.
- Add unit test.
Result:
ByteBufUtil.isText is thread-safe.
Motivation:
Usages of HttpResponseStatus may result in more object allocation then necessary due to not looking for cached objects and the AsciiString parsing method not being used due to CharSequence method being used instead.
Modifications:
- HttpResponseDecoder should attempt to get the HttpResponseStatus from cache instead of allocating a new object
- HttpResponseStatus#parseLine(CharSequence) should check if the type is AsciiString and redirect to the AsciiString parsing method which may not require an additional toString call
- HttpResponseStatus#parseLine(AsciiString) can be optimized and doesn't require and may not require object allocation
Result:
Less allocations when dealing with HttpResponseStatus.
Motivation:
Depending on the implementation of ByteBuf nioBuffer(...) and nioBuffers(...) may either share the content or return a ByteBuffer that contains a copy of the content.
Modifications:
Fix javadocs.
Result:
Correct docs.
Motivation:
Calling ByteBuf.toString(Charset) on the same buffer from multiple threads at the same time produces unexpected results, such as various exceptions and/or corrupted output. This is because ByteBufUtil.decodeString(...) is taking the source ByteBuffer for CharsetDecoder.decode() from ByteBuf.internalNioBuffer(int, int), which is not thread-safe.
Modification:
Call ByteBuf.nioBuffer() instead of ByteBuf.internalNioBuffer() to get the source buffer to pass to CharsetDecoder.decode().
Result:
Fixes the possible race condition.
Motivation:
We did not correctly take the position into account when wrapping a ByteBuffer via ReadOnlyUnsafeDirectByteBuf as we obtained the memory address from the original ByteBuffer and not the slice we take.
Modifications:
- Correctly use the slice to obtain memory address.
- Add test case.
Result:
Fixes [#7565].
Motivation:
There is no guarantee that FastThreadLocal.onRemoval(...) is called if the FastThreadLocal is used by "non" FastThreacLocalThreads. This can lead to all sort of problems, like for example memory leaks as direct memory is not correctly cleaned up etc.
Beside this we use ThreadDeathWatcher to check if we need to release buffers back to the pool when thread local caches are collected. In the past ThreadDeathWatcher was used which will need to "wakeup" every second to check if the registered Threads are still alive. If we can ensure FastThreadLocal.onRemoval(...) is called we do not need this anymore.
Modifications:
- Introduce ObjectCleaner and use it to ensure FastThreadLocal.onRemoval(...) is always called when a Thread is collected.
- Deprecate ThreadDeathWatcher
- Add unit tests.
Result:
Consistent way of cleanup FastThreadLocals when a Thread is collected.