Motivation:
We should allow to access the memoryAddress of the wrapped ByteBuf when using ReadOnlyByteBuf for peformance reasons. If a user act on a memoryAddress its his responsible anyway to do nothing "stupid".
Modifications:
Delegate to wrapped ByteBuf.
Result:
Less performance overhead for various operations and also when writing to a native transport (which needs the memoryAddress).
Motivations:
1. There are duplicated implementations of decoding hex strings. #6797
2. ByteBufUtil.HexUtil.decodeHexDump does not handle substring start
index properly and does not decode hex byte rigorously.
Modifications:
1. Function decodeHexByte is moved from QueryStringDecoder into ByteBufUtil.
2. ByteBufUtil.HexUtil.decodeHexDump is changed to use decodeHexByte.
3. Tests are Updated accordingly.
Result:
Fixed#6797 and made hex decoding functions more robust.
Motivation:
ByteBufUtil provides a hexDump method. For debugging purposes it is often useful to decode that hex dump to get the original content, but no such method exists.
Modifications:
- Add ByteBufUtil#decodeHexDump
Result:
ByteBufUtil#decodeHexDump is available to make debugging easier.
Motivation:
The javadocs for ByteBuf#ensureWritable(int, boolean) indicate that it should not throw, and instead the return code should indicate the result of the operation. Due to a bug in AbstractByteBuf it is possible for a resize to be attempted on a buffer that may exceed maxCapacity() and therefore throw.
Modifications:
- If there is not enough space in the buffer, and force is false, then a resize should not be attempted
Result:
AbstractByteBuf#ensureWritable(int, boolean) enforces the javadoc constraints and does not throw.
Motivation:
In cases when an application is running in a container or is otherwise
constrained to the number of processors that it is using, the JVM
invocation Runtime#availableProcessors will not return the constrained
value but rather the number of processors available to the virtual
machine. Netty uses this number in sizing various resources.
Additionally, some applications will constrain the number of threads
that they are using independenly of the number of processors available
on the system. Thus, applications should have a way to globally
configure the number of processors.
Modifications:
Rather than invoking Runtime#availableProcessors, Netty should rely on a
method that enables configuration when the JVM is started or by the
application. This commit exposes a new class NettyRuntime for enabling
such configuraiton. This value can only be set once. Its default value
is Runtime#availableProcessors so that there is no visible change to
existing applications, but enables configuring either a system property
or configuring during application startup (e.g., based on settings used
to configure the application).
Additionally, we introduce the usage of forbidden-apis to prevent future
uses of Runtime#availableProcessors from creeping. Future work should
enable the bundled signatures and clean up uses of deprecated and
other forbidden methods.
Result:
Netty can be configured to not use the underlying number of processors,
but rather the constrained number of processors.
Motivation:
Unsafe.invokeCleaner(...) checks if the passed in ByteBuffer is a slice or duplicate and if so throws an IllegalArgumentException on Java9. We need to ensure we never try to free a ByteBuffer that was provided by the user directly as we not know if its a slice / duplicate or not.
Modifications:
Never try to free a ByteBuffer that was passed into UnpooledUnsafeDirectByteBuf constructor by an user (via Unpooled.wrappedBuffer(....)).
Result:
Build passes again on Java9
Motivation:
Java9 added a new method to Unsafe which allows to allocate a byte[] without memset it. This can have a massive impact in allocation times when the byte[] is big. This change allows to enable this when using Java9 with the io.netty.tryAllocateUninitializedArray property when running Java9+. Please note that you will need to open up the jdk.internal.misc package via '--add-opens java.base/jdk.internal.misc=ALL-UNNAMED' as well.
Modifications:
Allow to allocate byte[] without memset on Java9+
Result:
Better performance when allocate big heap buffers and using java9.
Motivation:
UnreleasableByteBuf operations are designed to not modify the reference count of the underlying buffer. The Retained[Duplicate|Slice] operations violate this assumption and can cause the underlying buffer's reference count to be increased, but never allow for it to be decreased. This may lead to memory leaks.
Modifications:
- UnreleasableByteBuf's Retained[Duplicate|Slice] should leave the reference count of the parent buffer unchanged after the operation completes.
Result:
No more memory leaks due to usage of the Retained[Duplicate|Slice] on an UnreleasableByteBuf object.
Motiviation:
UnsafeByteBufUtil has some bugs related to using an incorrect index, and also omitting the array paramter when dealing with byte[] objects. There is also some simplification possible with respect to type casting, and minor formatting consistentcy issues.
Modifications:
- Ensure indexing is correct when dealing with native memory
- Fix the native access and endianness for the medium/unsigned medium methods
- Ensure array is used when dealing with heap memory
- Remove unecessary casts when using long
- Fix formating and alignment
Result:
UnsafeByteBufUtil is more correct and won't access direct memory when heap arrays are used.
Motivation:
The contract of `ByteBuf.writeBytes(ByteBuf src)` is such that it will
throw an `IndexOutOfBoundsException if `src.readableBytes()` is greater than
`this.writableBytes()`. The EmptyByteBuf class will throw the exception,
even if the source buffer has zero readable bytes, in violation of the
contract.
Modifications:
Use the helper method `checkLength(..)` to check the length and throw
the exception, if appropriate.
Result:
Conformance with the stated behavior of ByteBuf.
Motivation:
PR [#6460] added a way to access the used memory of an allocator. The used naming was not very good and how things were exposed are not consistent.
Modifications:
- Add a new ByteBufAllocatorMetric and ByteBufAllocatorMetricProvider interface
- Let the ByteBufAllocator implementations implement ByteBufAllocatorMetricProvider
- Move exposed stats / metric from PooledByteBufAllocator to PooledByteBufAllocatorMetric and mark old methods as `@Deprecated`.
Result:
More consistent way to expose metric / stats for ByteBufAllocator
Motivation:
There are numerous usages of internalNioBuffer which hard code 0 for the index when the intention was to use the readerIndex().
Modifications:
- Remove hard coded 0 for the index and use readerIndex()
Result:
We are less susceptible to using the wrong index, and don't make assumptions about the ByteBufAllocator.
Motivation:
Often its useful for the user to be able to get some stats about the memory allocated via an allocator.
Modifications:
- Allow to obtain the used heap and direct memory for an allocator
- Add test case
Result:
Fixes [#6341]
Motivation:
As we may access the metrics exposed of PooledByteBufAllocator from another thread then the allocations happen we need to ensure we synchronize on the PoolArena to ensure correct visibility.
Modifications:
Synchronize on the PoolArena to ensure correct visibility.
Result:
Fix multi-thread issues on the metrics
Motivation:
Commit 8dda984afe introduced a regression which lead to the situation that the allocator is not set when PooledByteBuf.initUnpooled(...) is called. Thus it was possible that PooledByteBuf.alloc() returns null or the wrong allocator if multiple PooledByteBufAllocator are used in an application.
Modifications:
- Correctly set the allocator
- Add test-case
Result:
Fixes [#6436].
Motivation:
When sun.misc.Unsafe is present we want to use *Unsafe*ByteBuf implementations. We missed to do so in PooledByteBufAllocator when the heapArena is null.
Modifications:
- Correctly use UnpooledUnsafeHeapByteBuf
- Add unit tests
Result:
Use most optimal ByteBuf implementation.
Motivation:
We can eliminate unnessary wrapping when call ByteBuf.asReadOnly() in some cases to reduce indirection.
Modifications:
- Check if asReadOnly() needs to create a new instance or not
- Add test cases
Result:
Less object creation / wrapping.
Motivation:
We should only try to calculate the direct memory offset when sun.misc.Unsafe is present as otherwise it will fail with an NPE as PlatformDependent.directBufferAddress(...) will throw it.
This problem was introduced by 66b9be3a46.
Modifications:
Use offset of 0 if no sun.misc.Unsafe is present.
Result:
PooledByteBufAllocator also works again when no sun.misc.Unsafe is present.
Motivation:
64-byte alignment is recommended by the Intel performance guide (https://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors) for data-structures over 64 bytes.
Requiring padding to a multiple of 64 bytes allows for using SIMD instructions consistently in loops without additional conditional checks. This should allow for simpler and more efficient code.
Modification:
At the moment cache alignment must be setup manually. But probably it might be taken from the system. The original code was introduced by @normanmaurer https://github.com/netty/netty/pull/4726/files
Result:
Buffer alignment works better than miss-align cache.
Motivation:
We not had tests for ByteBufAllocator implementations in general.
Modifications:
Added ByteBufAllocatorTest, AbstractByteBufAllocatorTest and UnpooledByteBufAllocatorTest
Result:
More tests for allocator implementations.
Motivation:
PooledByteBuf.capacity(...) miss to enforce maxCapacity() and so its possible to increase the capacity of the buffer even if it will be bigger then maxCapacity().
Modifications:
- Correctly enforce maxCapacity()
- Add unit tests for capacity(...) calls.
Result:
Correctly enforce maxCapacity().
Motivation:
When An HTTP server is listening in plaintext mode, it doesn't have
a chance to negotiate "h2" in the tls handshake. HTTP 1 clients
that are not expecting an HTTP2 server will accidentally a request
that isn't an upgrade, which the HTTP/2 decoder will not
understand. The decoder treats the bytes as hex and adds them to
the error message.
These error messages are hard to understand by humans, and result
in extra, manual work to decode.
Modification:
If the first bytes of the request are not the preface, the decoder
will now see if they are an HTTP/1 request first. If so, the error
message will include the method and path of the original request in
the error message.
In case the path is long, the decoder will check up to the first
1024 bytes to see if it matches. This could be a DoS vector if
tons of bad requests or other garbage come in. A future optimization
would be to treat the first few bytes as an AsciiString and not do
any Charset decoding. ByteBuf.toCharSequence alludes to such an
optimization.
The code has been left simple for the time being.
Result:
Faster identification of errant HTTP requests.
Motivation:
Disable ThreadLocal Cache, then allocate Pooled ByteBuf and release all these buffers, PoolArena's tiny/small/normal allocation count is incorrect.
Modifications:
- Calculate PoolArena's tiny/small/normal allocation one time
- Add testAllocationCounter TestCase
Result:
Fixes#6282 .
Motivation:
In PooledByteBuf we missed to null out the chunk and tmpNioBuf fields before recycle it to the Recycler. This could lead to keep objects longer alive then necessary which may hold a lot of memory.
Modifications:
Null out tmpNioBuf and chunk before recycle.
Result:
Possible to earlier GC objects.
Motivation:
ByteBufUtil.compare uses long arithmetic but doesn't check for underflow on when converting from long to int to satisfy the Comparable interface. This will result in incorrect comparisons and violate the Comparable interface contract.
Modifications:
- ByteBufUtil.compare should protect against int underflow
Result:
Fixes https://github.com/netty/netty/issues/6169
Motivation:
In later Java8 versions our Atomic*FieldUpdater are slower then the JDK implementations so we should not use ours anymore. Even worse the JDK implementations provide for example an optimized version of addAndGet(...) using intrinsics which makes it a lot faster for this use-case.
Modifications:
- Remove methods that return our own Atomic*FieldUpdaters.
- Use the JDK implementations everywhere.
Result:
Faster code.
Motivation:
If caches are disabled it does not make sense to schedule a task that will free up memory consumed by the caches.
Modifications:
Do not schedule if caches are disabled.
Result:
Less overhead.
Motivation:
We need to ensure the tracked object can not be GC'ed before ResourceLeak.close() is called as otherwise we may get false-positives reported by the ResourceLeakDetector. This can happen as the JIT / GC may be able to figure out that we do not need the tracked object anymore and so already enqueue it for collection before we actually get a chance to close the enclosing ResourceLeak.
Modifications:
- Add ResourceLeakTracker and deprecate the old ResourceLeak
- Fix some javadocs to correctly release buffers.
- Add a unit test for ResourceLeakDetector that shows that ResourceLeakTracker has not the problems.
Result:
No more false-positives reported by ResourceLeakDetector when ResourceLeakDetector.track(...) is used.
Motivation:
If a user allocates a lot from outside the EventLoop we may end up creating a lot of caches in the PooledByteBufAllocator. This may be wasteful and so it may be useful for an other to configure that caches should only be used from within EventLoops.
Modifications:
Add new constructor which allows to configure the caching behaviour.
Result:
More flexible configuration of PooledByteBufAllocator possible
Motivation:
SwappedByteBuf.unwrap() not returned the wrapped buffer but the buffer that was wrapped by the original buffer. This is not correct.
Modifications:
Correctly return wrapped buffer and fix test.
Result:
SwappedByteBuf.unwrap() works as expected.
Motivation:
IntelliJ issues several warnings.
Modifications:
* `ClientCookieDecoder` and `ServerCookieDecoder`:
* `nameEnd`, `valueBegin` and `valueEnd` don't need to be initialized
* `keyValLoop` loop doesn't been to be labelled, as it's the most inner one (same thing for labelled breaks)
* Remove `if (i != headerLen)` as condition is always true
* `ClientCookieEncoder` javadoc still mention old logic
* `DefaultCookie`, `ServerCookieEncoder` and `DefaultHttpHeaders` use ternary ops that can be turned into simple boolean ones
* `DefaultHeaders` uses a for(int) loop over an array. It can be turned into a foreach one as javac doesn't allocate an iterator to iterate over arrays
* `DefaultHttp2Headers` and `AbstractByteBuf` `equal` can be turned into a single boolean statement
Result:
Cleaner code
Motivation:
4bba7526e2 introduced changes which made pooled and unpooled derived buffers inconsistent in a few ways:
- Pooled derived buffers always generated a duplicate buffer when duplicate() was called and always generated a sliced buffer when slice() was called. Unpooled derived buffers some times generated a sliced buffer when duplicate() was called.
- The indexes that were set for duplicate buffers generated from slices were not always consistent.
There were also some various bugs in the derived pooled buffer implementation.
Modifications:
- Make pooled/unpooled consistently generate duplicate buffers when duplicate() is called and sliced buffers when slice() is called.
- Fix bugs in the derived pooled buffer
Result:
More consistent behavior from the derived pooled/unpooled buffers.
Motivation:
Currently the ByteBuf created as a result of retained[Slice|Duplicate] maintains its own reference count, and when this reference count is depleated it will release the ByteBuf returned from unwrap(). The unwrap() buffer is designed to be the 'root parent' and will skip all intermediate layers of buffers. If the intermediate layers of buffers contain a retained[Slice|Duplicate] then these reference counts will be ignored during deallocation. This may lead to deallocating the 'root parent' before all derived pooled buffers are actually released. This same issue holds if a retained[Slice|Duplicate] is in the heirachy and a 'regular' slice() or duplicate() buffer is created.
Modifications:
- AbstractPooledDerivedByteBuf must maintain a reference to the direct parent (the buffer which retained[Slice|Duplicate] was called on) and release on this buffer instead of the 'root parent' returned by unwrap()
- slice() and duplicate() buffers created from AbstractPooledDerivedByteBuf must also delegate reference count operations to their immediate parent (or first ancestor which maintains an independent reference count).
Result:
Fixes https://github.com/netty/netty/issues/5999
Motivation:
Netty provides a adaptor from ByteBuf to Java's InputStream interface. The JDK Stream interfaces have an explicit lifetime because they implement the Closable interface. This lifetime may be differnt than the ByteBuf which is wrapped, and controlled by the interface which accepts the JDK Stream. However Netty's ByteBufInputStream currently does not take reference count ownership of the underlying ByteBuf. There may be no way for existing classes which only accept the InputStream interface to communicate when they are done with the stream, other than calling close(). This means that when the stream is closed it may be appropriate to release the underlying ByteBuf, as the ownership of the underlying ByteBuf resource may be transferred to the Java Stream.
Motivation:
- ByteBufInputStream.close() supports taking reference count ownership of the underyling ByteBuf
Result:
ByteBufInputStream can assume reference count ownership so the underlying ByteBuf can be cleaned up when the stream is closed.
Motivation:
In some ByteBuf implementations we not correctly implement getBytes(index, ByteBuffer).
Modifications:
Correct code to do what is defined in the javadocs and adding test.
Result:
Implementation works as described.
Motivation:
We need to ensure we release all direct memory once the DirectPoolArena is collected. Otherwise we may never reclaim the memory and so leak memory.
Modifications:
Ensure we destroy all PoolChunk memory when DirectPoolArena is collected.
Result:
Free up unreleased memory when DirectPoolArena is collected.
Motivation:
We can share the code in retain() and retain(...) and also in release() and release(...).
Modifications:
Share code.
Result:
Less duplicated code.
Motivation:
We introduced a regression in 1abdbe6f67 which let the iteration start from the wrong index.
Modifications:
Fix start index and add tests.
Result:
Fix regression.
Motivation:
Result of ByteBufUtil.compare(ByteBuf a, ByteBuf b) is dependent on ByteOrder of supplied ByteBufs which should not be the case (as stated in the javadocs).
Modifications:
Ensure we get a consistent behavior when calling ByteBufUtil.compare(ByteBuf a, ByteBuf b) and not depend on ByteOrder.
Result:
ByteBufUtil.compare(ByteBuf a, ByteBuf b) and so AbstractByteBuf.compare(...) works correctly as stated in the javadocs.
Motivation:
Sometimes it is useful to be able to wrap an existing memory address (a.k.a pointer) and create a ByteBuf from it. This way its easier to interopt with other libraries.
Modifications:
Add a new Unpooled.wrappedBuffer(....) method that takes a memory address.
Result:
Be able to wrap an existing memory address into a ByteBuf.
Motivation:
The default limit for the maximum amount of bytes that a method will be inlined is 35 bytes. AbstractByteBuf#forEach and AbstractByteBuf#forEachDesc comprise of method calls which are more than this maximum default threshold and may prevent or delay inlining for occuring. The byte code for these methods can be reduced to allow for easier inlining. Here are the current byte code sizes:
AbstractByteBuf::forEachByte (24 bytes)
AbstractByteBuf::forEachByte(int,int,..) (14 bytes)
AbstractByteBuf::forEachByteAsc0 (71 bytes)
AbstractByteBuf::forEachByteDesc (24 bytes)
AbstractByteBuf::forEachByteDesc(int,int,.) (24 bytes)
AbstractByteBuf::forEachByteDesc0 (69 bytes)
Modifications:
- Reduce the code for each method in the AbstractByteBuf#forEach and AbstractByteBuf#forEachDesc call stack
Result:
AbstractByteBuf::forEachByte (25 bytes)
AbstractByteBuf::forEachByte(int,int,..) (25 bytes)
AbstractByteBuf::forEachByteAsc0 (29 bytes)
AbstractByteBuf::forEachByteDesc (25 bytes)
AbstractByteBuf::forEachByteDesc(int,int,..) (27 bytes)
AbstractByteBuf::forEachByteDesc0 (29 bytes)
Motivation:
We not need to do an extra conditional check in retain(...) as we can just check for overflow after we did the increment.
Modifications:
- Remove extra conditional check
- Add test code.
Result:
One conditional check less.
Motivation:
AbstractReferenceCountedByteBuf as independent conditional statements to check the bounds of the retain IllegalReferenceCountException condition. One of the exceptions also uses the incorrect increment. The same fix was done for AbstractReferenceCounted as 01523e7835.
Modifications:
- Combined independent conditional checks into 1 where possible
- Correct IllegalReferenceCountException with incorrect increment
- Remove the subtract to check for overflow and re-use the addition and check for overflow to remove 1 arithmetic operation (see http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.18.2)
Result:
AbstractReferenceCountedByteBuf has less independent branch statements and more correct IllegalReferenceCountException. Compilation size of AbstractReferenceCountedByteBuf.retain() is reduced.
Motivation:
Some usages of findNextPositivePowerOfTwo assume that bounds checking is taken care of by this method. However bounds checking is not taken care of by findNextPositivePowerOfTwo and instead assert statements are used to imply the caller has checked the bounds. This can lead to unexpected non power of 2 return values if the caller is not careful and thus invalidate any logic which depends upon a power of 2.
Modifications:
- Add a safeFindNextPositivePowerOfTwo method which will do runtime bounds checks and always return a power of 2
Result:
Fixes https://github.com/netty/netty/issues/5601
Motivation:
When Unpooled.wrappedBuffer(...) is called with an array of ByteBuf with length >= 2 and the first ByteBuf is not readable it will result in double releasing of these empty buffers when release() is called on the returned buffer.
Modifications:
- Ensure we only wrap readable buffers.
- Add unit test
Result:
No double release of buffers.
Motivation:
retainSlice() currently does not unwrap the ByteBuf when creating the ByteBuf wrapper. This effectivley forms a linked list of ByteBuf when it is only necessary to maintain a reference to the unwrapped ByteBuf.
Modifications:
- retainSlice() and retainDuplicate() variants should only maintain a reference to the unwrapped ByteBuf
- create new unit tests which generally verify the retainSlice() behavior
- Remove unecessary generic arguments from AbstractPooledDerivedByteBuf
- Remove unecessary int length member variable from the unpooled sliced ByteBuf implementation
- Rename the unpooled sliced/derived ByteBuf to include Unpooled in their name to be more consistent with the Pooled variants
Result:
Fixes https://github.com/netty/netty/issues/5582
Motivation:
SwappedByteBuf.retainedSlice(...) did not return a retained buffer.
Modifications:
Correctly delegate to retainedSlice(..) calls.
Result:
Correctly return retained slice.
Motivation:
Because of a bug we missed to include the first PoolSubpage when collection metrics.
Modifications:
- Correctly include all subpages
- Add unit test
Result:
Correctly include all subpages
Allow users of Netty to plug in their own leak detector for the purpose
of instrumentation.
Motivation:
We are rolling out a large Netty deployment and want to be able to
track the amount of leaks we're seeing in production via custom
instrumentation. In order to achieve this today, I had to plug in a
custom `ByteBufAllocator` into the bootstrap and have it initialize a
custom `ResourceLeakDetector`. Due to these classes mostly being marked
`final` or having private or static methods, a lot of the code had to
be copy-pasted and it's quite ugly.
Modifications:
* I've added a static loader method for the `ResourceLeakDetector` in
`AbstractByteBuf` that tries to instantiate the class passed in via the
`-Dio.netty.customResourceLeakDetector`, otherwise falling back to the
default one.
* I've modified `ResourceLeakDetector` to be non-final and to have the
reporting broken out in to methods that can be overridden.
Result:
You can instrument leaks in your application by just adding something
like the following:
```java
public class InstrumentedResourceLeakDetector<T> extends
ResourceLeakDetector<T> {
@Monitor("InstanceLeakCounter")
private final AtomicInteger instancesLeakCounter;
@Monitor("LeakCounter")
private final AtomicInteger leakCounter;
public InstrumentedResourceLeakDetector(Class<T> resource) {
super(resource);
this.instancesLeakCounter = new AtomicInteger();
this.leakCounter = new AtomicInteger();
}
@Override
protected void reportTracedLeak(String records) {
super.reportTracedLeak(records);
leakCounter.incrementAndGet();
}
@Override
protected void reportUntracedLeak() {
super.reportUntracedLeak();
leakCounter.incrementAndGet();
}
@Override
protected void reportInstancesLeak() {
super.reportInstancesLeak();
instancesLeakCounter.incrementAndGet();
}
}
```
Motivation:
See #82.
Modifications:
- Added `isText` to validate if the given ByteBuf is compliant with the specified charset.
- Optimized for UTF-8 and ASCII. For other cases, `CharsetDecoder.decoder` is used.
Result:
Users can validate ByteBuf with given charset.
Motivation:
We need to first store a reference to the wrapped buffer before recycle the AbstractPooledDerivedByteBuf instance. This is needed as otherwise it is possible that the same AbstractPooledDerivedByteBuf is again obtained and init(...) is called before we actually have a chance to call release(). This leads to call release() on the wrong buffer.
Modifications:
Store a reference to the wrapped buffer before call recycle and call release on the previous stored reference.
Result:
Always release the correct wrapped buffer when deallocate the AbstractPooledDerivedByteBuf.
Motivation:
If the user uses unsafe direct buffers with no cleaner we can use Unsafe.reallocateMemory(...) as optimization when we need to expand the buffer.
Modifications:
Use Unsafe.relocateMemory(...) in UnpooledUnsafeNoCleanerDirectByteBuf.
Result:
Less expensive expanding of buffers.
Motivation:
Using the Cleaner to release the native memory has a few drawbacks:
- Cleaner.clean() uses static synchronized internally which means it can be a performance bottleneck
- It put more load on the GC
Modifications:
Add new buffer implementations that can be enabled with a system flag as optimizations. In this case no Cleaner is used at all and the user must ensure everything is always released.
Result:
Less performance impact by direct buffers when need to be allocated and released.
Motivation:
Unsafe offers a method to set memory to a specific value. This can be used to implement an optimized version of setZero(...) and writeZero(...)
Modifications:
Add implementation for all Unsafe*ByteBuf implementations.
Result:
Faster setZero(...) and writeZero(...)
Motivation:
At the moment the user is responsible to increase the writer index of the composite buffer when a new component is added. We should add some methods that handle this for the user as this is the most popular usage of the composite buffer.
Modifications:
Add new methods that autoamtically increase the writerIndex when buffers are added.
Result:
Easier usage of CompositeByteBuf.
Motivation:
We missed to override a few methods and so some actions on the ByteBuf failed.
Modifications:
- Override all methods
- Add unit tests to ensure all is fixed.
Result:
All *LeakAware*ByteBuf have correct implementations
Motivation:
Recycler.recycle(...) should not be used anymore and be replaced by Handle.recycle().
Modifications:
Mark it as deprecated and update usage.
Result:
Correctly document deprecated api.
Motivation:
DefaultByteBufHolder.equals(...) and hashCode() should be implemented so it works correctly with instances that share the same content.
Modifications:
Add implementations and a unit-test.
Result:
Have correctly working equals(...) and hashCode() method
Related: #4333#4421#5128
Motivation:
slice(), duplicate() and readSlice() currently create a non-recyclable
derived buffer instance. Under heavy load, an application that creates a
lot of derived buffers can put the garbage collector under pressure.
Modifications:
- Add the following methods which creates a non-recyclable derived buffer
- retainedSlice()
- retainedDuplicate()
- readRetainedSlice()
- Add the new recyclable derived buffer implementations, which has its
own reference count value
- Add ByteBufHolder.retainedDuplicate()
- Add ByteBufHolder.replace(ByteBuf) so that..
- a user can replace the content of the holder in a consistent way
- copy/duplicate/retainedDuplicate() can delegate the holder
construction to replace(ByteBuf)
- Use retainedDuplicate() and retainedSlice() wherever possible
- Miscellaneous:
- Rename DuplicateByteBufTest to DuplicatedByteBufTest (missing 'D')
- Make ReplayingDecoderByteBuf.reject() return an exception instead of
throwing it so that its callers don't need to add dummy return
statement
Result:
Derived buffers are now recycled when created via retainedSlice() and
retainedDuplicate() and derived from a pooled buffer
Motivation:
We called deallocationsHuge.decrement() but it needs to be increment()
Modifications:
Replace decrement() with increment()
Result:
Correct metrics.
Motivation:
Often users either need to read or write CharSequences to a ByteBuf. We should add methods for this to ByteBuf as we can do some optimizations for this depending on the implementation.
Modifications:
Add setCharSequence, writeCharSequence, getCharSequence and readCharSequence
Result:
Easier reading / writing of CharSequence with ByteBuf.
Motivation:
Reduce nag warnings when compiling, make it easier for IDEs to display what's deprecated.
Modifications:
Added @Deprecated in a few places
Result:
No more warnings.
Motivation:
We lately added ByteBuf.isReadOnly() which allows to detect if a buffer is read-only or not. We should add ByteBuf.asReadOnly() to allow easily access a read-only version of a buffer.
Modifications:
- Add ByteBuf.asReadOnly()
- Deprecate Unpooled.unmodifiableBuffer(Bytebuf)
Result:
More consistent api.
Motivation:
When FixedCompositeByteBuf was constructed with new ByteBuf[0] and IndexOutOfboundsException was thrown.
Modifications:
Fix constructor
Result:
No more exception
Motivation:
We should not cache the SwappedByteBuf in AbstractByteBuf to reduce the memory footprint.
Modifications:
Not cache the SwappedByteBuf.
Result:
Less memory footprint.
Motivation:
Some ByteBuf implementations do not override all necessary methods,
which can lead to potentially sub-optimal behavior.
Also, SlicedByteBuf does not perform the range check correctly due to
missing overrides.
Modifications:
- Add missing overrides
- Use unwrap() instead of direct member access in derived buffers for
consistency
- Merge unwrap0() into unwrap() using covariant return type
- Deprecate AbstractDerivedByteBuf and its subtypes, because they were
not meant to be public
Result:
Correctness
Motivation:
ByteBuf.readBytes(...) uses Unpooled.buffer(...) internally which will use a heap ByteBuf and also not able to make use of the allocator which may be pooled. We should better make use of the allocator.
Modifications:
Use the allocator for thenew buffer.
Result:
Take allocator into account when copy bytes.
Motiviation:
Sometimes it is useful to dump the status of the PooledByteBufAllocator and log it. Doing this is currently a bit cumbersome as the user needs to basically iterate through all the metrics and compose the String. we would better provide an easy way to do this.
Modification:
Add dumpStats() method.
Result:
Easier to get a view into the status of the allocator.
Motivation:
PoolChunkList.allocate(...) should return false without the need to walk all the contained PoolChunks when the requested capacity is larger then the capacity that can be allocated out of the PoolChunks in respect to the minUsage() and maxUsage() of the PoolChunkList.
Modifications:
Precompute the maximal capacity that can be allocated out of the PoolChunks that are contained in the PoolChunkList and use this to fast return from the allocate(...) method if an allocation capacity larger then that is requested.
Result:
Faster detection of allocations that can not be handled by the PoolChunkList and so faster allocations in general via the PoolArena.
Motivation:
To better understand how much memory is used by Netty for ByteBufs it is useful to understand how many bytes are currently active (allocated) per PoolArena.
Modifications:
- Add PoolArenaMetric.numActiveBytes()
Result:
The user is able to get better insight into the PooledByteBufAllocator.
Motivation:
To make it easier to understand PoolChunk and PoolArena we should cleanup duplicated code.
Modifications:
- Move reused code into methods
- Use Math.max(...)
Result:
Cleaner code and easier to understand.
Motivation:
When doing a normal allocation in PoolArena we also tried to allocate out of the PoolChunkList that only contains completely full PoolChunks. This makes no sense as these are full anyway so an allocation will never work here and just gives a perf hit as we need to walk the whole list of PoolChunks in the list.
Modifications:
Not try to allocate from PoolChunkList that only contains full PoolChunks
Result:
Faster allocation times when a new PoolChunk must be created.
Motivation:
We should better use Math utilities as these are intrinsics. This is a cleanup for ea3ffb8536.
Modifications:
Use Math utilities.
Result:
Cleaner code and use of intrinsics.
Motivation:
When a PoolChunk needs to get moved to the previous PoolChunkList because of the minUsage / maxUsage constraints we always just moved it one level which is incorrect and so could lead to have PoolChunks in the wrong PoolChunkList (in respect to their minUsage / maxUsage settings). This then could have the effect that PoolChunks are not released / freed in a timely fashion and so.
Modifications:
- Correctly move PoolChunks between PoolChunkLists, which includes moving it multiple "levels".
- Add unit test
Result:
Correctlty move the PoolChunk to PoolChunkList when it is freed, even if its multiple layers.
Motivation:
The PoolChunkList.minUsage() and maxUsage() needs to take special action to translate Integer.MIN_VALUE / MAX_VALUE as these are used internal for tail and head of the linked-list structure.
Modifications:
- Correct the minUsage() and maxUsage() methods.
- Add unit test.
Result:
Correct metrics
Motivation:
Sometimes it is useful to allow to disable the leak detection of buffers if the UnpooledByteBufAllocator is used. This is for example true if the app wants to leak buffers into user code but not want to put the burden on the user to always release the buffer.
Modifications:
Add another constructor to UnpooledByteBufAllocator that allows to completely disable leak-detection for all buffers that are allocator out of the UnpooledByteBufAllocator.
Result:
It's possible to disable leak-detection when the UnpooledByteBufAllocator is used.
Motivation:
We should only increment the metric for the huge / normal allocation after it is done. Also we should only decrement once deallocate.
Modifications:
- Move increment after the allocation.
- Fix deallocation metric and move it after deallocation
Result:
More correct metrics.
Motivation:
PoolThreadCache includes the wrong value when throwing a IllegalArgumentException because of freeSweepAllocationThreshold.
Modifications:
Use the correct value.
Result:
Correct exception message.
Motivation:
The method setBytes creates temporary heap buffer when source buffer is read-only.
But this temporary buffer is not used correctly and may lead to data corruption.
This problem occurs when target buffer is pooled and temporary buffer
arrayOffset() is not zero.
Modifications:
Use correct arrayOffset when calling PlatformDependent.copyMemory.
Unit test was added to test this case.
Result:
Setting buffer content works correctly when target is pooled buffer and source
is read-only ByteBuffer.
Motivation:
We also need to add synchronization when access fields to ensure we see the latest updates.
Modifications:
Add synchronization when read fields that are written concurrently.
Result:
Ensure correct visibility of updated.
Motivation:
See #1811
Modifications:
Add LineEncoder and LineSeparator
Result:
The user can use LineEncoder to write a String with a line separator automatically
Motivation:
We had some double spacing in the methods which should be removed to keep things consistent.
Modifications:
Remove redundant spaces.
Result:
Cleaner / consistent coding style.
Motivation:
Circular assignment of arenas to thread caches can lead to less than optimal
mappings in cases where threads are (frequently) shutdown and started.
Example Scenario:
There are a total of 2 arenas. The first two threads performing an allocation
would lead to the following mapping:
Thread 0 -> Arena 0
Thread 1 -> Arena 1
Now, assume Thread 1 is shut down and another Thread 2 is started. The current
circular assignment algorithm would lead to the following mapping:
Thread 0 -> Arena 0
Thread 2 -> Arena 0
Ideally, we want Thread 2 to use Arena 1 though.
Presumably, this is not much of an issue for most Netty applications that do all
the allocations inside the eventloop, as eventloop threads are seldomly shut down
and restarted. However, applications that only use the netty-buffer package
or implement their own threading model outside the eventloop might suffer from
increased contention. For example, gRPC Java when using the blocking stub
performs some allocations outside the eventloop and within its own thread pool
that is dynamically sized depending on system load.
Modifications:
Implement a linear scan algorithm that assigns a new thread cache to the arena
that currently backs the fewest thread caches.
Result:
Closer to ideal mappings between thread caches and arenas. In order to always
get an ideal mapping, we would have to re-balance the mapping whenever a thread
dies. However, that's difficult because of deallocation.
Motivation:
The statistic counters PoolArena.(allocationsTiny|allocationsSmall) are
not protected by a per arena lock, but by a per size class lock. Thus,
two concurrent allocations of different size (class) could lead to a
race and ultimately to wrong statistics.
Modifications:
Use a thread-safe LongCounter instead of a plain long data type.
Result:
Fewer data races.
Motivation:
See #3321
Modifications:
1. Add CharsetUtil.encoder/decoder() methods
2. Deprecate CharsetUtil.getEncoder/getDecoder() methods
Result:
Users can use new CharsetUtil.encoder/decoder() to specify error actions
Motivation:
Utility methods in ByteBufUtil to writeUtf8 and writeAscii expect a buffer to already be allocated. If the user does not have a buffer allocated they have to know details of the encoding in order to know the size of the buffer to allocate.
Modifications:
- Add writeUtf8 and writeAscii which take a ByteBufAllocator and allocate a ByteBuf of the correct size for the user
Result:
ByteBufUtil methods which are easier to use if the user doesn't already have a ByteBuf.
Motivation:
[#4842] introduced 4 new methods but missed to implement advanced leak detection for these.
Modifications:
Correctly implement advanced leak detection for these methods.
Result:
Advanced leak detection works for all methods as expected.