Motivation:
64-byte alignment is recommended by the Intel performance guide (https://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors) for data-structures over 64 bytes.
Requiring padding to a multiple of 64 bytes allows for using SIMD instructions consistently in loops without additional conditional checks. This should allow for simpler and more efficient code.
Modification:
At the moment cache alignment must be setup manually. But probably it might be taken from the system. The original code was introduced by @normanmaurer https://github.com/netty/netty/pull/4726/files
Result:
Buffer alignment works better than miss-align cache.
Motivation:
We not had tests for ByteBufAllocator implementations in general.
Modifications:
Added ByteBufAllocatorTest, AbstractByteBufAllocatorTest and UnpooledByteBufAllocatorTest
Result:
More tests for allocator implementations.
Motivation:
PooledByteBuf.capacity(...) miss to enforce maxCapacity() and so its possible to increase the capacity of the buffer even if it will be bigger then maxCapacity().
Modifications:
- Correctly enforce maxCapacity()
- Add unit tests for capacity(...) calls.
Result:
Correctly enforce maxCapacity().
Motivation:
When An HTTP server is listening in plaintext mode, it doesn't have
a chance to negotiate "h2" in the tls handshake. HTTP 1 clients
that are not expecting an HTTP2 server will accidentally a request
that isn't an upgrade, which the HTTP/2 decoder will not
understand. The decoder treats the bytes as hex and adds them to
the error message.
These error messages are hard to understand by humans, and result
in extra, manual work to decode.
Modification:
If the first bytes of the request are not the preface, the decoder
will now see if they are an HTTP/1 request first. If so, the error
message will include the method and path of the original request in
the error message.
In case the path is long, the decoder will check up to the first
1024 bytes to see if it matches. This could be a DoS vector if
tons of bad requests or other garbage come in. A future optimization
would be to treat the first few bytes as an AsciiString and not do
any Charset decoding. ByteBuf.toCharSequence alludes to such an
optimization.
The code has been left simple for the time being.
Result:
Faster identification of errant HTTP requests.
Motivation:
Disable ThreadLocal Cache, then allocate Pooled ByteBuf and release all these buffers, PoolArena's tiny/small/normal allocation count is incorrect.
Modifications:
- Calculate PoolArena's tiny/small/normal allocation one time
- Add testAllocationCounter TestCase
Result:
Fixes#6282 .
Motivation:
In PooledByteBuf we missed to null out the chunk and tmpNioBuf fields before recycle it to the Recycler. This could lead to keep objects longer alive then necessary which may hold a lot of memory.
Modifications:
Null out tmpNioBuf and chunk before recycle.
Result:
Possible to earlier GC objects.
Motivation:
ByteBufUtil.compare uses long arithmetic but doesn't check for underflow on when converting from long to int to satisfy the Comparable interface. This will result in incorrect comparisons and violate the Comparable interface contract.
Modifications:
- ByteBufUtil.compare should protect against int underflow
Result:
Fixes https://github.com/netty/netty/issues/6169
Motivation:
In later Java8 versions our Atomic*FieldUpdater are slower then the JDK implementations so we should not use ours anymore. Even worse the JDK implementations provide for example an optimized version of addAndGet(...) using intrinsics which makes it a lot faster for this use-case.
Modifications:
- Remove methods that return our own Atomic*FieldUpdaters.
- Use the JDK implementations everywhere.
Result:
Faster code.
Motivation:
We should assert that the leak aware buffers correctly close the ResourceLeakTracker in the unit tests.
Modifications:
- Keep track of NoopResourceLeakTrackers and check if these were closed once the test completes
- Fix bugs in tests so the buffers are all released.
Result:
Better tests for leak aware buffers
Motivation:
If caches are disabled it does not make sense to schedule a task that will free up memory consumed by the caches.
Modifications:
Do not schedule if caches are disabled.
Result:
Less overhead.
Motivation:
We need to ensure the tracked object can not be GC'ed before ResourceLeak.close() is called as otherwise we may get false-positives reported by the ResourceLeakDetector. This can happen as the JIT / GC may be able to figure out that we do not need the tracked object anymore and so already enqueue it for collection before we actually get a chance to close the enclosing ResourceLeak.
Modifications:
- Add ResourceLeakTracker and deprecate the old ResourceLeak
- Fix some javadocs to correctly release buffers.
- Add a unit test for ResourceLeakDetector that shows that ResourceLeakTracker has not the problems.
Result:
No more false-positives reported by ResourceLeakDetector when ResourceLeakDetector.track(...) is used.
Motivation:
PooledByteBufAllocatorTest uses an ArrayQueue but access it from multiple threads (not concurrently but still from different threads). This may leak to memory visibility issues.
Modifications:
- Use a concurrent queue
- Some cleanup
Result:
Non racy test code.
Motivation:
If a user allocates a lot from outside the EventLoop we may end up creating a lot of caches in the PooledByteBufAllocator. This may be wasteful and so it may be useful for an other to configure that caches should only be used from within EventLoops.
Modifications:
Add new constructor which allows to configure the caching behaviour.
Result:
More flexible configuration of PooledByteBufAllocator possible
Motivation:
We support using Netty without sun.misc.Unsafe, so we should also support building it without it. This way we can also run all tests without sun.misc.Unsafe and so see if it works as expected.
Modifications:
Correctly skip tests that depend on sun.misc.Unsafe if its not present or -Dio.netty.noUnsafe=true is used.
Result:
Be able to build netty without sun.misc.Unsafe
Motivation:
SwappedByteBuf.unwrap() not returned the wrapped buffer but the buffer that was wrapped by the original buffer. This is not correct.
Modifications:
Correctly return wrapped buffer and fix test.
Result:
SwappedByteBuf.unwrap() works as expected.
Motivation:
We had a few tests PooledByteBufAllocatorTests which used parkNanos(...) to give a resource enough time to get destroyed. This is race and may not be good enough.
Modifications:
Ensure the ThreadCache is really destroyed.
Result:
No more racy tests that depend on ThreadCaches.
Motivation:
IntelliJ issues several warnings.
Modifications:
* `ClientCookieDecoder` and `ServerCookieDecoder`:
* `nameEnd`, `valueBegin` and `valueEnd` don't need to be initialized
* `keyValLoop` loop doesn't been to be labelled, as it's the most inner one (same thing for labelled breaks)
* Remove `if (i != headerLen)` as condition is always true
* `ClientCookieEncoder` javadoc still mention old logic
* `DefaultCookie`, `ServerCookieEncoder` and `DefaultHttpHeaders` use ternary ops that can be turned into simple boolean ones
* `DefaultHeaders` uses a for(int) loop over an array. It can be turned into a foreach one as javac doesn't allocate an iterator to iterate over arrays
* `DefaultHttp2Headers` and `AbstractByteBuf` `equal` can be turned into a single boolean statement
Result:
Cleaner code
Motivation:
4bba7526e2 introduced changes which made pooled and unpooled derived buffers inconsistent in a few ways:
- Pooled derived buffers always generated a duplicate buffer when duplicate() was called and always generated a sliced buffer when slice() was called. Unpooled derived buffers some times generated a sliced buffer when duplicate() was called.
- The indexes that were set for duplicate buffers generated from slices were not always consistent.
There were also some various bugs in the derived pooled buffer implementation.
Modifications:
- Make pooled/unpooled consistently generate duplicate buffers when duplicate() is called and sliced buffers when slice() is called.
- Fix bugs in the derived pooled buffer
Result:
More consistent behavior from the derived pooled/unpooled buffers.
Motiviation:
We used ReferenceCountUtil.releaseLater(...) in our tests which simplifies a bit the releasing of ReferenceCounted objects. The problem with this is that while it simplifies stuff it increase memory usage a lot as memory may not be freed up in a timely manner.
Modifications:
- Deprecate releaseLater(...)
- Remove usage of releaseLater(...) in tests.
Result:
Less memory needed to build netty while running the tests.
Motivation:
Currently the ByteBuf created as a result of retained[Slice|Duplicate] maintains its own reference count, and when this reference count is depleated it will release the ByteBuf returned from unwrap(). The unwrap() buffer is designed to be the 'root parent' and will skip all intermediate layers of buffers. If the intermediate layers of buffers contain a retained[Slice|Duplicate] then these reference counts will be ignored during deallocation. This may lead to deallocating the 'root parent' before all derived pooled buffers are actually released. This same issue holds if a retained[Slice|Duplicate] is in the heirachy and a 'regular' slice() or duplicate() buffer is created.
Modifications:
- AbstractPooledDerivedByteBuf must maintain a reference to the direct parent (the buffer which retained[Slice|Duplicate] was called on) and release on this buffer instead of the 'root parent' returned by unwrap()
- slice() and duplicate() buffers created from AbstractPooledDerivedByteBuf must also delegate reference count operations to their immediate parent (or first ancestor which maintains an independent reference count).
Result:
Fixes https://github.com/netty/netty/issues/5999
Motivation:
Netty provides a adaptor from ByteBuf to Java's InputStream interface. The JDK Stream interfaces have an explicit lifetime because they implement the Closable interface. This lifetime may be differnt than the ByteBuf which is wrapped, and controlled by the interface which accepts the JDK Stream. However Netty's ByteBufInputStream currently does not take reference count ownership of the underlying ByteBuf. There may be no way for existing classes which only accept the InputStream interface to communicate when they are done with the stream, other than calling close(). This means that when the stream is closed it may be appropriate to release the underlying ByteBuf, as the ownership of the underlying ByteBuf resource may be transferred to the Java Stream.
Motivation:
- ByteBufInputStream.close() supports taking reference count ownership of the underyling ByteBuf
Result:
ByteBufInputStream can assume reference count ownership so the underlying ByteBuf can be cleaned up when the stream is closed.
Motivation:
In some ByteBuf implementations we not correctly implement getBytes(index, ByteBuffer).
Modifications:
Correct code to do what is defined in the javadocs and adding test.
Result:
Implementation works as described.
Motivation:
the build doesnt seem to enforce this, so they piled up
Modifications:
removed unused import lines
Result:
less unused imports
Signed-off-by: radai-rosenblatt <radai.rosenblatt@gmail.com>
Motivation:
We need to ensure we release all direct memory once the DirectPoolArena is collected. Otherwise we may never reclaim the memory and so leak memory.
Modifications:
Ensure we destroy all PoolChunk memory when DirectPoolArena is collected.
Result:
Free up unreleased memory when DirectPoolArena is collected.
Motivation:
We can share the code in retain() and retain(...) and also in release() and release(...).
Modifications:
Share code.
Result:
Less duplicated code.
Motivation:
We introduced a regression in 1abdbe6f67 which let the iteration start from the wrong index.
Modifications:
Fix start index and add tests.
Result:
Fix regression.
Motivation:
Result of ByteBufUtil.compare(ByteBuf a, ByteBuf b) is dependent on ByteOrder of supplied ByteBufs which should not be the case (as stated in the javadocs).
Modifications:
Ensure we get a consistent behavior when calling ByteBufUtil.compare(ByteBuf a, ByteBuf b) and not depend on ByteOrder.
Result:
ByteBufUtil.compare(ByteBuf a, ByteBuf b) and so AbstractByteBuf.compare(...) works correctly as stated in the javadocs.
Motivation:
Sometimes it is useful to be able to wrap an existing memory address (a.k.a pointer) and create a ByteBuf from it. This way its easier to interopt with other libraries.
Modifications:
Add a new Unpooled.wrappedBuffer(....) method that takes a memory address.
Result:
Be able to wrap an existing memory address into a ByteBuf.
Motivation:
The default limit for the maximum amount of bytes that a method will be inlined is 35 bytes. AbstractByteBuf#forEach and AbstractByteBuf#forEachDesc comprise of method calls which are more than this maximum default threshold and may prevent or delay inlining for occuring. The byte code for these methods can be reduced to allow for easier inlining. Here are the current byte code sizes:
AbstractByteBuf::forEachByte (24 bytes)
AbstractByteBuf::forEachByte(int,int,..) (14 bytes)
AbstractByteBuf::forEachByteAsc0 (71 bytes)
AbstractByteBuf::forEachByteDesc (24 bytes)
AbstractByteBuf::forEachByteDesc(int,int,.) (24 bytes)
AbstractByteBuf::forEachByteDesc0 (69 bytes)
Modifications:
- Reduce the code for each method in the AbstractByteBuf#forEach and AbstractByteBuf#forEachDesc call stack
Result:
AbstractByteBuf::forEachByte (25 bytes)
AbstractByteBuf::forEachByte(int,int,..) (25 bytes)
AbstractByteBuf::forEachByteAsc0 (29 bytes)
AbstractByteBuf::forEachByteDesc (25 bytes)
AbstractByteBuf::forEachByteDesc(int,int,..) (27 bytes)
AbstractByteBuf::forEachByteDesc0 (29 bytes)
Motivation:
We used incorrect assumeTrue(...) checks.
Modifications:
Fix check.
Result:
Be able to run tests also if java.nio.DirectByteBuffer.<init>(long, int) could not be accessed.
Motivation:
We not need to do an extra conditional check in retain(...) as we can just check for overflow after we did the increment.
Modifications:
- Remove extra conditional check
- Add test code.
Result:
One conditional check less.
Motivation:
AbstractReferenceCountedByteBuf as independent conditional statements to check the bounds of the retain IllegalReferenceCountException condition. One of the exceptions also uses the incorrect increment. The same fix was done for AbstractReferenceCounted as 01523e7835.
Modifications:
- Combined independent conditional checks into 1 where possible
- Correct IllegalReferenceCountException with incorrect increment
- Remove the subtract to check for overflow and re-use the addition and check for overflow to remove 1 arithmetic operation (see http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.18.2)
Result:
AbstractReferenceCountedByteBuf has less independent branch statements and more correct IllegalReferenceCountException. Compilation size of AbstractReferenceCountedByteBuf.retain() is reduced.
Motivation:
Some usages of findNextPositivePowerOfTwo assume that bounds checking is taken care of by this method. However bounds checking is not taken care of by findNextPositivePowerOfTwo and instead assert statements are used to imply the caller has checked the bounds. This can lead to unexpected non power of 2 return values if the caller is not careful and thus invalidate any logic which depends upon a power of 2.
Modifications:
- Add a safeFindNextPositivePowerOfTwo method which will do runtime bounds checks and always return a power of 2
Result:
Fixes https://github.com/netty/netty/issues/5601
Motivation:
When Unpooled.wrappedBuffer(...) is called with an array of ByteBuf with length >= 2 and the first ByteBuf is not readable it will result in double releasing of these empty buffers when release() is called on the returned buffer.
Modifications:
- Ensure we only wrap readable buffers.
- Add unit test
Result:
No double release of buffers.
Motivation:
retainSlice() currently does not unwrap the ByteBuf when creating the ByteBuf wrapper. This effectivley forms a linked list of ByteBuf when it is only necessary to maintain a reference to the unwrapped ByteBuf.
Modifications:
- retainSlice() and retainDuplicate() variants should only maintain a reference to the unwrapped ByteBuf
- create new unit tests which generally verify the retainSlice() behavior
- Remove unecessary generic arguments from AbstractPooledDerivedByteBuf
- Remove unecessary int length member variable from the unpooled sliced ByteBuf implementation
- Rename the unpooled sliced/derived ByteBuf to include Unpooled in their name to be more consistent with the Pooled variants
Result:
Fixes https://github.com/netty/netty/issues/5582
Motivation:
At the moment the Recyler is very sensitive to allocation bursts which means that if there is a need for X objects for only one time these will most likely end up in the Recycler and sit there forever as the normal workload only need a subset of this number.
Modifications:
Add a ratio which sets how many objects should be pooled for each new allocation. This allows to slowly increase the number of objects in the Recycler while not be to sensitive for bursts.
Result:
Less unused objects in the Recycler if allocation rate sometimes bursts.