Commit Graph

892 Commits

Author SHA1 Message Date
root
d066f163d7 [maven-release-plugin] prepare release netty-4.1.43.Final 2019-10-24 12:56:30 +00:00
root
92941cdcac [maven-release-plugin] prepare for next development iteration 2019-09-25 06:15:31 +00:00
root
bd907c3b3a [maven-release-plugin] prepare release netty-4.1.42.Final 2019-09-25 06:14:31 +00:00
root
01d805bb76 [maven-release-plugin] prepare for next development iteration 2019-09-12 16:09:55 +00:00
root
7cf69022d4 [maven-release-plugin] prepare release netty-4.1.41.Final 2019-09-12 16:09:00 +00:00
root
aef47bec7f [maven-release-plugin] prepare for next development iteration 2019-09-12 05:38:11 +00:00
root
267e5da481 [maven-release-plugin] prepare release netty-4.1.40.Final 2019-09-12 05:37:30 +00:00
Nick Hill
1039f69e53 Fix for incorrect values from CompositeByteBuf#component(int) (#9525)
Motivation

This is a "simpler" alternative to #9416 which fixes the same
CompositeByteBuf bugs described there, originally reported by @jingene
in #9398.

Modifications
- Add fields to Component class for the original buffer along with its
adjustment, which may be different to the already-stored unwrapped
buffer. Use it in appropriate places to ensure correctness and
equivalent behaviour to that prior to the earlier optimizations
- Add comments explaining purpose of each of the Component fields
- Unwrap more kinds of buffers in newComponent method to extend scope of
the existing indirection-reduction optimization
- De-duplicate common buffer consolidation logic
- Unit test for the original bug provided by @jingene

Result
- Correct behaviour / fixed bugs
- Some code deduplication / simplification
- Unwrapping optimization applied to more types of buffers

The downside is increased mem footprint from the two new fields, and
additional allocations in some specific cases, though those should be
rare.


Co-authored-by: jingene <jingene0206@gmail.com>
2019-09-02 13:52:17 +02:00
Codrut Stancu
b7e9829a49 Update GraalVM Native Image configuration. (#9515)
Motivation:

The Netty classes are initialized at build time by default for GraalVM Native Image compilation. This is configured via the `--initialize-at-build-time=io.netty` option. While this reduces start-up time it can lead to some problems:

 - The class initializer of `io.netty.buffer.PooledByteBufAllocator` looks at the maximum memory size to compute the size of internal buffers. If the class initializer runs during image generation, then the buffers are sized according to the very large heap size that the image generator uses, and Netty allocates several arrays that are 16 MByte. The fix is to initialize the following 3 classes at run time: `io.netty.buffer.PooledByteBufAllocator,io.netty.buffer.ByteBufAllocator,io.netty.buffer.ByteBufUtil`. This fix was dependent on a GraalVM Native Image fix that was included in 19.2.0.

 - The class initializer of `io.netty.handler.ssl.util.ThreadLocalInsecureRandom` needs to be initialized at runtime to ensure that the generated values are trully random and not fixed for each generated image.

 - The class initializers of `io.netty.buffer.AbstractReferenceCountedByteBuf` and `io.netty.util.AbstractReferenceCounted` compute field offsets. While the field offset recomputation is necessary for correct execution as a native image these initializers also have logic that depends on the presence/absence of `sun.misc.Unsafe`, e.g., via the `-Dio.netty.noUnsafe=true` flag. The fix is to push these initializers to runtime so that the field offset lookups (and the logic depending on them) run at run time. This way no manual substitutions are necessary either.
 
Modifications:

Add `META-INF/native-image` configuration files that correctly trigger the inialization of the above classes at run time via `--initialize-at-run-time=...` flags.
 
Result:

Fixes the initialisation issues described above for Netty executables built with GraalVM.
2019-08-30 09:21:11 +02:00
Norman Maurer
da2aba5742
Reduce GC produced by AbstractByteBuf.indexOf(..) implementation (#9502)
Motivation:

AbstractByteBuf.indexOf(...) currently delegates to ByteBufUtils.indexOf(...) which will create a new ByteBufProcessor on each call. This is done to reduce overhead of bounds-checks. Unfortunally while this reduces bounds checks it produces a lot of GC. We can just implement our own version in AbstractByteBuf which makes use of _getByte(...) and so does no bound checks as well but also not need to create any garbage.

Modifications:

Write optimized implementation of indexOf(...) for AbstractByteBuf

Result:

Fixes https://github.com/netty/netty/issues/9499.
2019-08-24 13:46:28 +02:00
Norman Maurer
d8e59ca638
Avoid creating FileInputStream and FileOutputStream for obtaining Fil… (#8110)
Motivation:

If all we need is the FileChannel we should better use RandomAccessFile as FileInputStream and FileOutputStream use a finalizer.

Modifications:

Replace FileInputStream and FileOutputStream with RandomAccessFile when possible.

Result:

Fixes https://github.com/netty/netty/issues/8078.
2019-08-17 09:43:01 +02:00
Nick Hill
4db38b4e0c Don't zero non-readable buffer regions when capacity is decreased (#9427)
Motivation

#1802 fixed ByteBuf implementations to ensure that the whole buffer
region is preserved when capacity is increased, not just the readable
part. The behaviour is still different however when the capacity is
_decreased_ - data outside the currently-readable region is zeroed.

Modifications

Update ByteBuf capacity(int) implementations to also copy the whole
buffer region when the new capacity is less than the current capacity.

Result

Consistent behaviour of ByteBuf#capacity(int) regardless of whether the
new capacity is greater than or less than the current capacity.
2019-08-16 08:18:09 +02:00
root
d45a4ce01b [maven-release-plugin] prepare for next development iteration 2019-08-13 17:16:42 +00:00
root
88c2a4cab5 [maven-release-plugin] prepare release netty-4.1.39.Final 2019-08-13 17:15:20 +00:00
Nick Hill
695aa0959d Use alloc().heapBuffer(...) to allocate new heap buffer.
Motivation

Underlying array allocations in UnpooledHeapByteBuf are intended be done
via the protected allocateArray(int) method, so that they can be tracked
and/or overridden by subclasses, for example
UnpooledByteBufAllocator$InstrumentedUnpooledHeapByteBuf or #8015. But
it looks like an explicit allocation was missed in the copy(int,int)
method.

Modification

Just use alloc().heapBuffer(...) for the allocation

Result

No possibility of "missing" array allocations when ByteBuf#copy is used.
2019-08-13 10:52:11 +02:00
Nick Hill
fedcc40196 Fix ByteBufUtil#writeUtf8 subsequence split surrogate edge-case bug (#9437)
Motivation:

#9224 introduced overrides of ByteBufUtil#writeUtf8(...) and related
methods to operate on a sub-CharSequence directly to save having to
allocate substrings, but it missed an edge case where the subsequence
does not extend to the end of the CharSequence and the last char in the
sequence is a high surrogate.

Due to the catch-IndexOutOfBoundsException optimization that avoids an
additional bounds check, it would be possible to read past the specified
end char index and successfully decode a surrogate pair which would
otherwise result in a '?' byte being written.

Modifications:

- Check for end-of-subsequence before reading next char after a high
surrogate is encountered in the
writeUtf8(AbstractByteBuf,int,CharSequence,int,int) and
utf8BytesNonAscii methods
- Add unit test for this edge case

Result:

Bug is fixed.

This removes the bounds-check-avoidance optimization but it does not
appear to have a measurable impact on benchmark results, including when
the char sequence contains many surrogate pairs (which should be rare in
any case).
2019-08-10 20:54:04 +02:00
root
718b7626e6 [maven-release-plugin] prepare for next development iteration 2019-07-24 09:05:57 +00:00
root
465c900c04 [maven-release-plugin] prepare release netty-4.1.38.Final 2019-07-24 09:05:23 +00:00
Norman Maurer
131be58f48
Correctly take length of ByteBufInputStream into account for readLine… (#9310)
* Correctly take length of ByteBufInputStream into account for readLine() / readByte()

Motivation:

ByteBufInputStream did not correctly take the length into account when validate bounds for readLine() / readByte() which could lead to read more then allowed.

Modifications:

- Correctly take length into account
- Add unit tests
- Fix existing unit test

Result:

Correctly take length of ByteBufInputStream into account.
Related to https://github.com/netty/netty/pull/9306.
2019-07-01 20:55:23 +02:00
xiaoheng1
f8c1f350db Fix public int read() throws IOException method exceeds the limit of length (#9306)
Motivation:

buffer.isReadable() should not be used to limit the amount of data that can be read as the amount may be less then was is readable.

Modification:

- Use  available() which takes the length into account
- Add unit test

Result:

Fixes https://github.com/netty/netty/issues/9305
2019-07-01 15:57:34 +02:00
秦世成
4596f9e139 Fix the issue of incorrectly calculating the number of dump lines when using PrettyDump in ByteBufUtil (#9304)
Motivation:

Fix the issue of incorrectly calculating the number of dump rows when using prettyHexDumpmethod in ByteBufUtil. The way to find the remainder is either length % 16 or length & 15

Modification:

Fixed the way to calculate the remainder

Result:

Fixed #9301
2019-07-01 08:35:18 +02:00
root
5b58b8e6b5 [maven-release-plugin] prepare for next development iteration 2019-06-28 05:57:21 +00:00
root
35e0843376 [maven-release-plugin] prepare release netty-4.1.37.Final 2019-06-28 05:56:28 +00:00
jimin
856f1185e1 All override methods must be added @override (#9285)
Motivation:

Some methods that either override others or are implemented as part of implementation an interface did miss the `@Override` annotation

Modifications:

Add missing `@Override`s

Result:

Code cleanup
2019-06-27 13:51:26 +02:00
jimin
9621a5b981 remove unused imports (#9287)
Motivation:

Some imports are not used

Modification:

remove unused imports

Result:

Code cleanup
2019-06-26 21:08:31 +02:00
Norman Maurer
265c745d9a
EmptyByteBuf.getCharSequence(0,...) must return empty String (#9272)
Motivation:

At the moment EmptyByteBuf.getCharSequence(0,...) will return null while it must return a "".

Modifications:

- Let EmptyByteBuf.getCharSequence(0,...) return ""
- Add unit test

Result:

Fixes https://github.com/netty/netty/issues/9271.
2019-06-24 21:09:19 +02:00
Nick Hill
2af769f6dc Subsequence versions of ByteBufUtil#writeUtf8(...) methods (#9224)
Motivation

It would be useful to be able to write UTF-8 encoded subsequence of
CharSequence characters to a ByteBuf without needing to create a
temporary object via CharSequence#subSequence().

Modification

Add overloads of ByteBufUtil writeUtf8, reserveAndWriteUtf8 and
utf8Bytes methods which take explicit subsequence bounds.

Result

More efficient writing of substrings to byte buffers possible
2019-06-21 14:05:35 +02:00
Nick Hill
6381d0766a De-duplicate PooledByteBuf implementations (#9120)
Motivation

There's quite a lot of duplicate/equivalent logic across the various
concrete ByteBuf implementations. We could take this even further but
for now I've focused on the PooledByteBuf sub-hierarchy.

Modifications

- Move common logic/methods into existing PooledByteBuf abstract
superclass
- Shorten PooledByteBuf.capacity(int) method implementation

Result

Less code to maintain
2019-06-19 20:50:27 +02:00
Nick Hill
272f68f48c De-duplicate UnpooledDirectByteBuf/UnpooledUnsafeDirectByteBuf (#9085)
Motivation

While digging around looking at something else I noticed that these
share a lot of logic and it would be nice to reduce that duplication.

Modifications

Have UnpooledUnsafeDirectByteBuf extend UnpooledDirectByteBuf and make
adjustments to ensure existing behaviour remains unchanged.

The most significant addition needed to UnpooledUnsafeDirectByteBuf was
re-overriding the getPrimitive/setPrimitive methods to revert back to
the AbstractByteBuf versions which include bounds checks
(UnpooledDirectByteBuf excludes these as an optimization, relying on
those done by underlying ByteBuffer).

Result

~200 fewer lines, less duplicate logic.
2019-06-03 13:04:10 +02:00
Idel Pivnitskiy
ec69da9afb Make UnpooledUnsafeHeapByteBuf class public (#9184)
Motivation:

1. Users will be able to use an optimized version of
`UnpooledHeapByteBuf` and override behavior of methods if required.
2. Consistency with `UnpooledDirectByteBuf`, `UnpooledHeapByteBuf`, and
`UnpooledUnsafeDirectByteBuf`.

Modifications:

- Add `public` access modifier to `UnpooledUnsafeHeapByteBuf` class and
ctor;

Result:

Public access for optimized version of `UnpooledHeapByteBuf`.
2019-05-31 07:04:03 +02:00
Nick Hill
385dadcfbc Fix redundant or missing checks and other inconsistencies in ByteBuf impls (#9119)
Motivation

There are a few minor inconsistencies / redundant operations in the
ByteBuf implementations which would be good to fix.

Modifications

- Unnecessary ByteBuffer.duplicate() performed in
CompositeByteBuf.nioBuffer(int,int)
- Add missing checkIndex(...) check to
ReadOnlyByteBufferBuf.nioBuffer(int,int)
- Remove duplicate bounds check in
ReadOnlyByteBufferBuf.getBytes(int,byte[],int,int)
- Omit redundant bounds check in
UnpooledHeapByteBuf.getBytes(int,ByteBuffer)

Result

More consistency and slightly less overhead
2019-05-27 15:32:08 +02:00
Nick Hill
128403b492 Introduce ByteBuf.maxFastWritableBytes() method (#9086)
Motivation

ByteBuf capacity is automatically increased as needed up to maxCapacity
when writing beyond the buffer's current capacity. However there's no
way to tell in general whether such an increase will result in a
relatively costly internal buffer re-allocation.

For unpooled buffers it always does, in pooled cases it depends on the
size of the associated chunk of allocated memory, which I don't think is
currently exposed in any way.

It would sometimes be useful to know where this limit is when making
external decisions about whether to reuse or preemptively reallocate.

It would also be advantageous to take this limit into account when
auto-increasing the capacity during writes, to defer such reallocation
until really necessary.

Modifications

Introduce new AbstractByteBuf.maxFastWritableBytes() method which will
return a value >= writableBytes() and <= maxWritableBytes().

Make use of the new method in the sizing decision made by the
AbstractByteBuf.ensureWritable(...) methods.

Result

Less reallocation/copying.
2019-05-22 20:11:24 +02:00
Nick Hill
507e0a05b5 Fix possible unsafe sharing of internal NIO buffer in CompositeByteBuf (#9169)
Motivation

A small thread-safety bug was introduced during the internal
optimizations of ComponentByteBuf made a while back in #8437. When there
is a single component which was added as a slice,
internalNioBuffer(int,int) will currently return the unwrapped slice's
un-duplicated internal NIO buffer. This is not safe since it could be
modified concurrently with other usage of that parent buffer.

Modifications

Delegate internalNioBuffer to nioBuffer in this case, which returns a
duplicate. This matches what's done in derived buffers in general
(for the same reason). Add unit test.

Result

Fixed possible thread-safety bug
2019-05-22 11:07:06 +02:00
Tim Brooks
2dc686ded1 Prefer direct io buffers if direct buffers pooled (#9167)
Motivation

Direct buffers are normally preferred when interfacing with raw
sockets. Currently netty will only return direct io buffers (for reading
from a channel) when a platform has unsafe. However, this is
inconsistent with the write-side (filterOutboundMessage) where a direct
byte buffer will be returned if pooling is enabled. This means that
environments without unsafe (and no manual netty configurations) end up
with many pooled heap byte buffers for reading, many pooled direct byte
buffers for writing, and jdk pooled byte buffers (for reading).

Modifications

This commit modifies the AbstractByteBufAllocator to return a direct
byte buffer for io handling when the platform has unsafe or direct byte
buffers are pooled.

Result:

Use direct buffers when direct buffers are pooled for IO.
2019-05-22 07:32:41 +02:00
Nick Hill
60de092e36 Fix incorrect behavior of ReadOnlyByteBufferBuf.getBytes(int,ByteBuffer) (#9125)
* Fix incorrect behavior of ReadOnlyByteBufferBuf.getBytes(int,ByteBuffer)

Motivation

It currently will succeed when the destination is larger than the source
range, but the ByteBuf javadoc states this should be a failure, as is
the case with all the other implementations.

Modifications

- Fix logic to fail the bounds check in this case
- Remove explicit null check which isn't done in any equivalent method
- Add unit test

Result

More correct/consistent behaviour
2019-05-13 07:00:06 +02:00
root
ba06eafa1c [maven-release-plugin] prepare for next development iteration 2019-04-30 16:42:29 +00:00
root
49a451101c [maven-release-plugin] prepare release netty-4.1.36.Final 2019-04-30 16:41:28 +00:00
Norman Maurer
e01c4bce08
Fix regression in CompositeByteBuf.discard*ReadBytes() (#9068)
Motivation:

1f93bd3 introduced a regression that could lead to not have the lastAccessed field correctly null'ed out when the endOffset of the internal Component == CompositeByteBuf.readerIndex()

Modifications:

- Correctly null out the lastAccessed field in any case
- Add unit tests

Result:

Fixes regression in CompositeByteBuf.discard*ReadBytes()
2019-04-17 18:03:08 +02:00
root
baab215f66 [maven-release-plugin] prepare for next development iteration 2019-04-17 07:26:24 +00:00
root
dfe657e2d4 [maven-release-plugin] prepare release netty-4.1.35.Final 2019-04-17 07:25:40 +00:00
Nick Hill
b26a61acd1 Centralize internal reference counting logic (#8614)
Motivation

AbstractReferenceCounted and AbstractReferenceCountedByteBuf contain
duplicate logic for managing the volatile refcount in an optimized and
consistent manner, which increased in complexity in #8583. It's possible
to extract this into a common helper class now that all access is via an
AtomicIntegerFieldUpdater.

Modifications

- Move duplicate logic into a shared ReferenceCountUpdater class
- Incorporate some additional simplification for the most common single
increment/decrement cases (fewer checks/operations)

Result

Less code duplication, better encapsulation of the "non-trivial"
internal volatile refcount manipulation
2019-04-09 16:22:32 +02:00
Nick Hill
9f2221ebd4 CompositeByteBuf optimizations and new addFlattenedComponents method (#8939)
Motivation:

The CompositeByteBuf discardReadBytes / discardReadComponents methods are currently quite inefficient, including when there are no read components to discard. We would like to call the latter more frequently in ByteToMessageDecoder#COMPOSITE_CUMULATOR.

In the same context it would be beneficial to perform a "shallow copy" of a composite buffer (for example when it has a refcount > 1) to avoid having to allocate and copy the contained bytes just to obtain an "independent" cumulation.

Modifications:

- Optimize discardReadBytes() and discardReadComponents() implementations (start at first comp rather than performing a binary search for the readerIndex).
- New addFlattenedComponents(boolean,ByteBuf) method which performs a shallow copy if the provided buffer is also composite and avoids adding any empty buffers, plus unit test.
- Other minor optimizations to avoid unnecessary checks.

Results:

discardReadXX methods are faster, composite buffers can be easily appended without deepening the buffer "tree" or retaining unused components.
2019-04-08 20:48:08 +02:00
Norman Maurer
a2b85a306d
Fix NPE that was encounter by debugger (will never happen in real code). (#8992)
Motivation:

We synchronize on the chunk.arena when produce the String returned by PoolSubpage.toString() which may raise a NPE when chunk == null. Chunk == null for the head of the linked-list and so a NPE may raised by a debugger. This NPE can never happen in real code tho as we never access toString() of the head.

Modifications:

Add null checks and so fix the possible NPE

Result:

No NPE when using a debugger and inspect the PooledByteBufAllocator.
2019-04-01 19:44:28 +02:00
Norman Maurer
922e463524
Don't try to put back MemoryRegionCache.Entry objects into the Recycler when recycled because of a finalizer. (#8955)
Motivation:

In MemoryRegionCache.Entry we use the Recycler to reduce GC pressure and churn. The problem is that these will also be recycled when the PoolThreadCache is collected and finalize() is called. This then can have the effect that we try to load class but the WebApp is already stoped.

This will produce an stacktrace like this on Tomcat:

```
19-Mar-2019 15:53:21.351 INFO [Finalizer] org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading Illegal access: this web application instance has been stopped already. Could not load [java.util.WeakHashMap]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
 java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [java.util.WeakHashMap]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
	at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1383)
	at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1371)
	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1224)
	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1186)
	at io.netty.util.Recycler$3.initialValue(Recycler.java:233)
	at io.netty.util.Recycler$3.initialValue(Recycler.java:230)
	at io.netty.util.concurrent.FastThreadLocal.initialize(FastThreadLocal.java:188)
	at io.netty.util.concurrent.FastThreadLocal.get(FastThreadLocal.java:142)
	at io.netty.util.Recycler$Stack.pushLater(Recycler.java:624)
	at io.netty.util.Recycler$Stack.push(Recycler.java:597)
	at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:225)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry.recycle(PoolThreadCache.java:478)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.freeEntry(PoolThreadCache.java:459)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:430)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:422)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:279)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:270)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:241)
	at io.netty.buffer.PoolThreadCache.finalize(PoolThreadCache.java:230)
	at java.lang.System$2.invokeFinalize(System.java:1270)
	at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:102)
	at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:217)
```

Beside this we also need to ensure we not try to lazy load SizeClass when the finalizer is used as it may not be present anymore if the ClassLoader is already destroyed.

This would produce an error like:

```
20-Mar-2019 11:26:35.254 INFO [Finalizer] org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading Illegal access: this web application instance has been stopped already. Could not load [io.netty.buffer.PoolArena$1]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
 java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [io.netty.buffer.PoolArena$1]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
	at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1383)
	at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1371)
	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1224)
	at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1186)
	at io.netty.buffer.PoolArena.freeChunk(PoolArena.java:287)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.freeEntry(PoolThreadCache.java:464)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:429)
	at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:421)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:278)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:269)
	at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:240)
	at io.netty.buffer.PoolThreadCache.finalize(PoolThreadCache.java:229)
	at java.lang.System$2.invokeFinalize(System.java:1270)
	at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:102)
	at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:217)
```

Modifications:

- Only try to put the Entry back into the Recycler if the PoolThredCache is not destroyed because of the finalizer.
- Only try to access SizeClass if not triggered by finalizer.

Result:

No IllegalStateException anymoe when a webapp is reloaded in Tomcat that uses netty and uses the PooledByteBufAllocator.
2019-03-22 12:16:21 +01:00
Nick Hill
b36f75044f Fix possible ByteBuf leak when CompositeByteBuf is resized (#8946)
Motivation:

The special case fixed in #8497 also requires that we keep a derived slice when trimming components in place, as done by the capacity(int) and discardReadBytes() methods.

Modifications:

Ensure that we keep a ref to trimmed components' original retained slice in capacity(int) and discardReadBytes() methods, so that it is released properly when the they are later freed. Add unit test which fails prior to the fix.

Result:

Edge case leak is eliminated.
2019-03-22 11:18:10 +01:00
Norman Maurer
c83904a12a
Allow to automatically trim the PoolThreadCache in a timely interval (#8941)
Motivation:

PooledByteBufAllocator uses a PoolThreadCache per Thread that allocates / deallocates to minimize the performance overhead. This PoolThreadCache is trimmed after X allocations to free up buffers that are not allocated for a long time. This works out quite well when the app continues to allocate but fails if the app stops to allocate frequently (for whatever reason) and so a lot of memory is wasted and not given back to the arena / freed.

Modifications:

- Add a ThreadExecutorMap that offers multiple methods that wrap Runnable / ThreadFactory / Executor and allow to call ThreadExecutorMap.currentEventExecutor() to get the current executing EventExecutor for the calling Thread.
- Use these methods in the constructors of our EventExecutor implementations (which also covers the EventLoop implementations)
- Add io.netty.allocator.cacheTrimIntervalMillis system property which can be used to specify a fixed rate / interval on which we should try to trim the PoolThreadCache for a EventExecutor that allocates.
- Add PooledByteBufAllocator.trimCurrentThreadCache() to allow the user to trim the cache of the calling thread manually.
- Add testcases
- Introduce FastThreadLocal.getIfExists()

Result:

Allow to better / more frequently trim PoolThreadCache and so give back memory to the area / system.
2019-03-22 11:08:37 +01:00
root
92b19cfedd [maven-release-plugin] prepare for next development iteration 2019-03-08 08:55:45 +00:00
root
ff7a9fa091 [maven-release-plugin] prepare release netty-4.1.34.Final 2019-03-08 08:51:34 +00:00
Norman Maurer
90ea3ec9f6
Adjust tests to be able to build / test when using IBM J9 / OpenJ9 (#8900)
Motivation:

We should run a CI job using J9 to ensure netty also works when using different JVMs.

Modifications:

- Adjust PooledByteBufAllocatorTest to be able to complete faster when using a JVM which takes longer when joining Threads (this seems to be the case with J9).
- Skip UDT tests on J9 as UDT is not supported there.

Result:

Be able to run CI against J9.
2019-03-01 06:47:56 +01:00
Nick Hill
0811409ca3 Further reduce ensureAccessible() overhead (#8895)
Motivation:

This PR fixes some non-negligible overhead discovered in the ByteBuf
accessibility (non-zero refcount) checking. The cause turned out to be
mostly twofold:
- Unnecessary operations used to calculate the refcount from the "raw"
encoded int field value
- Call stack depths exceeding the default limit for inlining, in some
places (CompositeByteBuf in particular)

It's a follow-on from #8882 which uses the maxCapacity field for a
simpler non-negative check. The performance gap between these two
variants appears to be _mostly_ closed, but there's one exception which
may warrant further analysis.

Modifications:

- Replace ABB.internalRefCount() with ByteBuf.isAccessible(), the
default still checks for non-zero refCnt()
- Just test for parity of raw refCnt instead of converting to "real",
with fast-path for specific small values
- Make sure isAccessible() is delegated by derived/wrapper ByteBufs
- Use existing freed flag in CompositeByteBuf for faster isAccessible()
- Manually inline some calls in methods like CompositeByteBuf.setLong()
and AbstractReferenceCountedByteBuf.isAccessible() to reduce stack
depths (to ensure default inlining limit isn't hit)
- Add ByteBufAccessBenchmark which is an extension of
UnsafeByteBufBenchmark (maybe latter could now be removed)

Results:

Before:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
84524972.863 ±   518338.811  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
38608795.037 ±   298176.974  ops/s
readBatch           HEAP               true           true  thrpt   30
80003697.649 ±   974674.119  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
18495554.788 ±   108075.023  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
247069881.578 ± 10839162.593  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
196355905.206 ±  1802420.990  ops/s
setGetLong          HEAP               true           true  thrpt   30
245686644.713 ± 11769311.527  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
83170940.687 ±   657524.123  ops/s
setLong           UNSAFE               true           true  thrpt   30
278940253.918 ±  1807265.259  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
202556738.764 ± 11887973.563  ops/s
setLong             HEAP               true           true  thrpt   30
280045958.053 ±  2719583.400  ops/s
setLong        COMPOSITE               true           true  thrpt   30
121299806.002 ±  2155084.707  ops/s


After:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
101641801.035 ±  3950050.059  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
84395902.846 ±  4339579.057  ops/s
readBatch           HEAP               true           true  thrpt   30
100179060.207 ±  3222487.287  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
42288494.472 ±   294919.633  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
304530755.027 ±  6574163.899  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
212028547.645 ± 14277828.768  ops/s
setGetLong          HEAP               true           true  thrpt   30
309335422.609 ±  2272150.415  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
160383609.236 ±   966484.033  ops/s
setLong           UNSAFE               true           true  thrpt   30
298055969.747 ±  7437449.627  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
223784178.650 ±  9869750.095  ops/s
setLong             HEAP               true           true  thrpt   30
302543263.328 ±  8140104.706  ops/s
setLong        COMPOSITE               true           true  thrpt   30
157083673.285 ±  3528779.522  ops/s

There's also a similar knock-on improvement to other benchmarks (e.g.
HPACK encoding/decoding) as shown in #8882.

For sanity I did a final comparison of the "fast path" tweak using one
of the HPACK benchmarks:

(rawCnt & 1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score     Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  50914.479 ± 940.114  ops/s


rawCnt == 2 || rawCnt == 4 || rawCnt == 6 || rawCnt == 8 ||  (rawCnt &
1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score      Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  60036.425 ± 1478.196  ops/s
2019-02-28 20:40:41 +01:00