Motivation:
Phantom references are for cleaning up resources that were
forgotten, which means they keep their referent alive. This
means garbage is kept around until the refqueue is drained, rather
than when the reference is unreachable.
Modification:
Use Weak References instead of Phantoms
Result:
More punctual leak detection.
Motivation:
Objects of java.util.TreeMap or java.util.TreeSet will become
non-Serializable if instantiated with Comparators, which are not also
Serializable. This can result in unexpected and difficult-to-diagnose
bugs.
Modifications:
Implements Serializable for all classes, which implements Comparator.
Result:
Proper Comparators which will not force collections to
non-Serializable mode.
Motivation:
Even if it's a super micro-optimization (most JVM could optimize such
cases in runtime), in theory (and according to some perf tests) it
may help a bit. It also makes a code more clear and allows you to
access such methods in the test scope directly, without instance of
the class.
Modifications:
Add 'static' modifier for all methods, where it possible. Mostly in
test scope.
Result:
Cleaner code with proper 'static' modifiers.
Motivation:
Without a 'serialVersionUID' field, any change to a class will make
previously serialized versions unreadable.
Modifications:
Add missed 'serialVersionUID' field for all Serializable
classes.
Result:
Proper deserialization of previously serialized objects.
Configuring this is tough because there is split between highly shared (and accessed) objects and lightly accessed objects.
Modification:
There are a number of changes here. In relative order of importance:
API / Functionality changes:
* Max records and max sample records are gone. Only "target" records, the number of records tries to retain is exposed.
* Records are sampled based on the number of already stored records. The likelihood of recording a new sample is `2^(-n)`, where `n` is the number of currently stored elements.
* Records are stored in a concurrent stack structure rather than a list. This avoids a head and tail. Since the stack is only read once, there is no need to maintain head and tail pointers
* The properties of this imply that the very first and very last access are always recorded. When deciding to sample, the top element is replaced rather than pushed.
* Samples that happen between the first and last accesses now have a chance of being recorded. Previously only the final few were kept.
* Sampling is no longer deterministic. Previously, a deterministic access pattern meant that you could conceivably always miss some access points.
* Sampling has a linear ramp for low values and and exponentially backs off roughly equal to 2^n. This means that for 1,000,000 accesses, about 20 will actually be kept. I have an elegant proof for this which is too large to fit in this commit message.
Code changes:
* All locks are gone. Because sampling rarely needs to do a write, there is almost 0 contention. The dropped records counter is slightly contentious, but this could be removed or changed to a LongAdder. This was not done because of memory concerns.
* Stack trace exclusion is done outside of RLD. Classes can opt to remove some of their methods.
* Stack trace exclusion is faster, since it uses String.equals, often getting a pointer compare due to interning. Previously it used contains()
* Leak printing is outputted fairly differently. I tried to preserve as much of the original formatting as possible, but some things didn't make sense to keep.
Result:
More useful leak reporting.
Faster:
```
Before:
Benchmark (recordTimes) Mode Cnt Score Error Units
ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 136293.404 ± 7669.454 ops/s
ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 72805.720 ± 3710.864 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 139131.215 ± 4882.751 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 74146.313 ± 4999.246 ops/s
After:
Benchmark (recordTimes) Mode Cnt Score Error Units
ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 155281.969 ± 5301.399 ops/s
ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 77866.239 ± 3821.054 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 153360.036 ± 8611.353 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 78670.804 ± 2399.149 ops/s
```
Motivation:
We need to upgrade our dependencies to versions which use ASM 6.0.0+ to support compiling on java9.
Modifications:
Update animal-sniffer-maven-plugin and maven-shade-plugin.
Result:
Fixes https://github.com/netty/netty/issues/6100
Motivation:
There are 2 motivations, the first depends on the second:
Loading Netty Epoll statically stopped working in 4.1.16, due to
`Native` always loading the arch specific shared object. In a
static binary, there is no arch specific SO.
Second, there are a ton of exceptions that can happen when loading
a native library. When loading native code, Netty tries a bunch of
different paths but a failure in any given may not be fatal.
Additionally: turning on debug logging is not always feasible so
exceptions get silently swallowed.
Modifications:
* Change Epoll and Kqueue to try the static load second
* Modify NativeLibraryLoader to record all the locations where
exceptions occur.
* Attempt to use `addSuppressed` from Java 7 if available.
Alternatives Considered:
An alternative would be to record log messages at each failure. If
all load attempts fail, the log messages are printed as warning,
else as debug. The problem with this is there is no `LogRecord` to
create like in java.util.logging. Buffering the args to
logger.log() at the end of the method loses the call site, and
changes the order of events to be confusing.
Another alternative is to teach NativeLibraryLoader about loading
the SO first, and then the static version. This would consolidate
the code fore Epoll, Kqueue, and TCNative. I think this is the
long term better option, but this PR is changing a lot already.
Someone else can take a crack at it later
Results:
Epoll Still Loads and easier debugging.
Motivation:
Highly retained and released objects have contention on their ref
count. Currently, the ref count is updated using compareAndSet
with care to make sure the count doesn't overflow, double free, or
revive the object.
Profiling has shown that a non trivial (~1%) of CPU time on gRPC
latency benchmarks is from the ref count updating.
Modification:
Rather than pessimistically assuming the ref count will be invalid,
optimistically update it assuming it will be. If the update was
wrong, then use the slow path to revert the change and throw an
execption. Most of the time, the ref counts are correct.
This changes from using compareAndSet to getAndAdd, which emits a
different CPU instruction on x86 (CMPXCHG to XADD). Because the
CPU knows it will modifiy the memory, it can avoid contention.
On a highly contended machine, this can be about 2x faster.
There is a downside to the new approach. The ref counters can
temporarily enter invalid states if over retained or over released.
The code does handle these overflow and underflow scenarios, but it
is possible that another concurrent access may push the failure to
a different location. For example:
Time 1 Thread 1: obj.retain(INT_MAX - 1)
Time 2 Thread 1: obj.retain(2)
Time 2 Thread 2: obj.retain(1)
Previously Thread 2 would always succeed and Thread 1 would always
fail on the second access. Now, thread 2 could fail while thread 1
is rolling back its change.
====
There are a few reasons why I think this is okay:
1. Buggy code is going to have bugs. An exception _is_ going to be
thrown. This just causes the other threads to notice the state
is messed up and stop early.
2. If high retention counts are a use case, then ref count should
be a long rather than an int.
3. The critical section is greatly reduced compared to the previous
version, so the likelihood of this happening is lower
4. On error, the code always rollsback the change atomically, so
there is no possibility of corruption.
Result:
Faster refcounting
```
BEFORE:
Benchmark (delay) Mode Cnt Score Error Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1 sample 2901361 804.579 ± 1.835 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10 sample 3038729 785.376 ± 16.471 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 100 sample 2899401 817.392 ± 6.668 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1000 sample 3650566 2077.700 ± 0.600 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10000 sample 3005467 19949.334 ± 4.243 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1 sample 456091 48.610 ± 1.162 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10 sample 732051 62.599 ± 0.815 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 100 sample 778925 228.629 ± 1.205 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1000 sample 633682 2002.987 ± 2.856 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10000 sample 506442 19735.345 ± 12.312 ns/op
AFTER:
Benchmark (delay) Mode Cnt Score Error Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1 sample 3761980 383.436 ± 1.315 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10 sample 3667304 474.429 ± 1.101 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 100 sample 3039374 479.267 ± 0.435 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1000 sample 3709210 2044.603 ± 0.989 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10000 sample 3011591 19904.227 ± 18.025 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1 sample 494975 52.269 ± 8.345 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10 sample 771094 62.290 ± 0.795 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 100 sample 763230 235.044 ± 1.552 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1000 sample 634037 2006.578 ± 3.574 ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10000 sample 506284 19742.605 ± 13.729 ns/op
```
Motivation:
3c8c7fc7e9 introduced some changes to the ResourceLeakDetector that introduced a regression and so would always log that paranoid leak detection should be enabled even it was already.
Modifications:
Correctly not clear the recorded stacktraces when we process the reference queue so we can log these.
Result:
ResourceLeakDetector works again as expected.
Motivation:
NativeLibraryLoader uses ClassLoader#getResource method that can return nulls when the resource cannot be found. The returned url variable should be checked for nullity and fail in a more usable manner than a NullPointerException
Modifications:
Fail with a FileNotFoundException
Result:
Fixes [#7222].
Motiviation:
The ResourceLeakDetector helps to detect and troubleshoot resource leaks and is often used even in production enviroments with a low level. Because of this its import that we try to keep the overhead as low as overhead. Most of the times no leak is detected (as all is correctly handled) so we should keep the overhead for this case as low as possible.
Modifications:
- Only call getStackTrace() if a leak is reported as it is a very expensive native call. Also handle the filtering and creating of the String in a lazy fashion
- Remove the need to mantain a Queue to store the last access records
- Add benchmark
Result:
Huge decrease of performance overhead.
Before the patch:
Benchmark (recordTimes) Mode Cnt Score Error Units
ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 4358.367 ± 116.419 ops/s
ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 2306.027 ± 55.044 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 4220.979 ± 114.046 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 2250.734 ± 55.352 ops/s
With this patch:
Benchmark (recordTimes) Mode Cnt Score Error Units
ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 71398.957 ± 2695.925 ops/s
ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 38643.963 ± 1446.694 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 71677.882 ± 2923.622 ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 38660.176 ± 1467.732 ops/s
Motivation:
RLD allocates an ArrayDeque in anticipation of recording access
points. If the leak detection level is less than ADVANCED though,
the dequeue is never used. Since SIMPLE is the default level,
there is a minor perf win to not preemptively allocate it.
This showed up in garbage profiling when creation a high number of
buffers.
Modifications:
Only allocate the dequeue if it will be used.
Result:
Less garbage created.
Motivation:
NativeLibraryLoader may only log a debug statement if the library is successfully loaded from java.library.path, but will log failure statements the if load for java.library.path fails which can mislead users to believe the load actually failed when it may have succeeded.
Modifications:
- Always load a debug statement when a library was successfully loaded
Result:
NativeLibraryLoader log statements more clear.
Motivation:
When Log4j2Logger is used with PatternLayout (%F:%L)%c.%M, the log message incorrect shows:
(Log4J2Logger.java:73)io.netty.util.internal.PlatformDependent0.debug ....
Modification:
Extend AbstractLogger
Result:
Fixes [#7186].
Motivation:
We should only try to load the native artifacts if the architecture we are currently running on is the same as the one the native libraries were compiled for.
Modifications:
Include architecture in native lib name and append the current arch when trying to load these. This will fail then if its not the same as the arch of the compiled arch.
Result:
Fixes [#7150].
Motivation:
When the user want to have the direct memory explicitly managed by the GC (just as java.nio does) it is useful to be able to construct an UnpooledByteBufAllocator that allows this without the chances to see any memory leak.
Modifications:
Allow to explicitly disable the usage of reflection to construct direct ByteBufs and so be sure these will be collected by GC.
Result:
More flexible way to use the UnpooledByteBufAllocator.
Motivation:
The documentation for field updates says:
> Note that the guarantees of the {@code compareAndSet}
> method in this class are weaker than in other atomic classes.
> Because this class cannot ensure that all uses of the field
> are appropriate for purposes of atomic access, it can
> guarantee atomicity only with respect to other invocations of
> {@code compareAndSet} and {@code set} on the same updater.
This implies that volatiles shouldn't use normal assignment; the
updater should set them.
Modifications:
Use setter for field updaters that make use of compareAndSet.
Result:
Concurrency compliant code
Motivation:
PD and PD0 Both try to find and use Unsafe. If unavailable, they
try to log why and continue on. However, it is not always east to
enable this logging. Chaining exceptions together is much easier
to reach, and the original exception is relevant when Unsafe is
needed.
Modifications:
* Make PD log why PD0 could not be loaded with a trace level log
* Make PD0 remember why Unsafe wasn't available
* Expose unavailability cause through PD for higher level use.
* Make Epoll and KQueue include the reason when failing
Result:
Easier debugging in hard to reconfigure environments
Motivation:
Continuing to make netty happy when compiling through errorprone.
Modification:
Mostly comments, some minor switch statement changes.
Result:
No more compiler errors!
Motivation:
On restricted systems (e.g. grsecurity), it might not be possible to write a .so on disk and load it afterwards. On those system Netty should check java.library.path for libraries to load.
Modifications:
Changed NativeLibraryLoader.java to first try to load libs from java.library.path before exporting the .so to disk.
Result:
Libraries load fine on restricted systems.
Motivation:
When compiling this code and running it through errorprone[1], this message appears:
```
StringUtil.java:493: error: [FallThrough] Switch case may fall through; add a `// fall through` comment if it was deliberate
case LINE_FEED:
^
(see http://errorprone.info/bugpattern/FallThrough)
```
By adding that comment, it silences the error and also makes clear the intention of that statement.
[1]http://errorprone.info/index
Modification:
Add simple comment.
Result:
Errorprone is happier with the code.
Motivation:
The `AsciiString#toString` method calculate string value and cache it into field. If an `AsciiString` created from the `String` value, we can avoid rebuilding strings if we cache them immediately when creating `AsciiString`. It would be useful for constants strings, which already stored in the JVMs string table, or in cases where an unavoidable `#toString `method call is assumed.
Modifications:
- Add new static method `AsciiString#cache(String)` which save string value into cache field.
- Apply a "benign" data race in the `#hashCode` and `#toString` methods.
Result:
Less memory usage in some `AsciiString` use cases.
Motivation:
At the moment we try to load the library using multiple names which includes names using - but also _ . We should just use _ all the time.
Modifications:
Replace - with _
Result:
Fixes [#7069]
Motivation:
In some cases of using an `InternalThreadLocalMap#stringBuilder`, the `StringBuilder`s size can often exceed the exist limit (1024 bytes). This can lead to permanent memory reallocation.
Modifications:
Add custom properties for the initial capacity and maximum size (after which the `StringBuilder`s capacity will be reduced to the initial capacity).
Result:
An `InternalThreadLocalMap#stringBuilder`s initial and max size is configurable. Fixes [#7092].
Motivation:
NativeLibraryLoader has some code-duplication that can be removed.
Modifications:
Remove duplicated code and just use provided methods of PlatformDependent.
Result:
Less code duplication, fixes [#3756].
Motivation:
Now that the NativeLibraryLoader implicitly detects the shaded package prefix we no longer need the io.netty.packagePrefix system property.
Modifications:
- Remove io.netty.packagePrefix processing from NativeLibraryLoader
Result:
Code is cleaner.
Motivation:
Shading requires renaming binary components (.so, .dll; for tcnative,
epoll, etc). But the rename then requires setting the
io.netty.packagePrefix system property on the command line or runtime,
which is either a burden or not feasible.
If you don't rename the binary components everything appears to
work, until a dependency on a second version of the binary component is
added. At that point, only one version of the binary will be loaded...
which is what shading is supposed to prevent. So for valid shading, the
binaries must be renamed.
Modifications:
Automatically detect the package prefix by comparing the actual class
name to the non-shaded expected class name. The expected class name must
be obfuscated to prevent shading utilities from changing it.
Result:
When shading and using binary components, runtime configuration is no
longer necessary.
Pre-existing shading users that were not renaming the binary components
will break, because the packagePrefix previously defaulted to "". Since
these pre-existing users had broken configurations that only _appeared_
to work, this breakage is considered a Good Thing. Users may workaround
this breakage temporarily by setting -Dio.netty.packagePrefix= to
restore packagePrefix to "".
Fixes#6963
Motivation:
ResourceLeakDetector records at most MAX_RECORDS+1 records
Modifications:
Make room before add to lastRecords
Result:
ResourceLeakDetector will record at most MAX_RECORDS records
Motivation:
JCTools 2.0.2 provides an unbounded MPSC linked queue. Before we shaded JCTools we had our own unbounded MPSC linked queue and used it in various places but gave this up because there was no public equivalent available in JCTools at the time.
Modifications:
- Use JCTool's MPSC linked queue when no upper bound is specified
Result:
Fixes https://github.com/netty/netty/issues/5951
Motivation:
PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.
Modifications:
1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).
Result:
Less code duplication.
Motivation:
A `StringUtil#escapeCsv` creates new `StringBuilder` on each value even if the same string is returned in the end.
Modifications:
Create new `StringBuilder` only if it really needed. Otherwise, return the original string (or just trimmed substring).
Result:
Less GC load. Up to 4x faster work for not changed strings.
Motivation:
AppendableCharSequence depends upon IndexOutOfBoundsException to trigger a resize operation under the assumption that the resize operation will be rare if the initial size guess is good. However if the initial size guess is not good then the performance will be more unpredictable and likely suffer.
Modifications:
- Check the position in AppendableCharSequence#append to determine if a resize is necessary
Result:
More predictable performance in AppendableCharSequence#append.
Motivation:
We need to use FQCN to prevent classloader issues for classes that are > Java6. This is a cleanup of ed5fcbb773.
Modifications:
Just remove the imports and use FQCN.
Result:
No classloader issues with java6
Motivation:
Docker's `--tmpfs` flag mounts the temp volume with `noexec` by default,
resulting in an UnsatisfiedLinkError. While this is good security
practice, it is a surprising failure from a seemingly innocuous flag.
Modifications:
Add a best-effort attempt in `NativeLibraryLoader` to detect when temp
files beng loaded cannot be executed even when execution permissions
are set, often because the `noexec` flag is set on the volume.
Requires numerous additional exclusions to the Animal Sniffer config
for Java7 POSIX permissions manipulation.
Result:
Fixes [#6678].
Motivation:
For our native libraries in netty we support shading, to have this work on runtime the user needs to set a system property. This code should shared.
Modifications:
Move logic to NativeLbiraryLoader and so share for all native libs.
Result:
Less code duplication and also will work for netty-tcnative out of the box once it support shading
Motivation:
`FormattingTuple.getArgArray()` is never used.
In the `MessageFormatter` it is possible to make
some improvements, e.g. replace `StringBuffer`
with `StringBuilder`, avoid redundant allocations, etc.
Modifications:
- Remove `argArray` field from the `FormattingTuple`.
- In `MessageFormatter`:
- replace `StringBuffer` with `StringBuilder`,
- replace `HashMap` with `HashSet` and make it lazy initialized.
- avoid redundant allocations (`substring()`, etc.)
- use appropriate StringBuilder's methods for the some `Number` values.
- Porting unit tests from `slf4j`.
Result:
Less GC load on logging with internal `MessageFormatter`.
Motivation:
IPv4/6 validation methods use allocations, which can be avoided.
IPv4 parse method use StringTokenizer.
Modifications:
Rewriting IPv4/6 validation methods to avoid allocations.
Rewriting IPv4 parse method without use StringTokenizer.
Result:
IPv4/6 validation and IPv4 parsing faster up to 2-10x.
Motivation:
We should only try to load jdk.internal.misc.Unsafe if we run on Java9+ to eliminate noise in the log.
Modifications:
- Move javaVersion() and related methods to PlatformDependent0 to be able to use these in the static initializer without creating a cycle.
- Only try to load jdk.internal.misc.Unsafe when running in Java9+
Result:
Less noise in the log when running pre java9.
Motivation:
A previous change allocated a new thread local string builder if it
was getting too large. This is a good change, these string builders
can accidentally get too large and then never shrunk and that is sort
of a memory leak. However, the change allocates an entirely new string
builder which is more allocations than necessary. Instead, we can trim
the string builder if its too large, this only allocates an extra
backing array instead of a whole new object.
Modifications:
If the string builder is above a threshold, we trim the string builder
and then ensure its capacity is reasonable to we do not allocate too
much as we start using the string builder.
Result:
The thread local string builder do not serve as a memory yet we do not
allocate too many new objects.
Motivation:
`NetUtil`'s methods `isValidIpV6Address` and `getIPv6ByName` incorrectly validate some IPv6 addresses.
Modifications:
- `getIPv6ByName`: add checks for single colon at the start or end.
- `isValidIpV6Address`: fix checks for the count of colons and use `endOffset` instead of `ipAddress.length()` for the cases with the brackets or '%'.
Result:
More correct implementation of `NetUtil#isValidIpV6Address` and `NetUtil#getIPv6ByName`.
Motivation:
AtomicIntegerFieldUpdater#get is unnecessary, I think use simple volatile read is cleaner
Modifications:
Replace code STATE_UPDATER.get(this) to state in SingleThreadEventExecutor
Result:
Cleaner code
Motivation:
InternalThreadLocalMap#stringBuilder: ensure memory overhead
Modification:
If the capacity of StringBuilder is greater than 65536 then release it on the next time you get StringBuilder and re-create a StringBuilder.
Result:
Possible less memory usage.
Motivation:
If unsafe is unavailable, we can not use the cleaner anyway. If we try
to set it up, we get an annoying log message about unsafe being
unavailable (when debug logging is enabled). We know this will fail, so
we should not even bother and avoid the log message.
Modifications:
This commit adds a guard against setting up the cleaner if it is not
going to be available because unsafe is unavailable.
Result:
We do not try to set up the cleaner if unsafe is unavailable, and we do
not get an annoying log message.
Motivation:
Users should not see a scary log message when Netty is initialized if
Netty configuration explicitly disables unsafe. The log message that
produces this warning was previously guarded but the guard was
lost.
Modifications:
This commit brings back the guard against the scary log message if
unsafe is explicitly disabled.
Result:
No log message is produced when unsafe is unavailable because Netty was
told to not look for it.
Motivation:
We currently don't have a native transport which supports kqueue https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2. This can be useful for BSD systems such as MacOS to take advantage of native features, and provide feature parity with the Linux native transport.
Modifications:
- Make a new transport-native-unix-common module with all the java classes and JNI code for generic unix items. This module will build a static library for each unix platform, and included in the dynamic libraries used for JNI (e.g. transport-native-epoll, and eventually kqueue).
- Make a new transport-native-unix-common-tests module where the tests for the transport-native-unix-common module will live. This is so each unix platform can inherit from these test and ensure they pass.
- Add a new transport-native-kqueue module which uses JNI to directly interact with kqueue
Result:
JNI support for kqueue.
Fixes https://github.com/netty/netty/issues/2448
Fixes https://github.com/netty/netty/issues/4231
Motivation:
Fixes#6681.
Modification:
For the sake of better timer observability, expose the number of pending timeouts through the new HashedWheelTimer.pendingTimeouts method .
Result:
It's now ridiculously easy to observe Netty timer's very basic and yet important metric, the number of pending tasks/timeouts.
Motivation:
NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName allowed an invalid form of mapped IPv4 addresses which lead to accepting invalid IPv6 addresses as valid.
Modifications:
- NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName should only allow 7 colons for an IPv4 address if they are the first 2 characters.
Result:
More correct implementation of NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName
Motivation:
NetUtil#getByName and NetUtil#isValidIpV6Address do not strictly enforce the format of IPv4 addresses that are allowed to be embedded in IPv6 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5. This may lead to invalid addresses being parsed, or invalid addresses being considered valid. Compression of a single IPv6 word was also not handled correctly if there are 7 : characters.
Modifications:
- NetUtil#isValidIpV6Address should enforce the IPv4-Compatible and IPv4-Mapped are the only valid formats for including IPv4 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5
- NetUtil#getByName should more stritcly parse IPv6 addresses which contain IPv4 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5
- NetUtil should allow compression even if the number of : characters is 7.
- NetUtil#createByteArrayFromIpAddressString should use the same IP string to byte[] translation which is used in NetUtil#getByName
Result:
NetUtil#getByName and NetUtil#isValidIpV6Address respect the IPv6 RFC which defines the valid formats for embedding IPv4 addresses.
Motivation:
In cases when an application is running in a container or is otherwise
constrained to the number of processors that it is using, the JVM
invocation Runtime#availableProcessors will not return the constrained
value but rather the number of processors available to the virtual
machine. Netty uses this number in sizing various resources.
Additionally, some applications will constrain the number of threads
that they are using independenly of the number of processors available
on the system. Thus, applications should have a way to globally
configure the number of processors.
Modifications:
Rather than invoking Runtime#availableProcessors, Netty should rely on a
method that enables configuration when the JVM is started or by the
application. This commit exposes a new class NettyRuntime for enabling
such configuraiton. This value can only be set once. Its default value
is Runtime#availableProcessors so that there is no visible change to
existing applications, but enables configuring either a system property
or configuring during application startup (e.g., based on settings used
to configure the application).
Additionally, we introduce the usage of forbidden-apis to prevent future
uses of Runtime#availableProcessors from creeping. Future work should
enable the bundled signatures and clean up uses of deprecated and
other forbidden methods.
Result:
Netty can be configured to not use the underlying number of processors,
but rather the constrained number of processors.
Motivation:
Java9 added a new method to Unsafe which allows to allocate a byte[] without memset it. This can have a massive impact in allocation times when the byte[] is big. This change allows to enable this when using Java9 with the io.netty.tryAllocateUninitializedArray property when running Java9+. Please note that you will need to open up the jdk.internal.misc package via '--add-opens java.base/jdk.internal.misc=ALL-UNNAMED' as well.
Modifications:
Allow to allocate byte[] without memset on Java9+
Result:
Better performance when allocate big heap buffers and using java9.
Motivation:
As the javadoc of ScheduledExecutorService state:
Zero and negative delays (but not periods) are also allowed in schedule methods,and are treated as requests for immediate execution.
Modifications:
- Correctly handle delay <= 0.
- Add unit tests.
Result:
Fixes [#6627].
Motivation:
When debugging netty memory leaks, it's sometimes helpful to
print the object's reference count.
Modifications:
Add `refCnt` methods to set of already exitsting helpers for ref coutned
objects.
Result:
Users will have utility to print object's ref count without much of a
boilerplate.
Motivation:
Java9 adds a new method to Unsafe which allows to free direct ByteBuffer via the cleaner without the need to use an commandline arguments.
Modifications:
- Add Cleaner interface
- Add CleanerJava9 which will be used when using Java9+ and take care of release direct ByteBuffer
- Let Cleaner0 implement Cleaner
Result:
Be able to free direct ByteBuffer on Java9+ again without any commandline arguments.
Motivation:
When UNSAFE.allocateMemory is returning an address whose high bit is set we currently throw an IllegalArgumentException. This is not correct as it may return a negative number on at least sparc.
Modifications:
- Allow to pass in negative memoryAddress
- Add unit tests
Result:
Correctly validate the memoryAddress and so also work on sparc as expected. Fixes [#6574].
Motivation:
The updated HTTP/1.x RFC allows for header values to be CSV and separated by OWS [1]. CombinedHttpHeaders should remove this OWS on insertion.
[1] https://tools.ietf.org/html/rfc7230#section-7
Modification:
CombinedHttpHeaders doesn't account for the OWS and returns it back to the user as part of the value.
Result:
Fixes#6452
Motivation:
We should use SystemPropertyUtil to access system properties and so always handle SecurityExceptions.
Modifications:
Use SystemPropertyUtil everywhere.
Result:
Better and consist handling of SecurityException.
Motivation:
We used some deprecated Mockito methods.
Modifications:
- Replace deprecated method usage
- Some cleanup
Result:
No more usage of deprecated Mockito methods. Fixes [#6482].
Motivation:
We forked a new process to detect if the program is run by root. We should better just use user.name system property
Modifications:
- Change PlatformDependent.isRoot0() to read the user.name system property to detect if root runs the program and rename it to maybeSuperUser0().
- Rename PlatformDependent.isRoot() to maybeSuperUser() and let it init directly in the static block
Result:
Less heavy way to detect if the program is run by root.
Motivation:
When UnorderedThreadPoolEventExecutor.execute / submit etc is called it will consume up to 100 % CPU even after the task was executed.
Modifications:
Add a special wrapper which we will be used in execute(...) to wrap the submitted Runnable. This is needed as ScheduledThreadPoolExecutor.execute(...) will delegate to submit(...) which will then use decorateTask(...). The problem with this is that decorateTask(...) needs to ensure we only do our own decoration if we not call from execute(...) as otherwise we may end up creating an endless loop because DefaultPromise will call EventExecutor.execute(...) when notify the listeners of the promise.
Result:
Fixes [#6507].
Motivation:
PlatformDependent0 makes assumptions that the array index scale for byte[] is always 1. If this is not the case the results from methods which make this assumption will be undefined.
Modifications:
- PlatformDependent0 should check if unsafe.arrayIndexScale(byte[].class) is not 1, and if so not use unsafe
Result:
Assumptions made by optimizations in PlatformDependent0 which use byte[] are explicitly enforced.
Motivation:
We only need to add the port to the HOST header value if its not a standard port.
Modifications:
- Only add port if needed.
- Fix parsing of ipv6 address which is enclosed by [].
Result:
Fixes [#6426].
Motivation:
Calling a static method is faster then dynamic
Modifications:
Add 'static' keyword for methods where it missed
Result:
A bit faster method calls
Motivation:
We shipped a javassist based implementation for typematching and logged a confusing debug message about missing javassist. We never were able to prove it really gives any perf improvements so we should just remove it.
Modifications:
- Remove javassist dependency and impl
- Fix possible classloader deadlock as reported by intellij
Result:
Less code to maintain and less confusing log message.
Motivation:
We should log why we can not use ByteBuffer.cleaner and so maybe allow the user to fix it.
Modifications:
- Use Unsafe to access the field
- Log the exception when we can not use ByteBuffer.cleaner
Result:
Easier to debug why using cleaner is not possible.
Motivation:
We have our own ThreadLocalRandom implementation to support older JDKs . That said we should prefer the JDK provided when running on JDK >= 7
Modification:
Using ThreadLocalRandom implementation of the JDK when possible.
Result:
Make use of JDK implementations when possible.
Motivation:
Java9 does not allow changing access level via reflection by default. This lead to the situation that netty disabled Unsafe completely as ByteBuffer.address could not be read.
Modification:
Use Unsafe to read the address field as this works on all Java versions.
Result:
Again be able to use Unsafe optimisations when using Netty with Java9
Motivation:
NetworkInterface.getNetworkInterfaces() may return null if no network interfaces are found. We should guard against it.
Modifications:
Check for null return value.
Result:
Fixes [#6384]
Motiviation:
Simplify implementation of compareTo/equals/hashCode for ChannelIds.
Modifications:
We simplfy the hashCode implementation for DefaultChannelId by not
making it random, but making it based on the underlying data. We fix the
compareTo implementation for DefaultChannelId by using lexicographic
comparison of the underlying data array. We fix the compareTo
implementation for CustomChannelId to avoid the possibility of overflow.
Result:
Cleaner code that is easier to maintain.
Motivation:
Java8 is out now for some time and JDK7 is no longer supported officially. We should remove all our backports and just use what the JDK provides us. This also will allow us to use intrinsics that are offered by the JDK implementations.
Modifications:
Remove all backports of jdk8 classes.
Result:
Use what the JDK offers us. This also fixes [#5458]
Motivation:
Initialization of PlatformDependent0 fails on Java 9 in static initializer when calling setAccessible(true).
Modifications:
Add RefelectionUtil which can be used to safely try if setAccessible(true) can be used or not and if not fail back to non reflection.
Result:
Fixed [#6345]
Motivation:
EpollRecvByteAllocatorHandle intends to override the meaning of "maybe more data to read" which is a concept also used in all existing implementations of RecvByteBufAllocator$Handle but the interface doesn't support overriding. Because the interfaces lack the ability to propagate this computation EpollRecvByteAllocatorHandle attempts to implement a heuristic on top of the delegate which may lead to reading when we shouldn't or not reading data.
Modifications:
- Create a new interface ExtendedRecvByteBufAllocator and ExtendedHandle which allows the "maybe more data to read" between interfaces
- Deprecate RecvByteBufAllocator and change all existing implementations to extend ExtendedRecvByteBufAllocator
- transport-native-epoll should require ExtendedRecvByteBufAllocator so the "maybe more data to read" can be propagated to the ExtendedHandle
Result:
Fixes https://github.com/netty/netty/issues/6303.
Motivation:
The JDK uses gethostbyname for blocking hostname resoltuion. gethostbyname can be configured on Unix systems according to [1][2]. This may impact the name server that is used to resolve particular domains or just override the default fall-back resolver. DnsNameResolver currently ignores these configuration files which means the default resolution behavior is different than the JDK. This may lead to unexpected resolution failures which succeed when using the JDK's resolver.
Modifications:
- Add an interface which can override what DnsServerAddressStream to use for a given hostname
- Provide a Unix specific implementation of this interface and implement [1][2]. Some elements may be ignored sortlist, timeout, etc...
Result:
DnsNameResolver behaves more like the JDK resolver by default.
[1] https://linux.die.net/man/5/resolver
[2] https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man5/resolver.5.html
Motivation:
Update of Groovy is needed to compile on recent java9 releases.
Modification:
Update to Groovy 2.4.8
Result:
This change allows Netty to be successfully compiled on more recent Java 9 previews.
Motivation:
ResourceLeakDetector supports a parameter called maxActive. This parameter is used in attempt to limit the amount of objects which are being tracked for leaks at any given time, and generates an error log message if this limit is exceeded. This assumes that there is a relationship between leak sample rate and object lifetime for objects which are already being tracked. This relationship may appear to work in cases were there are a single leak record per object and those leak records live for the lifetime of the application but in general this relationship doesn't exist. The original motivation was to provide a limit for cases such as HashedWheelTimer to limit the number of instances which exist at any given time. This limit is not enforced in all circumstances in HashedWheelTimer (e.g. if the thread is a daemon) and can be implemented outside ResourceLeakDetector.
Modifications:
- Deprecate all methods which interact with maxActive in ResourceLeakDetectorFactory and ResourceLeakDetector
- Remove all logic related to maxActive in ResourceLeakDetector
- HashedWheelTimer implements its own logic to impose a limit and warn users if too many instances exists at any given time.
Result:
Fixes https://github.com/netty/netty/issues/6225.
Motivation:
We used various mocking frameworks. We should only use one...
Modifications:
Make usage of mocking framework consistent by only using Mockito.
Result:
Less dependencies and more consistent mocking usage.
Motivation:
When comparing MAC addresses searching for the best MAC address, if
locally-administered address (e.g., from a Docker container) is compared
against an empty MAC address, the empty MAC address will be marked as
preferred. In cases this is the only available MAC address, this leaves
Netty using a random machine ID instead of using a perfectly valid
machine ID from the locally-adminstered address.
Modifications:
This commit modifies the MAC address logic so that the empty MAC address
is not preferred over a locally-administered address. This commit also
simplifies the comparison logic here.
Result:
Empty MAC addresses will not be preferred over locally-administered
addresses thus permitting the default machine ID to be the
locally-adminstered MAC address if it is the only available MAC address.
Motivation:
codec-http2 couples the dependency tree state with the remainder of the stream state (Http2Stream). This makes implementing constraints where stream state and dependency tree state diverge in the RFC challenging. For example the RFC recommends retaining dependency tree state after a stream transitions to closed [1]. Dependency tree state can be exchanged on streams in IDLE. In practice clients may use stream IDs for the purpose of establishing QoS classes and therefore retaining this dependency tree state can be important to client perceived performance. It is difficult to limit the total amount of state we retain when stream state and dependency tree state is combined.
Modifications:
- Remove dependency tree, priority, and weight related items from public facing Http2Connection and Http2Stream APIs. This information is optional to track and depends on the flow controller implementation.
- Move all dependency tree, priority, and weight related code from DefaultHttp2Connection to WeightedFairQueueByteDistributor. This is currently the only place which cares about priority. We can pull out the dependency tree related code in the future if it is generally useful to expose for other implementations.
- DefaultHttp2Connection should explicitly limit the number of reserved streams now that IDLE streams are no longer created.
Result:
More compliant with the HTTP/2 RFC.
Fixes https://github.com/netty/netty/issues/6206.
[1] https://tools.ietf.org/html/rfc7540#section-5.3.4
Motivation:
A testing goof in 7c630fe introduced a binary incompatibility when the old Promise-specific `add` and `addAll` methods in PromiseCombiner were generalized to accept `Futures`.
Modification:
- Restore (but mark as `@Deprecated`) old PromiseCombiner methods.
- Fixed a couple minor documentation typos because sure why not.
Result:
`PromiseCombiner` is binary-compatible with previous versions of Netty.
Motivation:
When an empty hostname is used in DnsNameResolver.resolve*(...) it will never notify the future / promise. The root cause is that we not correctly guard against errors of IDN.toASCII(...) which will throw an IllegalArgumentException when it can not parse its input. That said we should also handle an empty hostname the same way as the JDK does and just use "localhost" when this happens.
Modifications:
- If the try to resolve an empty hostname we use localhost
- Correctly guard against errors raised by IDN.toASCII(...) so we will always noify the future / promise
- Add unit test.
Result:
DnsNameResolver.resolve*(...) will always notify the future.
Motivation:
Currently Netty does not wrap socket connect, bind, or accept
operations in doPrivileged blocks. Nor does it wrap cases where a dns
lookup might happen.
This prevents an application utilizing the SecurityManager from
isolating SocketPermissions to Netty.
Modifications:
I have introduced a class (SocketUtils) that wraps operations
requiring SocketPermissions in doPrivileged blocks.
Result:
A user of Netty can grant SocketPermissions explicitly to the Netty
jar, without granting it to the rest of their application.
Motivation:
Replacing System.err during Slf4JLoggerFactory construction is problematic as another class may optain the System.err reference before we set it back to the original value.
Modifications:
Remove code that temporary replaced System.err.
Result:
Fixes [#6212].
Motivation:
Pattern matching not necessary for number parsing.
Modification:
Removed pattern matching for number parsing and removed unnecessary toLowerCase() operation.
Result:
No static variable with pattern, removed unnecessary matching operation and toLowerCase() operation.
Motivation:
PlatformDependent* contains some methods that are not used and some other things that can be cleaned-up.
Modifications:
- Remove unused methods
- cleanup
Result:
Code cleanup.
Motivation:
The HttpProxyHandler is expected to be capable of issuing a valid CONNECT request for a tunneled connection to an IPv6 host.
Modifications:
- Correctly format the IPV6 address.
- Add unit tests
Result:
HttpProxyHandler works with IPV6 as well. Fixes [#6152].
Motivation:
When DefaultHttp2Connection removes a stream it iterates over all children and adds them as children to the parent of the stream being removed. This process may remove elements from the child map while iterating without using the iterator's remove() method. This is generally unsafe and may result in an undefined iteration.
Modifications:
- We should use the Iterator's remove() method while iterating over the child map
Result:
Fixes https://github.com/netty/netty/issues/6163
Motivation:
[#6153] reports an endless loop that existed in the Recycler, while this was fixed adding a few asserts to ensure this remains fixed is a good thing. Beside this we also should ensure this can not escape the constructor to avoid unsafe publication.
Modifications:
- Add asserts
- Fix unsafe publication
Result:
More correct code.
Motivation:
`scavengeSome()` has a corner case: when setting `cursor` to `head`, `this.prev` may point to the tail of the `WeakOrderQueue` linked list. Then it's possible that the following while loop will link the tail to the head, and cause endless loop.
I made a reproducer in 36522e7b72 . The unit test will just run forever. Unfortunately, I cannot change it to a unit test because it needs to add some codes to `scavengeSome` to control the execution flow.
Modification:
Set `prev` to null when setting `cursor` to `head` in `scavengeSome`
Result:
Fixes#6153.
Motivation:
InternalThreadLocalMap.arrayList returns a new ArrayList every time it's called that defeats the purpose of having a reusable ArrayList.
Modification:
Modified InternalThreadLocalMap.arrayList to create an ArrayList only if arrayList field is NULL.
Result:
InternalThreadLocalMap.arrayList now creates a reusable ArrayList only if arrayList field is NULL.
Motivation:
We used a MPSC queue in ThreadDeathWatcher and checked if it empty via isEmpty() from multiple threads if very unlucky. Depending on the implementation this is not safe and may even produce things like live-locks.
Modifications:
Change to use a MPMC queue.
Result:
No more risk to run into issues when multiple threads call watch(...) / unwatch(...) concurrently.
Motivation:
DefaultChannelId provides a regular expression which validates if a user provided MAC address is valid. This regular expression may allow invalid MAC addresses and also not allow valid MAC addresses.
Modifications:
- Introduce a MacAddressUtil#parseMac method which can parse and validate the MAC address at the same time. The regular expression check before hand is additional overhead if we have to parse the MAC address.
Result:
Fixes https://github.com/netty/netty/issues/6132.
Motivation:
`PromiseCombiner` is really handy, but it's not obvious how to use it from its existing documentation/method signatures.
Modification:
- Added javadoc comments to explain the theory of operation of `PromiseCombiner`.
- Generalized `PromiseCombiner` to work with `Futures` so it's clearer that the things for which it's listening won't be modified.
Result:
`PromiseCombiner` is easier to understand.
Motivation:
When profiling it is sometimes needed to still have the native library file avaible. We should allow to disable the explicit deletion and just delete it when the JVM stops.
This is related to #6110
Modifications:
Add io.netty.native.deleteLibAfterLoading system property which allows to disable the explicit delete after laoding
Result:
Possible to profile native libraries better.
Motivation:
In later Java8 versions our Atomic*FieldUpdater are slower then the JDK implementations so we should not use ours anymore. Even worse the JDK implementations provide for example an optimized version of addAndGet(...) using intrinsics which makes it a lot faster for this use-case.
Modifications:
- Remove methods that return our own Atomic*FieldUpdaters.
- Use the JDK implementations everywhere.
Result:
Faster code.
Motivation:
c2f4daa739 added a unit test but used a too small test timeout.
Modifications:
Increase timeout.
Result:
Test should have enough time to complete on the CI.
Motivation:
InternalLoggerFactory either sets a default logger factory
implementation based on the logging implementations on the classpath, or
applications can set a logger factory explicitly. If applications wait
too long to set the logger factory, Netty will have already set a logger
factory leading to some objects using one logging implementation and
other objets using another logging implementation. This can happen too
if the application tries to set the logger factory twice, which is
likely a bug in the application. Yet, the Javadocs for
InternalLoggerFactory warn against this saying that
InternalLoggerFactory#setLoggerFactory "should be called as early as
possible and shouldn't be called more than once". Instead, Netty should
guard against this.
Modications:
We replace the logger factory field with an atomic reference on which we
can do CAS operations to safely guard against it being set twice. We
also add an internal holder class that captures the static interface of
InternalLoggerFactory that can aid in testing.
Result:
The logging factory can not be set twice, and applications that want to
set the logging factory must do it before any Netty classes are
initialized (or the default logger factory will be set).
Motivation:
We need to ensure the tracked object can not be GC'ed before ResourceLeak.close() is called as otherwise we may get false-positives reported by the ResourceLeakDetector. This can happen as the JIT / GC may be able to figure out that we do not need the tracked object anymore and so already enqueue it for collection before we actually get a chance to close the enclosing ResourceLeak.
Modifications:
- Add ResourceLeakTracker and deprecate the old ResourceLeak
- Fix some javadocs to correctly release buffers.
- Add a unit test for ResourceLeakDetector that shows that ResourceLeakTracker has not the problems.
Result:
No more false-positives reported by ResourceLeakDetector when ResourceLeakDetector.track(...) is used.
Motivation:
Java9 will be released soon so we should ensure we can compile netty with Java9 and run all our tests. This will help to make sure Netty will be usable with Java9.
Modification:
- Add some workarounds to be able to compile with Java9, note that the full profile is not supported with Java9 atm.
- Remove some usage of internal APIs to be able to compile on java9
- Not support Alpn / Npn and so not run the tests when using Java9 for now. We will do a follow up PR to add support.
Result:
Its possible to build netty and run its testsuite with Java9.
Motivation:
42fba015ce changed the implemention of ResourceLeakDetector to improve performance. While this was done a branch was missed that can be removed. Beside this using a Boolean as value for the ConcurrentMap is sub-optimal as when calling remove(key, value) an uncessary instanceof check and cast is needed on each removal.
Modifications:
- Remove branch which is not needed anymore
- Replace usage of Boolean as value type of the ConcurrentMap and use our own special type which only compute hash-code one time and use a == operation for equals(...) to reduce overhead present when using Boolean.
Result:
Faster and cleaner ResourceLeakDetector.
Motivation:
Netty has a flag (io.netty.noUnsafe) for specifying to Netty to not be
unsafe. Yet, when initializing PlatformDependent0, Netty still tries to
be unsafe. For application that specify to Netty to not be unsafe and
run under a security manager, this can lead to an obnoxious (debug
level) stack trace. Since Netty was told not to be unsafe, Netty should
not try to be unsafe.
Modifications:
The initialization logic in PlatformDependent0 should take into account
that Netty was told not to be unsafe. This means that we need to
initialize PlatformDependent#IS_EXPLICIT_NO_UNSAFE as soon as possible,
before the static initializer for PlatformDependent0 has a chance to
run. Thus the following modifications are made:
- initialize PlatformDependent#IS_EXPLICIT_NO_UNSAFE before any other
code in PlatformDependent causes PlatformDependent0 to initialize
- expose the value of PlatformDependent#IS_EXPLICIT_NO_UNSAFE for
reading in PlatformDependent0
- take the value of PlatformDependent#IS_EXPLICIT_NO_UNSAFE into
account in PlatformDependent0
Result:
Netty does not try to be unsafe when told not to be unsafe.
Motivation:
For applications that set their own logger factory, they want that
logger factory to be the one logger factory. Yet, Netty eagerly
initializes this and then triggers initialization of other classes
before the application has had a chance to set its preferred logger
factory.
Modifications:
With this commit there are two key changes:
- Netty does not attempt to eagerly initialize the default logger
factory, only doing so if the application layer above Netty has not
already set a logger factory
- do not eagerly initialize unrelated classes from the logger factory;
while the motivation behind this was to initialize ThreadLocalRandom
as soon as possible in case it has to block reading from /dev/random,
this can be worked around for applications where it is problematic by
setting securerandom.source=file:/dev/urandom in their Java system
security policy (no, it is not less secure; do not even get me
started on myths about /dev/random)
Result:
Netty uses the logger factory that the application prefers, and does not
initialize unrelated classes.
Motivation:
PlatformDependent#getSystemClassLoader may throw a wide variety of exceptions based upon the environment. We should handle all exceptions and continue initializing the slow path if an exception occurs.
Modifications:
- Catch Throwable in cases where PlatformDependent#getSystemClassLoader is used
Result:
Fixes https://github.com/netty/netty/issues/6038
Motiviation:
We used ReferenceCountUtil.releaseLater(...) in our tests which simplifies a bit the releasing of ReferenceCounted objects. The problem with this is that while it simplifies stuff it increase memory usage a lot as memory may not be freed up in a timely manner.
Modifications:
- Deprecate releaseLater(...)
- Remove usage of releaseLater(...) in tests.
Result:
Less memory needed to build netty while running the tests.
Motivation:
00fc239995 introduced a change to HashedWheelTimerTest which attempted to wait for an explicit event notification until more timer events can be added. However HashedWheelTimer will execute the timer Runnable before removing it from the queue and decrementing the total count. This make it difficult for users to know when it is safe to add another timer task as the limit is approached.
Modifications:
- HashedWheelTimer should remove the timer Runnable before executing the task.
Result:
Users can more reliably add new timers when the limit is reached and HashedWheelTimerTest will no longer fail spuriously due to this race condition.
Motivation:
If a stream is not able to send any data (flow control window for the stream is exhausted) but has descendants who can send data then WeightedFairQueueByteDistributor may incorrectly modify the pseudo time and also double add the associated state to the parent's priority queue. The pseudo time should only be modified if a node is moved in the priority tree, and not if there happens to be no active streams in its descendent tree and a descendent is moved (e.g. removed from the tree because it wrote all data and the last data frame was EOS). Also the state objects for WeightedFairQueueByteDistributor should only appear once in any queue. If this condition is violated the pseudo time accounting would be biased at and assumptions in WeightedFairQueueByteDistributor would be invalidated.
Modifications:
- WeightedFairQueueByteDistributor#isActiveCountChangeForTree should not allow re-adding to the priority queue if we are currently processing a node in the distribution algorithm. The distribution algorithm will re-evaluate if the node should be re-added on the tail end of the recursion.
Result:
Fixes https://github.com/netty/netty/issues/5980
Motivation:
HashWheelTimerTest has busy/wait and sleep statements which are not necessary. We also depend upon a com.google.common.base.Supplier which isn't necessary.
Modifications:
- Remove buys wait loops and timeouts where possible
Result:
HashWheelTimerTest more explicit in verifying conditions and less reliant on wait times.
Motivation:
If the rate at which new timeouts are created is very high and the created timeouts are not cancelled, then the JVM can crash because of out of heap space. There should be a guard in the implementation to prevent this.
Modifications:
The constructor of HashedWheelTimer now takes an optional max pending timeouts parameter beyond which it will reject new timeouts by throwing RejectedExecutionException.
Result:
After this change, if the max pending timeouts parameter is passed as constructor argument to HashedWheelTimer, then it keeps a track of pending timeouts that aren't yet expired or cancelled. When a new timeout is being created, it checks for current pending timeouts and if it's equal to or greater than provided max pending timeouts, then it throws RejectedExecutionException.
Motivation:
PlatformDependent0 should not be referenced directly when sun.misc.Unsafe is unavailable.
Modifications:
Guard byteArrayBaseOffset with hasUnsafe check.
Result:
PlatformDependent can be initialized when sun.misc.Unsafe is unavailable.
Motivation:
ResourceLeakDetector shows two main problems, racy access and heavy lock contention.
Modifications:
This PR fixes this by doing two things:
1. Replace the sampling counter with a ThreadLocalRandom. This has two benefits.
First, it makes the sampling ration no longer have to be a power of two. Second,
it de-noises the continuous races that fight over this single value. Instead,
this change uses slightly more CPU to decide if it should sample by using TLR.
2. DefaultResourceLeaks need to be kept alive in order to catch leaks. The means
by which this happens is by a singular, doubly-linked list. This creates a
large amount of contention when allocating quickly. This is noticeable when
running on a multi core machine.
Instead, this uses a concurrent hash map to keep track of active resources
which has much better contention characteristics.
Results:
Better concurrent hygiene. Running the gRPC QPS benchmark showed RLD taking about
3 CPU seconds for every 1 wall second when runnign with 12 threads.
There are some minor perks to this as well. DefaultResourceLeak accounting is
moved to a central place which probably has better caching behavior.
Motivation:
PlatformDependent has a hash code algorithm which utilizes UNSAFE for performance reasons. This hash code algorithm must also be consistent with CharSequence objects that represent a collection of ASCII characters. In order to make the UNSAFE versions and CharSequence versions the endianness should be taken into account. However the big endian code was not correct in a few places.
Modifications:
- Correct bugs in PlatformDependent class related to big endian ASCII hash code computation
Result:
Fixes https://github.com/netty/netty/issues/5925
Motivation:
the build doesnt seem to enforce this, so they piled up
Modifications:
removed unused import lines
Result:
less unused imports
Signed-off-by: radai-rosenblatt <radai.rosenblatt@gmail.com>
Motivation:
ResourceLeakDetector reports leak for first call to open(obj) as its leakCheckCnt starts with value 0 and increment subsequently. with value of leakCheckCnt =0, it always returns ResourceLeak. Our application calls ResourceLeakDetector.open(obj) to validate Leak and it fails at very first call even though there is no leak in application.
Modifications:
ResourceLeakDetector.leakCheckCnt value will not be 0 while deriving leak and it will not return incorrect value of ResourceLeak.
Result:
Fix false leak report on first call on ResourceLeakDetector.
Motivation:
NetUtil.bytesToIpAddress does not correctly translate IPv4 address to String. Also IPv6 addresses may not follow minimization conventions when converting to a String (see rfc 5952).
Modifications:
- NetUtil.bytesToIpAddress should correctly handle negative byte values for IPv4
- NetUtil.bytesToIpAddress should leverage existing to string conversion code in NetUtil
Result:
Fixes https://github.com/netty/netty/issues/5821
Motivation:
At the moment we log very confusing messages when trying to load a native library which kind of suggest that the whole loading process failed even if just one mechanism failed and the library could be loaded at the end.
Modifications:
Make the mesage less confusing and also log a successful load of the native library.
Result:
Less confusing logs.
Motivation:
Since netty shaded JCTools the OSGi manifest no longer is correct. It claims to
have an optional import "org.jctools.queues;resolution:=optional,org.jctools.qu
eues.atomic;resolution:=optional,org.jctools.util;resolution:=optional"
However since it is shaded, this is no longer true.
This was noticed when making JCTools a real bundle and netty resolved it as
optional import.
Modifications:
Modify the generated manifest by no longer analyzing org.jctools for imports.
A manual setting of sun.misc as optional was required.
Result:
Netty OSGi bundle will no longer interfere with a JCTools bundle.
Motivation:
To make it easier to debug why notification of a promise failed we should log extra info and make it consistent.
Modifications:
- Create a new PromiseNotificationUtil that has static methods that can be used to try notify a promise and log.
- Reuse this in AbstractChannelHandlerContext, ChannelOutboundBuffer and PromiseNotifier
Result:
Easier to debug why a promise could not be notified.
Motivation:
We can share the code in retain() and retain(...) and also in release() and release(...).
Modifications:
Share code.
Result:
Less duplicated code.
Motivation:
Windows refuses to load a .DLL file when it's opened by other process.
Recent modification in NativeLibraryLoader causes NativeLibraryLoader to
attempt to load a .DLL before closing its OutputStream. As a result,
loading a .DLL file in Windows always fails.
Modifications:
Close the OutputStream explicitly before loading a shared library.
Result:
Native library loading in Windows works again.
Motivation:
SystemPropertyUtil requires InternalLoggerFactory requires ThreadLocalRandom requires SystemPropertyUtil. This can lead to a dead-lock.
Modifications:
Ensure ThreadLocalRandom does not require SystemPropertyUtil during initialization.
Result:
No more deadlock possible.
Motivation:
Sometimes it is useful to be able to wrap an existing memory address (a.k.a pointer) and create a ByteBuf from it. This way its easier to interopt with other libraries.
Modifications:
Add a new Unpooled.wrappedBuffer(....) method that takes a memory address.
Result:
Be able to wrap an existing memory address into a ByteBuf.
Motivation:
As the issue #5539 say, the OpenSsl.class will throw `java.lang.UnsatisfiedLinkError: org.apache.tomcat.jni.Library.version(I)I` when it is invoked. This path try to resolve the problem by modifying the native library loading logic of OpenSsl.class.
Modifications:
The OpenSsl.class loads the tcnative lib by `NativeLibraryLoader.loadFirstAvailable()`. The native library will be loaded in the bundle `netty-common`'s ClassLoader, which is diff with the native class's ClassLoader. That is the root cause of throws `UnsatisfiedLinkError` when the native method is invoked.
So, it should load the native library by the its bundle classloader firstly, then the embedded resources if failed.
Result:
First of all, the error threw by native method problem will be resolved.
Secondly, the native library should work as normal in non-OSGi env. But, this is hard. The loading logic of `Library.class` in `netty-tcnative` bundle is simple: try to load the library in PATH env. If not found, it falls back to the originally logic `NativeLibraryLoader.loadFirstAvailable()`.
Signed-off-by: XU JINCHUAN <xsir@msn.com>
Motivation:
When try to call Cleaner.run() via reflection on Java9 you may see an IllegalAccessException.
Modifications:
Just cast the Cleaner to Runnable to prevent IllegalAccessException to be raised.
Result:
Free direct buffers also work on Java9+ as expected.
Motivation:
Our current strategy in NativeLibraryLoader is to mark the temporary .so file to be deleted on JVM exit. This has the drawback to not delete the file in the case the JVM dies or is killed.
Modification:
Just directly try to delete the file one we loaded the native library and if this fails mark the file to be removed once the JVM exits.
Result:
Less likely to have temporary files still on the system in case of JVM kills.
Motivation:
According to the Oracle documentation:
> java.net.preferIPv4Stack (default: false)
>
> If IPv6 is available on the operating system, the underlying native
> socket will be an IPv6 socket. This allows Java applications to connect
> to, and accept connections from, both IPv4 and IPv6 hosts.
>
> If an application has a preference to only use IPv4 sockets, then this
> property can be set to true. The implication is that the application
> will not be able to communicate with IPv6 hosts.
which means, if DnsNameResolver returns an IPv6 address, a user (or
Netty) will not be able to connect to it.
Modifications:
- Move the code that retrieves java.net.prefer* properties from
DnsNameResolver to NetUtil
- Add NetUtil.isIpV6AddressesPreferred()
- Revise the API documentation of NetUtil.isIpV*Preferred()
- Set the default resolveAddressTypes to IPv4 only when
NetUtil.isIpv4StackPreferred() returns true
Result:
- Fixes#5657
Motiviation:
Preparing platform dependent code for using unsafe requires executing
privileged code. The privileged code for initializing unsafe is executed
in a manner that would require all code leading up to the initialization
to have the requisite permissions. Yet, in a restrictive environment
(e.g., under a security policy that only grants the requisite
permissions the Netty common jar but not to application code triggering
the Netty initialization), then initializing unsafe will not succeed
even if the security policy would otherwise permit it.
Modifications:
This commit marks the necessary blocks as privileged. This enables
access to the necessary resources for initialization unsafe. The idea is
that we are saying the Netty code is trusted, and as long as the Netty
code has been granted the necessary permissions, then we will allow the
caller access to these resources even though the caller itself might not
have the requisite permissions.
Result:
Unsafe can be initialized in a restrictive security environment.
Motivation:
Its not clear that the capacity is per thread.
Modifications:
Rename system property to make it more clear that the recycler capacity is per thread.
Result:
Less confusing.
Motivation:
We not need to do an extra conditional check in retain(...) as we can just check for overflow after we did the increment.
Modifications:
- Remove extra conditional check
- Add test code.
Result:
One conditional check less.
Motivation:
If the user uses 0 as quiet period we should shutdown without any delay if possible.
Modifications:
Ensure we not introduce extra delay when a shutdown quit period of 0 is used.
Result:
EventLoop shutdown as fast as expected.
Current constant pool holds all data within HashMap and all access to this HashMap is done via synchronized blocks. Thus CuncurrentHashMap will be here more efficient as it designed for higher throughput and will use less locks. Also valueOf method was not very efficient as it performed get operation 2 times.
Modifications :
HashMap -> PlatformDependent.newConcurrentHashMap().
ValueOf is more efficient now, threadsafe and uses less locks. Downside is that final T tempConstant = newConstant(nextId(), name); could be called more than 1 time during high contention.
Result :
Less contention, cleaner code.
Motivation:
We offer DefaultEventExecutorGroup as an EventExecutorGroup which return OrderedEventExecutor and so provide strict ordering of event execution. One limitations of this implementation is that each contained DefaultEventExecutor will always be tied to a single thread, which can lead to a very unbalanced execution as one thread may be super busy while others are idling.
Modifications:
- Add NonStickyEventExecutorGroup which can be used to wrap another EventExecutorGroup (like UnorderedThreadPoolEventExecutor) and expose ordering while not be sticky with the thread that is used for a given EventExecutor. This basically means that Threads may change between execution of tasks for an EventExecutor but ordering is still guaranteed.
Result:
Better utalization of threads in some use-cases.
Motivation:
To better restrict resource usage we should limit the number of WeakOrderQueue instances per Thread. Once this limit is reached object that are recycled from a different Thread then the allocation Thread are dropped on the floor.
Modifications:
Add new system property io.netty.recycler.maxDelayedQueuesPerThread and constructor that allows to limit the max number of WeakOrderQueue instances per Thread for Recycler instance. The default is 2 * cores (the same as the default number of EventLoop instances per EventLoopGroup).
Result:
Better way to restrict resource / memory usage per Recycler instance.
Motivation:
Commons logger is dead and not updated for more than 2 years. #5615.
Modifications:
Added @Deprecated annotation to CommonsLoggerFactory and CommonsLogger.
Result:
Commons logger now deprecated.
Motivation:
When Netty components are initialized, Netty attempts to determine if it
has access to unsafe. If Netty is not able to access unsafe (because of
security permissions, or because the JVM was started with an explicit
flag to tell Netty to not look for unsafe), Netty logs an info-level
message that looks like a warning:
Your platform does not provide complete low-level API for accessing
direct buffers reliably. Unless explicitly requested, heap buffer will
always be preferred to avoid potential system unstability.
This log message can appear in applications that depend on Netty for
networking, and this log message can be scary to end-users of such
platforms. This log message should not be emitted if the application was
started with an explicit flag telling Netty to not look for unsafe.
Modifications:
This commit refactors the unsafe detection logic to expose whether or
not the JVM was started with a flag telling Netty to not look for
unsafe. With this exposed information, the log message on unsafe being
unavailable can be modified to not be emitted when Netty is explicitly
told to not look for unsafe.
Result:
No log message is produced when unsafe is unavailable because Netty was
told to not look for it.
Motivation:
AbstractConstant.compareTo seems complex and hard to understand. Also it allocates unnecessary 1 byte in direct buffer and holds unnecessary pointer to this byte butter.
Modifications:
uniquifier (id) variable now initialized during Constant creation and thus no need in volatile and no need in uniquifier() method as it could be easily replaced with AtomicLong.
Result:
Every Constant instance now consumes less bytes for pointer, don't consume anything in direct buffer.
Motivation:
Old code doesn't needed anymore due to logger factory initialization.
Modifications :
Removed static section and useless static variables;
Logging concatenations replaced with placeholders.
Result:
Cleaner, simpler code doing the same
Motivation:
Some usages of findNextPositivePowerOfTwo assume that bounds checking is taken care of by this method. However bounds checking is not taken care of by findNextPositivePowerOfTwo and instead assert statements are used to imply the caller has checked the bounds. This can lead to unexpected non power of 2 return values if the caller is not careful and thus invalidate any logic which depends upon a power of 2.
Modifications:
- Add a safeFindNextPositivePowerOfTwo method which will do runtime bounds checks and always return a power of 2
Result:
Fixes https://github.com/netty/netty/issues/5601
Motivation:
NewLine initializing is complex, with unnecessary allocations and non-standard.
Static section is overloaded with StringBuilders for simple "s" + "s" concatenation pattern that compiler optimizes perfectly.
Modifications:
NewLine initializing replaced with standard System.getProperty("line.separator").
Removed StringBuilders in static section.
Result:
Less complex code.
Motivation:
When we try to close the Channel due a timeout we need to ensure we not log if the notification of the promise fails as it may be completed in the meantime.
Modifications:
Add another constructor to ChannelPromiseNotifier and PromiseNotifier which allows to log on notification failure.
Result:
No more miss-leading logs.
Motivation:
At the moment the Recyler is very sensitive to allocation bursts which means that if there is a need for X objects for only one time these will most likely end up in the Recycler and sit there forever as the normal workload only need a subset of this number.
Modifications:
Add a ratio which sets how many objects should be pooled for each new allocation. This allows to slowly increase the number of objects in the Recycler while not be to sensitive for bursts.
Result:
Less unused objects in the Recycler if allocation rate sometimes bursts.
Motivation:
Using Attribute.remove() and Attribute.getAndRemove() in a multi-threaded enviroment has its drawbacks. Make sure we document these.
Modifications:
Add javadocs and mark Attribute.remove() and Attribute.getAndRemove() as @Deprecated.
Result:
Hopefully less suprising behaviour.
Motivation:
We used a very high number for DEFAULT_INITIAL_MAX_CAPACITY (over 200k) which is not very relastic and my lead to very surprising memory usage if allocations happen in bursts.
Modifications:
Use a more sane default value of 32k
Result:
Less possible memory usage by default
Motivation:
AbstractReferenceCounted as independent conditional statements to check the bounds of the retain IllegalReferenceCountException condition. One of the exceptions also uses the incorrect increment.
Modifications:
- Combined independent conditional checks into 1 where possible
- Correct IllegalReferenceCountException with incorrect increment
- Remove the subtract to check for overflow and re-use the addition and check for overflow to remove 1 arithmetic operation (see http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.18.2)
Result:
AbstractReferenceCounted has less independent branch statements and more correct IllegalReferenceCountException. Compilation size of AbstractReferenceCounted.retain() is reduced from 58 bytes to 47 bytes.
Motivation:
We saw some sporadic test failures for GlobalEventExecutorTest.testAutomaticStartStop test. This is caused parallel execution of tests in combination with assert checks that will be affected.
Modifications:
Remove fragile assert checks.
Result:
No more sporadic test failures
Motivation:
We use a shared capacity per Stack for all its WeakOrderQueue elements. These relations are stored in a WeakHashMap to allow dropping these if memory pressure arise. The problem is that we not "reclaim" previous reserved space when this happens. This can lead to a Stack which has not shared capacity left which then will lead to an AssertError when we try to allocate a new WeakOderQueue.
Modifications:
- Ensure we never throw an AssertError if we not have enough space left for a new WeakOrderQueue
- Ensure we reclaim space when WeakOrderQueue is collected.
Result:
No more AssertError possible when new WeakOrderQueue is created and also correctly reclaim space that was reserved from the shared capacity.
Motivation:
Due an implementation flaw in DefaultAttributeMap it was possible that an attribute and its stored value/key could not be collected until the DefaultAttributeMap could be collected. This can lead to unexpected memory usage and strong reachability of objects that should be collected.
Modifications:
Use an special empty DefaultAttribute as head of the each bucket which will not store any key / value. With this change everything can be collected as expected as we not use any DefaultAttribute created by the user as head of a bucket.
Result:
DefaultAttributeMap does not store user data and thus the lifetime of this user data is not tied to the lifetime of the DefaultAttributeMap.
Motivation:
Commit afafadd3d7 introduced a change which stored the Stack in the WeakOrderQueue as field. This unfortunally had the effect that it was not removed from the WeakHashMap anymore as the Stack also is used as key.
Modifications:
Do not store a reference to the Stack in WeakOrderQueue.
Result:
WeakOrderQueue can be collected correctly again.
Motivation:
When resolving localhost on Windows where the hosts file does not contain a localhost entry by default, the resulting InetAddress object returned by the resolver does not have the hostname set so that getHostName returns the ip address 127.0.0.1. This behaviour is inconsistent with Windows where the hosts file does contain a localhost entry and with Linux in any case. It breaks at least some unit tests.
Modifications:
Create the LOCALHOST4 and LOCALHOST6 objects with hostname localhost in addition to the address.
Add unit test domain localhost to DnsNameResolverTest to check the resolution of localhost with ipv4 at least.
Result:
The resolver returns a InetAddress object for localhost with the hostname localhost in all cases.
Motivation:
The Java version is used for platform dependent logic. Yet, the logic
for acquiring the Java version requires special permissions (the runtime
permission "getClassLoader") that some downstream projects will never
grant. As such, these projects are doomed to have Netty act is their
Java major version is six. While there are ways to maintain the same
logic without requiring these special permissions, the logic is
needlessly complicated because it relies on loading classes that exist
in version n but not version n - 1. This complexity can be removed. As a
bonanza, the dangerous permission is no longer required.
Modifications:
Rather than attempting to load classes that exist in version n but not
in version n - 1, we can just parse the Java specification version. This
only requires a begign property (property permission
"java.specification.version") and is simple.
Result:
Acquisition of the Java version is safe and simple.
Motivation:
The clean method in java.base/jdk.internal.ref.Cleaner is not accessible
to methods outside java.base. This prevents Cleaner0.freeDirectBuffer
from actually calling the clean method on JDK9.
The issue could have been caught earlier if Cleaner0 is initialized when
PlatformDependent0 is initialized and logging statements in the static
initializer in Cleaner0 would be close to the logging statements in the
static initializer in PlatformDependent0.
Modifications:
Try casting the cleaner obtained from a ByteBuffer to Runnable and use
Runnable.run if possible. All Cleaners in JDK9 implements Runnable. Fall
back to the clean method if the cleaner does not implement Runnable.
The fallback preserves the behavior on JDK8 and earlier.
Try to free the direct ByteBuffer allocated during static initialization
of PlatformDependent0. This cause Cleaner0 to be initialized when
PlatformDependent0 is initialized, and logging statements from the
static initializers will be close together.
Result:
Cleaner0.freeDirectBuffer works as intended on JDK9 and logging shows
that Cleaner0.freeDirectBuffer works as intended.
Motivation:
A recent change to DefaultThreadFactory modified it so that it is sticky
with respect to thread groups. In particular, this change made it so
that DefaultThreadFactory would hold on to the thread group that created
it, and then use that thread group to create threads.
This can have problematic semantics since it can lead to violations of a
tenet of thread groups that a thread can only modify threads in its own
thread group and descendant thread groups. With a sticky thread group, a
thread triggering the creation of a new thread via
DefaultThreadFactory#newThread will be modifying a thread from the
sticky thread group which will not necessarily be its own nor a
descendant thread group. When a security manager is in place that
enforces this requirement, these modifications are now impossible. This
is especially problematic in the context of Netty because certain global
singletons like GlobalEventExecutor will create a
DefaultThreadFactory. If all DefaultThreadFactory instances are sticky
about their thread groups, it means that submitting tasks to the
GlobalEventExecutor singleton can cause a thread to be created from the
DefaultThreadFactory sticky thread group, exactly the problem with
DefaultThreadFactory being sticky about thread groups. A similar problem
arises from the ThreadDeathWatcher.
Modifications:
This commit modifies DefaultThreadFactory so that a null thread group
can be set with the behavior that all threads created by such an
instance will inherit the default thread group (the thread group
provided by the security manager if there is one, otherwise the thread
group of the creating thread). The construction of the instances of
DefaultThreadFactory used by the GlobalEventExecutor singleton and
ThreadDeathWatcher are modified to use this behavior. Additionally, we
also modify the chained constructor invocations of the
DefaultThreadFactory that do not have a parameter to specify a thread
group to use the thread group from the security manager is available,
otherwise the creating thread's thread group. We also add unit tests
ensuring that all of this behavior is maintained.
Result:
It will be possible to have DefaultThreadFactory instances that are not
sticky about the thread group that led to their creation. Instead,
threads created by such a DefaultThreadFactory will inherit the default
thread group which will either be the thread group from the security
manager or the current thread's thread group.
Motivation:
Currently, the recycler max capacity it's only enforced on the
thread-local stack which is used when the recycling happens on the
same thread that requested the object.
When the recycling happens in a different thread, then the objects
will be queued into a linked list (where each node holds N objects,
default=16). These objects are then transfered into the stack when
new objects are requested and the stack is empty.
The problem is that the queue doesn't have a max capacity and that
can lead to bad scenarios. Eg:
- Allocate 1M object from recycler
- Recycle all of them from different thread
- Recycler WeakOrderQueue will contain 1M objects
- Reference graph will be very long to traverse and GC timeseems to be negatively impacted
- Size of the queue will never shrink after this
Modifications:
Add some shared counter which is used to manage capacity limits when recycle from different thread then the allocation thread. We modify the counter whenever we allocate a new Link to reduce the overhead of increment / decrement it.
Result:
More predictable number of objects mantained in the recycler pool.
Motivation:
This change is part of the change done in PR #5395 to provide an `AUTO_FLUSH` capability.
Splitting this change will enable to try other ways of implementing `AUTO_FLUSH`.
Modifications:
Two methods:
```java
void executeAfterEventLoopIteration(Runnable task);
boolean removeAfterEventLoopIterationTask(Runnable task);
```
are added to `SingleThreadEventLoop` class for adding/removing a task to be executed at the end of current/next iteration of this `eventloop`.
In order to support the above, a few methods are added to `SingleThreadEventExecutor`
```java
protected void afterRunningAllTasks() { }
```
This is invoked after all tasks are run for this executor OR if the passed timeout value for `runAllTasks(long timeoutNanos)` is expired.
Added a queue of `tailTasks` to `SingleThreadEventLoop` to hold all tasks to be executed at the end of every iteration.
Result:
`SingleThreadEventLoop` now has the ability to execute tasks at the end of an eventloop iteration.
Motivation:
Today when awaiting uninterruptibly on a default promise, a race
condition can lead to a missed signal. Quite simply, the check for
whether the condition holds is not made inside a lock before
waiting. This means that the waiting thread can enter the wait after the
promise has completed and will thus not be notified, thus missing the
signal. This leads to the waiting thread to enter a timed wait that will
only trip with the timeout elapses leading to unnecessarily long waits
(imagine a connection timeout, and the waiting thread missed the signal
that the connection is ready).
Modification:
This commit fixes this missed signal by checking the condition inside a
lock. We also add a test that reliably fails without the non-racy
condition check.
Result:
Timed uninterruptible waits on default promise will not race against the
condition and possibly wait longer than necessary.
Motivation:
ExecutorService.invoke*(...) methods may block by API definition. This can lead to deadlocks if called from inside the EventLoop in SingleThreadEventExecutor as it only has one Thread that does all the work.
Modifications:
Throw a RejectedExectionException if someone tries to call SingleThreadEventExecutor.invoke*(...) while in the EventLoop.
Result:
No more deadlock possible.
Motivation:
The current DnsNameResolver does not support search domains resolution. Search domains resolution is supported out of the box by the java.net resolver, making the DnsNameResolver not able to be a drop in replacement for io.netty.resolver.DefaultNameResolver.
Modifications:
The DnsNameResolverContext resolution has been modified to resolve a list of search path first when it is configured so. The resolve method now uses the following algorithm:
if (hostname is absolute (start with dot) || no search domains) {
searchAsIs
} else {
if (numDots(name) >= ndots) {
searchAsIs
}
if (searchAsIs wasn't performed or failed) {
searchWithSearchDomainsSequenciallyUntilOneSucceeds
}
}
The DnsNameResolverBuilder provides configuration for the search domains and the ndots value. The default search domains value is configured with the OS search domains using the same native configuration the java.net resolver uses.
Result:
The DnsNameResolver performs search domains resolution when they are present.
Motivation:
DefaultPromise has a listeners member variable which is volatile to allow for an optimization which makes notification of listeners less expensive when there are no listeners to notify. However this change makes all other operations involving the listeners member variable more costly. This optimization which requires listeners to be volatile can be removed to avoid volatile writes/reads for every access on the listeners member variable.
Modifications:
- DefaultPromise listeners is made non-volatile and the null check optimization is removed
Result:
DefaultPromise.listeners is no longer volatile.
Motivation:
In commit f984870ccc I made a change which operated under invalide assumption that tasks executed by an EventExecutor will always be processed in a serial fashion. This is true for SingleThreadEventExecutor sub-classes but not part of the EventExecutor interface contract.
Because of this change implementations of EventExecutor which not strictly execute tasks in a serial fashion may miss events before handlerAdded(...) is called. This is strictly speaking not correct as there is not guarantee in this case that handlerAdded(...) will be called as first task (as there is no ordering guarentee).
Cassandra itself ships such an EventExecutor implementation which has no strict ordering to spread load across multiple threads.
Modifications:
- Add new OrderedEventExecutor interface and let SingleThreadEventExecutor / EventLoop implement / extend it.
- Only expose "restriction" of skipping events until handlerAdded(...) is called for OrderedEventExecutor implementations
- Add ThreadPoolEventExecutor implementation which executes tasks in an unordered fashion. This is used in added unit test but can also be used for protocols which not expose an strict ordering.
- Add unit test.
Result:
Resurrect the possibility to implement an EventExecutor which does not enforce serial execution of events and be able to use it with the DefaultChannelPipeline.
Motivation:
DefaultPromise has a listeners member variable which is volatile to allow for an optimization which makes notification of listeners less expensive when there are no listeners to notify. However this change makes all other operations involving the listeners member variable more costly. This optimization which requires listeners to be volatile can be removed to avoid volatile writes/reads for every access on the listeners member variable.
Modifications:
- DefaultPromise listeners is made non-volatile and the null check optimization is removed
Result:
DefaultPromise.listeners is no longer volatile.
Motivation:
The logging statements in i.n.u.c.DefaultPromise do not emit the
caught Throwable when a Throwable is thrown while a listener is being
notified of completed or progressed operations.
Modifications:
This issue arises because the logging message has a single placeholder
but is passing two additional arguments, the second one being the
caught Throwable that is thus quietly not logged. We address this by
modifying the logging statements to ensure the caught Throwable is
logged. In this case, the preferred approach is to use the logger
override that accepts a message and a Throwable parameter since logger
implementations might have special handling for this case.
Result:
Log messages from i.n.u.c.DefaultPromise when a Throwable is thrown
while notifying a listener of completed or progressed operations will
contain the caught Throwable.
Motivation:
A race detector found that DefaultPromise.listeners is improperly synchronized [1].
Worst case a listener will not be executed when the promise is completed.
Modifications:
Make DefaultPromise.listeners a volatile.
Result:
Hopefully, DefaultPromise is more correct under concurrent execution.
[1] https://github.com/grpc/grpc-java/issues/2015
Motivation:
AsciiString.hashCode(o) , if "o" is a subString, the hash code is not always same, when netty’s version is 4.1.1.Final and jdk’s version is 1.6.
Modifications:
Use a test to assert hash codes are equal between a new string and any sub string (a part of a char array),If their values are equal.
Result:
Create a test method to AsciiStringCharacterTest.
Motivation:
Currently in the single threaded and global event executors when the scheduled task queue is drained, there is a call to hasScheduledTasks(). If there are scheduled tasks then the the code polls the queue for tasks. The poll method duplicates the exact logic of hasScheduledTasks(). This involves two calls to nanoTime when one seems sufficient.
Modifications:
Directly poll the queue for tasks and break if the task returned is null.
Result:
Should be no noticeable impact on functionality. Two calls to nanoTime have been coarsened into a single call.
Motivation:
Project Jigsaw in JDK9 has moved the direct byte buffer cleaner from
sun.misc.Cleaner to java.lang.ref.Cleaner$Cleanable. This cause the
current platform tests to throw a ClassNotFoundException, disabling the
use of direct byte buffer cleaners.
Modifications:
I use reflection to find the clean method in either sun.misc.Cleaner or
java.lang.ref.Cleaner$Cleanable.
Result:
Netty uses direct byte buffers on JDK9 as it already do on earlier JDKs.
Motivation:
In JDK9 heap byte buffers have an address field, so we have to remove
the current check as it is invalid in JDK9.
Modifications:
Removed the address field check for heap byte buffers.
Result:
Netty continues to find sun.misc.Unsafe in JDK9 as in previous JDKs.
Motivation:
Netty's platform dependent parts should know about JDK9.
Modifications:
JDK9 introduce Runtime$Version Runtime.version() which has an int major()
method that always return the major Java version. I call that method to
get the Java major version.
Result:
Netty will recognize all future JDK versions.
Motivation:
HPACK Encoder has a data structure which is similar to a previous version of DefaultHeaders. Some of the same improvements can be made.
Motivation:
- Enforce the restriction that the Encoder's headerFields length must be a power of two so we can use masking instead of modulo
- Use AsciiString.hashCode which already has optimizations instead of having yet another hash code algorithm in Encoder
Result:
Fixes https://github.com/netty/netty/issues/5357
Motivation:
PlatformDependent attempts to use reflection to get the underlying char[] (or byte[]) from String objects. This is fragile as if the String implementation does not utilize the full array, and instead uses a subset of the array, this optimization is invalid. OpenJDK6 and some earlier versions of OpenJDK7 String have the capability to use a subsection of the underlying char[].
Modifications:
- PlatformDependent should not attempt to use the underlying array from String (or other data types) via reflection
Result:
PlatformDependent hash code generation for CharSequence does not depend upon specific JDK implementation details.
Motivation:
We recently added the ResourceLeakDetectorFactory but missed to updated HashedWheelTimer to use it.
Modifications:
- Add new abstract method to ResourceLeakDetectorFactory that allows to provide also samplingInterval and maxActive args.
- Deprecate most constructors in ResourceLeakDetector and add doc explaining that people should use ResourceLeakDetectorFactory
Result:
Custom ResourceLeakDetectorFactory will also be used in HashedWheelTimer if configured.
Motivation:
Sometimes a shared HashedWheelTimer can not easily be stopped in a good place. If the worker thread is daemon this is not a big deal and we should allow to not log a leak.
Modifications:
Add another constructor which allows to disable resource leak detection if worker thread is used.
Result:
Not log resource leak when HashedWheelTimer is not stopped and the worker thread is a deamon thread.
Motivation:
Some Netty use cases may want to configure the max allowed stack depth for promise listener notification.
Modifications:
- Add a system property so that this value can be configured.
Result:
DefaultPromise's max stack depth is configurable.
Motivation:
PR #5355 modified interfaces to reduce GC related to the HPACK code. However this came with an anticipated performance regression related to HpackUtil.equals due to AsciiString's increase cost of charAt(..). We should mitigate this performance regression.
Modifications:
- Introduce an equals method in PlatformDependent which doesn't leak timing information and use this in HpcakUtil.equals
Result:
Fixes https://github.com/netty/netty/issues/5436
Motivation:
It is good to have used dependencies and plugins up-to-date to fix any undiscovered bug fixed by the authors.
Modification:
Scanned dependencies and plugins and carefully updated one by one.
Result:
Dependencies and plugins are up-to-date.
Motiviation:
Sometimes it is useful to allow to specify a custom strategy to handle rejected tasks. For example if someone tries to add tasks from outside the eventloop it may make sense to try to backoff and retries and so give the executor time to recover.
Modification:
Add RejectedEventExecutor interface and implementations and allow to inject it.
Result:
More flexible handling of executor overload.
Motivation:
To restrict the memory usage of a system it is sometimes needed to adjust the number of max pending tasks in the tasks queue.
Modifications:
- Add new constructors to modify the number of allowed pending tasks.
- Add system properties to configure the default values.
Result:
More flexible configuration.
Motivation:
We should merge ThrowableUtils into ThrowableUtil as this name is more consistent with the naming of utility classes in netty.
Modifications:
Merge classes.
Result:
More consistent naming
Motivation:
We use pre-instantiated exceptions in various places for performance reasons. These exceptions don't include a stacktrace which makes it hard to know where the exception was thrown. This is especially true as we use the same exception type (for example ChannelClosedException) in different places. Setting some StackTraceElements will provide more context as to where these exceptions original and make debugging easier.
Modifications:
Set a generated StackTraceElement on these pre-instantiated exceptions which at least contains the origin class and method name. The filename and linenumber are specified as unkown (as stated in the javadocs of StackTraceElement).
Result:
Easier to find the origin of a pre-instantiated exception.
Allow users of Netty to plug in their own leak detector for the purpose
of instrumentation.
Motivation:
We are rolling out a large Netty deployment and want to be able to
track the amount of leaks we're seeing in production via custom
instrumentation. In order to achieve this today, I had to plug in a
custom `ByteBufAllocator` into the bootstrap and have it initialize a
custom `ResourceLeakDetector`. Due to these classes mostly being marked
`final` or having private or static methods, a lot of the code had to
be copy-pasted and it's quite ugly.
Modifications:
* I've added a static loader method for the `ResourceLeakDetector` in
`AbstractByteBuf` that tries to instantiate the class passed in via the
`-Dio.netty.customResourceLeakDetector`, otherwise falling back to the
default one.
* I've modified `ResourceLeakDetector` to be non-final and to have the
reporting broken out in to methods that can be overridden.
Result:
You can instrument leaks in your application by just adding something
like the following:
```java
public class InstrumentedResourceLeakDetector<T> extends
ResourceLeakDetector<T> {
@Monitor("InstanceLeakCounter")
private final AtomicInteger instancesLeakCounter;
@Monitor("LeakCounter")
private final AtomicInteger leakCounter;
public InstrumentedResourceLeakDetector(Class<T> resource) {
super(resource);
this.instancesLeakCounter = new AtomicInteger();
this.leakCounter = new AtomicInteger();
}
@Override
protected void reportTracedLeak(String records) {
super.reportTracedLeak(records);
leakCounter.incrementAndGet();
}
@Override
protected void reportUntracedLeak() {
super.reportUntracedLeak();
leakCounter.incrementAndGet();
}
@Override
protected void reportInstancesLeak() {
super.reportInstancesLeak();
instancesLeakCounter.incrementAndGet();
}
}
```
Motivation:
DefaultThreadFactory allows to override the newThread(...) method and so should have access to all fields that are set via the constructor.
Modifications:
Change threadGroup from private to protected visibility.
Result:
Easier to extend DefaultThreadFactory.
Motivation:
In case of exception in invokeExceptionCaught() only original exception passed to invokeExceptionCaught() will be logged on any log level.
+ AbstractChannelHandlerContext and CombinedChannelDuplexHandler log different exceptions.
Modifications:
Fix inconsistent logging code and add ability to see both stacktraces on DEBUG level.
Result:
Both handlers log now both original exception and thrown from invokeExceptionCaught. To see full stacktrace of exception thrown from invokeExceptionCaught DEBUG log level must be enabled.
Motivation:
JCTools supports both non-unsafe, unsafe versions of queues and JDK6 which allows us to shade the library in netty-common allowing it to stay "zero dependency".
Modifications:
- Remove copy paste JCTools code and shade the library (dependencies that are shaded should be removed from the <dependencies> section of the generated POM).
- Remove usage of OneTimeTask and remove it all together.
Result:
Less code to maintain and easier to update JCTools and less GC pressure as the queue implementation nt creates so much garbage
Motivation:
The javaDocs for Future.removeListener do not clarify that only the first occurrence of the listener is guaranteed to be removed.
Modifications:
- Clarify the javaDocs for Future.removeListener[s] so it is known that the only the first occurrence of the listener will be removed.
Result:
Fixes https://github.com/netty/netty/issues/5351
Motivation:
Sometimes it may be benefitially for an user to specify a custom algorithm when choose the next EventExecutor/EventLoop.
Modifications:
Allow to specify a custom EventExecutorChooseFactory that allows to customize algorithm.
Result:
More flexible api.
Motivation:
We tried to always use SecureRandom to generate the initialSeed for our ThreadLocalRandom, this can sometimes give warnings under normal usage. We should better not use SecureRandom as default (just as the implementation in jsr166y does) and only try if the user specified -Djava.util.secureRandomSeed=true .
Modifications:
Only try to use SecureRandom when -Djava.util.secureRandomSeed=true is used.
Result:
Less likely to see entropy warnings.
Motivation:
If the user uses unsafe direct buffers with no cleaner we can use Unsafe.reallocateMemory(...) as optimization when we need to expand the buffer.
Modifications:
Use Unsafe.relocateMemory(...) in UnpooledUnsafeNoCleanerDirectByteBuf.
Result:
Less expensive expanding of buffers.
Motivation:
Using the Cleaner to release the native memory has a few drawbacks:
- Cleaner.clean() uses static synchronized internally which means it can be a performance bottleneck
- It put more load on the GC
Modifications:
Add new buffer implementations that can be enabled with a system flag as optimizations. In this case no Cleaner is used at all and the user must ensure everything is always released.
Result:
Less performance impact by direct buffers when need to be allocated and released.
Modifications:
DefaultPromise provides a ThreadLocal queue to protect against StackOverflowError because of executors which may immediately execute runnables instead of queue them (i.e. ImmediateEventExecutor). However this may be better addressed by fixing these executors to protect against StackOverflowError instead of just fixing for a single use case. Also the most commonly used executors already provide the desired behavior and don't need the additional overhead of a ThreadLocal queue in DefaultPromise.
Modifications:
- Remove ThreadLocal queue from DefaultPromise
- Change ImmediateEventExecutor so it maintains a queue of runnables if reentrant condition occurs
Result:
DefaultPromise StackOverflowError code is simpler, and ImmediateEventExecutor protects against StackOverflowError.
Motivation:
Under very unlikely (however possible) circumstances, Recycler may leak
references. This happens _only_ when the object was already recycled
at least once (which means it's got written to the stack) and then
taken out again, and never returned.
The "never returned" part may be the fault of the user (forgotten
`finally` clause) or the situation when Recycler drops the possibly
youngest item itself.
Modifications:
Nullify the item taken from the stack.
Result:
Reference is cleaned up. If the object is lost, it will be a subject for
GC. The rest of Stack / Recycler functionality remains unaffected.
Motivation:
Sometimes people may want to trade GC with memory overhead. For this it can be useful to allow to change the capacity of the array that is hold in the Link that is used by the Recycler internally.
Modifications:
Introduce a new system property , io.netty.recycler.linkCapacity which allows to change the capcity.
Result:
More flexible configuration of netty.
Motivation:
Currenlty, netty-transport-native-epoll-*-linux-x86_64.jar is not packed as OSGi bundle
and thus not working in OSGi environment.
Modifications:
In netty-transport-native-epoll's pom.xml added configuration
to attach manifest to the jar with a native library.
In netty-common's pom.xml added configuration instruction (DynamicImport-Package)
to maven bnd plugin to make sure the native code is loaded from
netty-transport-native-epoll bundle.
Result:
The netty-transport-native-epoll-*-linux-x86_64.jar is a bundle (MANIFEST.MF attached)
and the inluced native library can be successfuly loaded in OSGi environment.
Fixing #5119
Motivation:
Unsafe offers a method to set memory to a specific value. This can be used to implement an optimized version of setZero(...) and writeZero(...)
Modifications:
Add implementation for all Unsafe*ByteBuf implementations.
Result:
Faster setZero(...) and writeZero(...)
Motivation:
SingleThreadEventExecutor.pendingTasks() will call taskQueue.size() to get the number of pending tasks in the queue. This is not safe when using MpscLinkedQueue as size() is only allowed to be called by a single consumer.
Modifications:
Ensure size() is only called from the EventLoop.
Result:
No more livelock possible when call pendingTasks, no matter from which thread it is done.
Motivation:
f2ed3e6ce8 removed the previous mechanism for StackOverflowError because it didn't work in all cases (i.e. ImmediateExecutor). However if a chain of listeners which complete other promises is formed there is still a possibility of a StackOverflowError.
Modifications:
- Use a ThreadLocal to save any DefaultPromises which could not be notified due to the stack being too large. After the first DefaultPromise on the stack completes notification this ThreadLocal should be used to notify any DefaultPromises which have not yet been notified.
Result:
DefaultPromise has StackOverflowError protection that works with all EventExecutor types.
Motivation:
If the executor changes while listeners are added and notification of listeners is being done then listeners can be notified out of order and concurrently. We should ensure that only one executor is used at any given time to notify listeners and ensure the listeners are notified in FIFO order.
Modifications:
- Move the notifyingListeners member variable from DefaultPromise into the synchronized block to prevent concurrent notification of listeners and preserve FIFO notification order
Result:
If the executor is changed for a DefaultPromise the listener notification order should be FIFO.
Motivation:
Recycler.recycle(...) should not be used anymore and be replaced by Handle.recycle().
Modifications:
Mark it as deprecated and update usage.
Result:
Correctly document deprecated api.
Motivation:
DomainMappingBuilder should have been named as DomainNameMappingBuilder
because it builds a DomainNameMapping.
Modifications:
- Add DomainNameMappingBuilder that does the same job with
DomainMappingBuilder
- Deprecate DomainMappingBuilder and delegate its logic to
DomainNameMappingBuilder
- Remove the references to the deprecated methods and classes related
with domain name mapping
- Miscellaneous:
- Fix Javadoc of DomainNameMapping.asMap()
- Pre-create the unmodifiable map in DomainNameMapping
Result:
- Consistent naming
- Less use of deprecated API
Motivation:
DomainNameMapping.entries() returns Set<Map.Entry<String, V>>, which
doesn't sound very natural.
Modifications:
Replace entries() with asMap() which returns a Map<String, V> instead.
Result:
- Better looking API
- User can do a lookup because it's a Map
Motivation:
As reported in #4211, when using Netty in Tomcat (or other container based deployment), ForkJoinPool leaks an instance of `Submitter` so that the class loader of `Submitter` won't be GCed. However, since `Submitter` is just a wrapper of `int`, we can replace it with `int[1]`.
Modifications:
Replace `Submitter` with `int[1]`.
Result:
No class loader leak in ForkJoinPool when using in a container.
Motivation:
See #4200.
Modifications:
Add DomainNameMapping.entries to allow retrieving the domain match lists.
Result:
People can use DomainNameMapping.entries to retrive the match list in DomainNameMapping.
Motivation:
EventExecutor.children uses generics in such a way that an entire colleciton must be cast to a specific type of object. This interface is not very flexible and is impossible to implement if the EventExecutor type must be wrapped. The current usage of this method also does not have any clear need within Netty. The Iterator interface allows for EventExecutor to be wrapped and forces the caller to make assumptions about types instead of building the assumptions into the interface.
Motivation:
- Remove EventExecutor.children and undeprecate the iterator() interface
Result:
EventExecutor interface has one less method and is easier to wrap.
Motivation:
KObjectHashMap.remove(int index) attempts to move back items which may have been displaced because their spot in the hash based array was taken by another item. If this happens the nextIndex reference in PrimitiveIterator will not be updated. At this time the PrimitiveEntry will reference the incorrect index and may result in a NPE.
Modifications:
- If KObjectHashMap.remove(int index) moves entries back then PrimitiveIterator should adjust its nextIndex
Result:
PrimitiveIterator.remove() updates its internal state to reference the new nextIndex and will not NPE.
Fixes https://github.com/netty/netty/issues/5198
Motivation:
KObjectHashMap.probeNext(..) usually involves 2 conditional statements and 2 aritmatic operations. This can be improved to have 0 conditional statements.
Modifications:
- Use bit masking to avoid conditional statements
Result:
Improved performance for KObjecthashMap.probeNext(..)
Motivation:
The DefaultHttp2Conneciton.close method accounts for active streams being iterated and attempts to avoid reentrant modifications of the underlying stream map by using iterators to remove from the stream map. However there are a few issues:
- While iterating over the stream map we don't prevent iterations over the active stream collection
- Removing a single stream may actually remove > 1 streams due to closed non-leaf streams being preserved in the priority tree which may result in NPE
Preserving closed non-leaf streams in the priority tree is no longer necessary with our current allocation algorithms, and so this feature (and related complexity) can be removed.
Modifications:
- DefaultHttp2Connection.close should prevent others from iterating over the active streams and reentrant modification scenarios which may result from this
- DefaultHttp2Connection should not keep closed stream in the priority tree
- Remove all associated code in DefaultHttp2RemoteFlowController which accounts for this case including the ReducedState object
- This includes fixing writability changes which depended on ReducedState
- Update unit tests
Result:
Fixes https://github.com/netty/netty/issues/5198
Motivation:
The LateListener logic is prone to infinite loops and relies on being processed in the EventExecutor's thread for synchronization, but this EventExecutor may not be constant. An infinite loop can occur if the EventExecutor's execute method does not introduce a context switch in LateListener.run. The EventExecutor can be changed by classes which inherit from DefaultPromise. For example the DefaultChannelPromise will return w/e EventLoop the channel is registered to, but this EventLoop can change (re-registration).
Modifications:
- Remove the LateListener concept and instead use a single Object to maintain the listeners while still preserving notification order
- Make the result member variable an atomic variable so it can be outside the synchronized(this) blocks
- Cleanup/simplify existing state management code
Result:
Fixes https://github.com/netty/netty/issues/5185
Motivation:
Currently the default log level when running tests is debug. When
running the build on the CI server it might be nice to avoid this debug
level and allow for the level to be configured.
Modifications:
Added a logback-test.xml configuration that has been added to the
common module. This allows for the logLevel to be configured.
The default level will still be debug.
Result:
The log level can now be configured from the command line:
$ mvn test -DlogLevel=error
Motivation:
Some codecs should be considered unstable as these are relative new. For this purpose we should introduce an annotation which these codecs should us to be marked as unstable in terms of API.
Modifications:
- Add UnstableApi annotation and use it on codecs that are not stable
- Move http2.hpack to http2.internal.hpack as it is internal.
Result:
Better document unstable APIs.
Motivation:
It's better to make all InternalLoggerFactory implementations be singletons according to the discussions in #5047
Modifications:
Make all InternalLoggerFactory implementations be singletons and hide the construtors.
Result:
All InternalLoggerFactory implementations be singletons.
Motivation:
Fixes#5084. We (gRPC) encountered a bug that was triggered by
grpc/grpc-java@d927180. After that commit, event loop threads are
created per task by NioEventLoopGroup, and inherits the thread group of
the caller, which in our case is an application-provided request-scope
thread. Things go south when the application tries to manipulate (e.g.,
interrupt and join) all threads of the request-scope thread group, which
unexpectedly include the event loop threads.
Modifications:
DefaultThreadFactory will save the current thread group in constructor,
and apply it to all new threads.
Result:
Threads created by DefaultThreadFactory will be in the same thread group
as the thread where the factory is created.
Related: #3449
Motivation:
When a user shut down an EventExecutor/Loop prematurely, a Promise will
fail to execute its listeners. When it happens, DefaultPromise will log
a message at ERROR level, but there's no way to get notified about it
programmatically.
Modifications:
Do not catch and log the RejectedExecutionException unconditionally,
but only catch and log for non-late listener notifications, so that a
user gets notified on submission failure at least when the listener is
late.
Result:
Remedies #3449 to some extent, although we will need fundamental fix for
that, such as #3566
Motivation:
See #3095
Modifications:
Add Log4J2LoggerFactory and Log4J2Logger which is an InternalLogger implementation based on log4j2.
Result:
The user can use log4j2 directly without a special slf4j binding.
Motivation:
Under high throughput/low latency workloads, selector wakeups are
degrading performance when the incoming operations are triggered
from outside of the event loop. This is a common scenario for
"client" applications where the originating input is coming from
application threads rather from the socket attached inside the
event loops.
As a result, it can be desirable to defer the blocking select
so that incoming tasks (write/flush) do not need to wakeup
the selector.
Modifications:
This changeset adds the notion of a generic SelectStrategy which,
based on its contract, allows the implementation to optionally
defer the blocking select based on some custom criteria.
The default implementation resembles the original behaviour, that
is if tasks are in the queue `selectNow()` and move on, and if no
tasks need to be processed go into the blocking select and wait
for wakeup.
The strategy can be customized per `NioEventLoopGroup` in the
constructor.
Result:
High performance client applications are now given the chance to
customize for how long the actual selector blocking should be
deferred by employing a custom select strategy.
Motivation:
The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418
Modifications:
- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal
Result:
The slow path of FastThreadLocal is improved.
Motivation:
It should be possible to disable the Recycler with -Dio.netty.recycler.maxCapacity=0, but because of a typo this is not the case.
Modifications:
Replace <= with < to make it posible to disable the Recycler.
Result:
Correct behaviour when using -Dio.netty.recycler.maxCapacity=0
Motivation:
Fix a typo in the log message of the static initializer of Recycler.
Modifications:
Fix typo.
Result:
Correctly log system property io.netty.recycler.maxCapacity.
Motivation:
We want to allow the use of an uber jar that contains shared dynamic libraries for all platforms (including fedora).
Modifications:
Modified OpenSsl to try and load the fedora library if the OS is Linux and the platform specified library fails before using the default lib.
Result:
True uber support.
Motivation:
PromiseAggregator's API allows for the aggregate promise to complete before the user is done adding promises. In order to support this use case the API structure would need to change in a breaking manner.
Modifications:
- Deprecate PromiseAggregator and subclasses
- Introduce PromiseCombiner which corrects these issues
Result:
PromiseCombiner corrects the deficiencies in PromiseAggregator.
Motivation:
A custom SecurityManager may prevent calling File.exists() and so throw a SecurityException in the static init block of NetUtil.
Modifications:
Correctly catch the exception and so allow to static init NetUtil.
Result:
Allow static init method of NetUtil to work even with custom SecurityManager.
Motivation:
See #3321
Modifications:
1. Add CharsetUtil.encoder/decoder() methods
2. Deprecate CharsetUtil.getEncoder/getDecoder() methods
Result:
Users can use new CharsetUtil.encoder/decoder() to specify error actions
Motivation:
See #3746.
Modifications:
Fork SpscLinkedQueue and SpscLinkedAtomicQueue from JCTools based on 7846450e28
Result:
Add SpscLinkedQueue and SpscLinkedAtomicQueue and apply it in LocalChannel.
Motivation:
If the Future that the PromiseNotifier is listening to is cancelled, it does not propagate the cancel to all the promises it is expected to notify.
Modifications:
- If the future is cancelled then all the promises should be cancelled
- Add a UnaryPromiseNotifier if a collection of promises is not necessary
Result:
PromiseNotifier propagates cancel events to all promises
Motivation:
See #4855
Modifications:
Unfortunately, unescapeCsv cannot be used here because the input could be a CSV line like `"a,b",c`. Hence this patch adds unescapeCsvFields to parse a CSV line and split it into multiple fields and unescaped them. The unit tests should define the behavior of unescapeCsvFields.
Then this patch just uses unescapeCsvFields to implement `CombinedHttpHeaders.getAll`.
Result:
`CombinedHttpHeaders.getAll` will return the unescaped values of a header.
Motivation:
ResourceLeakDetector enforces a limit as to how large the queue is allowed to grow for stack traces in order to keep memory from growing too large. However it is not always clear when records are dropped, or how many have been dropped. This can make interpreting leak reports more difficult if you assume all information is present when it may not be. Also we should increase the limit (currently 4) when running tests on the CI servers.
Modifications:
- Increase leak detector record limit on CI servers from 4 to 32.
- Track how many records have been discarded and disclose this in the leak report.
Result:
Leak reports clarify how many records were dropped, and how to increase the limit.
Motivation:
The implementation of obtaining the best possible mac address is very good. There are many sub-par implementations proposed on stackoverflow.
While not strictly a netty concern, it would be nice to offer this util also to netty users.
Modifications:
extract DefaultChannelId#defaultMachineId code obtaining the "best" mac into a new helper called MacAddress, keep the random bytes fallback in DefaultChannelID.
Result:
New helper available.
Motivation:
See #3411. A reusable ArrayList in InternalThreadLocalMap can avoid allocations in the following pattern:
```
List<...> list = new ArrayList<...>();
add something to list but never use InternalThreadLocalMap
return list.toArray(new ...[list.size()]);
```
Modifications:
Add a reusable ArrayList to InternalThreadLocalMap and update codes to use it.
Result:
Reuse a thread local ArrayList to avoid allocations.
Motivation:
PlatformDependent0 has an optimization which grabs the char[] from a String. Since this code was introduced http://openjdk.java.net/jeps/254 has been gaining momentum in JDK 9. This JEP changes the internal storage from char[] to byte[], and thus the existing char[] only based optimizations will not work.
Modifications:
- The ASCII encoding char[] String optimizations should also work for byte[].
Result:
ASCII encoding char[] String optimizations don't break if the underlying storage in String is byte[].
Motivation:
In AsciiString.trim, last should be `arrayOffset() + length() - 1`. See #4741.
Modifications:
Fix the last value.
Result:
AsciiString.trim works correctly.
Motivation:
PlatformDependent allows some exceptions to escape during static initialization. If an exception escapes it will be translated into a java.lang.ExceptionInInitializerError and render the application unable to run.
Modifications:
- Make sure to catch Throwable during static initialization.
Result:
PlatformDependent static initialization doesn't result in java.lang.ExceptionInInitializerError.
Motivation:
Sometimes a user want to do async mappings in the SniHandler as it is not possible to populate a Mapping up front.
Modifications:
Add AsyncMapping interface and make SniHandler work with it.
Result:
It is possible to do async mappings for SNI
Motivation:
HttpConversionUtil.toHttp2Headers currently has a throws Exception as part of the signature. This comes from the signature of ByteProcessor.process, but is not necessary because the ByteProcessor used does not throw.
Modifications:
- Remove throws Exception from the signature of HttpConversionUtil.toHttp2Headers.
Result:
HttpConversionUtil.toHttp2Headers interface does not propagate a throws Exception when it is used.
Motivation:
Caching is currently nested in DnsResolver.
It should also be possible to extend DnsResolver to ba able to pass a different cache on each resolution attemp.
Modifications:
* Introduce DnsCache, NoopDnsCache and DefaultDnsCache. The latter contains all the current caching logic that was extracted.
* Introduce protected versions of doResolve and doResolveAll that can be used as extension points to build resolvers that bypass the main cache and use a different one on each resolution.
Result:
Isolated caching logic. Better extensibility.
Motivation:
* newAtomicIntegerFieldUpdater and newAtomicLongFieldUpdater take a
class<?> so they're too lax
* newAtomicReferenceFieldUpdater takes a Class<U> so it's too strict
and can only be passe a rawtype parameter when dealing w/ generic
classes
Modifications:
Take a Class<? super T> parameter instead.
Result:
Better type safety and generics support.
Motivation:
Warnings in IDE, unclean code, negligible performance impact.
Modification:
Deletion of unused imports
Result:
No more warnings in IDE, cleaner code, negligible performance improvement.
Motivation:
Javadoc reports errors about invalid docs.
Modifications:
Fix some errors reported by javadoc.
Result:
A lot of javadoc errors are fixed by this patch.
Motivation:
There are some wrong links and tags in javadoc.
Modifications:
Fix the wrong links and tags in javadoc.
Result:
These links will work correctly in javadoc.
Motivation:
DomainNameMapping.add() makes DomainNameMapping look like it's safe to call add() anytime, and this is never true. It's probably better deprecate add() and introduce DomainNameMappingBuilder.
Modifications:
Made an immutable implementation of DomainNameMapping;
Added Builder for immutable DomainNameMapping;
Replaced regex pattern with String::startsWith check;
Replaced HashMap with two arrays in ImmutableDomainNameMapping;
Deprecated mutable API;
Estimation for StringBuilder initial size in ImmutableDomainNameMapping#toString()
Added StringUtil#commonSuffixOfLength
Replaced unnecessary substrings creation in DomainNameMapping#matches with regionMatches
Result:
Clients will be able to create immutable instances of DomainNameMapping with builder API.
Motivation:
UTF-16 can not represent the full range of Unicode characters, and thus has the concept of Surrogate Pair (http://unicode.org/glossary/#surrogate_pair) where 2 16-bit code units can be used to represent the missing characters. ByteBufUtil.writeUtf8 is currently does not support this and is thus incomplete.
Modifications:
- Add support for surrogate pairs in ByteBufUtil.writeUtf8
Result:
ByteBufUtil.writeUtf8 now supports surrogate pairs and is correctly converting to UTF-8.
Motivation:
PriorityStreamByteDistributor uses a homegrown algorithm which distributes bytes to nodes in the priority tree. PriorityStreamByteDistributor has no concept of goodput which may result in poor utilization of network resources. PriorityStreamByteDistributor also has performance issues related to the tree traversal approach and number of nodes that must be visited. There also exists some more proven algorithms from the resource scheduling domain which PriorityStreamByteDistributor does not employ.
Modifications:
- Introduce a new ByteDistributor which uses elements from weighted fair queue schedulers
Result:
StreamByteDistributor which is sensitive to priority and uses a more familiar distribution concept.
Fixes https://github.com/netty/netty/issues/4462
Motivation:
On contrary to `DefaultNameResolver`, `DnsNameResolver` doesn't currently honor hosts file.
Modifications:
* Introduce `HostsFileParser` that parses `/etc/hosts` or `C:\Windows\system32\drivers\etc\hosts` depending on the platform
* Introduce `HostsFileEntriesResolver` that uses the former to resolve host names
* Make `DnsNameResolver` check his `HostsFileEntriesResolver` prior to trying to resolve names against the DNS server
* Introduce `DnsNameResolverBuilder` so we now have a builder for `DnsNameResolver`s
* Additionally introduce a `CompositeNameResolver` that takes several `NameResolver`s and tries to resolve names by delegating sequentially
* Change `DnsNameResolver.asAddressResolver` to return a composite and honor hosts file
Result:
Hosts file support when using `DnsNameResolver`.
Consistent behavior with JDK implementation.
Motivation:
The KObjectHashMapTest is in a directory called "io.netty.util.collection" rather than "io/netty/util/collection". This causes the generated tests to be created in the wrong directory as well.
Modifications:
Moved the file.
Result:
Fixes#4546
Motivation:
Related to issue #4564.
AsciiString.contentEqualsIgnoreCase fails when comparing two AsciiStrings of the same length
Modifications:
Compare the values of the first AsciiString to the second AsciiString
Result:
AsciiString.contentEqualsIgnoreCase works as expected
Motivation:
DefaultPromiseTest has dead code which was left over from a code restructure. Shared code between 2 tests was moved into a common method, but some code which was not cleaned up in each of these methods after the code was moved.
Modifications:
- Delete dead code in DefaultPromiseTest
Result:
Less dead code
Motivation:
AbstractFuture currently wraps CancellationException in a ExecutionException. However the interface of Future says that this exception should be directly thrown.
Modifications:
- Throw CancellationException from AbstractFuture.get
Result:
Interface contract for CancellationException is honored in AbstractFuture.
Motivation:
There is a notification ordering issue in DefaultPromise when the lateListener collection is in use. The ordering issue can be observed in situations where a late listener is added to a Future returned from a write operation. It is possible that this future will run after a read operation scheduled on the I/O thread, even if the late listener is added on the I/O thread. This can lead to unexpected ordering where a listener for a write operation which must complete in order for the read operation to happen is notified after the read operation is done.
Modifications:
- If the lateListener collection becomes empty, it should be treated as though it was null when checking if lateListeners can be notified immediatley (instead of executing a task on the executor)
Result:
Ordering is more natural and will not be perceived as being out of order relative to other tasks on the same executor.
Motivation:
HttpHeaders already has specific methods for such popular and simple headers like "Host", but if I need to convert POST raw body to string I need to parse complex ContentType header in my code.
Modifications:
Add getCharset and getCharsetAsString methods to parse charset from Content-Length header.
Result:
Easy to use utility method.
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.
Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available
Result:
AsciiString hashCode is faster.
Motivation:
If netty used as part of application, should be a way to prefix service thread name to easy distinguish such threads (for example, used in IntelliJ Platform)
Modifications:
Introduce system property io.netty.serviceThreadPrefix
Result:
ThreadDeathWatcher thread has a readable name "Netty threadDeathWatcher-2-1" if io.netty.serviceThreadPrefix set to "Netty"
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.
Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.
Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
Motivation:
DefaultPromise.toString() returns 'DefaultPromise(incomplete)' when it's
actually complete with non-null result.
Modifications:
Handle the case where the promise is done and its result is non-null in
toString()
Result:
The String returned by DefaultPromise.toString() is not confusing
anymore.
Motivation:
Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.
Modifications:
- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.
Result:
Faster sampling.
Motivation:
When the ImmediateEventExecutor is in use it is possible to get a StackOverFlowException if when a promise completes a new listener is added to that promise.
Modifications:
- Protect against the case where LateListeners.run() smashes the stack.
Result:
Fixes https://github.com/netty/netty/issues/4395
Motivation:
sun.misc.Unsafe allows us to handle heap ByteBuf in a more efficient matter. We should use special ByteBuf implementation when sun.misc.Unsafe can be used to increase performance.
Modifications:
- Add PooledUnsafeHeapByteBuf and UnpooledUnsafeHeapByteBuf that are used when sun.misc.Unsafe is ready to use.
- Add UnsafeHeapSwappedByteBuf
Result:
Better performance when using heap buffers and sun.misc.Unsafe is ready to use.
Motivation:
We had a bug in our implemention which double "reversed" bytes on systems which not support unaligned access.
Modifications:
- Correctly only reverse bytes if needed.
- Share code between unsafe implementations.
Result:
No more data-corruption on sytems without unaligned access.
Motivation:
We started the thread before store it in a field which could lead to an assert error when the thread is executed before we actually store it.
Modifications:
Store thread before start it.
Result:
No more assert error possible.
Motivation:
At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.
Modifications:
- forward decoded messages after each decode call
Result:
Forwarding decoded messages through the pipeline in a more eager fashion.
Motivation:
When dealing with case insensitive headers it can be useful to have a case insensitive contains method for CharSequence.
Modifications:
- Add containsCaseInsensative to AsciiString
Result:
More expressive utility method for case insensitive CharSequence.
Motivation:
Http2CodecUtils has some static variables which are defined as Strings instead of CharSequence. One of these defines is used as a header name and should be AsciiString.
Modifications:
- Change the String defines in Http2CodecUtils to CharSequence
Result:
Types are more consistently using CharSequence and adding the upgrade header will require less work.
Motivation:
Leak detector, when it detects a leak, will print the last 5 stack
traces that touched the ByteBuf. In some cases that might not be enough
to identify the root cause of the leak.
Also, sometimes users might not be interested in tracing all the
operations on the buffer, but just the ones that are affecting the
reference count.
Modifications:
Added command line properties to override default values:
* Allow to configure max number of stack traces to collect
* Allow to only record retain/release operation on buffers
Result:
Users can increase the number of stack traces to debug buffer leaks
with lot of retain/release operations.
Motivation:
for debugging and metrics reasons its sometimes useful to be able to get details of the the Thread that powers a SingleThreadEventExecutor.
Modifications:
- Expose ThreadProperties
- Add unit test.
Result:
It's now possible to get details of the Thread that powers a SingleThreadEventExecutor.
Motivation:
Sometimes it is useful to disable recycling completely if memory constraints are very tight.
Modifications:
Allow to use -Dio.netty.recycler.maxCapacity=0 to disable recycling completely.
Result:
It's possible to disable recycling now.
Motivation:
The javadocs are incorrect and so give false impressions of use-pattern.
Modifications:
- Fix javadocs of which operations are allowed from multiple threads concurrently.
- Let isEmpty() work concurrently.
Result:
Correctly document usage-patterns.
Motivation:
The StringUtil class creates a Formatter object, but does not close it. There are also a 2 utility methods which would be generally useful.
Modifications:
- Close the Formatter
- Add length and isNullOrEmpty
Result:
No more resource leaks. Additional utility methods.
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.
Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.
Result:
Performance is now on par with 4.0.
Motivation:
We noticed that the headers implementation in Netty for HTTP/2 uses quite a lot of memory
and that also at least the performance of randomly accessing a header is quite poor. The main
concern however was memory usage, as profiling has shown that a DefaultHttp2Headers
not only use a lot of memory it also wastes a lot due to the underlying hashmaps having
to be resized potentially several times as new headers are being inserted.
This is tracked as issue #3600.
Modifications:
We redesigned the DefaultHeaders to simply take a Map object in its constructor and
reimplemented the class using only the Map primitives. That way the implementation
is very concise and hopefully easy to understand and it allows each concrete headers
implementation to provide its own map or to even use a different headers implementation
for processing requests and writing responses i.e. incoming headers need to provide
fast random access while outgoing headers need fast insertion and fast iteration. The
new implementation can support this with hardly any code changes. It also comes
with the advantage that if the Netty project decides to add a third party collections library
as a dependency, one can simply plug in one of those very fast and memory efficient map
implementations and get faster and smaller headers for free.
For now, we are using the JDK's TreeMap for HTTP and HTTP/2 default headers.
Result:
- Significantly fewer lines of code in the implementation. While the total commit is still
roughly 400 lines less, the actual implementation is a lot less. I just added some more
tests and microbenchmarks.
- Overall performance is up. The current implementation should be significantly faster
for insertion and retrieval. However, it is slower when it comes to iteration. There is simply
no way a TreeMap can have the same iteration performance as a linked list (as used in the
current headers implementation). That's totally fine though, because when looking at the
benchmark results @ejona86 pointed out that the performance of the headers is completely
dominated by insertion, that is insertion is so significantly faster in the new implementation
that it does make up for several times the iteration speed. You can't iterate what you haven't
inserted. I am demonstrating that in this spreadsheet [1]. (Actually, iteration performance is
only down for HTTP, it's significantly improved for HTTP/2).
- Memory is down. The implementation with TreeMap uses on avg ~30% less memory. It also does not
produce any garbage while being resized. In load tests for GRPC we have seen a memory reduction
of up to 1.2KB per RPC. I summarized the memory improvements in this spreadsheet [1]. The data
was generated by [2] using JOL.
- While it was my original intend to only improve the memory usage for HTTP/2, it should be similarly
improved for HTTP, SPDY and STOMP as they all share a common implementation.
[1] https://docs.google.com/spreadsheets/d/1ck3RQklyzEcCLlyJoqDXPCWRGVUuS-ArZf0etSXLVDQ/edit#gid=0
[2] https://gist.github.com/buchgr/4458a8bdb51dd58c82b4
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.
Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.
Result:
Peformance boost for decoding http objects.
Motivation:
We should support XXXCollections methods for all primitive map types.
Modifications:
Removed PrimitiveCollections and added a template for XXXCollections.
Result:
Fixes#4001
Motivation:
It would be useful to support the Java `Map` interface in our primitive maps.
Modifications:
Renamed current methods to "pXXX", where p is short for "primitive". Made the template for all primitive maps extend the appropriate Map interface.
Result:
Fixes#3970
Motivation:
Prior we used a purge task that would remove previous canceled scheduled tasks from the internal queue. This could introduce some delay and so use a lot of memory even if the task itself is already canceled.
Modifications:
Schedule removal of task from queue via EventLoop if cancel operation is not done in the EventLoop Thread or just remove directly if the Thread that cancels the scheduled task is in the EventLoop.
Result:
Faster possibility to GC a canceled ScheduledFutureTask.
Motivation:
PoolThreadCache did only cache allocations if the allocation and deallocation Thread were the same. This is not optimal as often people write from differen thread then the actual EventLoop thread.
Modification:
- Add MpscArrayQueue which was forked from jctools and lightly modified.
- Use MpscArrayQueue for caches and always add buffer back to the cache that belongs to the allocation thread.
Result:
ThreadPoolCache is now also usable and so gives performance improvements when allocation and deallocation thread are different.
Performance when using same thread for allocation and deallocation is noticable worse then before.
Motivation:
The PooledByteBufAllocator is more or less a black-box atm. We need to expose some metrics to allow the user to get a better idea how to tune it.
Modifications:
- Expose different metrics via PooledByteBufAllocator
- Add *Metrics interfaces
Result:
It is now easy to gather metrics and detail about the PooledByteBufAllocator and so get a better understanding about resource-usage etc.
Motivation:
There is an error in the ByteString test logic which is resulting in test failures.
Modifications:
- Fix the loop iteration to use the loop iteration variable instead of a fixed index.
Result:
Tests are less buggy.
Motivation:
In the SslHandler we schedule a timeout at which we close the Channel if a timeout was detected during close_notify. Because this can race with notify the flushFuture we can see an IllegalStateException when the Channel is closed.
Modifications:
- Use a trySuccess() and tryFailure(...) to guard against race.
Result:
No more race.
Motivation:
All read operations should be safe to execute from multiple threads which was not the case and so could produce a livelock.
Modifications:
Modify methods so these are safe to be called from multiple threads.
Result:
No more livelock.
Motivation:
There are various known issues in netty-codec-dns:
- Message types are not interfaces, which can make it difficult for a
user to implement his/her own message implementation.
- Some class names and field names do not match with the terms in the
RFC.
- The support for decoding a DNS record was limited. A user had to
encode and decode by him/herself.
- The separation of DnsHeader from DnsMessage was unnecessary, although
it is fine conceptually.
- Buffer leak caused by DnsMessage was difficult to analyze, because the
leak detector tracks down the underlying ByteBuf rather than the
DnsMessage itself.
- DnsMessage assumes DNS-over-UDP.
- To send an EDNS message, a user have to create a new DNS record class
instance unnecessarily.
Modifications:
- Make all message types interfaces and add default implementations
- Rename some classes, properties, and constants to match the RFCs
- DnsResource -> DnsRecord
- DnsType -> DnsRecordType
- and many more
- Remove DnsClass and use an integer to support EDNS better
- Add DnsRecordEncoder/DnsRecordDecoder and their default
implementations
- DnsRecord does not require RDATA to be ByteBuf anymore.
- Add DnsRawRecord as the catch-all record type
- Merge DnsHeader into DnsMessage
- Make ResourceLeakDetector track AbstractDnsMessage
- Remove DnsMessage.sender/recipient properties
- Wrap DnsMessage with AddressedEnvelope
- Add DatagramDnsQuest and DatagramDnsResponse for ease of use
- Rename DnsQueryEncoder to DatagramDnsQueryEncoder
- Rename DnsResponseDecoder to DatagramDnsResponseDecoder
- Miscellaneous changes
- Add StringUtil.TAB
Result:
- Cleaner APi
- Can support DNS-over-TCP more easily in the future
- Reduced memory footprint in the default DnsQuery/Response
implementations
- Better leak tracking for DnsMessages
- Possibility to introduce new DnsRecord types in the future and provide
full record encoder/decoder implementation.
- No unnecessary instantiation for an EDNS pseudo resource record
Motivation:
Many projects need some kind a Channel/Connection pool implementation. While the protocols are different many things can be shared, so we should provide a generic API and implementation.
Modifications:
Add ChannelPool / ChannelPoolMap API and implementations.
Result:
Reusable / Generic pool implementation that users can use.
Motivation:
'length2 == 0' is not reachable because length1 and length2 are same at this point.
Motification:
Removed 'length2 == 0'.
Result:
Cleaner code.
Motivation:
Currently, valueOf() and newInstance() use almost same code to create new constant.
For maintainability, it's better to share duplicate code among them.
Motification:
Added new private functions.
- checkNotNullAndNotEmpty() is for checking whether the name of a constant is null and empty.
- newConstant0() is for creating a new constant.
Result:
- Compact source code
- Improvement of maintainability
Motivation:
When a SecurityManager is in place that preven reading the somaxconn file trying to bootstrap a channel later will result in a ClassNotFoundError.
Modifications:
- Reading the file in a privileged block.
Result:
No more ClassNotFoundError when a SecurityManager is in place.
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.
Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset
Result:
ByteString and AsciiString can represent a sub region of a byte[].
Motivation:
static Package getPackage(Class<?> c) uses synchronized block internally.
Thanks to @jingene for the hint and initial report of the issue.
Modifications:
-Use simple lastIndexOf(...) and substring for a faster implementation
Result:
No more lock condition.
Motivation:
Each different *ChannelOption did extend ChannelOption in 4.0, which we changed in 4.1. This is a breaking change in terms of the API so we need to ensure we keep the old hierarchy.
Modifications:
- Let all *ChannelOption extend ChannelOption
- Add back constructor and mark it as @deprecated
Result:
No API breakage between 4.0 and 4.1
Motivation:
The current implementation does byte by byte comparison, which we have seen
can be a performance bottleneck when the AsciiString is used as the key in
a Map.
Modifications:
Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
and get closer to the performance of String.equals(Object).
Result:
Significant improvement (2x - 6x) in performance over the current implementation.
Benchmark (size) Mode Samples Score Score error Units
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10 thrpt 10 118843477.518 2347259.347 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 50 thrpt 10 43910319.773 198376.996 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100 thrpt 10 26339969.001 159599.252 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 1000 thrpt 10 2873119.030 20779.056 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10000 thrpt 10 306370.450 1933.303 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100000 thrpt 10 25750.415 108.391 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10 thrpt 10 248077563.510 635320.093 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 50 thrpt 10 128198943.138 614827.548 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100 thrpt 10 86195621.349 1063959.307 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 1000 thrpt 10 16920264.598 61615.365 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10000 thrpt 10 1687454.747 6367.602 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100000 thrpt 10 153717.851 586.916 ops/s
Motivation:
ByteString#hashCode() trashes its own hash code if it's being accessed concurrently
Modifications:
Pull the ByteString#hash into a local variable and calculate it locally.
Result:
ByteString#hashCode() is no longer returning a junk value.
Motivation:
The IntObjectHashMap benchmarks show the Agrona collections to be faster on put, lookup, and remove. One major difference is that we're using 2 modulus operations each time we increment the position index while iterating. Agrona uses a mask instead.
Modifications:
Modified the KObjectHashMap to use masking rather than modulus when wrapping the position index. This requires that the capacity be a power of 2.
Result:
Improved performance of IntObjectHashMap.
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.
Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.
Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
Motivation:
Attribute.getAndRemove() will return the value but also remove the AttributeKey itself from the AttributeMap. This may not
what you want as you may want to keep an instance of it and just set it later again. Document the contract so the user know what to expect.
Modifications:
- Make it clear when to use AttributeKey.getAndRemove() / AttributeKey.remove() and when AttributeKey.getAndSet(null) / AttributeKey.set(null).
Result:
Less suprising behaviour.
Motivation:
To support HTTP2 we need APLN support. This was not provided before when using OpenSslEngine, so SSLEngine (JDK one) was the only bet.
Beside this CipherSuiteFilter was not supported
Modifications:
- Upgrade netty-tcnative and make use of new features to support ALPN and NPN in server and client mode.
- Guard against segfaults after the ssl pointer is freed
- support correctly different failure behaviours
- add support for CipherSuiteFilter
Result:
Be able to use OpenSslEngine for ALPN / NPN for server and client.
Motivation:
Currently we have IntObjectMap/HashMap, but it will be useful to support other primitive-based maps.
Modifications:
Moved the code int the current maps to template files and run Groovy code from common/pom.xml to apply the templates.
Result:
Autogeneration of int and char-based hash maps.
Motivation:
We are allocating a hash map for every HTTP2 Stream to store it's children.
Most streams are leafs in the priority tree and don't have children.
Modification:
- Only allocate children when we actually use them.
- Make EmptyIntObjectMap not throw a UnsupportedOperationException on remove, but return null instead (as is stated in it's javadoc).
Result:
Fewer unnecessary allocations.
Motivation:
PrimitiveCollections is not in the 4.1 branch. It is needed by HTTP/2.
Modifications:
Backport this class.
Result:
PrimitiveCollections is in 4.1.
Motivation:
The Http2Settings class currently disallows setting non-standard settings, which violates the spec.
Modifications:
Updated Http2Settings to permit arbitrary settings. Also adjusting the default initial capacity to allow setting all of the standard settings without reallocation.
Result:
Fixes#3560
Motivation:
On a system where ipv4 and ipv6 are supported a user may want to use -Djava.net.preferIPv4Stack=true to restrict it to use ipv4 only.
This is currently ignored with the epoll transport.
Modifications:
Respect java.net.preferIPv4Stack system property.
Result:
-Djava.net.preferIPv4Stack=true will have the effect the user is looking for.
Motivation:
When remove0() is called multiple times for an DefaultAttribute it can cause corruption of the internal linked-list structure.
Modifications:
- Ensure remove0() can not cause corruption by null out prev and next references.
Result:
No more corruption possible
Motivation:
At the moment when EmbeddedChannel is used and a ChannelHandler tries to schedule and task it will throw an UnsupportedOperationException. This makes it impossible to test these handlers or even reuse them with EmbeddedChannel.
Modifications:
- Factor out reusable scheduling code into AbstractSchedulingEventExecutor
- Let EmbeddedEventLoop and SingleThreadEventExecutor extend AbstractSchedulingEventExecutor
- add EmbbededChannel.runScheduledPendingTasks() which allows to run all scheduled tasks that are ready
Result:
Embeddedchannel is now usable even with ChannelHandler that try to schedule tasks.
Motivation:
At the moment if you want to return a HTTP header containing multiple
values you have to set/add that header once with the values wanted. If
you used set/add with an array/iterable multiple HTTP header fields will
be returned in the response.
Note, that this is indeed a suggestion and additional work and tests
should be added. This is mainly to bring up a discussion.
Modifications:
Added a flag to specify that when multiple values exist for a single
HTTP header then add them as a comma separated string.
In addition added a method to StringUtil to help escape comma separated
value charsequences.
Result:
Allows for responses to be smaller.
Motivation:
We should allow to get a ChannelOption/AttributeKey from a String. This will make it a lot easier to make use of configuration files in applications.
Modifications:
- Add exists(...), newInstance(...) method to ChannelOption and AttributeKey and alter valueOf(...) to return an existing instance for a String or create one.
- Add unit tests.
Result:
Much more flexible usage of ChannelOption and AttributeKey.
While implementing netty-handler-proxy, I realized various issues in our
current socksx package. Here's the list of the modifications and their
background:
- Split message types into interfaces and default implementations
- so that a user can implement an alternative message implementations
- Use classes instead of enums when a user might want to define a new
constant
- so that a user can extend SOCKS5 protocol, such as:
- defining a new error code
- defining a new address type
- Rename the message classes
- to avoid abbreviated class names. e.g:
- Cmd -> Command
- Init -> Initial
- so that the class names align better with the protocol
specifications. e.g:
- AuthRequest -> PasswordAuthRequest
- AuthScheme -> AuthMethod
- Rename the property names of the messages
- so that the property names align better when the field names in the
protocol specifications
- Improve the decoder implementations
- Give a user more control over when a decoder has to be removed
- Use DecoderResult and DecoderResultProvider to handle decode failure
gracefully. i.e. no more Unknown* message classes
- Add SocksPortUnifinicationServerHandler since it's useful to the users
who write a SOCKS server
- Cleaned up and moved from the socksproxy example
Motivation:
isRoot() is an expensive operation. We should avoid calling it if
possible.
Modifications:
Move the isRoot() checks to the end of the 'if' block, so that isRoot()
is evaluated only when really necessary.
Result:
isRoot() is evaluated only when SO_BROADCAST is set and the bind address
is anylocal address.
Motivation:
When a user sees an error message, sometimes he or she does not know
what exactly he or she has to do to fix the problem.
Modifications:
Log the URL of the wiki pages that might help the user troubleshoot.
Result:
We are more friendly.