Motivation
As pointed out by @91he in
https://github.com/netty/netty/pull/8595#issuecomment-459181794, there
is a remaining bug in LocationAwareSlf4JLogger following the updates
done in #8595. The logging methods which take a varargs message
parameter array should format using MessageFormatter.arrayFormat rather
than MessageFormatter.format.
Modifications
Change varargs param methods in LocationAwareSlf4JLogger to use
MessageFormatter.arrayFormat and extend unit test to cover these cases.
Results
Correct log output when logging messages with > 2 parameters when using
LocationAwareSlf4JLogger.
Motivation:
If there are no listeners attached to the promise when full-filling it we do not need to schedule a task to notify.
Modifications:
- Don't schedule a task if there is nothing to notify.
- Add unit tests.
Result:
Fixes https://github.com/netty/netty/issues/8795.
Motivation:
To conform to the CharSequence interface we need to return an empty CharSequence when start == end index and a subSequence is requested.
Modifications:
- Correctly handle the case where start == end
- Add unit test
Result:
Fix https://github.com/netty/netty/issues/8796.
Motivation:
We can use lambdas now as we use Java8.
Modification:
use lambda function for all package, #8751 only migrate transport package.
Result:
Code cleanup.
Motivation:
As netty 4.x supported Java 6 we had various if statements to check for java versions < 8. We can remove these now.
Modification:
Remove unnecessary if statements that check for java versions < 8.
Result:
Cleanup code.
* Decouble EventLoop details from the IO handling for each transport to allow easy re-use of code and customization
Motiviation:
As today extending EventLoop implementations to add custom logic / metrics / instrumentations is only possible in a very limited way if at all. This is due the fact that most implementations are final or even package-private. That said even if these would be public there are the ability to do something useful with these is very limited as the IO processing and task processing are very tightly coupled. All of the mentioned things are a big pain point in netty 4.x and need improvement.
Modifications:
This changeset decoubled the IO processing logic from the task processing logic for the main transport (NIO, Epoll, KQueue) by introducing the concept of an IoHandler. The IoHandler itself is responsible to wait for IO readiness and process these IO events. The execution of the IoHandler itself is done by the SingleThreadEventLoop as part of its EventLoop processing. This allows to use the same EventLoopGroup (MultiThreadEventLoupGroup) for all the mentioned transports by just specify a different IoHandlerFactory during construction.
Beside this core API change this changeset also allows to easily extend SingleThreadEventExecutor / SingleThreadEventLoop to add custom logic to it which then can be reused by all the transports. The ideas are very similar to what is provided by ScheduledThreadPoolExecutor (that is part of the JDK). This allows for example things like:
* Adding instrumentation / metrics:
* how many Channels are registered on an SingleThreadEventLoop
* how many Channels were handled during the IO processing in an EventLoop run
* how many task were handled during the last EventLoop / EventExecutor run
* how many outstanding tasks we have
...
...
* Implementing custom strategies for choosing the next EventExecutor / EventLoop to use based on these metrics.
* Use different Promise / Future / ScheduledFuture implementations
* decorate Runnable / Callables when submitted to the EventExecutor / EventLoop
As a lot of functionalities are folded into the MultiThreadEventLoopGroup and SingleThreadEventLoopGroup this changeset also removes:
* AbstractEventLoop
* AbstractEventLoopGroup
* EventExecutorChooser
* EventExecutorChooserFactory
* DefaultEventLoopGroup
* DefaultEventExecutor
* DefaultEventExecutorGroup
Result:
Fixes https://github.com/netty/netty/issues/8514 .
Motivation:
Custom Netty ThreadLocalRandom and ThreadLocalRandomProvider classes are no longer needed and can be removed.
Modification:
Remove own ThreadLocalRandom
Result:
Less code to maintain
Motivation:
No need to initialize charsets from the string. We can take already allocated charset from StandardCharsets class.
Modification:
Replace Charset.forName("US-ASCII") with StandardCharsets.US_ASCII.
Removed Charset[] values() method and internal static variable as it was used only in tests.
Result:
Reuse what the JDK provides
Motivation:
We can use the diamond operator these days.
Modification:
Use diamond operator whenever possible.
Result:
More modern code and less boiler-plate.
Motivation:
Incrementing two variables in sync is not necessary when only one will do.
Modifications:
- Remove `j` from `for` loop and replace with `i`.
- Add more unit testing scenarios to cover changed code.
Results:
Unnecessary variable removed.
Motivation
#8563 highlighted race conditions introduced by the prior optimistic
update optimization in 83a19d5650. These
were known at the time but considered acceptable given the perf
benefit in high contention scenarios.
This PR proposes a modified approach which provides roughly half the
gains but stronger concurrency semantics. Race conditions still exist
but their scope is narrowed to much less likely cases (releases
coinciding with retain overflow), and even in those
cases certain guarantees are still assured. Once release() returns true,
all subsequent release/retains are guaranteed to throw, and in
particular deallocate will be called at most once.
Modifications
- Use even numbers internally (including -ve) for live refcounts
- "Final" release changes to odd number (equivalent to refcount 0)
- Retain still uses faster getAndAdd, release uses CAS loop
- First CAS attempt uses non-volatile read
- Thread.yield() after a failed CAS provides a net gain
Result
More (though not completely) robust concurrency semantics for ref
counting; increased latency under high contention, but still roughly
twice as fast as the original logic. Bench results to follow
Motivation:
We did miss to use MessageFormatter inside LocationAwareSlf4jLogger and so {} was not correctly replaced in log messages when using slf4j.
This regression was introduced by afe0767e9c.
Modifications:
- Make use of MessageFormatter
- Add unit test.
Result:
Fixes https://github.com/netty/netty/issues/8483.
* Log the correct line-number when using SLF4j with netty if possible.
Motivation:
At the moment we do not log the correct line number in many cases as it will log the line number of the logger wrapper itself. Slf4j does have an extra interface that can be used to filter out these nad make it more usable with logging wrappers.
Modifications:
Detect if the returned logger implements LocationAwareLogger and if so make use of its extra methods to be able to log the correct origin of the log request.
Result:
Better logging when using slf4j.
Motivation:
Log4J2Logger had some code-duplication with AbstractInternalLogger
Modifications:
Reuse AbstractInternaLogger.EXCEPTION_MESSAGE in Log4J2Logger and so remove some code-duplication
Result:
Less duplicated code.
Motivation:
There was a race condition between the task submitter and task executor threads such that the last Runnable submitted may not get executed.
Modifications:
The bug was fixed by checking the task queue and state in the task executor thread after it saw the task queue was empty.
Result:
Fixes#8230
Motivation:
Recycler may produce a NPE when the same object is recycled multiple times from different threads.
Modifications:
- Check if the id has changed or if the Stack became null and if so throw an IllegalStateException
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/8220.
Motivation:
We do not correctly check for previous calles of setUncancellable() in getNow() which may result in ClassCastException as we incorrectly return the internally UNCANCELLABLE object and not null if setUncancellable() we as called before.
Modifications:
Correctly check for UNCANCELLABLE and add unit test.
Result:
Fixes https://github.com/netty/netty/issues/8135.
Motivation:
We incorrectly calculated the length that was used for our for loop in AsciiString.indexOf(...). This lead to a possible ArrayIndexOutOfBoundsException.
Modifications:
- Not include the start in the length calculation
- Add unit test.
Result:
Fixes https://github.com/netty/netty/issues/8112.
Motivation:
ObjectCleaner does start a Thread to handle the cleaning of resources which leaks into the users application. We should not use it in netty itself to make things more predictable.
Modifications:
- Remove usage of ObjectCleaner and use finalize as a replacement when possible.
- Clarify javadocs for FastThreadLocal.onRemoval(...) to ensure its clear that remove() is not guaranteed to be called when the Thread completees and so this method is not enough to guarantee cleanup for this case.
Result:
Fixes https://github.com/netty/netty/issues/8017.
Motivation:
We should allow to schedule tasks with a delay up to Long.MAX_VALUE as we did pre 4.1.25.Final.
Modifications:
Just ensure we not overflow and put the correct max limits in place when schedule a timer. At worse we will get a wakeup to early and then schedule a new timeout.
Result:
Fixes https://github.com/netty/netty/issues/7970.
Motivation:
We did not guard against the case of calling malloc(0) when creating a ByteBuffer without a Cleaner. The problem is that malloc(0) can have different behaviour, it either return a null-pointer or a valid pointer that you can pass to free.
The real problem arise if Unsafe.allocateMemory(0) returns 0 and we use it as the memoryAddress of the ByteBuffer. The problem here is that native libraries test for 0 and handle it as a null-ptr. This is for example true in SSL.bioSetByteBuffer(...) which would throw a NPE when 0 is used as memoryAddress and so produced errors during SSL usage.
Modifications:
- Always allocate 1 byte as minimum (even if we ask for an empty buffer).
- Add unit test.
Result:
No more errors possible because of malloc(0).
Motivation:
When a buffer is over-released, the current error message of `IllegalReferenceCountException` is `refCnt: XXX, increment: XXX`, which is confusing. The correct message should be `refCnt: XXX, decrement: XXX`.
Modifications:
Pass `-decrement` to create `IllegalReferenceCountException`.
Result:
The error message will be `refCnt: XXX, decrement: XXX` when a buffer is over-released.
Motivation:
A FastThreadLocal instance currently occupies 2 slots of InternalThreadLocalMap of every thread. Actually for a FastThreadLocalThread, it does not need to store cleaner flags of FastThreadLocal instances. Besides, using BitSet to store cleaner flags is also better for space usage.
Modification:
Use BitSet to optimize space usage of FastThreadLocal.
Result:
Avoid unnecessary slot occupancy. Cleaner flags are now stored into a BitSet.
Motivation:
The bounds checking for AsciiString#indexOf and AsciiString#lastIndexOf is not correct and may lead to ArrayIndexOutOfBoundsException.
Modifications:
- Correct the bounds checking for AsciiString#indexOf and AsciiString#lastIndexOf
Result
Fixes https://github.com/netty/netty/issues/7863
* NetUtil valid IP methods to accept CharSequence
Motivation:
NetUtil has methods to determine if a String is a valid IP address. These methods don't rely upon String specific methods and can use CharSequence instead.
Modifications:
- Use CharSequence instead of String for the IP validator methods.
- Avoid object allocation in AsciiString#indexOf(char,int) and reduce
byte code
Result:
No more copy operation required if a CharSequence exists.
Motivation:
We dont protect from overflow and so the timer may fire too early if a large timeout is used.
Modifications:
Add overflow guard and a test.
Result:
Fixes https://github.com/netty/netty/issues/7760.
Motivation:
DefaultPromise's internal state depends upon specific Signal objects. These Signal objects can be used externally which causes the DefaultPromise object API to not function correct and state to become corrupted.
Modifications:
- DefaultPromise shouldn't depend upon Signal for its internal state
Result:
Fixes https://github.com/netty/netty/issues/7707
Motivation:
Currently if user call set/remove/set/remove many times, it will create multiple cleaner task for same index. It may cause OOM due to long live thread will have more and more task in LIVE_SET.
Modification:
Add flag to avoid recreating tasks.
Result:
Only create 1 clean task. But use more space of indexedVariables.
Motivation:
Reflective setAccessible(true) will produce scary warnings on the console when using java9+, while netty still works. That said users may feel uncomfortable with these warnings, we should not try to do it by default when using java9+.
Modifications:
Add io.netty.tryReflectionSetAccessible system property which controls if setAccessible(...) will be used. By default it will bet set to false when using java9+.
Result:
Fixes [#7254].
Motivation:
The ObjectCleanerThread must be a daemon thread as otherwise we may block the JVM from exit. By using a daemon thread we basically give the same garantees as the JVM when it comes to cleanup of resources (as the GC threads are also daemon threads and the CleanerImpl uses a deamon thread as well in Java9+).
Modifications:
Change ObjectCleanThread to be a daemon thread.
Result:
JVM shutdown will always be able to complete. Fixed [#7617].
Motivation:
In environments with a security manager, the reflective access to get the reference to
Throwable#addSuppressed can cause issues that result in Netty failing to load. The main
motivation in this pull request is to remove the use of reflection to prevent issues in
these environments.
Modifications:
ThrowableUtil no longer uses Class#getDeclaredMembers to get the Method that references
Throwable#addSuppressed and instead guards the call to Throwable#addSuppressed with a
Java version check.
Additionally, a annotation was added that suppresses the animal sniffer java16 signature
check on the given method. The benefit of the annotation is that it limits the exclusion
of Throwable to just the ThrowableUtil class and has string text indicating the reason
for suppressing the java16 signature check.
Result:
Netty no longer requires the use of Class#getDeclaredMethod for ThrowableUtil and will
work in environments restricted by a security manager without needing to grant reflection
permissions.
Fixes#7614
Motivation:
ObjectCleaner inovkes a Runnable which may execute user code (FastThreadLocal#onRemoval) and therefore exceptions maybe thrown. If an exception is thrown the cleanup thread will exit prematurely and we may never finish cleaning up which will result in leaks.
Modifications:
- ObjectCleaner should suppress exceptions and continue cleaning
Result:
ObjectCleaner will reliably clean despite exceptions being thrown.
Motivation:
e329ca1 introduced the user of ObjectCleaner in FastThreadLocal but we missed the case to register our cleaner task if FastThreadLocal.set was called only.
Modifications:
- Use ObjectCleaner also when FastThreadLocal.set is used.
- Add test case.
Result:
ObjectCleaner is always used.
Motivation:
There is no guarantee that FastThreadLocal.onRemoval(...) is called if the FastThreadLocal is used by "non" FastThreacLocalThreads. This can lead to all sort of problems, like for example memory leaks as direct memory is not correctly cleaned up etc.
Beside this we use ThreadDeathWatcher to check if we need to release buffers back to the pool when thread local caches are collected. In the past ThreadDeathWatcher was used which will need to "wakeup" every second to check if the registered Threads are still alive. If we can ensure FastThreadLocal.onRemoval(...) is called we do not need this anymore.
Modifications:
- Introduce ObjectCleaner and use it to ensure FastThreadLocal.onRemoval(...) is always called when a Thread is collected.
- Deprecate ThreadDeathWatcher
- Add unit tests.
Result:
Consistent way of cleanup FastThreadLocals when a Thread is collected.
Motivation:
When doStartThread throws an exception, e.g. due to the actual executor being depleted of threads and throwing in its rejected execution handler, the STEE ends up in started state anyway. If we try to execute another task in this executor, it will be queued but the thread won't be started anymore and the task will linger forever.
Modifications:
- Ensure we not update the internal state if the startThread() method throws.
- Add testcase
Result:
Fixes [#7483]
Motivation:
In our Recycler implementation we store a reference to the current Thread in the Stack that is stored in a FastThreadLocal. The Stack itself is referenced in the DefaultHandle itself. A problem can arise if a user stores a Reference to an Object that holds a reference to the DefaultHandle somewhere and either not remove the reference at all or remove it very late. In this case the Thread itself can not be collected as its still referenced in the Stack that is referenced by the DefaultHandle.
Modifications:
- Use a WeakReference to store the reference to the Thread in the Stack
- Add a test case
Result:
Ensure a Thread can be collected in a timely manner in all cases even if it used the Recycler.
Motivation:
When system property is empty, the default value should be used.
Modification:
- Correctly use the default value in all cases
- Add unit tests
Result:
Correct behaviour
Motivation:
`AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue.
Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K).
Modification:
Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance.
Result:
Before - cancelling futures has varying performance, capped at `O(n)`
After - cancelling futures has stable performance, capped at `O(log n)`
Benchmark results
After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 137779.616 ± 7709.751 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 11049.448 ± 385.832 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 943.294 ± 12.391 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 64.210 ± 1.824 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 167531.096 ± 9187.865 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 33019.786 ± 4737.770 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 2976.955 ± 248.555 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 362.654 ± 45.716 ops/s
```
Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation.
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 139968.586 ± 12951.333 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 12274.420 ± 337.800 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 958.168 ± 15.350 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 53.381 ± 13.981 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 123918.829 ± 3642.517 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 5099.810 ± 206.992 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 72.335 ± 0.443 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 0.743 ± 0.003 ops/s
```
Motivation:
Fix NullPointerExceptions that occur when running netty-tcnative inside the bootstrap class loader.
Modifications:
- Replace loader.getResource(...) with ClassLoader.getSystemResource(...) when loader is null.
- Replace loader.loadClass(...) with Class.forName(..., false, loader) which works when loader is both null and non-null.
Result:
Support running native libs in bootstrap class loader
Motivation:
Objects of java.util.TreeMap or java.util.TreeSet will become
non-Serializable if instantiated with Comparators, which are not also
Serializable. This can result in unexpected and difficult-to-diagnose
bugs.
Modifications:
Implements Serializable for all classes, which implements Comparator.
Result:
Proper Comparators which will not force collections to
non-Serializable mode.
Motivation:
Even if it's a super micro-optimization (most JVM could optimize such
cases in runtime), in theory (and according to some perf tests) it
may help a bit. It also makes a code more clear and allows you to
access such methods in the test scope directly, without instance of
the class.
Modifications:
Add 'static' modifier for all methods, where it possible. Mostly in
test scope.
Result:
Cleaner code with proper 'static' modifiers.
Motivation:
NativeLibraryLoader uses ClassLoader#getResource method that can return nulls when the resource cannot be found. The returned url variable should be checked for nullity and fail in a more usable manner than a NullPointerException
Modifications:
Fail with a FileNotFoundException
Result:
Fixes [#7222].
Motivation:
When Log4j2Logger is used with PatternLayout (%F:%L)%c.%M, the log message incorrect shows:
(Log4J2Logger.java:73)io.netty.util.internal.PlatformDependent0.debug ....
Modification:
Extend AbstractLogger
Result:
Fixes [#7186].
Motivation:
PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.
Modifications:
1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).
Result:
Less code duplication.
Motivation:
A `StringUtil#escapeCsv` creates new `StringBuilder` on each value even if the same string is returned in the end.
Modifications:
Create new `StringBuilder` only if it really needed. Otherwise, return the original string (or just trimmed substring).
Result:
Less GC load. Up to 4x faster work for not changed strings.
Motivation:
`FormattingTuple.getArgArray()` is never used.
In the `MessageFormatter` it is possible to make
some improvements, e.g. replace `StringBuffer`
with `StringBuilder`, avoid redundant allocations, etc.
Modifications:
- Remove `argArray` field from the `FormattingTuple`.
- In `MessageFormatter`:
- replace `StringBuffer` with `StringBuilder`,
- replace `HashMap` with `HashSet` and make it lazy initialized.
- avoid redundant allocations (`substring()`, etc.)
- use appropriate StringBuilder's methods for the some `Number` values.
- Porting unit tests from `slf4j`.
Result:
Less GC load on logging with internal `MessageFormatter`.