Motivation:
Currently if user call set/remove/set/remove many times, it will create multiple cleaner task for same index. It may cause OOM due to long live thread will have more and more task in LIVE_SET.
Modification:
Add flag to avoid recreating tasks.
Result:
Only create 1 clean task. But use more space of indexedVariables.
Motivation:
Reflective setAccessible(true) will produce scary warnings on the console when using java9+, while netty still works. That said users may feel uncomfortable with these warnings, we should not try to do it by default when using java9+.
Modifications:
Add io.netty.tryReflectionSetAccessible system property which controls if setAccessible(...) will be used. By default it will bet set to false when using java9+.
Result:
Fixes [#7254].
Motivation:
The ObjectCleanerThread must be a daemon thread as otherwise we may block the JVM from exit. By using a daemon thread we basically give the same garantees as the JVM when it comes to cleanup of resources (as the GC threads are also daemon threads and the CleanerImpl uses a deamon thread as well in Java9+).
Modifications:
Change ObjectCleanThread to be a daemon thread.
Result:
JVM shutdown will always be able to complete. Fixed [#7617].
Motivation:
In environments with a security manager, the reflective access to get the reference to
Throwable#addSuppressed can cause issues that result in Netty failing to load. The main
motivation in this pull request is to remove the use of reflection to prevent issues in
these environments.
Modifications:
ThrowableUtil no longer uses Class#getDeclaredMembers to get the Method that references
Throwable#addSuppressed and instead guards the call to Throwable#addSuppressed with a
Java version check.
Additionally, a annotation was added that suppresses the animal sniffer java16 signature
check on the given method. The benefit of the annotation is that it limits the exclusion
of Throwable to just the ThrowableUtil class and has string text indicating the reason
for suppressing the java16 signature check.
Result:
Netty no longer requires the use of Class#getDeclaredMethod for ThrowableUtil and will
work in environments restricted by a security manager without needing to grant reflection
permissions.
Fixes#7614
Motivation:
ObjectCleaner inovkes a Runnable which may execute user code (FastThreadLocal#onRemoval) and therefore exceptions maybe thrown. If an exception is thrown the cleanup thread will exit prematurely and we may never finish cleaning up which will result in leaks.
Modifications:
- ObjectCleaner should suppress exceptions and continue cleaning
Result:
ObjectCleaner will reliably clean despite exceptions being thrown.
Motivation:
e329ca1 introduced the user of ObjectCleaner in FastThreadLocal but we missed the case to register our cleaner task if FastThreadLocal.set was called only.
Modifications:
- Use ObjectCleaner also when FastThreadLocal.set is used.
- Add test case.
Result:
ObjectCleaner is always used.
Motivation:
There is no guarantee that FastThreadLocal.onRemoval(...) is called if the FastThreadLocal is used by "non" FastThreacLocalThreads. This can lead to all sort of problems, like for example memory leaks as direct memory is not correctly cleaned up etc.
Beside this we use ThreadDeathWatcher to check if we need to release buffers back to the pool when thread local caches are collected. In the past ThreadDeathWatcher was used which will need to "wakeup" every second to check if the registered Threads are still alive. If we can ensure FastThreadLocal.onRemoval(...) is called we do not need this anymore.
Modifications:
- Introduce ObjectCleaner and use it to ensure FastThreadLocal.onRemoval(...) is always called when a Thread is collected.
- Deprecate ThreadDeathWatcher
- Add unit tests.
Result:
Consistent way of cleanup FastThreadLocals when a Thread is collected.
Motivation:
When doStartThread throws an exception, e.g. due to the actual executor being depleted of threads and throwing in its rejected execution handler, the STEE ends up in started state anyway. If we try to execute another task in this executor, it will be queued but the thread won't be started anymore and the task will linger forever.
Modifications:
- Ensure we not update the internal state if the startThread() method throws.
- Add testcase
Result:
Fixes [#7483]
Motivation:
In our Recycler implementation we store a reference to the current Thread in the Stack that is stored in a FastThreadLocal. The Stack itself is referenced in the DefaultHandle itself. A problem can arise if a user stores a Reference to an Object that holds a reference to the DefaultHandle somewhere and either not remove the reference at all or remove it very late. In this case the Thread itself can not be collected as its still referenced in the Stack that is referenced by the DefaultHandle.
Modifications:
- Use a WeakReference to store the reference to the Thread in the Stack
- Add a test case
Result:
Ensure a Thread can be collected in a timely manner in all cases even if it used the Recycler.
Motivation:
When system property is empty, the default value should be used.
Modification:
- Correctly use the default value in all cases
- Add unit tests
Result:
Correct behaviour
Motivation:
`AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue.
Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K).
Modification:
Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance.
Result:
Before - cancelling futures has varying performance, capped at `O(n)`
After - cancelling futures has stable performance, capped at `O(log n)`
Benchmark results
After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 137779.616 ± 7709.751 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 11049.448 ± 385.832 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 943.294 ± 12.391 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 64.210 ± 1.824 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 167531.096 ± 9187.865 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 33019.786 ± 4737.770 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 2976.955 ± 248.555 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 362.654 ± 45.716 ops/s
```
Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation.
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 139968.586 ± 12951.333 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 12274.420 ± 337.800 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 958.168 ± 15.350 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 53.381 ± 13.981 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 123918.829 ± 3642.517 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 5099.810 ± 206.992 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 72.335 ± 0.443 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 0.743 ± 0.003 ops/s
```
Motivation:
Fix NullPointerExceptions that occur when running netty-tcnative inside the bootstrap class loader.
Modifications:
- Replace loader.getResource(...) with ClassLoader.getSystemResource(...) when loader is null.
- Replace loader.loadClass(...) with Class.forName(..., false, loader) which works when loader is both null and non-null.
Result:
Support running native libs in bootstrap class loader
Motivation:
Objects of java.util.TreeMap or java.util.TreeSet will become
non-Serializable if instantiated with Comparators, which are not also
Serializable. This can result in unexpected and difficult-to-diagnose
bugs.
Modifications:
Implements Serializable for all classes, which implements Comparator.
Result:
Proper Comparators which will not force collections to
non-Serializable mode.
Motivation:
Even if it's a super micro-optimization (most JVM could optimize such
cases in runtime), in theory (and according to some perf tests) it
may help a bit. It also makes a code more clear and allows you to
access such methods in the test scope directly, without instance of
the class.
Modifications:
Add 'static' modifier for all methods, where it possible. Mostly in
test scope.
Result:
Cleaner code with proper 'static' modifiers.
Motivation:
NativeLibraryLoader uses ClassLoader#getResource method that can return nulls when the resource cannot be found. The returned url variable should be checked for nullity and fail in a more usable manner than a NullPointerException
Modifications:
Fail with a FileNotFoundException
Result:
Fixes [#7222].
Motivation:
When Log4j2Logger is used with PatternLayout (%F:%L)%c.%M, the log message incorrect shows:
(Log4J2Logger.java:73)io.netty.util.internal.PlatformDependent0.debug ....
Modification:
Extend AbstractLogger
Result:
Fixes [#7186].
Motivation:
PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.
Modifications:
1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).
Result:
Less code duplication.
Motivation:
A `StringUtil#escapeCsv` creates new `StringBuilder` on each value even if the same string is returned in the end.
Modifications:
Create new `StringBuilder` only if it really needed. Otherwise, return the original string (or just trimmed substring).
Result:
Less GC load. Up to 4x faster work for not changed strings.
Motivation:
`FormattingTuple.getArgArray()` is never used.
In the `MessageFormatter` it is possible to make
some improvements, e.g. replace `StringBuffer`
with `StringBuilder`, avoid redundant allocations, etc.
Modifications:
- Remove `argArray` field from the `FormattingTuple`.
- In `MessageFormatter`:
- replace `StringBuffer` with `StringBuilder`,
- replace `HashMap` with `HashSet` and make it lazy initialized.
- avoid redundant allocations (`substring()`, etc.)
- use appropriate StringBuilder's methods for the some `Number` values.
- Porting unit tests from `slf4j`.
Result:
Less GC load on logging with internal `MessageFormatter`.
Motivation:
IPv4/6 validation methods use allocations, which can be avoided.
IPv4 parse method use StringTokenizer.
Modifications:
Rewriting IPv4/6 validation methods to avoid allocations.
Rewriting IPv4 parse method without use StringTokenizer.
Result:
IPv4/6 validation and IPv4 parsing faster up to 2-10x.
Motivation:
We should only try to load jdk.internal.misc.Unsafe if we run on Java9+ to eliminate noise in the log.
Modifications:
- Move javaVersion() and related methods to PlatformDependent0 to be able to use these in the static initializer without creating a cycle.
- Only try to load jdk.internal.misc.Unsafe when running in Java9+
Result:
Less noise in the log when running pre java9.
Motivation:
`NetUtil`'s methods `isValidIpV6Address` and `getIPv6ByName` incorrectly validate some IPv6 addresses.
Modifications:
- `getIPv6ByName`: add checks for single colon at the start or end.
- `isValidIpV6Address`: fix checks for the count of colons and use `endOffset` instead of `ipAddress.length()` for the cases with the brackets or '%'.
Result:
More correct implementation of `NetUtil#isValidIpV6Address` and `NetUtil#getIPv6ByName`.
Motivation:
Fixes#6681.
Modification:
For the sake of better timer observability, expose the number of pending timeouts through the new HashedWheelTimer.pendingTimeouts method .
Result:
It's now ridiculously easy to observe Netty timer's very basic and yet important metric, the number of pending tasks/timeouts.
Motivation:
NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName allowed an invalid form of mapped IPv4 addresses which lead to accepting invalid IPv6 addresses as valid.
Modifications:
- NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName should only allow 7 colons for an IPv4 address if they are the first 2 characters.
Result:
More correct implementation of NetUtil#isValidIpV6Address and NetUtil#getIPv6ByName
Motivation:
NetUtil#getByName and NetUtil#isValidIpV6Address do not strictly enforce the format of IPv4 addresses that are allowed to be embedded in IPv6 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5. This may lead to invalid addresses being parsed, or invalid addresses being considered valid. Compression of a single IPv6 word was also not handled correctly if there are 7 : characters.
Modifications:
- NetUtil#isValidIpV6Address should enforce the IPv4-Compatible and IPv4-Mapped are the only valid formats for including IPv4 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5
- NetUtil#getByName should more stritcly parse IPv6 addresses which contain IPv4 addresses as specified in https://tools.ietf.org/html/rfc4291#section-2.5.5
- NetUtil should allow compression even if the number of : characters is 7.
- NetUtil#createByteArrayFromIpAddressString should use the same IP string to byte[] translation which is used in NetUtil#getByName
Result:
NetUtil#getByName and NetUtil#isValidIpV6Address respect the IPv6 RFC which defines the valid formats for embedding IPv4 addresses.
Motivation:
In cases when an application is running in a container or is otherwise
constrained to the number of processors that it is using, the JVM
invocation Runtime#availableProcessors will not return the constrained
value but rather the number of processors available to the virtual
machine. Netty uses this number in sizing various resources.
Additionally, some applications will constrain the number of threads
that they are using independenly of the number of processors available
on the system. Thus, applications should have a way to globally
configure the number of processors.
Modifications:
Rather than invoking Runtime#availableProcessors, Netty should rely on a
method that enables configuration when the JVM is started or by the
application. This commit exposes a new class NettyRuntime for enabling
such configuraiton. This value can only be set once. Its default value
is Runtime#availableProcessors so that there is no visible change to
existing applications, but enables configuring either a system property
or configuring during application startup (e.g., based on settings used
to configure the application).
Additionally, we introduce the usage of forbidden-apis to prevent future
uses of Runtime#availableProcessors from creeping. Future work should
enable the bundled signatures and clean up uses of deprecated and
other forbidden methods.
Result:
Netty can be configured to not use the underlying number of processors,
but rather the constrained number of processors.
Motivation:
As the javadoc of ScheduledExecutorService state:
Zero and negative delays (but not periods) are also allowed in schedule methods,and are treated as requests for immediate execution.
Modifications:
- Correctly handle delay <= 0.
- Add unit tests.
Result:
Fixes [#6627].
Motivation:
When UNSAFE.allocateMemory is returning an address whose high bit is set we currently throw an IllegalArgumentException. This is not correct as it may return a negative number on at least sparc.
Modifications:
- Allow to pass in negative memoryAddress
- Add unit tests
Result:
Correctly validate the memoryAddress and so also work on sparc as expected. Fixes [#6574].
Motivation:
The updated HTTP/1.x RFC allows for header values to be CSV and separated by OWS [1]. CombinedHttpHeaders should remove this OWS on insertion.
[1] https://tools.ietf.org/html/rfc7230#section-7
Modification:
CombinedHttpHeaders doesn't account for the OWS and returns it back to the user as part of the value.
Result:
Fixes#6452
Motivation:
We used some deprecated Mockito methods.
Modifications:
- Replace deprecated method usage
- Some cleanup
Result:
No more usage of deprecated Mockito methods. Fixes [#6482].
Motivation:
When UnorderedThreadPoolEventExecutor.execute / submit etc is called it will consume up to 100 % CPU even after the task was executed.
Modifications:
Add a special wrapper which we will be used in execute(...) to wrap the submitted Runnable. This is needed as ScheduledThreadPoolExecutor.execute(...) will delegate to submit(...) which will then use decorateTask(...). The problem with this is that decorateTask(...) needs to ensure we only do our own decoration if we not call from execute(...) as otherwise we may end up creating an endless loop because DefaultPromise will call EventExecutor.execute(...) when notify the listeners of the promise.
Result:
Fixes [#6507].
Motivation:
We used various mocking frameworks. We should only use one...
Modifications:
Make usage of mocking framework consistent by only using Mockito.
Result:
Less dependencies and more consistent mocking usage.
Motivation:
When comparing MAC addresses searching for the best MAC address, if
locally-administered address (e.g., from a Docker container) is compared
against an empty MAC address, the empty MAC address will be marked as
preferred. In cases this is the only available MAC address, this leaves
Netty using a random machine ID instead of using a perfectly valid
machine ID from the locally-adminstered address.
Modifications:
This commit modifies the MAC address logic so that the empty MAC address
is not preferred over a locally-administered address. This commit also
simplifies the comparison logic here.
Result:
Empty MAC addresses will not be preferred over locally-administered
addresses thus permitting the default machine ID to be the
locally-adminstered MAC address if it is the only available MAC address.
Motivation:
codec-http2 couples the dependency tree state with the remainder of the stream state (Http2Stream). This makes implementing constraints where stream state and dependency tree state diverge in the RFC challenging. For example the RFC recommends retaining dependency tree state after a stream transitions to closed [1]. Dependency tree state can be exchanged on streams in IDLE. In practice clients may use stream IDs for the purpose of establishing QoS classes and therefore retaining this dependency tree state can be important to client perceived performance. It is difficult to limit the total amount of state we retain when stream state and dependency tree state is combined.
Modifications:
- Remove dependency tree, priority, and weight related items from public facing Http2Connection and Http2Stream APIs. This information is optional to track and depends on the flow controller implementation.
- Move all dependency tree, priority, and weight related code from DefaultHttp2Connection to WeightedFairQueueByteDistributor. This is currently the only place which cares about priority. We can pull out the dependency tree related code in the future if it is generally useful to expose for other implementations.
- DefaultHttp2Connection should explicitly limit the number of reserved streams now that IDLE streams are no longer created.
Result:
More compliant with the HTTP/2 RFC.
Fixes https://github.com/netty/netty/issues/6206.
[1] https://tools.ietf.org/html/rfc7540#section-5.3.4
Motivation:
The HttpProxyHandler is expected to be capable of issuing a valid CONNECT request for a tunneled connection to an IPv6 host.
Modifications:
- Correctly format the IPV6 address.
- Add unit tests
Result:
HttpProxyHandler works with IPV6 as well. Fixes [#6152].
Motivation:
DefaultChannelId provides a regular expression which validates if a user provided MAC address is valid. This regular expression may allow invalid MAC addresses and also not allow valid MAC addresses.
Modifications:
- Introduce a MacAddressUtil#parseMac method which can parse and validate the MAC address at the same time. The regular expression check before hand is additional overhead if we have to parse the MAC address.
Result:
Fixes https://github.com/netty/netty/issues/6132.
Motivation:
c2f4daa739 added a unit test but used a too small test timeout.
Modifications:
Increase timeout.
Result:
Test should have enough time to complete on the CI.
Motivation:
InternalLoggerFactory either sets a default logger factory
implementation based on the logging implementations on the classpath, or
applications can set a logger factory explicitly. If applications wait
too long to set the logger factory, Netty will have already set a logger
factory leading to some objects using one logging implementation and
other objets using another logging implementation. This can happen too
if the application tries to set the logger factory twice, which is
likely a bug in the application. Yet, the Javadocs for
InternalLoggerFactory warn against this saying that
InternalLoggerFactory#setLoggerFactory "should be called as early as
possible and shouldn't be called more than once". Instead, Netty should
guard against this.
Modications:
We replace the logger factory field with an atomic reference on which we
can do CAS operations to safely guard against it being set twice. We
also add an internal holder class that captures the static interface of
InternalLoggerFactory that can aid in testing.
Result:
The logging factory can not be set twice, and applications that want to
set the logging factory must do it before any Netty classes are
initialized (or the default logger factory will be set).
Motivation:
We need to ensure the tracked object can not be GC'ed before ResourceLeak.close() is called as otherwise we may get false-positives reported by the ResourceLeakDetector. This can happen as the JIT / GC may be able to figure out that we do not need the tracked object anymore and so already enqueue it for collection before we actually get a chance to close the enclosing ResourceLeak.
Modifications:
- Add ResourceLeakTracker and deprecate the old ResourceLeak
- Fix some javadocs to correctly release buffers.
- Add a unit test for ResourceLeakDetector that shows that ResourceLeakTracker has not the problems.
Result:
No more false-positives reported by ResourceLeakDetector when ResourceLeakDetector.track(...) is used.
Motivation:
HashWheelTimerTest has busy/wait and sleep statements which are not necessary. We also depend upon a com.google.common.base.Supplier which isn't necessary.
Modifications:
- Remove buys wait loops and timeouts where possible
Result:
HashWheelTimerTest more explicit in verifying conditions and less reliant on wait times.
Motivation:
If the rate at which new timeouts are created is very high and the created timeouts are not cancelled, then the JVM can crash because of out of heap space. There should be a guard in the implementation to prevent this.
Modifications:
The constructor of HashedWheelTimer now takes an optional max pending timeouts parameter beyond which it will reject new timeouts by throwing RejectedExecutionException.
Result:
After this change, if the max pending timeouts parameter is passed as constructor argument to HashedWheelTimer, then it keeps a track of pending timeouts that aren't yet expired or cancelled. When a new timeout is being created, it checks for current pending timeouts and if it's equal to or greater than provided max pending timeouts, then it throws RejectedExecutionException.
Motivation:
the build doesnt seem to enforce this, so they piled up
Modifications:
removed unused import lines
Result:
less unused imports
Signed-off-by: radai-rosenblatt <radai.rosenblatt@gmail.com>
Motivation:
NetUtil.bytesToIpAddress does not correctly translate IPv4 address to String. Also IPv6 addresses may not follow minimization conventions when converting to a String (see rfc 5952).
Modifications:
- NetUtil.bytesToIpAddress should correctly handle negative byte values for IPv4
- NetUtil.bytesToIpAddress should leverage existing to string conversion code in NetUtil
Result:
Fixes https://github.com/netty/netty/issues/5821
Motivation:
We not need to do an extra conditional check in retain(...) as we can just check for overflow after we did the increment.
Modifications:
- Remove extra conditional check
- Add test code.
Result:
One conditional check less.
Motivation:
We offer DefaultEventExecutorGroup as an EventExecutorGroup which return OrderedEventExecutor and so provide strict ordering of event execution. One limitations of this implementation is that each contained DefaultEventExecutor will always be tied to a single thread, which can lead to a very unbalanced execution as one thread may be super busy while others are idling.
Modifications:
- Add NonStickyEventExecutorGroup which can be used to wrap another EventExecutorGroup (like UnorderedThreadPoolEventExecutor) and expose ordering while not be sticky with the thread that is used for a given EventExecutor. This basically means that Threads may change between execution of tasks for an EventExecutor but ordering is still guaranteed.
Result:
Better utalization of threads in some use-cases.
Motivation:
To better restrict resource usage we should limit the number of WeakOrderQueue instances per Thread. Once this limit is reached object that are recycled from a different Thread then the allocation Thread are dropped on the floor.
Modifications:
Add new system property io.netty.recycler.maxDelayedQueuesPerThread and constructor that allows to limit the max number of WeakOrderQueue instances per Thread for Recycler instance. The default is 2 * cores (the same as the default number of EventLoop instances per EventLoopGroup).
Result:
Better way to restrict resource / memory usage per Recycler instance.
Motivation:
At the moment the Recyler is very sensitive to allocation bursts which means that if there is a need for X objects for only one time these will most likely end up in the Recycler and sit there forever as the normal workload only need a subset of this number.
Modifications:
Add a ratio which sets how many objects should be pooled for each new allocation. This allows to slowly increase the number of objects in the Recycler while not be to sensitive for bursts.
Result:
Less unused objects in the Recycler if allocation rate sometimes bursts.
Motivation:
We saw some sporadic test failures for GlobalEventExecutorTest.testAutomaticStartStop test. This is caused parallel execution of tests in combination with assert checks that will be affected.
Modifications:
Remove fragile assert checks.
Result:
No more sporadic test failures
Motivation:
The Java version is used for platform dependent logic. Yet, the logic
for acquiring the Java version requires special permissions (the runtime
permission "getClassLoader") that some downstream projects will never
grant. As such, these projects are doomed to have Netty act is their
Java major version is six. While there are ways to maintain the same
logic without requiring these special permissions, the logic is
needlessly complicated because it relies on loading classes that exist
in version n but not version n - 1. This complexity can be removed. As a
bonanza, the dangerous permission is no longer required.
Modifications:
Rather than attempting to load classes that exist in version n but not
in version n - 1, we can just parse the Java specification version. This
only requires a begign property (property permission
"java.specification.version") and is simple.
Result:
Acquisition of the Java version is safe and simple.
Motivation:
A recent change to DefaultThreadFactory modified it so that it is sticky
with respect to thread groups. In particular, this change made it so
that DefaultThreadFactory would hold on to the thread group that created
it, and then use that thread group to create threads.
This can have problematic semantics since it can lead to violations of a
tenet of thread groups that a thread can only modify threads in its own
thread group and descendant thread groups. With a sticky thread group, a
thread triggering the creation of a new thread via
DefaultThreadFactory#newThread will be modifying a thread from the
sticky thread group which will not necessarily be its own nor a
descendant thread group. When a security manager is in place that
enforces this requirement, these modifications are now impossible. This
is especially problematic in the context of Netty because certain global
singletons like GlobalEventExecutor will create a
DefaultThreadFactory. If all DefaultThreadFactory instances are sticky
about their thread groups, it means that submitting tasks to the
GlobalEventExecutor singleton can cause a thread to be created from the
DefaultThreadFactory sticky thread group, exactly the problem with
DefaultThreadFactory being sticky about thread groups. A similar problem
arises from the ThreadDeathWatcher.
Modifications:
This commit modifies DefaultThreadFactory so that a null thread group
can be set with the behavior that all threads created by such an
instance will inherit the default thread group (the thread group
provided by the security manager if there is one, otherwise the thread
group of the creating thread). The construction of the instances of
DefaultThreadFactory used by the GlobalEventExecutor singleton and
ThreadDeathWatcher are modified to use this behavior. Additionally, we
also modify the chained constructor invocations of the
DefaultThreadFactory that do not have a parameter to specify a thread
group to use the thread group from the security manager is available,
otherwise the creating thread's thread group. We also add unit tests
ensuring that all of this behavior is maintained.
Result:
It will be possible to have DefaultThreadFactory instances that are not
sticky about the thread group that led to their creation. Instead,
threads created by such a DefaultThreadFactory will inherit the default
thread group which will either be the thread group from the security
manager or the current thread's thread group.
Motivation:
Currently, the recycler max capacity it's only enforced on the
thread-local stack which is used when the recycling happens on the
same thread that requested the object.
When the recycling happens in a different thread, then the objects
will be queued into a linked list (where each node holds N objects,
default=16). These objects are then transfered into the stack when
new objects are requested and the stack is empty.
The problem is that the queue doesn't have a max capacity and that
can lead to bad scenarios. Eg:
- Allocate 1M object from recycler
- Recycle all of them from different thread
- Recycler WeakOrderQueue will contain 1M objects
- Reference graph will be very long to traverse and GC timeseems to be negatively impacted
- Size of the queue will never shrink after this
Modifications:
Add some shared counter which is used to manage capacity limits when recycle from different thread then the allocation thread. We modify the counter whenever we allocate a new Link to reduce the overhead of increment / decrement it.
Result:
More predictable number of objects mantained in the recycler pool.
Motivation:
Today when awaiting uninterruptibly on a default promise, a race
condition can lead to a missed signal. Quite simply, the check for
whether the condition holds is not made inside a lock before
waiting. This means that the waiting thread can enter the wait after the
promise has completed and will thus not be notified, thus missing the
signal. This leads to the waiting thread to enter a timed wait that will
only trip with the timeout elapses leading to unnecessarily long waits
(imagine a connection timeout, and the waiting thread missed the signal
that the connection is ready).
Modification:
This commit fixes this missed signal by checking the condition inside a
lock. We also add a test that reliably fails without the non-racy
condition check.
Result:
Timed uninterruptible waits on default promise will not race against the
condition and possibly wait longer than necessary.
Motivation:
ExecutorService.invoke*(...) methods may block by API definition. This can lead to deadlocks if called from inside the EventLoop in SingleThreadEventExecutor as it only has one Thread that does all the work.
Modifications:
Throw a RejectedExectionException if someone tries to call SingleThreadEventExecutor.invoke*(...) while in the EventLoop.
Result:
No more deadlock possible.
Motivation:
The current DnsNameResolver does not support search domains resolution. Search domains resolution is supported out of the box by the java.net resolver, making the DnsNameResolver not able to be a drop in replacement for io.netty.resolver.DefaultNameResolver.
Modifications:
The DnsNameResolverContext resolution has been modified to resolve a list of search path first when it is configured so. The resolve method now uses the following algorithm:
if (hostname is absolute (start with dot) || no search domains) {
searchAsIs
} else {
if (numDots(name) >= ndots) {
searchAsIs
}
if (searchAsIs wasn't performed or failed) {
searchWithSearchDomainsSequenciallyUntilOneSucceeds
}
}
The DnsNameResolverBuilder provides configuration for the search domains and the ndots value. The default search domains value is configured with the OS search domains using the same native configuration the java.net resolver uses.
Result:
The DnsNameResolver performs search domains resolution when they are present.
Motivation:
AsciiString.hashCode(o) , if "o" is a subString, the hash code is not always same, when netty’s version is 4.1.1.Final and jdk’s version is 1.6.
Modifications:
Use a test to assert hash codes are equal between a new string and any sub string (a part of a char array),If their values are equal.
Result:
Create a test method to AsciiStringCharacterTest.
Motivation:
PlatformDependent attempts to use reflection to get the underlying char[] (or byte[]) from String objects. This is fragile as if the String implementation does not utilize the full array, and instead uses a subset of the array, this optimization is invalid. OpenJDK6 and some earlier versions of OpenJDK7 String have the capability to use a subsection of the underlying char[].
Modifications:
- PlatformDependent should not attempt to use the underlying array from String (or other data types) via reflection
Result:
PlatformDependent hash code generation for CharSequence does not depend upon specific JDK implementation details.
Motivation:
PR #5355 modified interfaces to reduce GC related to the HPACK code. However this came with an anticipated performance regression related to HpackUtil.equals due to AsciiString's increase cost of charAt(..). We should mitigate this performance regression.
Modifications:
- Introduce an equals method in PlatformDependent which doesn't leak timing information and use this in HpcakUtil.equals
Result:
Fixes https://github.com/netty/netty/issues/5436
Modifications:
DefaultPromise provides a ThreadLocal queue to protect against StackOverflowError because of executors which may immediately execute runnables instead of queue them (i.e. ImmediateEventExecutor). However this may be better addressed by fixing these executors to protect against StackOverflowError instead of just fixing for a single use case. Also the most commonly used executors already provide the desired behavior and don't need the additional overhead of a ThreadLocal queue in DefaultPromise.
Modifications:
- Remove ThreadLocal queue from DefaultPromise
- Change ImmediateEventExecutor so it maintains a queue of runnables if reentrant condition occurs
Result:
DefaultPromise StackOverflowError code is simpler, and ImmediateEventExecutor protects against StackOverflowError.
Motivation:
f2ed3e6ce8 removed the previous mechanism for StackOverflowError because it didn't work in all cases (i.e. ImmediateExecutor). However if a chain of listeners which complete other promises is formed there is still a possibility of a StackOverflowError.
Modifications:
- Use a ThreadLocal to save any DefaultPromises which could not be notified due to the stack being too large. After the first DefaultPromise on the stack completes notification this ThreadLocal should be used to notify any DefaultPromises which have not yet been notified.
Result:
DefaultPromise has StackOverflowError protection that works with all EventExecutor types.
Motivation:
Recycler.recycle(...) should not be used anymore and be replaced by Handle.recycle().
Modifications:
Mark it as deprecated and update usage.
Result:
Correctly document deprecated api.
Motivation:
DomainMappingBuilder should have been named as DomainNameMappingBuilder
because it builds a DomainNameMapping.
Modifications:
- Add DomainNameMappingBuilder that does the same job with
DomainMappingBuilder
- Deprecate DomainMappingBuilder and delegate its logic to
DomainNameMappingBuilder
- Remove the references to the deprecated methods and classes related
with domain name mapping
- Miscellaneous:
- Fix Javadoc of DomainNameMapping.asMap()
- Pre-create the unmodifiable map in DomainNameMapping
Result:
- Consistent naming
- Less use of deprecated API
Motivation:
DomainNameMapping.entries() returns Set<Map.Entry<String, V>>, which
doesn't sound very natural.
Modifications:
Replace entries() with asMap() which returns a Map<String, V> instead.
Result:
- Better looking API
- User can do a lookup because it's a Map
Motivation:
See #4200.
Modifications:
Add DomainNameMapping.entries to allow retrieving the domain match lists.
Result:
People can use DomainNameMapping.entries to retrive the match list in DomainNameMapping.
Motivation:
The LateListener logic is prone to infinite loops and relies on being processed in the EventExecutor's thread for synchronization, but this EventExecutor may not be constant. An infinite loop can occur if the EventExecutor's execute method does not introduce a context switch in LateListener.run. The EventExecutor can be changed by classes which inherit from DefaultPromise. For example the DefaultChannelPromise will return w/e EventLoop the channel is registered to, but this EventLoop can change (re-registration).
Modifications:
- Remove the LateListener concept and instead use a single Object to maintain the listeners while still preserving notification order
- Make the result member variable an atomic variable so it can be outside the synchronized(this) blocks
- Cleanup/simplify existing state management code
Result:
Fixes https://github.com/netty/netty/issues/5185
Motivation:
It's better to make all InternalLoggerFactory implementations be singletons according to the discussions in #5047
Modifications:
Make all InternalLoggerFactory implementations be singletons and hide the construtors.
Result:
All InternalLoggerFactory implementations be singletons.
Motivation:
See #3095
Modifications:
Add Log4J2LoggerFactory and Log4J2Logger which is an InternalLogger implementation based on log4j2.
Result:
The user can use log4j2 directly without a special slf4j binding.
Motivation:
PromiseAggregator's API allows for the aggregate promise to complete before the user is done adding promises. In order to support this use case the API structure would need to change in a breaking manner.
Modifications:
- Deprecate PromiseAggregator and subclasses
- Introduce PromiseCombiner which corrects these issues
Result:
PromiseCombiner corrects the deficiencies in PromiseAggregator.
Motivation:
If the Future that the PromiseNotifier is listening to is cancelled, it does not propagate the cancel to all the promises it is expected to notify.
Modifications:
- If the future is cancelled then all the promises should be cancelled
- Add a UnaryPromiseNotifier if a collection of promises is not necessary
Result:
PromiseNotifier propagates cancel events to all promises
Motivation:
See #4855
Modifications:
Unfortunately, unescapeCsv cannot be used here because the input could be a CSV line like `"a,b",c`. Hence this patch adds unescapeCsvFields to parse a CSV line and split it into multiple fields and unescaped them. The unit tests should define the behavior of unescapeCsvFields.
Then this patch just uses unescapeCsvFields to implement `CombinedHttpHeaders.getAll`.
Result:
`CombinedHttpHeaders.getAll` will return the unescaped values of a header.
Motivation:
In AsciiString.trim, last should be `arrayOffset() + length() - 1`. See #4741.
Modifications:
Fix the last value.
Result:
AsciiString.trim works correctly.
Motivation:
DomainNameMapping.add() makes DomainNameMapping look like it's safe to call add() anytime, and this is never true. It's probably better deprecate add() and introduce DomainNameMappingBuilder.
Modifications:
Made an immutable implementation of DomainNameMapping;
Added Builder for immutable DomainNameMapping;
Replaced regex pattern with String::startsWith check;
Replaced HashMap with two arrays in ImmutableDomainNameMapping;
Deprecated mutable API;
Estimation for StringBuilder initial size in ImmutableDomainNameMapping#toString()
Added StringUtil#commonSuffixOfLength
Replaced unnecessary substrings creation in DomainNameMapping#matches with regionMatches
Result:
Clients will be able to create immutable instances of DomainNameMapping with builder API.
Motivation:
PriorityStreamByteDistributor uses a homegrown algorithm which distributes bytes to nodes in the priority tree. PriorityStreamByteDistributor has no concept of goodput which may result in poor utilization of network resources. PriorityStreamByteDistributor also has performance issues related to the tree traversal approach and number of nodes that must be visited. There also exists some more proven algorithms from the resource scheduling domain which PriorityStreamByteDistributor does not employ.
Modifications:
- Introduce a new ByteDistributor which uses elements from weighted fair queue schedulers
Result:
StreamByteDistributor which is sensitive to priority and uses a more familiar distribution concept.
Fixes https://github.com/netty/netty/issues/4462
Motivation:
Related to issue #4564.
AsciiString.contentEqualsIgnoreCase fails when comparing two AsciiStrings of the same length
Modifications:
Compare the values of the first AsciiString to the second AsciiString
Result:
AsciiString.contentEqualsIgnoreCase works as expected
Motivation:
DefaultPromiseTest has dead code which was left over from a code restructure. Shared code between 2 tests was moved into a common method, but some code which was not cleaned up in each of these methods after the code was moved.
Modifications:
- Delete dead code in DefaultPromiseTest
Result:
Less dead code
Motivation:
AbstractFuture currently wraps CancellationException in a ExecutionException. However the interface of Future says that this exception should be directly thrown.
Modifications:
- Throw CancellationException from AbstractFuture.get
Result:
Interface contract for CancellationException is honored in AbstractFuture.
Motivation:
There is a notification ordering issue in DefaultPromise when the lateListener collection is in use. The ordering issue can be observed in situations where a late listener is added to a Future returned from a write operation. It is possible that this future will run after a read operation scheduled on the I/O thread, even if the late listener is added on the I/O thread. This can lead to unexpected ordering where a listener for a write operation which must complete in order for the read operation to happen is notified after the read operation is done.
Modifications:
- If the lateListener collection becomes empty, it should be treated as though it was null when checking if lateListeners can be notified immediatley (instead of executing a task on the executor)
Result:
Ordering is more natural and will not be perceived as being out of order relative to other tasks on the same executor.
Motivation:
HttpHeaders already has specific methods for such popular and simple headers like "Host", but if I need to convert POST raw body to string I need to parse complex ContentType header in my code.
Modifications:
Add getCharset and getCharsetAsString methods to parse charset from Content-Length header.
Result:
Easy to use utility method.
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.
Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available
Result:
AsciiString hashCode is faster.
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.
Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.
Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
Motivation:
When the ImmediateEventExecutor is in use it is possible to get a StackOverFlowException if when a promise completes a new listener is added to that promise.
Modifications:
- Protect against the case where LateListeners.run() smashes the stack.
Result:
Fixes https://github.com/netty/netty/issues/4395
Motivation:
When dealing with case insensitive headers it can be useful to have a case insensitive contains method for CharSequence.
Modifications:
- Add containsCaseInsensative to AsciiString
Result:
More expressive utility method for case insensitive CharSequence.
Motivation:
for debugging and metrics reasons its sometimes useful to be able to get details of the the Thread that powers a SingleThreadEventExecutor.
Modifications:
- Expose ThreadProperties
- Add unit test.
Result:
It's now possible to get details of the Thread that powers a SingleThreadEventExecutor.
Motivation:
Sometimes it is useful to disable recycling completely if memory constraints are very tight.
Modifications:
Allow to use -Dio.netty.recycler.maxCapacity=0 to disable recycling completely.
Result:
It's possible to disable recycling now.
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.
Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.
Result:
Performance is now on par with 4.0.
Motivation:
There is an error in the ByteString test logic which is resulting in test failures.
Modifications:
- Fix the loop iteration to use the loop iteration variable instead of a fixed index.
Result:
Tests are less buggy.
Motivation:
In the SslHandler we schedule a timeout at which we close the Channel if a timeout was detected during close_notify. Because this can race with notify the flushFuture we can see an IllegalStateException when the Channel is closed.
Modifications:
- Use a trySuccess() and tryFailure(...) to guard against race.
Result:
No more race.
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.
Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset
Result:
ByteString and AsciiString can represent a sub region of a byte[].
Motivation:
static Package getPackage(Class<?> c) uses synchronized block internally.
Thanks to @jingene for the hint and initial report of the issue.
Modifications:
-Use simple lastIndexOf(...) and substring for a faster implementation
Result:
No more lock condition.
Motivation:
The current implementation does byte by byte comparison, which we have seen
can be a performance bottleneck when the AsciiString is used as the key in
a Map.
Modifications:
Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
and get closer to the performance of String.equals(Object).
Result:
Significant improvement (2x - 6x) in performance over the current implementation.
Benchmark (size) Mode Samples Score Score error Units
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10 thrpt 10 118843477.518 2347259.347 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 50 thrpt 10 43910319.773 198376.996 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100 thrpt 10 26339969.001 159599.252 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 1000 thrpt 10 2873119.030 20779.056 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10000 thrpt 10 306370.450 1933.303 ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100000 thrpt 10 25750.415 108.391 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10 thrpt 10 248077563.510 635320.093 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 50 thrpt 10 128198943.138 614827.548 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100 thrpt 10 86195621.349 1063959.307 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 1000 thrpt 10 16920264.598 61615.365 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10000 thrpt 10 1687454.747 6367.602 ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100000 thrpt 10 153717.851 586.916 ops/s