Motivation:
We produced a lot of noise during loading native libraries as we always included the stacktrace if we could not load by one mechanism. We should better just not include the stacktrace in the debugging logging if one mechanism fails. We will log all the stacks anyway when all of the mechanisms fail.
Modifications:
Make logging less aggressive
Result:
Less confusing behaviour for the end-user
Motivation:
Internally UnixResolverDnsServerAddressStreamProvider#parse calls FileInputStream.read(...)
when parsing the etcResolverFiles.
This will cause the error below when BlockHound is enabled
reactor.blockhound.BlockingOperationError: Blocking call! java.io.FileInputStream#readBytes
at java.io.FileInputStream.readBytes(FileInputStream.java)
at java.io.FileInputStream.read(FileInputStream.java:255)
Modifications:
- Add whitelist entry to BlockHound configuration
- Add test
Result:
Fixes#10925
Motivation:
A race detector discovered a data race in GlobalEventExecutor present in
netty 4.1.51.Final:
```
Write of size 4 at 0x0000cea08774 by thread T103:
#0 io.netty.util.internal.DefaultPriorityQueue.poll()Lio/netty/util/internal/PriorityQueueNode; DefaultPriorityQueue.java:113
#1 io.netty.util.internal.DefaultPriorityQueue.poll()Ljava/lang/Object; DefaultPriorityQueue.java:31
#2 java.util.AbstractQueue.remove()Ljava/lang/Object; AbstractQueue.java:113
#3 io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(J)Ljava/lang/Runnable; AbstractScheduledEventExecutor.java:133
#4 io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue()V GlobalEventExecutor.java:119
#5 io.netty.util.concurrent.GlobalEventExecutor.takeTask()Ljava/lang/Runnable; GlobalEventExecutor.java:106
#6 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:240
#7 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
#8 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
#9 java.lang.Thread.run()V Thread.java:835
#10 (Generated Stub) <null>
Previous read of size 4 at 0x0000cea08774 by thread T110:
#0 io.netty.util.internal.DefaultPriorityQueue.size()I DefaultPriorityQueue.java:46
#1 io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run()V GlobalEventExecutor.java:263
#2 io.netty.util.internal.ThreadExecutorMap$2.run()V ThreadExecutorMap.java:74
#3 io.netty.util.concurrent.FastThreadLocalRunnable.run()V FastThreadLocalRunnable.java:30
#4 java.lang.Thread.run()V Thread.java:835
#5 (Generated Stub) <null>
```
The race is legit, but benign. To trigger it requires a TaskRunner to
begin exiting and set 'started' to false, more work to be scheduled
which starts a new TaskRunner, that work then needs to schedule
additional work which modifies 'scheduledTaskQueue', and then the
original TaskRunner checks 'scheduledTaskQueue'. But there is no danger
to this race as it can only produce a false negative in the condition
which causes the code to CAS 'started' which is thread-safe.
Modifications:
Delete problematic references to scheduledTaskQueue. The only way
scheduledTaskQueue could be modified since the last check is if another
TaskRunner is running, in which case the current TaskRunner doesn't
care.
Result:
Data-race free code, and a bit less code to boot.
Motivation:
I did not see any tangible advantage to the padding.
The only other field that was guarded was a rarely changed object reference to a BitSet.
Without the padding, there is also no longer any use of the inheritance hierarchy.
The padding was also using `long`, which would not necessarily prevent the JVM from fitting the aforementioned object reference in an alignment gap.
Modification:
Move all the fields into the InternalThreadLocalMap and deprecate the stuff we'd like to remove.
Result:
Simpler code.
This resolves the discussion in https://github.com/netty/netty/issues/9284
Motivation:
We shouldnt catch Throwable in InternalLoggerFactory as this may hide OOME etc.
Modifications:
Only catch LinkageError and Exception
Result:
Fixes https://github.com/netty/netty/issues/10857
Motivation:
In #10630, field substitutions were introduced for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST fields. They were required to allow a native image be built with most of Netty (including NetUtil) initialized at build time.
The substitutions created in #10630 only define getters, so the 3 fields can only be read in a native image.
But when NetUtil is initialized at run-time (this is what happens in #10797), its static initialization block is executed, and this block writes to all 3 fields. As the substitutions do not provide any setters, field stores are not valid, and such builds fail.
Modifications:
- Add netty-testsuite-native-image-client-runtime-init Maven module that builds a native image deferring NetUtil class initialization till run-time; this module is used to demonstrate the problem and verify the problem is gone with the fix
- Add no-op setters to substitutions for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST
Result:
A native image initializing NetUtil at run-time builds successfully.
Fixes#10797
Motivation:
Internally SSLEngineImpl.wrap(...) may call FileInputStream.read(...).
This will cause the error below when BlockHound is enabled
reactor.blockhound.BlockingOperationError: Blocking call! java.io.FileInputStream#readBytes
at java.io.FileInputStream.readBytes(FileInputStream.java)
at java.io.FileInputStream.read(FileInputStream.java:255)
Modifications:
- Add whitelist entry to BlockHound configuration
- Add test
Result:
Fixes#10837
Motivation:
If Log4J2's `Filter` creates `Recycler.Stack` somehow, `Recycler.Stack()` will see uninitialized `Recycler.INITIAL_CAPACITY`. This has been raised originally in https://github.com/micrometer-metrics/micrometer/issues/2369.
Modification:
This PR changes to initialize `Recycler.INITIAL_CAPACITY` before invoking `InternalLogger.debug()` to avoid it.
Result:
Fixes the problem described in the "Motivation" section.
Motivation:
GlobalEventExecutor.addTask was rightfully allowed to block by commit
09d38c8. However the same should have been done for
SingleThreadEventExecutor.addTask.
BlockHound is currently intercepting that call, and as a consequence,
it prevents SingleThreadEventExecutor from working properly, if addTask is
called from a thread that cannot block.
The interception is due to LinkedBlockingQueue.offer implementation,
which uses a ReentrantLock internally.
Modifications:
* Added one BlockHound exception to
io.netty.util.internal.Hidden.NettyBlockHoundIntegration for
SingleThreadEventExecutor.addTask.
* Also added unit tests for both SingleThreadEventExecutor.addTask
and GlobalEventExecutor.addTask.
Result:
SingleThreadEventExecutor.addTask can now be invoked from any thread
when BlockHound is activated.
Motivation:
When a HashedWheelTimer instance is started or stopped, its working
thread is started or stopped. These operations block the calling
thread:
- start() calls java.util.concurrent.CountDownLatch.await() to wait
for the worker thread to finish initializing;
- stop() calls java.lang.Thread.join(long) to wait for the worker
thread to exit.
BlockHound detects these calls and as a consequence, prevents
HashedWheelTimer from working properly, if it is started or stopped
in a thread that is not allowed to block.
Modifications:
Added two more BlockHound exceptions to
io.netty.util.internal.Hidden.NettyBlockHoundIntegration: one
for HashedWheelTimer.start() and one for HashedWheelTimer.stop().
Result:
HashedWheelTimer can now be started and stopped properly when
BlockHound is activated.
Motivation:
03aafb9cff did ensure we fail while loading a natibe library which is not compatible. While this is great it is still sometimes hard for people to understand what NoSuchMethodError means in this context.
Modifications:
If possible rethrow the NoSuchMethodError and provide some more hints about multiple versions of the shared library
Result:
Easier to understand for people why loading fails
Motivation:
ddebc10 added a few adjustments that are needed by io_uring that we will add as an incubator repository. Unfortunally we missed the changed in PlatformDependent*.
Modifications:
Add missing methods
Result:
Be able to compile io_uring code against core netty
Motivation:
Since GraalVM version 19.3.0, instances of java.net.InetAddress (and its subclasses Inet4Address and Inet6Address) are not allowed in native image heap (that is, they cannot be stored in static fields of classes initialized at build time or be reachable through static fields of such classes). When building a native image, it makes sense to initialize at build time as many classes as possible.
But some fields of some classes in Netty (for example, NetUtil.LOCALHOST4) contain InetAddress instances. If a program is using code path that makes it possible to reach such fields at build time initialization, it becomes impossible to build a native image initializing core Netty classes initialized at runtime. An example of such a program is a client that uses netty-dns.
Modifications:
- Add netty-testsuite-native-image-client Maven module to test that such an example program can be built after the corresponding fixes
- Add native-image.properties to resolver-dns module to move initialization of some classes to runtime (some of them are parsing configuration during initialization, so it makes no sense to initialize them at build time; for others, it's needed to avoid InetAddress reachability at build time)
- Add substitutions for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST to overcome the InetAddress-related prohibition
- Extract some initialization code from NetUtil to NetUtilInitializations to allow it to be used by the substitutions
Result:
A client program using netty-dns with --initialize-at-build-time=io.netty builds successfully
Motivation:
HTTP is a plaintext protocol which means that someone may be able
to eavesdrop the data. To prevent this, HTTPS should be used whenever
possible. However, maintaining using https:// in all URLs may be
difficult. The nohttp tool can help here. The tool scans all the files
in a repository and reports where http:// is used.
Modifications:
- Added nohttp (via checkstyle) into the build process.
- Suppressed findings for the websites
that don't support HTTPS or that are not reachable
Result:
- Prevent using HTTP in the future.
- Encourage users to use HTTPS when they follow the links they found in
the code.
Motivation:
junit deprecated Assert.assertThat(...)
Modifications:
Use MatcherAssert.assertThat(...) as replacement for deprecated method
Result:
Less deprecation warnings
Motivation:
LGTM reports multiple issues. They need to be triaged,
and real ones should be fixed.
Modifications:
- Fixed multiple issues reported by LGTM, such as redundant conditions,
resource leaks, typos, possible integer overflows.
- Suppressed false-positives.
- Added a few testcases.
Result:
Fixed several possible issues, get rid of false alarms in the LGTM report.
Motivation:
We had two unit tests that sometimes failed due timeouts. After insepecting these I noticed these can be improved to run faster while still do the right validation
Modifications:
- Only submit one task for execution per execute
- Cleanup
Result:
No test failures due timeout
Motivation:
We have a few classes in which we store and reuse static instances of various exceptions. When doing so it is important to also override fillInStacktrace() and so prevent the leak of the ClassLoader in the internal backtrace field.
Modifications:
- Add overrides of fillInStracktrace when needed
- Move ThrowableUtil usage in the static methods
Result:
Fixes https://github.com/netty/netty/pull/10686
Motivation:
All scheduled executors should behave in accordance to their API.
The bug here is that scheduled tasks were not run more than once because we executed the runnables directly, instead of through the provided runnable future.
Modification:
We now run tasks through the provided future, so that when each run completes, the internal state of the task is reset and the ScheduledThreadPoolExecutor is informed of the completion.
This allows the executor to prepare the next run.
Result:
The UnorderedThreadPoolEventExecutor is now able to run scheduled tasks more than once.
Which is what one would expect from the API.
Motivation:
We check lots of numbers if it lies in a range. So it's better to add a method in `ObjectUtil` to check if a number lies inside a range.
Modification:
Added Range check method.
Result:
A faster and better way to check if a number lies inside a range.
Motivation:
DefaultAttributeMap::attr has a blocking behaviour on lookup of an existing attribute:
it can be made non-blocking.
Modification:
Replace the existing fixed bucket table using a locked intrusive linked list
with an hand-rolled copy-on-write ordered single array
Result:
Non blocking behaviour for the lookup happy path
Motivation:
writeUtf8 can suffer from inlining issues and/or megamorphic call-sites on the hot path due to ByteBuf hierarchy
Modifications:
Duplicate and specialize the code paths to reduce the need of polymorphic calls
Result:
Performance are more stable in user code
Motivation:
We're converting `Inet4Address` to `Integer` quite frequently so it's a good idea to keep that code in `NetUtils`.
Modification:
Added ipv4AddressToInt(Inet4Address) in NetUtils
Result:
Easy conversion of `Inet4Address` to `Integer`.
Motivation:
It makes no sense to remove the task when the underlying executor fails as we may be able to pick it up later. Beside this the used Queue doesnt support remove(...) and so will throw.
Modifications:
Remove the queue.remove(...) call
Result:
Fixes https://github.com/netty/netty/issues/10501.
Motivation:
The executor chooser should pluck executors in round-robin, but at the 32-bit
overflow boundary, the round-robin sequence was disrupted when the number of
executors are not a power of 2.
Modification:
Changed the index counter from a 32-bit to a 64-bit long. The overflow bug is
still technically there, but it now takes so long to reach that it will never
happen in practice. For example, 2^63 nanoseconds is almost 300 years.
Result:
The round-robin behaviour for all EventExecutorChoosers is now preserved in
practice.
This fixes#10423.
Motivation:
Recent Intellij versions are starting to anticipate
future versions of Java that include a
`java.lang.Record` class, and the Intellij compiler
gets confused by the `Record` class in our
`ResorceLeakDetector`.
Modification:
Rename our `Record` class to `TracerRecord`.
This matches what the class is doing, while avoiding
any future name clashes.
Result:
Intellij can now compile the project again, even when
configured to use a future (snapshot or early access)
version of Java.
Motivation:
SSLEngineImpl.unwrap(...) may call FileInputStream.read(...) internally when TLS1.3 is used. This will cause an BlockingOperationError when BlockHound is enabled.
For more details see https://mail.openjdk.java.net/pipermail/security-dev/2020-August/022271.html
Modifications:
- Add whitelist entry to BlockHound config for now
- Update NettyBlockHoundIntegrationTest to include testing for this workaround
Result:
No BlockingOperationError when TLS1.3 is used with JDK SSL implementation and BlockHound is enabled
Motivation:
GraalVM's native images built with native-image tool do not support Unsafe.staticFieldOffset() method (at least, currently). If an application using Netty (and causing initialization of io.netty.util.internal.PlatformDependent0 class) is built into a native image and run, this results in the following error thrown during initialization:
Exception in thread "main" com.oracle.svm.core.jdk.UnsupportedFeatureError: Unsupported method of Unsafe
at com.oracle.svm.core.util.VMError.unsupportedFeature(VMError.java:86)
at jdk.internal.misc.Unsafe.staticFieldOffset(Unsafe.java:230)
at sun.misc.Unsafe.staticFieldOffset(Unsafe.java:662)
at io.netty.util.internal.PlatformDependent0$5.run(PlatformDependent0.java:294)
at java.security.AccessController.doPrivileged(AccessController.java:83)
at io.netty.util.internal.PlatformDependent0.<clinit>(PlatformDependent0.java:279)
This seems to be the reason of the behavior described in #10051.
Modification:
The idea of this commit is to only invoke Unsafe.staticFieldOffset() is we are not in a native image; if we are, behave like if we could not find the field at all.
GraalDetector is borrowed from Spring framework.
Result:
Fixes#10051
Motivation:
We should provide details about why an IOOBE was thrown
Modification:
Add IndexOutOfBoundsException error information in io.netty.util.internal.AppendableCharSequence and io.netty.handler.codec.CodecOutputList class
Result:
Easier to debug
Motivation:
When a switch statement is used we should always define a `default:` so we don't introduce bugs due fall-through.
Modification:
Add missing `default:`s
Result:
Less error-prone code
Motivation:
When BlockHound is installed,
ReferenceCountedOpenSslClientContext$ExtendedTrustManagerVerifyCallback.verify
is reported as blocking call.
Modifications:
Add allowBlockingCallsInside configuration for
ReferenceCountedOpenSslClientContext$ExtendedTrustManagerVerifyCallback.verify
Result:
Fixes#10384
Motivation:
newResourceLeakDetector(...) did not correctly pass the samplingInterval parameter and so it was ignored.
Modification:
ResourceLeakDetectorFactory.newResourceLeakDetector(Class, int) use the second parameter as the sampling interval of the newly created ResourceLeakDetector.
Result:
Fixes#10378
Motivation:
Unsafe users are getting MpscChunkedArrayQueue while no Unsafe ones MpscGrowableAtomicArrayQueue
Modifications:
MpscChunkedAtomicArrayQueue should be used for no Unsafe users (matching MpscChunkedArrayQueue behaviour)
Result:
no Unsafe users uses MpscChunkedAtomicArrayQueue while allocating bounded Mpsc Queues
Motivation:
Fix very tiny comment error in Recycler
Modifications:
Fix very tiny comment error in Recycler
Result:
Correctly comment about drop in WeakOrderQueue#add
Motivation:
GlobalEventExecutor#addTask may be called during SingleThreadEventExecutor shutdown.
May result in a blocking call, because GlobalEventExecutor#taskQueue is a BlockingQueue.
Modifications:
Add allowBlockingCallsInside configuration for GlobalEventExecutor#addTask.
Result:
Fixes#10257.
When BlockHound is installed, GlobalEventExecutor#addTask is not reported as a blocking call.
Motivation
- Recycler stack and delayed queue drop ratio can only be configured with
the same value. The overall drop ratio is ratio^2.
- #10251 shows that enable drop in `WeakOrderQueue` may introduce
performance degradation. Though the final reason is not clear now,
it would be better to add option to configure delayed queue drop ratio separately.
Modification
- "io.netty.recycler.delayedQueue.ratio" as the drop ratio of delayed queue
- default "delayedQueue.ratio" is same as "ratio"
Results
Able to configure recycler delayed queue drop ratio separately
Motivation
1. It's inable to collect all object because RATIO is always >=1 after
`safeFindNextPositivePowerOfTwo`
2. Enable drop object in `WeakOrderQueue`(commit:
71860e5b94) enlarge the drop ratio. We
can subtly control the overall drop ratio by using `io.netty.recycler.ratio` directly,
Modification
- Remove `safeFindNextPositivePowerOfTwo` before set the ratio
Results
Able to disable drop when recycle object
Motivation:
We have found out that ByteBufUtil.indexOf can be inefficient for substring search on
ByteBuf, both in terms of algorithm complexity (worst case O(needle.readableBytes *
haystack.readableBytes)), and in constant factor (esp. on Composite buffers).
With implementation of more performant search algorithms we have seen improvements on
the order of magnitude.
Modifications:
This change introduces three search algorithms:
1. Knuth Morris Pratt - classical textbook algorithm, a good default choice.
2. Bit mask based algorithm - stable performance on any input, but limited to maximum
search substring (the needle) length of 64 bytes.
3. Aho–Corasick - worse performance and higher memory consumption than [1] and [2], but
it supports multiple substring (the needles) search simultaneously, by inspecting every
byte of the haystack only once.
Each algorithm processes every byte of underlying buffer only once, they are implemented
as ByteProcessor.
Result:
Efficient search algorithms with linear time complexity available in Netty (I will share
benchmark results in a comment on a PR).
Motivation:
How we did wildcard matching was not correct according to RFC6125. Beside this our implementation was quite CPU heavy.
Modifications:
- Add new DomainWildcardMappingBuilder which correctly does wildcard matching. See https://tools.ietf.org/search/rfc6125#section-6.4
- Add unit tests
- Deprecate old implementations
Result:
Correctly implement wildcard matching and improve performance
Motivation:
Magic numbers seem hard to read or understand.
Modification:
Replace several magic numbers with named constants.
Result:
Improve readability and make it easier to maintain.
Motivation:
ThrowableUtil.stackTraceToString is an expensive method call. So I think a log level check before this logging statement is quite needed especially in a environment with the warning log disabled.
Modification:
Add log level check simply before logging.
Result:
Improve performance in a environment with the warning log disabled.
Motivation:
In general, we will close the debug log in a product environment. However, logging without external level check may still affect performance as varargs will need to allocate an array.
Modification:
Add log level check simply before logging.
Result:
Improve performance slightly in a product environment.
Motivation:
Some JVMs (like OpenJDK 8 on Windows) use a low resolution timer in System.nanoTime() and may return the same value more than once. This triggered an assertion failure when deadlineNanos was equal to nanoTime and AbstractScheduledEventExecutor#pollScheduledTask called #setConsumed.
Modifications:
With this change the assertion checks exactly the same condition as AbstractScheduledEventExecutor#pollScheduledTask, and will no longer fail under these circumstances.
Result:
Fixes#10070.