Motivation
- Recycler stack and delayed queue drop ratio can only be configured with
the same value. The overall drop ratio is ratio^2.
- #10251 shows that enable drop in `WeakOrderQueue` may introduce
performance degradation. Though the final reason is not clear now,
it would be better to add option to configure delayed queue drop ratio separately.
Modification
- "io.netty.recycler.delayedQueue.ratio" as the drop ratio of delayed queue
- default "delayedQueue.ratio" is same as "ratio"
Results
Able to configure recycler delayed queue drop ratio separately
Motivation
1. It's inable to collect all object because RATIO is always >=1 after
`safeFindNextPositivePowerOfTwo`
2. Enable drop object in `WeakOrderQueue`(commit:
71860e5b94) enlarge the drop ratio. We
can subtly control the overall drop ratio by using `io.netty.recycler.ratio` directly,
Modification
- Remove `safeFindNextPositivePowerOfTwo` before set the ratio
Results
Able to disable drop when recycle object
Motivation:
We have found out that ByteBufUtil.indexOf can be inefficient for substring search on
ByteBuf, both in terms of algorithm complexity (worst case O(needle.readableBytes *
haystack.readableBytes)), and in constant factor (esp. on Composite buffers).
With implementation of more performant search algorithms we have seen improvements on
the order of magnitude.
Modifications:
This change introduces three search algorithms:
1. Knuth Morris Pratt - classical textbook algorithm, a good default choice.
2. Bit mask based algorithm - stable performance on any input, but limited to maximum
search substring (the needle) length of 64 bytes.
3. Aho–Corasick - worse performance and higher memory consumption than [1] and [2], but
it supports multiple substring (the needles) search simultaneously, by inspecting every
byte of the haystack only once.
Each algorithm processes every byte of underlying buffer only once, they are implemented
as ByteProcessor.
Result:
Efficient search algorithms with linear time complexity available in Netty (I will share
benchmark results in a comment on a PR).
Motivation:
How we did wildcard matching was not correct according to RFC6125. Beside this our implementation was quite CPU heavy.
Modifications:
- Add new DomainWildcardMappingBuilder which correctly does wildcard matching. See https://tools.ietf.org/search/rfc6125#section-6.4
- Add unit tests
- Deprecate old implementations
Result:
Correctly implement wildcard matching and improve performance
Motivation:
As we have java8 as a minimum target we can use MethodHandles. We should do so when we expect to have a method called multiple times.
Modifications:
- Replace usage of reflection with MethodHandles where it makes sense
- Remove some code which was there to support java < 8
Result:
Faster code
Motivation:
Magic numbers seem hard to read or understand.
Modification:
Replace several magic numbers with named constants.
Result:
Improve readability and make it easier to maintain.
Motivation:
ThrowableUtil.stackTraceToString is an expensive method call. So I think a log level check before this logging statement is quite needed especially in a environment with the warning log disabled.
Modification:
Add log level check simply before logging.
Result:
Improve performance in a environment with the warning log disabled.
Motivation:
In general, we will close the debug log in a product environment. However, logging without external level check may still affect performance as varargs will need to allocate an array.
Modification:
Add log level check simply before logging.
Result:
Improve performance slightly in a product environment.
Motivation:
ImmediateExecutor needs more test cases.
Modification:
Add several test cases for ImmediateExecutor.
Result:
Improve test coverage slightly.
Motivation:
Our parsing of the initial line / http headers did treat some characters as separators which should better trigger an exception during parsing.
Modifications:
- Tighten up parsing of the inital line by follow recommentation of RFC7230
- Restrict separators to OWS for http headers
- Add unit test
Result:
Stricter parsing of HTTP1
Motivation:
NetworkInterface.getByInetAddress() may return null on Android. This is incorrect by the API but still happens. To help our users we should provide a workaround
Modifications:
Just return an empty Enumeration when null is returned.
Result:
Fixes https://github.com/netty/netty/issues/10045
Motivation:
BufferedReader.readLine() may return null and cause a NPE.
Modification:
Simply add a null check.
Result:
If BufferedReader.readLine() returns null, the sysctlGetInt will just return null rather than cause NPE.
Motivation:
Modifications:
- Wrap the code and execute with an AccessController
- Ignore SecurityException (by just logging it)
- Add some more debug logging
Result:
Fixes https://github.com/netty/netty/issues/10017
Motivation:
If there was always a task in the taskQueue of GlobalEvenExecutor, scheduled tasks in the
scheduledTaskQueue will never be executed.
Related to #1614
Modifications:
fix bug in GlobalEventExecutor#takeTask
Result:
fix bug
Motivation:
JDK is the default SSL provider and internally uses blocking IO operations.
Modifications:
Add allowBlockingCallsInside configuration for SslHandler runAllDelegate function.
Result:
When BlockHound is installed, SSL works out of the box with the default SSL provider.
Co-authored-by: violetagg <milesg78@gmail.com>
Motivation:
Deploying a Micronaut application as GraalVM native image to AWS Lambda with custom runtime fails when using Micronaut Http Client.
This PR initializes at runtime some classes needed to fix the issue. There is more information in our original issue in Micronaut https://github.com/micronaut-projects/micronaut-core/issues/2335#issuecomment-570151944
At this moment I've added those classes into Micronaut (b383d3ab14) as a workaround but this should be included in Netty so it's available for everyone.
Modification:
Mark 3 classes to be initialized at runtime for GraalVM.
Result:
Mark 3 classes to be initialized at runtime for GraalVM.
Motivation:
We can extend `ResourceLeakDetector` through `ResourceLeakDetectorFactory`, and then report the leaked information by covering `reportTracedLeak` and `reportUntracedLeak`. However, the behavior of `reportTracedLeak` and `reportUntracedLeak` is controlled by `logger.isErrorEnabled()`, which is not reasonable. In the case of extending `ResourceLeakDetector`, we sometimes need `needReport` to always return true instead of relying on `logger.isErrorEnabled ()`.
Modification:
introduce `needReport` method and let it be `protected`
Result:
We can control the report leak behavior.
Motivation:
decodeHexNibble can be a lot faster using a lookup table
Modifications:
decodeHexNibble is made faster by using a lookup table
Result:
decodeHexNibble is faster
Motivation:
The resolv.conf file may contain inline comments which should be ignored
Modifications:
- Detect if we have a comment after the ipaddress and if so skip it
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9889
Motivation:
In #9603 the executor hung on shutdown because of an abandoned task
on another executor the first was waiting for.
Modifications:
This commit modifies the executor shutdown sequence to include
switching to SHUTDOWN state and then running all remaining tasks.
This ensures that no more tasks are scheduled after SHUTDOWN and
the last pass of running remaining tasks will take it all.
Any tasks scheduled after SHUTDOWN will be rejected.
This change preserves the functionality of graceful shutdown with
quiet period and only adds one more pass of task execution after
the default shutdown process has finished and the executor is
ready for termination.
Result:
After this change tasks that succeed to be added to the executor will
be always executed. Tasks which come late will be rejected instead of
abandoned.
Motivation:
We should use Objects.requireNonNull(...) as we require java8
Modifications:
Replace ObjectUtil.checkNonNull(...) with Objects.requireNonNull(...)
Result:
Code cleanup
Motivation:
We should include the shaded sources for JCTools in our sources jar to make it easier to debug.
Modifications:
- Adjust plugin configuration to execute plugins in correct order
- Update source plugin
- Add configuration for shade plugin to generate source jar content
Result:
Fixes https://github.com/netty/netty/issues/6640.
Motivation:
Replace Map with Set. `reportedLeaks` has better semantics as a Set, and if it is a Map, it seems that the value of this Map has no meaning to us.
Modifications:
Use Set.
Result:
Cleaner code
Motivation
The recycling ratio is currently implemented by comparing with a masked
count. The mask operation is not free and also not necessary.
Modification
Change the count(s) to just iterate over the corresponding interval,
which requires only a comparison and no mask.
Also make "first time recycle" behaviour consistent and revert change to
RecyclerTest made in #9727.
Result
Less recycling overhead
Motivation:
We can move some methods etc to make encapsulation better in Recycler
Modifications:
Move / rename methods to make usage more clear
Result:
Code cleanup
Motivation:
At the moment we only enfore ratioMask for the Stack which means that we only guard against recycle burts when recycled from the same Thread. We should also enforce the ratioMask in the WeakOrderQueue so we also guard against the bursts when recycle from other threads.
Modifications:
- Keep counter in WeakOrderQueue to enforce ratioMask as well
- Adjust unit test
Result:
Better guard against recycle bursts which could pollute the heap unnecessary.
Motivation
Currently the visibility of the various Recycler inner classes and their
fields isn't optimal. Some private members are accessed by other classes
resulting in synthetic methods, and other non-private classes/members
are only accessed privately and so can be made private.
Modifications
- Increase/reduce visibility of various fields/methods/classes within
Recycler
- Have WeakOrderQueue extend WeakReference<Thread> to eliminate the
owner field
- Change local DefaultHandle var to DefaultHandle<?> to avoid raw type
compiler warning
Result
Tidier code, fewer implicit methods on hot paths (reducing inlining
depths)
Motivation:
We currently use a finalizer to ensure we correctly return the reserved back to the Stack but this is not really needed as we can ensure we return it when needed before dropping the WeakOrderQueue
Modifications:
Use explicit method call to ensure we return the reserved space back before dropping the object
Result:
Less finalizer usage and so less work for the GC
Motivation:
We null out the element in the array after we decrement the current size of the Stack but not directly write back the updated size to the stored field. This is problematic as we do some validation before we write it back and so may never do so if the validation fails. This then later can lead to have null objects returned where not expected
Modifications:
Update size directly after null out object
Result:
No more unexpected null value possible
##Motivation
The InternalLoggerFactory attempts to instantiate different logger
implementations to discover what is available on the class path,
accepting the first implementation that does not throw an exception.
Currently, the default ordering will attempt to instantiate a Log4j1
logger before Log4j2. For environments where both Log4j1 and Log4j2 are
available, this will result in using the older version. It seems that it
would be more intuitive to prefer the newer version, when possible.
##Modifications
Change the default ordering to attempt to use the Log4J2LoggerFactory
before the Log4JLoggerFactory.
##Result
For environments where both Log4j1 and Log4j2 are available on the class
path (but Slf4J is not available), Netty will now use Log4j2 instead of
Log4j1.
Motivation:
If maxDelayedQueues == 0 we should never put any WeakHashMap into the FastThreadLocal for a Thread.
Modifications:
Check if maxDelayedQueues == 0 and if so return directly. This will ensure we never call FastThreadLocal.initialValue() in this case
Result:
Less overhead / memory usage when maxDelayedQueues == 0
Motivation:
At the moment we directly extend the Recycler base class in our code which makes it hard to experiment with different Object pool implementation. It would be nice to be able to switch from one to another by using a system property in the future. This would also allow to more easily test things like https://github.com/netty/netty/pull/8052.
Modifications:
- Introduce ObjectPool class with static method that we now use internally to obtain an ObjectPool implementation.
- Wrap the Recycler into an ObjectPool and return it for now
Result:
Preparation for different ObjectPool implementations
Motivation:
We do not correct guard against the gact that when applying our workaround for windows we may end up with a 0 sleep period. In this case we should just sleep for 1 ms.
Modifications:
Guard agains the case when our calculation will produce 0 as sleep time on windows
Result:
Fixes https://github.com/netty/netty/issues/9710.
Motivation:
Netty is an asynchronous framework.
If somebody uses a blocking call inside Netty's event loops,
it may lead to a severe performance degradation.
BlockHound is a tool that helps detecting such calls.
Modifications:
This change adds a BlockHound's SPI integration that marks
threads created by Netty (`FastThreadLocalThread`s) as non-blocking.
It also marks some of Netty's internal methods as whitelisted
as they are required to run the event loops.
Result:
When BlockHound is installed, any blocking call inside event loops
is intercepted and reported (by default an error will be thrown).
Motivation
The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.
The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I _think_ even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.
It also means shutdowns are unnecessarily delayed by up to 1 second.
Modifications
- Add/extend unit tests to expose the issue
- Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
to explicitly add no-op tasks to the taskQueue so that the subsequent
event loop iteration doesn't enter blocking wait (as looks like was
originally intended)
Results
Faster and more robust shutdown of event loops, allows removal of the default wait timeout.
This is a port of https://github.com/netty/netty/pull/9616
Motivation:
Recycler$Stack.pop will occurs `ArrayIndexOutOfBoundsException` in some race cases, we should double check `size` even after `scavenge` called.
Modifications:
Double check `size` after `scavenge`
Result:
avoid ArrayIndexOutOfBoundsException in `pop`
Motiviation:
A lot of reentrancy bugs and cycles can happen because the DefaultPromise will notify the FutureListener directly when completely in the calling Thread if the Thread is the EventExecutor Thread. To reduce the risk of this we should always notify the listeners via the EventExecutor which basically means that we will put a task into the taskqueue of the EventExecutor and pick it up for execution after the setSuccess / setFailure methods complete the promise.
Modifications:
- Always notify via the EventExecutor
- Adjust test to ensure we correctly account for this
- Adjust tests that use the EmbeddedChannel to ensure we execute the scheduled work.
Result:
Reentrancy bugs related to the FutureListeners cant happen anymore.
Motivation:
peek() is implemented in a similar way to poll() for the mpsc queue, thus it is more like a consumer call.
It is possible that we could have multiple thread call peek() and possibly one thread calls poll() at at the same time.
This lead to multiple consumer scenario, which violates the multiple producer single consumer condition and could lead to spin in an infinite loop in peek()
Modification:
Use isEmpty() instead of peek() to check if task queue is empty
Result:
Dont violate the mpsc semantics.
Motivation:
SystemPropertyUtil already uses the AccessController internally so not need to wrap its usage with AccessController as well.
Modifications:
Remove explicit AccessController usage when SystemPropertyUtil is used.
Result:
Code cleanup
Motivation:
At the current moment HttpContentEncoder handle only first value of multiple accept-encoding headers.
Modification:
Join multiple accept-encoding headers to one separated by comma.
Result:
Fixes#9553
Motivation
Currently every call to get() on a promise results in two reads of the
volatile result field when one would suffice. Maybe this is optimized
away but it seems sensible not to rely on that.
Modification
Reimplement get() and get(...) in DefaultPromise to reduce volatile access.
Result
Fewer volatile reads.
Motivation
problem with Throwable#addSuppressed() raised in #9151. This introduced
a performance issue when promises are cancelled at a high frequency due
to the construction cost of CancellationException at the time that
DefaultPromise#cancel() is called.
Modifications
- Reinstate the prior static CANCELLATION_CAUSE_HOLDER but use it just
as a sentinel to indicate cancellation, constructing a new
CancellationException only if/when one needs to be explicitly
returned/thrown
- Subclass CancellationException, overriding fillInStackTrace() to
minimize the construction cost in these cases
Result
Promises are much cheaper to cancel. Fixes#9522.
Motiviation:
EmbeddedChannel currently is quite differently in terms of semantics to other Channel implementations. We should better change it to be more closely aligned and so have the testing code be more robust.
Modifications:
- Change EmbeddedEventLoop.inEventLoop() to only return true if we currenlty run pending / scheduled tasks
- Change EmbeddedEventLoop.execute(...) to automatically process pending tasks if not already doing so
- Adjust a few tests for the new semantics (which is closer to other Channel implementations)
Result:
EmbeddedChannel works more like other Channel implementations
Motivation:
There are some extra log level checks (logger.isWarnEnabled()).
Modification:
Remove log level checks (logger.isWarnEnabled()) from io.netty.channel.epoll.AbstractEpollStreamChannel, io.netty.channel.DefaultFileRegion, io.netty.channel.nio.AbstractNioChannel, io.netty.util.HashedWheelTimer, io.netty.handler.stream.ChunkedWriteHandler and io.netty.channel.udt.nio.NioUdtMessageConnectorChannel
Result:
Fixes#9456
Motivation:
The Netty classes are initialized at build time by default for GraalVM Native Image compilation. This is configured via the `--initialize-at-build-time=io.netty` option. While this reduces start-up time it can lead to some problems:
- The class initializer of `io.netty.buffer.PooledByteBufAllocator` looks at the maximum memory size to compute the size of internal buffers. If the class initializer runs during image generation, then the buffers are sized according to the very large heap size that the image generator uses, and Netty allocates several arrays that are 16 MByte. The fix is to initialize the following 3 classes at run time: `io.netty.buffer.PooledByteBufAllocator,io.netty.buffer.ByteBufAllocator,io.netty.buffer.ByteBufUtil`. This fix was dependent on a GraalVM Native Image fix that was included in 19.2.0.
- The class initializer of `io.netty.handler.ssl.util.ThreadLocalInsecureRandom` needs to be initialized at runtime to ensure that the generated values are trully random and not fixed for each generated image.
- The class initializers of `io.netty.buffer.AbstractReferenceCountedByteBuf` and `io.netty.util.AbstractReferenceCounted` compute field offsets. While the field offset recomputation is necessary for correct execution as a native image these initializers also have logic that depends on the presence/absence of `sun.misc.Unsafe`, e.g., via the `-Dio.netty.noUnsafe=true` flag. The fix is to push these initializers to runtime so that the field offset lookups (and the logic depending on them) run at run time. This way no manual substitutions are necessary either.
Modifications:
Add `META-INF/native-image` configuration files that correctly trigger the inialization of the above classes at run time via `--initialize-at-run-time=...` flags.
Result:
Fixes the initialisation issues described above for Netty executables built with GraalVM.