Motivation:
We have found out that ByteBufUtil.indexOf can be inefficient for substring search on
ByteBuf, both in terms of algorithm complexity (worst case O(needle.readableBytes *
haystack.readableBytes)), and in constant factor (esp. on Composite buffers).
With implementation of more performant search algorithms we have seen improvements on
the order of magnitude.
Modifications:
This change introduces three search algorithms:
1. Knuth Morris Pratt - classical textbook algorithm, a good default choice.
2. Bit mask based algorithm - stable performance on any input, but limited to maximum
search substring (the needle) length of 64 bytes.
3. Aho–Corasick - worse performance and higher memory consumption than [1] and [2], but
it supports multiple substring (the needles) search simultaneously, by inspecting every
byte of the haystack only once.
Each algorithm processes every byte of underlying buffer only once, they are implemented
as ByteProcessor.
Result:
Efficient search algorithms with linear time complexity available in Netty (I will share
benchmark results in a comment on a PR).
Motivation:
How we did wildcard matching was not correct according to RFC6125. Beside this our implementation was quite CPU heavy.
Modifications:
- Add new DomainWildcardMappingBuilder which correctly does wildcard matching. See https://tools.ietf.org/search/rfc6125#section-6.4
- Add unit tests
- Deprecate old implementations
Result:
Correctly implement wildcard matching and improve performance
Motivation:
Magic numbers seem hard to read or understand.
Modification:
Replace several magic numbers with named constants.
Result:
Improve readability and make it easier to maintain.
Motivation:
ThrowableUtil.stackTraceToString is an expensive method call. So I think a log level check before this logging statement is quite needed especially in a environment with the warning log disabled.
Modification:
Add log level check simply before logging.
Result:
Improve performance in a environment with the warning log disabled.
Motivation:
In general, we will close the debug log in a product environment. However, logging without external level check may still affect performance as varargs will need to allocate an array.
Modification:
Add log level check simply before logging.
Result:
Improve performance slightly in a product environment.
Motivation:
Some JVMs (like OpenJDK 8 on Windows) use a low resolution timer in System.nanoTime() and may return the same value more than once. This triggered an assertion failure when deadlineNanos was equal to nanoTime and AbstractScheduledEventExecutor#pollScheduledTask called #setConsumed.
Modifications:
With this change the assertion checks exactly the same condition as AbstractScheduledEventExecutor#pollScheduledTask, and will no longer fail under these circumstances.
Result:
Fixes#10070.
Motivation:
PromiseTask.isCancelled performs an unsynchronized read and so may trigger race-detectors. As this was just done for a small optimization which should not really make a lot of difference we should just remove it.
Modifications:
Remove unsynchronized read optimization
Result:
Fixes https://github.com/netty/netty/issues/10026.
Motivation:
ImmediateExecutor needs more test cases.
Modification:
Add several test cases for ImmediateExecutor.
Result:
Improve test coverage slightly.
Motivation:
Our parsing of the initial line / http headers did treat some characters as separators which should better trigger an exception during parsing.
Modifications:
- Tighten up parsing of the inital line by follow recommentation of RFC7230
- Restrict separators to OWS for http headers
- Add unit test
Result:
Stricter parsing of HTTP1
Motivation:
NetworkInterface.getByInetAddress() may return null on Android. This is incorrect by the API but still happens. To help our users we should provide a workaround
Modifications:
Just return an empty Enumeration when null is returned.
Result:
Fixes https://github.com/netty/netty/issues/10045
Motivation:
BufferedReader.readLine() may return null and cause a NPE.
Modification:
Simply add a null check.
Result:
If BufferedReader.readLine() returns null, the sysctlGetInt will just return null rather than cause NPE.
Motivation:
Modifications:
- Wrap the code and execute with an AccessController
- Ignore SecurityException (by just logging it)
- Add some more debug logging
Result:
Fixes https://github.com/netty/netty/issues/10017
Motivation:
GlobalEventExecutor/SingleThreadEventExecutor#taskQueue is BlockingQueue.
Modifications:
Add allowBlockingCallsInside configuration for GlobalEventExecutor/SingleThreadEventExecutor#takeTask.
Result:
Fixes#9984
When BlockHound is installed, GlobalEventExecutor/SingleThreadEventExecutor#takeTask is not reported as a blocking call.
Motivation:
If there was always a task in the taskQueue of GlobalEvenExecutor, scheduled tasks in the
scheduledTaskQueue will never be executed.
Related to #1614
Modifications:
fix bug in GlobalEventExecutor#takeTask
Result:
fix bug
Motivation:
JDK is the default SSL provider and internally uses blocking IO operations.
Modifications:
Add allowBlockingCallsInside configuration for SslHandler runAllDelegate function.
Result:
When BlockHound is installed, SSL works out of the box with the default SSL provider.
Co-authored-by: violetagg <milesg78@gmail.com>
Motivation:
We need to initialize ThreadLocalRandom at runtime as it uses System.nanoTime() in a static block to init the seed.
Modifications:
Add io.netty.util.internal.ThreadLocalRandom to properties file
Result:
Better support for GraalVM
Motivation:
Deploying a Micronaut application as GraalVM native image to AWS Lambda with custom runtime fails when using Micronaut Http Client.
This PR initializes at runtime some classes needed to fix the issue. There is more information in our original issue in Micronaut https://github.com/micronaut-projects/micronaut-core/issues/2335#issuecomment-570151944
At this moment I've added those classes into Micronaut (b383d3ab14) as a workaround but this should be included in Netty so it's available for everyone.
Modification:
Mark 3 classes to be initialized at runtime for GraalVM.
Result:
Mark 3 classes to be initialized at runtime for GraalVM.
Motivation:
We can extend `ResourceLeakDetector` through `ResourceLeakDetectorFactory`, and then report the leaked information by covering `reportTracedLeak` and `reportUntracedLeak`. However, the behavior of `reportTracedLeak` and `reportUntracedLeak` is controlled by `logger.isErrorEnabled()`, which is not reasonable. In the case of extending `ResourceLeakDetector`, we sometimes need `needReport` to always return true instead of relying on `logger.isErrorEnabled ()`.
Modification:
introduce `needReport` method and let it be `protected`
Result:
We can control the report leak behavior.
Motivation:
decodeHexNibble can be a lot faster using a lookup table
Modifications:
decodeHexNibble is made faster by using a lookup table
Result:
decodeHexNibble is faster
Motivation:
The resolv.conf file may contain inline comments which should be ignored
Modifications:
- Detect if we have a comment after the ipaddress and if so skip it
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9889
Motivation:
In #9603 the executor hung on shutdown because of an abandoned task
on another executor the first was waiting for.
Modifications:
This commit modifies the executor shutdown sequence to include
switching to SHUTDOWN state and then running all remaining tasks.
This ensures that no more tasks are scheduled after SHUTDOWN and
the last pass of running remaining tasks will take it all.
Any tasks scheduled after SHUTDOWN will be rejected.
This change preserves the functionality of graceful shutdown with
quiet period and only adds one more pass of task execution after
the default shutdown process has finished and the executor is
ready for termination.
Result:
After this change tasks that succeed to be added to the executor will
be always executed. Tasks which come late will be rejected instead of
abandoned.
Motivation:
We should include the shaded sources for JCTools in our sources jar to make it easier to debug.
Modifications:
- Adjust plugin configuration to execute plugins in correct order
- Update source plugin
- Add configuration for shade plugin to generate source jar content
Result:
Fixes https://github.com/netty/netty/issues/6640.
Motivation:
Replace Map with Set. `reportedLeaks` has better semantics as a Set, and if it is a Map, it seems that the value of this Map has no meaning to us.
Modifications:
Use Set.
Result:
Cleaner code
Motivation
The event loop implementations had become somewhat tangled over time and
work was done recently to streamline EpollEventLoop. NioEventLoop would
benefit from the same treatment and it is more straighforward now that
we can follow the same structure as was done for epoll.
Modifications
Untangle NioEventLoop logic and mirror what's now done in EpollEventLoop
w.r.t. the volatile selector wake-up guard and scheduled task deadline
handling.
Some common refinements to EpollEventLoop have also been included - to
use constants for the "special" deadline/wakeup volatile values and to
avoid some unnecessary calls to System.nanoTime() on task-only
iterations.
Result
Hopefully cleaner, more efficient and less fragile NIO transport
implementation.
Motivation:
We can move some methods etc to make encapsulation better in Recycler
Modifications:
Move / rename methods to make usage more clear
Result:
Code cleanup
Motivation
Currently when future tasks are scheduled via EventExecutors from a
different thread, at least two allocations are performed - the
ScheduledFutureTask wrapping the to-be-run task, and a Runnable wrapping
the action to add to the scheduled task priority queue. The latter can
be avoided by incorporating this logic into the former.
Modification
- When scheduling or cancelling a future task from outside the event
loop, enqueue the task itself rather than wrapping in a Runnable
- Have ScheduledFutureTask#run first verify the task's deadline has
passed and if not add or remove it from the scheduledTaskQueue depending
on its cancellation state
- Add new outside-event-loop benchmarks to ScheduleFutureTaskBenchmark
Result
Fewer allocations when scheduling/cancelling future tasks
Motivation
The recycling ratio is currently implemented by comparing with a masked
count. The mask operation is not free and also not necessary.
Modification
Change the count(s) to just iterate over the corresponding interval,
which requires only a comparison and no mask.
Also make "first time recycle" behaviour consistent and revert change to
RecyclerTest made in #9727.
Result
Less recycling overhead
Motivation:
At the moment we only enfore ratioMask for the Stack which means that we only guard against recycle burts when recycled from the same Thread. We should also enforce the ratioMask in the WeakOrderQueue so we also guard against the bursts when recycle from other threads.
Modifications:
- Keep counter in WeakOrderQueue to enforce ratioMask as well
- Adjust unit test
Result:
Better guard against recycle bursts which could pollute the heap unnecessary.
Motivation
Currently the visibility of the various Recycler inner classes and their
fields isn't optimal. Some private members are accessed by other classes
resulting in synthetic methods, and other non-private classes/members
are only accessed privately and so can be made private.
Modifications
- Increase/reduce visibility of various fields/methods/classes within
Recycler
- Have WeakOrderQueue extend WeakReference<Thread> to eliminate the
owner field
- Change local DefaultHandle var to DefaultHandle<?> to avoid raw type
compiler warning
Result
Tidier code, fewer implicit methods on hot paths (reducing inlining
depths)
Motivation
This is already done internally for various reasons but it would make
sense i.m.o. as a top level concept: submitting a task to be run on the
event loop which doesn't need to run immediately but must still be
executed in FIFO order relative all other submitted tasks (be those
"lazy" or otherwise).
It's nice to separate this abstract "relaxed" semantic from concrete
implementations - the simplest is to just delegate to existing execute,
but for the main EL impls translates to whether a wakeup is required
after enqueuing.
Having a "global" abstraction also allows for simplification of our
internal use - for example encapsulating more of the common scheduled
future logic within AbstractScheduledEventExecutor.
Modifications
- Introduce public LazyRunnable interface and
AbstractEventExecutor#lazyExecute method (would be nice for this to be
added to EventExecutor interface in netty 5)
- Tweak existing SingleThreadEventExecutor mechanics to support these
- Replace internal use of NonWakeupRunnable (such as for pre-flush
channel writes)
- Uplift scheduling-related hooks into AbstractScheduledEventExecutor,
eliminating intermediate executeScheduledRunnable method
Result
Simpler code, cleaner and more useful/flexible abstractions - cleaner in
that they fully communicate the intent in a more general way, without
implying/exposing/restricting implementation details
Motivation:
We currently use a finalizer to ensure we correctly return the reserved back to the Stack but this is not really needed as we can ensure we return it when needed before dropping the WeakOrderQueue
Modifications:
Use explicit method call to ensure we return the reserved space back before dropping the object
Result:
Less finalizer usage and so less work for the GC
Motivation:
We null out the element in the array after we decrement the current size of the Stack but not directly write back the updated size to the stored field. This is problematic as we do some validation before we write it back and so may never do so if the validation fails. This then later can lead to have null objects returned where not expected
Modifications:
Update size directly after null out object
Result:
No more unexpected null value possible
##Motivation
The InternalLoggerFactory attempts to instantiate different logger
implementations to discover what is available on the class path,
accepting the first implementation that does not throw an exception.
Currently, the default ordering will attempt to instantiate a Log4j1
logger before Log4j2. For environments where both Log4j1 and Log4j2 are
available, this will result in using the older version. It seems that it
would be more intuitive to prefer the newer version, when possible.
##Modifications
Change the default ordering to attempt to use the Log4J2LoggerFactory
before the Log4JLoggerFactory.
##Result
For environments where both Log4j1 and Log4j2 are available on the class
path (but Slf4J is not available), Netty will now use Log4j2 instead of
Log4j1.
Motivation:
If maxDelayedQueues == 0 we should never put any WeakHashMap into the FastThreadLocal for a Thread.
Modifications:
Check if maxDelayedQueues == 0 and if so return directly. This will ensure we never call FastThreadLocal.initialValue() in this case
Result:
Less overhead / memory usage when maxDelayedQueues == 0
Motivation:
At the moment we directly extend the Recycler base class in our code which makes it hard to experiment with different Object pool implementation. It would be nice to be able to switch from one to another by using a system property in the future. This would also allow to more easily test things like https://github.com/netty/netty/pull/8052.
Modifications:
- Introduce ObjectPool class with static method that we now use internally to obtain an ObjectPool implementation.
- Wrap the Recycler into an ObjectPool and return it for now
Result:
Preparation for different ObjectPool implementations