Motivation:
During the last few month we did develop an io_uring based transport which shows very promising performance numbers. To give it more time to bake we will develop it outside of netty in an "incubator" module which will make it clear to users what to expect and also allow us to seperate its release cycle. While the implementation of it is very self contained there are few small adjustments that need to be made in netty itself to allow us to reuse code.
Modifications:
- AbstractChannel: Add method which can be used when a write fails and remove final from one method
- IovArray: Allow to create an IovArray from a ByteBuf instance
- FileDescriptor: Allow to reuse mark close logic via sub-class
Result:
Be able to reuse netty core classes in io_uring incubator repository
Motivation:
We should preferable always release the message before we notify the promise. Thhis has a few advantages:
- Release memory as soon as possible
- Listeners observe the "more correct" reference count
Modifications:
Release message before fail the promises
Result:
Faster releasing of resources. This came up in https://github.com/netty/netty/issues/10723
Motivation:
Since GraalVM version 19.3.0, instances of java.net.InetAddress (and its subclasses Inet4Address and Inet6Address) are not allowed in native image heap (that is, they cannot be stored in static fields of classes initialized at build time or be reachable through static fields of such classes). When building a native image, it makes sense to initialize at build time as many classes as possible.
But some fields of some classes in Netty (for example, NetUtil.LOCALHOST4) contain InetAddress instances. If a program is using code path that makes it possible to reach such fields at build time initialization, it becomes impossible to build a native image initializing core Netty classes initialized at runtime. An example of such a program is a client that uses netty-dns.
Modifications:
- Add netty-testsuite-native-image-client Maven module to test that such an example program can be built after the corresponding fixes
- Add native-image.properties to resolver-dns module to move initialization of some classes to runtime (some of them are parsing configuration during initialization, so it makes no sense to initialize them at build time; for others, it's needed to avoid InetAddress reachability at build time)
- Add substitutions for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST to overcome the InetAddress-related prohibition
- Extract some initialization code from NetUtil to NetUtilInitializations to allow it to be used by the substitutions
Result:
A client program using netty-dns with --initialize-at-build-time=io.netty builds successfully
Motivation:
Thread.stop() works by producing a ThreadDeath error in the target thread. EventLoops swallow all Throwables, which makes them effectively unkillable. This is effectively a memory leak, for our application. Beside this we should also just regrow all `Error` as there is almost no way to recover.
Modification:
Edit the EventLoops that swallow Throwables to instead rethrow Error.
Result:
`EventLoop` can crash if `Error` is thrown
Motivation:
HTTP is a plaintext protocol which means that someone may be able
to eavesdrop the data. To prevent this, HTTPS should be used whenever
possible. However, maintaining using https:// in all URLs may be
difficult. The nohttp tool can help here. The tool scans all the files
in a repository and reports where http:// is used.
Modifications:
- Added nohttp (via checkstyle) into the build process.
- Suppressed findings for the websites
that don't support HTTPS or that are not reachable
Result:
- Prevent using HTTP in the future.
- Encourage users to use HTTPS when they follow the links they found in
the code.
Motivation:
I was collecting stats for failed promises with a FixedChannelPool and I was bucketing by stats using cause.getSimpleName(). After #9152 was released, the introduction of the anonymous classes make getSimpleName() return "" causing confusion.
Modification:
Use named classes in the ChannelPool implementations. I made them private, but I can change that if you think otherwise.
Result:
The SimpleChannelPool fails the promises with a ChannelPoolFullException. The FixedChannelPool fails the promises with an AcquireTimeoutException. Also AcquireTimeoutException is more specific than just a plain TimeoutException, which is also useful for troubleshooting. If you want different class names, please advise.
Motivation:
DuplexChannel allow for half-closure, we should have a special config interface for it as well.
Modifications:
Add DuplexChannelConfig which allows to configure half-closure.
Result:
More consistent types
Motivation:
junit deprecated Assert.assertThat(...)
Modifications:
Use MatcherAssert.assertThat(...) as replacement for deprecated method
Result:
Less deprecation warnings
Motivation:
LGTM reports multiple issues. They need to be triaged,
and real ones should be fixed.
Modifications:
- Fixed multiple issues reported by LGTM, such as redundant conditions,
resource leaks, typos, possible integer overflows.
- Suppressed false-positives.
- Added a few testcases.
Result:
Fixed several possible issues, get rid of false alarms in the LGTM report.
Motivation:
We have a few classes in which we store and reuse static instances of various exceptions. When doing so it is important to also override fillInStacktrace() and so prevent the leak of the ClassLoader in the internal backtrace field.
Modifications:
- Add overrides of fillInStracktrace when needed
- Move ThrowableUtil usage in the static methods
Result:
Fixes https://github.com/netty/netty/pull/10686
Motivation:
We should use an initial buffer size with is >= 1500 (which is a common setting for MTU) to reduce the need for memory copies when a new connection is established. This is especially interesting when SSL / TLS comes into the mix.
This was ported from swiftnio:
https://github.com/apple/swift-nio/pull/1641
Modifications:
Increase the initial size from 1024 to 2048.
Result:
Possible less memory copies on new connections
Motivation:
Creating exceptions is expensive so we should only do so if really needed.
Modifications:
Only create the ConnectTimeoutException if we really need it.
Result:
Less overhead
Motivation:
In some benchmarks closing the Channel attributes to a lot of overhead due the call of fillInStackTrace(). We should reduce this overhead.
Modifications:
- Create a StacklessClosedChannelException and use it to reduce overhead.
- Only call ChannelOutboundBuffer.failFlushed(...) when there was a flushed message at all.
Result:
Less performance overhead when closing the Channel
Motivation:
I was working on the transport part in Netty (ofc, solving a major issue) and I found this typo so thought to fix it.
Modification:
Fixed Typo
Result:
No more confusion between `us` and `use`.
Motivation:
At some point a ChannelHandlerContext did have its own AttributeMap which is not true since 4.1.x was released. Unfortunally we missed to update the javadocs and so these don't reflect reality
Modifications:
Update javadocs
Result:
Fixes https://github.com/netty/netty/issues/10477
Motivation:
When we were using the netty http protocol, OOM occurred, this problem has been in 4.1.51.Final Fix [# 10424](https://github.com/netty/netty/issues/10424), even if OOM is up, the service will still receive new connection events, will occur again OOM and eventually cause the connection not to be released.
code `byteBuf = allocHandle.allocate(allocator);`
Modification:
I fail to create buffer when I try to receive new data, i determine if it is OOM then the close read event releases the connection.
```java
if (close || cause instanceof OutOfMemoryError || cause instanceof IOException) {
closeOnRead(pipeline);
}
```
Result:
Fixes # [10434](https://github.com/netty/netty/issues/10434).
Motivation:
When a switch statement is used we should always define a `default:` so we don't introduce bugs due fall-through.
Modification:
Add missing `default:`s
Result:
Less error-prone code
Motivation:
Seems like some users are suprised by the behaviour of DefaultEventExecutor when used within the ChannelPipeline. We should clarify the semantics and also mention UnordedThreadPoolEventExecutor which may be more inline with their expectations
Modifications:
Add javadocs section about UnorderedThreadPoolEventExecutor and expand details for DefaultEventExecutor
Result:
Clarify sematics
Motivation:
AbstractCoalescingBufferQueue had a bug which could lead to an empty queue while still report bytes left. This was due the fact that we decremented the pending bytes before draining the queue one-by-one. The problem here is that while the queue is drained we may notify the promise which may add again buffers to the queue for which we never decrement the bytes while we drain these
Modifications:
- Decrement the pending bytes every time we drain a buffer from the queue
- Add unit tests
Result:
Fixes https://github.com/netty/netty/issues/10286
Motivation:
In AbstractChannelHandlerContext we had some code where we tried to guard against endless loops caused by exceptions thrown by exceptionCaught(...) that would trigger exceptionCaught again. This code was proplematic for two reasons:
- It is quite expensive as we need to compare all the stacks
- We may end up not notify another handlers exceptionCaught(...) if in our exeuction stack we triggered actions that will cause an exceptionCaught somewhere else in the pipeline
Modifications:
- Just remove the detection code as we already handle everything correctly when we invoke exceptionCaught(...)
- Add unit tests
Result:
Ensure we always notify correctly and also fixes performance issue reported as https://github.com/netty/netty/issues/10165
Motivation:
Parameter maxCount needs the unit test.
Modifications:
1. Change the conditional statement to avoid the ineffective maxCount (enhance the robustness of the code merely).
2. Add the unit test for maxCount.
Result:
Enable this parameter to be tested.
Motivation:
We need to ensure we always close the Channel when we see an exception during bootstrapping it.
Modification:
Ensure we correct close the Channel if we see an exception during retrieving the Resolver from the group while bootstrapping.
Result:
Fixes#10109
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
Netty's DefaultThreadFactory that creates FastThreadLocalThread instance is widely used in NioEventLoopGroup, EpollEventLoopGroup, etc, but not OioEventLoopGroup. Although oio is quite stale, I still think this change may be useful.
Modification:
Replace oio's default thread factory with netty's DefaultThreadFactory just like NioEventLoopGroup, EpollEventLoopGroup, etc.
Result:
Faster access to FastThreadLocal in oio.
Motivation:
In general, we will close the debug log in a product environment. However, logging without external level check may still affect performance as varargs will need to allocate an array.
Modification:
Add log level check simply before logging.
Result:
Improve performance slightly in a product environment.
Motivation:
When the HttpContentCompressor is put on an alternate EventExecutor, the order of events should be
preserved so that the compressed content is correctly created.
Modifications:
- checking that the executor in the ChannelHandlerContext is the same executor as the current executor when evaluating if the handler should be skipped.
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/10067
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
We should not include the number of ServerChannel that are part of the DefaultChannelGroup when specify the initial size of the LinkedHashMap
Modifications:
Only use the number of the non ServerChannel
Result:
Reduce memory-footprint
Motivation:
The method body of fillInStackTrace method of IllegalStateException class in SimpleChannelPool.java is so simple (just return this) that there is no need to be marked as synchronized.
Modification:
Remove the synchronized flag of fillInStackTrace method.
Result:
It can improve performance slightly while creating a IllegalStateException instance.
Motivation:
The method body of fillInStackTrace method of TimeoutException class in FixedChannelPool.java is so simple (just return this) that there is no need to be marked as synchronized.
Modification:
Remove the synchronized flag of fillInStackTrace method.
Result:
It can improve performance slightly while throwing a TimeoutException.
Motivation:
Current pipeline handler replace tests are replacing handler with same name.
we need test that test handler can renamed, old handlers are removed from pipeline.
Modification:
There is coverage missing related to renaming of handlers
Result:
Adds missing tests
Co-authored-by: phani254 <phani254@yahoo.com>
Motivation:
NioEventLoopTest.testChannelsRegistered sometimes fails due a race which is related to how SelectionKey and Selector is implemented in the JDK. In the current implementation it will "lazy" remove SelectionKeys from the Set which means we may still have these included sometimes when we use size() to get the number of SelectionKeys.
Modifications:
Just retry to read the number of registered Channels if we still see 2
Result:
Fixes https://github.com/netty/netty/issues/9895
Motivation:
https://github.com/netty/netty/pull/9458 changed how we handled ChannelOptions internally to use a ConcurrentHashMap. This unfortunally had the side-effect that the ordering may be affected and not stable anymore. Here the problem is that sometimes we do validation based on two different ChannelOptions (for example we validate high and low watermarks against each other). Thus even if the user specified the options in the same order we may fail to configure them.
Modifications:
- Use again a LinkedHashMap to preserve order
- Add unit test
Result:
Apply ChannelOptions in correct and expected order
Motivation
This test is failing intermittently and upon inspection has a race
condition.
Modification
Fix race condition by waiting for async release calls to complete prior
to closing the pool.
Result
Hopefully fixed flakey test
Motivation:
The Channel Pool tests commonly use the same fixed local address String. This
has been observed to result in test failures if a single test fails and cleanup
is not done properly, or if tests are run in parallel. Also each test should
close any channel pool objects or maps to make sure resources are reclaimed.
Modifications:
- Use a random string for the address on each test to reduce the chance of
collision.
- close all ChannelPool and ChannelPoolMap objects at the end of each test
Result:
Less likely to observe spurious failures due to LocalAddress collisions and more
complete test cleanup.
Motivation:
In #9830 the get/remove/close methods implementation changed to avoid
deadlocks on event loops. The change involved modifying the methods to
close the managed ChannelPools asynchronously and return immediately.
While this behavior might be fine for get/remove, it is changing what
a user expects from a close() method and after returning from close()
there might be still resources open.
Modifications:
This change is a follow-up for #9830 to preserve the synchronous
behavior of the AbstractChannelPoolMap#close() method.
Result:
AbstractChannelPoolMap#close() returns once the managed pools have
been closed.
Motivation:
97361fa2c8 replace synchronized with ConcurrentHashMap in *Bootstrap classes but missed to do the same for the Http2 variant.
Modifications:
- Use ConcurrentHashMap
- Simplify code in *Bootstrap classes
Result:
Less contention
Motivation:
In certain scenarios mutliple concurrent AbstractChannelPoolMap
operations might be called from event loops that handle also
ChannelPool close operations. If the map uses synchronous close
it could end up blocking the event loop and if multiple threads
are waiting for each other a deadlock might occur.
Modifications:
Previously #9226 introduced a closeAsync operation for
FixedChannelPool, which is now extended to SimpleChannelPool class.
The AbstractChannelPoolMap now uses the closeAsync operations when
closing redundant or removed SimpleChannelPool instances.
Result:
The AbstractChannelPoolMap get/remove operations will not wait
until the pools are closed as they will happen asynchronously and
avoid situations that could cause the event loop being blocked in
deadlocks.
Motiviation
#9800 was just merged which consolidates the flush/no-flush WriteTasks
in AbstractChannelHandlerContext, but after looking at the changes again
I noticed a tiny simplification that would be good to make imo.
Modification
Remove use of conditional operator in decrementPendingOutboundBytes()
Result
Simpler code, one less branch
Motivation
The event loop implementations had become somewhat tangled over time and
work was done recently to streamline EpollEventLoop. NioEventLoop would
benefit from the same treatment and it is more straighforward now that
we can follow the same structure as was done for epoll.
Modifications
Untangle NioEventLoop logic and mirror what's now done in EpollEventLoop
w.r.t. the volatile selector wake-up guard and scheduled task deadline
handling.
Some common refinements to EpollEventLoop have also been included - to
use constants for the "special" deadline/wakeup volatile values and to
avoid some unnecessary calls to System.nanoTime() on task-only
iterations.
Result
Hopefully cleaner, more efficient and less fragile NIO transport
implementation.
Motivation
AbstractChannelHandlerContext uses recyclable tasks when performing
writes from outside of the event loop. There's currently two distinct
classes WriteTask and WriteAndFlushTask used for executing writes versus
writeAndFlushes, and these are recycled in separate pools. However it is
straightforward to just have a single class / recycler pool, with a
flush flag.
Modifications
- Unify WriteTasks into a single class using the sign bit of the
existing size field to indicate whether a flush should be performed
- Use the new executor lazyExecute() method to lazily execute the
non-flush write tasks explicitly
- Change AbstractChannelHandlerContext#invokeWrite and
AbstractChannelHandlerContext#invokeWriteAndFlush from private to
package-private to avoid synthetic methods
- Correct the default object size estimate for WriteTask
Results
- Possibly improved reuse of recycled write tasks
- Fewer virtual method calls and shorter path lengths
- Less code
Motivation:
In most cases, we want to use MultithreadEventLoopGroup such as NioEventLoopGroup without setting thread numbers but thread name only. So we need to use followed code:
NioEventLoopGroup boss = new NioEventLoopGroup(0, new DefaultThreadFactory("boss"));
It looks a bit confuse or strange for the number 0 due to we only want to set thread name. So it will be better to add new constructor for this case.
Modifications:
add new constructor into all event loop groups, for example: public NioEventLoopGroup(ThreadFactory threadFactory)
Result:
User can only set thread factory without setting the thread number to 0:
NioEventLoopGroup boss = new NioEventLoopGroup(new DefaultThreadFactory("boss"));
Motivation:
https://github.com/netty/netty/pull/9548 introduced a change that creates a new AttributeKey
for each SimpleChannelPool instance created. AttributeKeys are cached statically in a ConstantPool
by the AttributeKey.newInstance method. Because of this, creating a SimpleChannelPool instance will
allocate memory that will never be released, even after the SimpleChannelPool is closed.
Modifications:
This change goes back to a single AttributeKey per SimpleChannelPool, just using a more specific
name to reduce the chance of conflicts with user code.
Result:
No memory is leaked after a SimpleChannelPool instance is created and destroyed.
Motivation
This is already done internally for various reasons but it would make
sense i.m.o. as a top level concept: submitting a task to be run on the
event loop which doesn't need to run immediately but must still be
executed in FIFO order relative all other submitted tasks (be those
"lazy" or otherwise).
It's nice to separate this abstract "relaxed" semantic from concrete
implementations - the simplest is to just delegate to existing execute,
but for the main EL impls translates to whether a wakeup is required
after enqueuing.
Having a "global" abstraction also allows for simplification of our
internal use - for example encapsulating more of the common scheduled
future logic within AbstractScheduledEventExecutor.
Modifications
- Introduce public LazyRunnable interface and
AbstractEventExecutor#lazyExecute method (would be nice for this to be
added to EventExecutor interface in netty 5)
- Tweak existing SingleThreadEventExecutor mechanics to support these
- Replace internal use of NonWakeupRunnable (such as for pre-flush
channel writes)
- Uplift scheduling-related hooks into AbstractScheduledEventExecutor,
eliminating intermediate executeScheduledRunnable method
Result
Simpler code, cleaner and more useful/flexible abstractions - cleaner in
that they fully communicate the intent in a more general way, without
implying/exposing/restricting implementation details
Motivation:
At the moment we directly extend the Recycler base class in our code which makes it hard to experiment with different Object pool implementation. It would be nice to be able to switch from one to another by using a system property in the future. This would also allow to more easily test things like https://github.com/netty/netty/pull/8052.
Modifications:
- Introduce ObjectPool class with static method that we now use internally to obtain an ObjectPool implementation.
- Wrap the Recycler into an ObjectPool and return it for now
Result:
Preparation for different ObjectPool implementations
Motivation:
We can use the `@SuppressJava6Requirement` annotation to be more precise about when we use Java6+ APIs. This helps us to ensure we always protect these places.
Modifications:
Make use of `@SuppressJava6Requirement` explicit
Result:
Fixes https://github.com/netty/netty/issues/2509.
Motivation
NioEventLoopTest#testChannelsRegistered() fails intermittently due to
use of SingleThreadEventLoop#channelsRegistered() which is not
threadsafe and unreliable when called from outside the event loop.
Modifications
Add static registeredChannels method to NioEventLoopTest and
AbstractSingleThreadEventLoopTest to call from the tests via event loop
instead of directly.
Result
Hopefully fewer test failures
Motivation
The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.
The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I _think_ even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.
It also means shutdowns are unnecessarily delayed by up to 1 second.
Modifications
- Add/extend unit tests to expose the issue
- Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
to explicitly add no-op tasks to the taskQueue so that the subsequent
event loop iteration doesn't enter blocking wait (as looks like was
originally intended)
Results
Faster and more robust shutdown of event loops, allows removal of the
default wait timeout
Motivation:
On JDK > 9 Netty uses Unsafe to write two internal JDK fields: sun.nio.ch.SelectorImp.selectedKeys and sun.nio.ch.SelectorImpl.publicSelectedKeys. This is done in transport/src/main/java/io/netty/channel/nio/NioEventLoop.java:225, in openSelector() method. The GraalVM analysis cannot do the Unsafe registration automatically because the object field offset computation is hidden behind two layers of calls.
Modifications:
This PR updates the Netty GraalVM configuration by registering those fields for unsafe access.
Result:
Improved support for Netty on GraalVM with JDK > 9.
Motivation:
Due a bug we did not always correctly calculate the next buffer size in AdaptiveRecvByteBufAllocator.
Modification:
Fix calculation and add unit test
Result:
Correct calculation is always used.
Motivation:
We should correctly reset the cached local and remote address when a Channel.disconnect() is called and the channel has a notion of disconnect vs close (for example DatagramChannel implementations).
Modifications:
- Correctly reset cached kicak abd remote address
- Update testcase to cover it and so ensure all transports work in a consistent way
Result:
Correctly handle disconnect()
Motivation:
calculateMaxBytesPerGatheringWrite() contains duplicated calculation: getSendBufferSize() << 1
Modifications:
Remove the duplicated calculation
Result:
The method will be clear and better
Motivation:
It is noticed that SimpleChannelPool's POOL_KEY attribute name channelPool is easy to get conflict with user code and throws an exception 'channelPool' is already in use. Being a generic framework - it would be great if we can name the attribute something unique - may be use UUID for the name since the name is not required later.
Modifications:
This change make sure that the POOL_KEY used inside SimpleChannelPool is unique by appending the object hashcode in the name.
Result:
No unwanted channel attribute name conflict with user code.
Motivation:
Following up on discussion with @normanmaurer with suggestion to improve code clarity.
Modification:
Method is synchronized, no need for assert or verbose sync blocks around calls.
Result:
Reduce verbosity and more idiomatic use of keyword. Also rename the method to better describe what it's for.
Motivation:
We need to update the doubly-linked list nodes while holding a lock via synchronized in all cases as otherwise we may end-up with a corrupted pipeline. We missed this when calling remove0(...) due handlerAdded(...) throwing an exception.
Modifications:
- Correctly hold lock while update node
- Add assert
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9528
Motivation:
There are some extra log level checks (logger.isWarnEnabled()).
Modification:
Remove log level checks (logger.isWarnEnabled()) from io.netty.channel.epoll.AbstractEpollStreamChannel, io.netty.channel.DefaultFileRegion, io.netty.channel.nio.AbstractNioChannel, io.netty.util.HashedWheelTimer, io.netty.handler.stream.ChunkedWriteHandler and io.netty.channel.udt.nio.NioUdtMessageConnectorChannel
Result:
Fixes#9456
Motivation
The epoll transport was updated in #7834 to decouple setting of the
timerFd from the event loop, so that scheduling delayed tasks does not
require waking up epoll_wait. To achieve this, new overridable hooks
were added in the AbstractScheduledEventExecutor and
SingleThreadEventExecutor superclasses.
However, the minimumDelayScheduledTaskRemoved hook has no current
purpose and I can't envisage a _practical_ need for it. Removing
it would reduce complexity and avoid supporting this specific
API indefinitely. We can add something similar later if needed
but the opposite is not true.
There also isn't a _nice_ way to use the abstractions for
wakeup-avoidance optimizations in other EventLoops that don't have a
decoupled timer.
This PR replaces executeScheduledRunnable and
wakesUpForScheduledRunnable
with two new methods before/afterFutureTaskScheduled that have slightly
different semantics:
- They only apply to additions; given the current internals there's no
practical use for removals
- They allow per-submission wakeup decisions via a boolean return val,
which makes them easier to exploit from other existing EL impls (e.g.
NIO/KQueue)
- They are subjectively "cleaner", taking just the deadline parameter
and not exposing Runnables
- For current EL/queue impls, only the "after" hook is really needed,
but specialized blocking queue impls can conditionally wake on task
submission (I have one lined up)
Also included are further optimization/simplification/fixes to the
timerFd manipulation logic.
Modifications
- Remove AbstractScheduledEventExecutor#minimumDelayScheduledTaskRemoved()
and supporting methods
- Uplift NonWakeupRunnable and corresponding default wakesUpForTask()
impl from SingleThreadEventLoop to SingleThreadEventExecutor
- Change executeScheduledRunnable() to be package-private, and have a
final impl in SingleThreadEventExecutor which triggers new overridable
hooks before/afterFutureTaskScheduled()
- Remove unnecessary use of bookend tasks while draining the task queue
- Use new hooks to add simpler wake-up avoidance optimization to
NioEventLoop (primarily to demonstrate utility/simplicity)
- Reinstate removed EpollTest class
In EpollEventLoop:
- Refactor to use only the new afterFutureTaskScheduled() hook for
updating timerFd
- Fix setTimerFd race condition using a monitor
- Set nextDeadlineNanos to a negative value while the EL is awake and
use this to block timer changes from outside the EL. Restore the
known-set value prior to sleeping, updating timerFd first if necessary
- Don't read from timerFd when processing expiry event
Result
- Cleaner API for integrating with different EL/queue timing impls
- Fixed race condition to avoid missing scheduled wakeups
- Eliminate unnecessary timerFd updates while EL is awake, and
unnecessary expired timerFd reads
- Avoid unnecessary scheduled-task wakeups when using NIO transport
I did not yet further explore the suggestion of using
TFD_TIMER_ABSTIME for the timerFd.
Motivation:
In AbstractBoostrap, options and attrs are LinkedHashMap that are synchronized on for every read, copy/clone, write operation.
When a lot of connections are triggered concurrently on the same bootstrap instance, the synchronized blocks lead to contention, Netty IO threads get blocked, and performance may be severely degraded.
Modifications:
Use ConcurrentHashMap
Result:
Less contention. Fixes https://github.com/netty/netty/issues/9426
Motivation:
EPOLL supports decoupling the timed wakeup mechanism from the selector call. The EPOLL transport takes advantage of this in order to offer more fine grained timer resolution. However we are current calling timerfd_settime on each call to epoll_wait and this is expensive. We don't have to re-arm the timer on every call to epoll_wait and instead only have to arm the timer when a task is scheduled with an earlier expiration than any other existing scheduled task.
Modifications:
- Before scheduled tasks are added to the task queue, we determine if the new
duration is the soonest to expire, and if so update with timerfd_settime. We
also drain all the tasks at the end of the event loop to make sure we service
any expired tasks and get an accurate next time delay.
- EpollEventLoop maintains a volatile variable which represents the next deadline to expire. This variable is modified inside the event loop thread (before calling epoll_wait) and out side the event loop thread (immediately to ensure proper wakeup time).
- Execute the task queue before the schedule task priority queue. This means we
may delay the processing of scheduled tasks but it ensures we transfer all
pending tasks from the task queue to the scheduled priority queue to run the
soonest to expire scheduled task first.
- Deprecate IORatio on EpollEventLoop, and drain the executor and scheduled queue on each event loop wakeup. Coupling the amount of time we are allowed to drain the executor queue to a proportion of time we process inbound IO may lead to unbounded queue sizes and unpredictable latency.
Result:
Fixes https://github.com/netty/netty/issues/7829
- In most cases this results in less calls to timerfd_settime
- Less event loop wakeups just to check for scheduled tasks executed outside the event loop
- More predictable executor queue and scheduled task queue draining
- More accurate and responsive scheduled task execution
Motivation:
Look like `EmbeddedChannelPipeline` should also override `onUnhandledInboundMessage(ChannelHandlerContext ctx, Object msg)` in order to do not print "Discarded message pipeline" because in case of `EmbeddedChannelPipeline` discarding actually not happens.
This fixes next warning in the latest netty version with websocket and `WebSocketServerCompressionHandler`:
```
13:36:36.231 DEBUG- Decoding WebSocket Frame opCode=2
13:36:36.231 DEBUG- Decoding WebSocket Frame length=5
13:36:36.231 DEBUG- Discarded message pipeline : [JdkZlibDecoder#0, DefaultChannelPipeline$TailContext#0]. Channel : [id: 0xembedded, L:embedded - R:embedded].
```
Modification:
Override correct method
Result:
Follow up fix after https://github.com/netty/netty/pull/9286
Motivation:
On servers with many pipelines or dynamic pipelines, it is easy for end user to make mistake during pipeline configuration. Current message:
`Discarded inbound message PooledUnsafeDirectByteBuf(ridx: 0, widx: 2, cap: 2) that reached at the tail of the pipeline. Please check your pipeline configuration.`
Is not always meaningful and doesn't allow to find the wrong pipeline quickly.
Modification:
Added additional log placeholder that identifies pipeline handlers and channel info. This will allow for the end users quickly find the problem pipeline.
Result:
Meaningful warning when the message reaches the end of the pipeline. Fixes#7285
Motivation:
We observed some test-failues sometimes in the CI which happened if sc.close() was not completed before the next test did run. If this happened we would fail the bind(...) as the LocalAddress was still in use.
Modifications:
Await the close before return
Result:
Fixes race in testsuite which resulted in FixedChannelPoolTest.testAcquireNewConnection to fail if FixedChannelPoolTest.testCloseAsync() did run before it.