Motivation:
PlatformDependent#getSystemClassLoader may throw a wide variety of exceptions based upon the environment. We should handle all exceptions and continue initializing the slow path if an exception occurs.
Modifications:
- Catch Throwable in cases where PlatformDependent#getSystemClassLoader is used
Result:
Fixes https://github.com/netty/netty/issues/6038
Motivation:
When using java.nio.DatagramChannel we should not close the channel when a SocketException was thrown as we can still use the channel.
Modifications:
Not close the Channel when SocketException is thrown
Result:
More robust and correct handling of exceptions when using NioDatagramChannel.
Motivation:
If an exception is thrown while processing the ready channels in the EventLoop we should still run all tasks as this may allow to recover. For example a OutOfMemoryError may be thrown and runAllTasks() will free up memory again. Beside this we should also ensure we always allow to shutdown even if an exception was thrown.
Modifications:
- Call runAllTasks() in a finally block
- Ensure shutdown is always handles.
Result:
More robust EventLoop implementations for NIO and Epoll.
Motivation:
We should better first process OP_WRITE before OP_READ as this may allow us to free memory in a faster fashion for previous queued writes.
Modifications:
Process OP_WRITE before OP_READ
Result:
Free memory faster for queued writes.
Motivation:
The JDK implementation of SocketChannel has an internal state that is tracked for its operations. Because of this we need to ensure we call finishConnect() before try to call read(...) / write(...) as otherwise it may produce a NotYetConnectedException.
Modifications:
First process OP_CONNECT flag.
Result:
No more possibility of NotYetConnectedException because OP_CONNECT is handled not early enough when processing interestedOps for a Channel.
Motivation:
When attempting to set the selectedKeys fields on the selector
implementation, JDK 9 can throw an inaccessible object exception.
Modications:
Catch and log this exception as an possible course of action if the
sun.nio.ch package is not exported from java.base.
Result:
The selector replacement will fail gracefully as an expected course of
action if the sun.nio.ch package is not exported from java.base.
Motivation:
The NIO transport used an IllegalStateException if a user tried to issue another connect(...) while the connect was still in process. For this case the JDK specified a ConnectPendingException which we should use. The same issues exists in the EPOLL transport. Beside this the EPOLL transport also does not throw the right exceptions for ENETUNREACH and EISCONN errno codes.
Modifications:
- Replace IllegalStateException with ConnectPendingException in NIO and EPOLL transport
- throw correct exceptions for ENETUNREACH and EISCONN in EPOLL transport
- Add test case
Result:
More correct error handling for connect attempts when using NIO and EPOLL transport
Motivation:
We need to ensure we also call fireChannelActive() if the Channel is directly closed in a ChannelFutureListener that is belongs to the promise for the connect. Otherwise we will see missing active events.
Modifications:
Ensure we always call fireChannelActive() if the Channel was active.
Result:
No missing events.
Motivation:
Instrumenting the NIO selector implementation requires special
permissions. Yet, the code for performing this instrumentation is
executed in a manner that would require all code leading up to the
initialization to have the requisite permissions. In a restrictive
environment (e.g., under a security policy that only grants the
requisite permissions the Netty transport jar but not to application
code triggering the Netty initialization), then instrumeting the
selector will not succeed even if the security policy would otherwise
permit it.
Modifications:
This commit marks the necessary blocks as privileged. This enables
access to the necessary resources for instrumenting the selector. The
idea is that we are saying the Netty code is trusted, and as long as the
Netty code has been granted the necessary permissions, then we will
allow the caller access to these resources even though the caller itself
might not have the requisite permissions.
Result:
The selector can be instrumented in a restrictive security environment.
Motivation:
Writing to a system property requires permissions. Yet the code for
setting sun.nio.ch.bugLevel is not marked as privileged. In a
restrictive environment (e.g., under a security policy that only grants
the requisite permissions the Netty transport jar but not to application
code triggering the Netty initialization), writing to this system
property will not succeed even if the security policy would otherwise
permit it.
Modifications:
This commt marks the necessary code block as privileged. This enables
writing to this system property. The idea is that we are saying the
Netty code is trusted, and as long as the Netty code has been granted
the necessary permissions, then we will allow the caller access to these
resources even though the caller itself might not have the requisite
permissions.
Result:
The system property sun.nio.ch.bugLevel can be written to in a
restrictive security environment.
Motivation:
In 4.0 AbstractNioByteChannel has a default of 16 max messages per read. However in 4.1 that constraint was applied at the NioSocketChannel which is not equivalent. In 4.1 AbstractEpollStreamChannel also did not have the default of 16 max messages per read applied.
Modifications:
- Make Nio consistent with 4.0
- Make Epoll consistent with Nio
Result:
Nio and Epoll both have consistent ChannelMetadata and are consistent with 4.0.
Motiviation:
Sometimes it is useful to allow to specify a custom strategy to handle rejected tasks. For example if someone tries to add tasks from outside the eventloop it may make sense to try to backoff and retries and so give the executor time to recover.
Modification:
Add RejectedEventExecutor interface and implementations and allow to inject it.
Result:
More flexible handling of executor overload.
Motivation:
To restrict the memory usage of a system it is sometimes needed to adjust the number of max pending tasks in the tasks queue.
Modifications:
- Add new constructors to modify the number of allowed pending tasks.
- Add system properties to configure the default values.
Result:
More flexible configuration.
Motivation:
If a user writes an own nio based transport which uses a special SelectorProvider it is useful to be able to get the SelectorProvider that is used by a NioEventLoop. This way this can be used when implement AbstractChannel.isCompatible(...) and check that the SelectorProvider is the correct one.
Modifications:
Expose the SelectorProvider.
Result:
Be able to get the SelectorProvider used by a NioEventLoop.
Motivation:
We use pre-instantiated exceptions in various places for performance reasons. These exceptions don't include a stacktrace which makes it hard to know where the exception was thrown. This is especially true as we use the same exception type (for example ChannelClosedException) in different places. Setting some StackTraceElements will provide more context as to where these exceptions original and make debugging easier.
Modifications:
Set a generated StackTraceElement on these pre-instantiated exceptions which at least contains the origin class and method name. The filename and linenumber are specified as unkown (as stated in the javadocs of StackTraceElement).
Result:
Easier to find the origin of a pre-instantiated exception.
Motivation:
To better debug why a Selector need to be rebuild it is useful to also log the instance of the Selector.
Modifications:
Add logger instance to the log message.
Result:
More useful log message.
Motivation:
JCTools supports both non-unsafe, unsafe versions of queues and JDK6 which allows us to shade the library in netty-common allowing it to stay "zero dependency".
Modifications:
- Remove copy paste JCTools code and shade the library (dependencies that are shaded should be removed from the <dependencies> section of the generated POM).
- Remove usage of OneTimeTask and remove it all together.
Result:
Less code to maintain and easier to update JCTools and less GC pressure as the queue implementation nt creates so much garbage
Motivation:
Sometimes it may be benefitially for an user to specify a custom algorithm when choose the next EventExecutor/EventLoop.
Modifications:
Allow to specify a custom EventExecutorChooseFactory that allows to customize algorithm.
Result:
More flexible api.
Motivation:
SingleThreadEventExecutor.pendingTasks() will call taskQueue.size() to get the number of pending tasks in the queue. This is not safe when using MpscLinkedQueue as size() is only allowed to be called by a single consumer.
Modifications:
Ensure size() is only called from the EventLoop.
Result:
No more livelock possible when call pendingTasks, no matter from which thread it is done.
Motivation:
The DuplexChannel is currently incomplete and only supports shutting down the output side of a channel. This interface should also support shutting down the input side of the channel.
Modifications:
- Add shutdownInput and shutdown methods to the DuplexChannel interface
- Remove state in NIO and OIO for tracking input being shutdown independent of the underlying transport's socket type. Tracking the state independently may lead to inconsistent state.
Result:
DuplexChannel supports shutting down the input side of the channel
Fixes https://github.com/netty/netty/issues/5175
Motivation:
If a task was submitted when wakenUp value was true, the task didn't get a chance to call Selector#wakeup.
So we need to check task queue again before executing select operation. If we don't, the task might be pended until select operation was timed out.
It might be pended until idle timeout if IdleStateHandler existed in pipeline.
Modifications:
Execute Selector#select in a non-blocking manner if there's a task submitted when wakenUp value was true.
Result:
Every tasks in NioEventLoop will not be pended.
Motivation:
EventExecutor.children uses generics in such a way that an entire colleciton must be cast to a specific type of object. This interface is not very flexible and is impossible to implement if the EventExecutor type must be wrapped. The current usage of this method also does not have any clear need within Netty. The Iterator interface allows for EventExecutor to be wrapped and forces the caller to make assumptions about types instead of building the assumptions into the interface.
Motivation:
- Remove EventExecutor.children and undeprecate the iterator() interface
Result:
EventExecutor interface has one less method and is easier to wrap.
Motivation:
If a channel is deregistered from an NioEventLoop the associated SelectionKey is cancelled. If the NioEventLoop has yet to process a pending event as a result of that SelectionKey then the NioEventLoop will see the SelecitonKey is invalid and close the channel. The NioEventLoop should not close a channel if it is not registered with that NioEventLoop.
Modifications:
- NioEventLoop.processSelectedKeys should check that the channel is still registered to itself before closing the channel
Result:
NioEventLoop doesn't close a channel that is no longer registered to it when the SelectionKey is invalid
Fixes https://github.com/netty/netty/issues/5125
Motivation:
Commit 0bc93dd introduced a potential assertion failure, if the deprecated method would be used.
Modifications:
Fix the potential assertion error.
Result:
Regression removed
Motivation:
Prior to 5b48fc284e setting readPending to false when autoReadClear was called was enough because when/if the EventLoop woke up with a read event we would first check if readPending was true and if it is not, we would return early. Now that readPending will only be set in the EventLoop it is not necessary to check readPending at the start of the read methods, and we should instead remove the OP_READ from the SelectionKey when we also set readPending to false.
Modifications:
- autoReadCleared should call AbstractNioUnsafe.removeReadOp
Result:
NIO is now consistent with EPOLL and removes the READ operation on the selector when autoRead is cleared.
Motivation:
OIO/NIO use a volatile variable to track if a read is pending. EPOLL does not use a volatile an executes a Runnable on the event loop thread to set readPending to false. These mechansims should be consistent, and not using a volatile variable is preferable because the variable is written to frequently in the event loop thread.
OIO also does not set readPending to false before each fireChannelRead operation and may result in reading more data than the user desires.
Modifications:
- OIO/NIO should not use a volatile variable for readPending
- OIO should set readPending to false before each fireChannelRead
Result:
OIO/NIO/EPOLL are more consistent w.r.t. readPending and volatile variable operations are reduced
Fixes https://github.com/netty/netty/issues/5069
Motivation:
441aa4c575 introduced a bug in transport-native-epoll where readPending is set to false before a read is attempted, but this should happen before fireChannelRead is called. The NIO transport also only sets the readPending variable to false on the first read in the event loop. This means that if the user only calls read() on the first channelRead(..) the select loop will still listen for read events even if the user does not call read() on subsequent channelRead() or channelReadComplete() in the same event loop run. If the user only needs 2 channelRead() calls then by default they will may get 14 more channelRead() calls in the current event loop, and then 16 more when the event loop is woken up for a read event. This will also read data off the TCP stack and allow the peer to queue more data in the local RECV buffers.
Modifications:
- readPending should be set to false before each call to channelRead()
- make NIO readPending set to false consistent with EPOLL
Result:
NIO and EPOLL transport set readPending to false at correct times which don't read more data than intended by the user.
Fixes https://github.com/netty/netty/issues/5082
Motivation:
There is a spelling error in FileRegion.transfered() as it should be transferred().
Modifications:
Deprecate old method and add a new one.
Result:
Fix typo and can remove the old method later.
Motivation:
NIO now supports a pluggable select strategy, but EPOLL currently doesn't support this. We should strive for feature parity for EPOLL.
Modifications:
- Add SelectStrategy to EPOLL transport.
Result:
EPOLL transport supports SelectStategy.
Motivation:
Under high throughput/low latency workloads, selector wakeups are
degrading performance when the incoming operations are triggered
from outside of the event loop. This is a common scenario for
"client" applications where the originating input is coming from
application threads rather from the socket attached inside the
event loops.
As a result, it can be desirable to defer the blocking select
so that incoming tasks (write/flush) do not need to wakeup
the selector.
Modifications:
This changeset adds the notion of a generic SelectStrategy which,
based on its contract, allows the implementation to optionally
defer the blocking select based on some custom criteria.
The default implementation resembles the original behaviour, that
is if tasks are in the queue `selectNow()` and move on, and if no
tasks need to be processed go into the blocking select and wait
for wakeup.
The strategy can be customized per `NioEventLoopGroup` in the
constructor.
Result:
High performance client applications are now given the chance to
customize for how long the actual selector blocking should be
deferred by employing a custom select strategy.
Motivation:
cf171ff525 introduced a change in behavior when dealing with closing channel in the read loop. This changed behavior may use stale state to determine if a channel should be shutdown and may be incorrect.
Modifications:
- Revert the usage of potentially stale state
Result:
Closing a channel in the read loop is based upon current state instead of potentially stale state.
Motivation:
The prefix fix of #2363 did not correctly handle the case when selectAgain is true and so missed to null out entries.
Modifications:
Move the i++ from end of loop to beginning of loop
Result:
Entries in the array will be null out so allow to have these GC'ed once the Channel close
Motivation:
writeBytes(...) missed to set EPOLLOUT flag when not all bytes were written. This could lead to have the EpollEventLoop not try to flush the remaining bytes once the socket becomes writable again.
Modifications:
- Move setting EPOLLOUT flag logic to one point so we are sure we always do it.
- Move OP_WRITE flag logic to one point as well.
Result:
Correctly try to write pending data if socket becomes writable again.
Motiviation:
The current read loops don't fascilitate reading a maximum amount of bytes. This capability is useful to have more fine grain control over how much data is injested.
Modifications:
- Add a setMaxBytesPerRead(int) and getMaxBytesPerRead() to ChannelConfig
- Add a setMaxBytesPerIndividualRead(int) and getMaxBytesPerIndividualRead to ChannelConfig
- Add methods to RecvByteBufAllocator so that a pluggable scheme can be used to control the behavior of the read loop.
- Modify read loop for all transport types to respect the new RecvByteBufAllocator API
Result:
The ability to control how many bytes are read for each read operation/loop, and a more extensible read loop.
Motivation:
Because of a bug we missed to fail the connect future when doClose() is called. This can lead to a future which is never notified and so may lead to deadlocks in user-programs.
Modifications:
Correctly fail the connect future when doClose() is called and the connection was not established yet.
Result:
Connect future is always notified.
Motivation:
If SO_LINGER is used shutdownOutput() and close() syscalls will block until either all data was send or until the timeout exceed. This is a problem when we try to execute them on the EventLoop as this means the EventLoop may be blocked and so can not process any other I/O.
Modifications:
- Add AbstractUnsafe.closeExecutor() which returns null by default and use this Executor for close if not null.
- Override the closeExecutor() in NioSocketChannel and EpollSocketChannel and return GlobalEventExecutor.INSTANCE if getSoLinger() > 0
- use closeExecutor() in shutdownInput(...) in NioSocketChannel and EpollSocketChannel
Result:
No more blocking of the EventLoop if SO_LINGER is used and shutdownOutput() or close() is called.
Motivation:
As the ByteBuf is not set to null after release it we may try to release it again in handleReadException()
Modifications:
- set ByteBuf to null to avoid another byteBuf.release() to be called in handleReadException()
Result:
No IllegalReferenceCountException anymore
Related: #2964
Motivation:
Writing a zero-length FileRegion to an NIO channel will lead to an
infinite loop.
Modification:
- Do not write a zero-length FileRegion by protecting with proper 'if'.
- Update the testsuite
Result:
Another bug fixed
Motivation:
When a datagram packet is sent to a destination where nobody actually listens to,
the server O/S will respond with an ICMP Port Unreachable packet.
The ICMP Port Unreachable packet is translated into PortUnreachableException by JDK.
PortUnreachableException is not a harmful exception that prevents a user from sending a datagram.
Therefore, we should not close a datagram channel when PortUnreachableException is caught.
Modifications:
- Do not close a channel when the caught exception is PortUnreachableException.
Result:
A datagram channel is not closed unexpectedly anymore.
Motivation:
JDK's exception messages triggered by a connection attempt failure do
not contain the related remote address in its message. We currently
append the remote address to ConnectException's message, but I found
that we need to cover more exception types such as SocketException.
Modifications:
- Add AbstractUnsafe.annotateConnectException() to de-duplicate the
code that appends the remote address
Result:
- Less duplication
- A transport implementor can annotate connection attempt failure
message more easily
Motivation:
At the moment it's only possible for a user to set the RecvByteBufAllocator for a Channel but not access the Handle once it is assigned. This makes it hard to write more flexible implementations.
Modifications:
Add a new method to the Channel.Unsafe to allow access the the used Handle for the Channel. The RecvByteBufAllocator.Handle is created lazily.
Result:
It's possible to write more flexible implementatons that allow to adjust stuff on the fly for a Handle that is used by a Channel
Motivation:
We did various changes related to the ChannelOutboundBuffer in 4.0 branch. This commit port all of them over and so make sure our branches are synced in terms of these changes.
Related to [#2734], [#2709], [#2729], [#2710] and [#2693] .
Modification:
Port all changes that was done on the ChannelOutboundBuffer.
This includes the port of the following commits:
- 73dfd7c01b
- 997d8c32d2
- e282e504f1
- 5e5d1a58fd
- 8ee3575e72
- d6f0d12a86
- 16e50765d1
- 3f3e66c31a
Result:
- Less memory usage by ChannelOutboundBuffer
- Same code as in 4.0 branch
- Make it possible to use ChannelOutboundBuffer with Channel implementation that not extends AbstractChannel
Motivation:
As a DatagramChannel supports to write to multiple remote peers we must not close the Channel once a IOException accours as this error may be only valid for one remote peer.
Modification:
Continue writing on IOException.
Result:
DatagramChannel can be used even after an IOException accours during writing.
Motivation:
Because of a missing return statement we may produce a NPE when try to fullfill the connect ChannelPromise when it was fullfilled before.
Modification:
Add missing return statement.
Result:
No more NPE.
Motivation:
We use the nanoTime of the scheduledTasks to calculate the milli-seconds to wait for a select operation to select something. Once these elapsed we check if there was something selected or some task is ready for processing. Unfortunally we not take into account scheduled tasks here so the selection loop will continue if only scheduled tasks are ready for processing. This will delay the execution of these tasks.
Modification:
- Check if a scheduled task is ready after selecting
- also make a tiny change in NioEventLoop to not trigger a rebuild if nothing was selected because the timeout was reached a few times in a row.
Result:
Execute scheduled tasks on time.
Motivation:
When a select rebuild was triggered the reference to the SelectionKey is not updated in AbstractNioChannel. This will cause a CancelledKeyException later.
Modification:
Correctly update SelectionKey reference after rebuild
Result:
Fix exception