Motivation:
`DefaultFileRegion.transferTo` will return 0 all the time when we request more data then the actual file size. This may result in a busy spin while processing the fileregion during writes.
Modifications:
- If we wrote 0 bytes check if the underlying file size is smaller then the requested count and if so throw an IOException
- Add DefaultFileRegionTest
- Add a test to the testsuite
Result:
Fixes https://github.com/netty/netty/issues/8868.
Motivation:
The SSLEngine does provide a way to signal to the caller that it may need to execute a blocking / long-running task which then can be offloaded to an Executor to ensure the I/O thread is not blocked. Currently how we handle this in SslHandler is not really optimal as while we offload to the Executor we still block the I/O Thread.
Modifications:
- Correctly support offloading the task to the Executor while suspending processing of SSL in the I/O Thread
- Add new methods to SslContext to specify the Executor when creating a SslHandler
- Remove @deprecated annotations from SslHandler constructor that takes an Executor
- Adjust tests to also run with the Executor to ensure all works as expected.
Result:
Be able to offload long running tasks to an Executor when using SslHandler. Partly fixes https://github.com/netty/netty/issues/7862 and https://github.com/netty/netty/issues/7020.
Motivation:
We missed to extend a few tests from the testsuite and so also run these with our native KQueue and Epoll transport.
Modifications:
Extend tests and so run these for our native transports as well.
Result:
More tests.
Motivation:
We can just use Objects.requireNonNull(...) as a replacement for ObjectUtil.checkNotNull(....)
Modifications:
- Use Objects.requireNonNull(...)
Result:
Less code to maintain.
Motivation:
When using a linux distribution that supports sendmmsg(...) we allocated enough direct memory per EpollEventLoop to be able to write IOV_MAX number of iovecs per message that can be written per sendmmsg.
The number of messages that can be written per sendmmsg(...) call is limited by UIO_MAX_IOV.
In practice this resulted in an allocation of 16MB direct memory per EpollEventLoop instance that stayed allocated until the EpollEventLoop was shutdown which happens as part of the shutdown of the enclosing EpollEVentLoopGroup.
This resulted in quite some heavy direct memory usage in practice even when in practice we have very slim changes to ever need all of the memory.
Modification:
Adjust NativeDatagramPacketArray to share one IovArray instance across all NativeDatagramPacket instances it holds. This limits the max number of iovecs we can write across all messages to IOV_MAX per sendmmsg(...) call.
This in practice will still be enough to allow us to write multiple messages with one syscall while keep the memory overhead to a minimum.
Result:
Smaller direct memory footprint per EpollEventLoop when using EpollDatagramChannel on distributions that support sendmmsg(...).
Fixes https://github.com/netty/netty/issues/8814
Motivation:
In the native code EpollDatagramChannel / KQueueDatagramChannel creates a DatagramSocketAddress object for each received UDP datagram even when in connected mode as it uses the recvfrom(...) / recvmsg(...) method. Creating these is quite heavy in terms of allocations as internally, char[], String, Inet4Address, InetAddressHolder, InetSocketAddressHolder, InetAddress[], byte[] objects are getting generated when constructing the object. When in connected mode we can just use regular read(...) calls which do not need to allocate all of these.
Modifications:
- When in connected mode use read(...) and NOT recvfrom(..) / readmsg(...) to reduce allocations when possible.
- Adjust tests to ensure read works as expected when in connected mode.
Result:
Less allocations and GC when using native datagram channels in connected mode. Fixes https://github.com/netty/netty/issues/8770.
Motivation:
We have a utility method to check for > 0 and >0 arguments. We should use it.
Modification:
use checkPositive/checkPositiveOrZero instead of if statement.
Result:
Re-use utility method.
Motivation:
We can use lambdas now as we use Java8.
Modification:
use lambda function for all package, #8751 only migrate transport package.
Result:
Code cleanup.
Motivation:
We need to update to a new checkstyle plugin to allow the usage of lambdas.
Modifications:
- Update to new plugin version.
- Fix checkstyle problems.
Result:
Be able to use checkstyle plugin which supports new Java syntax.
* Decouble EventLoop details from the IO handling for each transport to allow easy re-use of code and customization
Motiviation:
As today extending EventLoop implementations to add custom logic / metrics / instrumentations is only possible in a very limited way if at all. This is due the fact that most implementations are final or even package-private. That said even if these would be public there are the ability to do something useful with these is very limited as the IO processing and task processing are very tightly coupled. All of the mentioned things are a big pain point in netty 4.x and need improvement.
Modifications:
This changeset decoubled the IO processing logic from the task processing logic for the main transport (NIO, Epoll, KQueue) by introducing the concept of an IoHandler. The IoHandler itself is responsible to wait for IO readiness and process these IO events. The execution of the IoHandler itself is done by the SingleThreadEventLoop as part of its EventLoop processing. This allows to use the same EventLoopGroup (MultiThreadEventLoupGroup) for all the mentioned transports by just specify a different IoHandlerFactory during construction.
Beside this core API change this changeset also allows to easily extend SingleThreadEventExecutor / SingleThreadEventLoop to add custom logic to it which then can be reused by all the transports. The ideas are very similar to what is provided by ScheduledThreadPoolExecutor (that is part of the JDK). This allows for example things like:
* Adding instrumentation / metrics:
* how many Channels are registered on an SingleThreadEventLoop
* how many Channels were handled during the IO processing in an EventLoop run
* how many task were handled during the last EventLoop / EventExecutor run
* how many outstanding tasks we have
...
...
* Implementing custom strategies for choosing the next EventExecutor / EventLoop to use based on these metrics.
* Use different Promise / Future / ScheduledFuture implementations
* decorate Runnable / Callables when submitted to the EventExecutor / EventLoop
As a lot of functionalities are folded into the MultiThreadEventLoopGroup and SingleThreadEventLoopGroup this changeset also removes:
* AbstractEventLoop
* AbstractEventLoopGroup
* EventExecutorChooser
* EventExecutorChooserFactory
* DefaultEventLoopGroup
* DefaultEventExecutor
* DefaultEventExecutorGroup
Result:
Fixes https://github.com/netty/netty/issues/8514 .
Motivation:
We can use the diamond operator these days.
Modification:
Use diamond operator whenever possible.
Result:
More modern code and less boiler-plate.
Motivation:
While using Load Balancers or HA support is needed there are cases when UDP channel need to bind to IP Address which is not available on network interfaces locally.
Modification:
Modified EpollDatagramChannelConfig to allow IP_FREEBIND option
Result:
Fixes ##8727.
Motiviation:
Because of how we implemented the registration / deregistration of an EventLoop it was not possible to wrap an EventLoop implementation and use it with a Channel.
Modification:
- Introduce EventLoop.Unsafe which is responsible for the actual registration.
- Move validation of EventLoop / Channel combo to the EventLoop
- Add unit test that verifies that wrapping works
Result:
Be able to wrap an EventLoop and so add some extra functionality.
Motivation:
At the moment it’s possible to have a Channel in Netty that is not registered / assigned to an EventLoop until register(...) is called. This is suboptimal as if the Channel is not registered it is also not possible to do anything useful with a ChannelFuture that belongs to the Channel. We should think about if we should have the EventLoop as a constructor argument of a Channel and have the register / deregister method only have the effect of add a Channel to KQueue/Epoll/... It is also currently possible to deregister a Channel from one EventLoop and register it with another EventLoop. This operation defeats the threading model assumptions that are wide spread in Netty, and requires careful user level coordination to pull off without any concurrency issues. It is not a commonly used feature in practice, may be better handled by other means (e.g. client side load balancing), and therefore we propose removing this feature.
Modifications:
- Change all Channel implementations to require an EventLoop for construction ( + an EventLoopGroup for all ServerChannel implementations)
- Remove all register(...) methods from EventLoopGroup
- Add ChannelOutboundInvoker.register(...) which now basically means we want to register on the EventLoop for IO.
- Change ChannelUnsafe.register(...) to not take an EventLoop as parameter (as the EventLoop is supplied on custruction).
- Change ChannelFactory to take an EventLoop to create new Channels and introduce ServerChannelFactory which takes an EventLoop and one EventLoopGroup to create new ServerChannel instances.
- Add ServerChannel.childEventLoopGroup()
- Ensure all operations on the accepted Channel is done in the EventLoop of the Channel in ServerBootstrap
- Change unit tests for new behaviour
Result:
A Channel always has an EventLoop assigned which will never change during its life-time. This ensures we are always be able to call any operation on the Channel once constructed (unit the EventLoop is shutdown). This also simplifies the logic in DefaultChannelPipeline a lot as we can always call handlerAdded / handlerRemoved directly without the need to wait for register() to happen.
Also note that its still possible to deregister a Channel and register it again. It's just not possible anymore to move from one EventLoop to another (which was not really safe anyway).
Fixes https://github.com/netty/netty/issues/8513.
* Handling AUTO_READ should not be the responsibility of DefaultChannelPipeline but the Channel itself.
Motivation:
At the moment we do automatically call read() in the DefaultChannelPipeline when fireChannelReadComplete() / fireChannelActive() is called and the Channel is using auto read. This is nice in terms of sharing code but imho is not the responsibility of the ChannelPipeline implementation but the responsibility of the Channel implementation.
Modifications:
Move handing of auto read from DefaultChannelPipeline to Channel implementations.
Result:
More clear responsibiliy and not depending on implemention details of the ChannelPipeline.
Motivation:
Most of the maven modules do not explicitly declare their
dependencies and rely on transitivity, which is not always correct.
Modifications:
For all maven modules, add all of their dependencies to pom.xml
Result:
All of the (essentially non-transitive) depepdencies of the modules are explicitly declared in pom.xml
Motivation:
https://github.com/netty/netty/issues/8444 reports that there is some issue with negative values passed to timerfd_settime. This test verifies that everything is working as expected.
Modifications:
Add testcase.
Result:
Test to verify expected behaviour.
Motivation:
epoll_wait should work in 4.1.30 like it did in 4.1.29.
Modifications:
Revert Integer.MAX_VALUE back to MAX_SCHEDULED_TIMERFD_NS (999,999,999).
Add unit test.
Result:
epoll_wait will no longer throw EINVAL.
Motivation:
Add an option (through a SelectStrategy return code) to have the Netty event loop thread to do busy-wait on the epoll.
The reason for this change is to avoid the context switch cost that comes when the event loop thread is blocked on the epoll_wait() call.
On average, the context switch has a penalty of ~13usec.
This benefits both:
The latency when reading from a socket
Scheduling tasks to be executed on the event loop thread.
The tradeoff, when enabling this feature, is that the event loop thread will be using 100% cpu, even when inactive.
Modification:
Added SelectStrategy option to return BUSY_WAIT
Epoll loop will do a epoll_wait() with no timeout
Use pause instruction to hint to processor that we're in a busy loop
Result:
When enabled, minimizes impact of context switch in the critical path
Motivation
The EpollChannelConfig (same for KQueues) and its subclasses repeatetly declare their own channel field which leads to a 3x repetition for each config instance. Given the fields are protected or package-private it's exposing the code code to "field hiding" bugs.
Modifications
Use the the existing protected channel field from the DefaultChannelConfig class and simply cast it when needed.
Result
Fixes#8331
Motivation:
The Epoll transport checks to see if there are any scheduled tasks
before entering epoll_wait, and resets the timerfd just before.
This causes an extra syscall to timerfd_settime before doing any
actual work. When scheduled tasks aren't added frequently, or
tasks are added with later deadlines, this is unnecessary.
Modification:
Check the *deadline* of the peeked task in EpollEventLoop, rather
than the *delay*. If it hasn't changed since last time, don't
re-arm the timer
Result:
About 2us faster on gRPC RTT 50pct latency benchmarks.
Before (2 runs for 5 minutes, 1 minute of warmup):
```
50.0%ile Latency (in nanos): 64267
90.0%ile Latency (in nanos): 72851
95.0%ile Latency (in nanos): 78903
99.0%ile Latency (in nanos): 92327
99.9%ile Latency (in nanos): 119691
100.0%ile Latency (in nanos): 13347327
QPS: 14933
50.0%ile Latency (in nanos): 63907
90.0%ile Latency (in nanos): 73055
95.0%ile Latency (in nanos): 79443
99.0%ile Latency (in nanos): 93739
99.9%ile Latency (in nanos): 123583
100.0%ile Latency (in nanos): 14028287
QPS: 14936
```
After:
```
50.0%ile Latency (in nanos): 62123
90.0%ile Latency (in nanos): 70795
95.0%ile Latency (in nanos): 76895
99.0%ile Latency (in nanos): 90887
99.9%ile Latency (in nanos): 117819
100.0%ile Latency (in nanos): 14126591
QPS: 15387
50.0%ile Latency (in nanos): 61021
90.0%ile Latency (in nanos): 70311
95.0%ile Latency (in nanos): 76687
99.0%ile Latency (in nanos): 90887
99.9%ile Latency (in nanos): 119527
100.0%ile Latency (in nanos): 6351615
QPS: 15571
```
Motivation:
When using Epoll based transport, allow applications to configure SO_BUSY_POLL socket option:
SO_BUSY_POLL (since Linux 3.11)
Sets the approximate time in microseconds to busy poll on a
blocking receive when there is no data. Increasing this value
requires CAP_NET_ADMIN. The default for this option is con‐
trolled by the /proc/sys/net/core/busy_read file.
The value in the /proc/sys/net/core/busy_poll file determines
how long select(2) and poll(2) will busy poll when they oper‐
ate on sockets with SO_BUSY_POLL set and no events to report
are found.
In both cases, busy polling will only be done when the socket
last received data from a network device that supports this
option.
While busy polling may improve latency of some applications,
care must be taken when using it since this will increase both
CPU utilization and power usage.
Modification:
Added SO_BUSY_POLL socket option
Result:
Able to configure SO_BUSY_POLL from Netty
Motivation:
We should ensure we call *UnLoad when we detect an error during calling *OnLoad and previous *OnLoad calls were succesfull.
Modifications:
Correctly call *UnLoad when needed.
Result:
More correct code and no leaks when an error happens during loading the native lib.
* Allow to use native transports when sun.misc.Unsafe is not present on the system
Motivation:
We should be able to use the native transports (epoll / kqueue) even when sun.misc.Unsafe is not present on the system. This is especially important as Java11 will be released soon and does not allow access to it by default.
Modifications:
- Correctly disable usage of sun.misc.Unsafe when -PnoUnsafe is used while running the build
- Correctly increment metric when UnpooledDirectByteBuf is allocated. This was uncovered once -PnoUnsafe usage was fixed.
- Implement fallbacks in all our native transport code for when sun.misc.Unsafe is not present.
Result:
Fixes https://github.com/netty/netty/issues/8229.
Motivation:
We should support to load multiple shaded versions of the same netty artifact as netty is often used in multiple dependencies.
This is related to https://github.com/netty/netty/issues/7272.
Modifications:
- Use -fvisibility=hidden when compiling and use JNIEXPORT for things we really want to have exported
- Ensure fields are declared as static so these are not exported
- Adjust testsuite-shading to use install_name_tool on MacOS to change the id of the lib. Otherwise the wrong may be used.
Result:
Be able to use multiple shaded versions of the same netty artifact.
Motivation:
Avoid unnecessary native memory allocation if UDP / TCP isn't being
used.
Modifications:
Create the reused NativeDatagramPacketArray and IovArray upon first use
instead of EpollEventLoop construction.
Also correct related comment in NativeDatagramPacketArray.
Result:
Reduced native memory use when using epoll in many cases
Motivation:
We can store the NativeDatagramPacketArray directly in the EpollEventLoop. This removes the need of using FastThreadLocal.
Modifications:
- Store NativeDatagramPacketArray directly in the EpollEventLoop (just as we do with IovArray as well).
Result:
Less FastThreadLocal usage and more consistent code.
Motivation:
Epoll and Kqueue channels have internal state which forces
a single read operation after channel construction. This
violates the Channel#read() interface which indicates that
data shouldn't be delivered until this method is called.
The behavior is also inconsistent with the NIO transport.
Modifications:
- Epoll and Kqueue shouldn't unconditionally read upon
initialization, and instead should rely upon Channel#read()
or auto_read.
Result:
Epoll and Kqueue are more consistent with NIO.
Motivation:
We should allow to schedule tasks with a delay up to Long.MAX_VALUE as we did pre 4.1.25.Final.
Modifications:
Just ensure we not overflow and put the correct max limits in place when schedule a timer. At worse we will get a wakeup to early and then schedule a new timeout.
Result:
Fixes https://github.com/netty/netty/issues/7970.
* Read until all data is consumed when EOF is detected even if readPending is false and auto-read is disabled.
Motivation:
We should better always notify the user of EOF even if the user did not request any data as otherwise we may never be notified when the remote peer closes the connection. This should be ok as the amount of extra data we may read and so fire through the pipeline is limited by SO_RECVBUF.
Modifications:
- Always drain the socket when EOF is detected.
- Add testcase
Result:
No risk for the user to be not notified of EOF.