Motivation:
394a1b3485 introduced the possibility to use recvmmsg(...) but did not correctly handle ipv6 mapped ip4 addresses to make it consistent with other transports.
Modifications:
- Correctly handle ipv6 mapped ipv4 addresses by only copy over the relevant bytes
- Small improvement on how to detect ipv6 mapped ipv4 addresses by using memcmp and not byte by byte compare
- Adjust test to cover this bug
Result:
Correctly handle ipv6 mapped ipv4 addresses
Motivation:
When using datagram sockets which need to handle a lot of packets it makes sense to use recvmmsg to be able to read multiple datagram packets with one syscall.
Modifications:
- Add support for recvmmsg on linux
- Add new EpollChannelOption.MAX_DATAGRAM_PACKET_SIZE
- Add tests
Result:
Fixes https://github.com/netty/netty/issues/8446.
Motivation:
If all we need is the FileChannel we should better use RandomAccessFile as FileInputStream and FileOutputStream use a finalizer.
Modifications:
Replace FileInputStream and FileOutputStream with RandomAccessFile when possible.
Result:
Fixes https://github.com/netty/netty/issues/8078.
Motivation:
We did not have support for enable / disable loopback mode in our native epoll transport and also missed the implemention to access the configured interface.
Modifications:
Add implementation and adjust test to cover it
Result:
More complete multicast support with native epoll transport
Motivation:
Provide epoll/native multicast to support high load multicast users (we are using it for a high load telecomm app at my day job).
Modification:
Added support for source specific and any source multicast for epoll transport. Some caveats: no support for disabling loop back mode, retrieval of interface and block operation, all of which tend to be less frequently used.
Result:
Provides epoll transport multicast for common use cases.
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
The current KQueueEventLoop implementation does not process concurrent domain socket channel registration/unregistration in the order they actual
happen since unregistration are delated by an event loop task scheduling. When a domain socket is closed, it's file descriptor might be reused
quickly and therefore trigger a new channel registration using the same descriptor.
Consequently the KQueueEventLoop#add(AbstractKQueueChannel) method will overwrite the current inactive channels having the same descriptor
and the delayed KQueueEventLoop#remove(AbstractKQueueChannel) will remove the active channel that replaced the inactive one.
As active channels are registered, events for this file descriptor won't be processed anymore and the channels will never be closed.
The same problem can also happen in EpollEventLoop. Beside this we also may never remove the AbstractEpollChannel from the internal map
when it is unregistered which will prevent it from be GC'ed
Modifications:
- Change logic of native KQueue and Epoll implementations to ensure we correctly handle the case of FD reuse
- Only try to update kevent / epoll if the Channel is still open (as otherwise it will be handled by kqueue / epoll itself)
- Correctly remove AbstractEpollChannel from internal map in all cases
- Make implementation of closeAll() consistent for Epoll and KQueueEventLoop
Result:
KQueue and Epoll native transports correctly handle FD reuse
Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Motivation:
86dd388637 reverted the usage of IPv6 Multicast test. This commit makes the whole multicast testing a lot more robust by selecting the correct interface in any case and also reverts the `@Ignore`
Modifications:
- More robust multicast testing by selecting the right NetworkInterface
- Remove the `@Ignore` again for the IPv6 test
Result:
More robust multicast testing
Motivation:
The multicast ipv6 test fails on some systems. As I just added it let me ignore it for now while investigating.
Modifications:
Add @ignore
Result:
Stable testsuite while investigate
Motivation:
We currently only cover ipv4 multicast in the testsuite but we should also have tests for ipv6.
Modifications:
- Add test for ipv6
- Ensure we only try to run multicast test for ipv4 / ipv6 if the loopback interface supports it.
Result:
Better test coverage
Motivation:
Systems depending on Netty may benefit (telemetry, alternative even loop scheduling algorithms) from knowing the number of channels assigned to each EventLoop.
Modification:
Expose the number of channels registered in the EventLoop via SingleThreadEventLoop.registeredChannels.
Result:
Fixes#8276.
Motivation:
Since DomainSocketChannel is a DuplexChannel, which be able to shutdown input or output individually on demands, but ALLOW_HALF_CLOSURE channel option has not been supported yet.
I thought this could be a missing feature of Unix domain socket, so here the PR for it.
Modifications:
1. Added allHalfClosure property both in EpollDomainSocketChannelConfig and KQueueDomainSocketChannelConfig,
2. Enabled isAllowHalfClosure method of native channel to support domain channel config,
3. Created EpollDomainSocketShutdownOutputByPeerTest and KQueueDomainSocketShutdownOutputByPeerTest to verify the change.
Result:
ALLOW_HALF_CLOSURE channel option can be set with DomainSocketChannel, and no more warning of Unknown channel option 'ALLOW_HALF_CLOSURE'.
Motivation:
Netty is very widely used which can lead to a lot of pain when we break API / ABI. We should make use japicmp-maven-plugin during the build to verify we do not introduce breakage by mistake.
Modifications:
- Add japicmp-maven-plugin to the build process
- Fix a method signature change in HttpProxyHandler that was flagged as a possible problem.
Result:
Ensure no API/ABI breakage accour between releases.
Motivation:
We should run a CI job using J9 to ensure netty also works when using different JVMs.
Modifications:
- Adjust PooledByteBufAllocatorTest to be able to complete faster when using a JVM which takes longer when joining Threads (this seems to be the case with J9).
- Skip UDT tests on J9 as UDT is not supported there.
Result:
Be able to run CI against J9.
Motivation:
`DefaultFileRegion.transferTo` will return 0 all the time when we request more data then the actual file size. This may result in a busy spin while processing the fileregion during writes.
Modifications:
- If we wrote 0 bytes check if the underlying file size is smaller then the requested count and if so throw an IOException
- Add DefaultFileRegionTest
- Add a test to the testsuite
Result:
Fixes https://github.com/netty/netty/issues/8868.
Motivation:
The SSLEngine does provide a way to signal to the caller that it may need to execute a blocking / long-running task which then can be offloaded to an Executor to ensure the I/O thread is not blocked. Currently how we handle this in SslHandler is not really optimal as while we offload to the Executor we still block the I/O Thread.
Modifications:
- Correctly support offloading the task to the Executor while suspending processing of SSL in the I/O Thread
- Add new methods to SslContext to specify the Executor when creating a SslHandler
- Remove @deprecated annotations from SslHandler constructor that takes an Executor
- Adjust tests to also run with the Executor to ensure all works as expected.
Result:
Be able to offload long running tasks to an Executor when using SslHandler. Partly fixes https://github.com/netty/netty/issues/7862 and https://github.com/netty/netty/issues/7020.
Motivation:
In the native code EpollDatagramChannel / KQueueDatagramChannel creates a DatagramSocketAddress object for each received UDP datagram even when in connected mode as it uses the recvfrom(...) / recvmsg(...) method. Creating these is quite heavy in terms of allocations as internally, char[], String, Inet4Address, InetAddressHolder, InetSocketAddressHolder, InetAddress[], byte[] objects are getting generated when constructing the object. When in connected mode we can just use regular read(...) calls which do not need to allocate all of these.
Modifications:
- When in connected mode use read(...) and NOT recvfrom(..) / readmsg(...) to reduce allocations when possible.
- Adjust tests to ensure read works as expected when in connected mode.
Result:
Less allocations and GC when using native datagram channels in connected mode. Fixes https://github.com/netty/netty/issues/8770.
Motivation:
In the test we assume some semantics on how RST is done that are not true for Windows so we should skip it.
Modifications:
Skip test when on windows.
Result:
Be able to run testsuite on windows. Fixes https://github.com/netty/netty/issues/8571.
Motivation:
Most of the maven modules do not explicitly declare their
dependencies and rely on transitivity, which is not always correct.
Modifications:
For all maven modules, add all of their dependencies to pom.xml
Result:
All of the (essentially non-transitive) depepdencies of the modules are explicitly declared in pom.xml
Motivation:
TLSv1.3 support is included in java11 and is also supported by OpenSSL 1.1.1, so we should support when possible.
Modifications:
- Add support for TLSv1.3 using either the JDK implementation or the native implementation provided by netty-tcnative when compiled against openssl 1.1.1
- Adjust unit tests for semantics provided by TLSv1.3
- Correctly handle custom Provider implementations that not support TLSv1.3
Result:
Be able to use TLSv1.3 with netty.
Motivation:
In some of our tests we not correctly init the SSLEngine before trying to perform a handshake which can cause an IllegalStateException. While this not happened in previous java releases it does now on Java11 (which is "ok" as its even mentioned in the api docs). Beside this how we selected the ciphersuite to test renegotation was not 100 % safe.
Modifications:
- Correctly init SSLEngine before using it
- Correctly select ciphersuite before testing for renegotation.
Result:
More correct tests and also pass on next java11 EA release.
Motivation:
This reverts commit 4b728cd5bc as it was fixes in Java 11 ea+17.
Modification:
Revert previous added workaround as this is fixed in Java 11 now.
Result:
No more workaround for test included.
Motivation:
Epoll and Kqueue channels have internal state which forces
a single read operation after channel construction. This
violates the Channel#read() interface which indicates that
data shouldn't be delivered until this method is called.
The behavior is also inconsistent with the NIO transport.
Modifications:
- Epoll and Kqueue shouldn't unconditionally read upon
initialization, and instead should rely upon Channel#read()
or auto_read.
Result:
Epoll and Kqueue are more consistent with NIO.
Motivation:
Java11 disallow draining any remaining bytes from the socket if a write causes a connection reset. This should be completely safe to do. At the moment if a write is causing a connection-reset you basically loose all the pending bytes that are sitting on the socket and are waiting to be read.
This happens because SocketOutputStream.write(…) may call AbstractPlainSocketImpl.setConnectionReset(…). Once this method is called any read(…) call will just throw a SocketException without even attempt to read any remaining data.
This is related:
- https://bugs.openjdk.java.net/browse/JDK-8199329
- http://hg.openjdk.java.net/jdk/jdk/rev/92cca24c8807
- http://mail.openjdk.java.net/pipermail/net-dev/2018-May/011511.html
Modifications:
Tolarate if remaining bytes could not be read when using OIO.
Result:
Be able to build Netty and run testsuite while using Java11
Motivation:
We added a workaround for Java 11 as it not produced a connect-reset when SO_LINGER with 0 was set and NIO was used. This was fixed in the latest ea release of Java 11:
- http://hg.openjdk.java.net/jdk/jdk/rev/ea54197f4fe4
- https://bugs.openjdk.java.net/browse/JDK-8203059
Modifications:
Revert workaround.
Result:
Test that Java 11 behave the same way as earlier Java versions again.
Motivation:
Java 11 will be out soon, so we should be able to build (and run tests) netty.
Modifications:
- Add dependency that is needed till Java 11
- Adjust tests so these also pass on Java 11 (SocketChannelImpl.close() behavious a bit differently now).
Result:
Build also works (and tests pass) on Java 11.
Motivation:
Java 10 is out so we should be able to build netty with it (and run the tests).
Modifications:
- Update Mockito and JBoss Marshalling to support Java 10
- Fix unit test to not depend on specific cipher which is not present in Java 10 anymore
Result:
Netty builds (and runs all tests) when using Java 10
Motivation:
DatagramPacket.recipient() doesn't return the actual destination IP, but the IP the app is bound to.
Modification:
- IP_RECVORIGDSTADDR option is enabled for UDP sockets, which allows retrieval of ancillary information containing the original recipient.
- _recvFrom(...) function from transport-native-unix-common/src/main/c/netty_unix_socket.c is modified such that if IP_RECVORIGDSTADDR is set, recvmsg is used instead of recvfrom; enabling the retrieval of the original recipient.
- DatagramSocketAddress also contains a 'local' address, representing the recipient.
- EpollDatagramChannel is updated to return the retrieved recipient address instead of the address the channel is bound to.
Result:
Fixes#4950.
Motivation:
AbstractNioByteChannel will detect that the remote end of the socket has
been closed and propagate a user event through the pipeline. However if
the user has auto read on, or calls read again, we may propagate the
same user events again. If the underlying transport continuously
notifies us that there is read activity this will happen in a spin loop
which consumes unnecessary CPU.
Modifications:
- AbstractNioByteChannel's unsafe read() should check if the input side
of the socket has been shutdown before processing the event. This is
consistent with EPOLL and KQUEUE transports.
- add unit test with @normanmaurer's help, and make transports consistent with respect to user events
Result:
No more read spin loop in NIO when the channel is half closed.
Motivation:
We called the wrong super method in the test and also had a few unused imports.
Modifications:
Fix super method call and cleanup.
Result:
More correct test and cleanup.
Motivation:
b215794de3 recently introduced a change in behavior where writeSpinCount provided a limit for how many write operations were attempted per flush operation. However when the write quantum was meet the selector write flag was not cleared, and the channel unsafe flush0 method has an optimization which prematurely exits if the write flag is set. This may lead to no write progress being made under the following scenario:
- flush is called, but the socket can't accept all data, we set the write flag
- the selector wakes us up because the socket is writable, we write data and use the writeSpinCount quantum
- we then schedule a flush() on the EventLoop to execute later, however it the flush0 optimization prematurely exits because the write flag is still set
In this scenario the socket is still writable so the EventLoop may never notify us that the socket is writable, and therefore we may never attempt to flush data to the OS.
Modifications:
- When the writeSpinCount quantum is exceeded we should clear the selector write flag
Result:
Fixes https://github.com/netty/netty/issues/7729
Motivation:
We recently removed support for renegotiation, but there are still some hooks to attempt to allow remote initiated renegotiation to succeed. The remote initated renegotiation can be even more problematic from a security stand point and should also be removed.
Modifications:
- Remove state related to remote iniated renegotiation from OpenSslEngine
Result:
More renegotiation code removed from the OpenSslEngine code path.
Motivation:
IovArray implements MessageProcessor, and the processMessage method will continue to be called during iteration until it returns true. A recent commit b215794de3 changed the return value to only return true if any component of a CompositeByteBuf was added as a result of the method call. However this results in the iteration continuing, and potentially subsequent smaller buffers maybe added, which will result in out of order writes and generally corrupts data.
Modifications:
- IovArray#add should return false so that the MessageProcessor#processMessage will stop iterating.
Result:
Native transports which use IovArray will not corrupt data during gathering writes of CompositeByteBuf objects.
Motivation:
At the moment we use netty.io as BAD_HOST with an port that we know is timing out. This may change in the future so we should better use 198.51.100.254 which is specified as "for documentation only".
Modifications:
Replace netty.io with 198.51.100.254 in tests that depend on BAD_HOST.
Result:
More future proof code.
Motivation:
af2f343648 introduced a test-case which was flacky due of multiple problems:
- we called writeAndFlush(...) in channelRead(...) and assumed it will only be called once. This is true most of the times but it may be called multile times if the data is fragemented.
- we didnt guard against the possibility that channelRead(...) is called with an empty buffer
Modifications:
- Call writeAndFlush(...) in channelActive(...) so we are sure its only called once and close the channel once we wrote the data
- only compare the data after we received a close so we are sure there isnt anything extra received
- check for exception and if we catched one fail the test.
Result:
No flacky test anymore and easier to debug issues that accour because of a catched exception.
Motivation:
FileDescriptor#writev calls JNI code, and that JNI code dereferences a NULL pointer which crashes the application. This occurs when writing a single CompositeByteBuf object with more than one component.
Modifications:
- Initialize the iovec iterator properly to avoid the core dump
- Fix the array length calculation if we aren't able to fit all the ByteBuffer objects in the iovec array
Result:
No more core dump.
Motivation:
When SO_LINGER is used we run doClose() on the GlobalEventExecutor by default so we need to ensure we schedule all code that needs to be run on the EventLoop on the EventLoop in doClose. Beside this there are also threading issues when calling shutdownOutput(...)
Modifications:
- Schedule removal from EventLoop to the EventLoop
- Correctly handle shutdownOutput and shutdown in respect with threading-model
- Add unit tests
Result:
Fixes [#7159].
Motivation:
We recently saw an assertion failure when running DatagramUnicastTest.testSimpleSendWithConnect.
Modifications:
- Adding more debug infos
- Ensure we always correctly release the buffers.
Result:
More informations when tests fail.
Motivation:
If AutoClose is false and there is a IoException then AbstractChannel will not close the channel but instead just fail flushed element in the ChannelOutboundBuffer. AbstractChannel also notifies of writability changes, which may lead to an infinite loop if the peer has closed its read side of the socket because we will keep accepting more data but continuously fail because the peer isn't accepting writes.
Modifications:
- If the transport throws on a write we should acknowledge that the output side of the channel has been shutdown and cleanup. If the channel can't accept more data because it is full, and still healthy it is not expected to throw. However if the channel is not healthy it will throw and is not expected to accept any more writes. In this case we should shutdown the output for Channels that support this feature and otherwise just close.
- Connection-less protocols like UDP can remain the same because the channel may disconnected temporarily.
- Make sure AbstractUnsafe#shutdownOutput is called because the shutdown on the socket may throw an exception.
Result:
More correct handling of write failure when AutoClose is false.
Motivation:
SocketStringEchoTest has been observed to fail on CI servers, but the stack traces still indicate work was being done.
Modifications:
- Increase the test timeout
Result:
Tests have more time to complete, and hopefully less false positive test failures.
Motivation:
EpollDomainSocketGatheringWriteTest. testGatheringWriteBig takes on average about 20-25 seconds on the CI servers, but there is a 30 second timeout being applied which leads to what maybe false positive test failures.
Modifications:
- Increase the test timeout to 120 seconds globally and 60 seconds to wait for all writes per test
Result:
Higher timeout for potentially less false positive test failures.
Motivation:
SocketGatherWriteTest has been observed to fail and it has numerous issues which when resolved may help reduce the test failures.
Modifications:
- A volatile counter and a spin/sleep loop is used to trigger test termination. Incrementing a volatile is generally bad practice and can be avoided in this situation. This mechanism can be replaced by a promise. This mechanism should also trigger upon exception or channel inactive.
- The TestHandler maintains an internal buffer, but it is not released. We now only create a buffer on the server side and release it after comparing the expected results.
- The composite buffer creation logic can be simplified, also the existing composite buffer doesn't take into account the buffer's reader index when building buf2.
Result:
Cleaner test.
Motivation:
SocketStringEchoTest has been observed to fail and it has numerous issues which when resolved may help reduce the test failures.
Modifications:
- A volatile counter and a spin/sleep loop is used to trigger test termination. Incrementing a volatile is generally bad practice and can be avoided in this situation. This mechanism can be replaced by a promise. This mechanism should also trigger upon exception or channel inactive.
- Asserts are done in the Netty threads. Although these should result in a exceptionCaught the test may not observe these failures because it is spinning waiting for the count to reach the desired value.
Result:
Cleaner test.
Motivation:
We need to support SO_TIMEOUT for the OioDatagramChannel but we miss this atm as we not have special handling for it in the DatagramChannelConfig impl that we use. Because of this the following log lines showed up when running the testsuite:
20:31:26.299 [main] WARN io.netty.bootstrap.Bootstrap - Unknown channel option 'SO_TIMEOUT' for channel '[id: 0x7cb9183c]'
Modifications:
- Add OioDatagramChannelConfig and impl
- Correctly set SO_TIMEOUT in testsuite
Result:
Support SO_TIMEOUT for OioDatagramChannel and so faster execution of datagram related tests in the testsuite
Motivation:
Implementations of DuplexChannel delegate the shutdownOutput to the underlying transport, but do not take any action on the ChannelOutboundBuffer. In the event of a write failure due to the underlying transport failing and application may attempt to shutdown the output and allow the read side the transport to finish and detect the close. However this may result in an issue where writes are failed, this generates a writability change, we continue to write more data, and this may lead to another writability change, and this loop may continue. Shutting down the output should fail all pending writes and not allow any future writes to avoid this scenario.
Modifications:
- Implementations of DuplexChannel should null out the ChannelOutboundBuffer and fail all pending writes
Result:
More controlled sequencing for shutting down the output side of a channel.
Motivation:
We did not correctly handle connect() and disconnect() in EpollDatagramChannel / KQueueDatagramChannel and so the behavior was different compared to NioDatagramChannel.
Modifications:
- Correct implement connect and disconnect methods
- Share connect and related code
- Add tests
Result:
EpollDatagramChannel / KQueueDatagramChannel also supports correctly connect() and disconnect() methods.
Motivation:
We should not try to detect a free port in tests put just use 0 when bind so there is no race in which the system my bind something to the port we choosen before.
Modifications:
- Remove the usage of TestUtils.getFreePort() in the testsuite
- Remove hack to workaround bind errors which will not happen anymore now
Result:
Less flacky tests.
Motivation:
When run the current testsuite on docker (mac) it will fail a few tests with:
io.netty.channel.AbstractChannel$AnnotatedConnectException: connect(..) failed: Cannot assign requested address: /0:0:0:0:0:0:0:0%0:46607
Caused by: java.net.ConnectException: connect(..) failed: Cannot assign requested address
Modifications:
Specify host explicit as done in other tests to only use ipv6 when really supported.
Result:
Build pass on docker as well