Motivation:
Java11 disallow draining any remaining bytes from the socket if a write causes a connection reset. This should be completely safe to do. At the moment if a write is causing a connection-reset you basically loose all the pending bytes that are sitting on the socket and are waiting to be read.
This happens because SocketOutputStream.write(…) may call AbstractPlainSocketImpl.setConnectionReset(…). Once this method is called any read(…) call will just throw a SocketException without even attempt to read any remaining data.
This is related:
- https://bugs.openjdk.java.net/browse/JDK-8199329
- http://hg.openjdk.java.net/jdk/jdk/rev/92cca24c8807
- http://mail.openjdk.java.net/pipermail/net-dev/2018-May/011511.html
Modifications:
Tolarate if remaining bytes could not be read when using OIO.
Result:
Be able to build Netty and run testsuite while using Java11
Motivation:
We added a workaround for Java 11 as it not produced a connect-reset when SO_LINGER with 0 was set and NIO was used. This was fixed in the latest ea release of Java 11:
- http://hg.openjdk.java.net/jdk/jdk/rev/ea54197f4fe4
- https://bugs.openjdk.java.net/browse/JDK-8203059
Modifications:
Revert workaround.
Result:
Test that Java 11 behave the same way as earlier Java versions again.
Motivation:
Java 11 will be out soon, so we should be able to build (and run tests) netty.
Modifications:
- Add dependency that is needed till Java 11
- Adjust tests so these also pass on Java 11 (SocketChannelImpl.close() behavious a bit differently now).
Result:
Build also works (and tests pass) on Java 11.
Motivation:
Java 10 is out so we should be able to build netty with it (and run the tests).
Modifications:
- Update Mockito and JBoss Marshalling to support Java 10
- Fix unit test to not depend on specific cipher which is not present in Java 10 anymore
Result:
Netty builds (and runs all tests) when using Java 10
Motivation:
DatagramPacket.recipient() doesn't return the actual destination IP, but the IP the app is bound to.
Modification:
- IP_RECVORIGDSTADDR option is enabled for UDP sockets, which allows retrieval of ancillary information containing the original recipient.
- _recvFrom(...) function from transport-native-unix-common/src/main/c/netty_unix_socket.c is modified such that if IP_RECVORIGDSTADDR is set, recvmsg is used instead of recvfrom; enabling the retrieval of the original recipient.
- DatagramSocketAddress also contains a 'local' address, representing the recipient.
- EpollDatagramChannel is updated to return the retrieved recipient address instead of the address the channel is bound to.
Result:
Fixes#4950.
Motivation:
AbstractNioByteChannel will detect that the remote end of the socket has
been closed and propagate a user event through the pipeline. However if
the user has auto read on, or calls read again, we may propagate the
same user events again. If the underlying transport continuously
notifies us that there is read activity this will happen in a spin loop
which consumes unnecessary CPU.
Modifications:
- AbstractNioByteChannel's unsafe read() should check if the input side
of the socket has been shutdown before processing the event. This is
consistent with EPOLL and KQUEUE transports.
- add unit test with @normanmaurer's help, and make transports consistent with respect to user events
Result:
No more read spin loop in NIO when the channel is half closed.
Motivation:
We called the wrong super method in the test and also had a few unused imports.
Modifications:
Fix super method call and cleanup.
Result:
More correct test and cleanup.
Motivation:
b215794de3 recently introduced a change in behavior where writeSpinCount provided a limit for how many write operations were attempted per flush operation. However when the write quantum was meet the selector write flag was not cleared, and the channel unsafe flush0 method has an optimization which prematurely exits if the write flag is set. This may lead to no write progress being made under the following scenario:
- flush is called, but the socket can't accept all data, we set the write flag
- the selector wakes us up because the socket is writable, we write data and use the writeSpinCount quantum
- we then schedule a flush() on the EventLoop to execute later, however it the flush0 optimization prematurely exits because the write flag is still set
In this scenario the socket is still writable so the EventLoop may never notify us that the socket is writable, and therefore we may never attempt to flush data to the OS.
Modifications:
- When the writeSpinCount quantum is exceeded we should clear the selector write flag
Result:
Fixes https://github.com/netty/netty/issues/7729
Motivation:
We recently removed support for renegotiation, but there are still some hooks to attempt to allow remote initiated renegotiation to succeed. The remote initated renegotiation can be even more problematic from a security stand point and should also be removed.
Modifications:
- Remove state related to remote iniated renegotiation from OpenSslEngine
Result:
More renegotiation code removed from the OpenSslEngine code path.
Motivation:
IovArray implements MessageProcessor, and the processMessage method will continue to be called during iteration until it returns true. A recent commit b215794de3 changed the return value to only return true if any component of a CompositeByteBuf was added as a result of the method call. However this results in the iteration continuing, and potentially subsequent smaller buffers maybe added, which will result in out of order writes and generally corrupts data.
Modifications:
- IovArray#add should return false so that the MessageProcessor#processMessage will stop iterating.
Result:
Native transports which use IovArray will not corrupt data during gathering writes of CompositeByteBuf objects.
Motivation:
At the moment we use netty.io as BAD_HOST with an port that we know is timing out. This may change in the future so we should better use 198.51.100.254 which is specified as "for documentation only".
Modifications:
Replace netty.io with 198.51.100.254 in tests that depend on BAD_HOST.
Result:
More future proof code.
Motivation:
af2f343648 introduced a test-case which was flacky due of multiple problems:
- we called writeAndFlush(...) in channelRead(...) and assumed it will only be called once. This is true most of the times but it may be called multile times if the data is fragemented.
- we didnt guard against the possibility that channelRead(...) is called with an empty buffer
Modifications:
- Call writeAndFlush(...) in channelActive(...) so we are sure its only called once and close the channel once we wrote the data
- only compare the data after we received a close so we are sure there isnt anything extra received
- check for exception and if we catched one fail the test.
Result:
No flacky test anymore and easier to debug issues that accour because of a catched exception.
Motivation:
FileDescriptor#writev calls JNI code, and that JNI code dereferences a NULL pointer which crashes the application. This occurs when writing a single CompositeByteBuf object with more than one component.
Modifications:
- Initialize the iovec iterator properly to avoid the core dump
- Fix the array length calculation if we aren't able to fit all the ByteBuffer objects in the iovec array
Result:
No more core dump.
Motivation:
When SO_LINGER is used we run doClose() on the GlobalEventExecutor by default so we need to ensure we schedule all code that needs to be run on the EventLoop on the EventLoop in doClose. Beside this there are also threading issues when calling shutdownOutput(...)
Modifications:
- Schedule removal from EventLoop to the EventLoop
- Correctly handle shutdownOutput and shutdown in respect with threading-model
- Add unit tests
Result:
Fixes [#7159].
Motivation:
We recently saw an assertion failure when running DatagramUnicastTest.testSimpleSendWithConnect.
Modifications:
- Adding more debug infos
- Ensure we always correctly release the buffers.
Result:
More informations when tests fail.
Motivation:
If AutoClose is false and there is a IoException then AbstractChannel will not close the channel but instead just fail flushed element in the ChannelOutboundBuffer. AbstractChannel also notifies of writability changes, which may lead to an infinite loop if the peer has closed its read side of the socket because we will keep accepting more data but continuously fail because the peer isn't accepting writes.
Modifications:
- If the transport throws on a write we should acknowledge that the output side of the channel has been shutdown and cleanup. If the channel can't accept more data because it is full, and still healthy it is not expected to throw. However if the channel is not healthy it will throw and is not expected to accept any more writes. In this case we should shutdown the output for Channels that support this feature and otherwise just close.
- Connection-less protocols like UDP can remain the same because the channel may disconnected temporarily.
- Make sure AbstractUnsafe#shutdownOutput is called because the shutdown on the socket may throw an exception.
Result:
More correct handling of write failure when AutoClose is false.
Motivation:
SocketStringEchoTest has been observed to fail on CI servers, but the stack traces still indicate work was being done.
Modifications:
- Increase the test timeout
Result:
Tests have more time to complete, and hopefully less false positive test failures.
Motivation:
EpollDomainSocketGatheringWriteTest. testGatheringWriteBig takes on average about 20-25 seconds on the CI servers, but there is a 30 second timeout being applied which leads to what maybe false positive test failures.
Modifications:
- Increase the test timeout to 120 seconds globally and 60 seconds to wait for all writes per test
Result:
Higher timeout for potentially less false positive test failures.
Motivation:
SocketGatherWriteTest has been observed to fail and it has numerous issues which when resolved may help reduce the test failures.
Modifications:
- A volatile counter and a spin/sleep loop is used to trigger test termination. Incrementing a volatile is generally bad practice and can be avoided in this situation. This mechanism can be replaced by a promise. This mechanism should also trigger upon exception or channel inactive.
- The TestHandler maintains an internal buffer, but it is not released. We now only create a buffer on the server side and release it after comparing the expected results.
- The composite buffer creation logic can be simplified, also the existing composite buffer doesn't take into account the buffer's reader index when building buf2.
Result:
Cleaner test.
Motivation:
SocketStringEchoTest has been observed to fail and it has numerous issues which when resolved may help reduce the test failures.
Modifications:
- A volatile counter and a spin/sleep loop is used to trigger test termination. Incrementing a volatile is generally bad practice and can be avoided in this situation. This mechanism can be replaced by a promise. This mechanism should also trigger upon exception or channel inactive.
- Asserts are done in the Netty threads. Although these should result in a exceptionCaught the test may not observe these failures because it is spinning waiting for the count to reach the desired value.
Result:
Cleaner test.
Motivation:
We need to support SO_TIMEOUT for the OioDatagramChannel but we miss this atm as we not have special handling for it in the DatagramChannelConfig impl that we use. Because of this the following log lines showed up when running the testsuite:
20:31:26.299 [main] WARN io.netty.bootstrap.Bootstrap - Unknown channel option 'SO_TIMEOUT' for channel '[id: 0x7cb9183c]'
Modifications:
- Add OioDatagramChannelConfig and impl
- Correctly set SO_TIMEOUT in testsuite
Result:
Support SO_TIMEOUT for OioDatagramChannel and so faster execution of datagram related tests in the testsuite
Motivation:
Implementations of DuplexChannel delegate the shutdownOutput to the underlying transport, but do not take any action on the ChannelOutboundBuffer. In the event of a write failure due to the underlying transport failing and application may attempt to shutdown the output and allow the read side the transport to finish and detect the close. However this may result in an issue where writes are failed, this generates a writability change, we continue to write more data, and this may lead to another writability change, and this loop may continue. Shutting down the output should fail all pending writes and not allow any future writes to avoid this scenario.
Modifications:
- Implementations of DuplexChannel should null out the ChannelOutboundBuffer and fail all pending writes
Result:
More controlled sequencing for shutting down the output side of a channel.
Motivation:
We did not correctly handle connect() and disconnect() in EpollDatagramChannel / KQueueDatagramChannel and so the behavior was different compared to NioDatagramChannel.
Modifications:
- Correct implement connect and disconnect methods
- Share connect and related code
- Add tests
Result:
EpollDatagramChannel / KQueueDatagramChannel also supports correctly connect() and disconnect() methods.
Motivation:
We should not try to detect a free port in tests put just use 0 when bind so there is no race in which the system my bind something to the port we choosen before.
Modifications:
- Remove the usage of TestUtils.getFreePort() in the testsuite
- Remove hack to workaround bind errors which will not happen anymore now
Result:
Less flacky tests.
Motivation:
When run the current testsuite on docker (mac) it will fail a few tests with:
io.netty.channel.AbstractChannel$AnnotatedConnectException: connect(..) failed: Cannot assign requested address: /0:0:0:0:0:0:0:0%0:46607
Caused by: java.net.ConnectException: connect(..) failed: Cannot assign requested address
Modifications:
Specify host explicit as done in other tests to only use ipv6 when really supported.
Result:
Build pass on docker as well