Motivation:
bfbef036a8 made EPOLL respect autoRead while in ET mode. However it is possible that we may miss data pending on the RECV queue if autoRead is off. This is because maybeMoreDataToRead is updated after fireChannelRead and if a user calls read() from here maybeMoreDataToRead will be false because it is updated after the fireChannelRead call. The way maybeMoreDataToRead was updated also causes a single channel to continuously read on the event loop and not relinquish and give other channels to try reading.
Modifications:
- Ensure maybeMoreDataToRead is always set after all user events, and is evaluated with readPending to execute a epollInReady on the EventLoop
- Combine the checkResetEpollIn and maybeMoreDataToRead logic to invoke a epollInReady later into the epollInFinally method due to similar responsibilities
- Update unit tests to reflect the user calling read() on the event loop from channelRead()
Result:
EPOLL ET with autoRead set to false will not leave data on the RECV queue.
Motivation:
Setting the WRITE_BUFFER_LOW_WATER_MARK before WRITE_BUFFER_HIGH_WATER_MARK results in an internal Exception (appears only in the logs) if the value is larger than the default high water mark value. The WRITE_BUFFER_HIGH_WATER_MARK call appears to have no effect in this context.
Setting the values in the reverse order works.
Modifications:
- deprecated ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK and
ChannelOption.WRITE_BUFFER_LOW_WATER_MARK.
- add one new option called ChannelOption.WRITE_BUFFER_WATER_MARK.
Result:
The high/low water mark values limits caused by default values are removed.
Setting the WRITE_BUFFER_LOW_WATER_MARK before WRITE_BUFFER_HIGH_WATER_MARK results in an internal Exception (appears only in the logs) if the value is larger than the default high water mark value. The WRITE_BUFFER_HIGH_WATER_MARK call appears to have no effect in this context.
Setting the values in the reverse order works.
Motivation:
NIO now supports a pluggable select strategy, but EPOLL currently doesn't support this. We should strive for feature parity for EPOLL.
Modifications:
- Add SelectStrategy to EPOLL transport.
Result:
EPOLL transport supports SelectStategy.
Motivation:
We need to break out of the read loop for two reasons:
- If the input was shutdown in between (which may be the case when the user did it in the
fireChannelRead(...) method we should not try to read again to not produce any
miss-leading exceptions.
- If the user closes the channel we need to ensure we not try to read from it again as
the filedescriptor may be re-used already by the OS if the system is handling a lot of
concurrent connections and so needs a lot of filedescriptors. If not do this we risk
reading data from a filedescriptor that belongs to another socket then the socket that
was "wrapped" by this Channel implementation.
Modification:
Break the reading loop if the input was shutdown from within the channelRead(...) method.
Result:
No more meaningless exceptions and no risk to read data from wrong socket after the original was closed.
Motivation:
8dbf5d02e5 modified the shutdown code for Socket but did not correctly calculate the change in shutdown state and only applying this change. This is significant because if sockets are being opening and closed quickly and the underlying FD happens to be reused we need to take care that we don't unintentionally change the state of the new FD by acting on an object which represents the old incarnation of that FD.
Modifications:
- Calculate the shutdown change, and only apply what has changed, or exit if no change.
Result:
Socket.shutdown can not inadvertently affect the state of another logical FD.
Motivation:
cf171ff525 introduced a change in behavior when dealing with closing channel in the read loop. This changed behavior may use stale state to determine if a channel should be shutdown and may be incorrect.
Modifications:
- Revert the usage of potentially stale state
Result:
Closing a channel in the read loop is based upon current state instead of potentially stale state.
Motivation:
The code of transport-native-epoll missed some things in terms of static keywords, @deprecated annotations and other minor things.
Modifications:
- Add missing @deprecated annotation
- Not using FQCN in javadocs
- Add static keyword where possible
- Use final fields when possible
- Remove throws IOException from method where it is not needed.
Result:
Cleaner code.
Motivation:
In commit acbca192bd we changed to have our native operations which either gall getsockopt or setsockopt throw IOExceptions (to be more specific we throw a ClosedChannelException in some cases). Unfortunally I missed to also do the same for getSoError() and missed to add throws IOException to the native methods.
Modifications:
- Correctly throw IOException from getSoError()
- Add throws IOException to native methods where it was missed.
Result:
Correct declaration of getSoError() and other native methods.
Motivation:
If SO_LINGER is set to 0 the EPOLL transport will send a FIN followed by a RST. This is not consistent with the behavior of the NIO transport. This variation in behavior can cause protocol violations in streaming protocols (e.g. HTTP) where a FIN may be interpreted as a valid end to a data stream, but RST may be treated as the data is corrupted and should be discarded.
https://github.com/netty/netty/issues/4170 Claims the behavior of NIO always issues a shutdown when close occurs. I could not find any evidence of this in Netty's NIO transport nor in the JDK's SocketChannel.close() implementation.
Modifications:
- AbstractEpollChannel should be consistent with the NIO transport and not force a shutdown on every close
- FileDescriptor to keep state in a consistent manner with the JDK and not allow a shutdown after a close
- Unit tests for NIO and EPOLL to ensure consistent behavior
Result:
EPOLL is capable of sending just a RST to terminate a connection.
Motivation:
To be consistent with the JDK we should ensure our native methods throw a ClosedChannelException if the Channel was previously closed. This will then be wrapped in a ChannelException as usual. For all other errors we continue to just throw a ChannelException directly.
Modifications:
Ensure getsockopt and setsockopt will throw a ClosedChannelException if the channel was closed before, on other errors we throw a ChannelException as before diretly.
Result:
Consistent with the NIO Channel implementations.
Motivation:
We should always first notify the promise before trigger an event through the pipeline to be consistent.
Modifications:
Ensure we notify the promise before fire event.
Result:
Consistent behavior
Motivation:
EpollServerSocketConfig.isFreebind() throws an exception when called.
Modifications:
Use the correct getsockopt arguments.
Result:
No more exception when call EpollServerSocketConfig.isFreebind()
Motivation:
TCP_MD5 is only supported by SocketChannels so remove it from EpollServerChannelConfig which is generic.
Modifications:
Remove invalid code.
Result:
Remove invalid / dead code.
Motivation:
EPOLL does not support autoread when in ET mode.
Modifications:
- EpollRecvByteAllocatorHandle should not unconditionally force reading just because ET is enabled
- AbstractEpollChannel and all derived classes which implement epollInReady must support a variable which indicates
there may be more data to read. The variable will be used when read is called to simulate a EPOLL wakeup and call epollInReady if necessary. This will ensure that if we don't read until EAGAIN that we will try to read again and not rely on EPOLL to notify us.
Result:
EPOLL ET supports auto read.
Motivation:
When using the native transport have support for TCP_DEFER_ACCEPT or / and TCP_QUICKACK can be useful.
Modifications:
- Add support for TCP_DEFER_ACCEPT and TCP_QUICKACK
- Ad unit tests
Result:
TCP_DEFER_ACCEPT and TCP_QUICKACK are supported now.
Motivation:
For on tests we expected a ConnectTimeoutException but used the default timeout of 10 seconds. This slows down testing.
Modifications:
Use connect timeout of 1 second in unit test.
Result:
Faster execution of unit test.
Motivation:
JNI_OnUnload(...) does not return anything (has void in its signature) so we should not try to return something.
Modifications:
Remove return.
Result:
Fix incorrect but harmless code.
Motivation:
netty_epoll_native.c uses dladdr in attempt to get the name of the library that the code is running in. However the address passed to this funciton (JNI_OnLoad) may not be unique in the context of the application which loaded it. For example if another JNI library is loaded this address may first resolve to the other JNI library and cause the path name parsing to fail, which will cause the library to fail.
Modifications:
- Pass an addresses which is local to the current library to dladdr
Result:
EPOLL JNI library can be loaded in an environment where multiple JNI libraries are loaded.
Fixes https://github.com/netty/netty/issues/4840
Motivation:
Currently our epoll native transport requires sun.misc.Unsafe and so we need to take this into account for Epoll.isAvailable().
Modifications:
Take into account if sun.misc.Unsafe is present.
Result:
Only return true for Epoll.isAvailable() if sun.misc.Unsafe is present.
Motivation:
If Netty's class files are renamed and the type references are updated (shaded) the native libraries will not function. The native epoll module uses implicit JNI bindings which requires the fully qualified java type names to match the method signatures of the native methods. This means EPOLL cannot be used with a shaded Netty.
Modifications:
- Make the JNI method registration dynamic
- support a system property io.netty.packagePrefix which must be prepended to the name of the native library (to ensure the correct library is loaded) and all class names (to allow classes to be correctly referenced)
- remove system property io.netty.native.epoll.nettyPackagePrefix which was recently added and the code to support it was incomplete
Result:
transport-native-epoll can be used when Netty has been shaded.
Fixes https://github.com/netty/netty/issues/4800
Motivation:
As we now can easily build static linked versions of tcnative it makes sense to run our netty build against all of them.
This helps to ensure our code works with libressl, openssl and boringssl.
Modifications:
Allow to specify -Dtcnative.artifactId= and -Dtcnative.version=
Result:
Easy to run netty build against different tcnative flavors.
Motivation:
When a wildcard address is used to bind a socket and ipv4 and ipv6 are usable we should accept both (just like JDK IO/NIO does).
Modifications:
Detect wildcard address and if so use in6addr_any
Result:
Correctly accept ipv4 and ipv6
Motivation:
transport-native-epoll finds java classes from JNI using fully qualified class names. If a shaded version of Netty is used then these lookups will fail.
Modifications:
- Allow a prefix to be appended to Netty class names in JNI code.
Result:
JNI code can be used with shaded version of Netty.
Motivation:
We should also be able to compile the native transport on 32bit systems.
Modifications:
Add cast to intptr_t for pointers
Result:
It's possible now to also compile on 32bit.
Motivation:
transport-native-epoll has its pom.xml encoding attribute set to ISO-8859-15. Because
of this gradle, and other dependency management systems, can't correctly resolve this
library from wherever it happens to be published.
Modifications:
netty/transport-native-epoll/pom.xml had its xml encoding changed to UTF-9
Result:
Gradle, and other dependency management systems, will now be able to correctly resolve this module.
Motivation:
Linux uses different socket options to set the traffic class (DSCP) on IPv6
Modifications:
Also set IPV6_TCLASS for IPv6 sockets
Result:
TrafficClass will work on IPv4 and IPv6 correctly
Motivation:
If an user will close a Socket / FileDescriptor multiple times we should handle the extra close operations as NOOP.
Modifications:
Only do the actual closing one time
Result:
No exception if close is called multiple times.
Motivation:
We missed to define the actual c function for isKeepAlive(...) and so throw UnsatisfieldLinkError.
Modifications:
- Add function
- Add unit test for Socket class
Result:
Correctly work isKeepAlive(...) when using native transport
Motivation:
We need to remove all registered events for a Channel from the EventLoop before doing the actual close to ensure we not produce a cpu spin when the actual close operation is delayed or executed outside of the EventLoop.
Modifications:
Deregister for events for NIO and EPOLL socket implementations when SO_LINGER is used.
Result:
No more cpu spin.
Motivation:
We should retain the original hostname when connect to a remote peer so the user can still query the origin hostname if getHostString() is used.
Modifications:
Compute a InetSocketAddress from the original remote address and the one returned by the Os.
Result:
Same behavior when using epoll transport and nio transport.
Motivation:
Fix a race-condition when closing NioSocketChannel or EpollSocketChannel while try to detect if a close executor should be used and the underlying socket was already closed. This could lead to an exception that then leave the channel / in an invalid state and so could lead to side-effects like heavy CPU usage.
Modifications:
Catch possible socket exception while try to get the SO_LINGER options from the underlying socket.
Result:
No more race-condition when closing the channel is possible with bad side-effects.
Motivation:
AbstractEpollStreamChannel has a queue which collects splice events. Splice is assumed not to be the most common use case of this class and thus the splice queue could be initialized in a lazy fashion to save memory. This becomes more significant when the number of connections grows.
Modifications:
- AbstractEpollStreamChannel.spliceQueue will be initialized in a lazy fashion
Result:
Less memory consumption for most use cases
Motivation:
We should use OneTimeTask where possible to reduce object creation.
Modifications:
Replace Runnable with OneTimeTask
Result:
Less object creation
Motivation:
If we have a lot of writes going on we currently need to lookup the IovArray for each Channel that does writes. This can have quite some perf overhead. We should not need to do this and just store a reference of the IovArray on the EpollEventLoop itself.
Modifications:
- Remove IoArrayThreadLocal
- Store the IoArray in the EventLoop itself
Result:
Less FastThreadLocal lookups
Motivation:
If ChannelOption.ALLOW_HALF_CLOSURE is true and the shutdown input operation fails we should not propagate this exception, and instead consider this socket's read as half closed.
Modifications:
- AbstractEpollChannel.shutdownInput should not propagate exceptions when attempting to shutdown the input, but instead should just close the socket
Result:
Users expecting a ChannelInputShutdownEvent will get this event even if the socket is already shutdown, and the shutdown operation fails.