Motivation:
441aa4c575 conditionally set the readFlag based upon if maybeMoreDataToRead is set. It is possible that the read flag will not be set, and nothing will be read by executeEpollInReadyRunnable and no actual data will be read even though the user requested it.
Modifications:
- Always set the readFlag in doBeginRead
- Make it so only a single epollInReadyRunnable can execute for a channel at a time
Result:
Less chance of missing read events in EPOLL transport.
Motivation:
OIO/NIO use a volatile variable to track if a read is pending. EPOLL does not use a volatile an executes a Runnable on the event loop thread to set readPending to false. These mechansims should be consistent, and not using a volatile variable is preferable because the variable is written to frequently in the event loop thread.
OIO also does not set readPending to false before each fireChannelRead operation and may result in reading more data than the user desires.
Modifications:
- OIO/NIO should not use a volatile variable for readPending
- OIO should set readPending to false before each fireChannelRead
Result:
OIO/NIO/EPOLL are more consistent w.r.t. readPending and volatile variable operations are reduced
Fixes https://github.com/netty/netty/issues/5069
Motivation:
441aa4c575 introduced a bug in transport-native-epoll where readPending is set to false before a read is attempted, but this should happen before fireChannelRead is called. The NIO transport also only sets the readPending variable to false on the first read in the event loop. This means that if the user only calls read() on the first channelRead(..) the select loop will still listen for read events even if the user does not call read() on subsequent channelRead() or channelReadComplete() in the same event loop run. If the user only needs 2 channelRead() calls then by default they will may get 14 more channelRead() calls in the current event loop, and then 16 more when the event loop is woken up for a read event. This will also read data off the TCP stack and allow the peer to queue more data in the local RECV buffers.
Modifications:
- readPending should be set to false before each call to channelRead()
- make NIO readPending set to false consistent with EPOLL
Result:
NIO and EPOLL transport set readPending to false at correct times which don't read more data than intended by the user.
Fixes https://github.com/netty/netty/issues/5082
Motivation:
There is a spelling error in FileRegion.transfered() as it should be transferred().
Modifications:
Deprecate old method and add a new one.
Result:
Fix typo and can remove the old method later.
Motivation:
bfbef036a8 made EPOLL respect autoRead while in ET mode. However it is possible that we may miss data pending on the RECV queue if autoRead is off. This is because maybeMoreDataToRead is updated after fireChannelRead and if a user calls read() from here maybeMoreDataToRead will be false because it is updated after the fireChannelRead call. The way maybeMoreDataToRead was updated also causes a single channel to continuously read on the event loop and not relinquish and give other channels to try reading.
Modifications:
- Ensure maybeMoreDataToRead is always set after all user events, and is evaluated with readPending to execute a epollInReady on the EventLoop
- Combine the checkResetEpollIn and maybeMoreDataToRead logic to invoke a epollInReady later into the epollInFinally method due to similar responsibilities
- Update unit tests to reflect the user calling read() on the event loop from channelRead()
Result:
EPOLL ET with autoRead set to false will not leave data on the RECV queue.
Motivation:
Setting the WRITE_BUFFER_LOW_WATER_MARK before WRITE_BUFFER_HIGH_WATER_MARK results in an internal Exception (appears only in the logs) if the value is larger than the default high water mark value. The WRITE_BUFFER_HIGH_WATER_MARK call appears to have no effect in this context.
Setting the values in the reverse order works.
Modifications:
- deprecated ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK and
ChannelOption.WRITE_BUFFER_LOW_WATER_MARK.
- add one new option called ChannelOption.WRITE_BUFFER_WATER_MARK.
Result:
The high/low water mark values limits caused by default values are removed.
Setting the WRITE_BUFFER_LOW_WATER_MARK before WRITE_BUFFER_HIGH_WATER_MARK results in an internal Exception (appears only in the logs) if the value is larger than the default high water mark value. The WRITE_BUFFER_HIGH_WATER_MARK call appears to have no effect in this context.
Setting the values in the reverse order works.
Motivation:
NIO now supports a pluggable select strategy, but EPOLL currently doesn't support this. We should strive for feature parity for EPOLL.
Modifications:
- Add SelectStrategy to EPOLL transport.
Result:
EPOLL transport supports SelectStategy.
Motivation:
We need to break out of the read loop for two reasons:
- If the input was shutdown in between (which may be the case when the user did it in the
fireChannelRead(...) method we should not try to read again to not produce any
miss-leading exceptions.
- If the user closes the channel we need to ensure we not try to read from it again as
the filedescriptor may be re-used already by the OS if the system is handling a lot of
concurrent connections and so needs a lot of filedescriptors. If not do this we risk
reading data from a filedescriptor that belongs to another socket then the socket that
was "wrapped" by this Channel implementation.
Modification:
Break the reading loop if the input was shutdown from within the channelRead(...) method.
Result:
No more meaningless exceptions and no risk to read data from wrong socket after the original was closed.
Motivation:
8dbf5d02e5 modified the shutdown code for Socket but did not correctly calculate the change in shutdown state and only applying this change. This is significant because if sockets are being opening and closed quickly and the underlying FD happens to be reused we need to take care that we don't unintentionally change the state of the new FD by acting on an object which represents the old incarnation of that FD.
Modifications:
- Calculate the shutdown change, and only apply what has changed, or exit if no change.
Result:
Socket.shutdown can not inadvertently affect the state of another logical FD.
Motivation:
cf171ff525 introduced a change in behavior when dealing with closing channel in the read loop. This changed behavior may use stale state to determine if a channel should be shutdown and may be incorrect.
Modifications:
- Revert the usage of potentially stale state
Result:
Closing a channel in the read loop is based upon current state instead of potentially stale state.
Motivation:
The code of transport-native-epoll missed some things in terms of static keywords, @deprecated annotations and other minor things.
Modifications:
- Add missing @deprecated annotation
- Not using FQCN in javadocs
- Add static keyword where possible
- Use final fields when possible
- Remove throws IOException from method where it is not needed.
Result:
Cleaner code.
Motivation:
In commit acbca192bd we changed to have our native operations which either gall getsockopt or setsockopt throw IOExceptions (to be more specific we throw a ClosedChannelException in some cases). Unfortunally I missed to also do the same for getSoError() and missed to add throws IOException to the native methods.
Modifications:
- Correctly throw IOException from getSoError()
- Add throws IOException to native methods where it was missed.
Result:
Correct declaration of getSoError() and other native methods.
Motivation:
If SO_LINGER is set to 0 the EPOLL transport will send a FIN followed by a RST. This is not consistent with the behavior of the NIO transport. This variation in behavior can cause protocol violations in streaming protocols (e.g. HTTP) where a FIN may be interpreted as a valid end to a data stream, but RST may be treated as the data is corrupted and should be discarded.
https://github.com/netty/netty/issues/4170 Claims the behavior of NIO always issues a shutdown when close occurs. I could not find any evidence of this in Netty's NIO transport nor in the JDK's SocketChannel.close() implementation.
Modifications:
- AbstractEpollChannel should be consistent with the NIO transport and not force a shutdown on every close
- FileDescriptor to keep state in a consistent manner with the JDK and not allow a shutdown after a close
- Unit tests for NIO and EPOLL to ensure consistent behavior
Result:
EPOLL is capable of sending just a RST to terminate a connection.
Motivation:
To be consistent with the JDK we should ensure our native methods throw a ClosedChannelException if the Channel was previously closed. This will then be wrapped in a ChannelException as usual. For all other errors we continue to just throw a ChannelException directly.
Modifications:
Ensure getsockopt and setsockopt will throw a ClosedChannelException if the channel was closed before, on other errors we throw a ChannelException as before diretly.
Result:
Consistent with the NIO Channel implementations.
Motivation:
We should always first notify the promise before trigger an event through the pipeline to be consistent.
Modifications:
Ensure we notify the promise before fire event.
Result:
Consistent behavior
Motivation:
EpollServerSocketConfig.isFreebind() throws an exception when called.
Modifications:
Use the correct getsockopt arguments.
Result:
No more exception when call EpollServerSocketConfig.isFreebind()
Motivation:
TCP_MD5 is only supported by SocketChannels so remove it from EpollServerChannelConfig which is generic.
Modifications:
Remove invalid code.
Result:
Remove invalid / dead code.
Motivation:
EPOLL does not support autoread when in ET mode.
Modifications:
- EpollRecvByteAllocatorHandle should not unconditionally force reading just because ET is enabled
- AbstractEpollChannel and all derived classes which implement epollInReady must support a variable which indicates
there may be more data to read. The variable will be used when read is called to simulate a EPOLL wakeup and call epollInReady if necessary. This will ensure that if we don't read until EAGAIN that we will try to read again and not rely on EPOLL to notify us.
Result:
EPOLL ET supports auto read.
Motivation:
When using the native transport have support for TCP_DEFER_ACCEPT or / and TCP_QUICKACK can be useful.
Modifications:
- Add support for TCP_DEFER_ACCEPT and TCP_QUICKACK
- Ad unit tests
Result:
TCP_DEFER_ACCEPT and TCP_QUICKACK are supported now.
Motivation:
For on tests we expected a ConnectTimeoutException but used the default timeout of 10 seconds. This slows down testing.
Modifications:
Use connect timeout of 1 second in unit test.
Result:
Faster execution of unit test.
Motivation:
JNI_OnUnload(...) does not return anything (has void in its signature) so we should not try to return something.
Modifications:
Remove return.
Result:
Fix incorrect but harmless code.
Motivation:
netty_epoll_native.c uses dladdr in attempt to get the name of the library that the code is running in. However the address passed to this funciton (JNI_OnLoad) may not be unique in the context of the application which loaded it. For example if another JNI library is loaded this address may first resolve to the other JNI library and cause the path name parsing to fail, which will cause the library to fail.
Modifications:
- Pass an addresses which is local to the current library to dladdr
Result:
EPOLL JNI library can be loaded in an environment where multiple JNI libraries are loaded.
Fixes https://github.com/netty/netty/issues/4840
Motivation:
Currently our epoll native transport requires sun.misc.Unsafe and so we need to take this into account for Epoll.isAvailable().
Modifications:
Take into account if sun.misc.Unsafe is present.
Result:
Only return true for Epoll.isAvailable() if sun.misc.Unsafe is present.
Motivation:
If Netty's class files are renamed and the type references are updated (shaded) the native libraries will not function. The native epoll module uses implicit JNI bindings which requires the fully qualified java type names to match the method signatures of the native methods. This means EPOLL cannot be used with a shaded Netty.
Modifications:
- Make the JNI method registration dynamic
- support a system property io.netty.packagePrefix which must be prepended to the name of the native library (to ensure the correct library is loaded) and all class names (to allow classes to be correctly referenced)
- remove system property io.netty.native.epoll.nettyPackagePrefix which was recently added and the code to support it was incomplete
Result:
transport-native-epoll can be used when Netty has been shaded.
Fixes https://github.com/netty/netty/issues/4800
Motivation:
As we now can easily build static linked versions of tcnative it makes sense to run our netty build against all of them.
This helps to ensure our code works with libressl, openssl and boringssl.
Modifications:
Allow to specify -Dtcnative.artifactId= and -Dtcnative.version=
Result:
Easy to run netty build against different tcnative flavors.
Motivation:
When a wildcard address is used to bind a socket and ipv4 and ipv6 are usable we should accept both (just like JDK IO/NIO does).
Modifications:
Detect wildcard address and if so use in6addr_any
Result:
Correctly accept ipv4 and ipv6
Motivation:
transport-native-epoll finds java classes from JNI using fully qualified class names. If a shaded version of Netty is used then these lookups will fail.
Modifications:
- Allow a prefix to be appended to Netty class names in JNI code.
Result:
JNI code can be used with shaded version of Netty.
Motivation:
We should also be able to compile the native transport on 32bit systems.
Modifications:
Add cast to intptr_t for pointers
Result:
It's possible now to also compile on 32bit.
Motivation:
transport-native-epoll has its pom.xml encoding attribute set to ISO-8859-15. Because
of this gradle, and other dependency management systems, can't correctly resolve this
library from wherever it happens to be published.
Modifications:
netty/transport-native-epoll/pom.xml had its xml encoding changed to UTF-9
Result:
Gradle, and other dependency management systems, will now be able to correctly resolve this module.
Motivation:
Linux uses different socket options to set the traffic class (DSCP) on IPv6
Modifications:
Also set IPV6_TCLASS for IPv6 sockets
Result:
TrafficClass will work on IPv4 and IPv6 correctly
Motivation:
If an user will close a Socket / FileDescriptor multiple times we should handle the extra close operations as NOOP.
Modifications:
Only do the actual closing one time
Result:
No exception if close is called multiple times.
Motivation:
We missed to define the actual c function for isKeepAlive(...) and so throw UnsatisfieldLinkError.
Modifications:
- Add function
- Add unit test for Socket class
Result:
Correctly work isKeepAlive(...) when using native transport
Motivation:
We need to remove all registered events for a Channel from the EventLoop before doing the actual close to ensure we not produce a cpu spin when the actual close operation is delayed or executed outside of the EventLoop.
Modifications:
Deregister for events for NIO and EPOLL socket implementations when SO_LINGER is used.
Result:
No more cpu spin.
Motivation:
We should retain the original hostname when connect to a remote peer so the user can still query the origin hostname if getHostString() is used.
Modifications:
Compute a InetSocketAddress from the original remote address and the one returned by the Os.
Result:
Same behavior when using epoll transport and nio transport.