Motivation:
If a user calls EpollSocketChannelConfig.getOptions() and TCP_FASTOPEN_CONNECT is not supported we throw an exception.
Modifications:
- Just return 0 if ENOPROTOOPT is set.
- Add testcase
Result:
getOptions() works as epxected.
Motivation:
Linux kernel 4.11 introduced a new socket option,
TCP_FASTOPEN_CONNECT, that greatly simplifies making TCP Fast Open
connections on client side. Usually simply setting the flag before
connect() call is enough, no more changes are required.
Details can be found in kernel commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=19f6d3f3
Modifications:
TCP_FASTOPEN_CONNECT socket option was added to EpollChannelOption
class.
Result:
Netty clients can easily make TCP Fast Open connections. Simply
calling option(EpollChannelOption.TCP_FASTOPEN_CONNECT, true) in
client bootstrap is enough (given recent enough kernel).
Motivation:
As noticed in https://stackoverflow.com/questions/45700277/
compilation can fail if the definition of a method doesn't
match the declaration. It's easy enough to add this in, and make
it easy to compile.
Modifications:
Add JNIEXPORT to the entry points.
* On Windows this adds: `__declspec(dllexport)`
* On Mac this adds: `__attribute__((visibility("default")))`
* On Linux (GCC 4.2+) this adds: ` __attribute__((visibility("default")))`
* On other it doesn't add anything.
Result:
Easier compilation
Motivation:
KQueueEventLoop and EpollEventLoop implement different approaches to applying a timeout of their respective poll calls. Epoll attempts to ensure the desired timeout is satisfied at the java layer and at the JNI layer, but it should be sufficient to account for spurious wakups at the JNI layer. Epoll timeout granularity is also limited to milliseconds which may be too large for some latency sensitive applications.
Modifications:
- Make EpollEventLoop wait method look like KQueueEventLoop
- Epoll should support a finer timeout granularity via timerfd_create. We can hide most of these details behind the epollWait0 JNI call to avoid crossing additional JNI boundaries.
Result:
More consistent timeout approach between KQueue and Epoll.
Motivation:
Due to an oversight (by myself), linking two JNI modules with
duplicate symbols fails in linking. This only seems to happen
some of the time (the behavior seems to be different between GCC
and Clang toolchains). For instance, including both netty tcnative
and netty epoll fails to link because of duplicate JNI_OnLoad
symobols.
Modification:
Do not define the JNI_OnLoad and JNI_OnUnload symbols when
compiling for static linkage, as indicated by the NETTY_BUILD_STATIC
preprocessor define. They are never directly called when
statically linked.
Result:
Able to statically compile epoll and tcnative code into a single
binary.
Motivation:
At the moment we try to load the library using multiple names which includes names using - but also _ . We should just use _ all the time.
Modifications:
Replace - with _
Result:
Fixes [#7069]
Motivation:
We used an int[] to store all values that are returned in the struct for TCP_INFO which is not good enough as it uses usigned int values.
Modifications:
- Change int[] to long[] and correctly cast values.
Result:
No more truncated values.
Motivation:
Enable static linking for Java 8. These commits are the same as those introduced to netty tcnative. The goal is to allow lots of JNI libraries to be statically linked together without having conflict `JNI_OnLoad` methods.
Modification:
* add JNI_OnLoad suffixes to enable static linking
* Add static names to the list of libraries that try to be loaded
* Enable compiling with JNI 1.8
* Sort includes
Result:
Enable statically linked JNI code.
Motivation:
Google requires stricter compilation by adding -Werror and enabling many other warnings.
Modification:
* fix warning caused by -Wmissing-braces
* Use the address of `sendmmsg` rather than the function itself when
checking for presence. This resovles the warning caused by
`-Wpointer-bool-conversion`.
More detail:
When compiling on Linux, `sendmmsg` is always present, so the
function is always nonnull. When compiling elsewhere, the
function is defined as `__attribute__((weak))` which means it
may be absent at link time. This is controlled by
`IO_NETTY_SENDMMSG_NOT_FOUND`, which is off by default.
The reason for the error is due to the risk of accidentally not
calling the function. By adding `&` before the function, there
is no ambiguity. (the result of the fn call cannot have its
address taken.)
* use != to check for sendmmsg
Result:
Easier compilation.
Motivation:
This allows netty to operate in 'transparent proxy' mode, intercepting connections
to other addresses by means of Linux firewalling rules, as per
https://www.kernel.org/doc/Documentation/networking/tproxy.txt
The original destination address can be obtained by referencing
ch.localAddress().
Modification:
Add methods similar to those for ipFreeBind, to set the IP_TRANSPARENT option.
Result:
Allows setting and getting of the IP_TRANSPARENT option, which allows retrieval of the ultimate socket address originally requested.
Motivation:
We currently don't have a native transport which supports kqueue https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2. This can be useful for BSD systems such as MacOS to take advantage of native features, and provide feature parity with the Linux native transport.
Modifications:
- Make a new transport-native-unix-common module with all the java classes and JNI code for generic unix items. This module will build a static library for each unix platform, and included in the dynamic libraries used for JNI (e.g. transport-native-epoll, and eventually kqueue).
- Make a new transport-native-unix-common-tests module where the tests for the transport-native-unix-common module will live. This is so each unix platform can inherit from these test and ensure they pass.
- Add a new transport-native-kqueue module which uses JNI to directly interact with kqueue
Result:
JNI support for kqueue.
Fixes https://github.com/netty/netty/issues/2448
Fixes https://github.com/netty/netty/issues/4231
Motivation:
I had a need to know the user credentials of a connected unix domain socket.
Modifications:
Added a class to encapsulate user credentials (UID, GID, and the PID).
Augemented the Socket class to provide the JNI native interface to return this new class
Augemented the c code to call getSockOpts passing <a href=http://man7.org/linux/man-pages/man7/socket.7.html>SO_PEERCRED</a>
Then surfaced the ability to get user credentials in the EpollDomainSocketChannel
Result:
The EpollDomainSocketChannel now has a the following function signature:
public PeerCredentials peerCredentials() throws IOException allowing a caller to get the UID, GID, and PID of the linux process
connected to the unix domain socket.
Motivation:
The NIO transport used an IllegalStateException if a user tried to issue another connect(...) while the connect was still in process. For this case the JDK specified a ConnectPendingException which we should use. The same issues exists in the EPOLL transport. Beside this the EPOLL transport also does not throw the right exceptions for ENETUNREACH and EISCONN errno codes.
Modifications:
- Replace IllegalStateException with ConnectPendingException in NIO and EPOLL transport
- throw correct exceptions for ENETUNREACH and EISCONN in EPOLL transport
- Add test case
Result:
More correct error handling for connect attempts when using NIO and EPOLL transport
Motivation:
ECONNREFUSED can be a common type of exception when attempting to finish the connection process. Generating a new exception each time can be costly and quickly bloat memory usage.
Modifications:
- Expose ECONNREFUSED from JNI and cache this exception in Socket.finishConnect
Result:
ECONNREFUSED during finish connect doesn't create a new exception each time.
Motivation:
Unused methods create warnings on some C compilers. It may not be feasible to selectively turn them off.
Modifications:
Remove createInetSocketAddress as it is unused.
Result:
Less noisy compilation
Motivation:
epoll_wait accepts a timeout argument which will specify the maximum amount of time the epoll_wait will wait for an event to occur. If the epoll_wait method returns for any reason that is not fatal (e.g. EINTR) the original timeout value is re-used. This does not honor the timeout interface contract and can lead to unbounded time in epoll_wait.
Modifications:
- The time taken by epoll_wait should be decremented before calling epoll_wait again, and if the remaining time is exhausted we should return 0 according to the epoll_wait interface docs http://man7.org/linux/man-pages/man2/epoll_wait.2.html
- link librt which is needed for some platforms to use clock_gettime
Result:
epoll_wait will wait for at most timeout ms according to the epoll_wait interface contract.
Motivation:
We used transfered in native code which is not correct spelling. It should be transferred.
Modifications:
Fix typo.
Result:
Less typos in source code.
Motivation:
When epoll datagram channel invokes sendmmsg0, _all_ of the messages go
on the wire with the address of the _last_ packet in the list.
Modifications:
An array of addresses equal to the length of the messages is allocated
on the stack to hold the address for each msg_hdr.msg_name.
Result:
Each message goes on the wire with the correct address.
Motivation:
To be consistent with the JDK we should ensure our native methods throw a ClosedChannelException if the Channel was previously closed. This will then be wrapped in a ChannelException as usual. For all other errors we continue to just throw a ChannelException directly.
Modifications:
Ensure getsockopt and setsockopt will throw a ClosedChannelException if the channel was closed before, on other errors we throw a ChannelException as before diretly.
Result:
Consistent with the NIO Channel implementations.
Motivation:
EpollServerSocketConfig.isFreebind() throws an exception when called.
Modifications:
Use the correct getsockopt arguments.
Result:
No more exception when call EpollServerSocketConfig.isFreebind()
Motivation:
When using the native transport have support for TCP_DEFER_ACCEPT or / and TCP_QUICKACK can be useful.
Modifications:
- Add support for TCP_DEFER_ACCEPT and TCP_QUICKACK
- Ad unit tests
Result:
TCP_DEFER_ACCEPT and TCP_QUICKACK are supported now.
Motivation:
JNI_OnUnload(...) does not return anything (has void in its signature) so we should not try to return something.
Modifications:
Remove return.
Result:
Fix incorrect but harmless code.
Motivation:
netty_epoll_native.c uses dladdr in attempt to get the name of the library that the code is running in. However the address passed to this funciton (JNI_OnLoad) may not be unique in the context of the application which loaded it. For example if another JNI library is loaded this address may first resolve to the other JNI library and cause the path name parsing to fail, which will cause the library to fail.
Modifications:
- Pass an addresses which is local to the current library to dladdr
Result:
EPOLL JNI library can be loaded in an environment where multiple JNI libraries are loaded.
Fixes https://github.com/netty/netty/issues/4840
Motivation:
If Netty's class files are renamed and the type references are updated (shaded) the native libraries will not function. The native epoll module uses implicit JNI bindings which requires the fully qualified java type names to match the method signatures of the native methods. This means EPOLL cannot be used with a shaded Netty.
Modifications:
- Make the JNI method registration dynamic
- support a system property io.netty.packagePrefix which must be prepended to the name of the native library (to ensure the correct library is loaded) and all class names (to allow classes to be correctly referenced)
- remove system property io.netty.native.epoll.nettyPackagePrefix which was recently added and the code to support it was incomplete
Result:
transport-native-epoll can be used when Netty has been shaded.
Fixes https://github.com/netty/netty/issues/4800
Motivation:
When a wildcard address is used to bind a socket and ipv4 and ipv6 are usable we should accept both (just like JDK IO/NIO does).
Modifications:
Detect wildcard address and if so use in6addr_any
Result:
Correctly accept ipv4 and ipv6
Motivation:
transport-native-epoll finds java classes from JNI using fully qualified class names. If a shaded version of Netty is used then these lookups will fail.
Modifications:
- Allow a prefix to be appended to Netty class names in JNI code.
Result:
JNI code can be used with shaded version of Netty.
Motivation:
We should also be able to compile the native transport on 32bit systems.
Modifications:
Add cast to intptr_t for pointers
Result:
It's possible now to also compile on 32bit.
Motivation:
Linux uses different socket options to set the traffic class (DSCP) on IPv6
Modifications:
Also set IPV6_TCLASS for IPv6 sockets
Result:
TrafficClass will work on IPv4 and IPv6 correctly
Motivation:
We missed to define the actual c function for isKeepAlive(...) and so throw UnsatisfieldLinkError.
Modifications:
- Add function
- Add unit test for Socket class
Result:
Correctly work isKeepAlive(...) when using native transport
Motivation:
transport-native-epoll is designed to be specific to Linux. However there is native code that can be extracted out and made to work on more Unix like distributions. There are a few steps to be completely decoupled but the first step is to extract out code that can run in a more general Unix environment from the Linux specific code base.
Modifications:
- Move all non-Linux specific stuff from Native.java into the io.netty.channel.unix package.
- io.netty.channel.unix.FileDescriptor will inherit all the native methods that are specific to file descriptors.
- io_netty_channel_epoll_Native.[c|h] will only have code that is specific to Linux.
Result:
Code is decoupled and design is streamlined in FileDescriptor.
Motivation:
Java_io_netty_channel_epoll_Native_getSoError incorrectly returns the value from the get socket option function.
Modifications:
- return the value from the result of the get socket option call
Result:
Java_io_netty_channel_epoll_Native_getSoError returns the correct value.
Motivation:
If a RDHUP and IN event occurred at the same time it is possible we may not read all pending data on the channel. We should ensure we read data before processing the RDHUP event.
Modifications:
- Process the RDHUP event before the IN event.
Result:
Data will not be dropped.
Fixes https://github.com/netty/netty/issues/4317
Motivation:
We should fail the build on warnings in the JNI/c code.
Modifications:
- Add GCC flag to fail build on warnings.
- Fix warnings (which also fixed a bug when using splice with offsets).
Result:
Better code quality.
Motivation:
TCP Fast Open allows data to be carried in the SYN and SYN-ACK packets and consumed by the receiving end during the initial connection handshake, and saves up to one full round-trip time (RTT) compared to the standard TCP, which requires a three-way handshake (3WHS) to complete before data can be exchanged. This commit enables support for TFO on server sockets.
Modifications:
Added new Integer Option TCP_FASTOPEN in EpollChannelOption.
Added getters/setters in EpollServerChannelConfig for TCP_FASTOPEN.
Added way to check if TCP_FASTOPEN is supported on server in Native.
Added setting on socket opt TCP_FASTOPEN if value is set on channel options in doBind in EpollServerSocketChannel.
Enhanced EpollSocketTestPermutation to contain a permutation for server socket containing fast open.
Result:
Users of native-epoll can set TCP_FASTOPEN on server sockets and thus leverage fast connect features of RFC7413 if client is capable of it.
Conflicts:
transport-native-epoll/src/main/java/io/netty/channel/epoll/EpollChannelOption.java
Motivation:
There are protocols (BGP, SXP), which are typically deployed with TCP
MD5 authentication to protect sessions from being hijacked/torn down by
third parties. This facility is not available on most operating systems,
but is typically present on Linux.
Modifications:
- add a new EpollChannelOption, which is write-only
- teach Epoll(Server)SocketChannel to track which addresses have keys
associated
- teach Native how to set the MD5 signature keys for a socket
Result:
Users of the native-epoll transport can set MD5 signature keys and thus
leverage RFC-2385 protection on TCP connections.
Motivation:
See #4174.
Modifications:
Modify transport-native-epoll to allow setting TCP_USER_TIMEOUT.
Result:
Hanging connections that are written into will get timeouted.
Conflicts:
transport-native-epoll/src/main/java/io/netty/channel/epoll/EpollChannelOption.java
Motivation:
In NIO and OIO we throw a ChannelException if a ChannelConfig operation fails. We should do the same with epoll to be consistent.
Modifications:
Use ChannelException
Result:
Consistent behaviour across different transport implementations.
Motivation:
The method implementions for setSoLinger(...) and setTrafficClass(...) were swapped by mistake.
Modifications:
Use the correct implementation for setSoLinger(...) and setTrafficClass(...)
Result:
Correct behaviour when setSoLinger(...) and setTrafficClass(...) are used with the epoll transport.
Motivation:
Because of java custom UTF encoding, it was previously impossible to use
nul-bytes in domain socket names, which is required for abstract domain
sockets.
Modifications:
- Pass the encoded string byte array to the native code
- Modify native code accordingly to work with nul-bytes in the the
array.
- Move the string encoding to UTF-8 in java code.
Result:
Unix domain socket addresses will work properly if they contain nul-
bytes. Address encoding for these addresses changes from UTF-8-like to
real UTF-8.
Motivation:
IP_FREEBIND allows to bind to addresses without the address up yet or even the interface configured yet.
Modifications:
Add support for IP_FREEBIND.
Result:
It's now possible to use IP_FREEBIND when using the native epoll transport.
Motivation:
Some glibc/kernel versions will trigger an EPOLLERR event to notify
about failed connect and not an EPOLLOUT. Also EPOLLERR may be triggered
when a connection is broke.
Modification:
React on EPOLLERR like if an EPOLLOUT / EPOLLIN was received, this will work in
all cases as we handle errors in EPOLLOUT / EPOLLIN anyway.
Result:
Correctly detect errors.
Motiviation:
TCP_NOTSENT_LOWAT is only supported in linux kernel 3.12 or newer. The addition of this socket option prevents older kernels from building.
Modifications:
- Conditionally define TCP_NOTSENT_LOWAT if it is not defined
Result:
Kernels older than 3.12 can still compile the EPOLL module.
Motiviation:
Linux provides the TCP_NOTSENT_LOWAT socket option. This can be used to control how much unsent data is queued in the tcp kernel buffers. This can be important when application level protocols (SPDY, HTTP/2) have their own priority mechanism and don't want data queued in the kernel.
Modifications:
- The epoll module will have an additional socket option TCP_NOTSENT_LOWAT
- There will be JNI methods to control the underlying linux socket option mechanism
Result:
Linux EPOLL module exposes the TCP_NOTSENT_LOWAT socket option.
Motivation:
the JNI function ThrowNew won't release any allocated memory.
The method exceptionMessage is allocating a new string concatenating 2 constant strings
What is creating a small leak in case of these exceptions are happening.
Modifications:
Added new methods that will use exceptionMessage and free resources accordingly.
I am also removing the inline definition on these methods as they could be reused by
other added modules (e.g. libaio which should be coming soon)
Result:
No more leaks in case of failures.
Motivation:
When trying to write more then Integer.MAX_VALUE / SSIZE_MAX via writev(...) the OS may return EINVAL depending on the kernel or the actual OS (bsd / osx always return EINVAL). This will trigger an IOException.
Modifications:
Never try to write more then Integer.MAX_VALUE / SSIZE_MAX when using writev.
Result:
No more IOException when write more data then Integer.MAX_VALUE / SSIZE_MAX via writev.
Motivation:
When using epoll_ctl we should respect the return value and do the right thing depending on it.
Modifications:
Adjust java and native code to respect epoll_ctl return values.
Result:
Correct and cleaner code.
Motivation:
Linux supports splice(...) to transfer data from one filedescriptor to another without
pass data through the user-space. This allows to write high-performant proxy code or to stream
stuff from the socket directly the the filesystem.
Modification:
Add AbstractEpollStreamChannel.spliceTo(...) method to support splice(...) system call
Result:
Splice is now supported when using the native linux transport.
Conflicts:
transport-native-epoll/src/main/java/io/netty/channel/epoll/AbstractEpollStreamChannel.java