netty5

Author	SHA1	Message	Date
Norman Maurer	0a8e1aaf19	Flush task should not flush messages that were written since last flush attempt. Motivation: The flush task is currently using flush() which will have the affect of have the flush traverse the whole ChannelPipeline and also flush messages that were written since we gave up flushing. This is not really correct as we should only continue to flush messages that were flushed at the point in time when the flush task was submitted for execution if the user not explicit call flush() by him/herself. Modification: Call *Unsafe.flush0() via the flush task which will only continue flushing messages that were marked as flushed before. Result: More correct behaviour when the flush task is used.	2018-03-02 10:09:40 +09:00
Scott Mitchell	d2d3e6ef0c	KQueue write filter initial state (#7738 ) Motivation: KQueue implementations current have inconsistent behavior with Epoll implementations with respect to asynchronous sockets and connecting. In the Epoll transport we attempt to connect, if the connect call does not synchornously fail/succeed we set the EPOLLOUT which will be triggered by the kernel if the connection attempt succeeds or an error occurs. The connect API provides no way to asynchronously communicate an error so the Epoll implementation fires a EPOLLOUT event and puts the connect status in getsockopt(SO_ERROR). KQueue provides the same APIs but different behavior. If the EVFILT_WRITE is not enabled and the EVFILT_READ is enabled before connect is called, and there is an error the kernel may fire the EVFILT_READ filter and provide the Connection Refused error via read(). This is even true if we set the EVFILT_WRITE filter after calling connect because connect didn't synchornously complete. After the error has been delievered via read() a call to getsockopt(SO_ERROR) will return 0 indicating there is no error. This means we cannot rely upon the KQueue based kernel to deliver connection errors via the EVFILT_WRITE filter in the same way that the linux kernel does with the EPOLLOUT flag. `ce241bd` introduced a change which depends upon the behavior of the EVFILT_WRITE being set and may prematurely stop writing to the OS as a result, becaues we assume the OS will notify us when the socket is writable. However the current work around for the above described behavior is to initialize the EVFILT_WRITE to true for connection oriented protocols. This leads to prematurely exiting from the flush() which may lead to deadlock. Modifications: - KQueue should check when an error is obtained from read() if the connectPromise has not yet been completed, and if not complete it with a ConnectException Result: No more deadlock in KQueue due to asynchronous connect workaround.	2018-02-20 11:01:49 -08:00
Scott Mitchell	ce241bd11e	Epoll flush/writabilityChange deadlock Motivation: `b215794de3` recently introduced a change in behavior where writeSpinCount provided a limit for how many write operations were attempted per flush operation. However when the write quantum was meet the selector write flag was not cleared, and the channel unsafe flush0 method has an optimization which prematurely exits if the write flag is set. This may lead to no write progress being made under the following scenario: - flush is called, but the socket can't accept all data, we set the write flag - the selector wakes us up because the socket is writable, we write data and use the writeSpinCount quantum - we then schedule a flush() on the EventLoop to execute later, however it the flush0 optimization prematurely exits because the write flag is still set In this scenario the socket is still writable so the EventLoop may never notify us that the socket is writable, and therefore we may never attempt to flush data to the OS. Modifications: - When the writeSpinCount quantum is exceeded we should clear the selector write flag Result: Fixes https://github.com/netty/netty/issues/7729	2018-02-20 11:40:58 +01:00
Scott Mitchell	33ddb83dc1	IovArray#add return value resulted in more ByteBufs being added during iteration Motivation: IovArray implements MessageProcessor, and the processMessage method will continue to be called during iteration until it returns true. A recent commit `b215794de3` changed the return value to only return true if any component of a CompositeByteBuf was added as a result of the method call. However this results in the iteration continuing, and potentially subsequent smaller buffers maybe added, which will result in out of order writes and generally corrupts data. Modifications: - IovArray#add should return false so that the MessageProcessor#processMessage will stop iterating. Result: Native transports which use IovArray will not corrupt data during gathering writes of CompositeByteBuf objects.	2018-01-04 08:04:32 -08:00
Scott Mitchell	af2f343648	FileDescriptor writev core dump Motivation: FileDescriptor#writev calls JNI code, and that JNI code dereferences a NULL pointer which crashes the application. This occurs when writing a single CompositeByteBuf object with more than one component. Modifications: - Initialize the iovec iterator properly to avoid the core dump - Fix the array length calculation if we aren't able to fit all the ByteBuffer objects in the iovec array Result: No more core dump.	2017-12-14 16:47:31 -08:00
Scott Mitchell	b215794de3	Enforce writeSpinCount to limit resource consumption per socket (#7478 ) Motivation: The writeSpinCount currently loops over the same buffer, gathering write, file write, or other write operation multiple times but will continue writing until there is nothing left or the OS doesn't accept any data for that specific write. However if the OS keeps accepting writes there is no way to limit how much time we spend on a specific socket. This can lead to unfair consumption of resources dedicated to a single socket. We currently don't limit the amount of bytes we attempt to write per gathering write. If there are many more bytes pending relative to the SO_SNDBUF size we will end up building iov arrays with more elements than can be written, which results in extra iteration, conditionals, and book keeping. Modifications: - writeSpinCount should limit the number of system calls we make to write data, instead of applying to individual write operations - IovArray should support a maximum number of bytes - IovArray should support composite buffers of greater than size 1024 - We should auto-scale the amount of data that we attempt to write per gathering write operation relative to SO_SNDBUF and how much data is successfully written - The non-unsafe path should also support a maximum number of bytes, and respect the IOV_MAX limit Result: Write resource consumption can be bounded and gathering writes have a limit relative to the amount of data which can actually be accepted by the socket.	2017-12-07 16:00:52 -08:00
Norman Maurer	3f101caa4c	Not call java methods from within JNI init code to prevent class loading deadlocks. Motivation: We used NetUtil.isIpV4StackPreferred() when loading JNI code which tries to load NetworkInterface in its static initializer. Unfortunally a lock on the NetworkInterface class init may be already hold somewhere else which may cause a loader deadlock. Modifications: Add a new Socket.initialize() method that will be called when init the library and pass everything needed to the JNI level so we not need to call back to java. Result: Fixes [#7458].	2017-12-06 14:34:15 +01:00
Norman Maurer	251bb1a739	Not use safeRelease(...) but release(...) to release non-readable holders to ensure we not mask errors. Motivation: AbstractChannel attempts to "filter" messages which are written [1]. A goal of this process is to copy from heap to direct if necessary. However implementations of this method [2][3] may translate a buffer with 0 readable bytes to EMPTY_BUFFER. This may mask a user error where an empty buffer is written but already released. Modifications: Replace safeRelease(...) with release(...) to ensure we propagate reference count issues. Result: Fixes [#7383]	2017-12-04 20:38:35 +01:00
Norman Maurer	e7f02b1dc0	Set readPending to false when EOF is detected while issue an read Motivation: We need to set readPending to false when we detect a EOF while issue a read as otherwise we may not unregister from the Selector / Epoll / KQueue and so keep on receving wakeups. The important bit is that we may even get a wakeup for a read event but will still will only be able to read 0 bytes from the socket, so we need to be very careful when we clear the readPending. This can happen because we generally using edge-triggered mode for our native transports and because of the nature of edge-triggered we may schedule an read event just to find out there is nothing left to read atm (because we completely drained the socket on the previous read). Modifications: Set readPending to false when EOF is detected. Result: Fixes [#7255].	2017-11-06 15:44:36 -08:00
Norman Maurer	bcad9dbf97	Revert "Set readPending to false when ever a read is done" This reverts commit `413c7c2cd8` as it introduced an regression when edge-triggered mode is used which is true for our native transports by default. With `413c7c2cd8` included it was possible that we set readPending to false by mistake even if we would be interested in read more.	2017-11-06 09:21:42 -08:00
Scott Mitchell	413c7c2cd8	Set readPending to false when ever a read is done Motivation: readPending is currently only set to false if data is delivered to the application, however this may result in duplicate events being received from the selector in the event that the socket was closed. Modifications: - We should set readPending to false before each read attempt for all transports besides NIO. - Based upon the Javadocs it is possible that NIO may have spurious wakeups [1]. In this case we should be more cautious and only set readPending to false if data was actually read. [1] https://docs.oracle.com/javase/7/docs/api/java/nio/channels/SelectionKey.html That a selection key's ready set indicates that its channel is ready for some operation category is a hint, but not a guarantee, that an operation in such a category may be performed by a thread without causing the thread to block. Result: Notification from the selector (or simulated events from kqueue/epoll ET) in the event of socket closure. Fixes https://github.com/netty/netty/issues/7255	2017-10-25 08:25:54 -07:00
Idel Pivnitskiy	50a067a8f7	Make methods 'static' where it possible Motivation: Even if it's a super micro-optimization (most JVM could optimize such cases in runtime), in theory (and according to some perf tests) it may help a bit. It also makes a code more clear and allows you to access such methods in the test scope directly, without instance of the class. Modifications: Add 'static' modifier for all methods, where it possible. Mostly in test scope. Result: Cleaner code with proper 'static' modifiers.	2017-10-21 14:59:26 +02:00
Norman Maurer	9bcf31977c	Fail the connectPromise with the correct exception if the connection is refused when using the native kqueue transport. Motivation: Due a bug we happen to sometimes fail the connectPromise with a ClosedChannelException when using the kqueue transport and the remote peer refuses the connection. We need to ensure we fail it with the correct exception. Modifications: Call finishConnect() before calling close() to ensure we preserve the correct exception. Result: KQueueSocketConnectionAttemptTest.testConnectionRefused will pass always on macOS.	2017-10-07 21:33:26 +02:00
Carl Mastrangelo	d3ca087f6b	Propagate all exceptions when loading native code Motivation: There are 2 motivations, the first depends on the second: Loading Netty Epoll statically stopped working in 4.1.16, due to `Native` always loading the arch specific shared object. In a static binary, there is no arch specific SO. Second, there are a ton of exceptions that can happen when loading a native library. When loading native code, Netty tries a bunch of different paths but a failure in any given may not be fatal. Additionally: turning on debug logging is not always feasible so exceptions get silently swallowed. Modifications: * Change Epoll and Kqueue to try the static load second * Modify NativeLibraryLoader to record all the locations where exceptions occur. * Attempt to use `addSuppressed` from Java 7 if available. Alternatives Considered: An alternative would be to record log messages at each failure. If all load attempts fail, the log messages are printed as warning, else as debug. The problem with this is there is no `LogRecord` to create like in java.util.logging. Buffering the args to logger.log() at the end of the method loses the call site, and changes the order of events to be confusing. Another alternative is to teach NativeLibraryLoader about loading the SO first, and then the static version. This would consolidate the code fore Epoll, Kqueue, and TCNative. I think this is the long term better option, but this PR is changing a lot already. Someone else can take a crack at it later Results: Epoll Still Loads and easier debugging.	2017-10-04 08:45:27 +02:00
Norman Maurer	aa8bdb5d6b	Fix assertion error when closing / shutdown native channel and SO_LINGER is set. Motivation: When SO_LINGER is used we run doClose() on the GlobalEventExecutor by default so we need to ensure we schedule all code that needs to be run on the EventLoop on the EventLoop in doClose. Beside this there are also threading issues when calling shutdownOutput(...) Modifications: - Schedule removal from EventLoop to the EventLoop - Correctly handle shutdownOutput and shutdown in respect with threading-model - Add unit tests Result: Fixes [#7159].	2017-09-18 14:46:37 -07:00
Norman Maurer	0fffc844d6	Only load native transport if running architecture match the compiled library architecture. Motivation: We should only try to load the native artifacts if the architecture we are currently running on is the same as the one the native libraries were compiled for. Modifications: Include architecture in native lib name and append the current arch when trying to load these. This will fail then if its not the same as the arch of the compiled arch. Result: Fixes [#7150].	2017-09-04 13:34:55 +02:00
Carl Mastrangelo	c891c9c13f	Include more detail why Unsafe is not available Motivation: PD and PD0 Both try to find and use Unsafe. If unavailable, they try to log why and continue on. However, it is not always east to enable this logging. Chaining exceptions together is much easier to reach, and the original exception is relevant when Unsafe is needed. Modifications: * Make PD log why PD0 could not be loaded with a trace level log * Make PD0 remember why Unsafe wasn't available * Expose unavailability cause through PD for higher level use. * Make Epoll and KQueue include the reason when failing Result: Easier debugging in hard to reconfigure environments	2017-08-29 22:02:06 +02:00
Scott Mitchell	89ecb4b4a4	AutoClose behavior may infinite loop Motivation: If AutoClose is false and there is a IoException then AbstractChannel will not close the channel but instead just fail flushed element in the ChannelOutboundBuffer. AbstractChannel also notifies of writability changes, which may lead to an infinite loop if the peer has closed its read side of the socket because we will keep accepting more data but continuously fail because the peer isn't accepting writes. Modifications: - If the transport throws on a write we should acknowledge that the output side of the channel has been shutdown and cleanup. If the channel can't accept more data because it is full, and still healthy it is not expected to throw. However if the channel is not healthy it will throw and is not expected to accept any more writes. In this case we should shutdown the output for Channels that support this feature and otherwise just close. - Connection-less protocols like UDP can remain the same because the channel may disconnected temporarily. - Make sure AbstractUnsafe#shutdownOutput is called because the shutdown on the socket may throw an exception. Result: More correct handling of write failure when AutoClose is false.	2017-08-25 21:01:41 -07:00
Carl Mastrangelo	7f1051b6ca	Include JNIEXPORT on exported symbols Motivation: As noticed in https://stackoverflow.com/questions/45700277/ compilation can fail if the definition of a method doesn't match the declaration. It's easy enough to add this in, and make it easy to compile. Modifications: Add JNIEXPORT to the entry points. * On Windows this adds: `__declspec(dllexport)` * On Mac this adds: `__attribute__((visibility("default")))` * On Linux (GCC 4.2+) this adds: ` __attribute__((visibility("default")))` * On other it doesn't add anything. Result: Easier compilation	2017-08-18 17:34:48 -07:00
Scott Mitchell	fe2dd973e9	Unify KQueue and Epoll wait timeout approach Motivation: KQueueEventLoop and EpollEventLoop implement different approaches to applying a timeout of their respective poll calls. Epoll attempts to ensure the desired timeout is satisfied at the java layer and at the JNI layer, but it should be sufficient to account for spurious wakups at the JNI layer. Epoll timeout granularity is also limited to milliseconds which may be too large for some latency sensitive applications. Modifications: - Make EpollEventLoop wait method look like KQueueEventLoop - Epoll should support a finer timeout granularity via timerfd_create. We can hide most of these details behind the epollWait0 JNI call to avoid crossing additional JNI boundaries. Result: More consistent timeout approach between KQueue and Epoll.	2017-08-18 13:09:02 -07:00
Scott Mitchell	1d7c3fb7ee	KQueue detect peer close without EVFILT_READ Motivation: The EPOLL transport uses EPOLLRDHUP to detect when the peer closes the write side of the socket. Currently KQueue is not able to mimic this behavior and the only way to detect if the peer has closed is to read. It may not always be appropriate to read for backpressure and other reasons at the application level. Modifications: - Support EVFILT_SOCK filter which provides notification when the peer closes the socket Result: KQueue transport has more consistent behavior with Epoll transport for detecting peer closure.	2017-08-18 11:00:18 -07:00
Carl Mastrangelo	e4af881bdb	Do not define JNI_OnLoad when not dynamic Motivation: Due to an oversight (by myself), linking two JNI modules with duplicate symbols fails in linking. This only seems to happen some of the time (the behavior seems to be different between GCC and Clang toolchains). For instance, including both netty tcnative and netty epoll fails to link because of duplicate JNI_OnLoad symobols. Modification: Do not define the JNI_OnLoad and JNI_OnUnload symbols when compiling for static linkage, as indicated by the NETTY_BUILD_STATIC preprocessor define. They are never directly called when statically linked. Result: Able to statically compile epoll and tcnative code into a single binary.	2017-08-18 09:20:58 +02:00
Norman Maurer	19dcb15062	Use underscore in native library names for consistency. Motivation: At the moment we try to load the library using multiple names which includes names using - but also _ . We should just use _ all the time. Modifications: Replace - with _ Result: Fixes [#7069]	2017-08-15 06:02:00 +02:00
Scott Mitchell	237a4da1b7	Shutting down the outbound side of the channel should not accept future writes Motivation: Implementations of DuplexChannel delegate the shutdownOutput to the underlying transport, but do not take any action on the ChannelOutboundBuffer. In the event of a write failure due to the underlying transport failing and application may attempt to shutdown the output and allow the read side the transport to finish and detect the close. However this may result in an issue where writes are failed, this generates a writability change, we continue to write more data, and this may lead to another writability change, and this loop may continue. Shutting down the output should fail all pending writes and not allow any future writes to avoid this scenario. Modifications: - Implementations of DuplexChannel should null out the ChannelOutboundBuffer and fail all pending writes Result: More controlled sequencing for shutting down the output side of a channel.	2017-08-04 10:59:57 -07:00
Norman Maurer	4bb89dcc54	Correctly handle connect/disconnect in EpollDatagramChannel / KQueueDatagramChannel Motivation: We did not correctly handle connect() and disconnect() in EpollDatagramChannel / KQueueDatagramChannel and so the behavior was different compared to NioDatagramChannel. Modifications: - Correct implement connect and disconnect methods - Share connect and related code - Add tests Result: EpollDatagramChannel / KQueueDatagramChannel also supports correctly connect() and disconnect() methods.	2017-08-04 09:22:53 +02:00
Carl Mastrangelo	d8f4547f5c	Unify {Epoll,KQueue}EventLoopGroup initialization. Motivation: `Epoll.ensureAvailability()` is called multiple times, once in static initialization and in a couple of the constructors. This is redundant and confusing to read. Modifications: Move `Epoll.ensureAvailability()` call into an instance initializer and remove all other references. This ensures that every EELG checks availability, while still delaying the check until construction. This pattern is used when there are multiple ctors, as in this class. Result: Easier to read code.	2017-07-24 20:14:54 +02:00
Norman Maurer	3cdff36821	Update tests to not use TestUtils.getFreePort() and so ensure we not try to use a port that is used by the system in the meantime. Motivation: We should not try to detect a free port in tests put just use 0 when bind so there is no race in which the system my bind something to the port we choosen before. Modifications: - Remove the usage of TestUtils.getFreePort() in the testsuite - Remove hack to workaround bind errors which will not happen anymore now Result: Less flacky tests.	2017-07-20 08:25:37 +02:00
Scott Mitchell	7cfe416182	Use unbounded queues from JCTools 2.0.2 Motivation: JCTools 2.0.2 provides an unbounded MPSC linked queue. Before we shaded JCTools we had our own unbounded MPSC linked queue and used it in various places but gave this up because there was no public equivalent available in JCTools at the time. Modifications: - Use JCTool's MPSC linked queue when no upper bound is specified Result: Fixes https://github.com/netty/netty/issues/5951	2017-07-10 12:32:15 -07:00
Norman Maurer	c318fc7cea	Remove not needed intermediate collection while reading DatagramPackets in native transports Motivation: We used an intermediate collection to store the read DatagramPackets and only fired these through the pipeline once wewere done with the reading loop. This is not needed and can also increase memory usage. Modifications: Remove intermediate collection Result: Less overhead and possible less memory usage during read loop.	2017-07-05 18:20:05 +02:00
Scott Mitchell	1df8f2ccd1	KQueue crash due to close/cleanup sequencing Motivation: The kqueue documentation states that 'Calling close() on a file descriptor will remove any kevents that reference the descriptor.' [1], but doesn't mention if this cleanup will be done synchronously. Under some circumstances it has been observed that cleanup was not done immediately and when KQueueEventLoop attempted to access the channel associated with the event the JVM would crash, a ClassCastException, or generally undefined behavior would occur because of invalid pointer references. [1] https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2 Modifications: - AbstractKqueueChannel#doClose should not rely upon this assumption and instead should call doDeregister() to ensure cleanup is done synchronously. - Deleting a kevent should also set the jniSelfPtr stored in the udata of that kevent to NULL, to ensure we will not dereference it later. Result: No more kqueue crash due to close/cleanup sequencing.	2017-06-28 17:34:19 -04:00
Carl Mastrangelo	83de77fbe5	Make Native loading work better with Java 8 Motivation: Enable static linking for Java 8. These commits are the same as those introduced to netty tcnative. The goal is to allow lots of JNI libraries to be statically linked together without having conflict `JNI_OnLoad` methods. Modification: * add JNI_OnLoad suffixes to enable static linking * Add static names to the list of libraries that try to be loaded * Enable compiling with JNI 1.8 * Sort includes Result: Enable statically linked JNI code.	2017-06-23 19:42:13 +02:00
Scott Mitchell	1df722f65b	kqueue version of `7baef4fbe8`	2017-06-23 08:23:40 -07:00
louxiu	3c4dfed08a	Fix handle of ByteBuf with multi nioBuffer in EpollDatagramChannel and KQueueDatagramChannel Motivation: 1. special handling of ByteBuf with multi nioBuffer rather than type of CompositeByteBuf (eg. DuplicatedByteBuf with CompositeByteBuf) 2. EpollDatagramUnicastTest and KQueueDatagramUnicastTest passed because CompositeByteBuf is converted to DuplicatedByteBuf before write to channel 3. uninitalized struct msghdr will raise error Modifications: 1. isBufferCopyNeededForWrite(like isSingleDirectBuffer in NioDatgramChannel) checks wether a new direct buffer is needed 2. special handling of ByteBuf with multi nioBuffer in EpollDatagramChannel, AbstractEpollStreamChannel, KQueueDatagramChannel, AbstractKQueueStreamChannel and IovArray 3. initalize struct msghdr Result: handle of ByteBuf with multi nioBuffer in EpollDatagramChannel and KQueueDatagramChannel are ok	2017-05-26 07:56:34 +02:00
Norman Maurer	2f8fe2af01	Only try to deregister from EventLoop when the native Channel was registered before. Motivation: We only can call eventLoop() if we are registered on an EventLoop yet. As we just did this without checking we spammed the log with an error that was harmless. Modifications: Check if registered on eventLoop before try to deregister on close. Result: Fixes [#6770]	2017-05-24 13:19:18 +02:00
Norman Maurer	61b1165136	Add support to wrap an existing filedescriptor when using native kqueue transport Motivation: The native epoll transport allows to wrap an existing filedescriptor, we should support the same in the native kqueue transport. Modifications: Add constructors that allow to wrap and existing filedescriptor. Result: Featureset of native transports more on par.	2017-05-19 19:34:47 +02:00
Norman Maurer	201d9b6536	Share code that is needed to support shaded native libraries. Motivation: For our native libraries in netty we support shading, to have this work on runtime the user needs to set a system property. This code should shared. Modifications: Move logic to NativeLbiraryLoader and so share for all native libs. Result: Less code duplication and also will work for netty-tcnative out of the box once it support shading	2017-05-19 19:33:21 +02:00
Scott Mitchell	4c6d946fba	KQueueSocket#setTrafficClass exceptions Motivation: MacOS will throw an error when attempting to set the IP_TOS socket option if IPv6 is available, and also when getting the value for IP_TOS. Modifications: - Socket#setTrafficClass and Socket#getTrafficClass should try to use IPv6 first, and check if the error code indicates the protocol is not supported before trying IPv4 Result: Fixes https://github.com/netty/netty/issues/6741.	2017-05-18 11:26:27 -07:00
Scott Mitchell	3cc4052963	New native transport for kqueue Motivation: We currently don't have a native transport which supports kqueue https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2. This can be useful for BSD systems such as MacOS to take advantage of native features, and provide feature parity with the Linux native transport. Modifications: - Make a new transport-native-unix-common module with all the java classes and JNI code for generic unix items. This module will build a static library for each unix platform, and included in the dynamic libraries used for JNI (e.g. transport-native-epoll, and eventually kqueue). - Make a new transport-native-unix-common-tests module where the tests for the transport-native-unix-common module will live. This is so each unix platform can inherit from these test and ensure they pass. - Add a new transport-native-kqueue module which uses JNI to directly interact with kqueue Result: JNI support for kqueue. Fixes https://github.com/netty/netty/issues/2448 Fixes https://github.com/netty/netty/issues/4231	2017-05-03 09:53:22 -07:00

1 2

88 Commits