Commit Graph

25 Commits

Author SHA1 Message Date
Nick Hill
170e4deee6 Fix event loop shutdown timing fragility (#9616)
Motivation

The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.

The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I _think_ even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.

It also means shutdowns are unnecessarily delayed by up to 1 second.

Modifications

- Add/extend unit tests to expose the issue
- Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
to explicitly add no-op tasks to the taskQueue so that the subsequent
event loop iteration doesn't enter blocking wait (as looks like was
originally intended)

Results

Faster and more robust shutdown of event loops, allows removal of the
default wait timeout
2019-10-07 11:06:01 +04:00
Norman Maurer
7a547aab65
Correctly handle IPV6-mapped-IPV4 addresses in native code when receiving datagrams (#9560)
Motivation:

291f80733a introduced a change to use a byte[] to construct the InetAddress when receiving datagram messages to reduce the overhead. Unfortunally it introduced a regression when handling IPv6-mapped-IPv4 addresses and so produced an IndexOutOfBoundsException when trying to fill the byte[] in native code.

Modifications:

- Correctly use the offset on the pointer of the address.
- Add testcase
- Make tests more robust and include more details when the test fails

Result:

No more IndexOutOfBoundsException
2019-09-11 20:30:28 +02:00
jimin
9621a5b981 remove unused imports (#9287)
Motivation:

Some imports are not used

Modification:

remove unused imports

Result:

Code cleanup
2019-06-26 21:08:31 +02:00
EliyahuStern
6f602cbd14 Resolve the pid field in PeerCredentials of KQueueDomainSocketChannels. (#9219)
Motivation:

This resolves a TODO from the initial transport-native-kqueue implementation, supplying the user with the pid of the local peer client/server process.

Modification:

Inside netty_kqueue_bsdsocket_getPeerCredentials, Call getsockopt with LOCAL_PEERPID and pass it to PeerCredentials constructor.
Add a test case in KQueueSocketTest.

Result:

PeerCredentials now have pid field set. Fixes https://github.com/netty/netty/issues/9213
2019-06-04 05:15:42 -07:00
Julien Viet
e348bd9217 KQueueEventLoop | EpollEventLoop may incorrectly update registration when FD is reused.
Motivation:

The current KQueueEventLoop implementation does not process concurrent domain socket channel registration/unregistration in the order they actual
happen since unregistration are delated by an event loop task scheduling. When a domain socket is closed, it's file descriptor might be reused
quickly and therefore trigger a new channel registration using the same descriptor.

Consequently the KQueueEventLoop#add(AbstractKQueueChannel) method will overwrite the current inactive channels having the same descriptor
and the delayed KQueueEventLoop#remove(AbstractKQueueChannel) will remove the active channel that replaced the inactive one.

As active channels are registered, events for this file descriptor won't be processed anymore and the channels will never be closed.

The same problem can also happen in EpollEventLoop. Beside this we also may never remove the AbstractEpollChannel from the internal map
when it is unregistered which will prevent it from be GC'ed

Modifications:

- Change logic of native KQueue and Epoll implementations to ensure we correctly handle the case of FD reuse
- Only try to update kevent / epoll if the Channel is still open (as otherwise it will be handled by kqueue / epoll itself)
- Correctly remove AbstractEpollChannel from internal map in all cases
- Make implementation of closeAll() consistent for Epoll and KQueueEventLoop

Result:

KQueue and Epoll native transports correctly handle FD reuse

Co-authored-by: Norman Maurer <norman_maurer@apple.com>
2019-05-22 09:23:09 +02:00
Norman Maurer
71c184076c Revert "KQueueEventLoop won't unregister active channels reusing a file descriptor (#9114)"
This reverts commit 909a3d942e.
2019-05-07 16:44:41 +02:00
Julien Viet
909a3d942e KQueueEventLoop won't unregister active channels reusing a file descriptor (#9114)
Motivation:

The current KQueueEventLoop implementation does not process concurrent domain socket channel registration/unregistration in the order they actual
happen since unregistration are delated by an event loop task scheduling. When a domain socket is closed, it's file descriptor might be reused
quickly and therefore trigger a new channel registration using the same descriptor.

Consequently the KQueueEventLoop#add(AbstractKQueueChannel) method will overwrite the current inactive channels having the same descriptor
and the delayed KQueueEventLoop#remove(AbstractKQueueChannel) will remove the active channel that replaced the inactive one.

As active channels are registered, events for this file descriptor won't be processed anymore and the channels will never be closed.

Modifications:

Change the logic of KQueueEventLoop#remove(AbstractKQueueChannel) channels so it will check channels equality prior removal.

Result:

KQueueEventLoop won't remove anymore active channels reusing a file descriptor.
2019-05-07 10:19:42 +02:00
Norman Maurer
778ff2057e
Add IPv6 multicast test to testsuite (#9037)
Motivation:

We currently only cover ipv4 multicast in the testsuite but we should also have tests for ipv6.

Modifications:

- Add test for ipv6
- Ensure we only try to run multicast test for ipv4 / ipv6 if the loopback interface supports it.

Result:

Better test coverage
2019-04-12 12:29:08 +02:00
Vladimir Kostyukov
0a0da67f43 Introduce SingleThreadEventLoop.registeredChannels (#8428)
Motivation:

Systems depending on Netty may benefit (telemetry, alternative even loop scheduling algorithms) from knowing the number of channels assigned to each EventLoop.

Modification:

Expose the number of channels registered in the EventLoop via SingleThreadEventLoop.registeredChannels.

Result:

Fixes #8276.
2019-03-28 11:33:12 +00:00
Lunfu Zhong
e7b3195570 Support ALLOW_HALF_CLOSURE channel option on Unix domain socket. (#8932)
Motivation:

Since DomainSocketChannel is a DuplexChannel,  which be able to shutdown input or output individually on demands, but ALLOW_HALF_CLOSURE channel option has not been supported yet.

I thought this could be a missing feature of Unix domain socket, so here the PR for it.

Modifications:

1. Added allHalfClosure property both in  EpollDomainSocketChannelConfig and KQueueDomainSocketChannelConfig,
2. Enabled isAllowHalfClosure method of native channel to support domain channel config,
3. Created EpollDomainSocketShutdownOutputByPeerTest and KQueueDomainSocketShutdownOutputByPeerTest to verify the change.

Result:

ALLOW_HALF_CLOSURE channel option can be set with DomainSocketChannel, and no more warning of Unknown channel option 'ALLOW_HALF_CLOSURE'.
2019-03-19 11:24:07 +01:00
Norman Maurer
fa6a8cb09c
Support using an Executor to offload blocking / long-running tasks wh… (#8847)
Motivation:

The SSLEngine does provide a way to signal to the caller that it may need to execute a blocking / long-running task which then can be offloaded to an Executor to ensure the I/O thread is not blocked. Currently how we handle this in SslHandler is not really optimal as while we offload to the Executor we still block the I/O Thread.

Modifications:

- Correctly support offloading the task to the Executor while suspending processing of SSL in the I/O Thread
- Add new methods to SslContext to specify the Executor when creating a SslHandler
- Remove @deprecated annotations from SslHandler constructor that takes an Executor
- Adjust tests to also run with the Executor to ensure all works as expected.

Result:

Be able to offload long running tasks to an Executor when using SslHandler. Partly fixes https://github.com/netty/netty/issues/7862 and https://github.com/netty/netty/issues/7020.
2019-02-11 09:47:44 +01:00
Norman Maurer
c6a90d90a6
Add more tests to KQueue and Epoll testsuites. (#8851)
Motivation:

We missed to extend a few tests from the testsuite and so also run these with our native KQueue and Epoll transport.

Modifications:

Extend tests and so run these for our native transports as well.

Result:

More tests.
2019-02-08 20:08:34 +01:00
Scott Mitchell
12f6500a4f Epoll and Kqueue shouldn't read by default (#8024)
Motivation:
Epoll and Kqueue channels have internal state which forces
a single read operation after channel construction. This
violates the Channel#read() interface which indicates that
data shouldn't be delivered until this method is called.
The behavior is also inconsistent with the NIO transport.

Modifications:
- Epoll and Kqueue shouldn't unconditionally read upon
initialization, and instead should rely upon Channel#read()
or auto_read.

Result:
Epoll and Kqueue are more consistent with NIO.
2018-06-15 10:28:50 +02:00
Norman Maurer
d133bf06a4
Allow to schedule tasks up to Long.MAX_VALUE (#7972)
Motivation:

We should allow to schedule tasks with a delay up to Long.MAX_VALUE as we did pre 4.1.25.Final.

Modifications:

Just ensure we not overflow and put the correct max limits in place when schedule a timer. At worse we will get a wakeup to early and then schedule a new timeout.

Result:

Fixes https://github.com/netty/netty/issues/7970.
2018-05-30 11:11:42 +02:00
Norman Maurer
b47fb81799
EventLoop.schedule with big delay fails (#7402)
Motivation:

Using a very huge delay when calling schedule(...) may cause an Selector error when calling select(...) later on. We should gaurd against such a big value.

Modifications:

- Add guard against a very huge value.
- Added tests.

Result:

Fixes [#7365]
2018-04-24 11:15:20 +02:00
Scott Mitchell
ce241bd11e Epoll flush/writabilityChange deadlock
Motivation:
b215794de3 recently introduced a change in behavior where writeSpinCount provided a limit for how many write operations were attempted per flush operation. However when the write quantum was meet the selector write flag was not cleared, and the channel unsafe flush0 method has an optimization which prematurely exits if the write flag is set. This may lead to no write progress being made under the following scenario:
- flush is called, but the socket can't accept all data, we set the write flag
- the selector wakes us up because the socket is writable, we write data and use the writeSpinCount quantum
- we then schedule a flush() on the EventLoop to execute later, however it the flush0 optimization prematurely exits because the write flag is still set

In this scenario the socket is still writable so the EventLoop may never notify us that the socket is writable, and therefore we may never attempt to flush data to the OS.

Modifications:
- When the writeSpinCount quantum is exceeded we should clear the selector write flag

Result:
Fixes https://github.com/netty/netty/issues/7729
2018-02-20 11:40:58 +01:00
Scott Mitchell
33ddb83dc1
IovArray#add return value resulted in more ByteBufs being added during iteration
Motivation:
IovArray implements MessageProcessor, and the processMessage method will continue to be called during iteration until it returns true. A recent commit b215794de3 changed the return value to only return true if any component of a CompositeByteBuf was added as a result of the method call. However this results in the iteration continuing, and potentially subsequent smaller buffers maybe added, which will result in out of order writes and generally corrupts data.

Modifications:
- IovArray#add should return false so that the MessageProcessor#processMessage will stop iterating.

Result:
Native transports which use IovArray will not corrupt data during gathering writes of CompositeByteBuf objects.
2018-01-04 08:04:32 -08:00
Scott Mitchell
af2f343648
FileDescriptor writev core dump
Motivation:
FileDescriptor#writev calls JNI code, and that JNI code dereferences a NULL pointer which crashes the application. This occurs when writing a single CompositeByteBuf object with more than one component.

Modifications:
- Initialize the iovec iterator properly to avoid the core dump
- Fix the array length calculation if we aren't able to fit all the ByteBuffer objects in the iovec array

Result:
No more core dump.
2017-12-14 16:47:31 -08:00
Norman Maurer
aa8bdb5d6b Fix assertion error when closing / shutdown native channel and SO_LINGER is set.
Motivation:

When SO_LINGER is used we run doClose() on the GlobalEventExecutor by default so we need to ensure we schedule all code that needs to be run on the EventLoop on the EventLoop in doClose. Beside this there are also threading issues when calling shutdownOutput(...)

Modifications:

- Schedule removal from EventLoop to the EventLoop
- Correctly handle shutdownOutput and shutdown in respect with threading-model
- Add unit tests

Result:

Fixes [#7159].
2017-09-18 14:46:37 -07:00
Scott Mitchell
1d7c3fb7ee KQueue detect peer close without EVFILT_READ
Motivation:
The EPOLL transport uses EPOLLRDHUP to detect when the peer closes the write side of the socket. Currently KQueue is not able to mimic this behavior and the only way to detect if the peer has closed is to read. It may not always be appropriate to read for backpressure and other reasons at the application level.

Modifications:
- Support EVFILT_SOCK filter which provides notification when the peer closes the socket

Result:
KQueue transport has more consistent behavior with Epoll transport for detecting peer closure.
2017-08-18 11:00:18 -07:00
Norman Maurer
4bb89dcc54 Correctly handle connect/disconnect in EpollDatagramChannel / KQueueDatagramChannel
Motivation:

We did not correctly handle connect() and disconnect() in EpollDatagramChannel / KQueueDatagramChannel and so the behavior was different compared to NioDatagramChannel.

Modifications:

- Correct implement connect and disconnect methods
- Share connect and related code
- Add tests

Result:

EpollDatagramChannel / KQueueDatagramChannel also supports correctly connect() and disconnect() methods.
2017-08-04 09:22:53 +02:00
Norman Maurer
3cdff36821 Update tests to not use TestUtils.getFreePort() and so ensure we not try to use a port that is used by the system in the meantime.
Motivation:

We should not try to detect a free port in tests put just use 0 when bind so there is no race in which the system my bind something to the port we choosen before.

Modifications:

- Remove the usage of TestUtils.getFreePort() in the testsuite
- Remove hack to workaround bind errors which will not happen anymore now

Result:

Less flacky tests.
2017-07-20 08:25:37 +02:00
Norman Maurer
2f8fe2af01 Only try to deregister from EventLoop when the native Channel was registered before.
Motivation:

We only can call eventLoop() if we are registered on an EventLoop yet. As we just did this without checking we spammed the log with an error that was harmless.

Modifications:

Check if registered on eventLoop before try to deregister on close.

Result:

Fixes [#6770]
2017-05-24 13:19:18 +02:00
Scott Mitchell
4c6d946fba KQueueSocket#setTrafficClass exceptions
Motivation:
MacOS will throw an error when attempting to set the IP_TOS socket option if IPv6 is available, and also when getting the value for IP_TOS.

Modifications:
- Socket#setTrafficClass and Socket#getTrafficClass should try to use IPv6 first, and check if the error code indicates the protocol is not supported before trying IPv4

Result:
Fixes https://github.com/netty/netty/issues/6741.
2017-05-18 11:26:27 -07:00
Scott Mitchell
3cc4052963 New native transport for kqueue
Motivation:
We currently don't have a native transport which supports kqueue https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2. This can be useful for BSD systems such as MacOS to take advantage of native features, and provide feature parity with the Linux native transport.

Modifications:
- Make a new transport-native-unix-common module with all the java classes and JNI code for generic unix items. This module will build a static library for each unix platform, and included in the dynamic libraries used for JNI (e.g. transport-native-epoll, and eventually kqueue).
- Make a new transport-native-unix-common-tests module where the tests for the transport-native-unix-common module will live. This is so each unix platform can inherit from these test and ensure they pass.
- Add a new transport-native-kqueue module which uses JNI to directly interact with kqueue

Result:
JNI support for kqueue.
Fixes https://github.com/netty/netty/issues/2448
Fixes https://github.com/netty/netty/issues/4231
2017-05-03 09:53:22 -07:00