Commit Graph

116 Commits

Author SHA1 Message Date
Norman Maurer 522c0bb706 benchmark 2020-09-30 10:24:27 +02:00
Norman Maurer 70b7621963
Implement batching of reading and writing when using datagram with io_uring. (#10606)
Motivation:

io_uring does not support recvmmsg / sendmmsg directly and so we need to
"emulate" it by submitting multiple IORING_IO_RECVMSG /
IORING_IO_SENDMSG calls.

Modifications:

- Allow to issue multiple write / read calls at once no matter what
  concrete AbstractIOUringChannel subclass it is
- Add support for batching recvmsg / sendmsg when using
IOUringDatagramChannel

Result:

Better performance
2020-09-29 16:58:46 +02:00
Norman Maurer d266af2778
Fix bitmasking / bitshifting code to encode / decode user data (#10617)
Motivation:

Our bitmasking / shifting did not correctly handle negative values.

Modifications:

- Change methods to allow passing user data
- fix bitmasking / bitshifting code
- Add unit test

Result:

Be able to pass in negative values as well
2020-09-29 13:09:33 +02:00
Norman Maurer ca8c4538c1
Ensure we can compile io_uring transport on systems that have linux kernel < 5.1 (#10619)
Motivation:

While we need to have a very recent kernel to run the io_uring transport itself we should allow to compile it with earlier versions to help with our build story.

Modifications:

- Ensure we can compile on "older systems"
- Just enable the profile when we build on linux

Result:

Less complicated to build io_uring transport
2020-09-29 13:09:04 +02:00
Norman Maurer 0c823e7527
Add support for probing and so be able to detect if we can support (#10618)
io_uring on the running system.

Motivation:

We should make use of the provided support of probing in io_uring. This
can help us to test if we can use the io_uring based transport on the
running system or not. Beside this it also allows us to compile on linux
systems which don't support all of the required io_uring ops.

Modifications:

- Add native call for probing
- make use of probing when trying to init the native library and fail if
probing fails

Result:

Better detection if the system supports all needed io_uring features or
not
2020-09-28 17:42:15 +02:00
Nick Hill f6c84541be
Close IOUringEventLoop wakeup/eventfd close shutdown race (#10615)
Motivation

There is a race condition when shutting down the event loop where the
eventFd write performed in the wakeup() method may actually hit a
different fd if it's closed and reassigned in the meantime.

This was already encountered and addressed in the epoll case.

Modifications

Similar to what's done for epoll, in IOUringEventLoop:
- Reinstate pendingWakeup flag which tracks when there is a wakeup
pending (CAS of nextWakeupNanos performed by other thread in the
wakeup() method)
- Add logic to the cleanup() method to wait for corresponding READ CQE
before closing the eventFd
- Remove unused fields from IOUringCompletionQueue (cleanup)

Result

No event loop shutdown race
2020-09-28 08:47:23 +02:00
Norman Maurer db73538737
Include header files to allow building on older machines (#10609)
Motivation:

liburing ships its own iouring headers so its possible to build on older
machines as well. We should do the same.

Modifications:

- Include header files and so not depend on kernel version
- Fix license files and header for attribution

Result:

Be able to build easier
2020-09-25 08:36:39 +02:00
Norman Maurer 09a0b78a81
Add IOUringDatagramChannel and so also support UDP (#10588)
Motivation:

We can also support UDP / Datagram based on io_uring, so we should do it
for maximal performance

Modifications:

- Add IOUringDatagramChannel
- Add tests based on our transport testsuite for it

Result:

UDP / Datagram is supported via io_uring as well now
2020-09-23 11:21:06 +02:00
Josef Grieb 0421c9c751
Some cleanup to make it more readable (#10586)
Motivation:

some cleanup to make it more readable

Modifications:

- fix typos
- remove cqe kRingMask
- remove unused pendingWakeup

Result:

cleaner code
2020-09-23 09:55:51 +02:00
Norman Maurer 11ef990f05
Cleanup the IOUringCompletionQueue and add some javadocs (#10594)
Motivation:

IOUringCompletionQueue did use 2 spaces but we use 4 spaces in netty.
Beside this there were not javadocs

Modifications:

- Use 4 spaces
- Add javadocs
- remove public from method signature

Result:

Code cleanup
2020-09-22 13:26:38 +02:00
Norman Maurer ce41aa1e66
Decouple rwflags from user data (#10597)
Motivation:

We always encoded the rwflags into user data which only makes sense for
POLL* atm. We should decouple this and so allow to store other things
into the user data for other ops.

Modifications:

Allow to explicit define what to store into user data and so be more
flexible.

Result:

More flexible usage
2020-09-22 11:01:52 +02:00
Norman Maurer 8ca81a2563
Only create ConnectTimeoutException if really needed (#10596)
Motivation:

Creating exceptions is expensive so we should only do so if really needed. This is basically a port of #10595 for io_uring.

Modifications:

Only create the ConnectTimeoutException if we really need it.

Result:

Less overhead
2020-09-22 08:58:14 +02:00
Norman Maurer 0751becf03
Make reading and writing of sockaddr_in / sockaddr_in6 more robust (#10591)
Motivation:

While the current code works just fine we should better lookup the
offsets of the various struct members on init and use these. This way
we are sure the code is portable and correct.

Modifications:

Lookup various offsets on init and than use the offsets when reading and
writing to / from the structs

Result:

More robust and portable code
2020-09-19 20:40:49 +02:00
Norman Maurer b5d2f53aa0
There is currently no support for sendfile when using io_uring, remove (#10589)
it

Motivation:

sendfile is not supported with io_uring atm. We should remove it.

Modifications:

Remove sendfile

Result:

Less code to maintain
2020-09-19 11:18:28 +02:00
Norman Maurer a1b36d43c5 Directly write / read sockaddr_in and sockaddr_in6 from direct memory (#10585)
Motivation:

We want to keep the amount of JNI as small as possible to reduce the
performance overhead now that we eliminated the overhead of the need of
it for syscalls.

Modifications:

Write / read sockaddr_in / sockaddr_in6 via PlatformDependent and so
eliminate the need for JNI

Result:

Less JNI and so less overhead for crossing the border.
2020-09-18 16:30:37 +02:00
Norman Maurer 7bc4521a99 Cleanup code (#10581)
Motivation:

Just some cleanup needed in general

Modifications:

- Make methods package-private when the class is package-private
- Use spaces and not tabs everywhere
- Fix eventfd_write usage as the implementation was only needed like
this for EPOLL when used with edge-triggered
- Correctly handle EINTR

Result:

Cleaner code
2020-09-18 16:30:18 +02:00
Norman Maurer d2219f089e
Allow to configure if IOSEQ_ASYNC is used per EventLoopGroup (#10576)
Motivation:

There may be situations when the user dont want to use IOSEQ_ASYNC so we
should allow to configure this

Modifications:

Make it configurable if IOSEQ_ASYNC should be used

Result:

More flexible configuration
2020-09-15 16:47:20 +02:00
Norman Maurer 0ef8fa47e5
Simplify JNI init code (#10571)
Motivation:

Using classes which are not provided by the JDK itself in JNI is
problematic when shading may be used by customers of the library. Also
it makes the maintainance of the code often more complicated.

Modifications:

- Only us classes which are provided by the JDK in the JNI code
- Cleanup

Result:

Easier to maintain code
2020-09-15 10:41:06 +02:00
Norman Maurer d991105276
Obtain remoteAddress as part of accept (#10570)
Motivation:

io_uring supports the same way of obtaining the remoteAddress as
accept4(...) does. We should use it

Modifications:

Obtain the remoteAddress of the accepted socket as part of the accept
operation

Result:

Ensure we always see the correct remoteAddress when accepting sockets
2020-09-15 08:48:11 +02:00
Josef Grieb 8cce4d273c
Access the field directly from JNI (#10568)
Motivation:

calling methods in JNI are more expensive, it would be cleaner not using the getter methods

Modifications:

-delete getter methods
-access these fields directly

Result:

it's more efficient
2020-09-13 20:03:08 +02:00
Nick Hill 907a71c930
Further reduce io_uring syscalls (#10542)
Motivation

IOUringEventLoop can be streamlined to further reduce io_uring_enter
calls

Modification

- Don't prepare to block-wait until all available work is exhausted
- Combine submission with GETEVENTS

Result

Hopefully faster
2020-09-13 15:46:39 +02:00
Norman Maurer f674d15865
Drain all of the input when a POLLRDHUP was received to ensure we not (#10566)
see any stales

Motivation:

When a POLLRDHUP was received we need to continue draining the input
until EOF is detected as otherwise we may see stales when the user never
tries to read again.

Modifications:

- Correctly handle reading when POLLRDHUP was seen
- Remove @Ignore from testcases related to POLLRDHUP handling

Result:

Correctly drain input when POLLRDHUP was received in all cases
2020-09-11 13:09:36 +02:00
Norman Maurer 8e86bf60a8
Cleanup IOUringEventLoopGroup construction and allow to specify the ring (#10563)
Motivation:

We should allow to specify the ringsize when constructing the
IOUringEventLoopGroup and also be constistent with the rest of the
EventLoopGroup implementations

Modifications:

- Cleanup constructors
- Make ringSize configurable

Result:

Cleaner code and more flexible in terms of configuration
2020-09-11 08:34:57 +02:00
Josef Grieb d990b99a6b
Added error handling for io_uring creation failure (#10561)
Motivation:

we should throw a jvm runtime exception for io_uring creation failure to avoid a NullPointerException

Modifications:

-error handling for creation ring fd and mmap io_uring ring buffer
-some cleanups

Result:

better error handling
2020-09-10 16:25:17 +02:00
Nick Hill 90674b4fce
Simplify SQE handling (#10544)
Motivation

SQE handling can be simplified in terms of code and operations
performed

Modifications

- Zero SQE array up front - no need to set never-used fields each time
- Fill SQ array up front with corresponding indicies - no need to set
each time since they are 1-1 with SQE array entries
- Keep local head and tail vars and don't track separate sqe array
head/tail
- Allocate memory for timespec directly (no need for ByteBuffer)
- Avoid some unnecessary casts / type conversions (no need to convert
uints to longs)

Result

Fewer operations, less code
2020-09-10 13:25:34 +02:00
Nick Hill 2316c2ce45
Exploit blocking FAST_POLL for eventfd reads (#10543)
Motivation

If we make eventfd blocking then read can take the place of poll+read

Modification

Make eventfd blocking, use READ instead of POLLIN, allocating a static
64bit buffer to read into

Result

Fewer kernel roundtrips for event loop wakeups
2020-09-10 07:37:39 +02:00
Norman Maurer d933a9dd56
Move IovArray handling code in an extra class to better seperate it and (#10559)
easier to test.

Motivation:

We should move the IovArray related code to an extra class so its easier
to test

Modifications:

- Move into extra class
- Add dedicated test

Result:

Cleanup
2020-09-09 20:30:48 +02:00
Norman Maurer dd63d1c8d0
Allow to specify a callback that is executed once submit was called and (#10555)
use it for clearing the IovArrays

Motivation:

IOUringSubmissionQueue may call submit() internally when there is no
space left in the buffer. Once this is done we can reuse for example
IovArrays etc. Because of this its useful to be able to specify a
callback that is executed after submission

Modifications:

- Allow to specify a Runnable that is called once submission was
complete
- Use this callback to clear the IovArrays

Result:

IovArrays are automatically cleared on each submit call.
2020-09-09 17:18:47 +02:00
Norman Maurer 044ec159b9
Only schedule another read if the FD is still open (#10551)
Motivation:

We should only keep on reading if the fd is still open, otherwise we
will produce a confusing exception

Modifications:

check if fd is still open before schedule the read.

Result:

Dont produce a confusing excepion when the fd was closed during a read
loop.
2020-09-09 11:43:46 +02:00
Norman Maurer 8ef5dbc24b
Only execute the close once the already added write operations completes (#10538)
Motivation:

We need to be careful that we only execute the close(...) once the write
operation completes as otherwise we may close the underlying socket too
fast and also the writes

Modifications:

Keep track of if we need to delay the close or not and if so execute it
once the write completes

Result:

No more test failures
2020-09-09 11:42:37 +02:00
Norman Maurer 5ee1f2c7ec
Handle when io_uring_enter(...) fails with EINTR (#10540)
Motivation:

It is possible that io_uring_enter(...) fails with EINTR. In this case
we should just retry the operation

Modifications:

Retry when EINTR was detected

Result:

More correct use of io_uring_enter(...)
2020-09-09 10:05:14 +02:00
Norman Maurer 5bd6611c0e
Explicit need to specify -Piouring-native to compile the native bits … (#10546)
Motivation:

At the moment our CI can not build and run the native bits for the iouring transport so we should just not compile this at the moment. The java classes itself should still be compiled tho

Modifications:

Add explicit profile to compile native bits of iouring

Result:

CI passes with iouring transport
2020-09-09 09:50:36 +02:00
Norman Maurer 7a34f1e6c5
Fix AssertionError caused by incorrect for loop (#10554)
Motivation:

incorrect for loop we could end up with an AssertionError (this is
sometimes triggered during testsuite run)

Modifications:

Fix for loop that calls IovArray.clear()

Result:

No more AssertionError
2020-09-09 09:18:29 +02:00
Norman Maurer f6474e66de
Use multiple IovArray for writev when using io_uring based transport (#10549)
Motivation:

How we did manage the memory of writev was quite wasteful and could
produce a lot of memory overhead. We can just keep it simple by using
an array of IovArrays. Once these are full we can just submit and clear these as at this
point the kernel did take over a copy and its safe to reuse

Modifications:

Use an array of IovArrays and submit once it is full.

Result:

Less memory overhead and less code duplication
2020-09-08 21:23:38 +02:00
Norman Maurer 47bfcd2e80
Remove workaround for previous io_uring bug related to IOSQE_ASYNC and (#10547)
IOURING_OP_WRITEV

Motivation:

The bug related to IOSQE_ASYNC and IORING_OP_WRITEV was fixed so no need
to have the workaround

Modifications:

Remove workaround

Result:

Use IOSQE_ASYNC all the time
2020-09-08 10:50:41 +02:00
Norman Maurer 9da59c3894
Fix reentrancy bug in io_uring transport implementation related to (#10541)
writes

Motivation:

We need to carefully manage state in terms of writing to guard against
rentrancy problems that could lead to corrupt state in the
ChannelOutboundBuffer

Modifications:

Only reset the flag once removeBytes(...) was called

Result:

No more reentrancy bug related to writes.
2020-09-08 08:43:46 +02:00
Norman Maurer 9b296c8034
Update README to reflect kernel requirements for iouring transport (#10539)
Motivation:

kernel 5.9-rc4 was released that ships all fixes we need

Modifications:

Update readme

Result:

Make it clear what kernel is needed
2020-09-07 12:05:49 +02:00
Norman Maurer ddb503f76d Fix checkstyle errors 2020-09-07 10:29:41 +02:00
Josef Grieb 8c465e2f1b
Merge pull request #35 from normanmaurer/jni_constants
Lookup constants via JNI
2020-09-05 11:10:08 +02:00
Norman Maurer ccd5a6e411 Add workaround for current kernel bug related to WRITEV and IOSEQ_ASYNC
Motivation:

There is currently a bug in the kernel that let WRITEV sometimes fail
when IOSEQ_ASYNC is enabled

Modifications:

Don't use IOSEQ_ASYNC for WRITEV for now

Result:

Tests pass
2020-09-05 10:22:02 +02:00
Norman Maurer dfca811648 Lookup constants via JNI
Motivation:

We should better use JNI to lookup constants so we are sure we not mess
things up

Modifications:

Use JNI calls to lookup constants once

Result:

Less error prone code
2020-09-05 09:40:02 +02:00
Norman Maurer 1c42a37f67 Use IOSQE_ASYNC flag when submitting
Motivation:

At least in the throughput benchmarks it has shown that IOSQE_ASYNC
gives a lot of performance improvements. Lets enable it by default for
now and maybe make it configurable in the future

Modifications:

Use IOSEQ_ASYNC

Result:

Better performance
2020-09-04 20:22:28 +02:00
Norman Maurer 6545d80d23 Submit IO in batches to reduce overhead
Motivation:

We should submit multiple IO ops at once to reduce the syscall overhead.

Modifications:

- Submit multiple IO ops in batches
- Adjust default ringsize

Result:

Much better performance
2020-09-04 17:09:46 +02:00
Josef Grieb 9e13c5cfd9
Merge pull request #30 from normanmaurer/handle_complete_cleanup
Call handle.readComplete() before fireChannlReadComplete() and also c…
2020-09-04 06:43:20 +02:00
Norman Maurer 0631824dcd Call handle.readComplete() before fireChannlReadComplete() and also cleanup some code 2020-09-03 18:40:44 +02:00
Norman Maurer 3b35976559 Fix bug related to reset the RecvByteBufAllocator.Handle on each read
Motivation:

We should only reset the RecvByteBufAllocator.Handle when a new "read
loop" starts. Otherwise the handle will not be able to correctly limit
reads.

Modifications:

- Move reset(...) call into pollIn(...)
- Remove all @Ignore

Result:

The whole testsuite passes
2020-09-03 16:14:24 +02:00
Norman Maurer 1440b4fa0c Add more missing tests 2020-09-03 14:51:54 +02:00
Josef Grieb a55313ed75
Merge pull request #27 from normanmaurer/close_group
Call IOUringEventLoopGroup.shutdownGracefully() after test is done.
2020-09-03 10:15:12 +02:00
Norman Maurer e1c6f111f5 Call IOUringEventLoopGroup.shutdownGracefully() after test is done. 2020-09-03 09:47:36 +02:00
Norman Maurer f57fcd6c4a Do support SO_BACKLOG in io_uring
Motivation:

Due a bug SO_BACKLOG was not supported via ChannelOption when using io_uring. Be

Modification:

- Add SO_BACKLOG to the supported ChannelOptions.
- Merge IOUringServerChannelConfig into IOUringServerSocketChannelConfig

Result:

SO_BACKLOG is supported
2020-09-03 09:40:55 +02:00