Motivation:
`ThreadLocalRandom` doesn't cause contention. Also `nextInt()` generates only 4 random bytes while `Math.random()` generates 8 bytes.
Modification:
Replaced `(int) Math.random()` with `PlatformDependent.threadLocalRandom().nextInt()`
Result:
No possible contention when random numbers for WebSockets.
Motivation:
We are increasingly running in environments where Unsafe, setAccessible, etc. are not available.
When debug logging is enabled, we log a complete stack trace every time one of these initialisations fail.
Seeing these stack traces can cause people unnecessary concern.
For instance, people might have alerts that are triggered by a stack trace showing up in logs, regardless of its log level.
Modification:
We continue to print debug log messages on the result of our initialisations, but now we only include the full stack trace is _trace_ logging (or FINEST, or equivalent in whatever logging framework is configured) is enabled.
Result:
We now only log these initialisation stack traces when the lowest possible log level is enabled.
Fixes#7817
Motivation:
SslHandler invokes channel.read() during the handshake process. For some
channel implementations (e.g. LocalChannel) this may result in re-entry
conditions into unwrap. Unwrap currently defers updating the input
buffer indexes until the unwrap method returns to avoid intermediate
updates if not necessary, but this may result in unwrapping the same
contents multiple times which leads to handshake failures [1][2].
[1] ssl3_get_record:decryption failed or bad record mac
[2] ssl3_read_bytes:sslv3 alert bad record mac
Modifications:
- SslHandler#unwrap updates buffer indexes on each iteration so that if
reentry scenario happens the correct indexes will be visible.
Result:
Fixes https://github.com/netty/netty/issues/11146
Motivation:
When Memory based Factory is used, if the first chunk starts with Line Break, the HttpData
is not filled with the current available buffer if the delimiter is not found yet, while it may
add some.
Fix JavaDoc to note potential wrong usage of content() or getByteBuf() if HttpDatais has
a huge content with the risk of Out Of Memory Exception.
Fix JavaDoc to explain how to release properly the Factory, whatever it is in Memory,
Disk or Mixed mode.
Fix issue #11143
Modifications:
First, when the delimiter is not found, instead of searching Line Break from readerIndex(), we should search
from readerIndex() + readableBytes() - delimiter size, since this is the only part where usefull
Line Break could be searched for, except if readableBytes is less than delimiter size (then we search from
readerIndex).
Second, when a Memory HttpData is created, it should be assigned an empty buffer to be
consistent with the other implementations (Disk or Mixed mode).
We cannot change the default behavior of the content() or getByteBuf() of the Memory based HttpData
since the ByteBuf is supposed to be null when released, but not empty.
When a new ByteBuf is added, one more check verifies if the current ByteBuf is empty, and if so, it
is released and replaced by the new one, without creating a new CompositeByteBuf.
Result:
In the tests testBIgFileUploadDelimiterInMiddleChunkDecoderMemoryFactory and related for other modes,
the buffers are starting with a CRLF.
When we offer only the prefix part of the multipart (no data at all), the current Partial HttpData has
an empty buffer.
The first time we offer the data starting with CRLF to the decoder, it now
has a correct current Partial HttpData with a buffer not empty.
The Benchmark was re-run against this new version.
Old Benchmark Mode Cnt Score Error Units
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigAdvancedLevel thrpt 6 4,037 ± 0,358 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigDisabledLevel thrpt 6 4,226 ± 0,471 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigParanoidLevel thrpt 6 0,875 ± 0,029 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigSimpleLevel thrpt 6 4,346 ± 0,275 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighAdvancedLevel thrpt 6 2,044 ± 0,020 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighDisabledLevel thrpt 6 2,278 ± 0,159 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighParanoidLevel thrpt 6 0,174 ± 0,004 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighSimpleLevel thrpt 6 2,370 ± 0,065 ops/ms
New Benchmark Mode Cnt Score Error Units
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigAdvancedLevel thrpt 6 5,604 ± 0,415 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigDisabledLevel thrpt 6 6,058 ± 0,111 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigParanoidLevel thrpt 6 0,914 ± 0,031 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigSimpleLevel thrpt 6 6,053 ± 0,051 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighAdvancedLevel thrpt 6 2,636 ± 0,141 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighDisabledLevel thrpt 6 3,033 ± 0,181 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighParanoidLevel thrpt 6 0,178 ± 0,006 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighSimpleLevel thrpt 6 2,859 ± 0,189 ops/ms
So +20 to +40% improvement due to not searching for CRLF/LF into the full buffer when no delimiter is found,
but only from the end and delimiter size + 2 (CRLF).
Motivation:
SslHandler has many independent boolean member variables. They can be
collapsed into a single variable to save memory.
Modifications:
- SslHandler boolean state consolidated into a single short variable.
Result:
Savings of 8 bytes per SslHandler (which is per connection) observed on
OpenJDK.
Motivation:
`SslHandler#unwrap` may produce `SslHandshakeCompletionEvent` if it
receives `close_notify` alert. This alert indicates that the engine is
closed and no more data are expected in the pipeline. However, it fires
the event before the last data chunk. As the result, further handlers
may loose data if they handle `SslHandshakeCompletionEvent`.
This issue was not visible before #11133 because we did not write
`close_notify` alert reliably.
Modifications:
- Add tests to reproduce described behavior;
- Move `notifyClosePromise` after fire of the last `decodeOut`;
Result:
`SslHandshakeCompletionEvent` correctly indicates that the engine is
closed and no more data are expected on the pipeline.
Motivation:
We should avoid blocking in the event loop as much as possible.
The InputStream.read() is a blocking method, and we don't need to call it if available() returns a positive number.
Modification:
Bypass calling InputStream.read() if available() returns a positive number.
Result:
Fewer blocking calls in the event loop, in general, when ChunkedStream is used.
Motivation:
SslHandler's wrap method notifies the handshakeFuture and sends a
SslHandshakeCompletionEvent user event down the pipeline before writing
the plaintext that has just been wrapped. It is possible the application
may write as a result of these events and re-enter into wrap to write
more data. This will result in out of sequence data and result in alerts
such as SSLV3_ALERT_BAD_RECORD_MAC.
Modifications:
- SslHandler wrap should write any pending data before notifying
promises, generating user events, or anything else that may create a
re-entry scenario.
Result:
SslHandler will wrap/write data in the same order.
Motivation:
This is a regression caused by #11086
Modifications:
AbstractKQueueChannel#writeFilter should be invoked with `!in.isEmpty()`
- false - all messages are written
- true - there are still messages to be written
Result:
AbstractKQueueChannel#writeFilter is invoked with the correct boolean depending on the ChannelOutboundBuffer state
Motivation:
We should skip the deployment of jars that are not meant to be consumed by the user as there is no public API.
Modifications:
Let's skip deployment for modules that are not useful for users
Result:
Build cleanup
Motivation:
We also need to ensure that all the header validation is done when a single header with the endStream flag is received
Modifications:
- Adjust code to always enforce the validation
- Add more unit tests
Result:
Always correctly validate
Motivation:
As we can supported SegmentedDatagramPacket in multiple native
transports (like in epoll and io_uring) we should just move it to
unix-common so we can share code.
Modification:
- Move SegmentedDatagrampPacket to transport-native-unixu
- Mark the SegmentedDatagramPacket in epoll as deprecated
- Update code to use updated package.
Result:
Possibility of code re-use
Motivation:
There are some redundant checks and so these can be removed
Modifications:
- First check frameOpcode != OPCODE_PING is removed because the code executed int the branch where frameOpcode <= 7, while OPCODE_PING is 9.
- Second check frameOpcode != OPCODE_PING is removed because its checked before.
Result:
Code cleanup
Motivation:
UDP_GRO can improve performance when reading UDP datagrams. This patch adds support for it.
See https://lwn.net/Articles/768995/
Modifications:
- Add recvmsg(...)
- Add support for UDP_GRO in recvmsg(...) and recvmmsg(...)
- Remove usage of recvfrom(...) and just always use recvmsg(...) or recvmmsg(...) to simplify things
- Refactor some code for sharing
- Add EpollChannelOption.UDP_GRO and the getter / setter in EpollDatagramConfig
Result:
UDP_GRO is supported when the underlying system supports it.
Motivation:
We had a bug in out DefaulThreadFactory as it always retrieved the ThreadGroup to used during creation time when now explicit ThreadGroup was given. This is problematic as the Thread may die and so the ThreadGroup is destroyed even tho the DefaultThreadFactory is still used.
This could produce exceptions like:
java.lang.IllegalThreadStateException
at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:867)
at java.lang.Thread.init(Thread.java:405)
at java.lang.Thread.init(Thread.java:349)
at java.lang.Thread.<init>(Thread.java:599)
at io.netty.util.concurrent.FastThreadLocalThread.<init>(FastThreadLocalThread.java:60)
at io.netty.util.concurrent.DefaultThreadFactory.newThread(DefaultThreadFactory.java:122)
at io.netty.util.concurrent.DefaultThreadFactory.newThread(DefaultThreadFactory.java:106)
at io.netty.util.concurrent.ThreadPerTaskExecutor.execute(ThreadPerTaskExecutor.java:32)
at io.netty.util.internal.ThreadExecutorMap$1.execute(ThreadExecutorMap.java:57)
at io.netty.util.concurrent.SingleThreadEventExecutor.doStartThread(SingleThreadEventExecutor.java:978)
at io.netty.util.concurrent.SingleThreadEventExecutor.startThread(SingleThreadEventExecutor.java:947)
at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:830)
at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:818)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register(AbstractChannel.java:471)
at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:87)
at io.netty.channel.SingleThreadEventLoop.register(SingleThreadEventLoop.java:81)
at io.netty.channel.MultithreadEventLoopGroup.register(MultithreadEventLoopGroup.java:86)
at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:323)
at io.netty.bootstrap.AbstractBootstrap.doBind(AbstractBootstrap.java:272)
at io.netty.bootstrap.AbstractBootstrap.bind(AbstractBootstrap.java:239)
at io.netty.incubator.codec.quic.QuicTestUtils.newServer(QuicTestUtils.java:138)
at io.netty.incubator.codec.quic.QuicTestUtils.newServer(QuicTestUtils.java:143)
at io.netty.incubator.codec.quic.QuicTestUtils.newServer(QuicTestUtils.java:147)
at io.netty.incubator.codec.quic.QuicStreamFrameTest.testCloseHalfClosure(QuicStreamFrameTest.java:48)
at io.netty.incubator.codec.quic.QuicStreamFrameTest.testCloseHalfClosureUnidirectional(QuicStreamFrameTest.java:35)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Modifications:
- If the user dont specify a ThreadGroup we will just pass null to the constructor of FastThreadLocalThread and so let it retrieve on creation time
- Adjust tests
Result:
Don't risk to see IllegalThreadStateExceptions.
Motivation:
The offsets were accidentally typed as int, where they should have been typed as long.
Modification:
Change type of offset arguments to PlatformDependent.put*(Object,int,?) from int to long.
Result:
It is now possible to use these methods to store to memory at absolute memory addresses.
netty-jni-util 0.0.2.Final is incompatible with static linking. Before
the netty-jni-util dependency was introduced netty-tcnative supported
static linking via NETTY_BUILD_STATIC. netty-jni-util 0.0.3.Final adds
static linking compatibility.
Modifications:
Bump netty-jni-util to version 0.0.3.Final and update to its new API
which requires the caller to manage packagePrefix.
Result:
Using latest version of netty-jni-util and restored static linking
compatibility.
Motivation:
These are necessary for creating a buffer implementation that uses Unsafe, and works generically for both on-heap and off-heap memory.
Modification:
PlatformDependent previously forced clients to decide if they are working on on-heap memory, or off-heap memory, by giving accessors distinct APIs for each.
What is added here, are generic accessors that work the same in either case.
Result:
We can now make an Unsafe-based buffer implementation that is agnostic to whether the memory is on- or off-heap.
Motivation:
LSE (https://mysqlonarm.github.io/ARM-LSE-and-MySQL/) can have a huge performance difference. Let's ensure we use a compiler that can support it.
Modifications:
Update to gc10 when cross-compiling as it supports LSE and enables it by default
Result:
More optimized builds for aarch64
Motivation:
CONNECT requests have no path defined as stated in the HTTP/2 spec, at the moment we will throw an exception if we try to convert such a request to HTTP/1.1
Modifications:
- Don't throw an exception if we try to convert a HTTP/2 CONNECT request that has no path
- Add unit test
Result:
Related to https://github.com/netty/netty-incubator-codec-http3/pull/112.
Motivation:
We need to ensure we are still be able to correctly map errors to streams in all cases. The problem was that we sometimes called closeStreamRemote(...) in a finally block and so closed the underyling stream before the actual exception was propagated. This was only true in some cases and not in all. Generally speaking we should only call closeStreamRemote(...) if there was no error as in a case of error we should generate a RST frame.
Modifications:
- Only call closeStreamRemote(...) if no exeption was thrown and so let the Http2ConnectionHandler handle the exception correctly
- Add unit tests
Result:
Correctly handle errors even when endStream is set to true
Motivation:
Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset.
The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101.
Modification:
Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address.
Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment.
In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure.
The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned.
This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms.
Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception.
Luckily, I think use of aligned allocations are rare.
Result:
Aligned pooled byte bufs now work correctly, and never have any overlap.
Fixes#11101
Motivation
A GOAWAY frame (or any other HTTP/2 frame) should not be sent before the
connection preface. Clients that immediately close the channel may
currently attempt to send a GOAWAY frame before the connection preface,
resulting in servers receiving a seemingly-corrupt connection preface.
Modifications
* Ensure that the preface has been sent before attempting to
automatically send a GOAWAY frame as part of channel shutdown logic
* Add unit test that only passes with new behavior
Result
Fixes https://github.com/netty/netty/issues/11026
Co-authored-by: Bennett Lynch <Bennett-Lynch@users.noreply.github.com>
Motivation:
If compression is enabled and the HttpResponse also
implements HttpContent (but not LastHttpContent) then
the buffer will be freed to eagerly.
Modification:
I retain the buffer the same way that is done for the LastHttpContent case.
Note that there is another suspicious looking call a few lines above (if beginEncode returns null). I am not sure if this should also be retained.
Result:
Fixes#11092
Motivation:
netty-build is now called netty-build-common
Modifications:
Rename netty-build to netty-build-common
Result:
Be able to compile branch again
Motivation:
Components in a composite buffer can "go missing" if the composite is a slice of another composite and the parent has changed its layout.
Modification:
Where we would previously have thrown a NullPointerException, we now have a null-check for the component, and we instead throw an IllegalStateException with a more descriptive message.
Result:
It's now a bit easier to understand what is going on in these situations.
Fixes#10908
Motivation:
Newer versions of dawidd6/action-download-artifact changed the default
workflow_conclusion from "completed" to "completed, success" which can
result in download failures if the associated job fails, which is
expected when tests fail.
Modifications:
- Explicitly set the workflow_conclusion to "completed"
Result:
Test failures which result in build failures will still download test
data and generate reports after updating
dawidd6/action-download-artifact.
... number of bytes when using DatagramChannels
Motivation:
In our FixedRecvByteBufAllocator we dont continue to read if the number of bytes is less then what was configured. This is correct when using it for TCP but not when using it for UDP. When using UDP the number of bytes is the maximum of what we want to support but we often end up processing smaller datagrams in general. Because of this we should use contineReading(UncheckedBooleanSupplier) to determite if we should continue reading
Modifications:
- use contineReading(UncheckedBooleanSupplier) for DatagramChannels
Result:
Read more then once in the general case for DatagramChannels with the default config
Motivation:
Allow to configure the maximum number of messages to write per eventloop run. This can be useful to ensure we read data in a timely manner and not let writes dominate the CPU time. This is especially useful in protocols like QUIC where you need to read "fast enough" as otherwise you may not read the ACKs fast enough.
Modifications:
- Add new ChannelOption / config that allows to limit the number of messages to write per eventloop run.
- Respect this setting for DatagramChannels
Result:
Reduce the risk of having WRITES block the processing of other events in a timely manner
Co-authored-by: terrarier2111 <58695553+terrarier2111@users.noreply.github.com>
Motivation:
We need to ensure we also build the modules we depend on as otherwise we may not be able to resolve the dependencies correctly
Modifications:
- Add -am when calling maven
Result:
Deployment works all the time
Motivation:
SslHandler owns the responsibility to flush non-application data
(e.g. handshake, renegotiation, etc.) to the socket. However when
TCP Fast Open is supported but the client_hello cannot be written
in the SYN the client_hello may not always be flushed. SslHandler
may not wrap/flush previously written/flushed data in the event
it was not able to be wrapped due to NEED_UNWRAP state being
encountered in wrap (e.g. peer initiated renegotiation).
Modifications:
- SslHandler to flush in channelActive() if TFO is enabled and
the client_hello cannot be written in the SYN.
- SslHandler to wrap application data after non-application data
wrap and handshake status is FINISHED.
- SocketSslEchoTest only flushes when writes are done, and waits
for the handshake to complete before writing.
Result:
SslHandler flushes handshake data for TFO, and previously flushed
application data after peer initiated renegotiation finishes.
Motivation:
The EpollSocketConnectTest was not correctly configuring TCP Fast Open on the server socket.
It's an option, not a child option.
Modification:
EpollSocketConnectTest now correctly enables TCP Fast Open on the server side, when available, for the test that needs it.
Result:
Test covers what it was intended to.
Motivation:
There are several overloads of streamError(), with one receiving the
Throwable to be made the cause of the new exception. However, the wrong
overload was being called and instead the IllegalArgumentException was
being passed as a message format argument which was summarily thrown
away as the message format didn't reference it.
Modifications:
Move IllegalArgumentException to proper argument position.
Result:
A useful exception, with the underlying cause available.
Motivation:
As stated by https://tools.ietf.org/html/rfc7540#section-8.1.2.6 we should report a stream error if the content-length does not match the sum of all data frames.
Modifications:
- Verify that the sum of data frames match if a content-length header was send.
- Handle multiple content-length headers and also handle negative values
- Add io.netty.http2.validateContentLength system property which allows to disable the more strict validation
- Add unit tests
Result:
Correctly handle the case when the content-length header was included but not match what is send and also when content-length header is invalid
Motivation:
c22c6b845d introduced support for
UDP_SEGMENT but did restrict it to continous buffers. This is not needed
as it is also fine to use CompositeByteBuf
Modifications:
- Allow to use CompositeByteBuf as well
- Add unit test
Result:
More flexible usage of segmented datagrams possible
Motivation
The HttpObjectDecoder accepts input parameters for maxInitialLineLength
and maxHeaderSize. These are important variables since both message
components must be buffered in memory. As such, many decoders (like
Netty and others) introduce constraints. Due to their importance, many
users may wish to add instrumentation on the values of successful
decoder results, or otherwise be able to access these values to enforce
their own supplemental constraints.
While users can perhaps estimate the sizes today, they will not be
exact, due to the decoder being responsible for consuming optional
whitespace and the like.
Modifications
* Add HttpMessageDecoderResult class. This class extends DecoderResult
and is intended for HttpMessage objects successfully decoded by the
HttpObjectDecoder. It exposes attributes for the decoded
initialLineLength and headerSize.
* Modify HttpObjectDecoder to produce HttpMessageDecoderResults upon
successfully decoding the last HTTP header.
* Add corresponding tests to HttpRequestDecoderTest &
HttpResponseDecoderTest.
Co-authored-by: Bennett Lynch <Bennett-Lynch@users.noreply.github.com>
Motivation:
Add test to validate we are not affected by the regression introduced by fd8c1874b4
Modifications:
- Add unit test
Result:
Verify code works as expected
Motivation:
At the moment we use http to download rpms, let's switch to https
Modifications:
Use https for rpms
Result:
Hopefully more stable docker image builds
Motivation:
HttpObjectDecoder may throw an IllegalArgumentException if it encounters
a character that Character.isWhitespace() returns true for, but is not
one of the two valid OWS (optional whitespace) values. Such values may
not have friendly or readable toString() values. We should include the
hex value so that the illegal character can always be determined.
Modifications:
Add hex value as well
Result:
Easier to debug
Co-authored-by: Bennett Lynch <Bennett-Lynch@users.noreply.github.com>
Motivation:
These days we always include the OS in the library name. This means we also can simplify things
Modifications:
Adjust build configuration to address for libray name change
Result:
Simplify build