Motivation:
In line base decoders, lines are split by delimiter, but the delimiter may be \r\n or \r, so in decoding, if findEndOfLine finds delimiter of a line, the length of the delimiter may be 1 or 2, instead of DELIMITER_LENGTH, where the value is fixed to 2.
The second problem is that if the data to be decoded is too long, the decoder will discard too long data, and needs to record the length of the discarded bytes. In the original implementation, the discarded bytes are not accumulated, but are assigned to the currently discarded bytes.
Modification:
Modifications:
Dynamic calculation of the length of delimiter.
In discarding mode, add up the number of characters discarded each time.
Result:
Correctly handle all delimiters and also correctly handle too long frames.
Motivation:
The toString() method should use Arrays.toString() to produce a meaningful String representation for arrays.
Modification:
Use Arrays.toString()
Result:
More useful toString() implementation
Motivation:
We should not propage Http2WindowUpdateFrames to the child channels at all as these are not really use-ful and should not be flow-controlled via `read()` anyway. In the other hand Http2ResetFrame is very useful but should be propagated via an user event so the user is aware of it directly even if the user stops reading.
Modifications:
- Dont propagate Http2WindowUpdateFrames when using Http2MultiplexHandler
- Use user event for Http2ResetFrame when using Http2MultiplexHandler
- Adjust javadoc of Http2MultiplexHandler
- Add unit tests
Result:
Fixes https://github.com/netty/netty/pull/8889 and https://github.com/netty/netty/pull/7635
Motivation:
Http2MultiplexCodec and Http2MultiplexHandler had a very strong coupling with Http2FrameCodec which we can reduce easily. The end-goal should be to have no coupling at all.
Modifications:
- Reduce coupling by move some common logic to Http2CodecUtil
- Move logic to check if a stream may have existed before to Http2FrameCodec
- Use ArrayDeque as replacement for custom double-linked-list which makes the code a lot more readable
- Use WindowUpdateFrame to signal consume bytes (just as users do when they use Http2FrameCodec directly)
Result:
Less coupling and cleaner code.
Motivation:
Some methods that either override others or are implemented as part of implementation an interface did miss the `@Override` annotation
Modifications:
Add missing `@Override`s
Result:
Code cleanup
Motivation:
asList should only be used if there are multiple elements.
Modification:
Call to asList with only one argument could be replaced with singletonList
Result:
Cleaner code and a bit of memory savings
Motivation:
SpotJMHBugs reports that accumulating a value as a way of eliding dead code
elimination may be inadvisable, as discussed in
`JMHSample_34_SafeLooping::measureWrong_2`. Change the test so that it consumes
the response with `Blackhole::consume` instead.
Modifications:
- Replace addition of results with explicit `blackhole.consume()` call
Result:
Tests work as before, but with different benchmark numbers.
Motivation:
Some JMH benchmarks need additional explanations to motivate
specific code choices.
Modifications:
Introduced comment to explai why calling BlackHole::consume
in a loop is not always the right choice for some benchmark.
Result:
The relevant method shows a comment that warn about changing
the code to introduce BlackHole::consume in the loop.
Motivation:
Currently GraalVM substrate returns null for reflective calls if the reflection access is not declared up front.
A change introduced in Netty 4.1.35 results in needing to register every Netty handler for reflection. This complicates matters as it is difficult to know all the possible handlers that need to be registered.
Modification:
This change adds a simple
null check such that Netty does not break on GraalVM substrate without the reflection information registration.
Result:
Fixes#9278
Motivation:
At the moment EmptyByteBuf.getCharSequence(0,...) will return null while it must return a "".
Modifications:
- Let EmptyByteBuf.getCharSequence(0,...) return ""
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9271.
Motivation:
The HttpPostRequestEncoder overwrites the original filename of file uploads sharing the same name encoded in mixed mode when it rewrites the multipart body header of the previous file. The original filename should be preserved instead.
Modifications:
Change the HttpPostRequestEncoder to reuse the correct filename when the encoder switches to mixed mode. The original test is incorrect and has been modified too, in addition it tests with an extra file upload since the current test was not testing the continuation of a mixed mode.
Result:
The HttpPostRequestEncoder will preserve the original filename of the first fileupload when switching to mixed mode
Motivation:
HAProxyMessage should be released as it contains a list of TLV which hold a ByteBuf, otherwise, it may cause memory leaks.
Modification:
- Let HAProxyMessage extend AbstractReferenceCounted
- Adjust tests.
Result:
Fixes#9201
Motivation:
In the past we had the following class hierarchy:
Http2ConnectionHandler --- Http2FrameCodec -- Http2MultiplexCodec
This hierarchy makes it impossible to plug in any code that would like to act on Http2Frame and Http2StreamFrame which can be quite useful for various situations (like metrics, logging etc). Beside this it also made the implementtion very hacky. To allow easier maintainance and also allow more flexible costumizations we should split Http2MultiplexCodec and Http2FrameCode.
Modifications:
- Introduce Http2MultiplexHandler (which is a replacement for Http2MultiplexCodec when used together with Http2FrameCodec)
- Mark Http2MultiplexCodecBuilder and Http2MultiplexCodec as deprecated. People should use Http2FrameCodecBuilder / Http2FrameCodec together with Http2MultiplexHandlder in the future
- Adjust / Add tests
- Adjust examples
Result:
More flexible usage possible and less hacky / coupled implementation for http2 multiplexing
Motivation:
I need to control WebSockets inbound flow manually, when autoRead=false
Modification:
Add missed ctx.read() call into WebSocketProtocolHandler, where read request has been swallowed.
Result:
Fixes#9257
Motivation:
c9aaa93d83 added the ability to specify an EventLoopTaskQueueFactory but did place it under MultithreadEventLoopGroup while not really belongs there.
Modifications:
Make EventLoopTaskQueueFactory a top-level interface
Result:
More logical code layout.
Motivation:
FlowControlHandler does use a recyclable ArrayDeque internally but only recycles it when the channel is closed. We should better recycle it once it is empty.
Modifications:
Recycle the deque as fast as possible
Result:
Less RecyclableArrayDeque instances.
Motivation:
Resolve the issue highlighted by SpotJMHBugs that the creation of the RecyclableArrayList may be elided by the JIT since the result isn't consumed or returned.
Modifications:
Return the result of `list.recycle()` so that the list isn't elided.
Result:
The JMH benchmark shows a change in performance indicating that the prior results of this may be unsound.
Motivation
It would be useful to be able to write UTF-8 encoded subsequence of
CharSequence characters to a ByteBuf without needing to create a
temporary object via CharSequence#subSequence().
Modification
Add overloads of ByteBufUtil writeUtf8, reserveAndWriteUtf8 and
utf8Bytes methods which take explicit subsequence bounds.
Result
More efficient writing of substrings to byte buffers possible
Motivation:
testTruncatedWithTcpFallback was flacky as we may end up closing the socket before we could read all data. We should only close the socket after we succesfully read all data.
Modifications:
Move socket.close() to finally block
Result:
Fix flaky test and so make the CI more stable again.
Motivation:
Sometimes it is desirable to be able to use a different Queue implementation for the EventLoop of a Channel. This is currently not possible without resort to reflection.
Modifications:
- Add a new constructor to Nio|Epoll|KQueueEventLoopGroup which allows to specify a factory which is used to create the task queue. This was the user can override the default implementation.
- Add test
Result:
Be able to change Queue that is used for the EventLoop.
Motivation
There's quite a lot of duplicate/equivalent logic across the various
concrete ByteBuf implementations. We could take this even further but
for now I've focused on the PooledByteBuf sub-hierarchy.
Modifications
- Move common logic/methods into existing PooledByteBuf abstract
superclass
- Shorten PooledByteBuf.capacity(int) method implementation
Result
Less code to maintain
Motivation:
For HTTP/2 messages with multiple cookies HttpConversionUtil.addHttp2ToHttpHeaders spends a good portion of time creating throwaway StringBuilders.
Modification:
Handle cookies lazily by using a ThreadLocal StringBuilder and then converting it to the H1 header at the end.
Result:
Less allocations.
Motivation:
f945a071db decoupled the writability state from the flow controller but could lead to the situation of a lot of writability updates events were propagated to the child channels. This change ensure we only take into account if the parent channel becomes writable again before we try to set the child channels to writable.
Modifications:
Only listen for channel writability changes for if the parent channel becomes writable again.
Result:
Less writability updates.
Motivation:
Incorrect WebSockets closure affects our production system.
Enforced 'close socket on any protocol violation' prevents our custom termination sequence from execution.
Huge number of parameters is a nightmare both in usage and in support (decoders configuration).
Modification:
Fix violations handling - send proper response codes.
Fix for messages leak.
Introduce decoder's option to disable default behavior (send close frame) on protocol violations.
Encapsulate WebSocket response codes - WebSocketCloseStatus.
Encapsulate decoder's configuration into a separate class - WebSocketDecoderConfig.
Result:
Fixes#8295.
Motivation:
We should decouple the writability state of the http2 child channels from the flow-controller and just tie it to its own pending bytes counter that is decremented by the parent Channel once the bytes were written.
Modifications:
- Decouple writability state of child channels from flow-contoller
- Update tests
Result:
Less coupling and more correct behavior. Fixes https://github.com/netty/netty/issues/8148.
Motivation:
Traffic shaping needs more accurate execution than scheduled one. So the
use of FixedRate instead.
Moreover the current implementation tends to create as many threads as
channels use a ChannelTrafficShapingHandlern, which is unnecessary.
Modifications:
Change the executor.schedule to executor.scheduleAtFixedRate in the
start and remove the reschedule call from run monitor thread since it
will be restarted by the Fixed rate executor.
Also fix a minor bug where restart was only doing start() without stop()
before.
Result:
Threads are more stable in number of cached and precision of traffic
shaping is enhanced.
Motivation:
Lz4FrameEncoder and Lz4FrameDecoder in their default configuration use
an extremely inefficient way to checksum direct byte buffers. In
particular, for every byte checksummed, a single-element byte array is
being allocated and a JNI cal is made, which in some internal testing
makes a 25x difference in total throughput and allocates *a lot* of
garbage.
Modifications:
Lz4XXHash32, an implementation of ByteBufChecksum specifically for use
by Lz4FrameEncoder and Lz4FrameDecoder, is introduced. It utilises
xxHash32 block API which provides a hash() method that accepts a
ByteBuffer as an argument. Lz4FrameEncoder and Lz4FrameDecoder are
modified to use this implementation by default.
Result:
Lz4FrameEncoder and Lz4FrameDecoder perform well again when operating
on direct byte buffers with default checksum configuration; a public
implementation is provided for those who need to override the seed.
Motivation:
ReflectiveByteBufChecksum#update(buf, off, len) ignores provided offset
and length arguments when operating on direct buffers, leading to wrong
byte sequences being checksummed and ultimately incorrect checksum
values (unless checksumming the entire buffer).
Modifications:
Use the provided offset and length arguments to get the correct nio
buffer to checksum; add test coverage exercising the four meaningfully
different offset and length combinations.
Result:
Offset and length are respected and a correct checksum gets calculated;
simple unit test should prevent regressions in the future.
Motivation:
SslHandler must generate control data as part of the TLS protocol, for example
to do handshakes. SslHandler doesn't capture the status of the future
corresponding to the writes when writing this control (aka non-application
data). If there is another handler before the SslHandler that wants to fail
these writes the SslHandler will not detect the failure and we must wait until
the handshake timeout to detect a failure.
Modifications:
- SslHandler should detect if non application writes fail, tear down the
channel, and clean up any pending state.
Result:
SslHandler detects non application write failures and cleans up immediately.
Motivation:
Because of a simple bug in ByteBufChecksum#updateByteBuffer(Checksum),
ReflectiveByteBufChecksum is never used for CRC32 and Adler32, resulting
in direct ByteBuffers being checksummed byte by byte, which is
undesriable.
Modification:
Fix ByteBufChecksum#updateByteBuffer(Checksum) method to pass the
correct argument to Method#invoke(Checksum, ByteBuffer).
Result:
ReflectiveByteBufChecksum will now be used for Adler32 and CRC32 on
Java8+ and direct ByteBuffers will no longer be checksummed on slow
byte-by-byte basis.
Motivation:
In the current implementation, the synchronous close() method for FixedChannelPool returns
after scheduling the channels to close via a single threaded executor asynchronously. Closing a channel
requires event loop group, however, there might be a scenario when the application has closed
the event loop group after the sync close() completes. In this scenario an exception is thrown
(event loop rejected the execution) when the single threaded executor tries to close the channel.
Modifications:
Complete the close function only after all the channels have been close and introduce
closeAsync() method for cases when the current/existing behaviour is desired.
Result:
Close function would completely when the channels have been closed
Motivation:
Sometimes it is beneficial to be able to set a parent Channel in EmbeddedChannel if the handler that should be tested depend on the parent.
Modifications:
- Add another constructor which allows to specify a parent
- Add unit tests
Result:
Fixes https://github.com/netty/netty/issues/9228.
Motivation:
When connecting through an HTTP proxy over clear HTTP, user agents must send requests with an absolute url. This hold true for WebSocket Upgrade request.
WebSocketClientHandshaker and subclasses currently always send requests with a relative url, which causes proxies to crash as request is malformed.
Modification:
Introduce a new parameter `absoluteUpgradeUrl` and expose it in constructors and WebSocketClientHandshakerFactory.
Result:
It's now possible to configure WebSocketClientHandshaker so it works properly with HTTP proxies over clear HTTP.
delete Other "Content-" MIME Header Fields exception
Motivation:
RFC7578 4.8. Other "Content-" Header Fields
The multipart/form-data media type does not support any MIME header
fields in parts other than Content-Type, Content-Disposition, and (in
limited circumstances) Content-Transfer-Encoding. Other header
fields MUST NOT be included and MUST be ignored.
Modification:
Ignore other Content types.
Result:
Other "Content-" Header Fields should be ignored no exception
Motivation:
We did not have support for enable / disable loopback mode in our native epoll transport and also missed the implemention to access the configured interface.
Modifications:
Add implementation and adjust test to cover it
Result:
More complete multicast support with native epoll transport
Motivation:
When Netty is run through ProGuard, seemingly unused methods are removed. This breaks reflection, making the Handler skipping throw a reflective error.
Modification:
If a method is seemingly absent, just disable the optimization.
Result:
Dealing with ProGuard sucks infinitesimally less.
Motivation:
b4e3c12b8e introduced code to avoid coupling
close() to graceful close. It also added some code which attempted to infer when
a graceful close was being done in writing of a GOAWAY to preserve the
"connection is closed when all streams are closed behavior" for the child
channel API. However the implementation was too overzealous and may preemptively
close the connection if there are not currently any open streams (and close if
there are any frames which create streams in flight).
Modifications:
- Decouple writing a GOAWAY from trying to infer if a graceful close is being
done and closing the connection. Even if we could enhance this logic (e.g.
wait to close until the second GOAWAY with no error) it is possible the user
doesn't want the connection to be closed yet. We can add a means for the codec
to orchestrate the graceful close in the future (e.g. write some special "close
the connection when all streams are closed") but for now we can just let the
application handle this.
Result:
Fixes https://github.com/netty/netty/issues/9207
Motivation:
The wakeup logic in EpollEventLoop is overly complex
Modification:
* Simplify the race to wakeup the loop
* Dont let the event loop wake up itself (it's already awake!)
* Make event loop check if there are any more tasks after preparing to
sleep. There is small window where the non-eventloop writers can issue
eventfd writes here, but that is okay.
Result:
Cleaner wakeup logic.
Benchmarks:
```
BEFORE
Benchmark Mode Cnt Score Error Units
EpollSocketChannelBenchmark.executeMulti thrpt 20 408381.411 ± 2857.498 ops/s
EpollSocketChannelBenchmark.executeSingle thrpt 20 157022.360 ± 1240.573 ops/s
EpollSocketChannelBenchmark.pingPong thrpt 20 60571.704 ± 331.125 ops/s
Benchmark Mode Cnt Score Error Units
EpollSocketChannelBenchmark.executeMulti thrpt 20 440546.953 ± 1652.823 ops/s
EpollSocketChannelBenchmark.executeSingle thrpt 20 168114.751 ± 1176.609 ops/s
EpollSocketChannelBenchmark.pingPong thrpt 20 61231.878 ± 520.108 ops/s
```
Motivation:
This resolves a TODO from the initial transport-native-kqueue implementation, supplying the user with the pid of the local peer client/server process.
Modification:
Inside netty_kqueue_bsdsocket_getPeerCredentials, Call getsockopt with LOCAL_PEERPID and pass it to PeerCredentials constructor.
Add a test case in KQueueSocketTest.
Result:
PeerCredentials now have pid field set. Fixes https://github.com/netty/netty/issues/9213
Motivation:
RoundRobinDnsAddressResolverGroup ultimately opens UDP
ports for DNS resolution. Callers likely expect that
RoundRobinDnsAddressResolverGroup#close() will close those
ports, but that is not currently true (see #9212).
Modifications:
Overrode RoundRobinInetAddressResolver#close() to close
the delegate name resolver, which in turn closes any UDP
ports used for name resolution.
Result:
RoundRobinDnsAddressResolverGroup#close() closes UDP ports
as expected. This fixes#9212.
Motivation
While digging around looking at something else I noticed that these
share a lot of logic and it would be nice to reduce that duplication.
Modifications
Have UnpooledUnsafeDirectByteBuf extend UnpooledDirectByteBuf and make
adjustments to ensure existing behaviour remains unchanged.
The most significant addition needed to UnpooledUnsafeDirectByteBuf was
re-overriding the getPrimitive/setPrimitive methods to revert back to
the AbstractByteBuf versions which include bounds checks
(UnpooledDirectByteBuf excludes these as an optimization, relying on
those done by underlying ByteBuffer).
Result
~200 fewer lines, less duplicate logic.
Motivation:
It is valid to use null as sender so we should support it when DatagramPacketEncoder checks if it supports the message.
Modifications:
- Add null check
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/9199.
Motivation:
At the moment ByteToMessageDecoder always calls fireChannelReadComplete() when the handler is removed from the pipeline and the cumulation buffer is not null. We should only call it when we also call fireChannelRead(...), which only happens if the cumulation buffer is not null and readable.
Modifications:
Only call fireChannelReadComplete() if fireChannelRead(...) is called before during removal of the handler.
Result:
More correct semantics
Motivation:
1. Users will be able to use an optimized version of
`UnpooledHeapByteBuf` and override behavior of methods if required.
2. Consistency with `UnpooledDirectByteBuf`, `UnpooledHeapByteBuf`, and
`UnpooledUnsafeDirectByteBuf`.
Modifications:
- Add `public` access modifier to `UnpooledUnsafeHeapByteBuf` class and
ctor;
Result:
Public access for optimized version of `UnpooledHeapByteBuf`.
Motivation:
We do not need to issue a read on timerfd and eventfd when the EventLoop wakes up if we register these as Edge-Triggered. This removes the overhead of 2 syscalls and so helps to reduce latency.
Modifications:
- Ensure we register the timerfd and eventfd with EPOLLET flag
- If eventfd_write fails with EAGAIN, call eventfd_read and try eventfd_write again as we only use it as wake-up mechanism.
Result:
Less syscalls and so reducing overhead.
Co-authored-by: Carl Mastrangelo <carl@carlmastrangelo.com>