* Add OpenSslX509KeyManagerFactory which makes it even easier for people to get the maximum performance when using OpenSSL / LibreSSL / BoringSSL with netty.
Motivation:
To make it even easier for people to get the maximum performance when using native SSL we should provide our own KeyManagerFactory implementation that people can just use to configure their key material.
Modifications:
- Add OpenSslX509KeyManagerFactory which users can use for maximum performance with native SSL
- Refactor some internal code to re-use logic and not duplicate it.
Result:
Easier to get the max performance out of native SSL implementation.
Motivation:
5b1fe611a6 introduced the usage of a finalizer as last resort for PoolThreadCache. As we may call free() from the FastThreadLocal.onRemoval(...) and finalize() we need to guard against multiple calls as otherwise we will corrupt internal state (that is used for metrics).
Modifications:
Use AtomicBoolean to guard against multiple calls of PoolThreadCache.free().
Result:
No more corruption of internal state caused by calling PoolThreadCache.free() multuple times.
Motivation:
Users should not see a scary log message when Netty is initialized if
Netty configuration explicitly disables unsafe. The log message that
produces this warning was previously guarded but by recent refactoring
a bug was introduced inside the guard helper method.
Modifications:
This commit brings back the guard against the scary log message if
unsafe is explicitly disabled.
Result:
No log message is produced when unsafe is unavailable because Netty was
told to not look for it.
Relates https://github.com/netty/netty/pull/5624, https://github.com/netty/netty/pull/6696
Motivation:
We need to release the inbound data to ensure there are no leaks.
Modifications:
Extend SimpleChannelInboundHandler which will release inbound data by default.
Result:
No more leaks.
Motivation:
Recent PR https://github.com/netty/netty/pull/8040 introduced
Unpooled.wrappedUnmodifiableBuffer(ByteBuf...) which has the same
behaviour but wraps the provided array directly. This is preferred for
most uses (including varargs-based use) and if there are any unusual
cases of an explicit array which is re-used before the ByteBuf is
finished with, it can just be copied first.
Modifications:
Added @Deprecated annotation and javadoc to
Unpooled.unmodifiableBuffer(ByteBuf...).
Result:
Unpooled.unmodifiableBuffer(ByteBuf...) will be deprecated.
Motivation:
When a Http2MultiplexCodec stream channel fails to write the first
HEADERS it will forcibly close, and that will trigger sending a
RST_STREAM, which is commonly a connection level protocol error. This is
because it has what looks like a valid stream id, but didn't check with
the connection as to whether the stream may have actually existed.
Modifications:
Instead of checking if the stream was just a valid looking id ( > 0) we
check with the connection as to whether it may have existed at all.
Result:
We no longer send a RST_STREAM frame from Http2MultiplexCodec for idle
streams.
Motivation:
netty_unix_socket attempts to use nettyClassName in an error message, but previously freed the memory. We should wait to free the memory until after we use it.
Modifications:
- Free nettyClassName after using it in snprintf
Result:
More useful error message.
Motivation:
The Http2Connection state is updated by the DefaultHttp2ConnectionDecoder after the frame listener is notified of the goaway frame. If the listener sends a frame synchronously this means the connection state will not know about the goaway it just received and we may send frames that are not allowed on the connection. This may also mean a stream object is created but it may never get taken out of the stream map unless some other event occurs (e.g. timeout).
Modifications:
- The Http2Connection state should be updated before the listener is notified of the goaway
- The Http2Connection state modification and validation should be self contained when processing a goaway instead of partially in the decoder.
Result:
No more creating streams and sending frames after a goaway has been sent or received.
Motivation:
b818852cdb broke support for shading the native libraries in netty as it missed to respect the package prefix that is used when shading.
Modifications:
Correctly respect package prefix for constructor argument and include the used classname when logging that we could not find the constructor.
Result:
Be able to shade native libraries of netty again.
Motivation:
Currently, the vast majority of userEventTriggered() implementations
require the user to supply the boilerplate behavior of performing an
instanceof check, handling if appropriate, and calling
fireUserEventTriggered() otherwise.
We can simplify this very common use case by creating a class that only
matches user events of a given type, similar to the existing
SimpleChannelInboundHandler class.
Modifications:
Create a new SimpleUserEventChannelHandler class
Create accompanying SimpleUserEventChannelHandlerTest class
Result:
Users will be able to handle most events in a less verbose manner.
Motivation:
If the local endpoint receives a GO_AWAY frame and then tries to write a stream with a streamId higher than the last know stream ID we will throw a connection error. This results in the local peer sending a GO_AWAY frame to the remote peer, but this is not necessary as the error can be isolated to the local endpoint and communicated via the ChannelFuture return value.
Modifications:
- Instead of throwing a connection error, throw a stream error that simulates the peer receiving the stream and replying with a RST
Result:
Connections are not closed abruptly when trying to create a stream on the local endpoint after a GO_AWAY frame is received.
Motivation:
As the used OpenSSL version may not support hostname validation we should only really call SSL.setHostNameValidation(...) if we detect that its needed.
Modifications:
Only call SSL.setHostNameValidation if it was disabled before and now it needs to be enabled or if it was enabled before and it should be disabled now.
Result:
Less risk of an exception when using an OpenSSL version that does not support hostname validation.
Motivation:
A new version of tcnative was released that allows to use features depending on the runtime version of openssl, which makes it possible to use KeyManagerFactory and hostname verification on newer versions of centos/fedora/rhel and debian/ubuntu without the need to compile again.
Modifications:
Update to 2.0.12.Final
Result:
Use latest version of netty-tcnative to support more features.
Motivation:
ObjectCleaner does start a Thread to handle the cleaning of resources which leaks into the users application. We should not use it in netty itself to make things more predictable.
Modifications:
- Remove usage of ObjectCleaner and use finalize as a replacement when possible.
- Clarify javadocs for FastThreadLocal.onRemoval(...) to ensure its clear that remove() is not guaranteed to be called when the Thread completees and so this method is not enough to guarantee cleanup for this case.
Result:
Fixes https://github.com/netty/netty/issues/8017.
Motivation:
OpenSSL allows to use a custom engine for its cryptographic operations. We should allow the user to make use of it if needed.
See also: https://www.openssl.org/docs/man1.0.2/crypto/engine.html.
Modifications:
Add new system property which can be used to specify the engine to use (null is the default and will use the build in default impl).
Result:
More flexible way of using OpenSSL.
Motivation:
We use FixedChannelPool in production, and we believe we have a leak that doesn't return sockets to the pool (but they should be closed), thus blocking us from creating new connections when we need them. I haven't confirmed this yet, but right now I have to resort to reflection to access this field which makes me sad.
Modification:
Expose the acquiredChannelCount field through a getter method.
Result:
Allows introspection of the pool size in FixedChannelPool.
Motivation:
If a write fails for a Http2MultiplexChannel stream channel, the channel
may be forcibly closed, but only after the promise has been failed. That
means continuations attached to the promise may see the channel in an
inconsistent state of still being open and active.
Modifications:
Move the satisfaction of the promise to after the channel cleanup logic
runs.
Result:
Listeners attached to the future that resulted in a Failed write will
see the stream channel in the correct state.
Motivation:
We can store the NativeDatagramPacketArray directly in the EpollEventLoop. This removes the need of using FastThreadLocal.
Modifications:
- Store NativeDatagramPacketArray directly in the EpollEventLoop (just as we do with IovArray as well).
Result:
Less FastThreadLocal usage and more consistent code.
Motivation:
We did not handle the case when the query was cancelled which could lead to an exhausted id space. Beside this we did not not cancel the timeout on failed promises.
Modifications:
- Do the removal of the id from the manager in a FutureListener so its handled in all cases.
- Cancel the timeout whenever the original promise was full-filled.
Result:
Fixes https://github.com/netty/netty/issues/8013.
Motivation:
Unpooled.unmodifiableBuffer() is currently used to efficiently write
arrays of ByteBufs via FixedCompositeByteBuf, but involves an allocation
and content-copy of the provided ByteBuf array which in many (most?)
cases shouldn't be necessary.
Modifications:
Modify the internal FixedCompositeByteBuf class to support wrapping the
provided ByteBuf array directly. Control this behaviour with a
constructor flag and expose the "unsafe" version via a new
Unpooled.wrappedUnmodifiableBuffer(ByteBuf...) method.
Result:
Less garbage on IO paths. I would guess pretty much all existing usage
of unmodifiableBuffer() could use the copy-free version but assume it's
not safe to change its default behaviour.
Motivation:
I'm not sure if trivial changes like this are interesting :-) But I
noticed that the PlatformDependent.maxDirectMemory0() method is called
twice unnecessarily during static initialization (on the default path at
least).
Modifications:
Use constant MAX_DIRECT_MEMORY already set to the same value instead of
calling maxDirectMemory0() again.
Result:
A surely imperceivable reduction in operations performed at startup.
Motivation:
The HTTP/2 spec dictates that invalid pseudo-headers should cause the
request/response to be treated as malformed (8.1.2.1), and the recourse
for that is to treat the situation as a stream error of type
PROTOCOL_ERROR (8.1.2.6). However, we're treating them as a connection
error with the connection being immediately torn down and the HPACK
state potentially being corrupted.
Modifications:
The HpackDecoder now throws a StreamException for validation failures
and throwing is deffered until the end of of the decode phase to ensure
that the HPACK state isn't corrupted by returning early.
Result:
Behavior more closely aligned with the HTTP/2 spec.
Fixes#8043.
Motivation:
Whenever we fail the query we should also remove the id from the DnsQueryContextManager.
Modifications:
Remove the id from the DnsQueryContextManager if we fail the query because the channel failed to become active.
Result:
More correct code.
Motivation:
ProxyHandlerTest package uses deprecated methods SslContext.newServerContext and
SslContext.newClientContext.
Modifications:
SslContextBuilder is used to build server and client SslContext.
Result:
Less deprecated method in the code.
Motivation:
Implementation of WebSocketUtil/randomNumber is incorrect and might violate
the API returning values > maximum specified.
Modifications:
* WebSocketUtil/randomNumber is reimplemented, the idea of the solution described
in the comment in the code
* Implementation of WebSocketUtil/randomBytes changed to nextBytes method
* PlatformDependet.threadLocalRandom is used instead of Math.random to improve efficiency
* Added test cases to check random numbers generator
* To ensure corretness, we now assert that min < max when generating random number
Result:
WebSocketUtil/randomNumber always produces correct result.
Covers https://github.com/netty/netty/issues/8023
Motivation:
`ProxyHandlerTest` relies on random values to run tests: first to
shuffle collection of test items and lately to set configuration
flag for `AUTO_READ`. While the purpose of randomization is clear,
it's still impossible to reproduce the same sequence of test cases
when something went wrong. For `AUTO_READ` it's even impossible
to tell what flag was set when the particular test failed.
Modifications:
* Test runner now log seed values that was used for shuffling,
so you can take one and put in your tests to "freeze" them
while debugging (pretty common approach with randomized tests)
* `SuccessItemTest` is split into 2 different use cases:
for AUTO_READ flag set to "on" and "off"
Result:
You can reproduce specific tests results now.
Motiviation:
During profiling it showed that a lot of time during the handshake is spent by parsing the key / chain over and over again. We should cache these parsed structures if possible to reduce the overhead during handshake.
Modification:
- Use new APIs provided by https://github.com/netty/netty-tcnative/pull/360.
- Introduce OpensslStaticX509KeyManagerFactory which allows to wrap another KeyManagerFactory and caches the key material provided by it.
Result:
In benchmarks handshake times have improved by 30 %.
Motivation:
The usage of Invocation level for JMH fixture methods (setup/teardown) inccurs in a significant overhead
in the benchmark time (see org.openjdk.jmh.annotations.Level documentation).
In the case of CodecInputListBenchmark, benchmarks are far too small (less than 50ns) and the Invocation
level setup offsets the measurement considerably.
On such cases, the recommended fix patch is to include the setup/teardown code in the benchmark method.
Modifications:
Include the setup/teardown code in the relevant benchmark methods.
Remove the setup/teardown methods from the benchmark class.
Result:
We run the entire benchmark 10 times with default parameters we observed:
- ArrayList benchmark affected directly by JMH overhead is now from 15-80% faster.
- CodecList benchmark is now 50% faster than original (even with the setup code being measured).
- Recyclable ArrayList is ~30% slower.
- All benchmarks have significant different means (ANOVA) and medians (Moore)
Mode: Throughput (Higher the better)
Method Full params Factor Modified (Median) Original (Median)
recyclableArrayList (elements = 1) 0.615520967 21719082.75 35285691.2
recyclableArrayList (elements = 4) 0.699553431 17149442.76 24514843.31
arrayList (elements = 4) 1.152666631 27120407.18 23528404.88
codecOutList (elements = 1) 1.527275908 67251089.04 44033359.47
codecOutList (elements = 4) 1.596917095 59174088.78 37055204.03
arrayList (elements = 1) 1.878616889 62188238.24 33103204.06
Environment:
Tests run on a Computational server with CPU: E5-1660-3.3GHZ Â (6 cores + HT), 64 GB RAM.
Motivation:
The usage of Invocation level for JMH fixture methods (setup/teardown) inccurs in a significant impact in
in the benchmark time (see org.openjdk.jmh.annotations.Level documentation).
When the benchmark and the setup/teardown is too small (less than a milisecond) the Invocation level might saturate the system with
timestamp requests and iteration synchronizations which introduce artificial latency, throughput, and scalability bottlenecks.
In the HeadersBenchmark, all benchmarks take less than 100ns and the Invocation level setup offsets the measurement considerably.
As fixture methods is defined for the entire class, this overhead also impacts every single benchmark in this class, not only
the ones that use the emptyHttpHeaders object (cleaned in the setup).
The recommended fix patch here is to include the setup/teardown code in the benchmark where the object is used.
Modifications:
Include the setup/teardown code in the relevant benchmark methods.
Remove the setup/teardown method of Invocation level from the benchmark class.
Result:
We run all benchmarks from HeadersBenchmark 10 times with default parameter, we observe:
- Benchmarks that were not directly affected by the fix patch, improved execution time.
For instance, http2Remove with (exampleHeader = THREE) had its median reported as 2x faster than the original version.
- Benchmarks that had the setup code inserted (eg. http2AddAllFastest) did not suffer a significant punch in the execution time,
as the benchmarks are not dominated by the clear().
Environment:
Tests run on a Computational server with CPU: E5-1660-3.3GHZ Â (6 cores + HT), 64 GB RAM.
Motivation:
We deviate from the AbstractChannel implementation on deregistration by
failing the provided promise if the channel is already deregistered. In
contrast, AbstractChannel will always set the promise to successfully
done.
Modification:
Change the
Http2MultiplexCodec.DefaultHttp2StreamChannel.Http2ChannelUnsafe to
always set the promise provided to deregister as done as is the
case in AbstractChannel.
Motivation:
If a wrapped cookie value with an invalid charcater is passed to the strict
encoder, an exception is thrown on validation but the error message contains
a character at the wrong position.
Modifications:
Print `unwrappedValue.charAt(pos)` instead of `value.charAt(pos)`.
Result:
The exception indicates the correct invalid character in the unwrapped cookie.
Motivation:
Eliminate avoidable backing array reallocations when constructing
composite ByteBufs from existing buffer arrays/Iterables. This also
applies to the Unpooled.wrappedBuffer(...) methods.
Modifications:
Ensure the initial components ComponentList is sized at least as large
as the provided buffer array/Iterable in the CompositeByteBuffer
constructors.
In single-arg Unpooled.wrappedBuffer(...) methods, set maxNumComponents
to the count of provided buffers, rather than a fixed default of 16. It
seems likely that most usage of these involves wrapping a list without
subsequent modification, particularly since they return a ByteBuf rather
than CompositeByteBuf. If a different/larger max is required there are
already the wrappedBuffer(int, ...) variants.
In fact the current behaviour could be considered inconsistent - if you
call Unpooled.wrappedBuffer(int, ByteBuf) with a single buffer, you
might expect to subsequently be able to add buffers to it (since you
specified a max related to consolidation), but it will in fact return
just a slice of the provided ByteBuf.
Result:
Fewer and smaller allocations in some cases when using CompositeByteBufs
or Unpooled.wrappedBuffer(...).
Motivation:
Some of the cipher protocol combos that were used are no longer included in more recent OpenSSL releases.
Modifications:
Remove some combos that were used for testing.
Result:
Tests also pass in more recent OpenSSL versions (1.1.0+).
Motivation:
This reverts commit 4b728cd5bc as it was fixes in Java 11 ea+17.
Modification:
Revert previous added workaround as this is fixed in Java 11 now.
Result:
No more workaround for test included.
Motivation:
netty-tcnative 2.0.9 did not contain all native code for boringssl due a release mistake.
Modifications:
Update to 2.0.10
Result:
Use latest netty-tcnative release.
Motivation:
netty-tcnative 2.0.9.Final was released which fixes a memory leak that can happen if client auth is used via client side.
Modifications:
Update to latest netty-tcnative.
Result:
No more memory leak.
Motivation:
Epoll and Kqueue channels have internal state which forces
a single read operation after channel construction. This
violates the Channel#read() interface which indicates that
data shouldn't be delivered until this method is called.
The behavior is also inconsistent with the NIO transport.
Modifications:
- Epoll and Kqueue shouldn't unconditionally read upon
initialization, and instead should rely upon Channel#read()
or auto_read.
Result:
Epoll and Kqueue are more consistent with NIO.
Motivation:
There is an inconsistency between the order of events in the
StreamChannel implementation in Http2MultiplexCodec and other Channel
implementations that extend AbstractChannel where channelInactive and
channelUnregistered events are not performed 'later'. This can cause an
unexected order of events for ChannelHandler implementations that call
Channel.close() in response to some event.
Modification:
The Http2MultiplexCodec.DefaultHttp2StreamChannel.Http2ChannelUnsafe was
modified to bounce the deregistration and channelInactive events through
the parent channels EventLoop.
Result:
Stream events are now in the proper order.
Fixes#8018.
Motivation
There is a cost to concatenating strings and calling methods that will be wasted if the Logger's level is not enabled.
Modifications
Check if Log level is enabled before producing log statement. These are just a few cases found by RegEx'ing in the code.
Result
Tiny bit more efficient code.
Motivation:
Currently there is not a clear way to provide a byte array to a netty
ByteBuf and be informed when it is released. This is a would be a
valuable addition for projects that integrate with netty but also pool
their own byte arrays.
Modification:
Modified the UnpooledHeapByteBuf class so that the freeArray method is
protected visibility instead of default. This will allow a user to
subclass the UnpooledHeapByteBuf, provide a byte array, and override
freeArray to return the byte array to a pool when it is called.
Additionally this makes this implementation equivalent to
UnpooledDirectByteBuf (freeDirect is protected).
Additionally allocateArray is also made protect to provide another override
option for subclasses.
Result:
Users can override UnpooledHeapByteBuf#freeArray and
UnpooledHeapByteBuf#allocateArray.
Motivation:
Http2MultiplexCodec doesn't currently have an API for using the response
of a h2c upgrade request.
Modifications:
Add a new API to the Http2MultiplexCodecBuilder which allows for setting
an upgrade handler and wire it into the Http2MultiplexCodec
implementation.
Result:
When using the Http2MultiplexCodec with h2c upgrades the upgrade handler
will get added to the Http2StreamChannel which represents the
half-closed (local) response of stream 1. It is then up to the user to
manage the transition from the IO channel pipeline configuration
necessary for making the h2c upgrade request to a form where it can read
the response from the new stream channel.
Fixes#7947.
Motivation:
The `ByteBuffer emptyPingBuf()` method of Http2CodecUtils is has been dead
code since DefaultHttp2PingFrame switched from using a ByteBuf to represent
the 8 octets to a long.
Modifications:
Remove the method and the unused static ByteBuf.
Result:
Less dead code.
Fixes#8002
Motivation:
When using conscrypt some NPEs were logged, these were fixed in the latest release.
Modifications:
Update to conscrypt 1.1.3.
Result:
Fixes https://github.com/netty/netty/issues/7988.