Motivation:
In #10630, field substitutions were introduced for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST fields. They were required to allow a native image be built with most of Netty (including NetUtil) initialized at build time.
The substitutions created in #10630 only define getters, so the 3 fields can only be read in a native image.
But when NetUtil is initialized at run-time (this is what happens in #10797), its static initialization block is executed, and this block writes to all 3 fields. As the substitutions do not provide any setters, field stores are not valid, and such builds fail.
Modifications:
- Add netty-testsuite-native-image-client-runtime-init Maven module that builds a native image deferring NetUtil class initialization till run-time; this module is used to demonstrate the problem and verify the problem is gone with the fix
- Add no-op setters to substitutions for NetUtil.LOCALHOST4, NetUtil.LOCALHOST6 and NetUtil.LOCALHOST
Result:
A native image initializing NetUtil at run-time builds successfully.
Fixes#10797
Motivation:
According to specification 1006 status code must not be set as a status code in a
Close control frame by the endpoint. However 1006 status code can be
used in applications to indicate that the connection was closed abnormally.
Modifications:
- Enforce status code validation in CloseWebSocketFrame
- Add WebSocketCloseStatus construction with disabled validation
- Add test
Result:
Fixes#10838
Motivation:
We should use aarch_64 in our classifier / jni libname on aarch64 as os.detected.arch uses the name. Being non consistent (especially across our different projects) already gave us a lot of trouble in the past.
Let's fix this once for all.
Modifications:
Use aarch_64
Result:
More consistent classifier usage on aarch64
Motivation:
We should add `state` in the exception message of `HttpObjectEncoder` because it makes debugging a little easier.
Modification:
Added `state` in the exception message.
Result:
Better exception message for smooth debugging.
Motivation:
Internally SSLEngineImpl.wrap(...) may call FileInputStream.read(...).
This will cause the error below when BlockHound is enabled
reactor.blockhound.BlockingOperationError: Blocking call! java.io.FileInputStream#readBytes
at java.io.FileInputStream.readBytes(FileInputStream.java)
at java.io.FileInputStream.read(FileInputStream.java:255)
Modifications:
- Add whitelist entry to BlockHound configuration
- Add test
Result:
Fixes#10837
Motivation:
If Log4J2's `Filter` creates `Recycler.Stack` somehow, `Recycler.Stack()` will see uninitialized `Recycler.INITIAL_CAPACITY`. This has been raised originally in https://github.com/micrometer-metrics/micrometer/issues/2369.
Modification:
This PR changes to initialize `Recycler.INITIAL_CAPACITY` before invoking `InternalLogger.debug()` to avoid it.
Result:
Fixes the problem described in the "Motivation" section.
Motivation:
We rely on this functionality in PoolChunk, and a bug was caught by a non-deterministic test failure
Modification:
Went back to the Algorithms book, and reimplemented remove() the way it was meant to.
Result:
No test failures after 200.000 runs, so we have some confidence the code is correct now.
Motivation:
Http2ConnectionHandler tries to addListener to the future without checking if it's void. If it is void, this will fail and generate an exception.
Modifications:
Unvoid the promise in writeData()
Result:
Fixes#10816, Writing with a voidPromise no longer generates exceptions.
Motivation:
The uncached access to PoolChunk can be made faster, and avoid allocating boxed Longs, if we have a primitive hash map and priority queue implementation for it.
Modification:
Add bespoke primitive implementations of a hash map and a priority queue for PoolChunk.
Remove all the long-boxing caused by the previous implementation.
The hashmap is a linear probing map with a fairly short probe that keeps the search within a couple of cache lines.
The priority queue is the same binary heap algorithm that's described in Algorithms by Sedgewick and Wayne.
The implementation avoids the Long boxing by relying on a long[] array.
This makes the internal-remove method faster, which is an important operation in PoolChunk.
Result:
Roughly 13% performance uplift in buffer allocations that miss cache.
Motivation:
The DnsNameResolver internally follows CNAME indirects for all records types, and supports caching for CNAME resolution and A* records. For DNS record types that are not cached (e.g. SRV records) the caching of CNAME records may result in failures at incorrect times. For example if a CNAME record has a larger TTL than the entries it resolves this may result in failures which don't occur if the CNAME cache is disabled.
Modifications:
- Don't cache CNAME and also dont use the cache for CNAME when using DnsRecordResolveContext
- Add unit test
Result:
More correct resolving and also not possible to have failures due CNAME still be in the cache while the queried record experied
Motivation:
https://github.com/netty/netty/pull/10267 introduced a change that reduced the fragmentation. Unfortunally it also introduced a regression when it comes to caching of normal allocations. This can have a negative performance impact depending on the allocation sizes.
Modifications:
- Fix algorithm to calculate the array size for normal allocation caches
- Correctly calculate indeox for normal caches
- Add unit test
Result:
Fixes https://github.com/netty/netty/issues/10805
Motivation:
We can make use of internalNioBuffer(...) if we cant access the memoryAddress. This at least will reduce the object creations.
Modifications:
Use internalNioBuffer(...) and so reduce the GC
Result:
Less object creation if we can't access the memory address.
Motivation:
We need to carefully check for null before we pass the cumulation buffer into decodeLast as callDecode(...) may have removed the codec already and so set cumulation to null.
Modifications:
- Check for null and if we see null use Unpooled.EMPTY_BUFFEr
- Only call decodeLast(...) if callDecode(...) didnt remove the handler yet.
Result:
Fixes https://github.com/netty/netty/issues/10802
Motivation:
GlobalEventExecutor.addTask was rightfully allowed to block by commit
09d38c8. However the same should have been done for
SingleThreadEventExecutor.addTask.
BlockHound is currently intercepting that call, and as a consequence,
it prevents SingleThreadEventExecutor from working properly, if addTask is
called from a thread that cannot block.
The interception is due to LinkedBlockingQueue.offer implementation,
which uses a ReentrantLock internally.
Modifications:
* Added one BlockHound exception to
io.netty.util.internal.Hidden.NettyBlockHoundIntegration for
SingleThreadEventExecutor.addTask.
* Also added unit tests for both SingleThreadEventExecutor.addTask
and GlobalEventExecutor.addTask.
Result:
SingleThreadEventExecutor.addTask can now be invoked from any thread
when BlockHound is activated.
Motivation:
In some enviroments sun.misc.Unsafe is not present. We should support these as well.
Modifications:
Fallback to JNI if we can't directly access the memoryAddress of the buffer.
Result:
Fixes https://github.com/netty/netty/issues/10813
Motivation:
When a HashedWheelTimer instance is started or stopped, its working
thread is started or stopped. These operations block the calling
thread:
- start() calls java.util.concurrent.CountDownLatch.await() to wait
for the worker thread to finish initializing;
- stop() calls java.lang.Thread.join(long) to wait for the worker
thread to exit.
BlockHound detects these calls and as a consequence, prevents
HashedWheelTimer from working properly, if it is started or stopped
in a thread that is not allowed to block.
Modifications:
Added two more BlockHound exceptions to
io.netty.util.internal.Hidden.NettyBlockHoundIntegration: one
for HashedWheelTimer.start() and one for HashedWheelTimer.stop().
Result:
HashedWheelTimer can now be started and stopped properly when
BlockHound is activated.
Motivation:
ABORT and COMMIT commands were missing from the enum but they are part of the STOMP spec.
Modifications:
Modified the enum to add the missing commands.
Result:
ABORT and COMMIT commands can now be parsed properly and acted on.
Motivation:
According rfc (https://tools.ietf.org/html/rfc6455#section-11.3.4), `Sec-WebSocket-Protocol` header field MUST NOT appear
more than once in an HTTP response.
At the moment we can pass `Sec-WebSocket-Protocol` via custom headers and it will be added to response.
Modification:
Change method add() to set() for avoid duplication. If we pass sub protocols in handshaker constructor it means that they are preferred over custom ones.
Result:
Less error prone behavior.
Motivation:
People may use the object serialisation example as a vehicle to test out sending their own objects across the wire.
If those objects are not actually serialisable for some reason, then we need to let the exception propagate so that this becomes obvious to people.
Modification:
Add a listener to the future that sends the first serialisable message, so that we ensure that any exceptions that shows up during serialisation becomes visible.
Without this, the state of the future that sent the first message was never checked or inspected anywhere.
Result:
Serialisation bugs in code derived from the Object Echo example are much easier to diagnose.
This fixes#10777
Fix issue #10508 where PARANOID mode slow down about 1000 times compared to ADVANCED.
Also fix a rare issue when internal buffer was growing over a limit, it was partially discarded
using `discardReadBytes()` which causes bad changes within previously discovered HttpData.
Reasons were:
Too many `readByte()` method calls while other ways exist (such as keep in memory the last scan position when trying to find a delimiter or using `bytesBefore(firstByte)` instead of looping externally).
Changes done:
- major change on way buffer are parsed: instead of read byte per byte until found delimiter, try to find the delimiter using `bytesBefore()` and keep the last unfound position to skeep already parsed parts (algorithms are the same but implementation of scan are different)
- Change the condition to discard read bytes when refCnt is at most 1.
Observations using Async-Profiler:
==================================
1) Without optimizations, most of the time (more than 95%) is through `readByte()` method within `loadDataMultipartStandard` method.
2) With using `bytesBefore(byte)` instead of `readByte()` to find various delimiter, the `loadDataMultipartStandard` method is going down to 19 to 33% depending on the test used. the `readByte()` method or equivalent `getByte(pos)` method are going down to 15% (from 95%).
Times are confirming those profiling:
- With optimizations, in SIMPLE mode about 82% better, in ADVANCED mode about 79% better and in PARANOID mode about 99% better (most of the duplicate read accesses are removed or make internally through `bytesBefore(byte)` method)
A benchmark is added to show the behavior of the various cases (one big item, such as File upload, and many items) and various level of detection (Disabled, Simple, Advanced, Paranoid). This benchmark is intend to alert if new implementations make too many differences (such as the previous version where about PARANOID gives about 1000 times slower than other levels, while it is now about at most 10 times).
Extract of Benchmark run:
=========================
Run complete. Total time: 00:13:27
Benchmark Mode Cnt Score Error Units
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigAdvancedLevel thrpt 6 2,248 ± 0,198 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigDisabledLevel thrpt 6 2,067 ± 1,219 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigParanoidLevel thrpt 6 1,109 ± 0,038 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigSimpleLevel thrpt 6 2,326 ± 0,314 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighAdvancedLevel thrpt 6 1,444 ± 0,226 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighDisabledLevel thrpt 6 1,462 ± 0,642 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighParanoidLevel thrpt 6 0,159 ± 0,003 ops/ms
HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighSimpleLevel thrpt 6 1,522 ± 0,049 ops/ms
Motivation:
`DelegatingDecompressorFrameListener#initDecompressor` has multiple dots `.` in comments. However, it should not have that.
Modification:
Removed multiple dots.
Result:
Clean comment
Motivation:
Passing a null value of byte[] to the `Unsafe.copyMemory(xxx)` would cause the JVM crash
Modification:
Add null checking before calling `PlatformDependent.copyMemory(src, xxx)`
Result:
Fixes#10791 .
Motivation:
https in xmlns URIs does not work and will let the maven release plugin fail:
```
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.779 s
[INFO] Finished at: 2020-11-10T07:45:21Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare (default-cli) on project netty-parent: Execution default-cli of goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare failed: The namespace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" could not be added as a namespace to "project": The namespace prefix "xsi" collides with an additional namespace declared by the element -> [Help 1]
[ERROR]
```
See also https://issues.apache.org/jira/browse/HBASE-24014.
Modifications:
Use http for xmlns
Result:
Be able to use maven release plugin
Motivation:
`HttpConversionUtil#toFullHttpResponse` and `HttpConversionUtil#toFullHttpRequest` has `ByteBufAllocator` which is used for building `FullHttpMessage` and then data can be appended with `FullHttpMessage#content`. However, there can be cases when we already have `ByteBuf` ready with data. So we need a parameter to add `ByteBuf` directly into `FullHttpMessage` while creating it.
Modification:
Added `ByteBuf` parameter,
Result:
More functionality for handling `FullHttpMessage` content.
Motivation:
Sometimes it would be helpful to easily detect if an operation failed due the SSLEngine already be closed.
Modifications:
Add special exception that is used when the engine was closed
Result:
Easier to detect a failure caused by a closed exception
Motivation:
Printing download progress in the build log makes it harder to see what's wrong when the build fails.
Modification:
Change the maven command to not show transfer progress, also enable batch mode so Maven don't print in colors that we can't see anyway.
Result:
More concise code analysis build logs.
Motivation:
We recently released a new netty-build version and changed the artifact name
Modifications:
Update version and artifact name
Result:
Use latest version
Motivation:
We should have a method to add `HttpScheme` if `HttpRequest` does not contain `x-http2-scheme` then we should use add it if `HttpToHttp2ConnectionHandler` is build using specified `HttpScheme`.
Modification:
Added `HttpScheme` in `HttpToHttp2ConnectionHandlerBuilder`.
Result:
Automatically add `HttpScheme` if missing in `HttpRequest`.
Motivation:
When parsing HEADERS, connection errors can occur (e.g., too large of
headers, such that we don't want to HPACK decode them). These trigger a
GOAWAY with a last-stream-id telling the client which streams haven't
been processed.
Unfortunately that last-stream-id didn't include the stream for the
HEADERS that triggered the error. Since clients are free to silently
retry streams not included in last-stream-id, the client is free to
retransmit the request on a new connection, which will fail the
connection with the wrong last-stream-id, and the client is still free
to retransmit the request.
Modifications:
Have fatal connection errors (those that hard-cut the connection)
include all streams in last-stream-id, which guarantees the HEADERS'
stream is included and thus should not be silently retried by the HTTP/2
client.
This modification is heavy-handed, as it will cause racing streams to
also fail, but alternatives that provide precise last-stream-id tracking
are much more invasive. Hard-cutting the connection is already
heavy-handed and so is rare.
Result:
Fixes#10670
Motivation:
We received a [bug report](https://bugs.chromium.org/p/chromium/issues/detail?id=1143320) from the Chrome team at Google, their canary builds are failing [HTTP/2 GREASE](https://tools.ietf.org/html/draft-bishop-httpbis-grease-00) testing to netflix.com.
The reason it's failing is that Netty can't handle unknown frames without an active stream created. Let me know if you'd like more info, such as stack traces or repro steps.
Modification:
The change is minor and simply ignores unknown frames on the connection stream, similarly to `onWindowUpdateRead`.
Result:
I figured I would just submit a PR rather than filing an issue, but let me know if you want me to do that for tracking purposes.
Motivation:
JUnit 5 is the new hotness. It's more expressive, extensible, and composable in many ways, and it's better able to run tests in parallel. But most importantly, it's able to directly run JUnit 4 tests.
This means we can update and start using JUnit 5 without touching any of our existing tests.
I'm also introducing a dependency on assertj-core, which is like hamcrest, but arguably has a nicer and more discoverable API.
Modification:
Add the JUnit 5 and assertj-core dependencies, without converting any tests at time time.
Result:
All our tests are now executed through the JUnit 5 Vintage Engine.
Also, the JUnit 5 test APIs are available, and any JUnit 5 tests that are added from now on will also be executed.
Motivation:
Allowing null handlers allows for more convenient idioms in
conditionally adding handlers, e.g.,
ch.pipeline().addLast(
new FooHandler(),
condition ? new BarHandler() : null,
new BazHandler()
);
Modifications:
* Change addFirst(..) and addLast(..) to skip null handlers, rather than
break or short-circuit.
* Add new unit tests.
Result:
* Makes addFirst(..) and addLast(..) behavior more consistent
* Resolves https://github.com/netty/netty/issues/10728
Motivation:
PoolChunk maintains multiple PriorityQueue<Long> collections. The usage
of PoolChunk#removeAvailRun unboxes the Long values to long, and then
this method uses queue.remove(..) which will auto box the value back to
Long. This creates unnecessary allocations via Long.valueOf(long).
Modifications:
- Adjust method signature and usage of PoolChunk#removeAvailRun to avoid
boxing
Result:
Less allocations as a result of PoolChunk#removeAvailRun.