Motivation:
When create NormalMemoryRegionCache for PoolThreadCache, we overbooked
cache array size. This means unnecessary overhead for thread local cache
as we will create multi cache enties for each element in cache array.
Modifications:
change:
int arraySize = Math.max(1, max / area.pageSize);
to:
int arraySize = Math.max(1, log2(max / area.pageSize) + 1);
Result:
Now arraySize won't introduce unnecessary overhead.
Changes to be committed:
modified: buffer/src/main/java/io/netty/buffer/PoolThreadCache.java
Motivation:
CompositeByteBuf has an iterator() method but fails to implement Iterable
Modifications:
Let CompositeByteBuf implement Iterable<ByteBuf>
Result:
Easier usage
Motivation:
The ReplayingDecoderBuffer does not match the naming scheme we use for ByteBuf types.
Modifications:
Rename to ReplayingDecoderByteBuf to match naming scheme
Result:
Consistent naming
Motivation:
We missed to dereference the chunk and tmpNioBuf when calling deallocate(). This means the GC can not collect these as we still hold a reference while have the PooledByteBuf in the recycler stack.
Modifications:
Dereference chunk and tmpNioBuf.
Result:
GC can collect things.
Motivation:
We missed to flush the channel when using HttpChunkedInput (this is done when using SSL). This will result in a stale.
Modifications:
Replace ctx.write(...) with ctx.writeAndFlush(...)
Result:
Correctly working example.
Motiviation:
At the moment we use FIFO for the PoolThreadCache which is sub-optimal as this may reduce the changes to have the cached memory actual still in the cpu-cache.
Modification:
- Change to use LIFO as this increase the chance to be able to serve buffers from the cpu-cache
Results:
Faster allocation out of the ThreadLocal cache.
Before the commit:
[xxx wrk]$ ./wrk -H 'Connection: keep-alive' -d 120 -c 256 -t 16 -s scripts/pipeline-many.lua http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 14.69ms 10.06ms 131.43ms 80.10%
Req/Sec 283.89k 40.37k 433.69k 66.81%
533859742 requests in 2.00m, 72.09GB read
Requests/sec: 4449510.51
Transfer/sec: 615.29MB
After the commit:
[xxx wrk]$ ./wrk -H 'Connection: keep-alive' -d 120 -c 256 -t 16 -s scripts/pipeline-many.lua http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 16.38ms 26.32ms 734.06ms 97.38%
Req/Sec 283.86k 39.31k 361.69k 83.38%
540836511 requests in 2.00m, 73.04GB read
Requests/sec: 4508150.18
Transfer/sec: 623.40MB
Motivation:
Now that we have a CharObjectHashMap, we should change Http2Settings to use it.
Modifications:
Changed Http2Settings to extend CharObjectHashMap rather than IntObjectHashMap.
Result:
Http2Settings uses less memory to store keys.
Motivation:
To support HTTP2 we need APLN support. This was not provided before when using OpenSslEngine, so SSLEngine (JDK one) was the only bet.
Beside this CipherSuiteFilter was not supported
Modifications:
- Upgrade netty-tcnative and make use of new features to support ALPN and NPN in server and client mode.
- Guard against segfaults after the ssl pointer is freed
- support correctly different failure behaviours
- add support for CipherSuiteFilter
Result:
Be able to use OpenSslEngine for ALPN / NPN for server and client.
Motivation:
We've removed access to the activeStreams collection, we should do the same for the children of a stream to provide a consistent interface.
Modifications:
Moved Http2StreamVisitor to a top-level interface. Removed unnecessary child operations from the Http2Stream interface so that we no longer require a map structure.
Result:
Cleaner and more consistent interface for iterating over child streams.
Motivation:
Currently we have IntObjectMap/HashMap, but it will be useful to support other primitive-based maps.
Modifications:
Moved the code int the current maps to template files and run Groovy code from common/pom.xml to apply the templates.
Result:
Autogeneration of int and char-based hash maps.
Motivation:
Too many duplicated code of tests for different compression codecs.
Modifications:
- Added abstract classes AbstractCompressionTest, AbstractDecoderTest and AbstractEncoderTest which contains common variables and tests for any compression codec.
- Removed common tests which are implemented in AbstractDecoderTest and AbstractEncoderTest from current tests for compression codecs.
- Implemented abstract methods of AbstractDecoderTest and AbstractEncoderTest in current tests for compression codecs.
- Added additional checks for current tests.
- Renamed abstract class IntegrationTest to AbstractIntegrationTest.
- Used Theories to run tests with head and direct buffers.
- Removed code duplicates.
Result:
Removed duplicated code of tests for compression codecs and simplified an addition of tests for new compression codecs.
Motivation:
The Http2Connection interface exposes an activeStreams() method which allows direct iteration over the underlying collection. There are a few places that make copies of this collection to avoid modification while iterating, and a few places that do not make copies. The copy operation can be expensive on hot code paths and also we are not consistently iterating over the activeStreams collection.
Modifications:
- The Http2Connection interface should reduce the exposure of the underlying collection and just expose what is necessary for the interface to function. This is just a means to iterate over the collection.
- The DefaultHttp2Connection should use this new interface and protect it's internal state while iteration is occurring.
Result:
Reduction in surface area of the Http2Connection interface. Consistent iteration of the set of active streams. Concurrent modification exceptions are handled in 1 encapsulated spot.
Motivation:
1) The current implementation doesn't allow for HEADERS, DATA, PING, PRIORITY and SETTINGS
frames to be sent after GOAWAY.
2) When receiving or sending a GOAWAY frame, all streams with ids greater than the lastStreamId
of the GOAWAY frame should be closed. That's not happening.
Modifications:
1) Allow sending of HEADERS and DATA frames after GOAWAY for streams with ids < lastStreamId.
2) Always allow sending PING, PRIORITY AND SETTINGS frames.
3) Allow sending multiple GOAWAY frames with decreasing lastStreamIds.
4) After receiving or sending a GOAWAY frame, close all streams with ids > lastStreamId.
Result:
The GOAWAY handling is more correct.
Motivation:
There are methods to manipulate the prioritzable count for streams which have the '0' postfix which are designed to be used during recursion. However these methods are calling out to an external method without the '0' during the recursive process. This is doing uneccessary conditional checks during recursion.
Modifications:
Change the decrementPrioritizableForTree to decrementPrioritizableForTree0 while in recursive method.
Change the incrementPrioritizableForTree to incrementPrioritizableForTree0 while in recursive method.
Result:
Less overhead during recursive calls.
Motiviation:
The interface provided by Http2LifecycleManager is not clear as to how the writeXXX methods should behave. The implementation of this interface from the Http2ConnectionHandler's perspecitve is unclear what writeXXX means in this context.
Modifications:
- Method names in Http2LifecycleManager and Http2ConnectionHandler should be renamed and comments should clarify the interfaces.
Results:
Http2LifecycleManager is more clear and Http2ConnectionHandler's implementation makes sense w.r.t to return values.
In TrafficCounter, a recent change makes the contract of the API (the
constructor) wrong and lead to issue with GlobalChannelTrafficCounter
where executor must be null.
Motivation:
TrafficCounter executor argument in constructor might be null, as
explained in the API, for some particular cases where no executor are
needed (relevant tasks being taken by the caller as in
GlobalChannelTrafficCounter).
A null pointer exception is raised while it should not since it is
legal.
Modifications:
Remove the 2 null checking for this particular attribute.
Note that when null, the attribute is not reached nor used (a null
checking condition later on is applied).
Result:
No more null exception raized while it should not.
This shall be made also to 4.0, 4.1 (present) and master. 3.10 is not
concerned.
Motivation:
The HTTP/2 headers code should be using binary string (currently AsciiString) objects instead of String objects. The DefaultHttp2HeadersEncoder was still using String for sensitiveHeaders.
Modifications:
- Remove the usage of String from DefaultHttp2HeadersEncoder.
- Introduce an interface to determine if a header name/value is sensitive or not to 1. prevent necessarily creating/copying sets. 2. Allow the name/value to be considered when checking if sensitive.
Result:
No more String in DefaultHttp2HeadersEncoder and less required set creation/operations.
Motivation:
The spec requires that a RST_STREAM received on an IDLE stream results in a connection error. This is not happening.
Modifications:
Check for this condition when a RST_STREAM is received in DefaultHttp2ConnectionDecoder.
Result:
More spec compliant. Fixes https://github.com/netty/netty/issues/3573.
Motivation:
The DefaultHttp2ConnectionDecoder has the setPriority call after the Http2FrameListener is notified of the change. The setPriority call has additional verification logic and may even create the dependency stream and so it must be before the Http2FrameListener is notified.
Modifications:
The DefaultHttp2ConnectionDecoder should treat the setPriority call in the same for the HEADERS and PRIORITY frame (call it before notifying the listener).
Result:
Http2FrameListener should see correct state when a HEADERS frame has a stream dependency that has not yet been created yet. Fixes https://github.com/netty/netty/issues/3572.
Motivation:
We are allocating a hash map for every HTTP2 Stream to store it's children.
Most streams are leafs in the priority tree and don't have children.
Modification:
- Only allocate children when we actually use them.
- Make EmptyIntObjectMap not throw a UnsupportedOperationException on remove, but return null instead (as is stated in it's javadoc).
Result:
Fewer unnecessary allocations.
Motivation:
In a simple load test that creates and closes several 10k streams per second
I have seen Iterator objects using roughly 1.6% of the total committed heap.
Modifications:
Use an ArrayList instead of a LinkedHashSet to store the connection listeners.
That way we can iterate over the list without creating an iterator every time.
Result:
Zero Iterator allocations due to notifying connection listeners.
Motivation:
The Http2Settings class currently disallows setting non-standard settings, which violates the spec.
Modifications:
Updated Http2Settings to permit arbitrary settings. Also adjusting the default initial capacity to allow setting all of the standard settings without reallocation.
Result:
Fixes#3560
Related: #3567
Motivation:
SslHandler.channelReadComplete() forgets to call
super.channelReadComplete(), which discards read bytes from the
cumulative buffer. As a result, the cumulative buffer can expand its
capacity unboundedly.
Modifications:
Call super.channelReadComplete() instead of calling
ctx.fireChannelReadComplete()
Result:
Fixes#3567
Motivation:
The HTTP/2 specification allows for closed (and streams in any state) to exist in the priority tree. The current code removes streams from the priority tree as soon as they are closed (subject to the removal policy). This may lead to undesired distribution of resources from the peer's perspective.
Modifications:
- We should only remove streams from the priority tree when they have no descendant streams in a viable state.
- We should track when tree edges change or nodes are removed if inviable nodes can then be removed.
Result:
Priority tree doesn't remove closed streams until descendant are all closed, or there are no descendants.
Motivation:
We're currently using Math.ceil which isn't necessary. We should exchange for a lighter weight operation.
Modifications:
Changing the logic to just ensure that we allocate at least one byte to the child rather than always performing a ceil.
Result:
Slight performance improvement in the priority algorithm.
Motivation:
The Connection.Listener GOAWAY event handler currently provides no additional information, requiring applications to hack in other ways to get at the error code and debug message.
Modifications:
Modified the Connection.Listener interface to pass on the error code and message that triggered the GOAWAY.
Result:
Application can now use Connection.Listener for all GOAWAY processing.
Motivation:
LoggingHandlerTest sometimes failure due to unexpected log messages
logged due to the automatic reclaimation of thread-local objects.
Expectation failure on verify:
Appender.doAppend([DEBUG] Freed 3 thread-local buffer(s) from thread: nioEventLoopGroup-23-0): expected: 1, actual: 0
Appender.doAppend([DEBUG] Freed 9 thread-local buffer(s) from thread: nioEventLoopGroup-23-1): expected: 1, actual: 0
Appender.doAppend([DEBUG] Freed 2 thread-local buffer(s) from thread: nioEventLoopGroup-23-2): expected: 1, actual: 0
Appender.doAppend([DEBUG] Freed 4 thread-local buffer(s) from thread: nioEventLoopGroup-26-0): expected: 1, actual: 0
Appender.doAppend(matchesLog(expected: ".+CLOSE$", got: "[id: 0xembedded, embedded => embedded] CLOSE")): expected: 1, actual: 0
Modifications:
Add the mock appender to the related logger only
Result:
No more intermittent test failures
Related: #3368
Motivation:
ChunkedWriteHandler checks if the return value of
ChunkedInput.isEndOfInput() after calling ChunkedInput.close().
This makes ChunkedStream.isEndOfInput() trigger an IOException, which is
originally triggered by PushBackInputStream.read().
By contract, ChunkedInput.isEndOfInput() should not raise an IOException
even when the underlying stream is closed.
Modifications:
Add a boolean flag that keeps track of whether the underlying stream has
been closed or not, so that ChunkedStream.isEndOfInput() does not
propagate the IOException from PushBackInputStream.
Result:
Fixes#3368
Motivation:
It currently takes a builder for the encoder and decoder, which makes it difficult to decorate them.
Modifications:
Removed the builders from the interfaces entirely. Left the builder for the decoder impl but removed it from the encoder since it's constructor only takes 2 parameters. Also added decorator base classes for the encoder and decoder and made the CompressorHttp2ConnectionEncoder extend the decorator.
Result:
Fixes#3530
Motivation:
The DefaultHttp2RemoteFlowController's priority algorithm doesn't really need to sort the children by weight since it already fairly distributes data based on weight.
Modifications:
Removing the sorting in the priority algorithm and updating one test to allow a small bit of variability in the results.
Result:
Slight improvement on the performance of the priority algorithm.
Motivation:
For some use cases X509ExtendedTrustManager is needed as it allows to also access the SslEngine during validation.
Modifications:
Add support for X509ExtendedTrustManager on java >= 7
Result:
It's now possible to use X509ExtendedTrustManager with OpenSslEngine
Motivation:
The encoder/decoder currently do not handle streams which have previously existed but no longer exist because they were closed. The specification requires supporting this.
Modifications:
- encoder/decoder should tolerate the frame or the dependent frame not existing in the streams map due to the fact that it may have previously existed.
Result:
encoder/decoder are more compliant with the specification.
Motivation:
The DefaultHttp2ConnectionDecoder class is calling verifyPrefaceReceived() for almost every frame event at all times.
The Http2ConnectionHandler class is calling readClientPrefaceString() on every decode event.
Modifications:
- DefaultHttp2ConnectionDecoder should not have to continuously call verifyPrefaceReceived() because it transitions boolean state 1 time for each connection.
- Http2ConnectionHandler should not have to continuously call readClientPrefaceString() because it transitions boolean state 1 time for each connection.
Result:
- Less conditional checks for the mainstream usage of the connection.
Motivation:
The current DefaultHttp2RemoteFlowController's writePendingBytes currently operates in 2 passes. The first allocates bytes and optionally writes some frames. The second pass just loops across all active streams and writes all remaining bytes.
If streams can be removed/added as a side effect of writing (EOS or error) then we need to take more care when the write actually occurs. Moving all of the writes to the second loop (across active streams) is simpler since we can just make a copy of the list and not worry about any restructuring of the priority tree that may result.
Modifications:
Modified DefaultHttp2RemoteFlowController.writePendingBytes to only allocate bytes on the first pass and then write any allocated bytes on the second pass.
Result:
Side effects resulting from writing should no longer impact the flow control algorithm.
Motivation:
A microbenchmark will be useful to get a baseline for performance.
Modifications:
- Introduce a new microbenchmark which tests the Http2DefaultFrameWriter.
- Allow benchmarks to run without thread context switching between JMH and Netty.
Result:
Microbenchmark exists to test performance.
Motivation:
The Http2ConnectionHandler writeRstStream method allows RST_STREAM frames to be sent when we do not know about the stream and after a RST_STREAM frame has already been sent. This may lead to sending frames when we should not according to the HTTP/2 spec. There is also the potential to notify the closeListener multiple times if the closeStream method is called multiple times.
Modifications:
- Prevent RST_STREAM from being sent if we don't know about the stream, or if we already sent the RST_STREAM.
- Prevent the closeListener from being notified multiple times.
Result:
More robust writeRstStream logic in boundary conditions.
Motivation:
There was a new draft for HTTP/2. We should support the new draft.
Modifications:
- Review the HTTP/2 draft 17 specification, and update code to reflect changes.
Result:
Support for HTTP/2 draft 17.
Motivation:
There are new versions of the ALPN and NPN dependencies. There was also some backport misses in the pom file related to ALPN/NPN.
Modifications:
- Add new versions for ALPN/NPN dependencies.
- Backport missed pieces from pom.xml.
Result:
Updated version of ALPN/NPN versions.
Motivation:
- In FlowState.write(...) we are currently swalloing an exception.
- In my previous commit I introduced a compiler warning by not making
a local variabe final.
Modifications:
- Have FlowState.cancel() take a Throwable.
- Make the variable final.
Result:
No more swallowed exceptions and warnings.
Motivation:
- The encoder and decoder should be closed right after the handler releases its resources.
- The clientPrefaceString is allocated in the constructor but releases in handlerRemoved.
If the handler is never added to the pipeline, the clientPrefaceString will never be
released.
Modifications:
- Call encoder.close() and decoder.close() on channelInactive.
- Release the clientPrefaceString on handlerRemoved.
Result:
- The encoder and decoder get closed right after the handler's resources are freed.
- It's easier to verify that the clientPrefaceString will also get released.
Motivation:
Current AbstractMemcacheObjectEncoder does unnecessary message type checking if the message is MemcacheMessage type.
Modifications:
Returns after encoding MemcacheMessage message.
Result:
Small performance improvement for this encoder.
(Ported @luciferous's changes against 3.10)
Motivation:
The current implementation of the encoder writes each character of the
String as a single byte to the buffer, however not all characters are
mappable to a single byte.
Modifications:
If a character is outside the ASCII range, it's converted to '?'.
Result:
A safer encoder for String to ASCII, which substitutes unmappable
characters with'?'.
Motivation:
The Http2FrameLogger is currently using the internal logging classes. We should change this so that it's using the public classes and then converts internally.
Modifications:
Modified Http2FrameLogger and the examples to use the public LogLevel class.
Result:
Fixes#2512