Related: #4333#4421#5128
Motivation:
slice(), duplicate() and readSlice() currently create a non-recyclable
derived buffer instance. Under heavy load, an application that creates a
lot of derived buffers can put the garbage collector under pressure.
Modifications:
- Add the following methods which creates a non-recyclable derived buffer
- retainedSlice()
- retainedDuplicate()
- readRetainedSlice()
- Add the new recyclable derived buffer implementations, which has its
own reference count value
- Add ByteBufHolder.retainedDuplicate()
- Add ByteBufHolder.replace(ByteBuf) so that..
- a user can replace the content of the holder in a consistent way
- copy/duplicate/retainedDuplicate() can delegate the holder
construction to replace(ByteBuf)
- Use retainedDuplicate() and retainedSlice() wherever possible
- Miscellaneous:
- Rename DuplicateByteBufTest to DuplicatedByteBufTest (missing 'D')
- Make ReplayingDecoderByteBuf.reject() return an exception instead of
throwing it so that its callers don't need to add dummy return
statement
Result:
Derived buffers are now recycled when created via retainedSlice() and
retainedDuplicate() and derived from a pooled buffer
Motivation:
We missed to reset the decoder when asked for it in HttpObjectDecoder and so sometimes could produce more then one LastHttpContent in a sequence during channelInactive.
This did show up as AssertionError:
22:22:35.499 [nioEventLoopGroup-3-1] WARN i.n.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.AssertionError: null
at io.netty.handler.codec.http.HttpObjectAggregator.decode(HttpObjectAggregator.java:205) ~[classes/:na]
at io.netty.handler.codec.http.HttpObjectAggregator.decode(HttpObjectAggregator.java:57) ~[classes/:na]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) ~[classes/:na]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292) [classes/:na]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278) [classes/:na]
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:428) [classes/:na]
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:277) [classes/:na]
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:343) [classes/:na]
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:309) [classes/:na]
at io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:228) [classes/:na]
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:213) [classes/:na]
...
Modifications:
Correctly reset decoder.
Result:
Correctly only produce one LastHttpContent per sequence.
Motivation:
If the input buffer is empty we should not have decodeLast(...) call decode(...) as the user may not expect this.
Modifications:
- Not call decode(...) in decodeLast(...) if the input buffer is empty.
- Add testcases.
Result:
decodeLast(...) will not call decode(...) if input buffer is empty.
Motivation:
b714297a44 introduced ChannelInputShutdownEvent support for HttpObjectDecoder. However this should have been added to the super class ByteToMessageDecoder, and ByteToMessageDecoder should not propegate a channelInactive event through the pipeline in this case.
Modifications:
- Move the ChannelInputShutdownEvent handling from HttpObjectDecoder to ByteToMessageDecoder
- ByteToMessageDecoder doesn't call ctx.fireChannelInactive() on ChannelInputShutdownEvent
Result:
Half closed events are treated more generically, and don't get translated into a channelInactive pipeline event.
Motivation:
The initial buffer size used to decode HTTP objects is currently fixed at 128. This may be too small for some use cases and create a high amount of overhead associated with resizing/copying. The user should be able to configure the initial size as they please.
Modifications:
- Make HttpObjectDecoder's AppendableCharSequence initial size configurable
Result:
Users can more finely tune initial buffer size for increased performance or to save memory.
Fixes https://github.com/netty/netty/issues/4807
Motivation:
Request bodies can easily be larger than Integer.MAX_VALUE in practice.
There's no reason, or intention, for Netty to impose this artificial constraint.
Worse, it currently does not fail if the body is larger than this value;
it just silently only reads the first Integer.MAX_VALUE bytes and discards the rest.
This restriction doesn't effect chunked transfers, with no Content-Length header.
Modifications:
Force the use of `long HttpUtil.getContentLength(HttpMessage, long)` instead of
`long HttpUtil.getContentLength(HttpMessage, long)`.
Result:
Netty will support HTTP request bodies of up to Long.MAX_VALUE length.
Motivation:
There currently exists http.HttpUtil, http2.HttpUtil, and http.HttpHeaderUtil. Having 2 HttpUtil methods can be confusing and the utilty methods in the http package could be consolidated.
Modifications:
- Rename http2.HttpUtil to http2.HttpConversionUtil
- Move http.HttpHeaderUtil methods into http.HttpUtil
Result:
Consolidated utilities whose names don't overlap.
Fixes https://github.com/netty/netty/issues/4120
Motivation:
Hixie 76 needs special handling compared to other connection upgrade responses. Our detection code of non websocket responses did actually always use the special handling that only should be used for Hixie 76 responses.
Modifications:
Correctly detect connection upgrade responses which are not for websockets.
Result:
Be able to upgrade connections for other protocols then websockets.
Motivation:
The HttpObjectAggregator always responds with a 100-continue response. It should check the Content-Length header to see if the content length is OK, and if not responds with a 417.
Modifications:
- HttpObjectAggregator checks the Content-Length header in the case of a 100-continue.
Result:
HttpObjectAggregator responds with 417 if content is known to be too big.
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.
Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.
Result:
Performance is now on par with 4.0.
Motivation:
In the event an HTTP message does not include either a content-length or a transfer-encoding header [RFC 7230](https://tools.ietf.org/html/rfc7230#section-3.3.3) states the behavior must be treated differently for requests and responses. If the channel is half closed then the HttpObjectDecoder is not invoking decodeLast and thus not checking if messages should be sent up the pipeline.
Modifications:
- Add comments to clarify regular decode default case.
- Handle the ChannelInputShutdownEvent in the HttpObjectDecoder and evaluate if messages need to be generated.
Result:
Messages are generated on half closed, and comments clarify existing logic.
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.
Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.
Result:
Peformance boost for decoding http objects.
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.
Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.
Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
Related: #3445
Motivation:
HttpObjectDecoder.HeaderParser does not reset its counter (the size
field) when it failed to find the end of line. If a header is split
into multiple fragments, the counter is increased as many times as the
number of fragments, resulting an unexpected TooLongFrameException.
Modifications:
- Add test cases that reproduces the problem
- Reset the HeaderParser.size field when no EOL is found.
Result:
One less bug
Motivation:
Found performance issues via FindBugs and PMD.
Modifications:
- Removed unnecessary boxing/unboxing operations in DefaultTextHeaders.convertToInt(CharSequence) and DefaultTextHeaders.convertToLong(CharSequence). A boxed primitive is created from a string, just to extract the unboxed primitive value.
- Added a static modifier for DefaultHttp2Connection.ParentChangedEvent class. This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Added a static compiled Pattern to avoid compile it each time it is used when we need to replace some part of authority.
- Improved using of StringBuilders.
Result:
Performance improvements.
Motivation:
HttpObjectDecoder extended ReplayDecoder which is slightly slower then ByteToMessageDecoder.
Modifications:
- Changed super class of HttpObjectDecoder from ReplayDecoder to ByteToMessageDecoder.
- Rewrote decode() method of HttpObjectDecoder to use proper state machine.
- Changed private methods HeaderParser.parse(ByteBuf), readHeaders(ByteBuf) and readTrailingHeaders(ByteBuf), skipControlCharacters(ByteBuf) to consider available bytes.
- Set HeaderParser and LineParser as static inner classes.
- Replaced not safe actualReadableBytes() with buffer.readableBytes().
Result:
Improved performance of HttpObjectDecoder by approximately 177%.
Motivation:
The commit 50e06442c3 changed the type of
the constants in HttpHeaders.Names and HttpHeaders.Values, making 4.1
backward-incompatible with 4.0.
It also introduces newer utility classes such as HttpHeaderUtil, which
deprecates most static methods in HttpHeaders. To ease the migration
between 4.1 and 5.0, we should deprecate all static methods that are
non-existent in 5.0, and provide proper counterpart.
Modification:
- Revert the changes in HttpHeaders.Names and Values
- Deprecate all static methods in HttpHeaders in favor of:
- HttpHeaderUtil
- the member methods of HttpHeaders
- AsciiString
- Add integer and date access methods to HttpHeaders for easier future
migration to 5.0
- Add HttpHeaderNames and HttpHeaderValues which provide standard HTTP
constants in AsciiString
- Deprecate HttpHeaders.Names and Values
- Make HttpHeaderValues.WEBSOCKET lowercased because it's actually
lowercased in all WebSocket versions but the oldest one
- Add RtspHeaderNames and RtspHeaderValues which provide standard RTSP
constants in AsciiString
- Deprecate RtspHeaders.*
- Do not use AsciiString.equalsIgnoreCase(CharSeq, CharSeq) if one of
the parameters are AsciiString
- Avoid using AsciiString.toString() repetitively
- Change the parameter type of some methods from String to
CharSequence
Result:
Backward compatibility is recovered. New classes and methods will make
the migration to 5.0 easier, once (Http|Rtsp)Header(Names|Values) are
ported to master.
Motivation:
The header class hierarchy and algorithm was improved on the master branch for versions 5.x. These improvments should be backported to the 4.1 baseline.
Modifications:
- cherry-pick the following commits from the master branch: 2374e17, 36b4157, 222d258
Result:
Header improvements in master branch are available in 4.1 branch.
Motivation:
At the moment the whole HTTP header must be parsed at once which can lead to multiple parsing of the same bytes. We can do better here and allow to parse it in multiple steps.
Modifications:
- Not parse headers multiple times
- Simplify the code
- Eliminate uncessary String[] creations
- Use readSlice(...).retain() when possible.
Result:
Performance improvements as shown in the included benchmark below.
Before change:
[nmaurer@xxx]~% ./wrk-benchmark
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 21.55ms 15.10ms 245.02ms 90.26%
Req/Sec 196.33k 30.17k 297.29k 76.03%
373954750 requests in 2.00m, 50.15GB read
Requests/sec: 3116466.08
Transfer/sec: 427.98MB
After change:
[nmaurer@xxx]~% ./wrk-benchmark
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 20.91ms 36.79ms 1.26s 98.24%
Req/Sec 206.67k 21.69k 243.62k 94.96%
393071191 requests in 2.00m, 52.71GB read
Requests/sec: 3275971.50
Transfer/sec: 449.89MB
Motivation:
Persuit for the consistency in method naming
Modifications:
- Remove the 'get' prefix from all HTTP/SPDY message classes
- Fix some inspector warnings
Result:
Consistency
Motivation:
When an HttpResponseDecoder decodes an invalid chunk, a LastHttpContent instance is produced and the decoder enters the 'BAD_MESSAGE' state, which is not supposed to produce a message any further. However, because HttpObjectDecoder.invalidChunk() did not clear this.message out to null, decodeLast() will produce another LastHttpContent message on a certain situation.
Modification:
Do not forget to null out HttpObjectDecoder.message in invalidChunk(), and add a test case for it.
Result:
No more consecutive LastHttpContent messages produced by HttpObjectDecoder.
Motivation:
Currently, it is impossible to give a user the full control over what to do in response to the request with 'Expect: 100-continue' header. Currently, a user have to do one of the following:
- Accept the request and respond with 100 Continue, or
- Send the reject response and close the connection.
.. which means it is impossible to send the reject response and keep the connection alive so that the client sends additional requests.
Modification:
Added a public method called 'reset()' to HttpObjectDecoder so that a user can reset the state of the decoder easily. Once called, the decoder will assume the next input will be the beginning of a new request.
HttpObjectAggregator now calls `reset()`right after calling 'handleOversizedMessage()' so that the decoder can continue to decode the subsequent request even after the request with 'Expect: 100-continue' header is rejected.
Added relevant unit tests / Minor clean-up
Result:
This commit completes the fix of #2211
- Since Netty 4, HTTP decoder does not generate a full message at all. Therefore, there's no need to keep separate states for the content smaller than maxChunkSize.
- maxChunkSize must be greater than 0. Setting it to 0 should not disable chunked encoding. We have a dedicated flag for that.
- Uncommented the tests that were commented out for an unknown reason, with some fixes.
- Added more tests for HTTP decoder.
- Removed the Ignore annotation on some tests.
* Minimize allocation of StringBuilder and also minimize char array copy
* Try to detect HttpVersion without calling toUpperCase() for performance reasons
I must admit MesageList was pain in the ass. Instead of forcing a
handler always loop over the list of messages, this commit splits
messageReceived(ctx, list) into two event handlers:
- messageReceived(ctx, msg)
- mmessageReceivedLast(ctx)
When Netty reads one or more messages, messageReceived(ctx, msg) event
is triggered for each message. Once the current read operation is
finished, messageReceivedLast() is triggered to tell the handler that
the last messageReceived() was the last message in the current batch.
Similarly, for outbound, write(ctx, list) has been split into two:
- write(ctx, msg)
- flush(ctx, promise)
Instead of writing a list of message with a promise, a user is now
supposed to call write(msg) multiple times and then call flush() to
actually flush the buffered messages.
Please note that write() doesn't have a promise with it. You must call
flush() to get notified on completion. (or you can use writeAndFlush())
Other changes:
- Because MessageList is completely hidden, codec framework uses
List<Object> instead of MessageList as an output parameter.
The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.
Memory consumption issue:
When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user. In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.
Memory footprint issue:
The old Netty 4 API decreased overall memory footprint by a great deal
in many cases. It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read. Instead, it
created a new buffer for each handler in a pipeline. This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.
Changes:
All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.
To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:
- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
- Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
- Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
- Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
- Similar to ReceiveBufferSizePredictor in Netty 3
- Some existing configuration properties such as
DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf
This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
- Fixes#1229
- Primarily written by @normanmaurer and revised by @trustin
This commit removes the notion of unfolding from the codec framework
completely. Unfolding was introduced in Netty 3.x to work around the
shortcoming of the codec framework where encode() and decode() did not
allow generating multiple messages.
Such a shortcoming can be fixed by changing the signature of encode()
and decode() instead of introducing an obscure workaround like
unfolding. Therefore, we changed the signature of them in 4.0.
The change is simple, but backward-incompatible. encode() and decode()
do not return anything. Instead, the codec framework will pass a
MessageBuf<Object> so encode() and decode() can add the generated
messages into the MessageBuf.
* This is done because we noticed that the previous change limit the usage more then it gave us any benefit. Now it is possible
again to rewrite the url on the fly or reuse the objects when writing a proxy and so limit the GC pressure.
* Fixes also #979