Motivation:
We noticed that the headers implementation in Netty for HTTP/2 uses quite a lot of memory
and that also at least the performance of randomly accessing a header is quite poor. The main
concern however was memory usage, as profiling has shown that a DefaultHttp2Headers
not only use a lot of memory it also wastes a lot due to the underlying hashmaps having
to be resized potentially several times as new headers are being inserted.
This is tracked as issue #3600.
Modifications:
We redesigned the DefaultHeaders to simply take a Map object in its constructor and
reimplemented the class using only the Map primitives. That way the implementation
is very concise and hopefully easy to understand and it allows each concrete headers
implementation to provide its own map or to even use a different headers implementation
for processing requests and writing responses i.e. incoming headers need to provide
fast random access while outgoing headers need fast insertion and fast iteration. The
new implementation can support this with hardly any code changes. It also comes
with the advantage that if the Netty project decides to add a third party collections library
as a dependency, one can simply plug in one of those very fast and memory efficient map
implementations and get faster and smaller headers for free.
For now, we are using the JDK's TreeMap for HTTP and HTTP/2 default headers.
Result:
- Significantly fewer lines of code in the implementation. While the total commit is still
roughly 400 lines less, the actual implementation is a lot less. I just added some more
tests and microbenchmarks.
- Overall performance is up. The current implementation should be significantly faster
for insertion and retrieval. However, it is slower when it comes to iteration. There is simply
no way a TreeMap can have the same iteration performance as a linked list (as used in the
current headers implementation). That's totally fine though, because when looking at the
benchmark results @ejona86 pointed out that the performance of the headers is completely
dominated by insertion, that is insertion is so significantly faster in the new implementation
that it does make up for several times the iteration speed. You can't iterate what you haven't
inserted. I am demonstrating that in this spreadsheet [1]. (Actually, iteration performance is
only down for HTTP, it's significantly improved for HTTP/2).
- Memory is down. The implementation with TreeMap uses on avg ~30% less memory. It also does not
produce any garbage while being resized. In load tests for GRPC we have seen a memory reduction
of up to 1.2KB per RPC. I summarized the memory improvements in this spreadsheet [1]. The data
was generated by [2] using JOL.
- While it was my original intend to only improve the memory usage for HTTP/2, it should be similarly
improved for HTTP, SPDY and STOMP as they all share a common implementation.
[1] https://docs.google.com/spreadsheets/d/1ck3RQklyzEcCLlyJoqDXPCWRGVUuS-ArZf0etSXLVDQ/edit#gid=0
[2] https://gist.github.com/buchgr/4458a8bdb51dd58c82b4
Motivation:
When looking through the logs for entries pertaining to a specific stream, it's difficult because header entries use the syntax "streamId:<id>" but all other entries use "streamId=<id>". We should make all of the entries consistent.
Modifications:
Changed header entries to use "streamId=<id>" to match the other entries.
Result:
Easier HTTP/2 log navigation.
Motivation:
We should support XXXCollections methods for all primitive map types.
Modifications:
Removed PrimitiveCollections and added a template for XXXCollections.
Result:
Fixes#4001
Motivation:
It would be useful to support the Java `Map` interface in our primitive maps.
Modifications:
Renamed current methods to "pXXX", where p is short for "primitive". Made the template for all primitive maps extend the appropriate Map interface.
Result:
Fixes#3970
Motivation:
HttpToHttp2ConnectionHandler only converts FullHttpMessage to HTTP/2 Frames. This does not support other use cases such as adding a HttpContentCompressor to the pipeline, which writes HttpMessage and HttpContent.
Additionally HttpToHttp2ConnectionHandler ignores converting and sending HTTP trailing headers, which is a bug as the HTTP/2 spec states that they should be sent.
Modifications:
Update HttpToHttp2ConnectionHandler to support converting HttpMessage and HttpContent to HTTP/2 Frames.
Additionally, include an extra call to writeHeaders if the message includes trailing headers
Result:
One can now write HttpMessage and HttpContent (http chunking) down the pipeline and they will be converted to HTTP/2 Frames. If any trailing headers exist, they will be converted and sent as well.
Motivation:
It is currently assumed that all usages of the HTTP/2 codec will be from the same event loop context. If the methods are used outside of the assumed thread context then unexpected behavior is observed. This assumption should be more clearly communicated and enforced in key areas.
Modifications:
- The flow controller interfaces have assert statements and updated javadocs indicating the assumptions.
Result:
Interfaces more clearly indicate thread context limitations.
Motivation:
The CompressorHttp2ConnectionEncoder is attempting to attach a property to streams before the exist.
Modifications:
- Allow the super class to create the streams before attempting to attach a property to the stream.
Result:
CompressorHttp2ConnectionEncoder is able to set the property and access the compressor.
Motivation:
See #3783
Modifications:
- The DefaultHttp2RemoteFlowController should use Channel.isWritable() before attempting to do any write operations.
- The Flow controller methods should no longer take ChannelHandlerContext. The concept of flow control is tied to a connection and we do not support 1 flow controller keeping track of multiple ChannelHandlerContext.
Result:
Writes are delayed until isWritable() is true. Flow controller interface methods are more clear as to ChannelHandlerContext restrictions.
Motivation:
The MAX_HEADER_LIST_SIZE of SETTINGS is represented by
unsigned 32-bit value and this value isn't limited in RFC7540.
But in current implementation, its stored to int variable so
over 2^31-1 value is recognized as minus and handled as
PROTOCOL_ERROR.
Modifications:
If a value of MAX_HEADER_LIST_SIZE is larger than 2^31-1, its
handled as 2^31-1
Result:
Over 2^31-1 MAX_HEADER_LIST_SIZE is became acceptable
Motivation:
Slicing a mutable CompositeByteBuf is not the appropriate mechanism to use to track and release buffers that have been written to a channel.
In particular buffers passed over an Embedded or LocalChannel are retained after the ChannelPromise is completed and listening to the
promise to consolidate a CompositeBuffer breaks slices taken from the composite as the offset indices have changed.
In addition CoalescingBufferQueue handles taking arbitrarily sized slices of a sequence of buffers more efficiently.
Modifications:
Convert FlowControlledData to use a CoalescingBufferQueue to handle merging data writes.
Result:
HTTP2 works over LocalChannel and code is considerably simpler.
Motivation:
Sometimes people use a data frame with length 0 to end a stream(such as jetty http2-server). So it is possible that data.readableBytes and padding are all 0 for a data frame, and cause an IllegalArgumentException when calling flowController.consumeBytes.
Modifications:
Return false when numBytes == 0 instead of throwing IllegalArgumentException.
Result:
Fix IllegalArgumentException.
Motivation:
The problem is described in https://github.com/grpc/grpc-java/issues/605. Basically, when using `StreamBufferingEncoder` there is a chance of creating zombie streams that never get closed.
Modifications:
Change `Http2ConnectionHandler`'s `channelInactive` handling logic to shutdown the encoder/decoder before shutting down the active streams.
Result:
Fixes https://github.com/grpc/grpc-java/issues/605
Motivation:
Http2ConnectionHandler missed to forward channelReadComplete(...) events.
Modifications:
Ensure we notify the next handler in the pipeline via ctx.fireChannelReadComplete().
Result:
Correctly forwarding of event.
Motivation:
The Http2FrameListener requires that the Http2FrameReader accumulate ByteBuf objects. There is currently no way to limit the amount of bytes that is accumulated.
Motiviation:
- DefaultHttp2FrameReader supports maxHeaderSize which will fail fast as soon as the maximum size is exceeded.
- DefaultHttp2HeadersDecoder will respect the return value of endHeaderBlock() and fail if the max size is exceeded.
Result:
Frames which carry header data now respect a maximum number of bytes that can be accumulated.
Motivation:
If server sends SETTINGS with ENABLE_PUSH, its handled as
PROTOCOL_ERROR in spite of the value. But the value specified to
0 may be allowed in RFC7540.
Modifications:
Check whether ENABLE_PUSH sent from a server is 0 or not.
Result:
When server specifies ENABLE_PUSH to 0 explicitly, client doesn't
handle it as PROTOCOL_ERROR.
Motivation:
Coalescing many small writes into a larger DATA frame reduces framing overheads on the wire and reduces the number of calls to Http2FrameListeners on the remote side.
Delaying the write of WINDOW_UPDATE until flush allows for more consumed bytes to be returned as the aggregate of consumed bytes is returned and not the amount consumed when the threshold was crossed.
Modifications:
- Remote flow controller no longer immediately writes bytes when a flow-controlled payload is enqueued. Sequential data payloads are now merged into a single CompositeByteBuf which are written when 'writePendingBytes' is called.
- Listener added to remote flow-controller which observes written bytes per stream.
- Local flow-controller no longer immediately writes WINDOW_UPDATE when the ratio threshold is crossed. Now an explicit call to 'writeWindowUpdates' triggers the WINDOW_UPDATE for all streams who's ratio is exceeded at that time. This results in
fewer window updates being sent and more bytes being returned.
- Http2ConnectionHandler.flush triggers 'writeWindowUpdates' on the local flow-controller followed by 'writePendingBytes' on the remote flow-controller so WINDOW_UPDATES preceed DATA frames on the wire.
Result:
- Better throughput for writing many small DATA chunks followed by a flush, saving 9-bytes per coalesced frame.
- Fewer WINDOW_UPDATES being written and more flow-control bytes returned to remote side more quickly, thereby improving throughput.
Motivation:
Bootstrap of the HTTP/2 can take a lot of paths and a lot of things can go wrong in the initial handshakes leading up to establishment of HTTP/2 between client and server. There have been many times where handshakes have failed silently, leading to very cryptic errors that are hard to debug.
Modifications:
Changed the HTTP/2 handler and decoder to ensure that the very first data on the wire (WRT HTTP/2) is SETTINGS/preface+SETTINGS. When this is not the case, a connection error is thrown with the bytes that were found instead.
Result:
Fixes#3880
Related: #3814
Motivation:
To implement the support for an upgrade from cleartext HTTP/1.1
connection to cleartext HTTP/2 (h2c) connection, a user usually uses
HttpServerUpgradeHandler.
It does its job, but it requires a user to instantiate the UpgradeCodecs
for all supported protocols upfront. It means redundancy for the
connections that are not upgraded.
Modifications:
- Change the constructor of HttpServerUpgradeHandler
- Accept UpgraceCodecFactory instead of UpgradeCodecs
- The default constructor of HttpServerUpgradeHandler sets the
maxContentLength to 0 now, which shouldn't be a problem because a
usual upgrade request is a GET.
- Update the examples accordingly
Result:
A user can instantiate Http2ServerUpgradeCodec and its related objects
(Http2Connection, Http2FrameReader/Writer, Http2FrameListener, etc) only
when necessary.
Motivation:
Our HTTP/2 implementation sometimes uses hard-coded handler names when
adding/removing a handler to/from a pipeline. It's not really a good
idea because it can easily result in name clashes. Unless there is a
good reason, we need to use the reference to the handlers
Modifications:
- Allow null as a handler name for Http2Client/ServerUpgradeCodec
- Use null as the default upgrade handler name
- Do not use handler name strings in some test cases and examples
Result:
Fixes#3815
Motiviation:
The connection handler stream close operation is unconditionally adding a listener object to a future. We may not have to add a listener at all because the future has already been completed.
Modifications:
- If the future is done, directly invoke the logic without creating/adding a new listener.
Result:
No need to create/add listener if the future is already done in close logic.
Motivation:
gRPC's BufferingHttp2ConnectionEncoder is a generic utility that simplifies client-side applications that want to allow stream creation without worrying about violating the SETTINGS_MAX_CONCURRENT_STREAMS limit. Since it's not gRPC-specific it makes sense to move it into Netty proper.
Modifications:
Adding the BufferingHttp2ConnectionEncoder and it's unit test.
Result:
Netty now supports buffering stream creation.
Motivation:
SpdyOrHttpChooser and Http2OrHttpChooser duplicate fair amount code with each other.
Modification:
- Replace SpdyOrHttpChooser and Http2OrHttpChooser with ApplicationProtocolNegotiationHandler
- Add ApplicationProtocolNames to define the known application-level protocol names
Result:
- Less code duplication
- A user can perform dynamic pipeline configuration that follows ALPN/NPN for any protocols.
Related: #3641 and #3813
Motivation:
When setting up an HTTP/1 or HTTP/2 (or SPDY) pipeline, a user usually
ends up with adding arbitrary set of handlers.
Http2OrHttpChooser and SpdyOrHttpChooser have two abstract methods
(create*Handler()) that expect a user to return a single handler, and
also have add*Handlers() methods that add the handler returned by
create*Handler() to the pipeline as well as the pre-defined set of
handlers.
The problem is, some users (read: I) don't need all of them or the
user wants to add more than one handler. For example, take a look at
io.netty.example.http2.tiles.Http2OrHttpHandler, which works around
this issue by overriding addHttp2Handlers() and making
createHttp2RequestHandler() a no-op.
Modifications:
- Replace add*Handlers() and create*Handler() with configure*()
- Rename getProtocol() to selectProtocol() to make what it does clear
- Provide the default implementation of selectProtocol()
- Remove SelectedProtocol.UNKNOWN and use null instead, because
'UNKNOWN' is not a protocol
- Proper exception handling in the *OrHttpChooser so that the
exception is logged and the connection is closed when failed to
select a protocol
- Make SpdyClient example always use SSL. It was always using SSL
anyway.
- Implement SslHandshakeCompletionEvent.toString() for debuggability
- Remove an orphaned class: JettyNpnSslSession
- Add SslHandler.applicationProtocol() to get the name of the
application protocol
- SSLSession.getProtocol() now returns transport-layer protocol name
only, so that it conforms to its contract.
Result:
- *OrHttpChooser have better API.
- *OrHttpChooser handle protocol selection failure properly.
- SSLSession.getProtocol() now conforms to its contract.
- SpdyClient example works with SpdyServer example out of the box
Motivation:
There is currently no good way to configure the initial SETTINGS frame. The individual settings can be configured on the various components, but doing this bypasses the proper setting update logic in the encoder.
Modifications:
Updated Http2ConnectionHandler to optionally take initial settings in the constructor. If not provided, it will default to current behavior.
Result:
Easy manual configuration of initial settings.
Motivation:
The Http2OutboundFrameLogger logs all PING frames as not acks.
Modifications:
Changed the logger to correctly log PING acks.
Result:
PING acks are logged correctly.
Motiviation:
The Http2ConnectionHandler is incrementing the reference count in the goAway method for the debugData buffer after it has already been sent and maybe consumed. This may result in an IllegalRefCountException to be thrown. The unit tests also encounter buffer leaks because they have not been updated to invoke the listener which releases the buffer in the goAway method.
Modifications:
- The retain() call should be before the frameWriter().writeGoAway(...) call
- The unit tests which call goAway must also invoke the operationComplete(..) method for the listener.
Result:
No IllegalRefCountException. Less buffer leaks in tests.
Motivation:
If headers are sent on a stream that does not yet exist and the END_STREAM flag is set we will send a RST_STREAM frame. We should send the HEADERS frame and no RST_STREAM.
Modifications:
DefaultHttp2RemoteFlowController should allow frames to be sent if stream is created in the 'half closed (local)' state.
Result:
We can send HEADERS frame with the END_STREAM flag sent without sending a RST_STREAM frame.
Motivation:
There were a few outstanding comments that were left unaddressed after committing the changes for #3749.
Modifications:
Changes to Http2ConnectionHandler.goAway():
- Retaining the debugData buffer, rather than always converting it to a string immediately.
- Changing log level for sending a GOAWAY with error to debug.
Result:
Remaining comments from #3749 are addressed.
Motivation:
Currently the graceful shutdown of the HTTP/2 connection waits until there are no active streams. There may be use cases that buffer stream creation (due to limits imposed by MAX_CONCURRENT_STREAMS), in which case they may still want those streams to complete before closing.
Modifications:
Added a isGracefulShutdownComplete method to Http2ConnectionHandler, which can be overridden by a subclass.
Result:
Graceful shutdown logic can be overridden.
Motivation:
If the client closes, a GOWAY is sent with a lastKnownStream of zero (since the remote side never created a stream). If there is still an exchange in progress, inbound frames for streams created by the client will be ignored because our ignore logic doesn't check to see if the stream was created by the remote endpoint. Frames for streams created by the local endpoint should continue to come through after sending GOAWAY.
Modifications:
Changed the decoder's streamCreatedAfterGoAwaySent logic to properly ensure that the stream was created remotely.
Result:
We now propertly process frames received after sending GOAWAY.
Motivation:
The isDone method is currently broken in the aggregator because the doneAllocatingPromises accidentally calls the overridden version of setSuccess, rather than calling the base class version. This causes the base class's version to never be called since allowNotificationEvent will evaluate to false. This means that setSuccess0 will never be set, resulting in isDone always returning false.
Modifications:
Changed setSuccess() to call the base class when appropriate, regardless of the result of allowNotificationEvent.
Result:
isDone now behaves properly for the promise aggregator.
Motivation:
Allow users of HTTP2 to control when flushes occur so they can optimize network writes.
Modifications:
Removed explicit calls to flush in encoder, decoder & flow-controller
Connection handler now calls flush on read-complete to enable batching writes in response to reads
Result:
Much less flushing occurs for normal HTTP2 request and response patterns.
Motiviation:
There are a few spots in the HTTP/2 codec where warnings were generated and can be avoided.
Modifications:
Clean up the cause of the warnings.
Result:
Less warnings.
Motivation:
The Http2ConnectionHandler incorrectly doesn't propagate channelActive and channelInactive events and thus breaks the pipeline
for other ChannelHandler.
Modification:
- Add calls to super.channelActive() and super.channelInactive().
- Remove unused methods.
Result:
- Http2ConnectionHandler can be used with other ChannelHandlers.
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.
Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset
Result:
ByteString and AsciiString can represent a sub region of a byte[].
Motivation:
Streams currently maintain a hash map of user-defined properties, which has been shown to add significant memory overhead as well as being a performance bottleneck for lookup of frequently used properties.
Modifications:
Modifying the connection/stream to use an array as the storage of user-defined properties, indexed by the class that identifies the index into the array where the property is stored.
Result:
Stream processing performance should be improved.
Motivation:
Currently we allocate the full amount of state for each stream as soon as the stream is created, and keep that state until the stream is GC. The full set of state is only needed when the stream can support flow controlled frames. There is an opportunity to reduce the required amount of memory, and make memory eligible for GC sooner by only allocating what is necessary for flow control stream state.
Modifications:
Introduce objects which require 'less' state for local/remote flow control stream state.
Use these new objects when streams have been created but will not transition out of idle AND when streams are no longer eligible for flow controlled frame transfer but still must persist in the priority tree.
Result:
Memory allocations are reduced to what is actually needed, and memory is made eligible for GC potentially sooner.
Motivation:
The recent PR that discarded the Http2StreamRemovalPolicy causes connection errors when receiving a frame for a stream that no longer exists. We should ignore these frames if we think there's a chance that the stream has existed previously
Modifications:
Modified the Http2Connection interface to provide a `streamMayHaveExisted` method. Also removed the requireStream() method to identify all of the places in the code that need to be updated.
Modified the encoder and decoder to properly handle cases where a stream may have existed but no longer does.
Result:
Fixes#3643
Motivation:
The current local flow controller does not guarantee that unconsumed bytes for a closed stream will be restored to the connection window. This may lead to degradation of the connection window over time.
Modifications:
Modified DefaultHttp2LocalFlowController to guarantee that any unconsumed bytes are returned to the connection window as soon as the stream is closed. We also immediately consume any bytes when receiving DATA for a closed stream.
Result:
Fixes#3668
Motivation:
Flow control is a required part of the HTTP/2 specification but it is currently structured more like an optional item. It must be accessed through the property map which is time consuming and does not represent its required nature. This access pattern does not give any insight into flow control outside of the codec (or flow controller implementation).
Modifications:
1. Create a read only public interface for LocalFlowState and RemoteFlowState.
2. Add a LocalFlowState localFlowState(); and RemoteFlowState remoteFlowState(); to Http2Stream.
Result:
Flow control is not part of the Http2Stream interface. This clarifies its responsibility and logical relationship to other interfaces. The flow controller no longer must be acquired though a map lookup.
Motivation:
If an exclusive dependency change stream B should be an exclusive dependency of stream A is requested and stream B is already a child of stream A...then we will add B to B's own children map and create a circular link in the priority tree. This leads to an infinite recursive loop and a stack overflow exception.
Modifications:
-when removeAllChildren is called it should not remove the exclusive dependency.
-unit test to ensure this case is covered.
Result:
No more circular link in the priority tree.
Motivation:
While forward porting https://github.com/netty/netty/pull/3579 there were a few areas that had not been previously back ported.
Modifications:
Backport the missed areas to ensure consistency.
Result:
More consistent 4.1 and master branches.
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.
Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.
Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
Motivation:
Due to a recent flurry of cleanup and fixes, we no longer need the stream removal policy to protect against recently removed streams. We should get rid of it.
Modifications:
Removed Http2StreamRemovalPolicy and everywhere it's used.
Result:
Fixes#3448
Motivation:
The DefaultHttp2Connection is not checking for RuntimeExceptions when invoking Http2Connection.Listener methods. This is a problem for a few reasons: 1. The state of DefaultHttp2Connection will be corrupted if a listener throws a RuntimeException. 2. If the first listener throws then no other listeners will be notified, which may further corrupt state that is updated as a result of listeners being notified.
Modifications:
- Document that RuntimeExceptions are not supported for Http2Connection.Listener methods, and will be logged as an error.
- Update DefaultHttp2Connection to handle and exception for each listener that is notified, and be sure that 1 listener throwing an exception does not prevent others from being notified.
Result:
More robust DefaultHttp2Connection.