H2C upgrades should be ineligible for flow control
Motivation:
When the h2c upgrade request is too big, the Http2FrameCodec complains
it's too big for flow control reasons, even though it's ineligible for
flow control.
Modifications:
Specially mark upgrade streams and make Http2FrameCodec know not to try
to flow control on those streams.
Result:
Servers won't barf when they receive an upgrade request with a fat
payload.
[Fixes#7280]
Motivation:
The writeSpinCount currently loops over the same buffer, gathering
write, file write, or other write operation multiple times but will
continue writing until there is nothing left or the OS doesn't accept
any data for that specific write. However if the OS keeps accepting
writes there is no way to limit how much time we spend on a specific
socket. This can lead to unfair consumption of resources dedicated to a
single socket.
We currently don't limit the amount of bytes we attempt to write per
gathering write. If there are many more bytes pending relative to the
SO_SNDBUF size we will end up building iov arrays with more elements
than can be written, which results in extra iteration, conditionals,
and book keeping.
Modifications:
- writeSpinCount should limit the number of system calls we make to
write data, instead of applying to individual write operations
- IovArray should support a maximum number of bytes
- IovArray should support composite buffers of greater than size 1024
- We should auto-scale the amount of data that we attempt to write per
gathering write operation relative to SO_SNDBUF and how much data is
successfully written
- The non-unsafe path should also support a maximum number of bytes,
and respect the IOV_MAX limit
Result:
Write resource consumption can be bounded and gathering writes have
a limit relative to the amount of data which can actually be accepted
by the socket.
Motivation:
We used NetUtil.isIpV4StackPreferred() when loading JNI code which tries to load NetworkInterface in its static initializer. Unfortunally a lock on the NetworkInterface class init may be already hold somewhere else which may cause a loader deadlock.
Modifications:
Add a new Socket.initialize() method that will be called when init the library and pass everything needed to the JNI level so we not need to call back to java.
Result:
Fixes [#7458].
Motivation:
At the moment its a bit "hacky" to retrieve the hostname that was used during SNI as you need to hold a reference to SniHandler and then call hostname() once the selection is done. It would be better to fire an event to let the user know we did the selection.
Modifications:
Add a SniCompletionEvent that can be used to get the hostname that was used to do the selection and was included in the SNI extension.
Result:
Easier usage of SNI.
Motivation:
We only want to log for the particular case when debug logging is enabled so we not need to try to match the message if this is not the case.
Modifications:
Guard with logger.isDebugEnabled()
Result:
Less overhead when debug logging is not enabled.
Motivation:
AbstractChannel attempts to "filter" messages which are written [1]. A goal of this process is to copy from heap to direct if necessary. However implementations of this method [2][3] may translate a buffer with 0 readable bytes to EMPTY_BUFFER. This may mask a user error where an empty buffer is written but already released.
Modifications:
Replace safeRelease(...) with release(...) to ensure we propagate reference count issues.
Result:
Fixes [#7383]
Motivation:
If large amounts of data is being transferred it is difficult to correlate the amount we attempt to read vs the maximum amount that the OS will actually buffer and deliver to the application. For exmaple some OSes may dynicamlly update the SO_RCVBUF size or otherwise dynamically adjust how much data is delieved to the application. In these circumstances it can reduce latency to just call read() on the socket another time to see if there is really any data remaining instead of giving up the maxMessagesPerRead quantum and going back to the selector to read later.
Motifications:
- Add DefaultMaxMessagesRecvByteBufAllocator#respectMaybeMoreData which provides a way to ignore the maybeMoreData function which may not account for the current data pending, and if it does this maybe racy.
Result:
Option to always use the full maxMessagesPerRead quantum before going back to the selector.
Motivation:
SslHandler will do aggregation of writes by default in an attempt to improve goodput and reduce the number of discrete buffers which must be accumulated. However if aggregation is not possible then a CompositeByteBuf is used to accumulate multiple buffers. Using a CompositeByteBuf doesn't provide any of the benefits of better goodput and in the case of small + large writes (e.g. http/2 frame header + data) this can reduce the amount of data that can be passed to writev by about half. This has the impact of increasing latency as well as reducing goodput.
Modifications:
- SslHandler should prefer copying instead of using a CompositeByteBuf
Result:
Better goodput (and potentially improved latency) at the cost of copy operations.
Motivation:
AdaptiveRecvByteBufAllocator currently adjusts the ByteBuf allocation size guess when readComplete is called. However the default configuration for number of reads before readComplete is called is 16. This means that there will be 16 reads done before any adjustment is done. If there is a large amount of data pending AdaptiveRecvByteBufAllocator will be slow to adjust the allocation size guess. In addition to being slow the result of only updating the guess in readComplete means that we must go back to the selector and wait to be woken up again when data is ready to read. Going back to the selector is an expensive operations and can add significant latency if there is large amount of data pending to read.
Modifications:
- AdaptiveRecvByteBufAllocator should check on each read if a step up is necessary. The step down process is left unchanged and can be more gradual at the cost of potentially over allocating.
Result:
AdaptiveRecvByteBufAllocator increases the guess size during the read loop to reduce latency when large amounts of data is being read.
Motivation:
Http2ConnectionHandler uses ctx.fireUserEvent to propagate the Http2ConnectionPrefaceAndSettingsFrameWrittenEvent through the pipeline. This will propagate the event to the next inbound handler in the pipeline. If the user extends Http2ConnectionHandler the Http2ConnectionPrefaceAndSettingsFrameWrittenEvent may be missed and initialization dependent upon this event will not be run.
Modifications:
- Http2ConnectionHandler should use userEventTriggered instead of ctx.fireUserEvent
Result:
Classes that extend Http2ConnectionHandler will see the Http2ConnectionPrefaceAndSettingsFrameWrittenEvent user event.
Motivation:
When the ECS source prefix length is not a mutiple of 8, the last byte the address inside the
ECS OPT record is not padded properly.
Modifications:
DefaultDnsRecordEncoder.padWithZeros(...) was modified to add padding from the least
significant bits.
Result:
ECS encoding bug fixed.
Motivation:
The default enabled cipher suites of the OpenSsl engine are not set to
SslUtils#DEFAULT_CIPHER_SUITES. Instead all available cipher suites are
enabled. This should happen only as a fallback.
Modifications:
Moved the line in the static initializer in OpenSsl which adds the
SslUtils#DEFAULT_CIPHER_SUITES to the default enabled cipher suites up
before the fallback.
Result:
The default enabled cipher suites of the OpenSsl engine are set to the
available ones of the SslUtils#DEFAULT_CIPHER_SUITES.
The default enabled cipher suites of the OpenSsl engine are only set to
all available cipher suites if no one of the
SslUtils#DEFAULT_CIPHER_SUITES is supported.
Automatic-Module-Name entry provides a stable JDK9 module name, when Netty is used in a modular JDK9 applications. More info: http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html
When Netty migrates to JDK9 in the future, the entry can be replaced by actual module-info descriptor.
Modification:
The POM-s are configured to put the correct module names to the manifest.
Result:
Fixes#7218.
Motivation:
We dont need to use the ThreadDeathWatcher if we use a FastThreadLocalThread for which we wrap the Runnable and ensure we call FastThreadLocal.removeAll() once the Runnable completes.
Modifications:
- Dont use a ThreadDeathWatcher if we are sure we will call FastThreadLocal.removeAll()
- Add unit test.
Result:
Less overhead / running theads if you only allocate / deallocate from FastThreadLocalThreads.
Motivation:
HttpObjectDecoder will throw a TooLongFrameException when either the max size for the initial line or the header size was exceeed. We have no tests for this.
Modifications:
Add test cases.
Result:
More tests.
Motivation:
Exception handling is nicer when a more specific Exception is thrown
Modification:
Add a static reference for ENOENT, and throw FNFE if it is returned
Result:
More precise exception handling
Motiviation:
The OSGi Test suite runs without access to sun.misc.Unsafe, and so is a good place to put a test to avoid regressing #6548.
Modification:
Added a test-case that failed before https://github.com/netty/netty/pull/7432.
Result:
Test for fix included.
Motivation:
OSGI and other enviroments may not allow to even load Unsafe which will lead to an NoClassDefFoundError when trying to access it. We should guard against this.
Modifications:
Catch NoClassDefFoundError when trying to load Unsafe.
Result:
Be able to use netty with a strict OSGI config.
Motivation:
When system property is empty, the default value should be used.
Modification:
- Correctly use the default value in all cases
- Add unit tests
Result:
Correct behaviour
Motivation:
At the moment there is not way for the user to know if resolving a domain was failed because the domain was unkown or because of an IO error / timeout. If it was caused by an timeout / IO error the user may want to retry the query. Also if the query was failed because of an IO error / timeout we should not cache it.
Modifications:
- Add DnsNameResolverTimeoutException and include it in the UnkownHostException if the domain could not be resolved because of an timeout. This will allow the user to retry the query when inspecting the cause.
- Do not cache IO errors / timeouts
- Add unit test
Result:
Easier for users to implement retries for DNS querys and not cache IO errors / timeouts.
Motivation:
At the moment there is not way for the user to know if resolving a domain was failed because the domain was unkown or because of an IO error / timeout. If it was caused by an timeout / IO error the user may want to retry the query. Also if the query was failed because of an IO error / timeout we should not cache it.
Modifications:
- Add DnsNameResolverTimeoutException and include it in the UnkownHostException if the domain could not be resolved because of an timeout. This will allow the user to retry the query when inspecting the cause.
- Do not cache IO errors / timeouts
- Add unit test
Result:
Easier for users to implement retries for DNS querys and not cache IO errors / timeouts.
Motivation:
DefaultHttpHeader.names() exposes HTTP header names as a Set<String>. Converting the resulting set to an array using toArray(String[]) throws an exception: java.lang.ArrayStoreException: io.netty.util.AsciiString.
Modifications:
- Remove our custom implementation of toArray(...) (and others) by just extending AbstractCollection.
- Add unit test
Result:
Fixes [#7428].
Motivation:
HttpConversionUtil#toHttp2Headers has special code to filter the TE header name. However this filtering code may result in adding the <TE, TRAILERS> tuple in scenarios that are not appropriate. For example if a value containing trailers is seen it will be added, but the value could not actually be equal to trailers. Also CSV values are not supported.
Modifications:
- Account for CSV header values
- Account for the value containing 'trailers' but not actually being equal to 'trailers'
Result:
More robust parsing of the TE header.
Motivation:
For debugging/logging purpose, it would be convenient to have
HttpHeaders#toString implemented.
DefaultHeaders does implement toString be the implementation is suboptimal and allocates a Set for the names and Lists for values.
Modification:
* Introduce HeadersUtil#toString that provides a convenient optimized helper to implement toString for various headers implementations
* Have DefaultHeaders#toString and HttpHeaders#toString delegate their toString implementation to HeadersUtil
Result:
Convenient HttpHeaders#toString. Optimized DefaultHeaders#toString.
Motivation:
To better isolate OS system calls we should not call getsockopt directly but use our netty_unix_socket_getOption0 function. See is a followup of f115bf5.
Modifications:
Export netty_unix_socket_getOption0 by declaring it in the header file and use it
Result:
Better isolation of system calls.
Motivation:
Its possible that cleanup() will throw if invalid data is passed into the wrapped EmbeddedChannel. We need to ensure we still call channelInactive(...) in this case.
Modifications:
- Correctly forward Exceptions caused by cleanup()
- Ensure all content is released when cleanup() throws
- Add unit tests
Result:
Correctly handle the case when cleanup() throws.
Motivation:
https://github.com/netty/netty/issues/7418 reported an issue with writing a LastHttpContent with trailers set.
Modifications:
Add unit test to ensure this issue is fixed in latest netty release.
Result:
Ensure code is correct.
Modifications:
HttpConversionUtil#toLowercaseMap requires an intermediate List to be allocated. This can be avoided with the recently added value iterator methods.
Modifications:
- Use HttpHeaders#valueCharSequenceIterator instead of getAll
Result:
Less intermediate object allocation and copying.
Motivation:
Minor cleanup from 844d804 just to reduce the conditional statements and indentation level.
Modifications:
- combine the else + if into an else if statement
Result:
Code cleaned up.
Motivation:
HttpMethod#valueOf shows up on profiler results in the top set of
results. Since it is a relatively simple operation it can be improved in
isolation.
Modifications:
- Introduce a special case map which assigns each HttpMethod to a unique
index in an array and provides constant time lookup from a hash code
algorithm. When the bucket is matched we can then directly do equality
comparison instead of potentially following a linked structure when
HashMap has hash collisions.
Result:
~10% improvement in benchmark results for HttpMethod#valueOf
Benchmark Mode Cnt Score Error Units
HttpMethodMapBenchmark.newMapKnownMethods thrpt 16 31.831 ± 0.928 ops/us
HttpMethodMapBenchmark.newMapMixMethods thrpt 16 25.568 ± 0.400 ops/us
HttpMethodMapBenchmark.newMapUnknownMethods thrpt 16 51.413 ± 1.824 ops/us
HttpMethodMapBenchmark.oldMapKnownMethods thrpt 16 29.226 ± 0.330 ops/us
HttpMethodMapBenchmark.oldMapMixMethods thrpt 16 21.073 ± 0.247 ops/us
HttpMethodMapBenchmark.oldMapUnknownMethods thrpt 16 49.081 ± 0.577 ops/us
Motivation:
In order to determine if a header contains a value we currently rely
upon getAll(..) and regular expressions. This operation is commonly used
during the encode and decode stage to determine the transfer encoding
(e.g. HttpUtil#isTransferEncodingChunked). This operation requires an
intermediate collection and possibly regular expressions for the
CombinedHttpHeaders use case which can be expensive.
Modifications:
- Add a valuesIterator to HttpHeaders and specializations of this method
for DefaultHttpHeaders, ReadOnlyHttpHeaders, and CombinedHttpHeaders.
Result:
Less intermediate collections and allocation overhead when determining
if HttpHeaders contains a name/value pair.
Motivation:
DefaultHttp2FrameWriter#writeData allocates a DataFrameHeader for each write operation. DataFrameHeader maintains internal state and allocates multiple slices of a buffer which is a maximum of 30 bytes. This 30 byte buffer may not always be necessary and the additional slice operations can utilize retainedSlice to take advantage of pooled objects. We can also save computation and object allocations if there is no padding which is a common case in practice.
Modifications:
- Remove DataFrameHeader
- Add a fast path for padding == 0
Result:
Less object allocation in DefaultHttp2FrameWriter
Motivation:
DN resolution does not fall back to the "original name" lookup after search list is checked. This results in a failure to resolve any name (outside of search list) that has number of dots less than resolv.conf's ndots value (which, for example, is often the case in the context of Kubernetes where kubelet passes on resolv.conf containing "options ndots:5").
It also does not go through the search list in a situation described in resolv.conf man:
"The default for n[dots] is 1, meaning that if there are any dots in a name, the name will be tried first as an absolute name before any search list elements are appended to it."
Modifications:
DnsNameResolverContext::resolve was updated to match Go's https://github.com/golang/go/blob/release-branch.go1.9/src/net/dnsclient_unix.go#L338 logic.
Result:
DnsNameResolverContext::resolve will now try to resolve "original name" if search list yields no results when number of dots in the original name is less than resolv.conf's ndots value. It will also go through the search list in case "origin name" resolution fails and number of dots is equal or larger than resolv.conf's ndots value.
Motivation:
AbstractByteBuf#readSlice relied upon the bounds checking of the slice operation in order to detect index out of bounds conditions. However the slice bounds checking operation allows for the slice to go beyond the writer index, and this is out of bounds for a read operation.
Modifications:
- AbstractByteBuf#readSlice and AbstractByteBuf#readRetainedSlice should ensure the desired amount of bytes are readable before taking a slice
Result:
No reading of undefined data in AbstractByteBuf#readSlice and AbstractByteBuf#readRetainedSlice.
Motivation:
If the HttpUtil class is initialized before HttpHeaders or
EmptyHttpHeaders, EmptyHttpHeaders.INSTANCE will be null. This
can lead to NPEs in code that relies on this field being
non-null. One example is the
LastHttpContent.EMPTY_LAST_CONTENT.trailingHeaders method.
Modifications:
- Move HttpUtil.EMPTY_HEADERS to a private static final inner class
of EmptyHttpHeaders called InstanceInitializer.
- Add tests, that when run in isolation, validate the fix for the issue.
Result:
Any initialization order of HttpUtil, EmptyHttpHeaders or
HttpHeaders will result in EmptyHttpHeaders.INSTANCE being initialized
correctly.
Motivation:
Netty could handle "connection" or "te" headers more gently when
converting from http/1.1 to http/2 headers. Http/2 headers don't
support single-hop headers, so when we convert from http/1.1 to http/2,
we should drop all single-hop headers. This includes headers like
"transfer-encoding" and "connection", but also the headers that
"connection" points to, since "connection" can be used to designate
other headers as single-hop headers. For the "te" header, we can more
permissively convert it by just dropping non-conforming headers (ie
non-"trailers" headers) which is what we do for all other headers when
we convert.
Modifications:
Add a new blacklist to the http/1.1 to http/2 conversion, which is
constructed from the values of the "connection" header, and stop
throwing an exception when a "te" header is passed with a non-"trailers"
value. Instead, drop all values except for "trailers". Add unit tests
for "connection" and "te" headers when converting from http/1.1 to http/2.
Result:
This will improve the h2c upgrade request, and also conversions from
http/1.1 to http/2. This will simplify implementing spec-compliant
http/2 servers that want to share code between their http/1.1 and http/2
implementations.
[Fixes#7355]
Motivation:
We should not fire a SslHandshakeEvent if the channel is closed but the handshake was not started.
Modifications:
- Add a variable to SslHandler which tracks if an handshake was started yet or not and depending on this fire the event.
- Add a unit test
Result:
Fixes [#7262].
Motivation:
If a user calls EpollSocketChannelConfig.getOptions() and TCP_FASTOPEN_CONNECT is not supported we throw an exception.
Modifications:
- Just return 0 if ENOPROTOOPT is set.
- Add testcase
Result:
getOptions() works as epxected.
Motivation:
The behavior for SelectorFailureBehavior and SelectedListenerFailureBehavior enum values are not clear. Additional comments would clarify the expected behavior.
Modifications:
- Add comments for each value in SelectedListenerFailureBehavior and SelectorFailureBehavior which clarify the expected behavior
Result:
The behavior of SelectedListenerFailureBehavior and SelectorFailureBehavior are more clearly communicated.
Motivation:
According to RFC 7231 the server may choose to:
```
indicate a zero-length payload for the response by including a
Transfer-Encoding header field with a value of chunked and a message
body consisting of a single chunk of zero-length
```
https://tools.ietf.org/html/rfc7231#page-53
In such cases the exception below appears during decoding phase:
```
java.lang.IllegalArgumentException: invalid version format: 0
at io.netty.handler.codec.http.HttpVersion.<init>(HttpVersion.java:121)
at io.netty.handler.codec.http.HttpVersion.valueOf(HttpVersion.java:76)
at io.netty.handler.codec.http.HttpResponseDecoder.createMessage(HttpResponseDecoder.java:118)
at io.netty.handler.codec.http.HttpObjectDecoder.decode(HttpObjectDecoder.java:219)
```
Modifications:
HttpObjectDecoder.isContentAlwaysEmpty specifies content NOT empty
when 205 Reset Content response
Result:
There is no `IllegalArgumentException: invalid version format: 0`
when handling 205 Reset Content response with transfer-encoding
Motivation:
`FixedChannelPool` allows users to configure `acquireTimeoutMillis`
and expects given value to be greater or equal to zero when timeout
action is supplied. However, validation error message said that
value is expected to be greater or equal to one. Code performs
check against zero.
Modifications:
Changed error message to say that value greater or equal to
zero is expected. Added test to check that zero is an acceptable
value.
Result:
Exception with right error message is thrown.
Motivation:
We should not try to use UnixResolverDnsServerAddressStreamProvider when on Windows as it will log some error that will produce noise and may confuse users.
Modifications:
Just use DefaultDnsServerAddressStreamProvider if windows is used.
Result:
Less noise in the logs. This was reported in vert.x: https://github.com/eclipse/vert.x/issues/2204
Motivation:
`AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue.
Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K).
Modification:
Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance.
Result:
Before - cancelling futures has varying performance, capped at `O(n)`
After - cancelling futures has stable performance, capped at `O(log n)`
Benchmark results
After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 137779.616 ± 7709.751 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 11049.448 ± 385.832 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 943.294 ± 12.391 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 64.210 ± 1.824 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 167531.096 ± 9187.865 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 33019.786 ± 4737.770 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 2976.955 ± 248.555 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 362.654 ± 45.716 ops/s
```
Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation.
```
Benchmark (num) Mode Cnt Score Error Units
ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 139968.586 ± 12951.333 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 12274.420 ± 337.800 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 958.168 ± 15.350 ops/s
ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 53.381 ± 13.981 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 123918.829 ± 3642.517 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 5099.810 ± 206.992 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 72.335 ± 0.443 ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 0.743 ± 0.003 ops/s
```
Motivation:
At the moment use loops to run SslHandler tests with different SslProviders which is error-prone and also make it difficult to understand with which provider these failed.
Modifications:
- Move unit tests that should run with multiple SslProviders to extra class.
- Use junit Parameterized to run with different SslProvider combinations
Result:
Easier to understand which SslProvider produced test failures
Motivation:
When calling CompositeBytebuf.copy() and copy(...) we currently use Unpooled to allocate the buffer. This is not really correct and may produce more GC then needed. We should use the allocator that was used when creating the CompositeByteBuf to allocate the new buffer which may be for example the PooledByteBufAllocator.
Modifications:
- Use alloc() to allocate the new buffer.
- Add tests
- Fix tests that depend on the copy to be backed by an byte-array without checking hasArray() first.
Result:
Fixes [#7393].