Motivation:
I am receiving a multipart/form_data upload from a Mailgun webhook. This webhook used to send parts like this:
--74e78d11b0214bdcbc2f86491eeb4902
Content-Disposition: form-data; name="attachment-2"; filename="attached_�айл.txt"
Content-Type: text/plain
Content-Length: 32
This is the content of the file
--74e78d11b0214bdcbc2f86491eeb4902--
but now it posts parts like this:
--74e78d11b0214bdcbc2f86491eeb4902
Content-Disposition: form-data; name="attachment-2"; filename*=utf-8''attached_%D1%84%D0%B0%D0%B9%D0%BB.txt
This is the content of the file
--74e78d11b0214bdcbc2f86491eeb4902--
This new format uses field parameter encoding described in RFC 5987. More about this encoding can be found here.
Netty does not parse this format. The result is the filename is not decoded and the part is not parsed into a FileUpload.
Modification:
Added failing test in HttpPostRequestDecoderTest.java and updated HttpPostMultipartRequestDecoder.java
Refactored to please Netkins
Result:
Fixes:
HttpPostMultipartRequestDecoder identifies the RFC 5987 format and parses it.
Previous functionality is retained.
Motivation:
Before this commit, it is impossible to access the path component of the
URI before it has been decoded. This makes it impossible to distinguish
between the following URIs:
/user/title?key=value
/user%2Ftitle?key=value
The user could already access the raw uri value, but they had to calculate
pathEndIdx themselves, even though it might already be cached inside
QueryStringDecoder.
Result:
The user can easily and efficiently access the undecoded path and query.
Motivation:
An `origin`/`sec-websocket-origin` header value in websocket client is filling incorrect in some cases:
- Hostname is not converting to lower-case as prescribed by RFC 6354 (see [1]).
- Selecting a `http` scheme when source URI has `wss`/`https` scheme and non-standard port.
Modifications:
- Convert uri-host to lower-case.
- Use a `https` scheme if source URI scheme is `wss`/`https`, or if source scheme is null and port == 443.
Result:
Correct filling an `origin` header for WS client.
[1] https://tools.ietf.org/html/rfc6454#section-4
Motivation:
Even if it's a super micro-optimization (most JVM could optimize such
cases in runtime), in theory (and according to some perf tests) it
may help a bit. It also makes a code more clear and allows you to
access such methods in the test scope directly, without instance of
the class.
Modifications:
Add 'static' modifier for all methods, where it possible. Mostly in
test scope.
Result:
Cleaner code with proper 'static' modifiers.
Motivation:
During code read of the Netty codebase I noticed that the Netty
HttpServerUpgradeHandler unconditionally sets a Content-Length: 0
header on 101 Switching Protocols responses. This explicitly
contravenes RFC 7230 Section 3.3.2 (Content-Length), which notes
that:
A server MUST NOT send a Content-Length header field in any
response with a status code of 1xx (Informational) or 204
(No Content).
While it is unlikely that any client will ever be confused by
this behaviour, there is no reason to contravene this part of the
specification.
Modifications:
Removed the line of code setting the header field and changed the
only test that expected it to be there.
Result:
When performing the server portion of HTTP upgrade, the 101
Switching Protocols response will no longer contain a
Content-Length: 0 header field.
Motivation:
#7269 removed an unnecessary instanciation for verifying WebSocket
handshake status code.
But it uses a hardcoded status code value for 101 instead of using the
intended `HttpResponseStatus#SWITCHING_PROTOCOLS` constant.
Modidication:
Compare actual `HttpResponseStatus` against predefined constant. Note
that `HttpResponseStatus#equals` is implemented in respect with the RFC
(only honor code, not text) so it’s intended to be used this way.
Result:
Cleaner code, use intended constant instead of hard coded value.
Motivation:
https://github.com/netty/netty/issues/7253
Modifications:
Adding `Content-Length: 0` to `CorsHandler.forbidden()` and `CorsHandler.handlePreflight()`
Result:
Contexts that are terminated by the CorsHandler will always include a Content-Length header
Motivation:
- In the `HttpResponseStatus#equals` checks only status code. No need to create new instance of `HttpResponseStatus` for comparison with response status.
- The RFC says: `the HTTP version and reason phrase aren't important` [1].
Modifications:
Use comparison by status code without creating new `HttpResponseStatus`.
Result:
Less allocations, more clear code.
[1] https://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-00
Motivation:
We need to ensure we not write any body when a response with status code of 1xx, 204 or 304 is used as stated in rfc:
https://tools.ietf.org/html/rfc7230#section-3.3.3
Modifications:
- Correctly handle status codes
- Add unit tests
Result:
Correctly handle responses with 1xx, 204, 304 status codes.
Motivation:
HttpObjectEncoder allocates a new buffer when encoding the initial line and headers, and also allocates a buffer when encoding the trailers. The allocation always uses the default size of 256. This may lead to consistent under allocation and require a few resize/copy operations which can cause GC/memory pressure.
Modifications:
- Introduce a weighted average which tracks the historical size of encoded data and uses this as an estimate for future buffer allocations
Result:
Better approximation of buffer sizes.
Motivation:
ServerCookieEncoder’s javadoc contains some invalid copy-pasting from
ClientCookieEncoder.
Modifications:
* As per RFC6265, multiple cookies are sent as separate Set-Cookie
response headers.
* Fix code sample
Result:
Proper javadoc
This reverts commit d63bb4811e as this not covered correctly all cases and so could lead to missing fireChannelReadComplete() calls. We will re-evalute d63bb4811e and resbumit a pr once we are sure all is handled correctly
Motivation:
The `AsciiString#toString` method calculate string value and cache it into field. If an `AsciiString` created from the `String` value, we can avoid rebuilding strings if we cache them immediately when creating `AsciiString`. It would be useful for constants strings, which already stored in the JVMs string table, or in cases where an unavoidable `#toString `method call is assumed.
Modifications:
- Add new static method `AsciiString#cache(String)` which save string value into cache field.
- Apply a "benign" data race in the `#hashCode` and `#toString` methods.
Result:
Less memory usage in some `AsciiString` use cases.
Motivation:
HttpObjectAggregator differs from HttpServerExpectContinueHandler's handling
of expect headers by not stripping the 'expect' header when a response
is generated.
Modifications:
HttpObjectAggregator now removes the 'expect' header in cases where it generates
a response.
Result:
Consistent and correct behavior between HttpObjectAggregator and HttpServerExpectContinueHandler.
Motivation:
Issue #6695 states that there is an issue when writing empty content via HttpResponseEncoder.
Modifications:
Add two test-cases.
Result:
Verified that all works as expected.
Motivation:
Its wasteful and also confusing that channelReadComplete() is called even if there was no message forwarded to the next handler.
Modifications:
- Only call ctx.fireChannelReadComplete() if at least one message was decoded
- Add unit test
Result:
Less confusing behavior. Fixes [#4312].
Motivation:
08748344d8 introduced two new tests which did not take into account that the multipart delimiter can be between 2 and 16 bytes long.
Modifications:
Take the multipart delimiter length into account.
Result:
Fixes [#7001]
Motivation:
codec-http2 currently does not strictly enforce the HTTP/1.x semantics with respect to the number of headers defined in RFC 7540 Section 8.1 [1]. We currently don't validate the number of headers nor do we validate that the trailing headers should indicate EOS.
[1] https://tools.ietf.org/html/rfc7540#section-8.1
Modifications:
- DefaultHttp2ConnectionDecoder should only allow decoding of a single headers and a single trailers
- DefaultHttp2ConnectionEncoder should only allow encoding of a single headers and optionally a single trailers
Result:
Constraints of RFC 7540 restricting the number of headers/trailers is enforced.
Motivation:
HttpPostRequestEncoder maintains an internal buffer that holds the
current encoded data. There are use cases when this internal buffer
becomes null, the next chunk processing implementation should take
this into consideration.
Modifications:
- When preparing the last chunk if currentBuffer is null, mark
isLastChunkSent as true and send LastHttpContent.EMPTY_LAST_CONTENT
- When calculating the remaining size take into consideration that the
currentBuffer might be null
- Tests are based on those provided in the issue by @nebhale and @bfiorini
Result:
Fixes#5478
Motivation:
- A `HttpPostMultipartRequestDecoder` contains two pairs of the same methods: `readFileUploadByteMultipartStandard`+`readFileUploadByteMultipart` and `loadFieldMultipartStandard`+`loadFieldMultipart`.
- These methods use `NotEnoughDataDecoderException` to detecting not last data chunk (exception handling is very expensive).
- These methods can be greatly simplified.
- Methods `loadFieldMultipart` and `loadFieldMultipartStandard` has an unnecessary catching for the `IndexOutOfBoundsException`.
Modifications:
- Remove duplicate methods.
- Replace handling `NotEnoughDataDecoderException` by the return of a boolean result.
- Simplify code.
Result:
The code is cleaner and easier to support. Less exception handling logic.
Motivation:
1. Some encoders used a `ByteBuf#writeBytes` to write short constant byte array (2-3 bytes). This can be replaced with more faster `ByteBuf#writeShort` or `ByteBuf#writeMedium` which do not access the memory.
2. Two chained calls of the `ByteBuf#setByte` with constants can be replaced with one `ByteBuf#setShort` to reduce index checks.
3. The signature of method `HttpHeadersEncoder#encoderHeader` has an unnecessary `throws`.
Modifications:
1. Use `ByteBuf#writeShort` or `ByteBuf#writeMedium` instead of `ByteBuf#writeBytes` for the constants.
2. Use `ByteBuf#setShort` instead of chained call of the `ByteBuf#setByte` with constants.
3. Remove an unnecessary `throws` from `HttpHeadersEncoder#encoderHeader`.
Result:
A bit faster writes constants into buffers.
Motivation:
Right now HttpRequestEncoder does insertion of slash for url like http://localhost?pararm=1 before the question mark. It is done not effectively.
Modification:
Code:
new StringBuilder(len + 1)
.append(uri, 0, index)
.append(SLASH)
.append(uri, index, len)
.toString();
Replaced with:
new StringBuilder(uri)
.insert(index, SLASH)
.toString();
Result:
Faster HttpRequestEncoder. Additional small test. Attached benchmark in PR.
Benchmark Mode Cnt Score Error Units
HttpRequestEncoderInsertBenchmark.newEncoder thrpt 40 3704843.303 ± 98950.919 ops/s
HttpRequestEncoderInsertBenchmark.oldEncoder thrpt 40 3284236.960 ± 134433.217 ops/s
Motivation:
1. `ByteBuf` contains methods to writing `CharSequence` which optimized for UTF-8 and ASCII encodings. We can also apply optimization for ISO-8859-1.
2. In many places appropriate methods are not used.
Modifications:
1. Apply optimization for ISO-8859-1 encoding in the `ByteBuf#setCharSequence` realizations.
2. Apply appropriate methods for writing `CharSequences` into buffers.
Result:
Reduce overhead from string-to-bytes conversion.
Motivation:
These headers can be used to prevent clickjacking.
Modifications:
Add static fields for content-security-policy and x-frame-options
Result:
Expose general useful names
Motivation:
A life cycle of QueryStringEncoder is simple: create, append params, convert to String. Current realization collect params in the list, and calculate an URI string in `toString` method. We can simplify this: don't store params to the list, and immediately append parameters to the `StringBuilder`.
Modifications:
- Remove list for params and remove a tuple class `Param`.
- Use one common `StringBuilder` and append parameters into it.
- Resolve `TODO` in the `encodeParam` method.
Result:
Less allocations (no `ArrayList`, no `Param` tuples). Second `toString` call is faster.
Motivation:
PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.
Modifications:
1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).
Result:
Less code duplication.
Motivation:
If the content-length does not parse as a number, leniency causes this
to instead be parsed as the default value. This leads to bodies being
silently ignored on requests which can be incredibly dangerous. Instead,
if the content-length header is invalid, an exception should be thrown
for upstream handling.
Modifications:
This commit removes the leniency in parsing the content-length header by
allowing a number format exception, if thrown, to escape from the method
rather than falling back to the default value.
Result:
In invalid content-length header will not be silently ignored.
Motivation:
For multi-line headers HttpObjectDecoder uses StringBuilder.append(a).append(b) pattern that could be easily replaced with regular a + b. Also oparations with a and b moved out from concat operation to make it friendly for StringOptimizeConcat optimization and thus - faster.
Modification:
StringBuilder.append(a).append(b) reaplced with a + b. Operations with a and b moved out from concat oparation.
Result:
Code simpler to read and faster.
Motivation:
The class `HttpPostRequestEncoder` has minor issues:
- The `encodeNextChunkMultipart()` method contains two identical blocks of code with a difference only in the cast interfaces: `Attribute` vs `HttpData`. Because the `Attribute` is extended by `HttpData`, the block with the `Attribute` can be safely deleted.
- The `getNewMultipartDelimiter()` method contains a redundant `toLowerCase()`.
- The `addBodyFileUploads()` method throws `NPE` instead of `IllegalArgumentException`.
Modifications:
- Remove duplicated code block from `encodeNextChunkMultipart()`.
- Remove redundant `toLowerCase()` from `getNewMultipartDelimiter()`.
- Replace `NPE` with `IllegalArgumentException` in `addBodyFileUploads()`.
- Use `ObjectUtil#checkNotNull` where possible.
Result:
More correct and clean code.
Motivations:
1. There are duplicated implementations of decoding hex strings. #6797
2. ByteBufUtil.HexUtil.decodeHexDump does not handle substring start
index properly and does not decode hex byte rigorously.
Modifications:
1. Function decodeHexByte is moved from QueryStringDecoder into ByteBufUtil.
2. ByteBufUtil.HexUtil.decodeHexDump is changed to use decodeHexByte.
3. Tests are Updated accordingly.
Result:
Fixed#6797 and made hex decoding functions more robust.
Motivation:
A `SeekAheadNoBackArrayException` used as check for `ByteBuf#hasArray`. The catch of exceptions carries a large overhead on stack trace filling, and this should be avoided.
Modifications:
- Remove the class `SeekAheadNoBackArrayException` and replace its usage with `if` statements.
- Use methods from `ObjectUtils` for better readability.
- Make private methods static where it make sense.
- Remove unused private methods.
Result:
Less of exception handling logic, better performance.
Motivation:
If a full HttpResponse with a Content-Length header is encoded by the HttpContentEncoder subtypes the Content-Length header is removed and the message is set to Transfer-Encoder: chunked. This is an unnecessary loss of information about the message content.
Modifications:
- If a full HttpResponse has a Content-Length header, the header is adjusted after encoding.
Result:
Complete messages continue to have the Content-Length header after encoding.
Motivation:
QueryStringDecoder has several problems:
- doesn't decode correctly path part with `+` (plus) sign in it,
- doesn't cut a `fragment` (after `#`) from query string (see RFC 3986),
- doesn't work correctly with encoding,
- treat `%%` as a percent character escaping (it's don't described in RFC).
Modifications:
- leave `+` chars in a `path` part of uri string,
- ignore `fragment` part (after `#`),
- correctly work with encoding.
- don't treat `%%` as escaping for the `%`.
Result:
Fixed issues from #6745.
Motivation:
Allow subclasses of HttpObjectEncoder other than HttpServerCodec to override the isContentAlwaysEmpty method
Modification:
Change the method visibility from package private to protected
Result:
Fixes#6761
Motivation:
Fix the regression recently introduced that causes already encoded responses to be encoded again as gzip
Modification:
instead of just looking for IDENTITY, anything set for Content-Encoding should be respected and left as-is
added unit tests to capture this use case
Result:
Fixes#6784
Motivation
RFC 1945 (see section 3.1) says that request lines may not have a version in which case the request is assumed to be HTTP/0.9. We don't necessarily want to support that but the existing Exception should indicate the possibility of the request being HTTP/0.9 and give the user a chance to track it down.
Modifications
Indicate in the Exception's message that the request is possibly HTTP/0.9.
Result
Fixes#6739
Motivation:
The status 308 is defined by RFC7538.
This RFC has currently the state Proposed Standard since 2 years, but the status code is already handle by all browsers (Chrome, Firefox, Edge, Safari, …).
To let developer handles easily this status code, it is added into this list.
Modifications:
Added this status code in the list of all status codes and changed the valudOf() method
Result:
Status code 308 included