Motivation:
We used Recycler for the CodecOutputList which is not optimized for the use-case of access only from the same Thread all the time.
Modifications:
- Use FastThreadLocal for CodecOutputList
- Add benchmark
Result:
Less overhead in our codecs.
Motivation:
According to RFC 1952, concatenation of valid gzip streams is also a valid gzip stream. JdkZlibDecoder only processed the first and discarded the rest.
Modifications:
- Introduced a constructor argument decompressConcatenated that if true, JdkZlibDecoder would continue to process the stream.
Result:
- If 'decompressConcatenated = true', concatenated streams would be processed in
compliance to RFC 1952.
- If 'decompressConcatenated = false' (default), existing behavior would remain.
Motivation:
Use actual links to new locations of Protobuf repo and documentation to
avoid problems when redirect will not work.
Modification:
Links in comments and all/pom.xml
Result:
Correct links to Protobuf resources
Motivation:
Without a 'serialVersionUID' field, any change to a class will make
previously serialized versions unreadable.
Modifications:
Add missed 'serialVersionUID' field for all Serializable
classes.
Result:
Proper deserialization of previously serialized objects.
Motivation: Today when Netty encounters a general error while decoding
it treats this as a decoder exception. However, for fatal causes this
should not be treated as such, instead the fatal error should be carried
up the stack without the callee having to unwind causes. This was
probably done for byte to byte message decoder but is now done for all
decoders.
Modifications: Instead of translating any error to a decoder exception,
we let those unwind out the stack (note that finally blocks still
execute) except in places where an event needs to fire where we fire
with the error instead of wrapping in a decoder exception.
Result: Fatal errors will not be treated as innocent decoder exceptions.
Motivation: Today when Netty encounters a general error while decoding
it treats this as a decoder exception. However, for fatal causes this
should not be treated as such, instead the fatal error should be carried
up the stack without the callee having to unwind causes.
Modifications: Instead of translating any error to a decoder exception,
we let those unwind out the stack (note that finally blocks still
execute).
Result: Fatal errors will not be treated as innocent decoder exceptions.
Motivation:
A large frame will be componsed by many packages. Every time the package
arrived, findEndOfLine will be called from the start of the buffer. It
will cause the complexity of reading frame equal to O(n^2). This can be
eliminated by using a offset to mark the last scan position, when new
package arrived, just find the delimter from the mark. The complexity
will be O(n).
Modification:
Add a offset to mark the last scan position.
Result:
Better performance for read large frame.
Motivation:
The decode method is too large to be inlined with default compiler settings, hence the uncommon paths need to be packed and moved away form the common one.
Modifications:
The uncommon paths of the decode call (eg failures with thrown exceptions) are packed and moved in private methods in order to reduce the size of the common one
and let it being inlined.
Result:
The decode method is being inlined if the stack depth allows it.
Motivation:
Continuing to make netty happy when compiling through errorprone.
Modification:
Mostly comments, some minor switch statement changes.
Result:
No more compiler errors!
This reverts commit d63bb4811e as this not covered correctly all cases and so could lead to missing fireChannelReadComplete() calls. We will re-evalute d63bb4811e and resbumit a pr once we are sure all is handled correctly
Motivation:
Its wasteful and also confusing that channelReadComplete() is called even if there was no message forwarded to the next handler.
Modifications:
- Only call ctx.fireChannelReadComplete() if at least one message was decoded
- Add unit test
Result:
Less confusing behavior. Fixes [#4312].
Motivation:
1. Hash function in the Snappy encoding is wrong probably: used '+' instead of '*'. See the reference implementation [1].
2. Size of the hash table is calculated, but not applied.
Modifications:
1. Fix hash function: replace addition by multiplication.
2. Allocate hash table with calculated size.
3. Use an `Integer.numberOfLeadingZeros` trick for calculate log2.
4. Release buffers in tests.
Result:
1. Better compression. In the test `encodeAndDecodeLongTextUsesCopy` now compressed size is 175 instead of 180 before this change.
2. No redundant allocations for hash table.
3. A bit faster the calc of shift (less an expensive math operations).
[1] 513df5fb5a/snappy.cc (L67)
Motivation:
JdkZlibDecoder will allocate a new buffer when the previous buffer is filled with inflated data, but JZlibDecoder will attempt to use the same buffer by resizing. This leads to inconsistent results when these two decoders that are intended to be functionality equivalent.
Modifications:
- JdkZlibDecoder should attempt to resize and reuse the existing buffer instead of creating multiple buffers
Result:
Fixes https://github.com/netty/netty/issues/6804
Motivation:
ByteToMessageDecoder#handlerRemoved will immediately release the cumulation buffer, but it is possible that a child class may still be using this buffer, and therefore use a dereferenced buffer.
Modifications:
- ByteToMessageDecoder#handlerRemoved and ByteToMessageDecoder#decode should coordinate to avoid the case where a child class is using the cumulation buffer but ByteToMessageDecoder releases that buffer.
Result:
Child classes of ByteToMessageDecoder are less likely to reference a released buffer.
Motivation:
We not correctly guarded against overflow and so call Base64.encode(...) with a big buffer may lead to an overflow when calculate the size of the out buffer.
Modifications:
Correctly guard against overflow.
Result:
Fixes [#6620].
Motivation:
If a read-only ByteBuf is passed to the ByteToMessageDecoder.channelRead(...) method we need to make a copy of it once we try to merge buffers for cumulation. This usually is not the case but can for example happen if the local transport is used. This was the cause of the leak report we sometimes saw during the codec-http2 tests, as we are using the local transport and write a read-only buffer. This buffer will then be passed to the peer channel and fired through the pipeline and so end up as the cumulation buffer in the ByteToMessageDecoder. Once the next fragement is received we tried to merge these and failed with a ReadOnlyBufferException which then produced a leak.
Modifications:
Ensure we copy the buffer if its read-only.
Result:
No more exceptions and so leak when a read-only buffer is passed to ByteToMessageDecoder.channelRead(...)
Motivation:
This pull request does not solve any problem but we find that several links in the code refer to project websites under the domain of http://code.google.com which are either moved to github or not maintained anymore.
Modification:
Update the project links from code.google.com to the relevant project in github.com
Motivation:
Base64#decode4to3 generally calculates an int value where the contents of the decodabet straddle bytes, and then uses a byte shifting or a full byte swapping operation to get the resulting contents. We can directly calculate the contents and avoid any intermediate int values and full byte swap operations. This will reduce the number of operations required during the decode operation.
Modifications:
- remove the intermediate int in the Base64#decond4to3 method.
- manually do the byte shifting since we are already doing bit/byte manipulations here anyways.
Result:
Base64#decode4to3 requires less operations to compute the end result.
Motivation:
The decode and encode method uses getByte(...) and setByte(...) in loops which can be very expensive because of bounds / reference-count checking. Beside this it also slows-down a lot when paranoid leak-detection is enabled as it will track each access.
Modifications:
- Pack bytes into int / short and so reduce operations on the ByteBuf
- Use ByteBufProcessor to reduce getByte calls.
Result:
Better performance in general. Also when you run the build with -Pleak the handler module will build in 1/4 of the time it took before.
Motivation:
We have our own ThreadLocalRandom implementation to support older JDKs . That said we should prefer the JDK provided when running on JDK >= 7
Modification:
Using ThreadLocalRandom implementation of the JDK when possible.
Result:
Make use of JDK implementations when possible.