Motivation:
Project Jigsaw in JDK9 has moved the direct byte buffer cleaner from
sun.misc.Cleaner to java.lang.ref.Cleaner$Cleanable. This cause the
current platform tests to throw a ClassNotFoundException, disabling the
use of direct byte buffer cleaners.
Modifications:
I use reflection to find the clean method in either sun.misc.Cleaner or
java.lang.ref.Cleaner$Cleanable.
Result:
Netty uses direct byte buffers on JDK9 as it already do on earlier JDKs.
Motivation:
In JDK9 heap byte buffers have an address field, so we have to remove
the current check as it is invalid in JDK9.
Modifications:
Removed the address field check for heap byte buffers.
Result:
Netty continues to find sun.misc.Unsafe in JDK9 as in previous JDKs.
Motivation:
Netty's platform dependent parts should know about JDK9.
Modifications:
JDK9 introduce Runtime$Version Runtime.version() which has an int major()
method that always return the major Java version. I call that method to
get the Java major version.
Result:
Netty will recognize all future JDK versions.
Motivation:
PlatformDependent attempts to use reflection to get the underlying char[] (or byte[]) from String objects. This is fragile as if the String implementation does not utilize the full array, and instead uses a subset of the array, this optimization is invalid. OpenJDK6 and some earlier versions of OpenJDK7 String have the capability to use a subsection of the underlying char[].
Modifications:
- PlatformDependent should not attempt to use the underlying array from String (or other data types) via reflection
Result:
PlatformDependent hash code generation for CharSequence does not depend upon specific JDK implementation details.
Motivation:
PR #5355 modified interfaces to reduce GC related to the HPACK code. However this came with an anticipated performance regression related to HpackUtil.equals due to AsciiString's increase cost of charAt(..). We should mitigate this performance regression.
Modifications:
- Introduce an equals method in PlatformDependent which doesn't leak timing information and use this in HpcakUtil.equals
Result:
Fixes https://github.com/netty/netty/issues/5436
Motivation:
It is good to have used dependencies and plugins up-to-date to fix any undiscovered bug fixed by the authors.
Modification:
Scanned dependencies and plugins and carefully updated one by one.
Result:
Dependencies and plugins are up-to-date.
Motivation:
To restrict the memory usage of a system it is sometimes needed to adjust the number of max pending tasks in the tasks queue.
Modifications:
- Add new constructors to modify the number of allowed pending tasks.
- Add system properties to configure the default values.
Result:
More flexible configuration.
Motivation:
We should merge ThrowableUtils into ThrowableUtil as this name is more consistent with the naming of utility classes in netty.
Modifications:
Merge classes.
Result:
More consistent naming
Motivation:
We use pre-instantiated exceptions in various places for performance reasons. These exceptions don't include a stacktrace which makes it hard to know where the exception was thrown. This is especially true as we use the same exception type (for example ChannelClosedException) in different places. Setting some StackTraceElements will provide more context as to where these exceptions original and make debugging easier.
Modifications:
Set a generated StackTraceElement on these pre-instantiated exceptions which at least contains the origin class and method name. The filename and linenumber are specified as unkown (as stated in the javadocs of StackTraceElement).
Result:
Easier to find the origin of a pre-instantiated exception.
Motivation:
In case of exception in invokeExceptionCaught() only original exception passed to invokeExceptionCaught() will be logged on any log level.
+ AbstractChannelHandlerContext and CombinedChannelDuplexHandler log different exceptions.
Modifications:
Fix inconsistent logging code and add ability to see both stacktraces on DEBUG level.
Result:
Both handlers log now both original exception and thrown from invokeExceptionCaught. To see full stacktrace of exception thrown from invokeExceptionCaught DEBUG log level must be enabled.
Motivation:
JCTools supports both non-unsafe, unsafe versions of queues and JDK6 which allows us to shade the library in netty-common allowing it to stay "zero dependency".
Modifications:
- Remove copy paste JCTools code and shade the library (dependencies that are shaded should be removed from the <dependencies> section of the generated POM).
- Remove usage of OneTimeTask and remove it all together.
Result:
Less code to maintain and easier to update JCTools and less GC pressure as the queue implementation nt creates so much garbage
Motivation:
We tried to always use SecureRandom to generate the initialSeed for our ThreadLocalRandom, this can sometimes give warnings under normal usage. We should better not use SecureRandom as default (just as the implementation in jsr166y does) and only try if the user specified -Djava.util.secureRandomSeed=true .
Modifications:
Only try to use SecureRandom when -Djava.util.secureRandomSeed=true is used.
Result:
Less likely to see entropy warnings.
Motivation:
If the user uses unsafe direct buffers with no cleaner we can use Unsafe.reallocateMemory(...) as optimization when we need to expand the buffer.
Modifications:
Use Unsafe.relocateMemory(...) in UnpooledUnsafeNoCleanerDirectByteBuf.
Result:
Less expensive expanding of buffers.
Motivation:
Using the Cleaner to release the native memory has a few drawbacks:
- Cleaner.clean() uses static synchronized internally which means it can be a performance bottleneck
- It put more load on the GC
Modifications:
Add new buffer implementations that can be enabled with a system flag as optimizations. In this case no Cleaner is used at all and the user must ensure everything is always released.
Result:
Less performance impact by direct buffers when need to be allocated and released.
Motivation:
Unsafe offers a method to set memory to a specific value. This can be used to implement an optimized version of setZero(...) and writeZero(...)
Modifications:
Add implementation for all Unsafe*ByteBuf implementations.
Result:
Faster setZero(...) and writeZero(...)
Motivation:
Recycler.recycle(...) should not be used anymore and be replaced by Handle.recycle().
Modifications:
Mark it as deprecated and update usage.
Result:
Correctly document deprecated api.
Motivation:
As reported in #4211, when using Netty in Tomcat (or other container based deployment), ForkJoinPool leaks an instance of `Submitter` so that the class loader of `Submitter` won't be GCed. However, since `Submitter` is just a wrapper of `int`, we can replace it with `int[1]`.
Modifications:
Replace `Submitter` with `int[1]`.
Result:
No class loader leak in ForkJoinPool when using in a container.
Motivation:
Some codecs should be considered unstable as these are relative new. For this purpose we should introduce an annotation which these codecs should us to be marked as unstable in terms of API.
Modifications:
- Add UnstableApi annotation and use it on codecs that are not stable
- Move http2.hpack to http2.internal.hpack as it is internal.
Result:
Better document unstable APIs.
Motivation:
It's better to make all InternalLoggerFactory implementations be singletons according to the discussions in #5047
Modifications:
Make all InternalLoggerFactory implementations be singletons and hide the construtors.
Result:
All InternalLoggerFactory implementations be singletons.
Motivation:
See #3095
Modifications:
Add Log4J2LoggerFactory and Log4J2Logger which is an InternalLogger implementation based on log4j2.
Result:
The user can use log4j2 directly without a special slf4j binding.
Motivation:
The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418
Modifications:
- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal
Result:
The slow path of FastThreadLocal is improved.
Motivation:
We want to allow the use of an uber jar that contains shared dynamic libraries for all platforms (including fedora).
Modifications:
Modified OpenSsl to try and load the fedora library if the OS is Linux and the platform specified library fails before using the default lib.
Result:
True uber support.
Motivation:
See #3746.
Modifications:
Fork SpscLinkedQueue and SpscLinkedAtomicQueue from JCTools based on 7846450e28
Result:
Add SpscLinkedQueue and SpscLinkedAtomicQueue and apply it in LocalChannel.
Motivation:
See #4855
Modifications:
Unfortunately, unescapeCsv cannot be used here because the input could be a CSV line like `"a,b",c`. Hence this patch adds unescapeCsvFields to parse a CSV line and split it into multiple fields and unescaped them. The unit tests should define the behavior of unescapeCsvFields.
Then this patch just uses unescapeCsvFields to implement `CombinedHttpHeaders.getAll`.
Result:
`CombinedHttpHeaders.getAll` will return the unescaped values of a header.
Motivation:
The implementation of obtaining the best possible mac address is very good. There are many sub-par implementations proposed on stackoverflow.
While not strictly a netty concern, it would be nice to offer this util also to netty users.
Modifications:
extract DefaultChannelId#defaultMachineId code obtaining the "best" mac into a new helper called MacAddress, keep the random bytes fallback in DefaultChannelID.
Result:
New helper available.
Motivation:
See #3411. A reusable ArrayList in InternalThreadLocalMap can avoid allocations in the following pattern:
```
List<...> list = new ArrayList<...>();
add something to list but never use InternalThreadLocalMap
return list.toArray(new ...[list.size()]);
```
Modifications:
Add a reusable ArrayList to InternalThreadLocalMap and update codes to use it.
Result:
Reuse a thread local ArrayList to avoid allocations.
Motivation:
PlatformDependent0 has an optimization which grabs the char[] from a String. Since this code was introduced http://openjdk.java.net/jeps/254 has been gaining momentum in JDK 9. This JEP changes the internal storage from char[] to byte[], and thus the existing char[] only based optimizations will not work.
Modifications:
- The ASCII encoding char[] String optimizations should also work for byte[].
Result:
ASCII encoding char[] String optimizations don't break if the underlying storage in String is byte[].
Motivation:
PlatformDependent allows some exceptions to escape during static initialization. If an exception escapes it will be translated into a java.lang.ExceptionInInitializerError and render the application unable to run.
Modifications:
- Make sure to catch Throwable during static initialization.
Result:
PlatformDependent static initialization doesn't result in java.lang.ExceptionInInitializerError.
Motivation:
Caching is currently nested in DnsResolver.
It should also be possible to extend DnsResolver to ba able to pass a different cache on each resolution attemp.
Modifications:
* Introduce DnsCache, NoopDnsCache and DefaultDnsCache. The latter contains all the current caching logic that was extracted.
* Introduce protected versions of doResolve and doResolveAll that can be used as extension points to build resolvers that bypass the main cache and use a different one on each resolution.
Result:
Isolated caching logic. Better extensibility.
Motivation:
* newAtomicIntegerFieldUpdater and newAtomicLongFieldUpdater take a
class<?> so they're too lax
* newAtomicReferenceFieldUpdater takes a Class<U> so it's too strict
and can only be passe a rawtype parameter when dealing w/ generic
classes
Modifications:
Take a Class<? super T> parameter instead.
Result:
Better type safety and generics support.
Motivation:
Javadoc reports errors about invalid docs.
Modifications:
Fix some errors reported by javadoc.
Result:
A lot of javadoc errors are fixed by this patch.
Motivation:
DomainNameMapping.add() makes DomainNameMapping look like it's safe to call add() anytime, and this is never true. It's probably better deprecate add() and introduce DomainNameMappingBuilder.
Modifications:
Made an immutable implementation of DomainNameMapping;
Added Builder for immutable DomainNameMapping;
Replaced regex pattern with String::startsWith check;
Replaced HashMap with two arrays in ImmutableDomainNameMapping;
Deprecated mutable API;
Estimation for StringBuilder initial size in ImmutableDomainNameMapping#toString()
Added StringUtil#commonSuffixOfLength
Replaced unnecessary substrings creation in DomainNameMapping#matches with regionMatches
Result:
Clients will be able to create immutable instances of DomainNameMapping with builder API.
Motivation:
UTF-16 can not represent the full range of Unicode characters, and thus has the concept of Surrogate Pair (http://unicode.org/glossary/#surrogate_pair) where 2 16-bit code units can be used to represent the missing characters. ByteBufUtil.writeUtf8 is currently does not support this and is thus incomplete.
Modifications:
- Add support for surrogate pairs in ByteBufUtil.writeUtf8
Result:
ByteBufUtil.writeUtf8 now supports surrogate pairs and is correctly converting to UTF-8.
Motivation:
PriorityStreamByteDistributor uses a homegrown algorithm which distributes bytes to nodes in the priority tree. PriorityStreamByteDistributor has no concept of goodput which may result in poor utilization of network resources. PriorityStreamByteDistributor also has performance issues related to the tree traversal approach and number of nodes that must be visited. There also exists some more proven algorithms from the resource scheduling domain which PriorityStreamByteDistributor does not employ.
Modifications:
- Introduce a new ByteDistributor which uses elements from weighted fair queue schedulers
Result:
StreamByteDistributor which is sensitive to priority and uses a more familiar distribution concept.
Fixes https://github.com/netty/netty/issues/4462
Motivation:
On contrary to `DefaultNameResolver`, `DnsNameResolver` doesn't currently honor hosts file.
Modifications:
* Introduce `HostsFileParser` that parses `/etc/hosts` or `C:\Windows\system32\drivers\etc\hosts` depending on the platform
* Introduce `HostsFileEntriesResolver` that uses the former to resolve host names
* Make `DnsNameResolver` check his `HostsFileEntriesResolver` prior to trying to resolve names against the DNS server
* Introduce `DnsNameResolverBuilder` so we now have a builder for `DnsNameResolver`s
* Additionally introduce a `CompositeNameResolver` that takes several `NameResolver`s and tries to resolve names by delegating sequentially
* Change `DnsNameResolver.asAddressResolver` to return a composite and honor hosts file
Result:
Hosts file support when using `DnsNameResolver`.
Consistent behavior with JDK implementation.
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.
Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available
Result:
AsciiString hashCode is faster.
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.
Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.
Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
Motivation:
Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.
Modifications:
- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.
Result:
Faster sampling.
Motivation:
sun.misc.Unsafe allows us to handle heap ByteBuf in a more efficient matter. We should use special ByteBuf implementation when sun.misc.Unsafe can be used to increase performance.
Modifications:
- Add PooledUnsafeHeapByteBuf and UnpooledUnsafeHeapByteBuf that are used when sun.misc.Unsafe is ready to use.
- Add UnsafeHeapSwappedByteBuf
Result:
Better performance when using heap buffers and sun.misc.Unsafe is ready to use.
Motivation:
We had a bug in our implemention which double "reversed" bytes on systems which not support unaligned access.
Modifications:
- Correctly only reverse bytes if needed.
- Share code between unsafe implementations.
Result:
No more data-corruption on sytems without unaligned access.
Motivation:
At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.
Modifications:
- forward decoded messages after each decode call
Result:
Forwarding decoded messages through the pipeline in a more eager fashion.
Motivation:
The javadocs are incorrect and so give false impressions of use-pattern.
Modifications:
- Fix javadocs of which operations are allowed from multiple threads concurrently.
- Let isEmpty() work concurrently.
Result:
Correctly document usage-patterns.
Motivation:
The StringUtil class creates a Formatter object, but does not close it. There are also a 2 utility methods which would be generally useful.
Modifications:
- Close the Formatter
- Add length and isNullOrEmpty
Result:
No more resource leaks. Additional utility methods.
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.
Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.
Result:
Performance is now on par with 4.0.
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.
Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.
Result:
Peformance boost for decoding http objects.
Motivation:
PoolThreadCache did only cache allocations if the allocation and deallocation Thread were the same. This is not optimal as often people write from differen thread then the actual EventLoop thread.
Modification:
- Add MpscArrayQueue which was forked from jctools and lightly modified.
- Use MpscArrayQueue for caches and always add buffer back to the cache that belongs to the allocation thread.
Result:
ThreadPoolCache is now also usable and so gives performance improvements when allocation and deallocation thread are different.
Performance when using same thread for allocation and deallocation is noticable worse then before.