Motivation:
We used a very high number for DEFAULT_INITIAL_MAX_CAPACITY (over 200k) which is not very relastic and my lead to very surprising memory usage if allocations happen in bursts.
Modifications:
Use a more sane default value of 32k
Result:
Less possible memory usage by default
Motivation:
We used Promise.setFailure(...) when fail a Promise in SimpleChannelPool. As this happens in multiple levels this can result in stackoverflow as setFailure(...) may throw an IllegalStateException which then again is propergated.
Modifications:
Use tryFailure(...)
Result:
No more possibility to cause a stack overflow when failing the promise.
Motivation:
FlushConsolidationHandler#flushIfNeeded has a conditional which is fixed based upon code path. This conditional can be removed and instead just manually set in each fixed code path.
Modifications:
- Remove boolean parameter on FlushConsolidationHandler#flushIfNeeded and set readInprogess to false manually when necessary
Result:
Less conditionals in FlushConsolidationHandler
Motivation:
AbstractReferenceCounted as independent conditional statements to check the bounds of the retain IllegalReferenceCountException condition. One of the exceptions also uses the incorrect increment.
Modifications:
- Combined independent conditional checks into 1 where possible
- Correct IllegalReferenceCountException with incorrect increment
- Remove the subtract to check for overflow and re-use the addition and check for overflow to remove 1 arithmetic operation (see http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.18.2)
Result:
AbstractReferenceCounted has less independent branch statements and more correct IllegalReferenceCountException. Compilation size of AbstractReferenceCounted.retain() is reduced from 58 bytes to 47 bytes.
Motivation:
We saw some sporadic test failures for GlobalEventExecutorTest.testAutomaticStartStop test. This is caused parallel execution of tests in combination with assert checks that will be affected.
Modifications:
Remove fragile assert checks.
Result:
No more sporadic test failures
Motivation:
We use a shared capacity per Stack for all its WeakOrderQueue elements. These relations are stored in a WeakHashMap to allow dropping these if memory pressure arise. The problem is that we not "reclaim" previous reserved space when this happens. This can lead to a Stack which has not shared capacity left which then will lead to an AssertError when we try to allocate a new WeakOderQueue.
Modifications:
- Ensure we never throw an AssertError if we not have enough space left for a new WeakOrderQueue
- Ensure we reclaim space when WeakOrderQueue is collected.
Result:
No more AssertError possible when new WeakOrderQueue is created and also correctly reclaim space that was reserved from the shared capacity.
Motivation:
Due an implementation flaw in DefaultAttributeMap it was possible that an attribute and its stored value/key could not be collected until the DefaultAttributeMap could be collected. This can lead to unexpected memory usage and strong reachability of objects that should be collected.
Modifications:
Use an special empty DefaultAttribute as head of the each bucket which will not store any key / value. With this change everything can be collected as expected as we not use any DefaultAttribute created by the user as head of a bucket.
Result:
DefaultAttributeMap does not store user data and thus the lifetime of this user data is not tied to the lifetime of the DefaultAttributeMap.
Motivation:
Commit afafadd3d7 introduced a change which stored the Stack in the WeakOrderQueue as field. This unfortunally had the effect that it was not removed from the WeakHashMap anymore as the Stack also is used as key.
Modifications:
Do not store a reference to the Stack in WeakOrderQueue.
Result:
WeakOrderQueue can be collected correctly again.
Motivation:
PR #5493 added support for KeyManagerFactories when using the OpenSsl context. This commit corrects a bug causing a NullPointerException that occurs when using a KeyManagerFactory without a certificate chain and private key.
Modifications:
Removes assertNotNull() assertions which were causing a certificate chain and private key to be required even when using a KeyManagerFactory. Also removed a redundant call to buildKeyManagerFactory() which was also causing a exception when a KeyManagerFactory is provided but a certificate chain and private key is not.
Result:
A KeyManagerFactory can now be used in the OpenSslServerContext without an independent certificate chain and private key.
Motivation:
The SimpleChannelPool#notifyConnect() method will leak Channels if the user cancelled the Promise in between.
Modifications:
Release the channel if the Promise was complete before.
Result:
No more channel leaks.
Motivation:
The Java version is used for platform dependent logic. Yet, the logic
for acquiring the Java version requires special permissions (the runtime
permission "getClassLoader") that some downstream projects will never
grant. As such, these projects are doomed to have Netty act is their
Java major version is six. While there are ways to maintain the same
logic without requiring these special permissions, the logic is
needlessly complicated because it relies on loading classes that exist
in version n but not version n - 1. This complexity can be removed. As a
bonanza, the dangerous permission is no longer required.
Modifications:
Rather than attempting to load classes that exist in version n but not
in version n - 1, we can just parse the Java specification version. This
only requires a begign property (property permission
"java.specification.version") and is simple.
Result:
Acquisition of the Java version is safe and simple.
Motivation:
DefaultChannelConfig has a "hack" which will automatically set maxMessagesPerRead to 16 when constructed when either a ServerChannel or a AbstractNioByteChannel is used. As this instanceof check will fail with epoll we should set it there as well.
Modifications:
Set maxMessagesPerRead to 16 by default in EpollDomainSocketChannelConfig and EpollSocketChannelConfig.
Result:
More consistent behavior between epoll and nio when LT mode is used.
Motivation:
The clean method in java.base/jdk.internal.ref.Cleaner is not accessible
to methods outside java.base. This prevents Cleaner0.freeDirectBuffer
from actually calling the clean method on JDK9.
The issue could have been caught earlier if Cleaner0 is initialized when
PlatformDependent0 is initialized and logging statements in the static
initializer in Cleaner0 would be close to the logging statements in the
static initializer in PlatformDependent0.
Modifications:
Try casting the cleaner obtained from a ByteBuffer to Runnable and use
Runnable.run if possible. All Cleaners in JDK9 implements Runnable. Fall
back to the clean method if the cleaner does not implement Runnable.
The fallback preserves the behavior on JDK8 and earlier.
Try to free the direct ByteBuffer allocated during static initialization
of PlatformDependent0. This cause Cleaner0 to be initialized when
PlatformDependent0 is initialized, and logging statements from the
static initializers will be close together.
Result:
Cleaner0.freeDirectBuffer works as intended on JDK9 and logging shows
that Cleaner0.freeDirectBuffer works as intended.
Motivation:
A recent change to DefaultThreadFactory modified it so that it is sticky
with respect to thread groups. In particular, this change made it so
that DefaultThreadFactory would hold on to the thread group that created
it, and then use that thread group to create threads.
This can have problematic semantics since it can lead to violations of a
tenet of thread groups that a thread can only modify threads in its own
thread group and descendant thread groups. With a sticky thread group, a
thread triggering the creation of a new thread via
DefaultThreadFactory#newThread will be modifying a thread from the
sticky thread group which will not necessarily be its own nor a
descendant thread group. When a security manager is in place that
enforces this requirement, these modifications are now impossible. This
is especially problematic in the context of Netty because certain global
singletons like GlobalEventExecutor will create a
DefaultThreadFactory. If all DefaultThreadFactory instances are sticky
about their thread groups, it means that submitting tasks to the
GlobalEventExecutor singleton can cause a thread to be created from the
DefaultThreadFactory sticky thread group, exactly the problem with
DefaultThreadFactory being sticky about thread groups. A similar problem
arises from the ThreadDeathWatcher.
Modifications:
This commit modifies DefaultThreadFactory so that a null thread group
can be set with the behavior that all threads created by such an
instance will inherit the default thread group (the thread group
provided by the security manager if there is one, otherwise the thread
group of the creating thread). The construction of the instances of
DefaultThreadFactory used by the GlobalEventExecutor singleton and
ThreadDeathWatcher are modified to use this behavior. Additionally, we
also modify the chained constructor invocations of the
DefaultThreadFactory that do not have a parameter to specify a thread
group to use the thread group from the security manager is available,
otherwise the creating thread's thread group. We also add unit tests
ensuring that all of this behavior is maintained.
Result:
It will be possible to have DefaultThreadFactory instances that are not
sticky about the thread group that led to their creation. Instead,
threads created by such a DefaultThreadFactory will inherit the default
thread group which will either be the thread group from the security
manager or the current thread's thread group.
Motivation:
The build generates a OSGi bundle with missing Bundle-NativeCode manifest entry.
Modifications:
Add missing manifest entry.
Result:
Be able to use transport-native-epoll in osgi container.
Motivation:
DiskFileUpload and MemoryFileUpload.equals(...) are broken.
Modifications:
Fix implementation and add unit test.
Result:
Equals method are correct now.
Motivation:
Currently, the recycler max capacity it's only enforced on the
thread-local stack which is used when the recycling happens on the
same thread that requested the object.
When the recycling happens in a different thread, then the objects
will be queued into a linked list (where each node holds N objects,
default=16). These objects are then transfered into the stack when
new objects are requested and the stack is empty.
The problem is that the queue doesn't have a max capacity and that
can lead to bad scenarios. Eg:
- Allocate 1M object from recycler
- Recycle all of them from different thread
- Recycler WeakOrderQueue will contain 1M objects
- Reference graph will be very long to traverse and GC timeseems to be negatively impacted
- Size of the queue will never shrink after this
Modifications:
Add some shared counter which is used to manage capacity limits when recycle from different thread then the allocation thread. We modify the counter whenever we allocate a new Link to reduce the overhead of increment / decrement it.
Result:
More predictable number of objects mantained in the recycler pool.
Motivation:
Because of a bug we missed to include the first PoolSubpage when collection metrics.
Modifications:
- Correctly include all subpages
- Add unit test
Result:
Correctly include all subpages
Motivation:
Today when awaiting uninterruptibly on a default promise, a race
condition can lead to a missed signal. Quite simply, the check for
whether the condition holds is not made inside a lock before
waiting. This means that the waiting thread can enter the wait after the
promise has completed and will thus not be notified, thus missing the
signal. This leads to the waiting thread to enter a timed wait that will
only trip with the timeout elapses leading to unnecessarily long waits
(imagine a connection timeout, and the waiting thread missed the signal
that the connection is ready).
Modification:
This commit fixes this missed signal by checking the condition inside a
lock. We also add a test that reliably fails without the non-racy
condition check.
Result:
Timed uninterruptible waits on default promise will not race against the
condition and possibly wait longer than necessary.
Motivation:
For some use-cases it would be useful to know the number of bytes queued in the PendingWriteQueue without the need to dequeue them.
Modifications:
Add PendingWriteQueue.bytes().
Result:
Be able to get the number of bytes queued.
Motivation:
ExecutorService.invoke*(...) methods may block by API definition. This can lead to deadlocks if called from inside the EventLoop in SingleThreadEventExecutor as it only has one Thread that does all the work.
Modifications:
Throw a RejectedExectionException if someone tries to call SingleThreadEventExecutor.invoke*(...) while in the EventLoop.
Result:
No more deadlock possible.
Motivation:
Commit 4c048d069d moved the logic of calling handlerAdded(...) to the channelRegistered(...) callback of the head of the DefaultChannelPipeline. Unfortunatlly this may execute the callbacks to late as a user may add handlers to the pipeline in the ChannelFutureListener attached to the registration future. This can lead to incorrect ordering.
Modifications:
Ensure we always invoke ChannelHandler.handlerAdded(...) for all handlers before the registration promise is notified.
Result:
Not possible of incorrect ordering or missed events.
Motivation:
We pinned the EventExecutor for a Channel in DefaultChannelPipeline. Which means if the user added multiple handlers with the same EventExecutorGroup to the ChannelPipeline it will use the same EventExecutor for all of these handlers. This may be unexpected and even not what the user wants. If the user want to use the same one for all of them it can be done by obtain an EventExecutor and pass the same instance to the add methods. Because of this we should allow to not pin.
Modifications:
Allow to disable pinning of EventExecutor for Channel based on EventExecutorGroup via ChannelOption.
Result:
Less confusing and more flexible usage of EventExecutorGroup when adding ChannelHandlers to the ChannelPipeline.
Motivation:
DefaultPromise has a listeners member variable which is volatile to allow for an optimization which makes notification of listeners less expensive when there are no listeners to notify. However this change makes all other operations involving the listeners member variable more costly. This optimization which requires listeners to be volatile can be removed to avoid volatile writes/reads for every access on the listeners member variable.
Modifications:
- DefaultPromise listeners is made non-volatile and the null check optimization is removed
Result:
DefaultPromise.listeners is no longer volatile.
Motivation:
DefaultChannelPipelineTest creates a channel and completes promises, but does not register this channel to an event loop. This is generally an illegal state for DefaultPromise, and the 4.1 branch correctly registers the channel in all of these cases.
Modifications:
- DefaultChannelPipelineTest should register the channel to an EventLoopGroup before completing a promise generated by that channel
Result:
DefaultChannelPipelineTest is more correct and consistent with 4.1.
Motivation
When I override ChannelHandler methods I usually (always) refire events myself via
ChannelHandlerContext instead of relieing on calling the super method (say
`super.write(ctx, ...)`). This works great and the IDE actually auto completes/generates
the right code for it except `#fireUserEventTriggered()` and `#userEventTriggered()`
which have a mismatching argument names and I have to manually "intervene".
Modification
Rename `ChannelHandlerContext#fireUserEventTriggered()` argument from `event` to `evt`
to match its handler counterpart.
Result
The IDE's auto generated code will reference the correct variable.
Motivation:
In commit f984870ccc I made a change which operated under invalide assumption that tasks executed by an EventExecutor will always be processed in a serial fashion. This is true for SingleThreadEventExecutor sub-classes but not part of the EventExecutor interface contract.
Because of this change implementations of EventExecutor which not strictly execute tasks in a serial fashion may miss events before handlerAdded(...) is called. This is strictly speaking not correct as there is not guarantee in this case that handlerAdded(...) will be called as first task (as there is no ordering guarentee).
Cassandra itself ships such an EventExecutor implementation which has no strict ordering to spread load across multiple threads.
Modifications:
- Add new OrderedEventExecutor interface and let SingleThreadEventExecutor / EventLoop implement / extend it.
- Only expose "restriction" of skipping events until handlerAdded(...) is called for OrderedEventExecutor implementations
- Add ThreadPoolEventExecutor implementation which executes tasks in an unordered fashion. This is used in added unit test but can also be used for protocols which not expose an strict ordering.
- Add unit test.
Result:
Resurrect the possibility to implement an EventExecutor which does not enforce serial execution of events and be able to use it with the DefaultChannelPipeline.
Motivation:
Calling flush() and writeAndFlush(...) are expensive operations in the sense as both will produce a write(...) or writev(...) system call if there are any pending writes in the ChannelOutboundBuffer. Often we can consolidate multiple flush operations into one if currently a read loop is active for a Channel, as we can just flush when channelReadComplete is triggered. Consolidating flushes can give a huge performance win depending on how often is flush is called. The only "downside" may be a bit higher latency in the case of where only one flush is triggered by the user.
Modifications:
Add a FlushConsolidationHandler which will consolidate flushes and so improve the throughput.
Result:
Better performance (throughput). This is especially true for protocols that use some sort of PIPELINING.
Motivation:
We should make it clear that each acquired Channel needs to be released in all cases.
Modifications:
More clear javadocs.
Result:
Harder for users to leak Channel.
Motivation:
The field can be read from arbitrary threads via Channel.(isWritable()|bytesBeforeWritable()|bytesBeforeUnwritable()), WriteAndFlushTask.newInstance(), PendingWriteQueue, etc.
Modifications:
Make AbstractChannel.outboundBuffer volatile.
Result:
More correct in a concurrent use case.
Motivation:
ECONNREFUSED can be a common type of exception when attempting to finish the connection process. Generating a new exception each time can be costly and quickly bloat memory usage.
Modifications:
- Expose ECONNREFUSED from JNI and cache this exception in Socket.finishConnect
Result:
ECONNREFUSED during finish connect doesn't create a new exception each time.
Motivation:
We used future in many method of ChannelDuplexHandler as argument name of ChannelPromise. We should make it more consistent and correct.
Modifications:
Replace future with promise.
Result:
More correct and consistent naming.
Motivation:
The logging statements in i.n.u.c.DefaultPromise do not emit the
caught Throwable when a Throwable is thrown while a listener is being
notified of completed or progressed operations.
Modifications:
This issue arises because the logging message has a single placeholder
but is passing two additional arguments, the second one being the
caught Throwable that is thus quietly not logged. We address this by
modifying the logging statements to ensure the caught Throwable is
logged. In this case, the preferred approach is to use the logger
override that accepts a message and a Throwable parameter since logger
implementations might have special handling for this case.
Result:
Log messages from i.n.u.c.DefaultPromise when a Throwable is thrown
while notifying a listener of completed or progressed operations will
contain the caught Throwable.
Motivation:
ReadTimeoutHandler and IdleStateHandler have duplicated code, we should share whatever possible.
Modifications:
Let ReadTimeoutHandler extend IdleStateHandler.
Result:
Remove code duplication.
Motivation:
We don't have an argument named {@code value} but have {@code set} and
{@code expected} in HttpHeaders and HttpUtil respectively.
Modifications:
I replaced {@code value} to {@code set} and {@code expected} in HttpHeaders
and HttpUtil respectively.
Result:
Now javadoc says;
If {@code set} is {@code true}, the {@code "Expect: 100-continue"} header is
set and all other previous {@code "Expect"} headers are removed. Otherwise,
all {@code "Expect"} headers are removed completely. in HttpHeaders
If {@code expected} is {@code true}, the {@code "Expect: 100-continue"} header
is set and all other previous {@code "Expect"} headers are removed. Otherwise,
all {@code "Expect"} headers are removed completely. in HttpUtil
Motivation:
A race detector found that DefaultPromise.listeners is improperly synchronized [1].
Worst case a listener will not be executed when the promise is completed.
Modifications:
Make DefaultPromise.listeners a volatile.
Result:
Hopefully, DefaultPromise is more correct under concurrent execution.
[1] https://github.com/grpc/grpc-java/issues/2015
Motivation:
Unit test for the OpenSslEngine "OpenSslEngine writePlaintextData WANT_READ with no data in BIO buffer" issue.
Modifications:
- Update SslEngine test to include renegotiation
Result:
More test coverage in OpenSslEngine.
Motivation:
Currently in the single threaded and global event executors when the scheduled task queue is drained, there is a call to hasScheduledTasks(). If there are scheduled tasks then the the code polls the queue for tasks. The poll method duplicates the exact logic of hasScheduledTasks(). This involves two calls to nanoTime when one seems sufficient.
Modifications:
Directly poll the queue for tasks and break if the task returned is null.
Result:
Should be no noticeable impact on functionality. Two calls to nanoTime have been coarsened into a single call.
Motivation:
Project Jigsaw in JDK9 has moved the direct byte buffer cleaner from
sun.misc.Cleaner to java.lang.ref.Cleaner$Cleanable. This cause the
current platform tests to throw a ClassNotFoundException, disabling the
use of direct byte buffer cleaners.
Modifications:
I use reflection to find the clean method in either sun.misc.Cleaner or
java.lang.ref.Cleaner$Cleanable.
Result:
Netty uses direct byte buffers on JDK9 as it already do on earlier JDKs.
Motivation:
In JDK9 heap byte buffers have an address field, so we have to remove
the current check as it is invalid in JDK9.
Modifications:
Removed the address field check for heap byte buffers.
Result:
Netty continues to find sun.misc.Unsafe in JDK9 as in previous JDKs.
Motivation:
Netty's platform dependent parts should know about JDK9.
Modifications:
JDK9 introduce Runtime$Version Runtime.version() which has an int major()
method that always return the major Java version. I call that method to
get the Java major version.
Result:
Netty will recognize all future JDK versions.
Motivation:
The gRPC interop tests fail due to a NPE in OpenSslEngine.
Caused by: java.lang.NullPointerException
at io.netty.handler.ssl.OpenSslEngine.setSSLParameters(OpenSslEngine.java:1473)
Modifications:
Add a null check
Result:
No more NPE exceptions :-)
Motivation:
We recently added the ResourceLeakDetectorFactory but missed to updated HashedWheelTimer to use it.
Modifications:
- Add new abstract method to ResourceLeakDetectorFactory that allows to provide also samplingInterval and maxActive args.
- Deprecate most constructors in ResourceLeakDetector and add doc explaining that people should use ResourceLeakDetectorFactory
Result:
Custom ResourceLeakDetectorFactory will also be used in HashedWheelTimer if configured.