Motivation:
If you start to have 1M+ concurrent connections memory footprint can be come a big issue. We should try to reduce it as much as possible in the core of netty.
Modifications:
- Remove HashMap that was used to store name to ctx mapping. This was only used for validation and access a handler by name. As a pipeline is not expected to be very long (like 100+ handlers) we can just walk the linked list structure to find the ctx with a given name.
Result:
Less memory footprint of the DefaultChannelPipeline.
Motivation:
The previous DefaultChannelPipeline#destroy() implementation, introduced in #3156, is suboptimal as it can cause the for loop to continuously spin if the executor used by a given handler is unable to "recognize" the event loop.
It could be objected that it's the custom executor responsibility to properly implement the inEventLoop() method, but some implementetaions might not be able to do that for performance reasons, and even so, it's always better to be safe against API misuse, in particular when it is not possible to fail fast and the alternative is rather some sutle behaviour.
Modifications:
The patch simply avoids the recursive spin by explicitly passing the "in event loop" condition as a boolean parameter, preserving the same guarantees offered by #3156. A unit test has also been added.
Result:
All channel events are correctly called and no high CPU usage is seen anymore.
Motivation:
Changing the chache of generated names to use a cache per thread. This will remove the bottleneck when many eventloops are used and names need to generate.
Modifications:
Use a FastThreadLocal to store the cached names.
Result:
Less locking between threads.
Motivation:
We should only call ReferenceCountUtil.touch(...) if needed as otherwise we pay the overhead of instanceof and cast
everytime.
Modifications:
Add boolean flag which indicates if touch(...) should be called.
Result:
Less overhead when leak detection is not enabled.
Motiviation:
If a user writes from outside the EventLoop we increase the pending bytes of the outbound buffer before submitting the write request. This is done so the user can stop writing asap once the channel turns unwritable. Unfortunally this doesn't take the overhead of adding the task into the account and so it is very easy for an user to full up the task queue. Beside this we use a value of 0 for an unown message by default which is not ideal.
Modifications:
- port the message calculation we used in netty 3.x into AbstractChannelHandlerContext and so better calculate the overhead of a message that is submitted from outside the EventLoop
- change the default estimated size for an unknown message to 8.
Result:
Better behaviour when submiting writes from outside the EventLoop.
Motivation:
Once a FixedChannelPool was closed we must not allow to acquire or release Channels to prevent assert errors.
Modifications:
Fail release and acquire calls when FixedChannelPool is closed.
Result:
No more assert errors.1
Motivation:
As MaxMessageHandle is stateful we can not share the same HandleImpl instance as otherwise we will see race conditions.
Modifications:
Create a new HandleImpl instance on each newHandle() call.
Result:
No more races.
Motivation:
EmbeddedChannelId#hashCode() and equals() do not behave correctly if an
instance is serialized and then deserialized. Additionally,
EmbeddedChannel does not allow use of any other type of ChannelId, and
EmbeddedChannelId is (mostly) a singleton instance. This creates a
problem for unit tests that construct multiple EmbeddedChannels and
expect each channel to have a unique ID.
Modifications:
EmbeddedChannelId is modified so equals() will return true and
hashCode() will return the same value for any instance of the class.
EmbeddedChannel is modified to allow a ChannelId to be specified when
the channel is constructed. Tests added for both changes.
Result:
EmbeddedChannelId now behaves correctly when deserialized, and
EmbeddedChannels can now have unique IDs.
Motivation:
PendingWriteQueueTest needs some cleanup.
Modifications:
- Cleanup code to remove deprecation warnings
- use static imports
Result:
No more warnings
Motivation:
We missed to run all pending tasks when EmbeddedChannel.close(...) or disconnect(...) was called. Because of this channelInactive(...) / channelUnregistered(...) of the handlers were never called.
Modifications:
Correctly run all pending tasks and cancel all not ready scheduled tasks when close or disconnect was called.
Result:
Correctly run tasks on close / disconnect and have channelInactive(...) / channelUnregistered(...) called.
Motiviation:
We need to ensure the actual close to the transport takes place before the promsie of the write is notified that triggered it. This is needed as otherwise Channel.isActive(), isOpen() and isWritable() may return true even if the Channel should be closed already.
Modifications:
- Ensure the close takes place first
Result:
ChannelFutureListener will see the correct state of the Channel.
Motivation:
writeBytes(...) missed to set EPOLLOUT flag when not all bytes were written. This could lead to have the EpollEventLoop not try to flush the remaining bytes once the socket becomes writable again.
Modifications:
- Move setting EPOLLOUT flag logic to one point so we are sure we always do it.
- Move OP_WRITE flag logic to one point as well.
Result:
Correctly try to write pending data if socket becomes writable again.
Motivation:
RecvByteBufAllocator.DelegatingHandle does not provide an accessor to get the delegate handle. This may be useful for classes that extend DelegatingHandle.
Modifications:
- add delegate() method to DelegatingHandle
Result:
Classes which inherit from DelegatingHandle can now access the delegate Handle.
Motivation:
If LocalChannel doWrite executes while the peer's state changes from CONNECTED to CLOSED it is possible that some promise's won't be completed and buffers will be leaked.
Modifications:
- Check the peer's state in doWrite to avoid a race condition
Result:
All write operations should release, and the associated promise should be completed.
Motivation:
https://github.com/netty/netty/pull/4143 addressed a few ordering issues but an ordering issue still remained if the Promise for a write completes, and a listener of that promise does a write on a peer channel. The ordering was subject to how potentially 2 different executors would run a task, but it should be coordinated such that the first write is read first.
Modifications:
- Keep track of the finishPeerRead task run on the executor if necessary and ensure it completes before current channel read occurs
Result:
Ordering of events for echo type situations is preserved.
Motivation:
When a LocalChannel write operation occurs, the promise associated with the write operation is marked successful when it is added to the peer's queue, but before the peer has actually received the data. If the promise callback closes the channel then a race condition exists where the close event may occur before the data is delivered. We should preserve ordering of events.
Modifications:
- LocalChannel should track when a write is in progress, and if a close operation happens make sure the peer gets all pending read operations.
Result:
LocalChannel preserves order of operations.
Fixes https://github.com/netty/netty/issues/4118
Motivation:
Doc of ChannelGroup says, that it can be used to manage server and child channels at once.
However, in DefaultChannelGroup, there is a race condition. When a server channel accepts a child, it schedules its
registration on an event loop, which takes some time. If the ChannelGroup, which is supposed
to close server and child channels at once, is closed after the child channel has been scheduled
for registration and before this registration actually happens, this child channel is not closed
and remains connected. This could lead to connection leaks.
Modifications:
To fix this, the DefaultChannelGroup is changed to has a closed flag.
This flag is set to true, just before the close() method is actually closing channels.
The add() method checks after adding a new channel, if this flag has been set to true.
If yes, the new channel is closed. If not, we have the guarantee, that this channel will be
closed by the ChannelGroup, because setting the closed flag to true happens-before closing any channels.
This behaviour can be activated by two new constructors. The old constructors are still there and behave like before.
Therefore, no existing code should be affected directly.
Result:
If activating this feature, the DefaultChannelGroup can be used, for managing server and child channels at once.
But this activating this feature means also, that a ChannelGroup cannot be reused after calling close().
Related issues:
- #3971
- #3973
- #3976
- #4035
Motivation:
1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.
2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()
Modifications:
- Changes related with DnsNameResolver.query()
- Make query() not retry
- Move the retry logic to DnsNameResolver.resolve() instead.
- Make query() fail the promise only when I/O error occurred or it
failed to get a response
- Add DnsNameResolverException and use it when query() fails so that
the resolver can give more information about the failure
- query() does not cache anymore.
- Changes related with NameResolver.resolveAll()
- Add NameResolver.resolveAll()
- Add SimpleNameResolver.doResolveAll()
- Changes related with DnsNameResolver.resolve() and resolveAll()
- Make DnsNameResolveContext abstract so that DnsNameResolver can
decide to get single or multiple addresses from it
- Re-implement cache so that the cache works for resolve() and
resolveAll()
- Add 'traceEnabled' property to enable/disable trace information
- Miscellaneous changes
- Use ObjectUtil.checkNotNull() wherever possible
- Add InternetProtocolFamily.addressType() to remove repetitive
switch-case blocks in DnsNameResolver(Context)
- Do not raise an exception when decoding a truncated DNS response
Result:
- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
Motivation:
When releasing unhealthy channel back to a pool we don't have to offer it since on acquire it will be discarded anyways.
Also checking healthiness at release is a good idea so we don't end up having tons of unhealthy channels in the pool(unless they became unhealthy after being offered)
Modifications:
private SimpleChannelPool.offerIfHealthy() method added that is called from SimpleChannelPool.doReleaseChannel(). SimpleChannelPool.offerIfHealthy() offers channel back to pool only if channel is healthy.
Otherwise it throws setFailure exception to the promise.
Result:
The pool is now much cleaner and not spammed with unhealthy channels.
Added ability to choose if channel health has to be validated on release by passing boolean flag.
Motivation:
Depending on performance preferences and individual use cases sometimes we would like to be able force health check of a channel at release time and do not offer it back to the pool. Other times we would want to just release channel and offer it back to the pool and check health only when we try to acquire that channel from the pool. See more details here: https://github.com/netty/netty/issues/4077#issuecomment-130461684
Modifications:
Future<Void> release(Channel channel, Promise<Void> promise, boolean offerHealthyOnly);
The offerHealthyOnly boolean flag allows developers to choose whether to do channel validation before offering it back to pool or not.
Appropriate modifications made to hierarchy of implementations of ChannelPool. offerHealthyOnly=true will force channel health to be checked before offering back to pool. offerHealthyOnly=false will ignore channel health check and will just try just offer it back to the pool
offerHealthyOnly=true by default.
Result:
Channel health check before offer back to pool is controlled by a flag now.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:279 line split to be less then 120 characters.
SimpleChannelPool.java:280:31 space added after '{'
SimpleChannelPool.java:282:17 space added after '{'
SimpleChannelPoolTest.java:198 - extra white space line removed.
Result:
Code satisfies checkstyle requirements.
offerHealthyOnly is passed as a constructor parameter now.
Motivation:
Instead of passing offerHealthyOnly as a method parameter it is better to pass it in as SimpleChannelPool or FixedChannelPool constructor.
Modifications:
Redundant release method that takes offerHealthyOnly removed from ChannelPool.
offerHealthyOnly parameter added to constructor for FixedChannelPool and SimpleChannelPool.
Result:
SimpleChannelPool and FixedChannelPool are now take offerHealthyOnly as a constructor parameter. Default behavior is: offerHealthyOnly=true.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:84: line made to be no longer then 120 characters.
SimpleChannelPool.java:237: extra white space line removed.
Result:
Code satisfies checkstyle requirements.
Tests do not need to be too copled to the code. Exception message should not be validated
Motivation:
We don't need our tests to be too coupled to the code. Exception type validation in tests is just good enough.
Modifications:
Exception validation message removed from SimpleChannelPoolTest.testUnhealthyChannelIsNotOffered() test.
Result:
The SimpleChannelPoolTest test is less coupled to the code now.
Stack trace set to empty for UNHEALTHY_NON_OFFERED_TO_POOL.
Motivation:
We don't need stack trace for UNHEALTHY_NON_OFFERED_TO_POOL.
Modifications:
Added UNHEALTHY_NON_OFFERED_TO_POOL.setStackTrace(EmptyArrays.EMPTY_STACK_TRACE) to static init block.
Result:
UNHEALTHY_NON_OFFERED_TO_POOL's stack trace set to empty.
Minor code re-factorings.
Motivation:
For better code readability we need to apply several minor code re-factorings.
Modifications:
javadocs true -> {@code true}
offerHealthyOnly variable name changed to releaseHeathCheck
<p/> -> <p> in javadocs
offerHealthyOnly removed from doReleaseChannel as it not needed there.
Result:
Code quality is improved.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:87: line made to be no longer then 120 characters.
Result:
Code satisfies checkstyle requirements.
Pull request needs to contain only necessary changes
Motivation:
The pull request should not contain unnecessary changes that are not needed as part of required functionality of pull request.
Modifications:
private void doReleaseChannel(final Channel channel, final Promise<Void> promise) - > private void doReleaseChannel(Channel channel, Promise<Void> promise)
Result:
Pull request contains less unnecessary modifications.
Motivation:
The acquire channel function resulted in calling itself several times in case when channel polled from the pool queue was unhealthy, which resulted FixedChannelPool to be called several times which in it's turn caused FixedChannelPool.acquire() to be called and resulted into acquireChannelCount to be unnecessary increased.
Example use case:
1) Create FixedChannelPool instance with one channel in the pool: new FixedChannelPool(cb, handler, 1)
2) Acquire channel A from the pool
3) close the channel A
4) Return it back to the pool
5) Acquire channel from the same pool again
Expected result:
new channel created and acquired, channel A that has been closed discarded and removed from the pool from being unhealthy
Actual result:
Channel A had been removed from the pool, how ever the new channel had never be acquired, instead the request to acquire had been added to the pending queue in FixedChannelPool and the acquireChannelCount is increased by one. The reason is that at the time when SimpleChannelPool figured out that the channel was unhealthy called FixedChannelPool.acquire to try to acquire new channel, how ever the request was added to the pendingTakQueue because by the time when FixedChannelPool.acquire was called, the acquireChannelCount was already "1" so new channel ould not be created cause of maxChannelsLimit=1.
Modifications:
The suggested approach modifies the SimpleChannelPool in a way so that when channel detected to be unhealthy it calls private method SimpleChannelPool.acquireHealthyFromPoolOrNew() which guarantees that SimpleChannelPool actually either finds a healthy channel in the pool and returns it or causes the promise.cause() in case when new channel was failed to be created.
Result:
The ```acquiredChannelCount``` is now calculated correctly as a result of SimpleChannelPool.acquire() of not being recursive on overridable acquire method.
Motiviation:
The current read loops don't fascilitate reading a maximum amount of bytes. This capability is useful to have more fine grain control over how much data is injested.
Modifications:
- Add a setMaxBytesPerRead(int) and getMaxBytesPerRead() to ChannelConfig
- Add a setMaxBytesPerIndividualRead(int) and getMaxBytesPerIndividualRead to ChannelConfig
- Add methods to RecvByteBufAllocator so that a pluggable scheme can be used to control the behavior of the read loop.
- Modify read loop for all transport types to respect the new RecvByteBufAllocator API
Result:
The ability to control how many bytes are read for each read operation/loop, and a more extensible read loop.
Motivation:
We don't decrease acquired channel count in FixedChannelPool when timeout occurs by AcquireTimeoutAction.NEW and eventually fails.
Modifications:
Set AcquireTask.acquired=true to call decrementAndRunTaskQueue when timeout action fails.
Result:
Acquired channel count decreases correctly.
Motivation:
We missed to correctly count acquired channels in FixedChannelPool which could produce an assert error.
Modifications:
Only try to decrement acquired count if the channel was really acuired.
Result:
No more assert error possible.
Motivation:
If the Channel is already closed when the PendingWriteQueue is created it will generate a NPE when add or remove is called later.
Modifications:
Add null checks to guard against NPE.
Result:
No more NPE possible.
Motivation:
Test was leaving composite buffers taken from the queue unreleased.
Modifications:
Make the test release buffers.
Result:
Nagging about leaked buffers should stop.
Motivation:
Simplifies writing code that needs to merge or slice a sequence of buffer & promise pairs into chunks of arbitrary sizes.
For example in HTTP2 we merge or split buffers across fixed-size DATA frame boundaries.
Modifications:
Add new utility class CoalescingBufferQueue
Result:
Following this change HTTP2 code will switch to use it instead of CompositeByteBuffer for DATA frame coalescing.
Motivation:
While cherry-picked 11f9e9084b I changed the EmbeddedChannel implementation to not allow no ChannelHandlers when constructing it.
This was done by mistake.
Modifications:
Revert change and add unit test.
Result:
Restore old behavior.
Motivation:
When using an EmbeddedChannel often it either does inbound or outbound processing which means we only often need one queue.
Modifications:
Lazy init the inbound and outbound message queues.
Result:
Less memory usage.
Motivation:
At the moment we directly closed the Channel when an exception accoured durring initChannel(...) without giving the user any way to do extra or special handling.
Modifications:
Handle the exception in exceptionCaught(...) of the ChannelInitializer which will by default log and close the Channel. This way the user can override this.
Result:
More felixible handling of exceptions.
Motivation:
Currently in EmbeddedChannel we add the ChannelHandlers before the Channel is registered which leads to have the handlerAdded(...) callback
be called from outside the EventLoop and also prevent the user to obtain a reference to the EventLoop in the callback itself.
Modifications:
Delay adding ChannelHandlers until EmbeddedChannel is registered.
Result:
Correctly call handlerAdded(...) after EmbeddedChannel is registered.
Motivation:
If you set a ChannelHandler via ServerBootstrap.handler(...) it is added to the ChannelPipeline before the Channel is registered. This will lead to and IllegalStateException if a user tries to access the EventLoop in the ChannelHandler.handlerAdded(...) method.
Modifications:
Delay the adding of the ChannelHandler until the Channel was registered.
Result:
No more IllegalStateException.
Motivation:
Only one of the three FixedChannelPool constructors checks for the constructor
arguments. Therfore it was possible to create a pool with zero maxConnections.
This change chains all constructors together, so that the last one
in the chain always checks the validity of the arguments, regardless of the
constructor used.
Result:
It is no longer possible to create a FixedChannelPool instance with invalid
maxConnections or maxPendingAcquires parameters.
Motivation:
FixedChannelPool should enforce a number of maximal used channels, but due a bug we fail to correctly enforce this.
Modifications:
Change check to correctly only acquire channel if we not hit the limit yet.
Result:
Correct limiting.
Motivation:
To avoid buffering too much it would be useful to get an estimate of how many bytes can be written to a Channel before it becomes unwritable.
Modifications:
- Update the Channel interface to support 2 new methods. 1 to give how many bytes before unwritable. 1 to give how many bytes before writable.
- Update the AbstractChannel implementation to delegate to the ChannelOutboundBuffer.
Result:
The Channel interface supports 2 new methods which provide more visibility into writability.
Motivation:
It's useful to be able to be notified once all Channels that are part of the ChannelGroup are notified. This can for example be useful if you want to do a graceful shutdown.
Modifications:
- Add ChannelGroup.newCloseFuture(...) which will be notified once all Channels are notified that are part of the ChannelGroup at the time of calling.
Result:
Easier to be notified once all Channels within a ChannelGroup are closed.
Motiviation:
There are currently no accessors which provide visbility into how many bytes must be written in order for a writability change to occur. This feature would be useful for codecs which intent to control how many bytes are queued at any given time.
Modifications:
- add bytesBeforeUnWritable() which will give the number of bytes before the buffer (and associated channel) transitions to not writable
- add bytesBeforeWritable() which will give the number of bytes that must be drained from the queue until the channel becomes writable.
Result:
More visibility into writability for the ChannelOutboundBuffer.
Motivation:
the ByteBuffer[] that we keep in the ThreadLocal are never nulled out which can lead to have ByteBuffer instances sit there forever.
This is even a bigger problem if nioBuffer() of ByteBuffer returns a new ByteBuffer that can not be destroyed by ByteBuffer.release().
Modifications:
Null out ByteBuffer array after processing.
Result:
No more dangling references after done.