Motiviation:
We need to ensure the actual close to the transport takes place before the promsie of the write is notified that triggered it. This is needed as otherwise Channel.isActive(), isOpen() and isWritable() may return true even if the Channel should be closed already.
Modifications:
- Ensure the close takes place first
Result:
ChannelFutureListener will see the correct state of the Channel.
Motivation:
writeBytes(...) missed to set EPOLLOUT flag when not all bytes were written. This could lead to have the EpollEventLoop not try to flush the remaining bytes once the socket becomes writable again.
Modifications:
- Move setting EPOLLOUT flag logic to one point so we are sure we always do it.
- Move OP_WRITE flag logic to one point as well.
Result:
Correctly try to write pending data if socket becomes writable again.
Motivation:
RecvByteBufAllocator.DelegatingHandle does not provide an accessor to get the delegate handle. This may be useful for classes that extend DelegatingHandle.
Modifications:
- add delegate() method to DelegatingHandle
Result:
Classes which inherit from DelegatingHandle can now access the delegate Handle.
Motivation:
If LocalChannel doWrite executes while the peer's state changes from CONNECTED to CLOSED it is possible that some promise's won't be completed and buffers will be leaked.
Modifications:
- Check the peer's state in doWrite to avoid a race condition
Result:
All write operations should release, and the associated promise should be completed.
Motivation:
https://github.com/netty/netty/pull/4143 addressed a few ordering issues but an ordering issue still remained if the Promise for a write completes, and a listener of that promise does a write on a peer channel. The ordering was subject to how potentially 2 different executors would run a task, but it should be coordinated such that the first write is read first.
Modifications:
- Keep track of the finishPeerRead task run on the executor if necessary and ensure it completes before current channel read occurs
Result:
Ordering of events for echo type situations is preserved.
Motivation:
When a LocalChannel write operation occurs, the promise associated with the write operation is marked successful when it is added to the peer's queue, but before the peer has actually received the data. If the promise callback closes the channel then a race condition exists where the close event may occur before the data is delivered. We should preserve ordering of events.
Modifications:
- LocalChannel should track when a write is in progress, and if a close operation happens make sure the peer gets all pending read operations.
Result:
LocalChannel preserves order of operations.
Fixes https://github.com/netty/netty/issues/4118
Motivation:
Doc of ChannelGroup says, that it can be used to manage server and child channels at once.
However, in DefaultChannelGroup, there is a race condition. When a server channel accepts a child, it schedules its
registration on an event loop, which takes some time. If the ChannelGroup, which is supposed
to close server and child channels at once, is closed after the child channel has been scheduled
for registration and before this registration actually happens, this child channel is not closed
and remains connected. This could lead to connection leaks.
Modifications:
To fix this, the DefaultChannelGroup is changed to has a closed flag.
This flag is set to true, just before the close() method is actually closing channels.
The add() method checks after adding a new channel, if this flag has been set to true.
If yes, the new channel is closed. If not, we have the guarantee, that this channel will be
closed by the ChannelGroup, because setting the closed flag to true happens-before closing any channels.
This behaviour can be activated by two new constructors. The old constructors are still there and behave like before.
Therefore, no existing code should be affected directly.
Result:
If activating this feature, the DefaultChannelGroup can be used, for managing server and child channels at once.
But this activating this feature means also, that a ChannelGroup cannot be reused after calling close().
Related issues:
- #3971
- #3973
- #3976
- #4035
Motivation:
1. Previously, DnsNameResolver.query() retried the request query by its
own. It prevents a user from deciding when to retry or stop. It is also
impossible to get the response object whose code is not NOERROR.
2. NameResolver does not have an operation that resolves a host name
into multiple addresses, like InetAddress.getAllByName()
Modifications:
- Changes related with DnsNameResolver.query()
- Make query() not retry
- Move the retry logic to DnsNameResolver.resolve() instead.
- Make query() fail the promise only when I/O error occurred or it
failed to get a response
- Add DnsNameResolverException and use it when query() fails so that
the resolver can give more information about the failure
- query() does not cache anymore.
- Changes related with NameResolver.resolveAll()
- Add NameResolver.resolveAll()
- Add SimpleNameResolver.doResolveAll()
- Changes related with DnsNameResolver.resolve() and resolveAll()
- Make DnsNameResolveContext abstract so that DnsNameResolver can
decide to get single or multiple addresses from it
- Re-implement cache so that the cache works for resolve() and
resolveAll()
- Add 'traceEnabled' property to enable/disable trace information
- Miscellaneous changes
- Use ObjectUtil.checkNotNull() wherever possible
- Add InternetProtocolFamily.addressType() to remove repetitive
switch-case blocks in DnsNameResolver(Context)
- Do not raise an exception when decoding a truncated DNS response
Result:
- Full control over query()
- A user can now retrieve all addresses via (Dns)NameResolver.resolveAll()
- DNS cache works only for resolve() and resolveAll() now.
Motivation:
When releasing unhealthy channel back to a pool we don't have to offer it since on acquire it will be discarded anyways.
Also checking healthiness at release is a good idea so we don't end up having tons of unhealthy channels in the pool(unless they became unhealthy after being offered)
Modifications:
private SimpleChannelPool.offerIfHealthy() method added that is called from SimpleChannelPool.doReleaseChannel(). SimpleChannelPool.offerIfHealthy() offers channel back to pool only if channel is healthy.
Otherwise it throws setFailure exception to the promise.
Result:
The pool is now much cleaner and not spammed with unhealthy channels.
Added ability to choose if channel health has to be validated on release by passing boolean flag.
Motivation:
Depending on performance preferences and individual use cases sometimes we would like to be able force health check of a channel at release time and do not offer it back to the pool. Other times we would want to just release channel and offer it back to the pool and check health only when we try to acquire that channel from the pool. See more details here: https://github.com/netty/netty/issues/4077#issuecomment-130461684
Modifications:
Future<Void> release(Channel channel, Promise<Void> promise, boolean offerHealthyOnly);
The offerHealthyOnly boolean flag allows developers to choose whether to do channel validation before offering it back to pool or not.
Appropriate modifications made to hierarchy of implementations of ChannelPool. offerHealthyOnly=true will force channel health to be checked before offering back to pool. offerHealthyOnly=false will ignore channel health check and will just try just offer it back to the pool
offerHealthyOnly=true by default.
Result:
Channel health check before offer back to pool is controlled by a flag now.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:279 line split to be less then 120 characters.
SimpleChannelPool.java:280:31 space added after '{'
SimpleChannelPool.java:282:17 space added after '{'
SimpleChannelPoolTest.java:198 - extra white space line removed.
Result:
Code satisfies checkstyle requirements.
offerHealthyOnly is passed as a constructor parameter now.
Motivation:
Instead of passing offerHealthyOnly as a method parameter it is better to pass it in as SimpleChannelPool or FixedChannelPool constructor.
Modifications:
Redundant release method that takes offerHealthyOnly removed from ChannelPool.
offerHealthyOnly parameter added to constructor for FixedChannelPool and SimpleChannelPool.
Result:
SimpleChannelPool and FixedChannelPool are now take offerHealthyOnly as a constructor parameter. Default behavior is: offerHealthyOnly=true.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:84: line made to be no longer then 120 characters.
SimpleChannelPool.java:237: extra white space line removed.
Result:
Code satisfies checkstyle requirements.
Tests do not need to be too copled to the code. Exception message should not be validated
Motivation:
We don't need our tests to be too coupled to the code. Exception type validation in tests is just good enough.
Modifications:
Exception validation message removed from SimpleChannelPoolTest.testUnhealthyChannelIsNotOffered() test.
Result:
The SimpleChannelPoolTest test is less coupled to the code now.
Stack trace set to empty for UNHEALTHY_NON_OFFERED_TO_POOL.
Motivation:
We don't need stack trace for UNHEALTHY_NON_OFFERED_TO_POOL.
Modifications:
Added UNHEALTHY_NON_OFFERED_TO_POOL.setStackTrace(EmptyArrays.EMPTY_STACK_TRACE) to static init block.
Result:
UNHEALTHY_NON_OFFERED_TO_POOL's stack trace set to empty.
Minor code re-factorings.
Motivation:
For better code readability we need to apply several minor code re-factorings.
Modifications:
javadocs true -> {@code true}
offerHealthyOnly variable name changed to releaseHeathCheck
<p/> -> <p> in javadocs
offerHealthyOnly removed from doReleaseChannel as it not needed there.
Result:
Code quality is improved.
Code changed to satisfy checkstyle requirements.
Motivation:
Code needs to satisfy checkstyle requirements.
Modifications:
SimpleChannelPool.java:87: line made to be no longer then 120 characters.
Result:
Code satisfies checkstyle requirements.
Pull request needs to contain only necessary changes
Motivation:
The pull request should not contain unnecessary changes that are not needed as part of required functionality of pull request.
Modifications:
private void doReleaseChannel(final Channel channel, final Promise<Void> promise) - > private void doReleaseChannel(Channel channel, Promise<Void> promise)
Result:
Pull request contains less unnecessary modifications.
Motivation:
The acquire channel function resulted in calling itself several times in case when channel polled from the pool queue was unhealthy, which resulted FixedChannelPool to be called several times which in it's turn caused FixedChannelPool.acquire() to be called and resulted into acquireChannelCount to be unnecessary increased.
Example use case:
1) Create FixedChannelPool instance with one channel in the pool: new FixedChannelPool(cb, handler, 1)
2) Acquire channel A from the pool
3) close the channel A
4) Return it back to the pool
5) Acquire channel from the same pool again
Expected result:
new channel created and acquired, channel A that has been closed discarded and removed from the pool from being unhealthy
Actual result:
Channel A had been removed from the pool, how ever the new channel had never be acquired, instead the request to acquire had been added to the pending queue in FixedChannelPool and the acquireChannelCount is increased by one. The reason is that at the time when SimpleChannelPool figured out that the channel was unhealthy called FixedChannelPool.acquire to try to acquire new channel, how ever the request was added to the pendingTakQueue because by the time when FixedChannelPool.acquire was called, the acquireChannelCount was already "1" so new channel ould not be created cause of maxChannelsLimit=1.
Modifications:
The suggested approach modifies the SimpleChannelPool in a way so that when channel detected to be unhealthy it calls private method SimpleChannelPool.acquireHealthyFromPoolOrNew() which guarantees that SimpleChannelPool actually either finds a healthy channel in the pool and returns it or causes the promise.cause() in case when new channel was failed to be created.
Result:
The ```acquiredChannelCount``` is now calculated correctly as a result of SimpleChannelPool.acquire() of not being recursive on overridable acquire method.
Motiviation:
The current read loops don't fascilitate reading a maximum amount of bytes. This capability is useful to have more fine grain control over how much data is injested.
Modifications:
- Add a setMaxBytesPerRead(int) and getMaxBytesPerRead() to ChannelConfig
- Add a setMaxBytesPerIndividualRead(int) and getMaxBytesPerIndividualRead to ChannelConfig
- Add methods to RecvByteBufAllocator so that a pluggable scheme can be used to control the behavior of the read loop.
- Modify read loop for all transport types to respect the new RecvByteBufAllocator API
Result:
The ability to control how many bytes are read for each read operation/loop, and a more extensible read loop.
Motivation:
We don't decrease acquired channel count in FixedChannelPool when timeout occurs by AcquireTimeoutAction.NEW and eventually fails.
Modifications:
Set AcquireTask.acquired=true to call decrementAndRunTaskQueue when timeout action fails.
Result:
Acquired channel count decreases correctly.
Motivation:
We missed to correctly count acquired channels in FixedChannelPool which could produce an assert error.
Modifications:
Only try to decrement acquired count if the channel was really acuired.
Result:
No more assert error possible.
Motivation:
If the Channel is already closed when the PendingWriteQueue is created it will generate a NPE when add or remove is called later.
Modifications:
Add null checks to guard against NPE.
Result:
No more NPE possible.
Motivation:
Test was leaving composite buffers taken from the queue unreleased.
Modifications:
Make the test release buffers.
Result:
Nagging about leaked buffers should stop.
Motivation:
Simplifies writing code that needs to merge or slice a sequence of buffer & promise pairs into chunks of arbitrary sizes.
For example in HTTP2 we merge or split buffers across fixed-size DATA frame boundaries.
Modifications:
Add new utility class CoalescingBufferQueue
Result:
Following this change HTTP2 code will switch to use it instead of CompositeByteBuffer for DATA frame coalescing.
Motivation:
While cherry-picked 11f9e9084b I changed the EmbeddedChannel implementation to not allow no ChannelHandlers when constructing it.
This was done by mistake.
Modifications:
Revert change and add unit test.
Result:
Restore old behavior.
Motivation:
When using an EmbeddedChannel often it either does inbound or outbound processing which means we only often need one queue.
Modifications:
Lazy init the inbound and outbound message queues.
Result:
Less memory usage.
Motivation:
At the moment we directly closed the Channel when an exception accoured durring initChannel(...) without giving the user any way to do extra or special handling.
Modifications:
Handle the exception in exceptionCaught(...) of the ChannelInitializer which will by default log and close the Channel. This way the user can override this.
Result:
More felixible handling of exceptions.
Motivation:
Currently in EmbeddedChannel we add the ChannelHandlers before the Channel is registered which leads to have the handlerAdded(...) callback
be called from outside the EventLoop and also prevent the user to obtain a reference to the EventLoop in the callback itself.
Modifications:
Delay adding ChannelHandlers until EmbeddedChannel is registered.
Result:
Correctly call handlerAdded(...) after EmbeddedChannel is registered.
Motivation:
If you set a ChannelHandler via ServerBootstrap.handler(...) it is added to the ChannelPipeline before the Channel is registered. This will lead to and IllegalStateException if a user tries to access the EventLoop in the ChannelHandler.handlerAdded(...) method.
Modifications:
Delay the adding of the ChannelHandler until the Channel was registered.
Result:
No more IllegalStateException.
Motivation:
Only one of the three FixedChannelPool constructors checks for the constructor
arguments. Therfore it was possible to create a pool with zero maxConnections.
This change chains all constructors together, so that the last one
in the chain always checks the validity of the arguments, regardless of the
constructor used.
Result:
It is no longer possible to create a FixedChannelPool instance with invalid
maxConnections or maxPendingAcquires parameters.
Motivation:
FixedChannelPool should enforce a number of maximal used channels, but due a bug we fail to correctly enforce this.
Modifications:
Change check to correctly only acquire channel if we not hit the limit yet.
Result:
Correct limiting.
Motivation:
To avoid buffering too much it would be useful to get an estimate of how many bytes can be written to a Channel before it becomes unwritable.
Modifications:
- Update the Channel interface to support 2 new methods. 1 to give how many bytes before unwritable. 1 to give how many bytes before writable.
- Update the AbstractChannel implementation to delegate to the ChannelOutboundBuffer.
Result:
The Channel interface supports 2 new methods which provide more visibility into writability.
Motivation:
It's useful to be able to be notified once all Channels that are part of the ChannelGroup are notified. This can for example be useful if you want to do a graceful shutdown.
Modifications:
- Add ChannelGroup.newCloseFuture(...) which will be notified once all Channels are notified that are part of the ChannelGroup at the time of calling.
Result:
Easier to be notified once all Channels within a ChannelGroup are closed.
Motiviation:
There are currently no accessors which provide visbility into how many bytes must be written in order for a writability change to occur. This feature would be useful for codecs which intent to control how many bytes are queued at any given time.
Modifications:
- add bytesBeforeUnWritable() which will give the number of bytes before the buffer (and associated channel) transitions to not writable
- add bytesBeforeWritable() which will give the number of bytes that must be drained from the queue until the channel becomes writable.
Result:
More visibility into writability for the ChannelOutboundBuffer.
Motivation:
the ByteBuffer[] that we keep in the ThreadLocal are never nulled out which can lead to have ByteBuffer instances sit there forever.
This is even a bigger problem if nioBuffer() of ByteBuffer returns a new ByteBuffer that can not be destroyed by ByteBuffer.release().
Modifications:
Null out ByteBuffer array after processing.
Result:
No more dangling references after done.
Motivation:
SingleThreadEventLoopTest.testScheduleTaskAtFixedRate() fails often due to:
- too little tolerance
- incorrect assertion (it compares only with the previous timestamp)
Modifications:
- Increase the timestamp difference tolerance from 10ms to 20ms
- Improve the timestamp assertion so that the comparison is performed against the first recorded timestamp
- Misc: Fix broken Javadoc tag
Result:
More build stability
Motivation:
When trying to write more then Integer.MAX_VALUE / SSIZE_MAX via writev(...) the OS may return EINVAL depending on the kernel or the actual OS (bsd / osx always return EINVAL). This will trigger an IOException.
Modifications:
Never try to write more then Integer.MAX_VALUE / SSIZE_MAX when using writev.
Result:
No more IOException when write more data then Integer.MAX_VALUE / SSIZE_MAX via writev.
Motivation:
In the SslHandler we schedule a timeout at which we close the Channel if a timeout was detected during close_notify. Because this can race with notify the flushFuture we can see an IllegalStateException when the Channel is closed.
Modifications:
- Use a trySuccess() and tryFailure(...) to guard against race.
Result:
No more race.
Motivation:
We should not trigger channelWritabilityChanged during failing message when we are about to close the Channel as otherwise the use may try again writing even if the Channel is about to get closed.
Modifications:
Add new boolean param to ChannelOutboundBuffer.failFlushed(...) which allows to specify if we should notify or not.
Result:
channelWritabilityChanged is not triggered anymore if we cloe the Channel because of an IOException during write.
Motivation:
Previously, we deferred the closing of the Channel when we were flushing. This is problematic as this means that if the user adds a ChannelFutureListener, that will close the Channel, the closing will not happen until we are done with flushing. This can lead to more data is sent than expected.
Modifications:
- Do not defer closing when in flush
Result:
Correctly respect order of events and closing the Channel ASAP
Motivation:
The semantic of LocalChannel.doWrite(...) were a bit off as it notified the ChannelFuture before the data was actual moved to the peer buffer.
Modifications:
- Use our MPSC queue as inbound buffer
- Directly copy to data to the inbound buffer of the peer and either success or fail the promise after each copy.
Result:
Correct semantic and less memory copies.
Motiviation:
If user events or excpetions reach the tail end of the pipeline they are not released. This could result in buffer leaks.
Motivation:
- Use the ReferenceCountUtil.release to release objects for the userEventTriggered and exceptionCaught methods on DefaultChannelPipeline
Result:
2 less areas where buffer leaks can occur.
Motivation:
There are various known issues in netty-codec-dns:
- Message types are not interfaces, which can make it difficult for a
user to implement his/her own message implementation.
- Some class names and field names do not match with the terms in the
RFC.
- The support for decoding a DNS record was limited. A user had to
encode and decode by him/herself.
- The separation of DnsHeader from DnsMessage was unnecessary, although
it is fine conceptually.
- Buffer leak caused by DnsMessage was difficult to analyze, because the
leak detector tracks down the underlying ByteBuf rather than the
DnsMessage itself.
- DnsMessage assumes DNS-over-UDP.
- To send an EDNS message, a user have to create a new DNS record class
instance unnecessarily.
Modifications:
- Make all message types interfaces and add default implementations
- Rename some classes, properties, and constants to match the RFCs
- DnsResource -> DnsRecord
- DnsType -> DnsRecordType
- and many more
- Remove DnsClass and use an integer to support EDNS better
- Add DnsRecordEncoder/DnsRecordDecoder and their default
implementations
- DnsRecord does not require RDATA to be ByteBuf anymore.
- Add DnsRawRecord as the catch-all record type
- Merge DnsHeader into DnsMessage
- Make ResourceLeakDetector track AbstractDnsMessage
- Remove DnsMessage.sender/recipient properties
- Wrap DnsMessage with AddressedEnvelope
- Add DatagramDnsQuest and DatagramDnsResponse for ease of use
- Rename DnsQueryEncoder to DatagramDnsQueryEncoder
- Rename DnsResponseDecoder to DatagramDnsResponseDecoder
- Miscellaneous changes
- Add StringUtil.TAB
Result:
- Cleaner APi
- Can support DNS-over-TCP more easily in the future
- Reduced memory footprint in the default DnsQuery/Response
implementations
- Better leak tracking for DnsMessages
- Possibility to introduce new DnsRecord types in the future and provide
full record encoder/decoder implementation.
- No unnecessary instantiation for an EDNS pseudo resource record
Motivation:
Many projects need some kind a Channel/Connection pool implementation. While the protocols are different many things can be shared, so we should provide a generic API and implementation.
Modifications:
Add ChannelPool / ChannelPoolMap API and implementations.
Result:
Reusable / Generic pool implementation that users can use.
Motivation:
Because of a bug we missed to fail the connect future when doClose() is called. This can lead to a future which is never notified and so may lead to deadlocks in user-programs.
Modifications:
Correctly fail the connect future when doClose() is called and the connection was not established yet.
Result:
Connect future is always notified.
Motivation:
Each different *ChannelOption did extend ChannelOption in 4.0, which we changed in 4.1. This is a breaking change in terms of the API so we need to ensure we keep the old hierarchy.
Modifications:
- Let all *ChannelOption extend ChannelOption
- Add back constructor and mark it as @deprecated
Result:
No API breakage between 4.0 and 4.1
Related: #3464
Motivation:
When a connection attempt is failed,
NioSocketChannelUnsafe.closeExecutor() triggers a SocketException,
suppressing the channelUnregistered() event.
Modification:
Do not attempt to get SO_LINGER value when a socket is not open yet.
Result:
One less bug
Motivation:
At the moment when EmbeddedChannel is used and a ChannelHandler tries to schedule and task it will throw an UnsupportedOperationException. This makes it impossible to test these handlers or even reuse them with EmbeddedChannel.
Modifications:
- Factor out reusable scheduling code into AbstractSchedulingEventExecutor
- Let EmbeddedEventLoop and SingleThreadEventExecutor extend AbstractSchedulingEventExecutor
- add EmbbededChannel.runScheduledPendingTasks() which allows to run all scheduled tasks that are ready
Result:
Embeddedchannel is now usable even with ChannelHandler that try to schedule tasks.
Motivation:
We should allow to get a ChannelOption/AttributeKey from a String. This will make it a lot easier to make use of configuration files in applications.
Modifications:
- Add exists(...), newInstance(...) method to ChannelOption and AttributeKey and alter valueOf(...) to return an existing instance for a String or create one.
- Add unit tests.
Result:
Much more flexible usage of ChannelOption and AttributeKey.
Motivation:
As we plan to have other native transports soon (like a kqueue transport) we should move unix classes/interfaces out of the epoll package so we
introduce other implementations without breaking stuff before the next stable release.
Modifications:
Create a new io.netty.channel.unix package and move stuff over there.
Result:
Possible to introduce other native impls beside epoll.
Motivation:
If SO_LINGER is used shutdownOutput() and close() syscalls will block until either all data was send or until the timeout exceed. This is a problem when we try to execute them on the EventLoop as this means the EventLoop may be blocked and so can not process any other I/O.
Modifications:
- Add AbstractUnsafe.closeExecutor() which returns null by default and use this Executor for close if not null.
- Override the closeExecutor() in NioSocketChannel and EpollSocketChannel and return GlobalEventExecutor.INSTANCE if getSoLinger() > 0
- use closeExecutor() in shutdownInput(...) in NioSocketChannel and EpollSocketChannel
Result:
No more blocking of the EventLoop if SO_LINGER is used and shutdownOutput() or close() is called.
Motivation:
isRoot() is an expensive operation. We should avoid calling it if
possible.
Modifications:
Move the isRoot() checks to the end of the 'if' block, so that isRoot()
is evaluated only when really necessary.
Result:
isRoot() is evaluated only when SO_BROADCAST is set and the bind address
is anylocal address.
Related:
- 27a25e29f7
Motivation:
The commit mentioned above introduced a regression where
channelReadComplete() event is swallowed by a handler which was added
dynamically.
Modifications:
Do not suppress channelReadComplete() if the current handler's
channelRead() method was not invoked at all, so that a just-added
handler does not suppress channelReadComplete().
Result:
Regression is gone, and channelReadComplete() is invoked when necessary.
Motivation:
Even if a handler called ctx.fireChannelReadComplete(), the next handler
should not get its channelReadComplete() invoked if fireChannelRead()
was not invoked before.
Modifications:
- Ensure channelReadComplete() is invoked only when the handler of the
current context actually produced a message, because otherwise there's
no point of triggering channelReadComplete().
i.e. channelReadComplete() must follow channelRead().
- Fix a bug where ctx.read() was not called if the handler of the
current context did not produce any message, making the connection
stall. Read the new comment for more information.
Result:
- channelReadComplete() is invoked only when it makes sense.
- No stale connection
Motivation:
Because of a re-entrance bug in PendingWriteQueue it was possible to get the queue corrupted and also trigger an IllegalStateException caused by multiple recycling of the internal PendingWrite objects.
Modifications:
- Correctly guard against re-entrance
Result:
No more IllegalStateException possible
Motiviation:
When using domain sockets on linux it is supported to recv and send file descriptors. This can be used to pass around for example sockets.
Modifications:
- Add support for recv and send file descriptors when using EpollDomainSocketChannel.
- Allow to obtain the file descriptor for an Epoll*Channel so it can be send via domain sockets.
Result:
recv and send of file descriptors is supported now.
Motivation:
As the ByteBuf is not set to null after release it we may try to release it again in handleReadException()
Modifications:
- set ByteBuf to null to avoid another byteBuf.release() to be called in handleReadException()
Result:
No IllegalReferenceCountException anymore
Motivation:
Fix a minor documentation bug in
ChannelHandlerContext#fireChannelReadComplete.
Modifications:
ChannelHandlerContext#fireChannelReadComplete no longer references an
incorrect method in its javadoc.
Results:
Documentation is correct.
Motivation:
We only provided a constructor in DefaultFileRegion that takes a FileChannel which means the File itself needs to get opened on construction. This has the problem that if you want to write a lot of Files very fast you may end up with may open FD's even if they are not needed yet. This can lead to hit the open FD limit of the OS.
Modifications:
Add a new constructor to DefaultFileRegion which allows to construct it from a File. The FileChannel will only be obtained when transferTo(...) is called or the DefaultFileRegion is explicit open'ed via open() (this is needed for the native epoll transport)
Result:
Less resource usage when writing a lot of DefaultFileRegion.
Related: #3212
Motivation:
When SslHandler and ChunkedWriteHandler exists in a pipeline together,
it is possible that ChunkedWriteHandler.channelWritabilityChanged()
invokes SslHandler.flush() and vice versa. Because they can feed each
other (i.e. ChunkedWriteHandler.channelWritabilityChanged() ->
SslHandler.flush() -> ChunkedWriteHandler.channelWritabilityChanged() ->
..), they can fall into an inconsistent state due to reentrance (e.g.
bad MAC record at the remote peer due to incorrect ordering.)
Modifications:
- Trigger channelWritabilityChanged() using EventLoop.execute() when
there's a chance where channelWritabilityChanged() can cause a
reentrance issue
- Fix test failures caused by the modification
Result:
Fix the handler reentrance issues related with a
channelWritabilityChanged() event
Related: #3212
Motivation:
PendingWriteQueue.recycle() updates its data structure after triggering
a channelWritabilityChanged() event. It causes a rare corruption such as
double free when channelWritabilityChanged() method accesses the
PendingWriteQueue.
Modifications:
Update the state of PendingWriteQueue before triggering an event.
Result:
Fix a rare double-free problem
Related: #3190
Motivation:
When an outbound handler method raises an exception, its promise is
marked as failed. If the promise is done already, the exception is
logged.
When the promise is void, exceptionCaught() must be triggered to notify
a user. However, ChannelHandlerInvokerUtil simply swallows it.
Modifications:
Do not swallow an exception when the promise is void.
Result:
A user who uses a void promise for an outbound operation will be
notified on failure.
Related: #3189
Motivation:
OIO transport implementations block for at most 1 second to wait for
additional messages (or accepted connections).
However, because AbstractOioMessageChannel defers the channelRead()
events for the messages read so far until there's nothing to read up to
maxMessagesPerRead, any read operation will be followed by a 1-second
delay.
Modifications:
Fire channelRead() events as soon as doRead() returns so that there is
no 1 second delay between the actual read and the channelRead() event.
Result:
No more weird 1-second delay
Related: #3156
Motivation:
Let's say we have a channel with the following pipeline configuration:
HEAD --> [E1] H1 --> [E2] H2 --> TAIL
when the channel is deregistered, the channelUnregistered() methods of
H1 and H2 will be invoked from the executor thread of E1 and E2
respectively. To ensure that the channelUnregistered() methods are
invoked from the correct thread, new one-time tasks will be created
accordingly and be scheduled via Executor.execute(Runnable).
As soon as the one-time tasks are scheduled,
DefaultChannelPipeline.fireChannelUnregistered() will start to remove
all handlers from the pipeline via teardownAll(). This process is
performed in reversed order of event propagation. i.e. H2 is removed
first, and then H1 is removed.
If the channelUnregistered() event has been passed to H2 before H2 is
removed, a user does not see any problem.
If H2 has been removed before channelUnregistered() event is passed to
H2, a user will often see the following confusing warning message:
An exceptionCaught() event was fired, and it reached at the tail of
the pipeline. It usually means the last handler in the pipeline did
not handle the exception.
Modifications:
To ensure that the handlers are removed *after* all events are
propagated, traverse the pipeline in ascending order before performing
the actual removal.
Result:
A user does not get the confusing warning message anymore.
Motivation:
AbstractUnsafe considers two possibilities during channel registration. First,
the channel may be an outgoing connection, in which case it will be registered
before becoming active. Second, the channel may be an incoming connection in,
which case the channel will already be active when it is registered. To handle
the second case, AbstractUnsafe checks if the channel is active after
registration and calls ChannelPipeline.fireChannelActive() if so. However, if
an active channel is deregistered and then re-registered this logic causes a
second fireChannelActive() to be invoked. This is unexpected; it is reasonable
for handlers to assume that this method will only be invoked once per channel.
Modifications:
This change introduces a flag into AbstractUnsafe to recognize if this is the
first or a subsequent registration. ChannelPipeline.fireChannelActive() is only
possible for the first registration.
Result:
ChannelPipeline.fireChannelActive() is only called once.
Motivation:
Found performance issues via FindBugs and PMD.
Modifications:
- Removed unnecessary boxing/unboxing operations in DefaultTextHeaders.convertToInt(CharSequence) and DefaultTextHeaders.convertToLong(CharSequence). A boxed primitive is created from a string, just to extract the unboxed primitive value.
- Added a static modifier for DefaultHttp2Connection.ParentChangedEvent class. This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Added a static compiled Pattern to avoid compile it each time it is used when we need to replace some part of authority.
- Improved using of StringBuilders.
Result:
Performance improvements.
Motivation:
ChannelPromiseAggregator and ChannelPromiseNotifiers only allow
consumers to work with Channels as the result type. Generic versions
of these classes allow consumers to aggregate or broadcast the results
of an asynchronous execution with other result types.
Modifications:
Add PromiseAggregator and PromiseNotifier. Add unit tests for both.
Remove code in ChannelPromiseAggregator and ChannelPromiseNotifier and
modify them to extend the new base classes.
Result:
Consumers can now aggregate or broadcast the results of an asynchronous
execution with results types other than Channel.
Related: #2945
Motivation:
Some special handlers such as TrafficShapingHandler need to override the
writability of a Channel to throttle the outbound traffic.
Modifications:
Add a new indexed property called 'user-defined writability flag' to
ChannelOutboundBuffer so that a handler can override the writability of
a Channel easily.
Result:
A handler can override the writability of a Channel using an unsafe API.
For example:
Channel ch = ...;
ch.unsafe().outboundBuffer().setUserDefinedWritability(1, false);
Related: #2034
Motivation:
Some users want to mock Bootstrap (or ServerBootstrap), and thus they
should not be final but be fully overridable and extensible.
Modifications:
Remove finals wherever possible
Result:
@daschl is happy.
Related: #2964
Motivation:
Writing a zero-length FileRegion to an NIO channel will lead to an
infinite loop.
Modification:
- Do not write a zero-length FileRegion by protecting with proper 'if'.
- Update the testsuite
Result:
Another bug fixed
Motivation:
So far, we relied on the domain name resolution mechanism provided by
JDK. It served its purpose very well, but had the following
shortcomings:
- Domain name resolution is performed in a blocking manner.
This becomes a problem when a user has to connect to thousands of
different hosts. e.g. web crawlers
- It is impossible to employ an alternative cache/retry policy.
e.g. lower/upper bound in TTL, round-robin
- It is impossible to employ an alternative name resolution mechanism.
e.g. Zookeeper-based name resolver
Modification:
- Add the resolver API in the new module: netty-resolver
- Implement the DNS-based resolver: netty-resolver-dns
.. which uses netty-codec-dns
- Make ChannelFactory reusable because it's now used by
io.netty.bootstrap, io.netty.resolver.dns, and potentially by other
modules in the future
- Move ChannelFactory from io.netty.bootstrap to io.netty.channel
- Deprecate the old ChannelFactory
- Add ReflectiveChannelFactory
Result:
It is trivial to resolve a large number of domain names asynchronously.
Motivation:
When a datagram packet is sent to a destination where nobody actually listens to,
the server O/S will respond with an ICMP Port Unreachable packet.
The ICMP Port Unreachable packet is translated into PortUnreachableException by JDK.
PortUnreachableException is not a harmful exception that prevents a user from sending a datagram.
Therefore, we should not close a datagram channel when PortUnreachableException is caught.
Modifications:
- Do not close a channel when the caught exception is PortUnreachableException.
Result:
A datagram channel is not closed unexpectedly anymore.
Motivation:
JDK's exception messages triggered by a connection attempt failure do
not contain the related remote address in its message. We currently
append the remote address to ConnectException's message, but I found
that we need to cover more exception types such as SocketException.
Modifications:
- Add AbstractUnsafe.annotateConnectException() to de-duplicate the
code that appends the remote address
Result:
- Less duplication
- A transport implementor can annotate connection attempt failure
message more easily
Motivation:
There's no way to generate the name of a handler being newly added
automatically and reliably.
For example, let's say you have a routine that adds a set of handlers to
a pipeline using addBefore() or addAfter(). Because addBefore() and
addAfter() always require non-conflicting non-null handler name, making
the multiple invocation of the routine on the same pipeline is
non-trivial.
Modifications:
- If a user specifies null as the name of the new handler,
DefaultChannelPipeline generates one.
- Update the documentation of ChannelPipeline to match the new behavior
Result:
A user doesn't need to worry about name conflicts anymore.
Motiviation:
Before this change, autoRead was a volatile boolean accessed directly. Any thread that invoked the DefaultChannelConfig#setAutoRead(boolean) method would read the current value of autoRead, and then set a new value. If the old value did not match the new value, some action would be immediately taken as part of the same method call.
As volatile only provides happens-before consistency, there was no guarantee that the calling thread was actually the thread mutating the state of the autoRead variable (such that it should be the one to invoke the follow-up actions). For example, with 3 threads:
* Thread 1: get = false
* Thread 1: set = true
* Thread 1: invokes read()
* Thread 2: get = true
* Thread 3: get = true
* Thread 2: set = false
* Thread 2: invokes autoReadCleared()
* Event Loop receives notification from the Selector that data is available, but as autoRead has been cleared, cancels the operation and removes read interest
* Thread 3: set = true
This results in a livelock - autoRead is set true, but no reads will happen even if data is available (as readyOps). The only way around this livelock currently is to set autoRead to false, and then back to true.
Modifications:
Write access to the autoRead variable is now made using the getAndSet() method of an AtomicIntegerFieldUpdater, AUTOREAD_UPDATER. This also changed the type of the underlying autoRead variable to be an integer, as no AtomicBooleanFieldUpdater class exists. Boolean logic is retained by assuming that 1 is true and 0 is false.
Result:
There is no longer a race condition between retrieving the old value of the autoRead variable and setting a new value.
Motivation:
We used the wrong EventExecutor to notify for bind failures if a late registration was done.
Modifications:
Use the correct EventExecutor to notify and only use the GlobelEventExecutor if the registration fails itself.
Result:
The correct Thread will do the notification.
When a ChannelOutboundBuffer contains ByteBufs followed by a FileRegion,
removeBytes() will fail with a ClassCastException. It should break the
loop instead.
f31c630c8c was causing
SocketGatheringWriteTest to fail because it does not take the case where
an empty buffer exists in a gathering write.
When there is an empty buffer in a gathering write, the number of
buffers returned by ChannelOutboundBuffer.nioBuffer() and the actual
number of write attemps can differ.
To remove the write requests correctly, a byte transport must use
ChannelOutboundBuffer.removeBytes()
Motivation:
Because of an incorrect logic in teh EmbeddedChannel constructor it is not possible to use EmbeddedChannel with a ChannelInitializer as constructor argument. This is because it adds the internal LastInboundHandler to its ChannelPipeline before it register itself to the EventLoop.
Modifications:
First register self to EventLoop before add LastInboundHandler to the ChannelPipeline.
Result:
It's now possible to use EmbeddedChannel with ChannelInitializer.
Motivation:
Due a regression NioSocketChannel.doWrite(...) will throw a ClassCastException if you do something like:
channel.write(bytebuf);
channel.write(fileregion);
channel.flush();
Modifications:
Correctly handle writing of different message types by using the correct message count while loop over them.
Result:
No more ClassCastException
Motivation:
The previous fix did disable the caching of ByteBuffers completely which can cause performance regressions. This fix makes sure we use nioBuffers() for all writes in NioSocketChannel and so prevent data-corruptions. This is still kind of a workaround which will be replaced by a more fundamental fix later.
Modifications:
- Revert 4059c9f354
- Use nioBuffers() for all writes to prevent data-corruption
Result:
No more data-corruption but still retain the original speed.
Motivation:
At the moment we expand the ByteBuffer[] when we have more then 1024 ByteBuffer to write and replace the stored instance in its FastThreadLocal. This is not needed and may even harm performance on linux as IOV_MAX is 1024 and so this may cause the JVM to do an array copy.
Modifications:
Just exit the nioBuffers() method if we can not fit more ByteBuffer in the array. This way we will pick them up on the next call.
Result:
Remove uncessary array copy and simplify the code.
Motivation:
We cache the ByteBuffers in ChannelOutboundBuffer.nioBuffers() for the Entries in the ChannelOutboundBuffer to reduce some overhead. The problem is this can lead to data-corruption if an incomplete write happens and next time we try to do a non-gathering write.
To fix this we should remove the caching which does not help a lot anyway and just make the code buggy.
Modifications:
Remove the caching of ByteBuffers.
Result:
No more data-corruption.
Motivation:
At the moment it's only possible for a user to set the RecvByteBufAllocator for a Channel but not access the Handle once it is assigned. This makes it hard to write more flexible implementations.
Modifications:
Add a new method to the Channel.Unsafe to allow access the the used Handle for the Channel. The RecvByteBufAllocator.Handle is created lazily.
Result:
It's possible to write more flexible implementatons that allow to adjust stuff on the fly for a Handle that is used by a Channel
Motivation:
Sometimes ChannelHandler need to queue writes to some point and then process these. We currently have no datastructure for this so the user will use an Queue or something like this. The problem is with this Channel.isWritable() will not work as expected and so the user risk to write to fast. That's exactly what happened in our SslHandler. For this purpose we need to add a special datastructure which will also take care of update the Channel and so be sure that Channel.isWritable() works as expected.
Modifications:
- Add PendingWriteQueue which can be used for this purpose
- Make use of PendingWriteQueue in SslHandler
Result:
It is now possible to queue writes in a ChannelHandler and still have Channel.isWritable() working as expected. This also fixes#2752.
Motivation:
We did various changes related to the ChannelOutboundBuffer in 4.0 branch. This commit port all of them over and so make sure our branches are synced in terms of these changes.
Related to [#2734], [#2709], [#2729], [#2710] and [#2693] .
Modification:
Port all changes that was done on the ChannelOutboundBuffer.
This includes the port of the following commits:
- 73dfd7c01b
- 997d8c32d2
- e282e504f1
- 5e5d1a58fd
- 8ee3575e72
- d6f0d12a86
- 16e50765d1
- 3f3e66c31a
Result:
- Less memory usage by ChannelOutboundBuffer
- Same code as in 4.0 branch
- Make it possible to use ChannelOutboundBuffer with Channel implementation that not extends AbstractChannel
Motivation:
The PID_MAX_LIMIT on 64bit linux systems is 4194304 and on osx it is 99998. At the moment we use 65535 as an upper-limit which is too small.
Modifications:
Use 4194304 as max possible value
Result:
No more false-positives when try to detect current pid.
Motivation:
We have some inconsistency when handling writes. Sometimes we call ChannelOutboundBuffer.progress(...) also for complete writes and sometimes not. We should call it always.
Modifications:
Correctly call ChannelOuboundBuffer.progress(...) for complete and incomplete writes.
Result:
Consistent behavior
Motivation:
While benchmarking the native transport with gathering writes I noticed that it is quite slow. This is due the fact that we need to do a lot of array copies to get the buffers into the iov array.
Modification:
Introduce a new class calles IovArray which allows to fill buffers directly in a iov array that can be passed over to JNI without any array copies. This gives a nice optimization in terms of speed when doing gathering writes.
Result:
Big performance improvement when doing gathering writes. See the included benchmark...
Before:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256 http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 23.44ms 16.37ms 259.57ms 91.77%
Req/Sec 181.99k 31.69k 304.60k 78.12%
346544071 requests in 2.00m, 46.48GB read
Requests/sec: 2887885.09
Transfer/sec: 396.59MB
With this change:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256 http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 21.93ms 16.33ms 305.73ms 92.34%
Req/Sec 194.56k 33.75k 309.33k 77.04%
369617503 requests in 2.00m, 49.57GB read
Requests/sec: 3080169.65
Transfer/sec: 423.00MB
Motivation:
ChannelOutboundBuffer is basically a circular array queue of its entry
objects. Once an entry is created in the array, it is never nulled out
to reduce the allocation cost.
However, because it is a circular queue, the array almost always ends up
with as many entry instances as the size of the array, regardless of the
number of pending writes.
At worst case, a channel might have only 1 pending writes at maximum
while creating 32 entry objects, where 32 is the initial capacity of the
array.
Modifications:
- Reduce the initial capacity of the circular array queue to 4.
- Make the initial capacity of the circular array queue configurable
Result:
We spend 4 times less memory for entry objects under certain
circumstances.
Motivation:
At the moment NioSocketChannelOutboundBuffer.nioBuffers() / EpollSocketChannelOutboundBuffer.memoryAddresses() returns null if something is contained in the ChannelOutboundBuffer which is not a ByteBuf. This is a problem for two reasons:
1 - In the javadocs we state that it will never return null
2 - We may do a not optimal write as there may be things that could be written via gathering writes
Modifications:
Change NioSocketChannelOutboundBuffer.nioBuffers() / EpollSocketChannelOutboundBuffer.memoryAddresses() to never return null but have it contain all ByteBuffer that were found before the non ByteBuf. This way we can do a gathering write and also conform to the javadocs.
Result:
Better speed and also correct implementation in terms of the api.
Motivation:
Now Netty has a few problems with null values.
Modifications:
- Check HAProxyProxiedProtocol in HAProxyMessage constructor and throw NPE if it is null.
If HAProxyProxiedProtocol is null we will set AddressFamily as null. So we will get NPE inside checkAddress(String, AddressFamily) and it won't be easy to understand why addrFamily is null.
- Check File in DiskFileUpload.toString().
If File is null we will get NPE when calling toString() method.
- Check Result<String> in MqttDecoder.decodeConnectionPayload(...).
If !mqttConnectVariableHeader.isWillFlag() || !mqttConnectVariableHeader.hasUserName() || !mqttConnectVariableHeader.hasPassword() we will get NPE when we will try to create new instance of MqttConnectPayload.
- Check Unsafe before calling unsafe.getClass() in PlatformDependent0 static block.
- Removed unnecessary null check in WebSocket08FrameEncoder.encode(...).
Because msg.content() can not return null.
- Removed unnecessary null check in DefaultStompFrame(StompCommand) constructor.
Because we have this check in the super class.
- Removed unnecessary null checks in ConcurrentHashMapV8.removeTreeNode(TreeNode<K,V>).
- Removed unnecessary null check in OioDatagramChannel.doReadMessages(List<Object>).
Because tmpPacket.getSocketAddress() always returns new SocketAddress instance.
- Removed unnecessary null check in OioServerSocketChannel.doReadMessages(List<Object>).
Because socket.accept() always returns new Socket instance.
- Pass Unpooled.buffer(0) instead of null inside CloseWebSocketFrame(boolean, int) constructor.
If we will pass null we will get NPE in super class constructor.
- Added throw new IllegalStateException in GlobalEventExecutor.awaitInactivity(long, TimeUnit) if it will be called before GlobalEventExecutor.execute(Runnable).
Because now we will get NPE. IllegalStateException will be better in this case.
- Fixed null check in OpenSslServerContext.setTicketKeys(byte[]).
Now we throw new NPE if byte[] is not null.
Result:
Added new null checks when it is necessary, removed unnecessary null checks and fixed some NPE problems.
Motivation:
Fix some typos in Netty.
Modifications:
- Fix potentially dangerous use of non-short-circuit logic in Recycler.transfer(Stack<?>).
- Removed double 'the the' in javadoc of EmbeddedChannel.
- Write to log an exception message if we can not get SOMAXCONN in the NetUtil's static block.
Motivation:
As a DatagramChannel supports to write to multiple remote peers we must not close the Channel once a IOException accours as this error may be only valid for one remote peer.
Modification:
Continue writing on IOException.
Result:
DatagramChannel can be used even after an IOException accours during writing.
Motivation:
Because of a missing return statement we may produce a NPE when try to fullfill the connect ChannelPromise when it was fullfilled before.
Modification:
Add missing return statement.
Result:
No more NPE.
Motivation:
When system is in short of entrophy, the initialization of
ThreadLocalRandom can take at most 3 seconds. The initialization occurs
when ThreadLocalRandom.current() is invoked first time, which might be
much later than the moment when the application has started. If we
start the initialization of ThreadLocalRandom as early as possible, we
can reduce the perceived time taken for the retrieval.
Modification:
Begin the initialization of ThreadLocalRandom in InternalLoggerFactory,
potentially one of the firstly initialized class in a Netty application.
Make DefaultChannelId retrieve the current process ID before retrieving
the current machine ID, because retrieval of a machine ID is more likely
to use ThreadLocalRandom.current().
Use a dummy channel ID for EmbeddedChannel, which prevents many unit
tests from creating a ThreadLocalRandom instance.
Result:
We gain extra 100ms at minimum for initialSeedUniquifier generation. If
an application has its own initialization that takes long enough time
and generates good amount of entrophy, it is very likely that we will
gain a lot more.
Motivation:
When a bind fails AbstractBootstrap will use the GlobalEventExecutor to notify the ChannelPromise. We should use the EventLoop of the Channel if possible.
Modification:
Use EventLoop of the Channel if possible to use the correct Thread to notify and so guaranteer the right order of events.
Result:
Use the correct EventLoop for notification
Motivation:
There is no way for a ChannelHandler to check if the passed in ChannelPromise for a write(...) call is a VoidChannelPromise. This is a problem as some handlers need to add listeners to the ChannelPromise which is not possible in the case of a VoidChannelPromise.
Modification:
- Introduce ChannelFuture.isVoid() which will return true if it is not possible to add listeners or wait on the result.
- Add ChannelPromise.unvoid() which allows to create a ChannelFuture out of a void ChannelFuture which supports all the operations.
Result:
It's now easy to write ChannelHandler implementations which also works when a void ChannelPromise is used.
Motivation:
We use the nanoTime of the scheduledTasks to calculate the milli-seconds to wait for a select operation to select something. Once these elapsed we check if there was something selected or some task is ready for processing. Unfortunally we not take into account scheduled tasks here so the selection loop will continue if only scheduled tasks are ready for processing. This will delay the execution of these tasks.
Modification:
- Check if a scheduled task is ready after selecting
- also make a tiny change in NioEventLoop to not trigger a rebuild if nothing was selected because the timeout was reached a few times in a row.
Result:
Execute scheduled tasks on time.
Motivation:
When a select rebuild was triggered the reference to the SelectionKey is not updated in AbstractNioChannel. This will cause a CancelledKeyException later.
Modification:
Correctly update SelectionKey reference after rebuild
Result:
Fix exception
Motivation:
Recycler is used in many places to reduce GC-pressure but is still not as fast as possible because of the internal datastructures used.
Modification:
- Rewrite Recycler to use a WeakOrderQueue which makes minimal guaranteer about order and visibility for max performance.
- Recycling of the same object multiple times without acquire it will fail.
- Introduce a RecyclableMpscLinkedQueueNode which can be used for MpscLinkedQueueNodes that use Recycler
These changes are based on @belliottsmith 's work that was part of #2504.
Result:
Huge increase in performance.
4.0 branch without this commit:
Benchmark (size) Mode Samples Score Score error Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 116026994.130 2763381.305 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 110823170.627 3007221.464 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 118290272.413 7143962.304 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 120560396.523 6483323.228 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 114726607.428 2960013.108 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 119385917.899 3172913.684 ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.617 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark
4.0 branch with this commit:
Benchmark (size) Mode Samples Score Score error Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 204158855.315 5031432.145 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 205179685.861 1934137.841 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 209906801.437 8007811.254 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 214288320.053 6413126.689 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 215940902.649 7837706.133 ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 211141994.206 5017868.542 ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.648 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark
Motivation:
LocalServerChannel.doClose() calls LocalChannelRegistry.unregister(localAddress); without check if localAddress is null and so produce a NPE when pass null the used ConcurrentHashMapV8
Modification:
Check for localAddress != null before try to remove it from Map. Also added a unit test which showed the stacktrace of the error.
Result:
No more NPE during doClose().
Motivation:
At the moment AbstractBoostrap.bind(...) will always use the GlobalEventExecutor to notify the returned ChannelFuture if the registration is not done yet. This should only be done if the registration fails later. If it completes successful we should just notify with the EventLoop of the Channel.
Modification:
Use EventLoop of the Channel if possible to use the correct Thread to notify and so guaranteer the right order of events.
Result:
Use the correct EventLoop for notification
Motivation:
When Netty runs in a managed environment such as web application server,
Netty needs to provide an explicit way to remove the thread-local
variables it created to prevent class loader leaks.
FastThreadLocal uses different execution paths for storing a
thread-local variable depending on the type of the current thread.
It increases the complexity of thread-local removal.
Modifications:
- Moved FastThreadLocal and FastThreadLocalThread out of the internal
package so that a user can use it.
- FastThreadLocal now keeps track of all thread local variables it has
initialized, and calling FastThreadLocal.removeAll() will remove all
thread-local variables of the caller thread.
- Added FastThreadLocal.size() for diagnostics and tests
- Introduce InternalThreadLocalMap which is a mixture of hard-wired
thread local variable fields and extensible indexed variables
- FastThreadLocal now uses InternalThreadLocalMap to implement a
thread-local variable.
- Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
tells it to stop watching when its thread-local cache has been freed
by FastThreadLocal.removeAll().
- Added FastThreadLocalTest to ensure that removeAll() works
- Added microbenchmark for FastThreadLocal and JDK ThreadLocal
- Upgraded to JMH 0.9
Result:
- A user can remove all thread-local variables Netty created, as long as
he or she did not exit from the current thread. (Note that there's no
way to remove a thread-local variable from outside of the thread.)
- FastThreadLocal exposes more useful operations such as isSet() because
we always implement a thread local variable via InternalThreadLocalMap
instead of falling back to JDK ThreadLocal.
- FastThreadLocalBenchmark shows that this change improves the
performance of FastThreadLocal even more.
Motivation:
The code in ChannelOutboundBuffer can be simplified by using AtomicLongFieldUpdater.addAndGet(...)
Modification:
Replace our manual looping with AtomicLongFieldUpdater.addAndGet(...)
Result:
Cleaner code
Motivation:
If ChannelOutboundBuffer.addFlush() is called multiple times and flushed != unflushed it will still loop through all entries that are not flushed yet even if it is not needed anymore as these were marked uncancellable before.
Modifications:
Check if new messages were added since addFlush() was called and only if this was the case loop through all entries and try to mark the uncancellable.
Result:
Less overhead when ChannelOuboundBuffer.addFlush() is called multiple times without new messages been added.
Motivation:
Provide a faster ThreadLocal implementation
Modification:
Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark);
Result:
Improved performance
Motivation:
Each of DefaultChannelPipeline instance creates an head and tail that wraps a handler. These are used to chain together other DefaultChannelHandlerContext that are created once a new ChannelHandler is added. There are a few things here that can be improved in terms of memory usage and initialization time.
Modification:
- Only generate the name for the tail and head one time as it will never change anyway
- Rename DefaultChannelHandlerContext to AbstractChannelHandlerContext and make it abstract
- Create a new DefaultChannelHandlerContext that is used when a ChannelHandler is added to the DefaultChannelPipeline
- Rename TailHandler to TailContext and HeadHandler to HeadContext and let them extend AbstractChannelHandlerContext. This way we can save 2 object creations per DefaultChannelPipeline
Result:
- Less memory usage because we have 2 less objects per DefaultChannelPipeline
- Faster creation of DefaultChannelPipeline as we not need to generate the name for the head and tail
Motivation:
At the moment ChannelFlushPromiseNotifier.add(....) takes an int value for pendingDataSize, which may be too small as a user may need to use a long. This can for example be useful when a user writes a FileRegion etc. Beside this the notify* method names are kind of missleading as these should not contain *Future* because it is about ChannelPromises.
Modification:
Add a new add(...) method that takes a long for pendingDataSize and @deprecated the old method. Beside this also @deprecated all *Future* methods and add methods that have *Promise* in the method name to better reflect usage.
Result:
ChannelFlushPromiseNotifier can be used with bigger data.
Motivation:
The default StringBuilder size is too small (data.length + 4) while it will be 2*data.length (byte to Hex) + 5 "-" char (since 5 peaces appended).
Modification:
Changing initial size to the correct one
Result:
Allocation of the correct final size from the beginning for this StringBuilder.
Motivation:
On some ill-configured systems, InetAddress.getLocalHost() fails. NioSocketChannelTest calls java.net.Socket.connect() and it internally invoked InetAddress.getLocalHost(), which causes the test failures in NioSocketChannelTes on such an ill-configured system.
Modifications:
Use NetUtil.LOCALHOST explicitly.
Result:
NioSocketChannelTest should not fail anymore.
Motivation:
The old DefaultAttributeMap impl did more synchronization then needed and also did not expose a efficient way to check if an attribute exists with a specific key.
Modifications:
* Rewrite DefaultAttributeMap to not use IdentityHashMap and synchronization on the map directly. The new impl uses a combination of AtomicReferenceArray and synchronization per chain (linked-list). Also access the first Attribute per bucket can be done without any synchronization at all and just uses atomic operations. This should fit for most use-cases pretty weel.
* Add hasAttr(...) implementation
Result:
It's now possible to check for the existence of a attribute without create one. Synchronization is per linked-list and the first entry can even be added via atomic operation.
Motivation:
At the moment we sometimes use only RecvByteBufAllocator.guess() to guess the next size and the use the ByteBufAllocator.* directly to allocate the buffer. We should always use RecvByteBufAllocator.allocate(...) all the time as this makes the behavior easier to adjust.
Modifications:
Change the read() implementations to make use of RecvByteBufAllocator.
Result:
Behavior is more consistent.