Motivation:
Several issues were shown by various ticket (#2900#2956).
Also use the improvement on writability user management from #3036.
And finally add a mixte handler, both for Global and Channels, with
the advantages of being uniquely created and using less memory and
less shaping.
Issue #2900
When a huge amount of data are written, the current behavior of the
TrafficShaping handler is to limit the delay to 15s, whatever the delay
the previous write has. This is wrong, and when a huge amount of writes
are done in a short time, the traffic is not correctly shapened.
Moreover, there is a high risk of OOM if one is not using in his/her own
handler for instance ChannelFuture.addListener() to handle the write
bufferisation in the TrafficShapingHandler.
This fix use the "user-defined writability flags" from #3036 to
allow the TrafficShapingHandlers to "user-defined" managed writability
directly, as for reading, thus using the default isWritable() and
channelWritabilityChanged().
This allows for instance HttpChunkedInput to be fully compatible.
The "bandwidth" compute on write is only on "acquired" write orders, not
on "real" write orders, which is wrong from statistic point of view.
Issue #2956
When using GlobalTrafficShaping, every write (and read) are
synchronized, thus leading to a drop of performance.
ChannelTrafficShaping is not touched by this issue since synchronized is
then correct (handler is per channel, so the synchronized).
Modifications:
The current write delay computation takes into account the previous
write delay and time to check is the 15s delay (maxTime) is really
exceeded or not (using last scheduled write time). The algorithm is
simplified and in the same time more accurate.
This proposal uses the #3036 improvement on user-defined writability
flags.
When the real write occurs, the statistics are update accordingly on a
new attribute (getRealWriteThroughput()).
To limit the synchronisations, all synchronized on
GlobalTrafficShapingHandler on submitWrite were removed. They are
replaced with a lock per channel (since synchronization is still needed
to prevent unordered write per channel), as in the sendAllValid method
for the very same reason.
Also all synchronized on TrafficCounter on read/writeTimeToWait() are
removed as they are unnecessary since already locked before by the
caller.
Still the creation and remove operations on lock per channel (PerChannel
object) are synchronized to prevent concurrency issue on this critical
part, but then limited.
Additionnal changes:
1) Use System.nanoTime() instead of System.currentTimeMillis() and
minimize calls
2) Remove / 10 ° 10 since no more sleep usage
3) Use nanoTime instead of currentTime such that time spend is computed,
not real time clock. Therefore the "now" relative time (nanoTime based)
is passed on all sub methods.
4) Take care of removal of the handler to force write all pending writes
and release read too
8) Review Javadoc to explicit:
- recommandations to take into account isWritable
- recommandations to provide reasonable message size according to
traffic shaping limit
- explicit "best effort" traffic shaping behavior when changing
configuration dynamically
Add a MixteGlobalChannelTrafficShapingHandler which allows to use only one
handler for mixing Global and Channel TSH. I enables to save more memory and
tries to optimize the traffic among various channels.
Result:
The traffic shaping is more stable, even with a huge number of writes in
short time by taking into consideration last scheduled write time.
The current implementation of TrafficShapingHandler using user-defined
writability flags and default isWritable() and
fireChannelWritabilityChanged works as expected.
The statistics are more valuable (asked write vs real write).
The Global TrafficShapingHandler should now have less "global"
synchronization, hoping to the minimum, but still per Channel as needed.
The GlobalChannel TrafficShapingHandler allows to have only one handler for all channels while still offering per channel in addition to global traffic shaping.
And finally maintain backward compatibility.
Motivation:
We only support openssl for server side at the moment but it would be also useful for client side.
Modification:
* Upgrade to new netty-tcnative snapshot to support client side openssl support
* Add OpenSslClientContext which can be used to create SslEngine for client side usage
* Factor out common logic between OpenSslClientContext and OpenSslServerContent into new abstract base class called OpenSslContext
* Correctly detect handshake failures as soon as possible
* Guard against segfault caused by multiple calls to destroyPools(). This can happen if OpenSslContext throws an exception in the constructor and the finalize() method is called later during GC
Result:
openssl can be used for client and servers now.
Motivation:
TrafficShapingHandlerTest uses Logback API directly, which is
discouraged. Also, it overrides the global default log level, which
silences the DEBUG messages from other tests.
Modifications:
Remove the direct use of Logback API
Result:
The tests executed after TrafficShapingHandlerTest logs their DEBUG
messages correctly.
Motivation:
We need more information to understand why SocketSslEchoTest fails
sporadically in the CI machine.
Modifications:
- Refactor SocketSslEchoTest so that it is easier to retrieve the
information about renegotiation and the current progress
Result:
We will get more information when the test fails.
Motivation:
Tests sometimes time out because it took too long to compress the
generated heap dump.
Modifications:
- Move the compression logic to a new method 'compressHeapDumps()'
- Call TestUtils.compressHeapDumps() at the end of the tests, so that
the tests do not fail because of timeout
Result:
JUnit reports the real cause of the test failure instead of timeout
exception.
Motivation:
So far, we generated and deployed test JARs to Maven repositories. The
deployed JAR had the classifier 'test-jar'. The test JAR is consumed by
transport-native-epoll as a test dependency.
The problem is, when netty-transport-native-epoll pulls the test JAR as
a dependency, that Maven resolves its transitive dependencies at
'compile' and 'runtime' scope only, which is incorrect.
I was bitten by this problem recently while trying to add a new
dependency to netty-testsuite. Because I added a new dependency at the
'test' scope, the new dependency was not pulled transitively by
transport-native-epoll and caused an unexpected build failure.
- d6160208c3
- bf77bb4c3a
Modifications:
- Move all classes in netty-testsuite from src/test to src/main
- Update the 'compile' scope dependencies of netty-testsuite
- Override the test directory configuration properties of the surefire
plugin
- Do not generate the test JAR anymore
- Update the dependency of netty-transport-native-epoll
Result:
It is less error-prone to add a new dependency to netty-testsuite.
Motivation:
It takes too long to download the heap dump from the CI server.
Modifications:
Compress the heap dump as much as possible.
Result:
When heap dump is generated by certain test failure, the generated heap
dump file is about 3 times smaller than before, although the compression
time will increase the build time when the test fails.
Motivation:
So far, our TLS renegotiation test did not test changing cipher suite
during renegotiation explicitly.
Modifications:
- Switch the cipher suite during renegotiation
Result:
We are now sure the cipher suite change works.
Related:
e9685ea45a
Motivation:
SslHandler.unwrap() does not evaluate the handshake status of
SSLEngine.unwrap() when the status of SSLEngine.unwrap() is CLOSED.
It is not correct because the status does not reflect the state of the
handshake currently in progress, accoding to the API documentation of
SSLEngineResult.Status.
Also, sslCloseFuture can be notified earlier than handshake notification
because we call sslCloseFuture.trySuccess() before evaluating handshake
status.
Modifications:
- Notify sslCloseFuture after the unwrap loop is finished
- Add more assertions to SocketSslEchoTest
Result:
Potentially fix the regression caused by:
- e9685ea45a
Motivation:
We have a few sporadic test failures which are only easily reproduceable
in our CI machine. To get more information about the failure, we need
heap and full thread dump at the moment of failure.
Modifications:
- Add TestUtils.dump() method to dump heap and threads
- Modify SocketGatheringWriteTest and SocketSslEchoTest to call
TestUtils.dump() on failure
Result:
We get more information about the test failure.
Related: #3125
Motivation:
We did not expose a way to initiate TLS renegotiation and to get
notified when the renegotiation is done.
Modifications:
- Add SslHandler.renegotiate() so that a user can initiate TLS
renegotiation and get the future that's notified on completion
- Make SslHandler.handshakeFuture() return the future for the most
recent handshake so that a user can get the future of the last
renegotiation
- Add the test for renegotiation to SocketSslEchoTest
Result:
Both client-initiated and server-initiated renegotiations are now
supported properly.
Motivation:
The commit 50e06442c3 changed the type of
the constants in HttpHeaders.Names and HttpHeaders.Values, making 4.1
backward-incompatible with 4.0.
It also introduces newer utility classes such as HttpHeaderUtil, which
deprecates most static methods in HttpHeaders. To ease the migration
between 4.1 and 5.0, we should deprecate all static methods that are
non-existent in 5.0, and provide proper counterpart.
Modification:
- Revert the changes in HttpHeaders.Names and Values
- Deprecate all static methods in HttpHeaders in favor of:
- HttpHeaderUtil
- the member methods of HttpHeaders
- AsciiString
- Add integer and date access methods to HttpHeaders for easier future
migration to 5.0
- Add HttpHeaderNames and HttpHeaderValues which provide standard HTTP
constants in AsciiString
- Deprecate HttpHeaders.Names and Values
- Make HttpHeaderValues.WEBSOCKET lowercased because it's actually
lowercased in all WebSocket versions but the oldest one
- Add RtspHeaderNames and RtspHeaderValues which provide standard RTSP
constants in AsciiString
- Deprecate RtspHeaders.*
- Do not use AsciiString.equalsIgnoreCase(CharSeq, CharSeq) if one of
the parameters are AsciiString
- Avoid using AsciiString.toString() repetitively
- Change the parameter type of some methods from String to
CharSequence
Result:
Backward compatibility is recovered. New classes and methods will make
the migration to 5.0 easier, once (Http|Rtsp)Header(Names|Values) are
ported to master.
Related: #2964
Motivation:
Writing a zero-length FileRegion to an NIO channel will lead to an
infinite loop.
Modification:
- Do not write a zero-length FileRegion by protecting with proper 'if'.
- Update the testsuite
Result:
Another bug fixed
Motivation:
We see occational failures in the datagram tests saying 'address already
in use' when we attempt to bind on a port returned by
TestUtils.getFreePort().
It turns out that TestUtils.getFreePort() only checks if TCP port is
available.
Modifications:
Also check if UDP port is available, so that the datagram tests do not
fail because of the 'address already in use' error during a bind
attempt.
Result:
Less chance of datagram test failures
Motivation:
So far, we relied on the domain name resolution mechanism provided by
JDK. It served its purpose very well, but had the following
shortcomings:
- Domain name resolution is performed in a blocking manner.
This becomes a problem when a user has to connect to thousands of
different hosts. e.g. web crawlers
- It is impossible to employ an alternative cache/retry policy.
e.g. lower/upper bound in TTL, round-robin
- It is impossible to employ an alternative name resolution mechanism.
e.g. Zookeeper-based name resolver
Modification:
- Add the resolver API in the new module: netty-resolver
- Implement the DNS-based resolver: netty-resolver-dns
.. which uses netty-codec-dns
- Make ChannelFactory reusable because it's now used by
io.netty.bootstrap, io.netty.resolver.dns, and potentially by other
modules in the future
- Move ChannelFactory from io.netty.bootstrap to io.netty.channel
- Deprecate the old ChannelFactory
- Add ReflectiveChannelFactory
Result:
It is trivial to resolve a large number of domain names asynchronously.
Motivation:
Due incorrect usage of CompositeByteBuf a buffer leak was introduced.
Modifications:
Correctly handle tests with CompositeByteBuf.
Result:
No more buffer leaks
Motivation:
On linux with glibc >= 2.14 it is possible to send multiple DatagramPackets with one syscall. This can be a huge performance win and so we should support it in our native transport.
Modification:
- Add support for sendmmsg by reuse IovArray
- Factor out ThreadLocal support of IovArray to IovArrayThreadLocal for better separation as we use IovArray also without ThreadLocal in NativeDatagramPacketArray now
- Introduce NativeDatagramPacketArray which is used for sendmmsg(...)
- Implement sendmmsg(...) via jni
- Expand DatagramUnicastTest to test also sendmmsg(...)
Result:
Netty now automatically use sendmmsg(...) if it is supported and we have more then 1 DatagramPacket in the ChannelOutboundBuffer and flush() is called.
Motivation:
On linux it is possible to use the sendMsg(...) system call to write multiple buffers with one system call when using datagram/udp.
Modifications:
- Implement the needed changes and make use of sendMsg(...) if possible for max performance
- Add tests that test sending datagram packets with all kind of different ByteBuf implementations.
Result:
Performance improvement when using CompoisteByteBuf and EpollDatagramChannel.
Motivation:
The test procedure is unstable when testing quick time (factor less or equal to 1). Changing to default 10ms in this case will force time to be correct and time to be checked only when factor is >= 2.
Modifications:
When factor is <= 1, minimalWaitBetween is 10ms
Result:
Hoping this version is finally stable.
Motivation:
It seems that in certain conditions, the write back from the server is so quick that the handler has no time to compute traffic shaping. So 10ms of wait before acknowledging is added in server side.
Modifications:
Add 10ms waiting before server ackonwledge the client.
Result:
The timing is now suppsed to be stable.
Motivation:
The test procedure is unstable due to not enough precise timestamping
during the check.
Modifications:
Reducing the test cases and cibling "stable" test ("timestamp-able")
bring more stability to the tests.
Result:
Tests for TrafficShapingHandler seem more stable (whatever using JVM 6,
7 or 8).
Motivation:
Due a regression NioSocketChannel.doWrite(...) will throw a ClassCastException if you do something like:
channel.write(bytebuf);
channel.write(fileregion);
channel.flush();
Modifications:
Correctly handle writing of different message types by using the correct message count while loop over them.
Result:
No more ClassCastException
Related issue: #2764
Motivation:
EpollSocketChannel.writeFileRegion() does not handle the case where the
position of a FileRegion is non-zero properly.
Modifications:
- Improve SocketFileRegionTest so that it tests the cases where the file
transfer begins from the middle of the file
- Add another jlong parameter named 'base_off' so that we can take the
position of a FileRegion into account
Result:
Improved test passes. Corruption is gone.
Motivation:
Currently Traffic Shaping is using 1 timer only and could lead to
"partial" wrong bandwidth computation when "short" time occurs between
adding used bytes and when the TrafficCounter updates itself and finally
when the traffic is computed.
Indeed, the TrafficCounter is updated every x delay and it is at the
same time saved into "lastXxxxBytes" and set to 0. Therefore, when one
request the counter, it first updates the TrafficCounter with the added
used bytes. If this value is set just before the TrafficCounter is
updated, then the bandwidth computation will use the TrafficCounter with
a "0" value (this value being reset once the delay occurs). Therefore,
the traffic shaping computation is wrong in rare cases.
Secondly the traffic shapping should avoid if possible the "Timeout"
effect by not stopping reading or writing more than a maxTime, this
maxTime being less than the TimeOut limit.
Thirdly the traffic shapping in read had an issue since the readOp
was not set but should, turning in no read blocking from socket
point of view.
Modifications:
The TrafficCounter has 2 new methods that compute the time to wait
according to read or write) using in priority the currentXxxxBytes (as
before), but could used (if current is at 0) the lastXxxxxBytes, and
therefore having more chance to take into account the real traffic.
Moreover the Handler could change the default "max time to wait", which
is by default set to half of "standard" Time Out (30s:2 = 15s).
Finally we add the setAutoRead(boolean) accordingly to the situation,
as proposed in #2696 (this pull request is in error for unknown reason).
Result:
The Traffic Shaping is better take into account (no 0 value when it
shouldn't) and it tries to not block traffic more than Time Out event.
Moreover the read is really stopped from socket point of view.
This version is similar to #2388 and #2450.
This version is for V4.1, and includes the #2696 pull request
to ease the merge process.
It is compatible with master too.
Including also #2748
The test minimizes time check by reducing to 66ms steps (55s).
Motivation:
epoll transport fails on gathering write of more then 1024 buffers. As linux supports max. 1024 iov entries when calling writev(...) the epoll transport throws an exception.
Thanks again to @blucas to provide me with a reproducer and so helped me to understand what the issue is.
Modifications:
Make sure we break down the writes if to many buffers are uses for gathering writes.
Result:
Gathering writes work with any number of buffers
Motivation:
Persuit for the consistency in method naming
Modifications:
- Remove the 'get' prefix from all HTTP/SPDY message classes
- Fix some inspector warnings
Result:
Consistency
Motivation:
We have different message aggregator implementations for different
protocols, but they are very similar with each other. They all stems
from HttpObjectAggregator. If we provide an abstract class that provide
generic message aggregation functionality, we will remove their code
duplication.
Modifications:
- Add MessageAggregator which provides generic message aggregation
- Reimplement all existing aggregators using MessageAggregator
- Add DecoderResultProvider interface and extend it wherever possible so
that MessageAggregator respects the state of the decoded message
Result:
Less code duplication
Motivation:
Some users already use an SSLEngine implementation in finagle-native. It
wraps OpenSSL to get higher SSL performance. However, to take advantage
of it, finagle-native must be compiled manually, and it means we cannot
pull it in as a dependency and thus we cannot test our SslHandler
against the OpenSSL-based SSLEngine. For an instance, we had #2216.
Because the construction procedures of JDK SSLEngine and OpenSslEngine
are very different from each other, we also need to provide a universal
way to enable SSL in a Netty application.
Modifications:
- Pull netty-tcnative in as an optional dependency.
http://netty.io/wiki/forked-tomcat-native.html
- Backport NativeLibraryLoader from 4.0
- Move OpenSSL-based SSLEngine implementation into our code base.
- Copied from finagle-native; originally written by @jpinner et al.
- Overall cleanup by @trustin.
- Run all SslHandler tests with both default SSLEngine and OpenSslEngine
- Add a unified API for creating an SSL context
- SslContext allows you to create a new SSLEngine or a new SslHandler
with your PKCS#8 key and X.509 certificate chain.
- Add JdkSslContext and its subclasses
- Add OpenSslServerContext
- Add ApplicationProtocolSelector to ensure the future support for NPN
(NextProtoNego) and ALPN (Application Layer Protocol Negotiation) on
the client-side.
- Add SimpleTrustManagerFactory to help a user write a
TrustManagerFactory easily, which should be useful for those who need
to write an alternative verification mechanism. For example, we can
use it to implement an unsafe TrustManagerFactory that accepts
self-signed certificates for testing purposes.
- Add InsecureTrustManagerFactory and FingerprintTrustManager for quick
and dirty testing
- Add SelfSignedCertificate class which generates a self-signed X.509
certificate very easily.
- Update all our examples to use SslContext.newClient/ServerContext()
- SslHandler now logs the chosen cipher suite when handshake is
finished.
Result:
- Cleaner unified API for configuring an SSL client and an SSL server
regardless of its internal implementation.
- When native libraries are available, OpenSSL-based SSLEngine
implementation is selected automatically to take advantage of its
performance benefit.
- Examples take advantage of this modification and thus are cleaner.
Motivation:
When writing data from a server before the ssl handshake completes may not be written at all to the remote peer
if nothing else is written after the handshake was done.
Modification:
Correctly try to write pending data after the handshake was complete
Result:
Correctly write out all pending data
Motivation:
4 and 5 were diverged long time ago and we recently reverted some of the
early commits in master. We must make sure 4.1 and master are not very
different now.
Modification:
Fix found differences
Result:
4.1 and master got closer.
Motivation:
At the moment ChanneConfig.setAutoRead(false) only is guaranteer to not have an extra channelRead(...) triggered when used from within the channelRead(...) or channelReadComplete(...) method. This is not the correct behaviour as it should also work from other methods that are triggered from within the EventLoop. For example a valid use case is to have it called from within a ChannelFutureListener, which currently not work as expected.
Beside this there is another bug which is kind of related. Currently Channel.read() will not work as expected for OIO as we will stop try to read even if nothing could be read there after one read operation on the socket (when the SO_TIMEOUT kicks in).
Modifications:
Implement the logic the right way for the NIO/OIO/SCTP and native transport, specific to the transport implementation. Also correctly handle Channel.read() for OIO transport by trigger a new read if SO_TIMEOUT was catched.
Result:
It is now also possible to use ChannelConfig.setAutoRead(false) from other methods that are called from within the EventLoop and have direct effect.
Conflicts:
transport-sctp/src/main/java/io/netty/channel/sctp/nio/NioSctpChannel.java
transport/src/main/java/io/netty/channel/socket/nio/NioDatagramChannel.java
transport/src/main/java/io/netty/channel/socket/nio/NioSocketChannel.java
Motivation:
Currently, the SPDY frame encoding and decoding code is based upon
the ChannelHandler abstraction. This requires maintaining multiple
versions for 3.x and 4.x (and possibly 5.x moving forward).
Modifications:
The SPDY frame encoding and decoding code is separated from the
ChannelHandler and SpdyFrame abstractions. Also test coverage is
improved.
Result:
SpdyFrameCodec now implements the ChannelHandler abstraction and is
responsible for creating and handling SpdyFrame objects.
Motivation:
Testing the OIO transport takes longer time than other transports because it has to wait for SO_TIMEOUT if there is nothing to read. In production, it's not a good idea to decrease this value (1000ms) because it will result in so many SocketTimeoutExceptions internally, but doing so in the testsuite should be fine.
Modifications:
Reduce the default SO_TIMEOUT of OIO channels to 10 ms.
Result:
Our testsuite finishes sooner.
Motivation:
The epoll testsuite tests the epoll transport only against itself (i.e. epoll x epoll only). We should test the epoll transport also against the well-tested NIO transport, too.
Modifications:
- Make SocketTestPermutation extensible and reusable so that the epoll testsuite can take advantage of it.
- Rename EpollTestUtils to EpollSocketTestPermutation and make it extend SocketTestPermutation.
- Overall clean-up of SocketTestPermutation
- Use Arrays.asList() for simplicity
- Add combo() method to remove code duplication
Result:
The epoll transport is now also tested against the NIO transport. SocketTestPermutation got cleaner.
Motivation:
We are seeing EpollSocketSslEchoTest does not finish itself while its I/O thread is busy. Jenkins should have terminated them when the global build timeout reaches, but Jenkins seems to fail to do so. What's more interesting is that Jenkins will start another job before the EpollSocketSslEchoTest is terminated, and Linux starts to oom-kill them, impacting the uptime of the CI service.
Modifications:
- Set timeout for all test cases in SocketSslEchoTest so that all SSL tests terminate themselves when they take too long.
- Fix a bug where the epoll testsuite uses non-daemon threads which can potentially prevent JVM from quitting.
- (Cleanup) Separate boss group and worker group just like we do for NIO/OIO transport testsuite.
Result:
Potentially more stable CI machine.
This ChannelOption allows to tell the DatagramChannel implementation to be active as soon as they are registrated to their EventLoop. This can be used to make it possible to write to a not bound DatagramChannel.
The ChannelOption is marked as @deprecated as I'm looking for a better solution in master which breaks default behaviour with 4.0 branch.
- write() now accepts a ChannelPromise and returns ChannelFuture as most
users expected. It makes the user's life much easier because it is
now much easier to get notified when a specific message has been
written.
- flush() does not create a ChannelPromise nor returns ChannelFuture.
It is now similar to what read() looks like.
- Remove channelReadSuspended because it's actually same with messageReceivedLast
- Rename messageReceived to channelRead
- Rename messageReceivedLast to channelReadComplete
We renamed messageReceivedLast to channelReadComplete because it
reflects what it really is for. Also, we renamed messageReceived to
channelRead for consistency in method names.
I must admit MesageList was pain in the ass. Instead of forcing a
handler always loop over the list of messages, this commit splits
messageReceived(ctx, list) into two event handlers:
- messageReceived(ctx, msg)
- mmessageReceivedLast(ctx)
When Netty reads one or more messages, messageReceived(ctx, msg) event
is triggered for each message. Once the current read operation is
finished, messageReceivedLast() is triggered to tell the handler that
the last messageReceived() was the last message in the current batch.
Similarly, for outbound, write(ctx, list) has been split into two:
- write(ctx, msg)
- flush(ctx, promise)
Instead of writing a list of message with a promise, a user is now
supposed to call write(msg) multiple times and then call flush() to
actually flush the buffered messages.
Please note that write() doesn't have a promise with it. You must call
flush() to get notified on completion. (or you can use writeAndFlush())
Other changes:
- Because MessageList is completely hidden, codec framework uses
List<Object> instead of MessageList as an output parameter.
- SimpleChannelInboundHandler now has a constructor parameter to let a
user decide to enable automatic message release. (the default is to
enable), which makes ChannelInboundConsumingHandler of less value.
The AIO transport was added in the past as we hoped it would have better latency as the NIO transport. But in reality this was never the case.
So there is no reason to use the AIO transport at all. It just put more burden on us as we need to also support it and fix bugs.
Because of this we dedicided to remove it for now. It will stay in the master_with_aio_transport branch so we can pick it up later again if it is ever needed.
The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.
Memory consumption issue:
When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user. In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.
Memory footprint issue:
The old Netty 4 API decreased overall memory footprint by a great deal
in many cases. It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read. Instead, it
created a new buffer for each handler in a pipeline. This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.
Changes:
All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.
To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:
- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
- Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
- Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
- Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
- Similar to ReceiveBufferSizePredictor in Netty 3
- Some existing configuration properties such as
DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf
This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
- Fixes#1282 (not perfectly, but to the extent it's possible with the current API)
- Add AddressedEnvelope and DefaultAddressedEnvelope
- Make DatagramPacket extend DefaultAddressedEnvelope<ByteBuf, InetSocketAddress>
- Rename ByteBufHolder.data() to content() so that a message can implement both AddressedEnvelope and ByteBufHolder (DatagramPacket does) without introducing two getter methods for the content
- Datagram channel implementations now understand ByteBuf and ByteBufHolder as a message with unspecified remote address.
shutdownGracefully() provides two optional parameters that give more
control over when an executor has to be shut down.
- Related issue: #1307
- Add shutdownGracefully(..) and isShuttingDown()
- Deprecate shutdown() / shutdownNow()
- Replace lastAccessTime with lastExecutionTime and update it after task
execution for accurate quiet period check
- runAllTasks() and runShutdownTasks() update it automatically.
- Add updateLastExecutionTime() so that subclasses can update it
- Add a constructor parameter that tells not to add an unncessary wakeup
task in execute() if addTask() wakes up the executor thread
automatically. Previously, execute() always called wakeup() after
addTask(), which often caused an extra dummy task in the task queue.
- Use shutdownGracefully() wherever possible / Deprecation javadoc
- Reduce the running time of SingleThreadEventLoopTest from 40s to 15s
using custom graceful shutdown parameters
- Other changes made along with this commit:
- takeTask() does not throw InterruptedException anymore.
- Returns null on interruption or wakeup
- Make sure runShutdownTasks() return true even if an exception was
raised while running the shutdown tasks
- Remove unnecessary isShutdown() checks
- Consistent use of SingleThreadEventExecutor.nanoTime()
Replace isWakeupOverridden with a constructor parameter
- Fixes#1308
freeInboundBuffer() and freeOutboundBuffer() were introduced in the early days of the new API when we did not have reference counting mechanism in the buffer. A user did not want Netty to free the handler buffers had to override these methods.
However, now that we have reference counting mechanism built into the buffer, a user who wants to retain the buffers beyond handler's life cycle can simply return the buffer whose reference count is greater than 1 in newInbound/OutboundBuffer().
This change also introduce a few other changes which was needed:
* ChannelHandler.beforeAdd(...) and ChannelHandler.beforeRemove(...) were removed
* ChannelHandler.afterAdd(...) -> handlerAdded(...)
* ChannelHandler.afterRemoved(...) -> handlerRemoved(...)
* SslHandler.handshake() -> SslHandler.hanshakeFuture() as the handshake is triggered automatically after
the Channel becomes active
- Add ChannelHandlerUtil and move the core logic of ChannelInbound/OutboundMessageHandler to ChannelHandlerUtil
- Add ChannelHandlerUtil.SingleInbound/OutboundMessageHandler and make ChannelInbound/OutboundMessageHandlerAdapter implement them. This is a backward incompatible change because it forces all handler methods to be public (was protected previously)
- Fixes: #1119
This pull request introduces a new operation called read() that replaces the existing inbound traffic control method. EventLoop now performs socket reads only when the read() operation has been issued. Once the requested read() operation is actually performed, EventLoop triggers an inboundBufferSuspended event that tells the handlers that the requested read() operation has been performed and the inbound traffic has been suspended again. A handler can decide to continue reading or not.
Unlike other outbound operations, read() does not use ChannelFuture at all to avoid GC cost. If there's a good reason to create a new future per read at the GC cost, I'll change this.
This pull request consequently removes the readable property in ChannelHandlerContext, which means how the traffic control works changed significantly.
This pull request also adds a new configuration property ChannelOption.AUTO_READ whose default value is true. If true, Netty will call ctx.read() for you. If you need a close control over when read() is called, you can set it to false.
Another interesting fact is that non-terminal handlers do not really need to call read() at all. Only the last inbound handler will have to call it, and that's just enough. Actually, you don't even need to call it at the last handler in most cases because of the ChannelOption.AUTO_READ mentioned above.
There's no serious backward compatibility issue. If the compiler complains your handler does not implement the read() method, add the following:
public void read(ChannelHandlerContext ctx) throws Exception {
ctx.read();
}
Note that this pull request certainly makes bounded inbound buffer support very easy, but itself does not add the bounded inbound buffer support.
- Fixes#831
This commit ensures the following events are never triggered as a direct
invocation if they are triggered via ChannelPipeline.fire*():
- channelInactive
- channelUnregistered
- exceptionCaught
This commit also fixes the following issues surfaced by this fix:
- Embedded channel implementations run scheduled tasks too early
- SpdySessionHandlerTest tries to generate inbound data even after the
channel is closed.
- AioSocketChannel enters into an infinite loop on I/O error.
This pull request introduces the new default ByteBufAllocator implementation based on jemalloc, with a some differences:
* Minimum possible buffer capacity is 16 (jemalloc: 2)
* Uses binary heap with random branching (jemalloc: red-black tree)
* No thread-local cache yet (jemalloc has thread-local cache)
* Default page size is 8 KiB (jemalloc: 4 KiB)
* Default chunk size is 16 MiB (jemalloc: 2 MiB)
* Cannot allocate a buffer bigger than the chunk size (jemalloc: possible) because we don't have control over memory layout in Java. A user can work around this issue by creating a composite buffer, but it's not always a feasible option. Although 16 MiB is a pretty big default, a user's handler might need to deal with the bounded buffers when the user wants to deal with a large message.
Also, to ensure the new allocator performs good enough, I wrote a microbenchmark for it and made it a dedicated Maven module. It uses Google's Caliper framework to run and publish the test result (example)
Miscellaneous changes:
* Made some ByteBuf implementations public so that those who implements a new allocator can make use of them.
* Added ByteBufAllocator.compositeBuffer() and its variants.
* ByteBufAllocator.ioBuffer() creates a buffer with 0 capacity.
- Add ChannelOption.ALLOW_HALF_CLOSURE
- If true, ChannelInputShutdownEvent is fired via userEventTriggered()
when the remote peer shuts down its output, and the connection is
not closed until a user calls close() explicitly.
- If false, the connection is closed immediately as it did before.
- Add SocketChannel.isInputShutdown()
- Add & improve test cases related with half-closed sockets
- Reimplemented the test
- Fixed various bugs related with read/accept suspension found while testing
- defaultInterestOps of NioServerSocketChannel should be OP_ACCEPT
- There's no need do deregister and re-register to suspend/resume accept()
- Occational infinite loop with 100% CPU consumption in OioEventLoop, caused by OioSocketChannel
- Even if read/accept is suspended, what's read or accepted should be notified to a user
- Add EventExecutorGroup and EventLoopGroup
- EventExecutor and EventLoop extends EventExecutorGroup and
EventLoopGroup
- They form their own group so that .next() returns itself.
- Rename Bootstrap.eventLoop() to group()
- Rename parameter names such as executor to group
- Rename *EventLoop/Executor to *EventLoop/ExecutorGroup
- Rename *ChildEventLoop/Executor to *EventLoop/Executor
- Used reflection hack to dispatch the tasks submitted by JDK
efficiently. Without hack, there's higher chance of additional
context switches.
- Server side performance improved to the expected level.
- Client side performance issue still under investigation
- Add MessageBuf which replaces java.util.Queue
- Add ChannelBuf which is common type of ByteBuf and ChannelBuf
- ChannelBuffers was renamed to ByteBufs
- Add MessageBufs
- All these changes are going to replace ChannelBufferHolder.
- ChannelBuffer gives a perception that it's a buffer of a
channel, but channel's buffer is now a byte buffer or a message
buffer. Therefore letting it be as is is going to be confusing.
- Also prohibited a user from overriding
ChannelInbound(Byte|Message)HandlerAdapter. If a user wants to do
that, he or she should extend ChannelInboundHandlerAdapter instead.
- In computing, 'stream' means both byte stream and message stream,
which is confusing.
- Also, we were already mixing stream and byte in some places and
it's better use the terms consistently.
(e.g. inboundByteBuffer & inbound stream)
- SslHandler always begins handshake unless startTls is true
- Removed issueHandshake property
- If a user wants to start handshake later, he/she has to add
SslHandler later.
- Removed enableRenegotiation property
- JDK upgrade fixes the security vulnerability - no need to complicate
our code
- Some property name changes
- getSSLEngineInboundCloseFuture() -> sslCloseFuture()
- Updated securechat example
- Added timeout for handshake and close_notify for better security
- However, it's currently hard-coded. Will make it a property later.
- Added EventExecutor.inEventLoop(Thread) and replaced executor identity
comparison in DefaultChannelPipeline with it - more elegant IMO
- Removed the test classes that needs rewrite or is of no use
- Extracted some handler methods from ChannelInboundHandler into
ChannelStateHandler
- Extracted some handler methods from ChannelOutboundHandler into
ChannelOperationHandler
- Moved exceptionCaught and userEventTriggered are now in
ChannelHandler
- Channel(Inbound|Outbound)HandlerContext is merged into
ChannelHandlerContext
- ChannelHandlerContext adds direct access methods for inboud and
outbound buffers
- The use of ChannelBufferHolder is minimal now.
- Before: inbound().byteBuffer()
- After: inboundByteBuffer()
- Simpler and better performance
- Bypass buffer types were removed because it just does not work at all
with the thread model.
- All handlers that uses a bypass buffer are broken. Will fix soon.
- CombinedHandlerAdapter does not make sense anymore either because
there are four handler interfaces to consider and often the two
handlers will implement the same handler interface such as
ChannelStateHandler. Thinking of better ways to provide this feature