Motivation:
Currently when there are bytes left in the cumulation buffer we do a byte copy to produce the input buffer for the decode method. This can put quite some overhead on the impl.
Modification:
- Use a CompositeByteBuf to eliminate the byte copy.
- Allow to specify if a CompositeBytebug should be used or not as some handlers can only act on one ByteBuffer in an efficient way (like SslHandler :( ).
Result:
Performance improvement as shown in the following benchmark.
Without this patch:
[xxx@xxx ~]$ ./wrk-benchmark
Running 5m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 20.19ms 38.34ms 1.02s 98.70%
Req/Sec 241.10k 26.50k 303.45k 93.46%
1153994119 requests in 5.00m, 155.84GB read
Requests/sec: 3846702.44
Transfer/sec: 531.93MB
With the patch:
[xxx@xxx ~]$ ./wrk-benchmark
Running 5m test @ http://xxx:8080/plaintext
16 threads and 256 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 17.34ms 27.14ms 877.62ms 98.26%
Req/Sec 252.55k 23.77k 329.50k 87.71%
1209772221 requests in 5.00m, 163.37GB read
Requests/sec: 4032584.22
Transfer/sec: 557.64MB
Motivation:
When a user sees an error message, sometimes he or she does not know
what exactly he or she has to do to fix the problem.
Modifications:
Log the URL of the wiki pages that might help the user troubleshoot.
Result:
We are more friendly.
Motivation:
When a user deliberatively omitted netty-tcnative from classpath, he or
she will see an ugly stack trace of ClassNotFoundException.
Modifications:
Log more briefly when netty-tcnative is not in classpath.
Result:
Better-looking log at DEBUG level
Motivation:
- There's no point of pre-population.
- Waste of memory and time because they are going to be cached lazily
- Some pre-populated cipher suites are ancient and will be unused
Modification:
- Remove cache pre-population
Result:
Sanity restored
Motivation:
Calling JNI methods is pretty expensive, so we should only do if needed.
Modifications:
Lazy call methods if needed.
Result:
Better performance.
Motivation:
SSL_set_cipher_list() in OpenSSL does not fail as long as at least one
cipher suite is available. It is different from the semantics of
SSLEngine.setEnabledCipherSuites(), which raises an exception when the
list contains an unavailable cipher suite.
Modifications:
- Add OpenSsl.isCipherSuiteAvailable(String) which checks the
availability of a cipher suite
- Raise an IllegalArgumentException when the specified cipher suite is
not available
Result:
Fixed compatibility
Motivation:
To make OpenSslEngine a full drop-in replacement, we need to implement
getSupportedCipherSuites() and get/setEnabledCipherSuites().
Modifications:
- Retrieve the list of the available cipher suites when initializing
OpenSsl.
- Improve CipherSuiteConverter to understand SRP
- Add more test data to CipherSuiteConverterTest
- Add bulk-conversion method to CipherSuiteConverter
Result:
OpenSslEngine should now be a drop-in replacement for JDK SSLEngineImpl
for most cases.
Related: #3285
Motivation:
When a user attempts to switch from JdkSslContext to OpenSslContext, he
or she will see the initialization failure if he or she specified custom
cipher suites.
Modifications:
- Provide a utility class that converts between Java cipher suite string
and OpenSSL cipher suite string
- Attempt to convert the cipher suite so that a user can use the cipher
suite string format of Java regardless of the chosen SslContext impl
Result:
- It is possible to convert all known cipher suite strings.
- It is possible to switch from JdkSslContext and OpenSslContext and
vice versa without any configuration changes
Motivation:
Several issues were shown by various ticket (#2900#2956).
Also use the improvement on writability user management from #3036.
And finally add a mixte handler, both for Global and Channels, with
the advantages of being uniquely created and using less memory and
less shaping.
Issue #2900
When a huge amount of data are written, the current behavior of the
TrafficShaping handler is to limit the delay to 15s, whatever the delay
the previous write has. This is wrong, and when a huge amount of writes
are done in a short time, the traffic is not correctly shapened.
Moreover, there is a high risk of OOM if one is not using in his/her own
handler for instance ChannelFuture.addListener() to handle the write
bufferisation in the TrafficShapingHandler.
This fix use the "user-defined writability flags" from #3036 to
allow the TrafficShapingHandlers to "user-defined" managed writability
directly, as for reading, thus using the default isWritable() and
channelWritabilityChanged().
This allows for instance HttpChunkedInput to be fully compatible.
The "bandwidth" compute on write is only on "acquired" write orders, not
on "real" write orders, which is wrong from statistic point of view.
Issue #2956
When using GlobalTrafficShaping, every write (and read) are
synchronized, thus leading to a drop of performance.
ChannelTrafficShaping is not touched by this issue since synchronized is
then correct (handler is per channel, so the synchronized).
Modifications:
The current write delay computation takes into account the previous
write delay and time to check is the 15s delay (maxTime) is really
exceeded or not (using last scheduled write time). The algorithm is
simplified and in the same time more accurate.
This proposal uses the #3036 improvement on user-defined writability
flags.
When the real write occurs, the statistics are update accordingly on a
new attribute (getRealWriteThroughput()).
To limit the synchronisations, all synchronized on
GlobalTrafficShapingHandler on submitWrite were removed. They are
replaced with a lock per channel (since synchronization is still needed
to prevent unordered write per channel), as in the sendAllValid method
for the very same reason.
Also all synchronized on TrafficCounter on read/writeTimeToWait() are
removed as they are unnecessary since already locked before by the
caller.
Still the creation and remove operations on lock per channel (PerChannel
object) are synchronized to prevent concurrency issue on this critical
part, but then limited.
Additionnal changes:
1) Use System.nanoTime() instead of System.currentTimeMillis() and
minimize calls
2) Remove / 10 ° 10 since no more sleep usage
3) Use nanoTime instead of currentTime such that time spend is computed,
not real time clock. Therefore the "now" relative time (nanoTime based)
is passed on all sub methods.
4) Take care of removal of the handler to force write all pending writes
and release read too
8) Review Javadoc to explicit:
- recommandations to take into account isWritable
- recommandations to provide reasonable message size according to
traffic shaping limit
- explicit "best effort" traffic shaping behavior when changing
configuration dynamically
Add a MixteGlobalChannelTrafficShapingHandler which allows to use only one
handler for mixing Global and Channel TSH. I enables to save more memory and
tries to optimize the traffic among various channels.
Result:
The traffic shaping is more stable, even with a huge number of writes in
short time by taking into consideration last scheduled write time.
The current implementation of TrafficShapingHandler using user-defined
writability flags and default isWritable() and
fireChannelWritabilityChanged works as expected.
The statistics are more valuable (asked write vs real write).
The Global TrafficShapingHandler should now have less "global"
synchronization, hoping to the minimum, but still per Channel as needed.
The GlobalChannel TrafficShapingHandler allows to have only one handler for all channels while still offering per channel in addition to global traffic shaping.
And finally maintain backward compatibility.
Motivation:
Openssl supports the SSL_CTX_set_session_id_context function to limit for which context a session can be used. We should support this.
Modifications:
Add OpenSslServerSessionContext that exposes a setSessionIdContext(...) method now.
Result:
It's now possible to use SSL_CTX_set_session_id_context.
Motivation:
It is sometimes useful to enable / disable the session cache.
Modifications:
* Add OpenSslSessionContext.setSessionCacheEnabled(...) and isSessionCacheEnabled()
Result:
It is now possible to enable / disable cache on the fly
Motivation:
To be compatible with SSLEngine we need to support enable / disable procols on the OpenSslEngine
Modifications:
Implement OpenSslEngine.getSupportedProtocols() , getEnabledProtocols() and setEnabledProtocols(...)
Result:
Better compability with SSLEngine
Motivation:
The current implementation not returns the real session as byte[] representation.
Modifications:
Create a proper Openssl.SSLSession.get() implementation which returns the real session as byte[].
Result:
More correct implementation
Motivation:
At the moment it is not possible to make use of the session cache when OpenSsl is used. This should be possible when server mode is used.
Modifications:
- Add OpenSslSessionContext (implements SSLSessionContext) which exposes all the methods to modify the session cache.
- Add various extra methods to OpenSslSessionContext for extra functionality
- Return OpenSslSessionContext when OpenSslEngine.getSession().getContext() is called.
- Add sessionContext() to SslContext
- Move OpenSsl specific session operations to OpenSslSessionContext and mark the old methods @deprecated
Result:
It's now possible to use session cache with OpenSsl
Motivation:
ProxyHandlerTest fails with NoClassDefFoundError raised by
SslContext.newClientContext().
Modifications:
Fix a missing 'return' statement that makes the switch-case block fall
through unncecessarily
Result:
- ProxyHandlerTest does not fail anymore.
- SslContext.newClientContext() does not raise NoClassDefFoundError
anymore.
Motivation:
At the moment we use SSL.getLastError() in unwrap(...) to check for error. This is very inefficient as it creates a new String for each check and we also use a String.startsWith(...) to detect if there was an error we need to handle.
Modifications:
Use SSL.getLastErrorNumber() to detect if we need to handle an error, as this only returns a long and so no String creation happens. Also the detection is much cheaper as we can now only compare longs. Once an error is detected the lately SSL.getErrorString(long) is used to conver the error number to a String and include it in log and exception message.
Result:
Performance improvements in OpenSslEngine.unwrap(...) due less object allocation and also faster comparations.
Motivation:
As we now support OpenSslEngine for client side, we should use it when avaible.
Modifications:
Use SslProvider.OPENSSL when openssl can be found
Result:
OpenSslEngine is used whenever possible
Motivation:
When using client auth it is sometimes needed to use a custom TrustManagerFactory.
Modifications:
Allow to pass in TrustManagerFactory
Result:
It's now possible to use custom TrustManagerFactories for JdkSslServerContext and OpenSslServerContext
Motivation:
To make OpenSsl*Context a drop in replacement for JdkSsl*Context we need to use TrustManager.
Modifications:
Correctly hook in the TrustManager
Result:
Better compatibility
Motivation:
At the moment there is no way to enable client authentication when using OpenSslEngine. This limits the uses of OpenSslEngine.
Modifications:
Add support for different authentication modes.
Result:
OpenSslEngine can now also be used when client authenticiation is needed.
Motivation:
The current SSLSession implementation used by OpenSslEngine does not support various operations and so may not be a good replacement by the SSLEngine provided by the JDK implementation.
Modifications:
- Add SSLSession.getCreationTime()
- Add SSLSession.getLastAccessedTime()
- Add SSLSession.putValue(...), getValue(...), removeValue(...), getValueNames()
- Add correct SSLSession.getProtocol()
- Ensure OpenSSLEngine.getSession() is thread-safe
- Use optimized AtomicIntegerFieldUpdater when possible
Result:
More complete OpenSslEngine SSLSession implementation
Motivation:
We only support openssl for server side at the moment but it would be also useful for client side.
Modification:
* Upgrade to new netty-tcnative snapshot to support client side openssl support
* Add OpenSslClientContext which can be used to create SslEngine for client side usage
* Factor out common logic between OpenSslClientContext and OpenSslServerContent into new abstract base class called OpenSslContext
* Correctly detect handshake failures as soon as possible
* Guard against segfault caused by multiple calls to destroyPools(). This can happen if OpenSslContext throws an exception in the constructor and the finalize() method is called later during GC
Result:
openssl can be used for client and servers now.
Motivation:
SslHandler.wrap(...) does a poor job when handling CompositeByteBuf as it always call ByteBuf.nioBuffer() which will do a memory copy when a CompositeByteBuf is used that is backed by multiple ByteBuf.
Modifications:
- Use SslEngine.wrap(ByteBuffer[]...) to allow wrap CompositeByteBuf in an efficient manner
- Reduce object allocation in unwrapNonAppData(...)
Result:
Performance improvement when a CompositeByteBuf is written and the SslHandler is in the ChannelPipeline.
Motivation:
When a remote peer did open a connection and only do the handshake without sending any data and then directly close the connection we did not call shutdown() in the OpenSslEngine. This leads to a native memory leak. Beside this it also was not fireed when a OpenSslEngine was created but never used.
Modifications:
- Make sure shutdown() is called in all cases when closeInbound() is called
- Call shutdown() also in the finalize() method to ensure we release native memory when the OpenSslEngine is GC'ed
Result:
No more memory leak when using OpenSslEngine
Related:
e9685ea45aebcb4f9dad0f3a1fc328a06b4932dd
Motivation:
SslHandler.unwrap() does not evaluate the handshake status of
SSLEngine.unwrap() when the status of SSLEngine.unwrap() is CLOSED.
It is not correct because the status does not reflect the state of the
handshake currently in progress, accoding to the API documentation of
SSLEngineResult.Status.
Also, sslCloseFuture can be notified earlier than handshake notification
because we call sslCloseFuture.trySuccess() before evaluating handshake
status.
Modifications:
- Notify sslCloseFuture after the unwrap loop is finished
- Add more assertions to SocketSslEchoTest
Result:
Potentially fix the regression caused by:
- e9685ea45aebcb4f9dad0f3a1fc328a06b4932dd
Related: #2958
Motivation:
SslHandler currently does not issue a read() request when it is
handshaking. It makes a connection with autoRead off stall, because a
user's read() request can be used to read the handshake response which
is invisible to the user.
Modifications:
- SslHandler now issues a read() request when:
- the current handshake is in progress and channelReadComplete() is
invoked
- the current handshake is complete and a user issued a read() request
during handshake
- Rename flushedBeforeHandshakeDone to flushedBeforeHandshake for
consistency with the new variable 'readDuringHandshake'
Result:
SslHandler should work regardless whether autoRead is on or off.
Related: #3125
Motivation:
We did not expose a way to initiate TLS renegotiation and to get
notified when the renegotiation is done.
Modifications:
- Add SslHandler.renegotiate() so that a user can initiate TLS
renegotiation and get the future that's notified on completion
- Make SslHandler.handshakeFuture() return the future for the most
recent handshake so that a user can get the future of the last
renegotiation
- Add the test for renegotiation to SocketSslEchoTest
Result:
Both client-initiated and server-initiated renegotiations are now
supported properly.
Related: #3219
Motivation:
ChunkedWriteHandler.flush() does not call ctx.flush() when channel is
not writable. This can be a problem when other handler / non-Netty
thread writes messages simultaneously, because
ChunkedWriteHandler.flush() might have no chance to observe
channel.isWritable() returns true and thus the channel is never flushed.
Modifications:
- Ensure that ChunkedWriteHandler.flush() calls ctx.flush() at least
once.
Result:
A stall connection issue, that occurs when certain combination of
handlers exist in a pipeline, has been fixed. (e.g. SslHandler and
ChunkedWriteHandler)
- Parameterize DomainNameMapping to make it useful for other use cases
than just mapping to SslContext
- Move DomainNameMapping to io.netty.util
- Clean-up the API documentation
- Make SniHandler.hostname and sslContext volatile because they can be
accessed by non-I/O threads
Motivation:
We use 3 (!) libraries to build mock objects - easymock, mockito, jmock.
Mockito and jMock pulls in the different versions of Hamcrest, and it
conflicts with the version pulled by jUnit.
Modifications:
- Replace mockito-all with mockito-core to avoid pulling in outdated
jUnit and Hamcrest
- Exclude junit-dep when pulling in jmock-junit4, because it pulls an
outdated Hamcrest version
- Pull in the hamcrest-library version used by jUnit explicitly
Result:
No more dependency hell that results in NoSuchMethodError during the
tests
Motivation:
When we need to host multiple server name with a single IP, it requires
the server to support Server Name Indication extension to serve clients
with proper certificate. So the SniHandler will host multiple
SslContext(s) and append SslHandler for requested hostname.
Modification:
* Added SniHandler to host multiple certifications in a single server
* Test case
Result:
User could use SniHandler to host multiple certifcates at a time.
It's server-side only.
Motivation:
JdkSslContext used SSL_RSA_WITH_DES_CBC_SHA in its cipher suite list.
OpenSslServerContext used DES-CBC3-SHA in the same place in its cipher suite
list, which is equivalent to SSL_RSA_WITH_3DES_EDE_CBC_SHA.
This means the lists were out of sync. Furthermore, using
SSL_RSA_WITH_DES_CBC_SHA is not desirable as it uses DES, a weak cipher. Triple
DES should be used instead.
Modifications:
Replace SSL_RSA_WITH_DES_CBC_SHA with SSL_RSA_WITH_3DES_EDE_CBC_SHA in
JdkSslContext.
Result:
The JdkSslContext and OpenSslServerContext cipher suite lists are now in sync.
Triple DES is used instead of DES, which is stronger.
Motivation:
RC4 is not a recommended cipher suite anymore, as the recent research
reveals, such as:
- http://www.isg.rhul.ac.uk/tls/
Modifications:
- Remove most RC4 cipher suites from the default cipher suites
- For backward compatibility, leave RC4-SHA, while de-prioritizing it
Result:
Potentially safer default
Motivation:
Found performance issues via FindBugs and PMD.
Modifications:
- Removed unnecessary boxing/unboxing operations in DefaultTextHeaders.convertToInt(CharSequence) and DefaultTextHeaders.convertToLong(CharSequence). A boxed primitive is created from a string, just to extract the unboxed primitive value.
- Added a static modifier for DefaultHttp2Connection.ParentChangedEvent class. This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Added a static compiled Pattern to avoid compile it each time it is used when we need to replace some part of authority.
- Improved using of StringBuilders.
Result:
Performance improvements.
Motivation:
When ALPN/NPN is disabled, a user has to instantiate a new
ApplicationProtocolConfig with meaningless parameters.
Modifications:
- Add ApplicationProtocolConfig.DISABLED, the singleton instance
- Reject the constructor calls with Protocol.NONE, which doesn't make
much sense because a user should use DISABLED instead.
Result:
More user-friendly API when ALPN/NPN is not needed by a user.
Motivation:
Previous backport removed the old methods and constructors. They should
not be removed in 4.x but just deprecated in favor of the new methods
and constructors.
Modifications:
Add back the removed methods and constructors in SslContext and its
subtypes for backward compatibility.
Result:
Backward compatibility issues fixed.
Motivation:
Improvements were made on the main line to support ALPN and mutual
authentication for TLS. These should be backported.
Modifications:
- Backport commits from the master branch
- f8af84d5993456426a63ad0146479147b1a4a5e5
- e74c8edba3fcbfd2e895ed6aac440efeb3aa637f
Result:
Support for ALPN and mutual authentication.
Motivation:
The SslHandler currently forces the use of a direct buffer for the input to the SSLEngine.wrap(..) operation. This allocation may not always be desired and should be conditionally done.
Modifications:
- Use the pre-existing wantsDirectBuffer variable as the condition to do the conversion.
Result:
- An allocation of a direct byte buffer and a copy of data is now not required for every SslHandler wrap operation.
Motivation:
The SslHandler wrap method requires that a direct buffer be passed to the SSLEngine.wrap() call. If the ByteBuf parameter does not have an underlying direct buffer then one is allocated in this method, but it is not released.
Modifications:
- Release the direct ByteBuffer only accessible in the scope of SslHandler.wrap
Result:
Memory leak in SslHandler.wrap is fixed.
Motivation:
Currently the last read/write throughput is calculated by first division,this will be 0 if the last read/write bytes < interval,change the order will get the correct result
Modifications:
Change the operator order from first do division to multiplication
Result:
Get the correct result instead of 0 when bytes are smaller than interval
Motivation:
handlerAdded and handlerRemoved were overriden but super was never
called, while it should.
Also add one missing information in the toString method.
Modifications:
Add the super corresponding call, and add checkInterval to the
toString() method
Result;
super method calls are correctly passed to the super implementation
part.
Motivation:
When constructing a FingerprintTrustManagerFactory from an Iterable of Strings, the fingerprints were correctly parsed but never added to the result array. The constructed FingerprintTrustManagerFactory consequently fails to validate any certificate.
Modifications:
I added a line to add each converted SHA-1 certificate fingerprint to the result array which then gets passed on to the next constructor.
Result:
Certificate fingerprints passed to the constructor are now correctly added to the array of valid fingerprints. The resulting FingerprintTrustManagerFactory object correctly validates certificates against the list of specified fingerprints.
Motivation:
In GitHub issue #2767 a bug was reported that the IPv4
default route leads to the ipfilter package denying
instead of accepting all addresses.
While the issue was reported for Netty 3.9, this bug
also applies to Netty 4 and higher.
Modifications:
When computing the subnet address from the CIDR prefix,
correctly handle the case where the prefix is set to zero.
Result:
Ipfilter accepts all addresses when passed the
IPv4 default route.