Motivation:
PooledByteBufAllocator's thread local cache and
ReferenceCountUtil.releaseLater() are in need of a way to run an
arbitrary logic when a certain thread is terminated.
Modifications:
- Add ThreadDeathWatcher, which spawns a low-priority daemon thread
that watches a list of threads periodically (every second) and
invokes the specified tasks when the associated threads are not alive
anymore
- Start-stop logic based on CAS operation proposed by @tea-dragon
- Add debug-level log messages to see if ThreadDeathWatcher works
Result:
- Fixes#2519 because we don't use GlobalEventExecutor anymore
- Cleaner code
Motivation:
At the moment MessageToMessageEncoder uses ctx.write(msg) when have more then one message was produced. This may produce more GC pressure then necessary as when the original ChannelPromise is a VoidChannelPromise we can safely also use one when write messages.
Modifications:
Use VoidChannelPromise when the original ChannelPromise was of this type
Result:
Less object creation and GC pressure
Motivation:
The current DefaultAttributeMap cause an infinite-loop when the user removes an attribute and create the same attribute again. This regression was introduced by c3bd7a8ff1.
Modification:
Correctly break out loop
Result:
No infinite-loop anymore.
Motivation:
If we make allocateRun/SubpageSimple() always try the left node first and make allocateRun/Subpage() always tries the right node first, it is more likely that allocateRun/Subpage() will find a node with ST_UNUSED sooner.
Modifications:
- Make allocateRunSimple() and allocateSubpageSimple() always try the left node first.
- Make allocateRun() and allocateSubpage() always try the right node first.
- Remove randome
Result:
We get the same performance without using random numbers.
Motivation:
We still have a room for improvement in PoolChunk.allocateRun() and
Subpage.allocate().
Modifications:
- Unroll the recursion in PoolChunk.allocateRun()
- Subpage.allocate() makes use of the 'nextAvail' value set by previous
free().
Result:
- PoolChunk.allocateRun() optimization yields 10%+ improvements in
allocation throughput for non-subpage allocations.
- Subpage.allocate() optimization makes the subpage allocations for
tiny buffers as fast as non-tiny buffers even when the pageSize is
huge (e.g. 1048576) because it doesn't need to perform a linear search
in most cases.
Motivation:
Allocating a single buffer and releasing it repetitively for a benchmark will not involve the realistic execution path of the allocators.
Modifications:
Keep the last 8192 allocations and release them randomly.
Result:
We are now getting the result close to what we got with caliper.
Motivation:
On some ill-configured systems, InetAddress.getLocalHost() fails. NioSocketChannelTest calls java.net.Socket.connect() and it internally invoked InetAddress.getLocalHost(), which causes the test failures in NioSocketChannelTes on such an ill-configured system.
Modifications:
Use NetUtil.LOCALHOST explicitly.
Result:
NioSocketChannelTest should not fail anymore.
Motivation:
maven-antrun-plugin does not redirect stdin, and thus it's impossible to
run interactive examples such as securechat-client and telnet-client.
org.codehaus.mojo:exec-maven-plugin redirects stdin, but it buffers
stdout and stderr, and thus an application output is not flushed timely.
Modifications:
Deploy a forked version of exec-maven-plugin which flushes output
buffers in a timely manner.
Result:
Interactive examples work. Launches faster than maven-antrun-plugin.
Motivation:
The examples have not been updated since long time ago, showing various
issues fixed in this commit.
Modifications:
- Overall simplification to reduce LoC
- Use system properties to get options instead of parsing args.
- Minimize option validation
- Just use System.out/err instead of Logger
- Do not pass config as parameters - just access it directly
- Move the main logic to main(String[]) instead of creating a new
instance meaninglessly
- Update netty-build-21 to make checkstyle not complain
- Remove 'throws Exception' clause if possible
- Line wrap at 120 (previously at 80)
- Add an option to enable SSL for most examples
- Use ChannelFuture.sync() instead of await()
- Use System.out for the actual result. Use System.err otherwise.
- Delete examples that are not very useful:
- applet
- websocket/html5
- websocketx/sslserver
- localecho/multithreaded
- Add run-example.sh which simplifies launching an example from command
line
- Rewrite FileServer example
Result:
Shorter and simpler examples. A user can focus more on what it actually
does than miscellaneous stuff. A user can launch an example very
easily.
Motivation:
When (listeners == null && lateListeners == null) and (stackDepth >= MAX_LISTENER_STACK_DEPTH), the listener is not notified at all. The discard client does not work.
Modification:
Make sure to submit the notification task.
Result:
The discard client works again and all listeners are notified.
Motivation:
- dependencyVersionsDir property is not resolved during the build
process. The build doesn't fail because of this, but it creates an
ugly directory.
- All-in-one JAR contains libnetty-tcnative.so, which is not part of the
all-in-one JAR.
Modifications:
- Fix an incorrect property name
(dependencyVersionDir -> dependencyVersionsDir)
- Exclude libnetty-tcnative.so
- Remove unnecessary includes in source expanding configuration
Result:
- Cleaner pom.xml
- We do not ship libnetty-tcnative.so in all-in-one JAR anymore, which
is correct, because strictly speaking the native library belongs to
org.apache.tomcat.jni package.
Motivation:
exec-maven-plugin does not flush stdout and stderr, making the console
output from the examples invisible to users
Modification:
Use maven-antrun-plugin instead
Result:
A user sees the output from the examples immediately.
Motivation:
According to TLS ALPN draft-05, a client sends the list of the supported
protocols and a server responds with the selected protocol, which is
different from NPN. Therefore, ApplicationProtocolSelector won't work
with ALPN
Modifications:
- Use Iterable<String> to list the supported protocols on the client
side, rather than using ApplicationProtocolSelector
- Remove ApplicationProtocolSelector
Result:
Future compatibility with TLS ALPN
Motivation:
- OpenSslEngine and JDK SSLEngine (+ Jetty NPN) have different APIs to
support NextProtoNego extension.
- It is impossible to configure NPN with SslContext when the provider
type is JDK.
Modification:
- Implement NextProtoNego extension by overriding the behavior of
SSLSession.getProtocol() for both OpenSSLEngine and JDK SSLEngine.
- SSLEngine.getProtocol() returns a string delimited by a colon (':')
where the first component is the transport protosol (e.g. TLSv1.2)
and the second component is the name of the application protocol
- Remove the direct reference of Jetty NPN classes from the examples
- Add SslContext.newApplicationProtocolSelector
Result:
- A user can now use both JDK SSLEngine and OpenSslEngine for NPN-based
protocols such as HTTP2 and SPDY
Motivation:
Mac OS X ships Bash 3, and it does not have an associative array
(declare -A).
Modifications:
Do not use an associative array.
Result:
Can run examples on Mac OS X using run-example.sh
Motivation:
- There's no way to pass an argument to an example.
- Assigning a Maven profile for each example is an overkill.
It makes the pom.xml crowded.
Modifications:
- Remove example profiles from example/pom.xml
- Keep the list of examples in run-example.sh
- run-example.sh passes all options to exec-maven-plugin.
For example, we can now do this:
./run-example.sh -Dssl -Dport=443 http-server
Result:
- It's much easier to add a new example and provide an easy way to
launch it.
- We can still pass an arbitrary argument to the example being launched.
(I'll update all examples to make them get their options from system
properties rather than from args[].
Motivation:
Build fails with JDK 8 because npn-boot does not work with JDK 8
Modifications:
Do not specify bootclasspath when on JDK 8
Result:
Build is green again.
Motivation:
Due to a known problem[1] of maven-compiler-plugin, our build always
compiles everything from scratch, which is waste of time.
Modifications:
Exclude package-info.java from the source list.
Result:
Much shorter build time.
[1]: https://jira.codehaus.org/browse/MCOMPILER-205
Motivation:
- example/pom.xml has quite a bit of duplication.
- We expect that we depend on npn-boot in more than one module in the
near future. (e.g. handler, codec-http, and codec-http2)
Modification:
- Deduplicate the profiles in example/pom.xml
- Move the build configuration related with npn-boot to the parent pom.
- Add run-example.sh that helps a user launch an example easily
Result:
- Cleaner build files
- Easier to add a new example
- Easier to launch an example
- Easier to run the tests that relies on npn-boot in the future
Motivation:
During a large memory copy, safepoint polling is diabled, hindering
accurate profiling.
Modifications:
Only copy up to 1 MiB per Unsafe.copyMemory()
Result:
Potentially more reliable performance
Motivation:
For an unknown reason, JVM of JDK8 crashes intermittently when
SslHandler feeds a direct buffer to SSLEngine.unwrap() *and* the current
cipher suite has GCM (Galois/Counter Mode) enabled.
Modifications:
Convert the inbound network buffer to a heap buffer when the current
cipher suite is using GCM.
Result:
JVM does not crash anymore.
Motivation:
JDK's SSLEngine.wrap() requires the output buffer to be always as large as MAX_ENCRYPTED_PACKET_LENGTH even if the input buffer contains small number of bytes. Our OpenSslEngine implementation does not have such wasteful behaviot.
Modifications:
If the current SSLEngine is OpenSslEngine, allocate as much as only needed.
Result:
Less peak memory usage.
Motivation:
It's useful to have netty-tcnative dependency in netty-example because
we can play with OpenSslEngine from our IDE.
Modifications:
Add netty-tcnative to example/pom.xml
Motivation:
Previous fix for the OpenSslEngine compatibility issue (#2216 and
18b0e95659) was to feed SSL records one by
one to OpenSslEngine.unwrap(). It is not optimal because it will result
in more JNI calls.
Modifications:
- Do not feed SSL records one by one.
- Feed as many records as possible up to MAX_ENCRYPTED_PACKET_LENGTH
- Deduplicate MAX_ENCRYPTED_PACKET_LENGTH definitions
Result:
- No allocation of intemediary arrays
- Reduced number of calls to SSLEngine and thus its underlying JNI calls
- A tad bit increase in throughput, probably reverting the tiny drop
caused by 18b0e95659
Motivation:
Some users already use an SSLEngine implementation in finagle-native. It
wraps OpenSSL to get higher SSL performance. However, to take advantage
of it, finagle-native must be compiled manually, and it means we cannot
pull it in as a dependency and thus we cannot test our SslHandler
against the OpenSSL-based SSLEngine. For an instance, we had #2216.
Because the construction procedures of JDK SSLEngine and OpenSslEngine
are very different from each other, we also need to provide a universal
way to enable SSL in a Netty application.
Modifications:
- Pull netty-tcnative in as an optional dependency.
http://netty.io/wiki/forked-tomcat-native.html
- Backport NativeLibraryLoader from 4.0
- Move OpenSSL-based SSLEngine implementation into our code base.
- Copied from finagle-native; originally written by @jpinner et al.
- Overall cleanup by @trustin.
- Run all SslHandler tests with both default SSLEngine and OpenSslEngine
- Add a unified API for creating an SSL context
- SslContext allows you to create a new SSLEngine or a new SslHandler
with your PKCS#8 key and X.509 certificate chain.
- Add JdkSslContext and its subclasses
- Add OpenSslServerContext
- Add ApplicationProtocolSelector to ensure the future support for NPN
(NextProtoNego) and ALPN (Application Layer Protocol Negotiation) on
the client-side.
- Add SimpleTrustManagerFactory to help a user write a
TrustManagerFactory easily, which should be useful for those who need
to write an alternative verification mechanism. For example, we can
use it to implement an unsafe TrustManagerFactory that accepts
self-signed certificates for testing purposes.
- Add InsecureTrustManagerFactory and FingerprintTrustManager for quick
and dirty testing
- Add SelfSignedCertificate class which generates a self-signed X.509
certificate very easily.
- Update all our examples to use SslContext.newClient/ServerContext()
- SslHandler now logs the chosen cipher suite when handshake is
finished.
Result:
- Cleaner unified API for configuring an SSL client and an SSL server
regardless of its internal implementation.
- When native libraries are available, OpenSSL-based SSLEngine
implementation is selected automatically to take advantage of its
performance benefit.
- Examples take advantage of this modification and thus are cleaner.
Motivation:
The old DefaultAttributeMap impl did more synchronization then needed.
Modifications:
* Rewrite DefaultAttributeMap to not use IdentityHashMap and synchronization on the map directly. The new impl uses a combination of AtomicReferenceArray and synchronization per chain (linked-list). Also access the first Attribute per bucket can be done without any synchronization at all and just uses atomic operations. This should fit for most use-cases pretty weel.
Result:
Synchronization is per linked-list and the first entry can even be added via atomic operation.
Motivation:
It should be frictionless to import our project into Eclipse
Modifications:
Exclude the plugins with missing life cycle mapping. They are not useful
for use with IDE anyway.
Result:
Fixes#2488
Netty is imported into Eclipse without a problem.
Motivation:
At the moment there are two issues with HashedWheelTimer:
* the memory footprint of it is pretty heavy (250kb fon an empty instance)
* the way how added Timeouts are handled is inefficient in terms of how locks etc are used and so a lot of context-switching / condition can happen.
Modification:
Rewrite HashedWheelTimer to use an optimized bucket implementation to store the submitted Timeouts and a MPSC queue to handover the timeouts. So volatile writes are reduced to a minimum and also the memory foot-print of the buckets itself is reduced a lot as the bucket uses a double-linked-list. Beside this we use Atomic*FieldUpdater where-ever possible to improve the memory foot-print and performance.
Result:
Lower memory-footprint and better performance
Motivation:
Some JDK versions of Mac OS X generates a JNI dynamic library with '.jnilib' extension rather than with '.dynlib' extension. However, System.mapLibraryName() always returns 'lib<name>.dynlib'. As a result, NativeLibraryLoader fails to load the native library whose extension is .jnilib.
Modification:
Try to find both '.jnilib' and '.dynlib' resources on OS X.
Result:
Dynamic libraries are loaded correctly in Mac OS X, and thus we can continue the OpenSslEngine work.
Motivation:
At the moment we call ByteBuf.readBytes(...) in these handlers but with optimizations done as part of 25e0d9d we can just use readSlice(...).retain() and eliminate the memory copy.
Modifications:
Replace ByteBuf.readBytes(...) usage with readSlice(...).retain().
Result:
Less memory copies.
Motivation:
At the moment we sometimes use only RecvByteBufAllocator.guess() to guess the next size and the use the ByteBufAllocator.* directly to allocate the buffer. We should always use RecvByteBufAllocator.allocate(...) all the time as this makes the behavior easier to adjust.
Modifications:
Change the read() implementations to make use of RecvByteBufAllocator.
Result:
Behavior is more consistent.