Motivation:
During a large memory copy, safepoint polling is diabled, hindering
accurate profiling.
Modifications:
Only copy up to 1 MiB per Unsafe.copyMemory()
Result:
Potentially more reliable performance
Motivation:
For an unknown reason, JVM of JDK8 crashes intermittently when
SslHandler feeds a direct buffer to SSLEngine.unwrap() *and* the current
cipher suite has GCM (Galois/Counter Mode) enabled.
Modifications:
Convert the inbound network buffer to a heap buffer when the current
cipher suite is using GCM.
Result:
JVM does not crash anymore.
Motivation:
JDK's SSLEngine.wrap() requires the output buffer to be always as large as MAX_ENCRYPTED_PACKET_LENGTH even if the input buffer contains small number of bytes. Our OpenSslEngine implementation does not have such wasteful behaviot.
Modifications:
If the current SSLEngine is OpenSslEngine, allocate as much as only needed.
Result:
Less peak memory usage.
Motivation:
It's useful to have netty-tcnative dependency in netty-example because
we can play with OpenSslEngine from our IDE.
Modifications:
Add netty-tcnative to example/pom.xml
Motivation:
Previous fix for the OpenSslEngine compatibility issue (#2216 and
18b0e95659) was to feed SSL records one by
one to OpenSslEngine.unwrap(). It is not optimal because it will result
in more JNI calls.
Modifications:
- Do not feed SSL records one by one.
- Feed as many records as possible up to MAX_ENCRYPTED_PACKET_LENGTH
- Deduplicate MAX_ENCRYPTED_PACKET_LENGTH definitions
Result:
- No allocation of intemediary arrays
- Reduced number of calls to SSLEngine and thus its underlying JNI calls
- A tad bit increase in throughput, probably reverting the tiny drop
caused by 18b0e95659
Motivation:
Some users already use an SSLEngine implementation in finagle-native. It
wraps OpenSSL to get higher SSL performance. However, to take advantage
of it, finagle-native must be compiled manually, and it means we cannot
pull it in as a dependency and thus we cannot test our SslHandler
against the OpenSSL-based SSLEngine. For an instance, we had #2216.
Because the construction procedures of JDK SSLEngine and OpenSslEngine
are very different from each other, we also need to provide a universal
way to enable SSL in a Netty application.
Modifications:
- Pull netty-tcnative in as an optional dependency.
http://netty.io/wiki/forked-tomcat-native.html
- Backport NativeLibraryLoader from 4.0
- Move OpenSSL-based SSLEngine implementation into our code base.
- Copied from finagle-native; originally written by @jpinner et al.
- Overall cleanup by @trustin.
- Run all SslHandler tests with both default SSLEngine and OpenSslEngine
- Add a unified API for creating an SSL context
- SslContext allows you to create a new SSLEngine or a new SslHandler
with your PKCS#8 key and X.509 certificate chain.
- Add JdkSslContext and its subclasses
- Add OpenSslServerContext
- Add ApplicationProtocolSelector to ensure the future support for NPN
(NextProtoNego) and ALPN (Application Layer Protocol Negotiation) on
the client-side.
- Add SimpleTrustManagerFactory to help a user write a
TrustManagerFactory easily, which should be useful for those who need
to write an alternative verification mechanism. For example, we can
use it to implement an unsafe TrustManagerFactory that accepts
self-signed certificates for testing purposes.
- Add InsecureTrustManagerFactory and FingerprintTrustManager for quick
and dirty testing
- Add SelfSignedCertificate class which generates a self-signed X.509
certificate very easily.
- Update all our examples to use SslContext.newClient/ServerContext()
- SslHandler now logs the chosen cipher suite when handshake is
finished.
Result:
- Cleaner unified API for configuring an SSL client and an SSL server
regardless of its internal implementation.
- When native libraries are available, OpenSSL-based SSLEngine
implementation is selected automatically to take advantage of its
performance benefit.
- Examples take advantage of this modification and thus are cleaner.
Motivation:
The old DefaultAttributeMap impl did more synchronization then needed and also did not expose a efficient way to check if an attribute exists with a specific key.
Modifications:
* Rewrite DefaultAttributeMap to not use IdentityHashMap and synchronization on the map directly. The new impl uses a combination of AtomicReferenceArray and synchronization per chain (linked-list). Also access the first Attribute per bucket can be done without any synchronization at all and just uses atomic operations. This should fit for most use-cases pretty weel.
* Add hasAttr(...) implementation
Result:
It's now possible to check for the existence of a attribute without create one. Synchronization is per linked-list and the first entry can even be added via atomic operation.
Motivation:
It should be frictionless to import our project into Eclipse
Modifications:
Exclude the plugins with missing life cycle mapping. They are not useful
for use with IDE anyway.
Result:
Fixes#2488
Netty is imported into Eclipse without a problem.
Motivation:
At the moment we have some methods that return a ChannelFuture but still throw a Http2Exception too. This is confusing in terms of semantic. A method which returns a ChannelFuture should not throw an Http2Exception but just fail the ChannelFuture.
Modifications:
* Make sure we fail the returned ChannelFuture in cases of Http2Exception and remove the throws Http2Exception from the method signature.
* Also some cleanup
Result:
Make the API usage more clear.
Motivation:
At the moment there are two issues with HashedWheelTimer:
* the memory footprint of it is pretty heavy (250kb fon an empty instance)
* the way how added Timeouts are handled is inefficient in terms of how locks etc are used and so a lot of context-switching / condition can happen.
Modification:
Rewrite HashedWheelTimer to use an optimized bucket implementation to store the submitted Timeouts and a MPSC queue to handover the timeouts. So volatile writes are reduced to a minimum and also the memory foot-print of the buckets itself is reduced a lot as the bucket uses a double-linked-list. Beside this we use Atomic*FieldUpdater where-ever possible to improve the memory foot-print and performance.
Result:
Lower memory-footprint and better performance
Motivation:
Some JDK versions of Mac OS X generates a JNI dynamic library with '.jnilib' extension rather than with '.dynlib' extension. However, System.mapLibraryName() always returns 'lib<name>.dynlib'. As a result, NativeLibraryLoader fails to load the native library whose extension is .jnilib.
Modification:
Try to find both '.jnilib' and '.dynlib' resources on OS X.
Result:
Dynamic libraries are loaded correctly in Mac OS X, and thus we can continue the OpenSslEngine work.
Motivation:
The HTTP/2 connection preface logic is currently handled in two places.
Reading/writing the client preface string is handled by
Http2PrefaceHandler while the reading/writing of the initial settings
frame is handled by AbstractHttp2ConnectionHandler. Given that their
isn't much code in Http2PrefaceHandler, it makes sense to just merge it
into the preface handling logic of AbstractHttp2ConnectionHandler. This
will also make configuring the pipeline simpler for HTTP/2.
Modifications:
Removed Http2PrefaceHandler and added it's logic to
AbstractHttp2ConnectionHandler. Updated other classes depending on
Http2PrefaceHandler.
Result:
All of the HTTP/2 connection preface processing logic is now in one
place.
Motivation:
At the moment we call ByteBuf.readBytes(...) in these handlers but with optimizations done as part of 25e0d9d we can just use readSlice(...).retain() and eliminate the memory copy.
Modifications:
Replace ByteBuf.readBytes(...) usage with readSlice(...).retain().
Result:
Less memory copies.
Motivation:
At the moment we sometimes use only RecvByteBufAllocator.guess() to guess the next size and the use the ByteBufAllocator.* directly to allocate the buffer. We should always use RecvByteBufAllocator.allocate(...) all the time as this makes the behavior easier to adjust.
Modifications:
Change the read() implementations to make use of RecvByteBufAllocator.
Result:
Behavior is more consistent.
Motivation:
Previously local settings is applied on its transmission. But this
makes settings value out-of-sync with remote endpoint. The
application of local settings must be done after the remote endpoint
sends back its SETTINGS ACK. This synchronization is vital to the
protocol. Suppose that server sends SETTINGS_HEADER_TABLE_SIZE 0 on
connection establishment and resizes its header table size down to 0.
Meanwhile, client sends HEADERS which has header block encoded in
default 4096 header table size, because client has not seen the
SETTINGS from server. Server tries to decode received HEADERS but due
to the difference of header table size, decoding process may fail.
Modifications:
We don't apply settings on transmission. Instead, we keep track of
outstanding settings in FIFO queue. When we get ACK, we pop oldest
outstanding settings and apply its values.
Result:
The outstanding settings values are applied when its ACK is received,
which is compliant behavior to the HTTP/2 specification.
Motivation:
A few items were identified where the http2 codec is out of compliance
with the spec.
Modifications:
- Fixed handling of priority weight on the wire. Now adding 1 after
reading from the wire and subtracing 1 before writing.
- Fixed handling of next stream ID. Client streamIds were starting at 3,
but they need to start at 1 to allow the upgrade from HTTP/1.1. Also
making next stream ID logic more flexible. Allowing the next created
stream to be any number in the sequence following the previously created
stream.
- Disallowing SETTINGS frames with ENABLE_PUSH specified for server
endpoints. This means that attempts to write this frame from a server,
or read it from a client will fail.
Result:
The http2 implementation will be more inline with the spec.
Motivation:
The Http2FrameObserver isn't provided the ChannelHandlerContext when
it's called back. This will force observers to find an alternative means
of obtaining the context if they need to do things like copying buffers.
Modifications:
Changed the Http2FrameReader and Http2FrameObserver to include the
context. Updated all other uses of these interfaces.
Result:
Frame observers will now have the channel context.
Motivation:
The outbound flow controller currently has to walk the entire tree each
time to calculate the total available data for each subtree. For better
performance we should maintain a running total for each subtree as we
queue/write frames.
Modifications:
I've modified the DefaultHttp2OutboundFlowController to manage the state
of "priorityData" at each node in the priority tree, which is
essentially the total writable data for all streams in that subtree.
These totals do not take into account the connection window, as that is
applied when splitting the data across the streams at each level in the
tree.
The flow controller now sorts the children of a node by the product of
their data (for the subtree) and its weight. This is used since in
certain cases the algorithm might prefer nodes that appear later in the
list. Sorting helps keep nodes with similar characteristics (e.g. a lot
of data and a high priority) with similar output.
To help clean things up, I'm storing a FlowState for the root node as
well, which maintains the runnning total of the currently writable data
for all stream.
Another item of cleanup is that I created a GarbageCollector innerclass
within the outbound flow controller. This keeps all of the garbage
collection code in one place, away from the flow control code.
Also added some more unit tests for the flow controller.
Result:
The outbound flow controller is a bit cleaner and perhaps a bit more
fair when distributing outbound data across streams.
Motivation:
When doing a gathering write we need to update the indices after the write partial completes. In the current code-base we use the wrong value when compare the expected written bytes and the actual written bytes.
Modifications:
Use the correct value when compare.
Result:
Indices are updated correctly and so no corruption can happen when resume writing after data was only partial written before.
Motivation:
Our ci showed some buffer leaks in the http2 codec. This commit fixes all of these.
Modifications:
* Correctly release buffers in DefaultHttp2FrameWriter
* Fix buffer leaks in tests
Result:
Tests pass without leaks
Motivation:
Draft 12 has just arrived and has quite a few changes. Need to update in
order to keep current with the spec.
Modifications:
This is a rewrite of the original (draft 10) code. There are only 2
handlers now: preface and connection. The connection handler is now
callback based rather than frame based (there are no frame classes
anymore). AbstractHttp2ConnectionHandler is the base class for any
HTTP/2 handlers. All of the stream priority logic now resides in the
outbound flow controller, and its interface exposes methods for
adding/updating priority for streams.
Upgraded to hpack 0.7.0, which is used by draft12. Also removed
draft10 code and moved draft12 code to the ../http2 package
(no draft subpackage).
Result:
Addition of a HTTP/2 draft 12 support.
Motivation:
CORS request are currently processed, and potentially failed, after the
target ChannelHandler(s) have been invoked. This might not be desired, for
example a HTTP PUT or POST might have been performed.
Modifications:
Added a shortCurcuit option to CorsConfig which when set will
cause a validation of the HTTP request's 'Origin' header and verify that
it is valid according to the configuration. If found invalid an 403
"Forbidden" response will be returned and not further processing will
take place.
This is indeed no help for non browser request, like using curl, which
can set the 'Origin' header.
Result:
Users can now configure if the 'Origin' header should be validated
upfront and have the request rejected before any further processing
takes place.
Motivation:
Because of not correctly release a buffer before null out the reference a memory leak shows up.
Modifications:
Correct call buffer.release() before null out reference.
Result:
No more leak
Motivation:
Currently there's no way to configure maxFramePayloadSize from
WebSocketServerProtocolHandler, which is the most used entry point of
WebSocket server.
Modifications:
Added another constructor for maxFramePayloadSize.
Result:
We can configure max frame size for websocket packet in
WebSocketServerProtocolHandler. It will also keep backward compatibility
with default max size: 65536. (65536 is hard-coded max size in previous
version of Netty)
Motivation:
DefaultChannelPipeline.firstContext() should return null when the ipeline is empty. This is not the case atm.
Modification:
Fix incorrect check in DefaultChannelPipeline.firstContext() and add unit tests.
Result:
Correctly return null when DefaultChannelPipeline.firstContext() is called on empty pipeline.
Motivation:
oss.sonatype.org refuses to promote an artifact if it doesn't have the
default JAR (the JAR without classifier.)
Modifications:
- Generate both the default JAR and the native JAR to make
oss.sonatype.org happy
- Rename the profile 'release' to 'restricted-release' which reflects
what it really does better
- Remove the redundant <quickbuild>true</quickbuild> in all/pom.xml
We specify the profile 'full' that triggers that property already
in maven-release-plugin configuration.
Result:
oss.sonatype.org is happy. Simpler pom.xml
Motivation:
Netty must be released from RHEL 6.5 x86_64 or compatible so that:
1) we ship x86_64 version of epoll transport officially, and
2) we ensure the ABI compatibility with older GLIBC versions.
The shared library built on a distribution with newer GLIBC will not
run on older distributions.
Modifications:
- When 'release' profile is active, perform an additional check using
maven-enforcer-plugin so that 'mvn release:*' fails when running on
non-RHEL6.5. This rule is active only when releasing, so a user
should not be affected.
- Simplify maven-release-plugin configuration by removing redundant
profiles such as 'linux'. 'linux' is automatically activated when
releasing because we now enforce the release occurs on linux-x86_64.
- Remove the no-osgi profile, which is unused
- Remove the reference to 'sonatype-oss-release' profile in all/pom.xml,
because we always specify 'release' profile when releasing
- Rename the profile 'linux-native' to 'linux' for brevity
- Upgrade oss-parent and maven-enforcer-plugin
Result:
No one can make a mistake to release Netty on an environment that can
produce incompatible or missing native library.
Motivation:
So far, we used a very simple platform string such as linux64 and
linux32. However, this is far from perfection because it does not
include anything about the CPU architecture.
Also, the current build tries to put multiple versions of .so files into
a single JAR. This doesn't work very well when we have to ship for many
different platforms. Think about shipping .so/.dynlib files for both
Linux and Mac OS X.
Modification:
- Use os-maven-plugin as an extension to determine the current OS and
CPU architecture reliable at build time
- Use Maven classifier instead of trying to put all shared libraries
into a single JAR
- NativeLibraryLoader does not guess the OS and bit mode anymore and it
always looks for the same location regardless of platform, because the
Maven classifier does the job instead.
Result:
Better scalable native library deployment and retrieval
Motivation:
Before we aggregated the full text in the WebSocket08FrameDecoder just to fill in the ContinuationWebSocketFrame.aggregatedText(). The problem was that there was no upper-limit and so it would be possible to see an OOME if the remote peer sends a TextWebSocketFrame + a never ending stream of ContinuationWebSocketFrames. Furthermore the aggregation does not really belong in the WebSocket08FrameDecoder, as we provide an extra ChannelHandler for this anyway (WebSocketFrameAggregator).
Modification:
Remove the ContinuationWebSocketFrame.aggregatedText() method and corresponding constructor. Also refactored WebSocket08FrameDecoder a bit to me more efficient which is now possible as we not need to aggregate here.
Result:
No more risk of OOME because of frames.
Motivation:
When writing data from a server before the ssl handshake completes may not be written at all to the remote peer
if nothing else is written after the handshake was done.
Modification:
Correctly try to write pending data after the handshake was complete
Result:
Correctly write out all pending data