Commit Graph

327 Commits

Author SHA1 Message Date
Kiril Menshikov
66b9be3a46 Allow to allign allocated Buffers
Motivation:

64-byte alignment is recommended by the Intel performance guide (https://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors) for data-structures over 64 bytes.
Requiring padding to a multiple of 64 bytes allows for using SIMD instructions consistently in loops without additional conditional checks. This should allow for simpler and more efficient code.

Modification:

At the moment cache alignment must be setup manually. But probably it might be taken from the system. The original code was introduced by @normanmaurer https://github.com/netty/netty/pull/4726/files

Result:

Buffer alignment works better than miss-align cache.
2017-02-06 07:58:29 +01:00
Scott Mitchell
3482651e0c HTTP/2 Non Active Stream RFC Corrections
Motivation:
codec-http2 couples the dependency tree state with the remainder of the stream state (Http2Stream). This makes implementing constraints where stream state and dependency tree state diverge in the RFC challenging. For example the RFC recommends retaining dependency tree state after a stream transitions to closed [1]. Dependency tree state can be exchanged on streams in IDLE. In practice clients may use stream IDs for the purpose of establishing QoS classes and therefore retaining this dependency tree state can be important to client perceived performance. It is difficult to limit the total amount of state we retain when stream state and dependency tree state is combined.

Modifications:
- Remove dependency tree, priority, and weight related items from public facing Http2Connection and Http2Stream APIs. This information is optional to track and depends on the flow controller implementation.
- Move all dependency tree, priority, and weight related code from DefaultHttp2Connection to WeightedFairQueueByteDistributor. This is currently the only place which cares about priority. We can pull out the dependency tree related code in the future if it is generally useful to expose for other implementations.
- DefaultHttp2Connection should explicitly limit the number of reserved streams now that IDLE streams are no longer created.

Result:
More compliant with the HTTP/2 RFC.
Fixes https://github.com/netty/netty/issues/6206.

[1] https://tools.ietf.org/html/rfc7540#section-5.3.4
2017-02-01 10:34:27 -08:00
Norman Maurer
735d6dd636 [maven-release-plugin] prepare for next development iteration 2017-01-30 15:14:02 +01:00
Norman Maurer
76e22e63f3 [maven-release-plugin] prepare release netty-4.1.8.Final 2017-01-30 15:12:36 +01:00
Scott Mitchell
e13da218e9 HTTP/2 revert Http2FrameWriter throws API change
Motivation:
2fd42cfc6b fixed a bug related to encoding headers but it also introduced a throws statement onto the Http2FrameWriter methods which write headers. This throws statement makes the API more verbose and is not necessary because we can communicate the failure in the ChannelFuture that is returned by these methods.

Modifications:
- Remove throws from all Http2FrameWriter methods.

Result:
Http2FrameWriter APIs do not propagate checked exceptions.
2017-01-26 23:26:17 -08:00
Tim Brooks
3344cd21ac Wrap operations requiring SocketPermission with doPrivileged blocks
Motivation:

Currently Netty does not wrap socket connect, bind, or accept
operations in doPrivileged blocks. Nor does it wrap cases where a dns
lookup might happen.

This prevents an application utilizing the SecurityManager from
isolating SocketPermissions to Netty.

Modifications:

I have introduced a class (SocketUtils) that wraps operations
requiring SocketPermissions in doPrivileged blocks.

Result:

A user of Netty can grant SocketPermissions explicitly to the Netty
jar, without granting it to the rest of their application.
2017-01-19 21:12:52 +01:00
Scott Mitchell
2fd42cfc6b HTTP/2 Max Header List Size Bug
Motivation:
If the HPACK Decoder detects that SETTINGS_MAX_HEADER_LIST_SIZE has been violated it aborts immediately and sends a RST_STREAM frame for what ever stream caused the issue. Because HPACK is stateful this means that the HPACK state may become out of sync between peers, and the issue won't be detected until the next headers frame. We should make a best effort to keep processing to keep the HPACK state in sync with our peer, or completely close the connection.
If the HPACK Encoder is configured to verify SETTINGS_MAX_HEADER_LIST_SIZE it checks the limit and encodes at the same time. This may result in modifying the HPACK local state but not sending the headers to the peer if SETTINGS_MAX_HEADER_LIST_SIZE is violated. This will also lead to an inconsistency in HPACK state that will be flagged at some later time.

Modifications:
- HPACK Decoder now has 2 levels of limits related to SETTINGS_MAX_HEADER_LIST_SIZE. The first will attempt to keep processing data and send a RST_STREAM after all data is processed. The second will send a GO_AWAY and close the entire connection.
- When the HPACK Encoder enforces SETTINGS_MAX_HEADER_LIST_SIZE it should not modify the HPACK state until the size has been checked.
- https://tools.ietf.org/html/rfc7540#section-6.5.2 states that the initial value of SETTINGS_MAX_HEADER_LIST_SIZE is "unlimited". We currently use 8k as a limit. We should honor the specifications default value so we don't unintentionally close a connection before the remote peer is aware of the local settings.
- Remove unnecessary object allocation in DefaultHttp2HeadersDecoder and DefaultHttp2HeadersEncoder.

Result:
Fixes https://github.com/netty/netty/issues/6209.
2017-01-19 10:42:43 -08:00
Scott Mitchell
b701da8d1c HTTP/2 HPACK Integer Encoding Bugs
Motivation:
- Decoder#decodeULE128 has a bounds bug and cannot decode Integer.MAX_VALUE
- Decoder#decodeULE128 doesn't support values greater than can be represented with Java's int data type. This is a problem because there are cases that require at least unsigned 32 bits (max header table size).
- Decoder#decodeULE128 treats overflowing the data type and invalid input the same. This can be misleading when inspecting the error that is thrown.
- Encoder#encodeInteger doesn't support values greater than can be represented with Java's int data type. This is a problem because there are cases that require at least unsigned 32 bits (max header table size).

Modifications:
- Correct the above issues and add unit tests.

Result:
Fixes https://github.com/netty/netty/issues/6210.
2017-01-18 18:36:47 -08:00
Norman Maurer
7f01da8d0f [maven-release-plugin] prepare for next development iteration 2017-01-12 11:36:51 +01:00
Norman Maurer
7a21eb1178 [maven-release-plugin] prepare release netty-4.1.7.Final 2017-01-12 11:35:58 +01:00
Scott Mitchell
06e7627b5f Read Only Http2Headers
Motivation:
A read only implementation of Http2Headers can allow for a more efficient usage of memory and more performant combined construction and iteration during serialization.

Modifications:
- Add a new ReadOnlyHttp2Headers class

Result:
ReadOnlyHttp2Headers exists and can be used for performance reasons when appropriate.

```
Benchmark                                            (headerCount)  Mode  Cnt    Score   Error  Units
ReadOnlyHttp2HeadersBenchmark.defaultClientHeaders               1  avgt   20   96.156 ± 1.902  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultClientHeaders               5  avgt   20  157.925 ± 3.847  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultClientHeaders              10  avgt   20  236.257 ± 2.663  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultClientHeaders              20  avgt   20  392.861 ± 3.932  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultServerHeaders               1  avgt   20   48.759 ± 0.466  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultServerHeaders               5  avgt   20  113.122 ± 0.948  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultServerHeaders              10  avgt   20  192.698 ± 1.936  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultServerHeaders              20  avgt   20  348.974 ± 3.111  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultTrailers                    1  avgt   20   35.694 ± 0.271  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultTrailers                    5  avgt   20   98.993 ± 2.933  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultTrailers                   10  avgt   20  171.035 ± 5.068  ns/op
ReadOnlyHttp2HeadersBenchmark.defaultTrailers                   20  avgt   20  330.621 ± 3.381  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyClientHeaders              1  avgt   20   40.573 ± 0.474  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyClientHeaders              5  avgt   20   56.516 ± 0.660  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyClientHeaders             10  avgt   20   76.890 ± 0.776  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyClientHeaders             20  avgt   20  117.531 ± 1.393  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyServerHeaders              1  avgt   20   29.206 ± 0.264  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyServerHeaders              5  avgt   20   44.587 ± 0.312  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyServerHeaders             10  avgt   20   64.458 ± 1.169  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyServerHeaders             20  avgt   20  107.179 ± 0.881  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyTrailers                   1  avgt   20   21.563 ± 0.202  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyTrailers                   5  avgt   20   41.019 ± 0.440  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyTrailers                  10  avgt   20   64.053 ± 0.785  ns/op
ReadOnlyHttp2HeadersBenchmark.readOnlyTrailers                  20  avgt   20  113.737 ± 4.433  ns/op
```
2016-12-18 09:32:24 -08:00
Stephane Landelle
f755e58463 Clean up following #6016
Motivation:

* DefaultHeaders from netty-codec has some duplicated logic for header date parsing
* Several classes keep on using deprecated HttpHeaderDateFormat

Modifications:

* Move HttpHeaderDateFormatter to netty-codec and rename it into HeaderDateFormatter
* Make DefaultHeaders use HeaderDateFormatter
* Replace HttpHeaderDateFormat usage with HeaderDateFormatter

Result:

Faster and more consistent code
2016-11-21 12:35:40 -08:00
Stephane Landelle
edc4842309 Fix cookie date parsing, close #6016
Motivation:
* RFC6265 defines its own parser which is different from RFC1123 (it accepts RFC1123 format but also other ones). Basically, it's very lax on delimiters, ignores day of week and timezone. Currently, ClientCookieDecoder uses HttpHeaderDateFormat underneath, and can't parse valid cookies such as Github ones whose expires attribute looks like "Sun, 27 Nov 2016 19:37:15 -0000"
* ServerSideCookieEncoder currently uses HttpHeaderDateFormat underneath for formatting expires field, and it's slow.

Modifications:
* Introduce HttpHeaderDateFormatter that correctly implement RFC6265
* Use HttpHeaderDateFormatter in ClientCookieDecoder and ServerCookieEncoder
* Deprecate HttpHeaderDateFormat

Result:
* Proper RFC6265 dates support
* Faster ServerCookieEncoder and ClientCookieDecoder
* Faster tool for handling headers such as "Expires" and "Date"
2016-11-18 11:22:21 +00:00
buchgr
c9918de37b http2: Make MAX_HEADER_LIST_SIZE exceeded a stream error when encoding.
Motivation:

The SETTINGS_MAX_HEADER_LIST_SIZE limit, as enforced by the HPACK Encoder, should be a stream error and not apply to the whole connection.

Modifications:

Made the necessary changes for the exception to be of type StreamException.

Result:

A HEADERS frame exceeding the limit, only affects a specific stream.
2016-10-17 09:24:06 -07:00
Norman Maurer
5f533b7358 [maven-release-plugin] prepare for next development iteration 2016-10-14 13:20:41 +02:00
Norman Maurer
35fb0babe2 [maven-release-plugin] prepare release netty-4.1.6.Final 2016-10-14 12:47:19 +02:00
Scott Mitchell
540c26bb56 HTTP/2 Ensure default settings are correctly enforced and interfaces clarified
Motivation:
The responsibility for retaining the settings values and enforcing the settings constraints is spread out in different areas of the code and may be initialized with different values than the default specified in the RFC. This should not be allowed by default and interfaces which are responsible for maintaining/enforcing settings state should clearly indicate the restrictions that they should only be set by the codec upon receipt of a SETTINGS ACK frame.

Modifications:
- Encoder, Decoder, and the Headers Encoder/Decoder no longer expose public constructors that allow the default settings to be changed.
- Http2HeadersDecoder#maxHeaderSize() exists to provide some bound when headers/continuation frames are being aggregated. However this is roughly the same as SETTINGS_MAX_HEADER_LIST_SIZE (besides the 32 byte octet for each header field) and can be used instead of attempting to keep the two independent values in sync.
- Encoding headers now enforces SETTINGS_MAX_HEADER_LIST_SIZE at the octect level. Previously the header encoder compared the number of header key/value pairs against SETTINGS_MAX_HEADER_LIST_SIZE instead of the number of octets (plus 32 bytes overhead).
- DefaultHttp2ConnectionDecoder#onData calls shouldIgnoreHeadersOrDataFrame but may swallow exceptions from this method. This means a STREAM_RST frame may not be sent when it should for an unknown stream and thus violate the RFC. The exception is no longer swallowed.

Result:
Default settings state is enforced and interfaces related to settings state are clarified.
2016-10-07 13:00:45 -07:00
radai-rosenblatt
15ac6c4a1f Clean-up unused imports
Motivation:

the build doesnt seem to enforce this, so they piled up

Modifications:

removed unused import lines

Result:

less unused imports

Signed-off-by: radai-rosenblatt <radai.rosenblatt@gmail.com>
2016-09-30 09:08:50 +02:00
Masaru Nomura
20da5383ce Upgrade JMH to 1.14.1
Motivation:
It'd be usually good to use the latest library version.

Modification:
Bumped JMH to the latest version as of today.

Result:
Now we use JMH version 1.14.1 for our benchmark.
2016-09-29 21:29:26 +02:00
buchgr
67d3a78123 Reduce bytecode size of PlatformDependent0.equals.
Motivation:

PP0.equals has a bytecode size of 476. This is above the default inlining threshold of OpenJDK.

Modifications:

Slightly change the method to reduce the bytecode size by > 50% to 212 bytes.

Result:

The bytecode size is dramatically reduced, making the method a candidate for inlining.
The relevant code in our application (gRPC) that relies heavily on equals comparisons,
runs some ~10% faster. The Netty JMH benchmark shows no performance regression.

Current 4.1:

PlatformDependentBenchmark.unsafeBytesEqual      10  avgt   20     7.836 ±   0.113  ns/op
PlatformDependentBenchmark.unsafeBytesEqual      50  avgt   20    16.889 ±   4.284  ns/op
PlatformDependentBenchmark.unsafeBytesEqual     100  avgt   20    15.601 ±   0.296  ns/op
PlatformDependentBenchmark.unsafeBytesEqual    1000  avgt   20    95.885 ±   1.992  ns/op
PlatformDependentBenchmark.unsafeBytesEqual   10000  avgt   20   824.429 ±  12.792  ns/op
PlatformDependentBenchmark.unsafeBytesEqual  100000  avgt   20  8907.035 ± 177.844  ns/op

With this change:

PlatformDependentBenchmark.unsafeBytesEqual      10  avgt   20      5.616 ±   0.102  ns/op
PlatformDependentBenchmark.unsafeBytesEqual      50  avgt   20     17.896 ±   0.373  ns/op
PlatformDependentBenchmark.unsafeBytesEqual     100  avgt   20     14.952 ±   0.210  ns/op
PlatformDependentBenchmark.unsafeBytesEqual    1000  avgt   20     94.799 ±   1.604  ns/op
PlatformDependentBenchmark.unsafeBytesEqual   10000  avgt   20    834.996 ±  17.484  ns/op
PlatformDependentBenchmark.unsafeBytesEqual  100000  avgt   20   8757.421 ± 187.555  ns/op
2016-09-09 07:57:41 +02:00
Norman Maurer
54b1a100f4 [maven-release-plugin] prepare for next development iteration 2016-08-26 10:06:32 +02:00
Norman Maurer
1208b90f57 [maven-release-plugin] prepare release netty-4.1.5.Final 2016-08-26 04:59:35 +02:00
Scott Mitchell
208893aac9 HTTP/2 Hpack Encoder Cleanup
Motivation:
The HTTP/2 HPACK Encoder class has some code which is only used for test purposes. This code can be removed to reduce complexity and member variable count.

Modifications:
- Remove test code and update unit tests
- Other minor cleanup

Result:
Test code is removed from operational code.
2016-08-25 09:08:46 -07:00
Scott Mitchell
21e8d84b79 HTTP/2 Simplify Headers Decode Bounds Checking
Motivation:
The HPACK decoder keeps state so that the decode method can be called multiple times with successive header fragments. This decoder also requires that a method is called to signify the decoding is complete. At this point status is returned to indicate if the max header size has been violated. Netty always accumulates the header fragments into a single buffer before attempting to HPACK decode process and so keeping state and delaying notification that bounds have been exceeded is not necessary.

Modifications:
- HPACK Decoder#decode(..) now must be called with a complete header block
- HPACK will terminate immediately if the maximum header length, or maximum number of headers is exceeded
- Reduce member variables in the HPACK Decoder class because they can now live in the decode(..) method

Result:
HPACK bounds checking is done earlier and less class state is needed.
2016-08-12 17:12:53 -07:00
Scott Mitchell
82b617dfe9 retainSlice() unwrap ByteBuf
Motivation:
retainSlice() currently does not unwrap the ByteBuf when creating the ByteBuf wrapper. This effectivley forms a linked list of ByteBuf when it is only necessary to maintain a reference to the unwrapped ByteBuf.

Modifications:
- retainSlice() and retainDuplicate() variants should only maintain a reference to the unwrapped ByteBuf
- create new unit tests which generally verify the retainSlice() behavior
- Remove unecessary generic arguments from AbstractPooledDerivedByteBuf
- Remove unecessary int length member variable from the unpooled sliced ByteBuf implementation
- Rename the unpooled sliced/derived ByteBuf to include Unpooled in their name to be more consistent with the Pooled variants

Result:
Fixes https://github.com/netty/netty/issues/5582
2016-07-29 11:16:44 -07:00
Norman Maurer
cb7cf4491c [maven-release-plugin] prepare for next development iteration 2016-07-27 13:29:56 +02:00
Norman Maurer
9466b32d05 [maven-release-plugin] prepare release netty-4.1.4.Final 2016-07-27 13:16:59 +02:00
Norman Maurer
047f6aed28 [maven-release-plugin] prepare for next development iteration 2016-07-15 09:09:13 +02:00
Norman Maurer
b2adea87a0 [maven-release-plugin] prepare release netty-4.1.3.Final 2016-07-15 09:08:53 +02:00
Norman Maurer
4676a2271c [maven-release-plugin] prepare for next development iteration 2016-07-01 10:33:32 +02:00
Norman Maurer
ad270c02b9 [maven-release-plugin] prepare release netty-4.1.2.Final 2016-07-01 09:07:40 +02:00
Scott Mitchell
6af56ffe76 HPACK Encoder headerFields improvements
Motivation:
HPACK Encoder has a data structure which is similar to a previous version of DefaultHeaders. Some of the same improvements can be made.

Motivation:
- Enforce the restriction that the Encoder's headerFields length must be a power of two so we can use masking instead of modulo
- Use AsciiString.hashCode which already has optimizations instead of having yet another hash code algorithm in Encoder

Result:
Fixes https://github.com/netty/netty/issues/5357
2016-06-30 09:00:12 -07:00
Scott Mitchell
a7f7d9c8e0 Remove unsafe char[] access in PlatformDependent
Motivation:
PlatformDependent attempts to use reflection to get the underlying char[] (or byte[]) from String objects. This is fragile as if the String implementation does not utilize the full array, and instead uses a subset of the array, this optimization is invalid. OpenJDK6 and some earlier versions of OpenJDK7 String have the capability to use a subsection of the underlying char[].

Modifications:
- PlatformDependent should not attempt to use the underlying array from String (or other data types) via reflection

Result:
PlatformDependent hash code generation for CharSequence does not depend upon specific JDK implementation details.
2016-06-30 08:58:28 -07:00
Scott Mitchell
70651cc58d HpackUtil.equals performance improvement
Motivation:
PR #5355 modified interfaces to reduce GC related to the HPACK code. However this came with an anticipated performance regression related to HpackUtil.equals due to AsciiString's increase cost of charAt(..). We should mitigate this performance regression.

Modifications:
- Introduce an equals method in PlatformDependent which doesn't leak timing information and use this in HpcakUtil.equals

Result:
Fixes https://github.com/netty/netty/issues/5436
2016-06-27 14:37:39 -07:00
Guido Medina
f0a5ee068f Update dependencies and plugins to latest possible versions.
Motivation:
It is good to have used dependencies and plugins up-to-date to fix any undiscovered bug fixed by the authors.

Modification:
Scanned dependencies and plugins and carefully updated one by one.

Result:
Dependencies and plugins are up-to-date.
2016-06-27 13:35:35 +02:00
Norman Maurer
b4d4c0034d Optimize HPACK usage to align more with Netty types and remove heavy object creations. Related to [#3597]
Motivations:

The HPACK code was not really optimized and written with Netty types in mind. Because of this a lot of garbage was created due heavy object creation.

This was first reported in [#3597] and https://github.com/grpc/grpc-java/issues/1872 .

Modifications:

- Directly use ByteBuf as input and output
- Make use of ByteProcessor where possible
- Use AsciiString as this is the only thing we need for our http2 usage

Result:

Less garbage and better usage of Netty apis.
2016-06-22 14:26:05 +02:00
Norman Maurer
4dec7f11b7 [maven-release-plugin] prepare for next development iteration 2016-06-07 18:52:34 +02:00
Norman Maurer
cf670fab75 [maven-release-plugin] prepare release netty-4.1.1.Final 2016-06-07 18:52:22 +02:00
Norman Maurer
6ca49d1336 [maven-release-plugin] prepare for next development iteration 2016-05-25 19:16:44 +02:00
Norman Maurer
446b38db52 [maven-release-plugin] prepare release netty-4.1.0.Final 2016-05-25 19:14:15 +02:00
Norman Maurer
e59d0e9efb Introduce CodecOutputList to reduce overhead of encoder/decoder
Motivation:

99dfc9ea79 introduced some code that will more frequently try to forward messages out of the list of decoded messages to reduce latency and memory footprint. Unfortunally this has the side-effect that RecycleableArrayList.clear() will be called more often and so introduce some overhead as ArrayList will null out the array on each call.

Modifications:

- Introduce a CodecOutputList which allows to not null out the array until we recycle it and also allows to access internal array with extra range checks.
- Add benchmark that add elements to different List implementations and clear them

Result:

Less overhead when decode / encode messages.

Benchmark                                     (elements)   Mode  Cnt         Score        Error  Units
CodecOutputListBenchmark.arrayList                     1  thrpt   20  24853764.609 ± 161582.376  ops/s
CodecOutputListBenchmark.arrayList                     4  thrpt   20  17310636.508 ± 930517.403  ops/s
CodecOutputListBenchmark.codecOutList                  1  thrpt   20  26670751.661 ± 587812.655  ops/s
CodecOutputListBenchmark.codecOutList                  4  thrpt   20  25166421.089 ± 166945.599  ops/s
CodecOutputListBenchmark.recyclableArrayList           1  thrpt   20  24565992.626 ± 210017.290  ops/s
CodecOutputListBenchmark.recyclableArrayList           4  thrpt   20  18477881.775 ± 157003.777  ops/s

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 246.748 sec - in io.netty.handler.codec.CodecOutputListBenchmark
2016-05-20 09:12:07 +02:00
Norman Maurer
68cd670eb9 Remove ChannelHandlerInvoker
Motivation:

We tried to provide the ability for the user to change the semantics of the threading-model by delegate the invoking of the ChannelHandler to the ChannelHandlerInvoker. Unfortunually this not really worked out quite well and resulted in just more complexity and splitting of code that belongs together. We should remove the ChannelHandlerInvoker again and just do the same as in 4.0

Modifications:

Remove ChannelHandlerInvoker again and replace its usage in Http2MultiplexCodec

Result:

Easier code and less bad abstractions.
2016-05-17 11:14:00 +02:00
Scott Mitchell
c5faa142fb KObjectHashMap probeNext improvement
Motivation:
KObjectHashMap.probeNext(..) usually involves 2 conditional statements and 2 aritmatic operations. This can be improved to have 0 conditional statements.

Modifications:
- Use bit masking to avoid conditional statements

Result:
Improved performance for KObjecthashMap.probeNext(..)
2016-05-13 10:04:26 -07:00
Scott Mitchell
d580245afc DefaultHttp2Connection.close Reentrant Modification
Motivation:
The DefaultHttp2Conneciton.close method accounts for active streams being iterated and attempts to avoid reentrant modifications of the underlying stream map by using iterators to remove from the stream map. However there are a few issues:

- While iterating over the stream map we don't prevent iterations over the active stream collection
- Removing a single stream may actually remove > 1 streams due to closed non-leaf streams being preserved in the priority tree which may result in NPE

Preserving closed non-leaf streams in the priority tree is no longer necessary with our current allocation algorithms, and so this feature (and related complexity) can be removed.

Modifications:
- DefaultHttp2Connection.close should prevent others from iterating over the active streams and reentrant modification scenarios which may result from this
- DefaultHttp2Connection should not keep closed stream in the priority tree
  - Remove all associated code in DefaultHttp2RemoteFlowController which accounts for this case including the ReducedState object
  - This includes fixing writability changes which depended on ReducedState
- Update unit tests

Result:
Fixes https://github.com/netty/netty/issues/5198
2016-05-09 14:16:30 -07:00
Norman Maurer
9229ed98e2 [#5088] Add annotation which marks packages/interfaces/classes as unstable
Motivation:

Some codecs should be considered unstable as these are relative new. For this purpose we should introduce an annotation which these codecs should us to be marked as unstable in terms of API.

Modifications:

- Add UnstableApi annotation and use it on codecs that are not stable
- Move http2.hpack to http2.internal.hpack as it is internal.

Result:

Better document unstable APIs.
2016-05-09 15:16:35 +02:00
Norman Maurer
03638869b9 Update dependencies
Motivation:

Before release 4.1.0.Final we should update all our dependencies.

Modifications:

Update dependencies.

Result:

Up-to-date dependencies used.
2016-04-14 11:05:53 +02:00
Norman Maurer
572bdfb494 [maven-release-plugin] prepare for next development iteration 2016-04-10 08:37:18 +02:00
Norman Maurer
c6121a6f49 [maven-release-plugin] prepare release netty-4.1.0.CR7 2016-04-10 08:36:56 +02:00
Norman Maurer
6e919f70f8 [maven-release-plugin] rollback the release of netty-4.1.0.CR7 2016-04-09 22:13:44 +02:00
Norman Maurer
4cdd51509a [maven-release-plugin] prepare release netty-4.1.0.CR7 2016-04-09 22:05:34 +02:00
Trustin Lee
3b941c2a7c [maven-release-plugin] prepare for next development iteration 2016-04-02 01:25:05 -04:00
Trustin Lee
7368ccc539 [maven-release-plugin] prepare release netty-4.1.0.CR6 2016-04-02 01:24:55 -04:00
Norman Maurer
cee38ed2b6 [maven-release-plugin] prepare for next development iteration 2016-03-29 16:45:13 +02:00
Norman Maurer
9cd9e7daeb [maven-release-plugin] prepare release netty-4.1.0.CR5 2016-03-29 16:44:33 +02:00
Xiaoyan Lin
3ad55eb839 Speed up the slow path of FastThreadLocal
Motivation:

The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418

Modifications:

- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal

Result:

The slow path of FastThreadLocal is improved.
2016-03-23 11:36:16 +01:00
Norman Maurer
28d03adbfe [maven-release-plugin] prepare for next development iteration 2016-03-21 11:51:50 +01:00
Norman Maurer
4653dc1d05 [maven-release-plugin] prepare release netty-4.1.0.CR4 2016-03-21 11:51:12 +01:00
Xiaoyan Lin
4a5e484c5a Add xml-maven-plugin to check indentation and fix violations
Motivation:

See https://github.com/netty/netty-build/issues/5

Modifications:

Add xml-maven-plugin to check indentation and fix violations

Result:
pom.xml will be checked in the PR build
2016-02-29 09:46:32 +01:00
Norman Maurer
ca443e42e0 [maven-release-plugin] prepare for next development iteration 2016-02-19 23:00:11 +01:00
Norman Maurer
f39eb9a6b2 [maven-release-plugin] prepare release netty-4.1.0.CR3 2016-02-19 22:59:52 +01:00
Norman Maurer
75a2ddd61c [maven-release-plugin] prepare for next development iteration 2016-02-04 16:51:44 +01:00
Norman Maurer
7eb3a60dba [maven-release-plugin] prepare release netty-4.1.0.CR2 2016-02-04 16:37:06 +01:00
Scott Mitchell
f990f9983d HTTP/2 Don't Flow Control Iniital Headers
Motivation:
Currently the initial headers for every stream is queued in the flow controller. Since the initial header frame may create streams the peer must receive these frames in the order in which they were created, or else this will be a protocol error and the connection will be closed. Tolerating the initial headers being queued would increase the complexity of the WeightedFairQueueByteDistributor and there is benefit of doing so is not clear.

Modifications:
- The initial headers will no longer be queued in the flow controllers

Result:
Fixes https://github.com/netty/netty/issues/4758
2016-02-01 13:37:43 -08:00
Norman Maurer
1c417e5f82 [maven-release-plugin] prepare for next development iteration 2016-01-21 15:35:55 +01:00
Norman Maurer
c681a40a78 [maven-release-plugin] prepare release netty-4.1.0.CR1 2016-01-21 15:28:21 +01:00
Eric Anderson
6dbb610f5b Add ChannelHandlerContext.invoker()
Motivation:

Being able to access the invoker() is useful when adding additional
handlers that should be running in the same thread. Since an application
may be using a threading model unsupported by the default invoker, they
can specify their own. Because of that, in a handler that auto-adds
other handlers:

// This is a good pattern
ctx.pipeline().addBefore(ctx.invoker(), ctx.name(), null, newHandler);
// This will generally work, but prevents using custom invoker.
ctx.pipeline().addBefore(ctx.executor(), ctx.name(), null, newHandler);

That's why I believe in commit 110745b0, for the now-defunct 5.0 branch,
when ChannelHandlerAppender was added the invoker() method was also
necessary.

There is a side-benefit to exposing the invoker: in certain advanced
use-cases using the invoker for a particular handler is useful. Using
the invoker you are able to invoke a _particular_ handler, from possibly
a different thread yet still using standard exception processing.

ChannelHandlerContext does part of that, but is unwieldy when trying to
invoke a particular handler because it invokes the prev or next handler,
not the one the context is for. A workaround is to use the next or prev
context (respectively), but this breaks when the pipeline changes.

This came up during writing the Http2MultiplexCodec which uses a
separate child channel for each http/2 stream and wants to send messages
from the child channel directly to the Http2MultiplexCodec handler that
created it.

Modifications:

Add the invoker() method to ChannelHandlerContext. It was already being
implemented by AbstractChannelHandlerContext. The two other
implementations of ChannelHandlerContext needed minor tweaks.

Result:

Access to the invoker used for a particular handler, for either reusing
for other handlers or for advanced use-cases. Fixes #4738
2016-01-22 14:04:35 +01:00
Xiaoyan Lin
475d901131 Fix errors reported by javadoc
Motivation:

Javadoc reports errors about invalid docs.

Modifications:

Fix some errors reported by javadoc.

Result:

A lot of javadoc errors are fixed by this patch.
2015-12-27 08:36:45 +01:00
zhangduo
f22ad97cf3 Remove PriorityStreamByteDistributor from http2 microbench
Motivation:
PriorityStreamByteDistributor has been removed but NoPriorityByteDistributionBenchmark in microbench still need it and causes compile error

Modifications:
Remove PriorityStreamByteDistributor from NoPriorityByteDistributionBenchmark

Result:
The compile error has been fixed
2015-12-22 09:20:32 +01:00
nmittler
ee56a4a5c6 Fixing broken HTTP/2 benchmark
Motivation:

The `NoPriorityByteDistibbutionBenchmark` was broken with a recent commit.

Modifications:

Fixed the benchmark to use the new HTTP2 handler builder.

Result:

It builds.
2015-12-18 09:52:12 -08:00
Scott Mitchell
904e70a4d4 HTTP/2 Weighted Fair Queue Byte Distributor
Motivation:
PriorityStreamByteDistributor uses a homegrown algorithm which distributes bytes to nodes in the priority tree. PriorityStreamByteDistributor has no concept of goodput which may result in poor utilization of network resources. PriorityStreamByteDistributor also has performance issues related to the tree traversal approach and number of nodes that must be visited. There also exists some more proven algorithms from the resource scheduling domain which PriorityStreamByteDistributor does not employ.

Modifications:
- Introduce a new ByteDistributor which uses elements from weighted fair queue schedulers

Result:
StreamByteDistributor which is sensitive to priority and uses a more familiar distribution concept.
Fixes https://github.com/netty/netty/issues/4462
2015-12-17 11:17:02 -08:00
Trustin Lee
2202e8f967 Revamp the Http2ConnectionHandler builder API
Related: #4572

Motivation:

- A user might want to extend Http2ConnectionHandler and define his/her
  own static inner Builder class that extends
  Http2ConnectionHandler.BuilderBase. This introduces potential
  confusion because there's already Http2ConnectionHandler.Builder. Your
  IDE will warn about this name duplication as well.
- BuilderBase exposes all setters with public modifier. A user's Builder
  might not want to expose them to enforce it to certain configuration.
  There's no way to hide them because it's public already and they are
  final.
- BuilderBase.build(Http2ConnectionDecoder, Http2ConnectionEncoder)
  ignores most properties exposed by BuilderBase, such as
  validateHeaders, frameLogger and encoderEnforceMaxConcurrentStreams.
  If any build() method ignores the properties exposed by the builder,
  there's something wrong.
- A user's Builder that extends BuilderBase might want to require more
  parameters in build(). There's no way to do that cleanly because
  build() is public and final already.

Modifications:

- Make BuilderBase and Builder top-level so that there's no duplicate
  name issue anymore.
  - Add AbstractHttp2ConnectionHandlerBuilder
  - Add Http2ConnectionHandlerBuilder
  - Add HttpToHttp2ConnectionHandlerBuilder
- Make all builder methods in AbstractHttp2ConnectionHandlerBuilder
  protected so that a subclass can choose which methods to expose
- Provide only a single build() method
  - Add connection() and codec() so that a user can still specify
    Http2Connection or Http2Connection(En|De)coder explicitly
  - Implement proper state validation mechanism so that it is prevented
    to invoke conflicting setters

Result:

Less confusing yet flexible builder API
2015-12-17 14:08:13 +09:00
Scott Mitchell
641505a5d2 DefaultChannelConfig maxMessagesPerRead default not always set
Motivation:
ChannelMetadata has a field minMaxMessagesPerRead which can be confusing. There are also some cases where static instances are used and the default value for channel type is not being applied.

Modifications:
- use a default value which is set unconditionally to simplify
- make sure static instances of MaxMessagesRecvByteBufAllocator are not used if the intention is that the default maxMessagesPerRead should be derived from the channel type.

Result:
Less confusing interfaces in ChannelMetadata and ChannelConfig. Default maxMessagesPerRead is correctly applied.
2015-11-25 15:14:07 -08:00
Scott Mitchell
49cd00da1c Backport of benchmark broke build
Motivation:
2a2059d976 was backported from master, and included an overriden method which does not exist in 4.1.

Modifications:
- Remove the invoker method from NoPriorityByteDistributionBenchmark

Result:
No more build error
2015-11-20 15:03:50 -08:00
nmittler
2a2059d976 Adding UniformStreamByteDistributor
Motivation:

The current priority algorithm can yield poor per-stream goodput when either the number of streams is high or the connection window is small. When all priorities are the same (i.e. priority is disabled), we should be able to do better.

Modifications:

Added a new UniformStreamByteDistributor that ignores priority entirely and manages a queue of streams.  Each stream is allocated a minimum of 1KiB on each iteration.

Result:

Improved goodput when priority is not used.
2015-11-19 16:49:12 -08:00
nmittler
8accc52b03 Forking Twitter's hpack
Motivation:

The twitter hpack project does not have the support that it used to have.  See discussion here: https://github.com/netty/netty/issues/4403.

Modifications:

Created a new module in Netty and copied the latest from twitter hpack master.

Result:

Netty no longer depends on twitter hpack.
2015-11-14 10:13:32 -08:00
Norman Maurer
2ecce8fa56 [maven-release-plugin] prepare for next development iteration 2015-11-10 22:59:33 +01:00
Norman Maurer
6a93f331d3 [maven-release-plugin] prepare release netty-4.1.0.Beta8 2015-11-10 22:50:57 +01:00
Scott Mitchell
b4b791353d AsciiString optimized hashCode
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.

Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available

Result:
AsciiString hashCode is faster.
2015-11-10 10:28:31 -08:00
Louis Ryan
6e108cb96a Improve the performance of copying header sets when hashing and name validation are equivalent.
Motivation:
Headers and groups of headers are frequently copied and the current mechanism is slower than it needs to be.

Modifications:
Skip name validation and hash computation when they are not necessary.
Fix emergent bug in CombinedHttpHeaders identified with better testing
Fix memory leak in DefaultHttp2Headers when clearing
Added benchmarks

Result:
Faster header copying and some collateral bug fixes
2015-11-07 08:53:10 -08:00
nmittler
6504d52b94 Add HTTP/2 local flow control option for auto refill
Motivation:

For many HTTP/2 applications (such as gRPC) it is necessary to autorefill the connection window in order to prevent application-level deadlocking.

Consider an application with 2 streams, A and B.  A receives a stream of messages and the application pops off one message at a time and makes a request on stream B. However, if receiving of data on A has caused the connection window to collapse, B will not be able to receive any data and the application will deadlock.  The only way (currently) to get around this is 1) use multiple connections, or 2) manually refill the connection window.  Both are undesirable and could needlessly complicate the application code.

Modifications:

Add a configuration option to DefaultHttp2LocalFlowController, allowing it to autorefill the connection window.

Result:

Applications can configure HTTP/2 to avoid inter-stream deadlocking.
2015-11-05 15:47:10 -08:00
Norman Maurer
1b2e43e70c Correctly construct Executor in microbenchmarks.
Motivation:

We should allow our custom Executor to shutdown quickly.

Modifications:

Call super constructor which correct arguments.

Result:

Custom Executor can be shutdown quickly.
2015-11-03 09:46:05 +01:00
Scott Mitchell
19658e9cd8 HTTP/2 Headers Type Updates
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.

Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.

Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
2015-10-30 15:29:44 -07:00
buchgr
c9364616c8 Fix performance regression in FastThreadLocal microbenchmark. Fixes #4402
Motivation:

As reported in #4402, the FastThreadLocalBenchmark shows that the JDK ThreadLocal
is actually faster than Netty's custom thread local implementation.

I was looking forward to doing some deep digging, but got disappointed :(.

Modifications:

The microbenchmark was not using FastThreadLocalThreads and would thus always hit the slow path.
I updated the JMH command line flags, so that FastThreadLocalThreads would be used.

Result:

FastThreadLocalBenchmark shows FastThreadLocal to be faster than JDK's ThreadLocal implementation,
by about 56% in this particular benchmark. Run on OSX El Capitan with OpenJDK 1.8u60.

Benchmark                                    Mode  Cnt      Score      Error  Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  ops/s
2015-10-29 21:40:13 +01:00
Norman Maurer
2e36ac4594 Add benchmark for HeapByteBuf implementations.
Motivation:

To prove one implementation is faster as the other we should have a benchmark.

Modifications:

Add benchmark which benchmarks the unsafe and non-unsafe implementation of HeapByteBuf.

Result:

Able to compare speed of implementations easily.
2015-10-29 19:38:52 +01:00
Norman Maurer
a47685b243 Use bitwise operation when sampling for resource leak detection.
Motivation:

Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.

Modifications:

- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.

Result:

Faster sampling.
2015-10-29 19:18:44 +01:00
Stephane Landelle
c6474f9218 Exclude native transport related test and dependency when not running under Linux, close #4409
Motivation:

The build fails on OSX, due to it trying to pull in an epoll specific OSX dependency. See #4409.

Modifications:

* move netty-transport-native-epoll to linux profile
* exclude Http2FrameWriterBenchmark from compiler
* include Http2FrameWriterBenchmark back only in linux profile (please check)

Result:

Build succeeds on OSX.
2015-10-29 16:23:25 +01:00
Norman Maurer
4c287d4e27 Added SlicedAbstractByteBuf that can provide fast-path for _get* and _set* methods
Motivation:

SlicedByteBuf can be used for any ByteBuf implementations and so can not do any optimizations that could be done
when AbstractByteBuf is sliced.

Modifications:

- Add SlicedAbstractByteBuf that can eliminate range and reference count checks for _get* and _set* methods.

Result:

Faster SlicedByteBuf implementations for AbstractByteBuf sub-classes.
2015-10-16 09:12:20 +02:00
Norman Maurer
2aef4a504f Minimize object allocation when calling AbstractByteBuf.toString(..., Charset)
Motivation:

Calling AbstractByteBuf.toString(..., Charset) is used quite frequently by users but produce a lot of GC.

Modification:

- Use a FastThreadLocal to store the CharBuffer that are needed for decoding.
- Use internalNioBuffer(...) when possible

Result:

Less object creation / Less GC
2015-10-15 17:51:57 +02:00
Norman Maurer
9697afc106 Allow to disable reference count checks on every access of the ByteBuf
Motiviation:

Checking reference count on every access on a ByteBuf can have some big performance overhead depending on how the access pattern is. If the user is sure that there are no reference count errors on his side it should be possible to disable the check and so gain the max performance.

Modification:

- Add io.netty.buffer.bytebuf.checkAccessible system property which allows to disable the checks. Enabled by default.
- Add microbenchmark

Result:

Increased performance for operations on the ByteBuf.
2015-10-15 10:21:16 +02:00
Norman Maurer
2ff2806ada [maven-release-plugin] prepare for next development iteration 2015-10-02 09:03:29 +02:00
Norman Maurer
5a43de10f7 [maven-release-plugin] prepare release netty-4.1.0.Beta7 2015-10-02 09:02:58 +02:00
Scott Mitchell
d4680c55d8 AsciiString contains utility methods
Motivation:
When dealing with case insensitive headers it can be useful to have a case insensitive contains method for CharSequence.

Modifications:
- Add containsCaseInsensative to AsciiString

Result:
More expressive utility method for case insensitive CharSequence.
2015-10-02 12:50:11 -07:00
Scott Mitchell
284e3702d8 Http2ConnectionHandler Builder instead of constructors
Motivation:
Using the builder pattern for Http2ConnectionHandler (and subclasses) would be advantageous for the following reasons:
1. Provides the consistent construction afforded by the builder pattern for 'optional' arguments. Users can specify these options 1 time in the builder and then re-use the builder after this.
2. Enforces that the Http2ConnectionHandler's internals (decoder Http2FrameListener) are initialized after construction.

Modifications:
- Add an extensible builder which can be used to build Http2ConnectionHandler objects
- Update classes which inherit from Http2ConnectionHandler

Result:
It is easier to specify options and construct Http2ConnectionHandler objects.
2015-10-01 13:51:03 -07:00
Scott Mitchell
1485a87e25 Http2ConnectionHandler and Http2FrameListener cyclic dependency
Motivation:
It is often the case that implementations of Http2FrameListener will want to send responses when data is read. The Http2FrameListener needs access to the Http2ConnectionHandler (or the encoder contained within) to be able to send responses. However the Http2ConnectionHandler requires a Http2FrameListener instance to be passed in during construction time. This creates a cyclic dependency which can make it difficult to cleanly accomplish this relationship.

Modifications:
- Add Http2ConnectionDecoder.frameListener(..) method to set the frame listener. This will allow the listener to be set after construction.

Result:
Classes which inherit from Http2ConnectionHandler can more cleanly set the Http2FrameListener.
2015-09-30 15:41:15 -07:00
Scott Mitchell
0e9545e94d Http2RemoteFlowController stream writibility listener
Motivation:
For implementations that want to manage flow control down to the stream level it is useful to be notified when stream writability changes.

Modifications:
- Add writabilityChanged to Http2RemoteFlowController.Listener
- Add isWritable to Http2RemoteFlowController

Result:
The Http2RemoteFlowController provides notification when writability of a stream changes.
2015-09-28 13:47:24 -07:00
nmittler
3ee44a3dbb Update Netty to latest netty-tcnative
Motivation:

The latest netty-tcnative fixes a bug in determining the version of the runtime openssl lib.  It also publishes an artificact with the classifier linux-<arch>-fedora for fedora-based systems.

Modifications:

Modified the build files to use the "-fedora" classifier when appropriate for tcnative. Care is taken, however, to not change the classifier for the native epoll transport.

Result:

Netty is updated the the new shiny netty-tcnative.
2015-09-18 12:07:21 -07:00
Norman Maurer
34de2667c7 [maven-release-plugin] prepare for next development iteration 2015-09-02 11:45:20 +02:00
Norman Maurer
2eb444ec1d [maven-release-plugin] prepare release netty-4.1.0.Beta6 2015-09-02 11:36:11 +02:00
Scott Mitchell
ba6ce5449e Headers Performance Boost and Interface Simplification
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.

Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.

Result:
Performance is now on par with 4.0.
2015-08-17 08:50:11 -07:00
Jakob Buchgraber
6fd0a0c55f Faster and more memory efficient headers for HTTP, HTTP/2, STOMP and SPYD. Fixes #3600
Motivation:

We noticed that the headers implementation in Netty for HTTP/2 uses quite a lot of memory
and that also at least the performance of randomly accessing a header is quite poor. The main
concern however was memory usage, as profiling has shown that a DefaultHttp2Headers
not only use a lot of memory it also wastes a lot due to the underlying hashmaps having
to be resized potentially several times as new headers are being inserted.

This is tracked as issue #3600.

Modifications:
We redesigned the DefaultHeaders to simply take a Map object in its constructor and
reimplemented the class using only the Map primitives. That way the implementation
is very concise and hopefully easy to understand and it allows each concrete headers
implementation to provide its own map or to even use a different headers implementation
for processing requests and writing responses i.e. incoming headers need to provide
fast random access while outgoing headers need fast insertion and fast iteration. The
new implementation can support this with hardly any code changes. It also comes
with the advantage that if the Netty project decides to add a third party collections library
as a dependency, one can simply plug in one of those very fast and memory efficient map
implementations and get faster and smaller headers for free.

For now, we are using the JDK's TreeMap for HTTP and HTTP/2 default headers.

Result:

- Significantly fewer lines of code in the implementation. While the total commit is still
  roughly 400 lines less, the actual implementation is a lot less. I just added some more
  tests and microbenchmarks.

- Overall performance is up. The current implementation should be significantly faster
  for insertion and retrieval. However, it is slower when it comes to iteration. There is simply
  no way a TreeMap can have the same iteration performance as a linked list (as used in the
  current headers implementation). That's totally fine though, because when looking at the
  benchmark results @ejona86 pointed out that the performance of the headers is completely
  dominated by insertion, that is insertion is so significantly faster in the new implementation
  that it does make up for several times the iteration speed. You can't iterate what you haven't
  inserted. I am demonstrating that in this spreadsheet [1]. (Actually, iteration performance is
  only down for HTTP, it's significantly improved for HTTP/2).

- Memory is down. The implementation with TreeMap uses on avg ~30% less memory. It also does not
  produce any garbage while being resized. In load tests for GRPC we have seen a memory reduction
  of up to 1.2KB per RPC. I summarized the memory improvements in this spreadsheet [1]. The data
  was generated by [2] using JOL.

- While it was my original intend to only improve the memory usage for HTTP/2, it should be similarly
  improved for HTTP, SPDY and STOMP as they all share a common implementation.

[1] https://docs.google.com/spreadsheets/d/1ck3RQklyzEcCLlyJoqDXPCWRGVUuS-ArZf0etSXLVDQ/edit#gid=0
[2] https://gist.github.com/buchgr/4458a8bdb51dd58c82b4
2015-08-04 17:12:24 -07:00
Scott Mitchell
a7713069a1 HttpObjectDecoder performance improvements
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.

Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.

Result:
Peformance boost for decoding http objects.
2015-07-29 23:26:26 -07:00
Scott Mitchell
9747ffe5fc HTTP/2 Flow Controller should use Channel.isWritable()
Motivation:
See #3783

Modifications:
- The DefaultHttp2RemoteFlowController should use Channel.isWritable() before attempting to do any write operations.
- The Flow controller methods should no longer take ChannelHandlerContext. The concept of flow control is tied to a connection and we do not support 1 flow controller keeping track of multiple ChannelHandlerContext.

Result:
Writes are delayed until isWritable() is true. Flow controller interface methods are more clear as to ChannelHandlerContext restrictions.
2015-07-16 14:38:48 -07:00
Louis Ryan
05ce33f5ca Make the flow-controllers write fewer, fatter frames to improve throughput.
Motivation:

Coalescing many small writes into a larger DATA frame reduces framing overheads on the wire and reduces the number of calls to Http2FrameListeners on the remote side.
Delaying the write of WINDOW_UPDATE until flush allows for more consumed bytes to be returned as the aggregate of consumed bytes is returned and not the amount consumed when the threshold was crossed.

Modifications:
- Remote flow controller no longer immediately writes bytes when a flow-controlled payload is enqueued. Sequential data payloads are now merged into a single CompositeByteBuf which are written when 'writePendingBytes' is called.
- Listener added to remote flow-controller which observes written bytes per stream.
- Local flow-controller no longer immediately writes WINDOW_UPDATE when the ratio threshold is crossed. Now an explicit call to 'writeWindowUpdates' triggers the WINDOW_UPDATE for all streams who's ratio is exceeded at that time. This results in
  fewer window updates being sent and more bytes being returned.
- Http2ConnectionHandler.flush triggers 'writeWindowUpdates' on the local flow-controller followed by 'writePendingBytes' on the remote flow-controller so WINDOW_UPDATES preceed DATA frames on the wire.

Result:
- Better throughput for writing many small DATA chunks followed by a flush, saving 9-bytes per coalesced frame.
- Fewer WINDOW_UPDATES being written and more flow-control bytes returned to remote side more quickly, thereby improving throughput.
2015-06-19 15:20:31 -07:00
Norman Maurer
f23b7b4efd [maven-release-plugin] prepare for next development iteration 2015-05-07 14:21:08 -04:00
Norman Maurer
871ce43b1f [maven-release-plugin] prepare release netty-4.1.0.Beta5 2015-05-07 14:20:38 -04:00
Louis Ryan
a3cea186ce Have Http2LocalFlowController.consumeBytes indicate whether a WINDOW_UPDATE was written 2015-05-04 13:22:18 -07:00
Scott Mitchell
f812180c2d ByteString arrayOffset method
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.

Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset

Result:
ByteString and AsciiString can represent a sub region of a byte[].
2015-04-24 18:54:01 -07:00
nmittler
70a2608325 Optimizing user-defined stream properties.
Motivation:

Streams currently maintain a hash map of user-defined properties, which has been shown to add significant memory overhead as well as being a performance bottleneck for lookup of frequently used properties.

Modifications:

Modifying the connection/stream to use an array as the storage of user-defined properties, indexed by the class that identifies the index into the array where the property is stored.

Result:

Stream processing performance should be improved.
2015-04-23 12:41:14 -07:00
Scott Mitchell
b426fb1618 Compile error introduced in ee9233d
Motivation:
Commit ee9233d introduced a compile error in microbench.

Modifications:
Fix compile error.

Result:
Code now builds.
2015-04-22 16:23:39 -07:00
Scott Mitchell
541137cc93 HTTP/2 Flow Controller interface updates
Motivation:
Flow control is a required part of the HTTP/2 specification but it is currently structured more like an optional item. It must be accessed through the property map which is time consuming and does not represent its required nature. This access pattern does not give any insight into flow control outside of the codec (or flow controller implementation).

Modifications:
1. Create a read only public interface for LocalFlowState and RemoteFlowState.
2. Add a LocalFlowState localFlowState(); and RemoteFlowState remoteFlowState(); to Http2Stream.

Result:
Flow control is not part of the Http2Stream interface. This clarifies its responsibility and logical relationship to other interfaces. The flow controller no longer must be acquired though a map lookup.
2015-04-20 20:02:02 -07:00
Scott Mitchell
2b8104c852 HTTP/2 Priority Tree Benchmark
Motivation:
There is no benchmark to measure the priority tree implementation performance.

Modifications:
Introduce a new benchmark which will populate the priority tree, and then shuffle parent/child links around.

Result:
A simple benchmark to get a baseline for the HTTP/2 codec's priority tree implementation.
2015-04-17 10:14:13 -07:00
Louis Ryan
f3fb77f4bc Have microbenchmarks produce a deployable artifact. Fix some minor miscellaneous issues.
Motivation:
Allows for running benchmarks from built jars which is useful in development environments that only take released artifacts.

Modifications:
Move benchmarks into 'main' from 'test'
Add @State annotations to benchmarks that are missing them
Fix timing issue grabbing context during channel initialization

Result:
Users can run benchmarks more easily.
2015-04-17 10:04:26 -07:00
Jakob Buchgraber
c2de195f87 Improve performance of AsciiString.equals(Object).
Motivation:

The current implementation does byte by byte comparison, which we have seen
can be a performance bottleneck when the AsciiString is used as the key in
a Map.

Modifications:

Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
and get closer to the performance of String.equals(Object).

Result:

Significant improvement (2x - 6x) in performance over the current implementation.

Benchmark                                             (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       10  thrpt        10 118843477.518 2347259.347    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       50  thrpt        10 43910319.773   198376.996    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual      100  thrpt        10 26339969.001   159599.252    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual     1000  thrpt        10  2873119.030    20779.056    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual    10000  thrpt        10   306370.450     1933.303    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual   100000  thrpt        10    25750.415      108.391    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       10  thrpt        10 248077563.510  635320.093    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       50  thrpt        10 128198943.138  614827.548    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual      100  thrpt        10 86195621.349  1063959.307    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual     1000  thrpt        10 16920264.598    61615.365    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual    10000  thrpt        10  1687454.747     6367.602    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual   100000  thrpt        10   153717.851      586.916    ops/s
2015-04-16 17:29:54 -07:00
Scott Mitchell
9a7a85dbe5 ByteString introduced as AsciiString super class
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.

Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.

Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
2015-04-14 16:35:17 -07:00
nmittler
ab158a6ea4 Adding basic benchmarks for IntObjectHashMap
Motivation:

It needs to be fast :)

Modifications:

Added a simple benchmark to the microbench module.

Result:

Yay, benchmarks!
2015-04-13 12:24:19 -07:00
Scott Mitchell
cc7ee002dd HTTP/2 Frame Writer Microbenchmark Fix
Motivation:
The Http2FrameWriterBenchmark JMH harness class name was not updated for the JVM arguments. The number of forks is 0 which means the JHM will share a JVM with the benchmarks.  Sharing the JVM may lead to less reliable benchmarking results and as doesn't allow for the command line arguments to be applied for each benchmark.

Modifications:
- Update the JMH version from 0.9 to 1.7.1.  Benchmarks wouldn't run on old version.
- Increase the number of forks from 0 to 1.
- Remove allocation of environment from static and cleanup AfterClass to using the Setup and Teardown methods. The forked JVM would not shut down correctly otherwise (and wait for 30+ seconds before timeing out).

Result:
Benchmarks that run as intended.
2015-04-13 10:59:39 -07:00
nmittler
6fbca14f8a Cleaning up the initialization of Http2ConnectionHandler
Motivation:

It currently takes a builder for the encoder and decoder, which makes it difficult to decorate them.

Modifications:

Removed the builders from the interfaces entirely. Left the builder for the decoder impl but removed it from the encoder since it's constructor only takes 2 parameters. Also added decorator base classes for the encoder and decoder and made the CompressorHttp2ConnectionEncoder extend the decorator.

Result:

Fixes #3530
2015-03-30 11:23:02 -07:00
Scott Mitchell
2bf592c50f Backport of HTTP/2 Microbenchmark fail.
Motivation:
The backport of a6c729bdf8 failed.

Modifications:
- Make sure the interfaces are correctly implemented when backporting.

Result:
Microbenchmark compiles and runs on 4.1 branch.
2015-03-28 18:41:09 -07:00
scottmitch
2dda917f27 Http2DefaultFrameWriter microbenchmark
Motivation:
A microbenchmark will be useful to get a baseline for performance.

Modifications:
- Introduce a new microbenchmark which tests the Http2DefaultFrameWriter.
- Allow benchmarks to run without thread context switching between JMH and Netty.

Result:
Microbenchmark exists to test performance.
2015-03-27 13:10:57 -07:00
Norman Maurer
fce0989844 [maven-release-plugin] prepare for next development iteration 2015-03-03 02:06:47 -05:00
Norman Maurer
ca3b1bc4b7 [maven-release-plugin] prepare release netty-4.1.0.Beta4 2015-03-03 02:05:52 -05:00
Michael Nitschinger
1d344f488c Fix ByteBufUtilBenchmark on utf8 encodings.
Motivation
----------
The performance tests for utf8 also used the getBytes on ASCII,
which is incorrect and also provides different performance numbers.

Modifications
-------------
Use CharsetUtil.UTF_8 instead of US_ASCII for the getBytes calls.

Result
------
Accurate and semantically correct benchmarking results on utf8
comparisons.
2014-12-31 20:26:42 +09:00
Norman Maurer
fe796fc8ab Provide helper methods in ByteBufUtil to write UTF-8/ASCII CharSequences. Related to [#909]
Motivation:

We expose no methods in ByteBuf to directly write a CharSequence into it. This leads to have the user either convert the CharSequence first to a byte array or use CharsetEncoder. Both cases have some overheads and we can do a lot better for well known Charsets like UTF-8 and ASCII.

Modifications:

Add ByteBufUtil.writeAscii(...) and ByteBufUtil.writeUtf8(...) which can do the task in an optimized way. This is especially true if the passed in ByteBuf extends AbstractByteBuf which is true for all of our implementations which not wrap another ByteBuf.

Result:

Writing an ASCII and UTF-8 CharSequence into a AbstractByteBuf is a lot faster then what the user could do by himself as we can make use of some package private methods and so eliminate reference and range checks. When the Charseq is not ASCII or UTF-8 we can still do a very good job and are on par in most of the cases with what the user would do.

The following benchmark shows the improvements:

Result: 2456866.966 ?(99.9%) 59066.370 ops/s [Average]
  Statistics: (min, avg, max) = (2297025.189, 2456866.966, 2586003.225), stdev = 78851.914
  Confidence interval (99.9%): [2397800.596, 2515933.336]

Benchmark                                                        Mode   Samples        Score  Score error    Units
i.n.m.b.ByteBufUtilBenchmark.writeAscii                         thrpt        50  9398165.238   131503.098    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiString                   thrpt        50  9695177.968   176684.821    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArray           thrpt        50  4788597.415    83181.549    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArrayWrapped    thrpt        50  4722297.435    98984.491    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringWrapped            thrpt        50  4028689.762    66192.505    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArray                 thrpt        50  3234841.565    91308.009    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArrayWrapped          thrpt        50  3311387.474    39018.933    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiWrapped                  thrpt        50  3379764.250    66735.415    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8                          thrpt        50  5671116.821   101760.081    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8String                    thrpt        50  5682733.440   111874.084    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArray            thrpt        50  3564548.995    55709.512    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArrayWrapped     thrpt        50  3621053.671    47632.820    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringWrapped             thrpt        50  2634029.071    52304.876    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArray                  thrpt        50  3397049.332    57784.119    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArrayWrapped           thrpt        50  3318685.262    35869.562    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8Wrapped                   thrpt        50  2473791.249    46423.114    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,387.417 sec - in io.netty.microbench.buffer.ByteBufUtilBenchmark

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

The *ViaArray* benchmarks are basically doing a toString().getBytes(Charset) which the others are using ByteBufUtil.write*(...).
2014-12-26 15:58:18 +09:00
Idel Pivnitskiy
cff98fff51 Benchmark for HttpRequestDecoder 2014-11-12 14:29:15 +01:00
Scott Mitchell
7e65c09373 IPv6 address to string rfc5952
Motivation:
The java implementations for Inet6Address.getHostName() do not follow the RFC 5952 (http://tools.ietf.org/html/rfc5952#section-4) for recommended string representation. This introduces inconsistencies when integrating with other technologies that do follow the RFC.

Modifications:
-NetUtil.java to have another public static method to convert InetAddress to string. Inet4Address will use the java InetAddress.getHostAddress() implementation and there will be new code to implement the RFC 5952 IPV6 string conversion.
-New unit tests to test the new method

Result:
Netty provides a RFC 5952 compliant string conversion method for IPV6 addresses
2014-10-30 00:05:57 -04:00
Trustin Lee
b5f61d0de5 [maven-release-plugin] prepare for next development iteration 2014-08-16 03:27:42 +09:00
Trustin Lee
76ac3b21a5 [maven-release-plugin] prepare release netty-4.1.0.Beta3 2014-08-16 03:27:37 +09:00
Trustin Lee
b3c1904cc9 [maven-release-plugin] prepare for next development iteration 2014-08-15 09:31:03 +09:00
Trustin Lee
e013b2400f [maven-release-plugin] prepare release netty-4.1.0.Beta2 2014-08-15 09:30:59 +09:00
Trustin Lee
e167b02d52 [maven-release-plugin] prepare for next development iteration 2014-07-04 17:26:02 +09:00
Trustin Lee
ba50cb829b [maven-release-plugin] prepare release netty-4.1.0.Beta1 2014-07-04 17:25:54 +09:00
Trustin Lee
787663a644 [maven-release-plugin] rollback the release of netty-4.1.0.Beta1 2014-07-04 17:11:14 +09:00
Trustin Lee
83eae705e1 [maven-release-plugin] prepare release netty-4.1.0.Beta1 2014-07-04 17:02:17 +09:00
Trustin Lee
f67ac5e46d Fix the inconsistencies between performance tests in ByteBufAllocatorBenchmark
Motivation:

default*() tests are performing a test in a different way, and they must be same with other tests.

Modification:

Make sure default*() tests are same with the others

Result:

Easier to compare default and non-default allocators
2014-06-21 13:28:02 +09:00
Trustin Lee
085a61a310 Refactor FastThreadLocal to simplify TLV management
Motivation:

When Netty runs in a managed environment such as web application server,
Netty needs to provide an explicit way to remove the thread-local
variables it created to prevent class loader leaks.

FastThreadLocal uses different execution paths for storing a
thread-local variable depending on the type of the current thread.
It increases the complexity of thread-local removal.

Modifications:

- Moved FastThreadLocal and FastThreadLocalThread out of the internal
  package so that a user can use it.
- FastThreadLocal now keeps track of all thread local variables it has
  initialized, and calling FastThreadLocal.removeAll() will remove all
  thread-local variables of the caller thread.
- Added FastThreadLocal.size() for diagnostics and tests
- Introduce InternalThreadLocalMap which is a mixture of hard-wired
  thread local variable fields and extensible indexed variables
- FastThreadLocal now uses InternalThreadLocalMap to implement a
  thread-local variable.
- Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
  tells it to stop watching when its thread-local cache has been freed
  by FastThreadLocal.removeAll().
- Added FastThreadLocalTest to ensure that removeAll() works
- Added microbenchmark for FastThreadLocal and JDK ThreadLocal
- Upgraded to JMH 0.9

Result:

- A user can remove all thread-local variables Netty created, as long as
  he or she did not exit from the current thread. (Note that there's no
  way to remove a thread-local variable from outside of the thread.)
- FastThreadLocal exposes more useful operations such as isSet() because
  we always implement a thread local variable via InternalThreadLocalMap
  instead of falling back to JDK ThreadLocal.
- FastThreadLocalBenchmark shows that this change improves the
  performance of FastThreadLocal even more.
2014-06-19 21:13:55 +09:00
belliottsmith
2a2a21ec59 Introduce FastThreadLocal which uses an EnumMap and a predefined fixed set of possible thread locals
Motivation:
Provide a faster ThreadLocal implementation

Modification:
Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark);

Result:
Improved performance
2014-06-13 10:56:18 +02:00
Norman Maurer
61dbc353ca [#2436] Unsafe*ByteBuf implementation should only invert bytes if ByteOrder differ from native ByteOrder
Motivation:
Our Unsafe*ByteBuf implementation always invert bytes when the native ByteOrder is LITTLE_ENDIAN (this is true on intel), even when the user calls order(ByteOrder.LITTLE_ENDIAN). This is not optimal for performance reasons as the user should be able to set the ByteOrder to LITTLE_ENDIAN and so write bytes without the extra inverting.

Modification:
- Introduce a new special SwappedByteBuf (called UnsafeDirectSwappedByteBuf) that is used by all the Unsafe*ByteBuf implementation and allows to write without inverting the bytes.
- Add benchmark
- Upgrade jmh to 0.8

Result:
The user is be able to get the max performance even on servers that have ByteOrder.LITTLE_ENDIAN as their native ByteOrder.
2014-06-05 10:59:22 +02:00
Trustin Lee
0cc264b76b More realistic ByteBuf allocation benchmark
Motivation:

Allocating a single buffer and releasing it repetitively for a benchmark will not involve the realistic execution path of the allocators.

Modifications:

Keep the last 8192 allocations and release them randomly.

Result:

We are now getting the result close to what we got with caliper.
2014-05-29 19:51:05 +09:00
Michael Nitschinger
7d62594cc6 Upgrade JMH to 0.4.1 and make use of @Params. 2014-02-23 16:39:39 +01:00
Michael Nitschinger
33197c7696 Update JMH to 0.3.2 2014-02-14 13:16:13 -08:00
Trustin Lee
bb145c0057 Fix wiki link 2014-02-14 12:04:12 -08:00
Trustin Lee
ac70dc4546 Update the version to 4.1.0.Alpha1-SNAPSHOT 2014-02-13 18:32:26 -08:00
Norman Maurer
d67184b488 [maven-release-plugin] prepare for next development iteration 2014-01-21 08:18:32 +01:00
Norman Maurer
287515210d [maven-release-plugin] prepare release netty-4.0.15.Final 2014-01-21 08:18:26 +01:00
Michael Nitschinger
ac332dfe02 Using SystemPropertyUtil for prperty parsing. 2014-01-15 18:53:28 +01:00
Michael Nitschinger
99f9c6dbc3 Make JMH options modifiable through the subclassed benchmark. 2014-01-15 18:53:22 +01:00
Trustin Lee
462817cd96 Add a link to the wiki page about the microbench module 2014-01-15 16:00:53 +09:00
Michael Nitschinger
03b0099b63 microbench: move from Caliper to JMH 2014-01-14 14:56:20 +09:00
Trustin Lee
e83d2e0b4e [maven-release-plugin] prepare for next development iteration 2013-12-22 21:57:48 +09:00
Trustin Lee
cdb700c7a4 [maven-release-plugin] prepare release netty-4.0.14.Final 2013-12-22 21:57:40 +09:00