342 Commits

Author SHA1 Message Date
Norman Maurer
b4d4c0034d Optimize HPACK usage to align more with Netty types and remove heavy object creations. Related to [#3597]
Motivations:

The HPACK code was not really optimized and written with Netty types in mind. Because of this a lot of garbage was created due heavy object creation.

This was first reported in [#3597] and https://github.com/grpc/grpc-java/issues/1872 .

Modifications:

- Directly use ByteBuf as input and output
- Make use of ByteProcessor where possible
- Use AsciiString as this is the only thing we need for our http2 usage

Result:

Less garbage and better usage of Netty apis.
2016-06-22 14:26:05 +02:00
Norman Maurer
4dec7f11b7 [maven-release-plugin] prepare for next development iteration 2016-06-07 18:52:34 +02:00
Norman Maurer
cf670fab75 [maven-release-plugin] prepare release netty-4.1.1.Final 2016-06-07 18:52:22 +02:00
Norman Maurer
6ca49d1336 [maven-release-plugin] prepare for next development iteration 2016-05-25 19:16:44 +02:00
Norman Maurer
446b38db52 [maven-release-plugin] prepare release netty-4.1.0.Final 2016-05-25 19:14:15 +02:00
Norman Maurer
e59d0e9efb Introduce CodecOutputList to reduce overhead of encoder/decoder
Motivation:

99dfc9ea799348430a1c25776ce30a95bc10a1ff introduced some code that will more frequently try to forward messages out of the list of decoded messages to reduce latency and memory footprint. Unfortunally this has the side-effect that RecycleableArrayList.clear() will be called more often and so introduce some overhead as ArrayList will null out the array on each call.

Modifications:

- Introduce a CodecOutputList which allows to not null out the array until we recycle it and also allows to access internal array with extra range checks.
- Add benchmark that add elements to different List implementations and clear them

Result:

Less overhead when decode / encode messages.

Benchmark                                     (elements)   Mode  Cnt         Score        Error  Units
CodecOutputListBenchmark.arrayList                     1  thrpt   20  24853764.609 ± 161582.376  ops/s
CodecOutputListBenchmark.arrayList                     4  thrpt   20  17310636.508 ± 930517.403  ops/s
CodecOutputListBenchmark.codecOutList                  1  thrpt   20  26670751.661 ± 587812.655  ops/s
CodecOutputListBenchmark.codecOutList                  4  thrpt   20  25166421.089 ± 166945.599  ops/s
CodecOutputListBenchmark.recyclableArrayList           1  thrpt   20  24565992.626 ± 210017.290  ops/s
CodecOutputListBenchmark.recyclableArrayList           4  thrpt   20  18477881.775 ± 157003.777  ops/s

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 246.748 sec - in io.netty.handler.codec.CodecOutputListBenchmark
2016-05-20 09:12:07 +02:00
Norman Maurer
68cd670eb9 Remove ChannelHandlerInvoker
Motivation:

We tried to provide the ability for the user to change the semantics of the threading-model by delegate the invoking of the ChannelHandler to the ChannelHandlerInvoker. Unfortunually this not really worked out quite well and resulted in just more complexity and splitting of code that belongs together. We should remove the ChannelHandlerInvoker again and just do the same as in 4.0

Modifications:

Remove ChannelHandlerInvoker again and replace its usage in Http2MultiplexCodec

Result:

Easier code and less bad abstractions.
2016-05-17 11:14:00 +02:00
Scott Mitchell
c5faa142fb KObjectHashMap probeNext improvement
Motivation:
KObjectHashMap.probeNext(..) usually involves 2 conditional statements and 2 aritmatic operations. This can be improved to have 0 conditional statements.

Modifications:
- Use bit masking to avoid conditional statements

Result:
Improved performance for KObjecthashMap.probeNext(..)
2016-05-13 10:04:26 -07:00
Scott Mitchell
d580245afc DefaultHttp2Connection.close Reentrant Modification
Motivation:
The DefaultHttp2Conneciton.close method accounts for active streams being iterated and attempts to avoid reentrant modifications of the underlying stream map by using iterators to remove from the stream map. However there are a few issues:

- While iterating over the stream map we don't prevent iterations over the active stream collection
- Removing a single stream may actually remove > 1 streams due to closed non-leaf streams being preserved in the priority tree which may result in NPE

Preserving closed non-leaf streams in the priority tree is no longer necessary with our current allocation algorithms, and so this feature (and related complexity) can be removed.

Modifications:
- DefaultHttp2Connection.close should prevent others from iterating over the active streams and reentrant modification scenarios which may result from this
- DefaultHttp2Connection should not keep closed stream in the priority tree
  - Remove all associated code in DefaultHttp2RemoteFlowController which accounts for this case including the ReducedState object
  - This includes fixing writability changes which depended on ReducedState
- Update unit tests

Result:
Fixes https://github.com/netty/netty/issues/5198
2016-05-09 14:16:30 -07:00
Norman Maurer
9229ed98e2 [#5088] Add annotation which marks packages/interfaces/classes as unstable
Motivation:

Some codecs should be considered unstable as these are relative new. For this purpose we should introduce an annotation which these codecs should us to be marked as unstable in terms of API.

Modifications:

- Add UnstableApi annotation and use it on codecs that are not stable
- Move http2.hpack to http2.internal.hpack as it is internal.

Result:

Better document unstable APIs.
2016-05-09 15:16:35 +02:00
Norman Maurer
03638869b9 Update dependencies
Motivation:

Before release 4.1.0.Final we should update all our dependencies.

Modifications:

Update dependencies.

Result:

Up-to-date dependencies used.
2016-04-14 11:05:53 +02:00
Norman Maurer
572bdfb494 [maven-release-plugin] prepare for next development iteration 2016-04-10 08:37:18 +02:00
Norman Maurer
c6121a6f49 [maven-release-plugin] prepare release netty-4.1.0.CR7 2016-04-10 08:36:56 +02:00
Norman Maurer
6e919f70f8 [maven-release-plugin] rollback the release of netty-4.1.0.CR7 2016-04-09 22:13:44 +02:00
Norman Maurer
4cdd51509a [maven-release-plugin] prepare release netty-4.1.0.CR7 2016-04-09 22:05:34 +02:00
Trustin Lee
3b941c2a7c [maven-release-plugin] prepare for next development iteration 2016-04-02 01:25:05 -04:00
Trustin Lee
7368ccc539 [maven-release-plugin] prepare release netty-4.1.0.CR6 2016-04-02 01:24:55 -04:00
Norman Maurer
cee38ed2b6 [maven-release-plugin] prepare for next development iteration 2016-03-29 16:45:13 +02:00
Norman Maurer
9cd9e7daeb [maven-release-plugin] prepare release netty-4.1.0.CR5 2016-03-29 16:44:33 +02:00
Xiaoyan Lin
3ad55eb839 Speed up the slow path of FastThreadLocal
Motivation:

The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418

Modifications:

- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal

Result:

The slow path of FastThreadLocal is improved.
2016-03-23 11:36:16 +01:00
Norman Maurer
28d03adbfe [maven-release-plugin] prepare for next development iteration 2016-03-21 11:51:50 +01:00
Norman Maurer
4653dc1d05 [maven-release-plugin] prepare release netty-4.1.0.CR4 2016-03-21 11:51:12 +01:00
Xiaoyan Lin
4a5e484c5a Add xml-maven-plugin to check indentation and fix violations
Motivation:

See https://github.com/netty/netty-build/issues/5

Modifications:

Add xml-maven-plugin to check indentation and fix violations

Result:
pom.xml will be checked in the PR build
2016-02-29 09:46:32 +01:00
Norman Maurer
ca443e42e0 [maven-release-plugin] prepare for next development iteration 2016-02-19 23:00:11 +01:00
Norman Maurer
f39eb9a6b2 [maven-release-plugin] prepare release netty-4.1.0.CR3 2016-02-19 22:59:52 +01:00
Norman Maurer
75a2ddd61c [maven-release-plugin] prepare for next development iteration 2016-02-04 16:51:44 +01:00
Norman Maurer
7eb3a60dba [maven-release-plugin] prepare release netty-4.1.0.CR2 2016-02-04 16:37:06 +01:00
Scott Mitchell
f990f9983d HTTP/2 Don't Flow Control Iniital Headers
Motivation:
Currently the initial headers for every stream is queued in the flow controller. Since the initial header frame may create streams the peer must receive these frames in the order in which they were created, or else this will be a protocol error and the connection will be closed. Tolerating the initial headers being queued would increase the complexity of the WeightedFairQueueByteDistributor and there is benefit of doing so is not clear.

Modifications:
- The initial headers will no longer be queued in the flow controllers

Result:
Fixes https://github.com/netty/netty/issues/4758
2016-02-01 13:37:43 -08:00
Norman Maurer
1c417e5f82 [maven-release-plugin] prepare for next development iteration 2016-01-21 15:35:55 +01:00
Norman Maurer
c681a40a78 [maven-release-plugin] prepare release netty-4.1.0.CR1 2016-01-21 15:28:21 +01:00
Eric Anderson
6dbb610f5b Add ChannelHandlerContext.invoker()
Motivation:

Being able to access the invoker() is useful when adding additional
handlers that should be running in the same thread. Since an application
may be using a threading model unsupported by the default invoker, they
can specify their own. Because of that, in a handler that auto-adds
other handlers:

// This is a good pattern
ctx.pipeline().addBefore(ctx.invoker(), ctx.name(), null, newHandler);
// This will generally work, but prevents using custom invoker.
ctx.pipeline().addBefore(ctx.executor(), ctx.name(), null, newHandler);

That's why I believe in commit 110745b0, for the now-defunct 5.0 branch,
when ChannelHandlerAppender was added the invoker() method was also
necessary.

There is a side-benefit to exposing the invoker: in certain advanced
use-cases using the invoker for a particular handler is useful. Using
the invoker you are able to invoke a _particular_ handler, from possibly
a different thread yet still using standard exception processing.

ChannelHandlerContext does part of that, but is unwieldy when trying to
invoke a particular handler because it invokes the prev or next handler,
not the one the context is for. A workaround is to use the next or prev
context (respectively), but this breaks when the pipeline changes.

This came up during writing the Http2MultiplexCodec which uses a
separate child channel for each http/2 stream and wants to send messages
from the child channel directly to the Http2MultiplexCodec handler that
created it.

Modifications:

Add the invoker() method to ChannelHandlerContext. It was already being
implemented by AbstractChannelHandlerContext. The two other
implementations of ChannelHandlerContext needed minor tweaks.

Result:

Access to the invoker used for a particular handler, for either reusing
for other handlers or for advanced use-cases. Fixes #4738
2016-01-22 14:04:35 +01:00
Xiaoyan Lin
475d901131 Fix errors reported by javadoc
Motivation:

Javadoc reports errors about invalid docs.

Modifications:

Fix some errors reported by javadoc.

Result:

A lot of javadoc errors are fixed by this patch.
2015-12-27 08:36:45 +01:00
zhangduo
f22ad97cf3 Remove PriorityStreamByteDistributor from http2 microbench
Motivation:
PriorityStreamByteDistributor has been removed but NoPriorityByteDistributionBenchmark in microbench still need it and causes compile error

Modifications:
Remove PriorityStreamByteDistributor from NoPriorityByteDistributionBenchmark

Result:
The compile error has been fixed
2015-12-22 09:20:32 +01:00
nmittler
ee56a4a5c6 Fixing broken HTTP/2 benchmark
Motivation:

The `NoPriorityByteDistibbutionBenchmark` was broken with a recent commit.

Modifications:

Fixed the benchmark to use the new HTTP2 handler builder.

Result:

It builds.
2015-12-18 09:52:12 -08:00
Scott Mitchell
904e70a4d4 HTTP/2 Weighted Fair Queue Byte Distributor
Motivation:
PriorityStreamByteDistributor uses a homegrown algorithm which distributes bytes to nodes in the priority tree. PriorityStreamByteDistributor has no concept of goodput which may result in poor utilization of network resources. PriorityStreamByteDistributor also has performance issues related to the tree traversal approach and number of nodes that must be visited. There also exists some more proven algorithms from the resource scheduling domain which PriorityStreamByteDistributor does not employ.

Modifications:
- Introduce a new ByteDistributor which uses elements from weighted fair queue schedulers

Result:
StreamByteDistributor which is sensitive to priority and uses a more familiar distribution concept.
Fixes https://github.com/netty/netty/issues/4462
2015-12-17 11:17:02 -08:00
Trustin Lee
2202e8f967 Revamp the Http2ConnectionHandler builder API
Related: #4572

Motivation:

- A user might want to extend Http2ConnectionHandler and define his/her
  own static inner Builder class that extends
  Http2ConnectionHandler.BuilderBase. This introduces potential
  confusion because there's already Http2ConnectionHandler.Builder. Your
  IDE will warn about this name duplication as well.
- BuilderBase exposes all setters with public modifier. A user's Builder
  might not want to expose them to enforce it to certain configuration.
  There's no way to hide them because it's public already and they are
  final.
- BuilderBase.build(Http2ConnectionDecoder, Http2ConnectionEncoder)
  ignores most properties exposed by BuilderBase, such as
  validateHeaders, frameLogger and encoderEnforceMaxConcurrentStreams.
  If any build() method ignores the properties exposed by the builder,
  there's something wrong.
- A user's Builder that extends BuilderBase might want to require more
  parameters in build(). There's no way to do that cleanly because
  build() is public and final already.

Modifications:

- Make BuilderBase and Builder top-level so that there's no duplicate
  name issue anymore.
  - Add AbstractHttp2ConnectionHandlerBuilder
  - Add Http2ConnectionHandlerBuilder
  - Add HttpToHttp2ConnectionHandlerBuilder
- Make all builder methods in AbstractHttp2ConnectionHandlerBuilder
  protected so that a subclass can choose which methods to expose
- Provide only a single build() method
  - Add connection() and codec() so that a user can still specify
    Http2Connection or Http2Connection(En|De)coder explicitly
  - Implement proper state validation mechanism so that it is prevented
    to invoke conflicting setters

Result:

Less confusing yet flexible builder API
2015-12-17 14:08:13 +09:00
Scott Mitchell
641505a5d2 DefaultChannelConfig maxMessagesPerRead default not always set
Motivation:
ChannelMetadata has a field minMaxMessagesPerRead which can be confusing. There are also some cases where static instances are used and the default value for channel type is not being applied.

Modifications:
- use a default value which is set unconditionally to simplify
- make sure static instances of MaxMessagesRecvByteBufAllocator are not used if the intention is that the default maxMessagesPerRead should be derived from the channel type.

Result:
Less confusing interfaces in ChannelMetadata and ChannelConfig. Default maxMessagesPerRead is correctly applied.
2015-11-25 15:14:07 -08:00
Scott Mitchell
49cd00da1c Backport of benchmark broke build
Motivation:
2a2059d97683206a23488e8c96256f2f6fb1bed0 was backported from master, and included an overriden method which does not exist in 4.1.

Modifications:
- Remove the invoker method from NoPriorityByteDistributionBenchmark

Result:
No more build error
2015-11-20 15:03:50 -08:00
nmittler
2a2059d976 Adding UniformStreamByteDistributor
Motivation:

The current priority algorithm can yield poor per-stream goodput when either the number of streams is high or the connection window is small. When all priorities are the same (i.e. priority is disabled), we should be able to do better.

Modifications:

Added a new UniformStreamByteDistributor that ignores priority entirely and manages a queue of streams.  Each stream is allocated a minimum of 1KiB on each iteration.

Result:

Improved goodput when priority is not used.
2015-11-19 16:49:12 -08:00
nmittler
8accc52b03 Forking Twitter's hpack
Motivation:

The twitter hpack project does not have the support that it used to have.  See discussion here: https://github.com/netty/netty/issues/4403.

Modifications:

Created a new module in Netty and copied the latest from twitter hpack master.

Result:

Netty no longer depends on twitter hpack.
2015-11-14 10:13:32 -08:00
Norman Maurer
2ecce8fa56 [maven-release-plugin] prepare for next development iteration 2015-11-10 22:59:33 +01:00
Norman Maurer
6a93f331d3 [maven-release-plugin] prepare release netty-4.1.0.Beta8 2015-11-10 22:50:57 +01:00
Scott Mitchell
b4b791353d AsciiString optimized hashCode
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.

Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available

Result:
AsciiString hashCode is faster.
2015-11-10 10:28:31 -08:00
Louis Ryan
6e108cb96a Improve the performance of copying header sets when hashing and name validation are equivalent.
Motivation:
Headers and groups of headers are frequently copied and the current mechanism is slower than it needs to be.

Modifications:
Skip name validation and hash computation when they are not necessary.
Fix emergent bug in CombinedHttpHeaders identified with better testing
Fix memory leak in DefaultHttp2Headers when clearing
Added benchmarks

Result:
Faster header copying and some collateral bug fixes
2015-11-07 08:53:10 -08:00
nmittler
6504d52b94 Add HTTP/2 local flow control option for auto refill
Motivation:

For many HTTP/2 applications (such as gRPC) it is necessary to autorefill the connection window in order to prevent application-level deadlocking.

Consider an application with 2 streams, A and B.  A receives a stream of messages and the application pops off one message at a time and makes a request on stream B. However, if receiving of data on A has caused the connection window to collapse, B will not be able to receive any data and the application will deadlock.  The only way (currently) to get around this is 1) use multiple connections, or 2) manually refill the connection window.  Both are undesirable and could needlessly complicate the application code.

Modifications:

Add a configuration option to DefaultHttp2LocalFlowController, allowing it to autorefill the connection window.

Result:

Applications can configure HTTP/2 to avoid inter-stream deadlocking.
2015-11-05 15:47:10 -08:00
Norman Maurer
1b2e43e70c Correctly construct Executor in microbenchmarks.
Motivation:

We should allow our custom Executor to shutdown quickly.

Modifications:

Call super constructor which correct arguments.

Result:

Custom Executor can be shutdown quickly.
2015-11-03 09:46:05 +01:00
Scott Mitchell
19658e9cd8 HTTP/2 Headers Type Updates
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.

Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.

Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
2015-10-30 15:29:44 -07:00
buchgr
c9364616c8 Fix performance regression in FastThreadLocal microbenchmark. Fixes #4402
Motivation:

As reported in #4402, the FastThreadLocalBenchmark shows that the JDK ThreadLocal
is actually faster than Netty's custom thread local implementation.

I was looking forward to doing some deep digging, but got disappointed :(.

Modifications:

The microbenchmark was not using FastThreadLocalThreads and would thus always hit the slow path.
I updated the JMH command line flags, so that FastThreadLocalThreads would be used.

Result:

FastThreadLocalBenchmark shows FastThreadLocal to be faster than JDK's ThreadLocal implementation,
by about 56% in this particular benchmark. Run on OSX El Capitan with OpenJDK 1.8u60.

Benchmark                                    Mode  Cnt      Score      Error  Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  ops/s
2015-10-29 21:40:13 +01:00
Norman Maurer
2e36ac4594 Add benchmark for HeapByteBuf implementations.
Motivation:

To prove one implementation is faster as the other we should have a benchmark.

Modifications:

Add benchmark which benchmarks the unsafe and non-unsafe implementation of HeapByteBuf.

Result:

Able to compare speed of implementations easily.
2015-10-29 19:38:52 +01:00
Norman Maurer
a47685b243 Use bitwise operation when sampling for resource leak detection.
Motivation:

Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.

Modifications:

- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.

Result:

Faster sampling.
2015-10-29 19:18:44 +01:00