156 Commits

Author SHA1 Message Date
Norman Maurer
0ea4597542 Introduce CodecOutputList to reduce overhead of encoder/decoder
Motivation:

99dfc9ea799348430a1c25776ce30a95bc10a1ff introduced some code that will more frequently try to forward messages out of the list of decoded messages to reduce latency and memory footprint. Unfortunally this has the side-effect that RecycleableArrayList.clear() will be called more often and so introduce some overhead as ArrayList will null out the array on each call.

Modifications:

- Introduce a CodecOutputList which allows to not null out the array until we recycle it and also allows to access internal array with extra range checks.
- Add benchmark that add elements to different List implementations and clear them

Result:

Less overhead when decode / encode messages.

Benchmark                                     (elements)   Mode  Cnt         Score        Error  Units
CodecOutputListBenchmark.arrayList                     1  thrpt   20  24853764.609 ± 161582.376  ops/s
CodecOutputListBenchmark.arrayList                     4  thrpt   20  17310636.508 ± 930517.403  ops/s
CodecOutputListBenchmark.codecOutList                  1  thrpt   20  26670751.661 ± 587812.655  ops/s
CodecOutputListBenchmark.codecOutList                  4  thrpt   20  25166421.089 ± 166945.599  ops/s
CodecOutputListBenchmark.recyclableArrayList           1  thrpt   20  24565992.626 ± 210017.290  ops/s
CodecOutputListBenchmark.recyclableArrayList           4  thrpt   20  18477881.775 ± 157003.777  ops/s

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 246.748 sec - in io.netty.handler.codec.CodecOutputListBenchmark
2016-05-20 09:58:12 +02:00
Norman Maurer
4b6b167839 [maven-release-plugin] prepare for next development iteration 2016-04-04 16:53:40 +02:00
Norman Maurer
e8fa848f43 [maven-release-plugin] prepare release netty-4.0.36.Final 2016-04-04 16:52:53 +02:00
Norman Maurer
47baeb5ded [maven-release-plugin] rollback the release of netty-4.0.36.Final 2016-03-29 22:58:32 +02:00
Norman Maurer
29a4f3e363 [maven-release-plugin] prepare release netty-4.0.36.Final 2016-03-29 21:30:50 +02:00
Xiaoyan Lin
f5b4937543 Speed up the slow path of FastThreadLocal
Motivation:

The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418

Modifications:

- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal

Result:

The slow path of FastThreadLocal is improved.
2016-03-23 11:44:30 +01:00
Norman Maurer
64dc03a25d [maven-release-plugin] prepare for next development iteration 2016-03-21 10:34:26 +01:00
Norman Maurer
e444e8d7a6 [maven-release-plugin] prepare release netty-4.0.35.Final 2016-03-21 10:32:08 +01:00
Xiaoyan Lin
57cebc7f5b Add xml-maven-plugin to check indentation and fix violations
Motivation:

See https://github.com/netty/netty-build/issues/5

Modifications:

Add xml-maven-plugin to check indentation and fix violations

Result:
pom.xml will be checked in the PR build
2016-02-29 10:04:53 +01:00
Norman Maurer
b647513b6b [maven-release-plugin] prepare for next development iteration 2016-01-29 09:57:10 +01:00
Norman Maurer
cf1777b619 [maven-release-plugin] prepare release netty-4.0.34.Final 2016-01-29 09:23:46 +01:00
Norman Maurer
450939842e Fix version 2015-11-24 21:24:22 +01:00
Norman Maurer
69b5aefd09 [maven-release-plugin] prepare release netty-4.0.33.Final 2015-11-03 14:18:17 +01:00
Norman Maurer
efeec7e390 Correctly construct Executor in microbenchmarks.
Motivation:

We should allow our custom Executor to shutdown quickly.

Modifications:

Call super constructor which correct arguments.

Result:

Custom Executor can be shutdown quickly.
2015-11-03 09:42:42 +01:00
buchgr
e12613a018 Fix performance regression in FastThreadLocal microbenchmark. Fixes #4402
Motivation:

As reported in #4402, the FastThreadLocalBenchmark shows that the JDK ThreadLocal
is actually faster than Netty's custom thread local implementation.

I was looking forward to doing some deep digging, but got disappointed :(.

Modifications:

The microbenchmark was not using FastThreadLocalThreads and would thus always hit the slow path.
I updated the JMH command line flags, so that FastThreadLocalThreads would be used.

Result:

FastThreadLocalBenchmark shows FastThreadLocal to be faster than JDK's ThreadLocal implementation,
by about 56% in this particular benchmark. Run on OSX El Capitan with OpenJDK 1.8u60.

Benchmark                                    Mode  Cnt      Score      Error  Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  ops/s
2015-10-29 21:46:04 +01:00
Norman Maurer
0c8fe18d3c Add benchmark for HeapByteBuf implementations.
Motivation:

To prove one implementation is faster as the other we should have a benchmark.

Modifications:

Add benchmark which benchmarks the unsafe and non-unsafe implementation of HeapByteBuf.

Result:

Able to compare speed of implementations easily.
2015-10-29 19:38:33 +01:00
Norman Maurer
577931e8bc Use bitwise operation when sampling for resource leak detection.
Motivation:

Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.

Modifications:

- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.

Result:

Faster sampling.
2015-10-29 19:18:06 +01:00
Norman Maurer
291674262c Added SlicedAbstractByteBuf that can provide fast-path for _get* and _set* methods
Motivation:

SlicedByteBuf can be used for any ByteBuf implementations and so can not do any optimizations that could be done
when AbstractByteBuf is sliced.

Modifications:

- Add SlicedAbstractByteBuf that can eliminate range and reference count checks for _get* and _set* methods.

Result:

Faster SlicedByteBuf implementations for AbstractByteBuf sub-classes.
2015-10-16 08:59:58 +02:00
Norman Maurer
054af70fed Minimize object allocation when calling AbstractByteBuf.toString(..., Charset)
Motivation:

Calling AbstractByteBuf.toString(..., Charset) is used quite frequently by users but produce a lot of GC.

Modification:

- Use a FastThreadLocal to store the CharBuffer that are needed for decoding.
- Use internalNioBuffer(...) when possible

Result:

Less object creation / Less GC
2015-10-15 17:49:21 +02:00
Norman Maurer
1103379e02 Allow to disable reference count checks on every access of the ByteBuf
Motiviation:

Checking reference count on every access on a ByteBuf can have some big performance overhead depending on how the access pattern is. If the user is sure that there are no reference count errors on his side it should be possible to disable the check and so gain the max performance.

Modification:

- Add io.netty.buffer.bytebuf.checkAccessible system property which allows to disable the checks. Enabled by default.
- Add microbenchmark

Result:

Increased performance for operations on the ByteBuf.
2015-10-15 10:19:49 +02:00
Norman Maurer
696a287736 [maven-release-plugin] prepare for next development iteration 2015-09-30 09:31:26 +02:00
Norman Maurer
fb2d562306 [maven-release-plugin] prepare release netty-4.0.32.Final 2015-09-30 09:28:40 +02:00
Norman Maurer
bd928eaa38 [maven-release-plugin] prepare for next development iteration 2015-09-02 08:58:54 +02:00
Norman Maurer
26bbcc38c2 [maven-release-plugin] prepare release netty-4.0.31.Final 2015-09-02 08:57:57 +02:00
Scott Mitchell
3056b80602 Microbench backport issue
Motivation:
The microbench code in 4.0 lives in src/test while in 4.1 and master it lives in src/main. A backport of a patch did not account for this.

Modifications:
- Move the benchmark to the src/test directory
- Update new benchmark package info

Result:
4.0 branch can now build again.
2015-07-30 10:33:10 -07:00
Scott Mitchell
1fcc72aa90 HttpObjectDecoder performance improvements
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.

Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.

Result:
Peformance boost for decoding http objects.
2015-07-29 23:26:49 -07:00
Norman Maurer
148692705c [maven-release-plugin] prepare for next development iteration 2015-07-24 10:11:44 +02:00
Norman Maurer
11cc2d5197 [maven-release-plugin] prepare release netty-4.0.30.Final 2015-07-24 09:54:20 +02:00
Norman Maurer
1da998bc7c [maven-release-plugin] prepare for next development iteration 2015-06-23 11:08:27 +02:00
Norman Maurer
4c482c1215 [maven-release-plugin] prepare release netty-4.0.29.Final 2015-06-23 11:07:56 +02:00
Norman Maurer
d1c46ca987 [maven-release-plugin] prepare for next development iteration 2015-05-07 11:33:47 -04:00
Norman Maurer
005d4a42fc [maven-release-plugin] prepare release netty-4.0.28.Final 2015-05-07 11:33:09 -04:00
Norman Maurer
0f4d3a981e Revert "[maven-release-plugin] prepare for next development iteration"
This reverts commit 3c10ffab5ee6b607d45f73123acb59c5c6602b6f.
2015-05-07 11:02:03 -04:00
Norman Maurer
3c10ffab5e [maven-release-plugin] prepare for next development iteration 2015-05-07 09:09:23 -04:00
Norman Maurer
f2fedbcdef [maven-release-plugin] prepare for next development iteration 2015-03-31 22:06:30 -04:00
Norman Maurer
054e7c5d17 [maven-release-plugin] prepare release netty-4.0.27.Final 2015-03-31 22:05:43 -04:00
Norman Maurer
37264bb72b [maven-release-plugin] prepare for next development iteration 2015-03-02 01:31:30 -05:00
Norman Maurer
0dbc96cffd [maven-release-plugin] prepare release netty-4.0.26.Final 2015-03-02 01:30:58 -05:00
Norman Maurer
e99d89c04d [maven-release-plugin] rollback the release of netty-4.0.26.Final 2015-02-28 21:28:06 +01:00
Norman Maurer
b86e2e6ac0 [maven-release-plugin] prepare release netty-4.0.26.Final 2015-02-28 13:55:01 -05:00
Trustin Lee
0e61aeb849 [maven-release-plugin] prepare for next development iteration 2014-12-31 20:58:44 +09:00
Trustin Lee
087db82e78 [maven-release-plugin] prepare release netty-4.0.25.Final 2014-12-31 20:58:33 +09:00
Michael Nitschinger
9fc95803da Fix ByteBufUtilBenchmark on utf8 encodings.
Motivation
----------
The performance tests for utf8 also used the getBytes on ASCII,
which is incorrect and also provides different performance numbers.

Modifications
-------------
Use CharsetUtil.UTF_8 instead of US_ASCII for the getBytes calls.

Result
------
Accurate and semantically correct benchmarking results on utf8
comparisons.
2014-12-31 20:26:21 +09:00
Norman Maurer
61a5e60513 Provide helper methods in ByteBufUtil to write UTF-8/ASCII CharSequences. Related to [#909]
Motivation:

We expose no methods in ByteBuf to directly write a CharSequence into it. This leads to have the user either convert the CharSequence first to a byte array or use CharsetEncoder. Both cases have some overheads and we can do a lot better for well known Charsets like UTF-8 and ASCII.

Modifications:

Add ByteBufUtil.writeAscii(...) and ByteBufUtil.writeUtf8(...) which can do the task in an optimized way. This is especially true if the passed in ByteBuf extends AbstractByteBuf which is true for all of our implementations which not wrap another ByteBuf.

Result:

Writing an ASCII and UTF-8 CharSequence into a AbstractByteBuf is a lot faster then what the user could do by himself as we can make use of some package private methods and so eliminate reference and range checks. When the Charseq is not ASCII or UTF-8 we can still do a very good job and are on par in most of the cases with what the user would do.

The following benchmark shows the improvements:

Result: 2456866.966 ?(99.9%) 59066.370 ops/s [Average]
  Statistics: (min, avg, max) = (2297025.189, 2456866.966, 2586003.225), stdev = 78851.914
  Confidence interval (99.9%): [2397800.596, 2515933.336]

Benchmark                                                        Mode   Samples        Score  Score error    Units
i.n.m.b.ByteBufUtilBenchmark.writeAscii                         thrpt        50  9398165.238   131503.098    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiString                   thrpt        50  9695177.968   176684.821    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArray           thrpt        50  4788597.415    83181.549    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArrayWrapped    thrpt        50  4722297.435    98984.491    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringWrapped            thrpt        50  4028689.762    66192.505    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArray                 thrpt        50  3234841.565    91308.009    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArrayWrapped          thrpt        50  3311387.474    39018.933    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiWrapped                  thrpt        50  3379764.250    66735.415    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8                          thrpt        50  5671116.821   101760.081    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8String                    thrpt        50  5682733.440   111874.084    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArray            thrpt        50  3564548.995    55709.512    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArrayWrapped     thrpt        50  3621053.671    47632.820    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringWrapped             thrpt        50  2634029.071    52304.876    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArray                  thrpt        50  3397049.332    57784.119    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArrayWrapped           thrpt        50  3318685.262    35869.562    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8Wrapped                   thrpt        50  2473791.249    46423.114    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,387.417 sec - in io.netty.microbench.buffer.ByteBufUtilBenchmark

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

The *ViaArray* benchmarks are basically doing a toString().getBytes(Charset) which the others are using ByteBufUtil.write*(...).
2014-12-26 15:57:59 +09:00
Idel Pivnitskiy
9b3f536921 Benchmark for HttpRequestDecoder 2014-11-12 14:37:11 +01:00
Norman Maurer
1914b77c71 [maven-release-plugin] prepare for next development iteration 2014-10-29 11:48:40 +01:00
Norman Maurer
c170e7df3f [maven-release-plugin] prepare release netty-4.0.24.Final 2014-10-29 11:47:19 +01:00
Trustin Lee
7710e7da44 [maven-release-plugin] prepare for next development iteration 2014-08-16 03:02:02 +09:00
Trustin Lee
208198c0cb [maven-release-plugin] prepare release netty-4.0.23.Final 2014-08-16 03:01:57 +09:00
Trustin Lee
a2d508711d [maven-release-plugin] prepare for next development iteration 2014-08-14 09:41:33 +09:00