182 Commits

Author SHA1 Message Date
Norman Maurer
57fd316a82 Allow to obtain informations of used direct and heap memory for ByteBufAllocator implementations
Motivation:

Often its useful for the user to be able to get some stats about the memory allocated via an allocator.

Modifications:

- Allow to obtain the used heap and direct memory for an allocator
- Add test case

Result:

Fixes [#6341]
2017-03-01 18:56:16 +01:00
Norman Maurer
b372a2f19f Add benchmarks for UnpooledUnsafeNoCleanerDirectByteBuf vs UnpooledUnsafeDirectByteBuf
Motivation:

Issue [#6349] brought up the idea to not use UnpooledUnsafeNoCleanerDirectByteBuf by default. To decide what to do a benchmark is needed.

Modifications:

Add benchmarks for UnpooledUnsafeNoCleanerDirectByteBuf vs UnpooledUnsafeDirectB
yteBuf

Result:

Better idea about impact of using UnpooledUnsafeNoCleanerDirectByteBuf.
2017-02-27 20:04:24 +01:00
Norman Maurer
95b4814e3a Add benchmarks for SSLEngine implementations
Motivation:

As we provide our own SSLEngine implementation we should have benchmarks to compare it against JDK impl.

Modifications:

Add benchmarks for wrap / unwrap and handshake performance.

Result:

Benchmarks FTW.
2017-02-24 08:07:38 +01:00
Norman Maurer
00dde3224c Move all the microbenchmark code into src/main/java
Motivation:

All our benchmarks are in src/main/java in 4.1.x . We should make it consistent.

Modifications:

Move everything to src/main/java

Result:

Consistent code base.
2017-02-24 08:00:32 +01:00
Norman Maurer
ab3ee48fc5 Move benchmark to the correct folder after bad cherry-pick of 2f0b07975e2c99b65377efb7d7d0d6c4df62eb40 2017-02-15 19:55:37 +01:00
Norman Maurer
cc848f6960 Update to latest jmh version
Motivation:

We use an outdated jmh version.

Modifications:

Update to jmh 1.17.4.

Result:

Using latest jmh version.
2017-02-14 08:41:30 +01:00
Kiril Menshikov
2f0b07975e Allow to allign allocated Buffers
Motivation:

64-byte alignment is recommended by the Intel performance guide (https://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors) for data-structures over 64 bytes.
Requiring padding to a multiple of 64 bytes allows for using SIMD instructions consistently in loops without additional conditional checks. This should allow for simpler and more efficient code.

Modification:

At the moment cache alignment must be setup manually. But probably it might be taken from the system. The original code was introduced by @normanmaurer https://github.com/netty/netty/pull/4726/files

Result:

Buffer alignment works better than miss-align cache.
2017-02-06 09:51:06 +01:00
Norman Maurer
0fbad09535 [maven-release-plugin] prepare for next development iteration 2017-01-30 17:42:39 +01:00
Norman Maurer
452812a62d [maven-release-plugin] prepare release netty-4.0.44.Final 2017-01-30 17:42:07 +01:00
Tim Brooks
095be39826 Wrap operations requiring SocketPermission with doPrivileged blocks
Motivation:

Currently Netty does not wrap socket connect, bind, or accept
operations in doPrivileged blocks. Nor does it wrap cases where a dns
lookup might happen.

This prevents an application utilizing the SecurityManager from
isolating SocketPermissions to Netty.

Modifications:

I have introduced a class (SocketUtils) that wraps operations
requiring SocketPermissions in doPrivileged blocks.

Result:

A user of Netty can grant SocketPermissions explicitly to the Netty
jar, without granting it to the rest of their application.
2017-01-19 21:23:28 +01:00
Norman Maurer
a2b8646b5f [maven-release-plugin] prepare for next development iteration 2017-01-12 13:25:14 +01:00
Norman Maurer
91a0bdc17a [maven-release-plugin] prepare release netty-4.0.43.Final 2017-01-12 13:05:33 +01:00
Norman Maurer
50a11d964d [maven-release-plugin] prepare for next development iteration 2016-10-14 14:32:28 +02:00
Norman Maurer
73306e017d [maven-release-plugin] prepare release netty-4.0.42.Final 2016-10-14 14:31:27 +02:00
Masaru Nomura
368701f528 Upgrade JMH to 1.14.1
Motivation:
It'd be usually good to use the latest library version.

Modification:
Bumped JMH to the latest version as of today.

Result:
Now we use JMH version 1.14.1 for our benchmark.
2016-09-29 21:31:27 +02:00
Norman Maurer
3b86867992 [maven-release-plugin] prepare for next development iteration 2016-08-26 08:36:54 +02:00
Norman Maurer
8bdfc9ce39 [maven-release-plugin] prepare release netty-4.0.41.Final 2016-08-26 06:51:15 +02:00
Norman Maurer
e015dfaea2 [maven-release-plugin] prepare for next development iteration 2016-07-27 10:47:03 +02:00
Norman Maurer
837d9947ec [maven-release-plugin] prepare release netty-4.0.40.Final 2016-07-27 10:30:08 +02:00
Norman Maurer
45f9d29fc1 [maven-release-plugin] prepare for next development iteration 2016-07-15 07:10:09 +02:00
Norman Maurer
38bdf86ba1 [maven-release-plugin] prepare release netty-4.0.39.Final 2016-07-15 07:08:29 +02:00
Norman Maurer
4329e97455 [maven-release-plugin] prepare for next development iteration 2016-07-01 07:59:55 +02:00
Norman Maurer
8642f16f35 [maven-release-plugin] prepare release netty-4.0.38.Final 2016-07-01 07:59:37 +02:00
Norman Maurer
2919145072 [maven-release-plugin] prepare for next development iteration 2016-06-07 20:00:14 +02:00
Norman Maurer
4169779352 [maven-release-plugin] prepare release netty-4.0.37.Final 2016-06-07 19:57:15 +02:00
Norman Maurer
de751ed179 Fix compile error introduced by 0ea4597542c28af4d9d30c0e2d5bd876646cbd58 2016-05-21 20:00:45 +02:00
Norman Maurer
0ea4597542 Introduce CodecOutputList to reduce overhead of encoder/decoder
Motivation:

99dfc9ea799348430a1c25776ce30a95bc10a1ff introduced some code that will more frequently try to forward messages out of the list of decoded messages to reduce latency and memory footprint. Unfortunally this has the side-effect that RecycleableArrayList.clear() will be called more often and so introduce some overhead as ArrayList will null out the array on each call.

Modifications:

- Introduce a CodecOutputList which allows to not null out the array until we recycle it and also allows to access internal array with extra range checks.
- Add benchmark that add elements to different List implementations and clear them

Result:

Less overhead when decode / encode messages.

Benchmark                                     (elements)   Mode  Cnt         Score        Error  Units
CodecOutputListBenchmark.arrayList                     1  thrpt   20  24853764.609 ± 161582.376  ops/s
CodecOutputListBenchmark.arrayList                     4  thrpt   20  17310636.508 ± 930517.403  ops/s
CodecOutputListBenchmark.codecOutList                  1  thrpt   20  26670751.661 ± 587812.655  ops/s
CodecOutputListBenchmark.codecOutList                  4  thrpt   20  25166421.089 ± 166945.599  ops/s
CodecOutputListBenchmark.recyclableArrayList           1  thrpt   20  24565992.626 ± 210017.290  ops/s
CodecOutputListBenchmark.recyclableArrayList           4  thrpt   20  18477881.775 ± 157003.777  ops/s

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 246.748 sec - in io.netty.handler.codec.CodecOutputListBenchmark
2016-05-20 09:58:12 +02:00
Norman Maurer
4b6b167839 [maven-release-plugin] prepare for next development iteration 2016-04-04 16:53:40 +02:00
Norman Maurer
e8fa848f43 [maven-release-plugin] prepare release netty-4.0.36.Final 2016-04-04 16:52:53 +02:00
Norman Maurer
47baeb5ded [maven-release-plugin] rollback the release of netty-4.0.36.Final 2016-03-29 22:58:32 +02:00
Norman Maurer
29a4f3e363 [maven-release-plugin] prepare release netty-4.0.36.Final 2016-03-29 21:30:50 +02:00
Xiaoyan Lin
f5b4937543 Speed up the slow path of FastThreadLocal
Motivation:

The current slow path of FastThreadLocal is much slower than JDK ThreadLocal. See #4418

Modifications:

- Add FastThreadLocalSlowPathBenchmark for the flow path of FastThreadLocal
- Add final to speed up the slow path of FastThreadLocal

Result:

The slow path of FastThreadLocal is improved.
2016-03-23 11:44:30 +01:00
Norman Maurer
64dc03a25d [maven-release-plugin] prepare for next development iteration 2016-03-21 10:34:26 +01:00
Norman Maurer
e444e8d7a6 [maven-release-plugin] prepare release netty-4.0.35.Final 2016-03-21 10:32:08 +01:00
Xiaoyan Lin
57cebc7f5b Add xml-maven-plugin to check indentation and fix violations
Motivation:

See https://github.com/netty/netty-build/issues/5

Modifications:

Add xml-maven-plugin to check indentation and fix violations

Result:
pom.xml will be checked in the PR build
2016-02-29 10:04:53 +01:00
Norman Maurer
b647513b6b [maven-release-plugin] prepare for next development iteration 2016-01-29 09:57:10 +01:00
Norman Maurer
cf1777b619 [maven-release-plugin] prepare release netty-4.0.34.Final 2016-01-29 09:23:46 +01:00
Norman Maurer
450939842e Fix version 2015-11-24 21:24:22 +01:00
Norman Maurer
69b5aefd09 [maven-release-plugin] prepare release netty-4.0.33.Final 2015-11-03 14:18:17 +01:00
Norman Maurer
efeec7e390 Correctly construct Executor in microbenchmarks.
Motivation:

We should allow our custom Executor to shutdown quickly.

Modifications:

Call super constructor which correct arguments.

Result:

Custom Executor can be shutdown quickly.
2015-11-03 09:42:42 +01:00
buchgr
e12613a018 Fix performance regression in FastThreadLocal microbenchmark. Fixes #4402
Motivation:

As reported in #4402, the FastThreadLocalBenchmark shows that the JDK ThreadLocal
is actually faster than Netty's custom thread local implementation.

I was looking forward to doing some deep digging, but got disappointed :(.

Modifications:

The microbenchmark was not using FastThreadLocalThreads and would thus always hit the slow path.
I updated the JMH command line flags, so that FastThreadLocalThreads would be used.

Result:

FastThreadLocalBenchmark shows FastThreadLocal to be faster than JDK's ThreadLocal implementation,
by about 56% in this particular benchmark. Run on OSX El Capitan with OpenJDK 1.8u60.

Benchmark                                    Mode  Cnt      Score      Error  Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  ops/s
2015-10-29 21:46:04 +01:00
Norman Maurer
0c8fe18d3c Add benchmark for HeapByteBuf implementations.
Motivation:

To prove one implementation is faster as the other we should have a benchmark.

Modifications:

Add benchmark which benchmarks the unsafe and non-unsafe implementation of HeapByteBuf.

Result:

Able to compare speed of implementations easily.
2015-10-29 19:38:33 +01:00
Norman Maurer
577931e8bc Use bitwise operation when sampling for resource leak detection.
Motivation:

Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.

Modifications:

- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.

Result:

Faster sampling.
2015-10-29 19:18:06 +01:00
Norman Maurer
291674262c Added SlicedAbstractByteBuf that can provide fast-path for _get* and _set* methods
Motivation:

SlicedByteBuf can be used for any ByteBuf implementations and so can not do any optimizations that could be done
when AbstractByteBuf is sliced.

Modifications:

- Add SlicedAbstractByteBuf that can eliminate range and reference count checks for _get* and _set* methods.

Result:

Faster SlicedByteBuf implementations for AbstractByteBuf sub-classes.
2015-10-16 08:59:58 +02:00
Norman Maurer
054af70fed Minimize object allocation when calling AbstractByteBuf.toString(..., Charset)
Motivation:

Calling AbstractByteBuf.toString(..., Charset) is used quite frequently by users but produce a lot of GC.

Modification:

- Use a FastThreadLocal to store the CharBuffer that are needed for decoding.
- Use internalNioBuffer(...) when possible

Result:

Less object creation / Less GC
2015-10-15 17:49:21 +02:00
Norman Maurer
1103379e02 Allow to disable reference count checks on every access of the ByteBuf
Motiviation:

Checking reference count on every access on a ByteBuf can have some big performance overhead depending on how the access pattern is. If the user is sure that there are no reference count errors on his side it should be possible to disable the check and so gain the max performance.

Modification:

- Add io.netty.buffer.bytebuf.checkAccessible system property which allows to disable the checks. Enabled by default.
- Add microbenchmark

Result:

Increased performance for operations on the ByteBuf.
2015-10-15 10:19:49 +02:00
Norman Maurer
696a287736 [maven-release-plugin] prepare for next development iteration 2015-09-30 09:31:26 +02:00
Norman Maurer
fb2d562306 [maven-release-plugin] prepare release netty-4.0.32.Final 2015-09-30 09:28:40 +02:00
Norman Maurer
bd928eaa38 [maven-release-plugin] prepare for next development iteration 2015-09-02 08:58:54 +02:00
Norman Maurer
26bbcc38c2 [maven-release-plugin] prepare release netty-4.0.31.Final 2015-09-02 08:57:57 +02:00