netty5

Author	SHA1	Message	Date
Norman Maurer	ea58dc7ac7	[maven-release-plugin] prepare for next development iteration	2018-01-21 12:53:51 +00:00
Norman Maurer	96c7132dee	[maven-release-plugin] prepare release netty-4.1.20.Final	2018-01-21 12:53:34 +00:00
Francesco Nigro	1cf2687244	Fixed JMH ByteBuf benchmark to avoid dead code elimination Motivation: The JMH doc suggests to use BlackHoles to avoid dead code elimination hence would be better to follow this best practice. Modifications: Each benchmark method is returning the ByteBuf/ByteBuffer to avoid the JVM to perform any dead code elimination. Result: The results are more reliable and comparable to the others provided by other ByteBuf benchmarks (eg HeapByteBufBenchmark)	2017-12-19 14:09:18 +01:00
Scott Mitchell	55ef09f191	Add HttpObjectEncoderBenchmark Motivation: Benchmark to measure HttpObjectEncoder performance. Modifications: - Create new benchmark HttpObjectEncoderBenchmark Result: JMH Microbenchmark for HttpObjectEncoder.	2017-12-16 13:47:34 +01:00
Scott Mitchell	5f0342ebe0	Add RedisEncoderBenchmark Motivation: Add a benchmark to measure RedisEncoder's performance Modifications: - Add RedisEncoderBenchmark Result: JMH benchmark exists to measure RedisEncoder's performance.	2017-12-16 13:42:50 +01:00
Norman Maurer	264a5daa41	[maven-release-plugin] prepare for next development iteration	2017-12-15 13:10:54 +00:00
Norman Maurer	0786c4c8d9	[maven-release-plugin] prepare release netty-4.1.19.Final	2017-12-15 13:09:30 +00:00
Norman Maurer	b2bc6407ab	[maven-release-plugin] prepare for next development iteration	2017-12-08 09:26:15 +00:00
Norman Maurer	96732f47d8	[maven-release-plugin] prepare release netty-4.1.18.Final	2017-12-08 09:25:56 +00:00
Scott Mitchell	93b144b7b4	HttpMethod#valueOf improvement Motivation: HttpMethod#valueOf shows up on profiler results in the top set of results. Since it is a relatively simple operation it can be improved in isolation. Modifications: - Introduce a special case map which assigns each HttpMethod to a unique index in an array and provides constant time lookup from a hash code algorithm. When the bucket is matched we can then directly do equality comparison instead of potentially following a linked structure when HashMap has hash collisions. Result: ~10% improvement in benchmark results for HttpMethod#valueOf Benchmark Mode Cnt Score Error Units HttpMethodMapBenchmark.newMapKnownMethods thrpt 16 31.831 ± 0.928 ops/us HttpMethodMapBenchmark.newMapMixMethods thrpt 16 25.568 ± 0.400 ops/us HttpMethodMapBenchmark.newMapUnknownMethods thrpt 16 51.413 ± 1.824 ops/us HttpMethodMapBenchmark.oldMapKnownMethods thrpt 16 29.226 ± 0.330 ops/us HttpMethodMapBenchmark.oldMapMixMethods thrpt 16 21.073 ± 0.247 ops/us HttpMethodMapBenchmark.oldMapUnknownMethods thrpt 16 49.081 ± 0.577 ops/us	2017-11-20 11:07:50 -08:00
Scott Mitchell	e6126215e0	DefaultHttp2FrameWriter reduce object allocation Motivation: DefaultHttp2FrameWriter#writeData allocates a DataFrameHeader for each write operation. DataFrameHeader maintains internal state and allocates multiple slices of a buffer which is a maximum of 30 bytes. This 30 byte buffer may not always be necessary and the additional slice operations can utilize retainedSlice to take advantage of pooled objects. We can also save computation and object allocations if there is no padding which is a common case in practice. Modifications: - Remove DataFrameHeader - Add a fast path for padding == 0 Result: Less object allocation in DefaultHttp2FrameWriter	2017-11-20 08:10:59 -08:00
Anuraag Agrawal	1f1a60ae7d	Use Netty's DefaultPriorityQueue instead of JDK's PriorityQueue for scheduled tasks Motivation: `AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue. Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K). Modification: Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance. Result: Before - cancelling futures has varying performance, capped at `O(n)` After - cancelling futures has stable performance, capped at `O(log n)` Benchmark results After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds ``` Benchmark (num) Mode Cnt Score Error Units ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 137779.616 ± 7709.751 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 11049.448 ± 385.832 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 943.294 ± 12.391 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 64.210 ± 1.824 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 167531.096 ± 9187.865 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 33019.786 ± 4737.770 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 2976.955 ± 248.555 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 362.654 ± 45.716 ops/s ``` Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation. ``` Benchmark (num) Mode Cnt Score Error Units ScheduledFutureTaskBenchmark.cancelInOrder 100 thrpt 20 139968.586 ± 12951.333 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 1000 thrpt 20 12274.420 ± 337.800 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 10000 thrpt 20 958.168 ± 15.350 ops/s ScheduledFutureTaskBenchmark.cancelInOrder 100000 thrpt 20 53.381 ± 13.981 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 100 thrpt 20 123918.829 ± 3642.517 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 1000 thrpt 20 5099.810 ± 206.992 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 10000 thrpt 20 72.335 ± 0.443 ops/s ScheduledFutureTaskBenchmark.cancelInReverseOrder 100000 thrpt 20 0.743 ± 0.003 ops/s ```	2017-11-10 23:09:32 -08:00
Norman Maurer	188ea59c9d	[maven-release-plugin] prepare for next development iteration	2017-11-08 22:36:53 +00:00
Norman Maurer	812354cf1f	[maven-release-plugin] prepare release netty-4.1.17.Final	2017-11-08 22:36:33 +00:00
Carl Mastrangelo	83a19d5650	Optimistically update ref counts Motivation: Highly retained and released objects have contention on their ref count. Currently, the ref count is updated using compareAndSet with care to make sure the count doesn't overflow, double free, or revive the object. Profiling has shown that a non trivial (~1%) of CPU time on gRPC latency benchmarks is from the ref count updating. Modification: Rather than pessimistically assuming the ref count will be invalid, optimistically update it assuming it will be. If the update was wrong, then use the slow path to revert the change and throw an execption. Most of the time, the ref counts are correct. This changes from using compareAndSet to getAndAdd, which emits a different CPU instruction on x86 (CMPXCHG to XADD). Because the CPU knows it will modifiy the memory, it can avoid contention. On a highly contended machine, this can be about 2x faster. There is a downside to the new approach. The ref counters can temporarily enter invalid states if over retained or over released. The code does handle these overflow and underflow scenarios, but it is possible that another concurrent access may push the failure to a different location. For example: Time 1 Thread 1: obj.retain(INT_MAX - 1) Time 2 Thread 1: obj.retain(2) Time 2 Thread 2: obj.retain(1) Previously Thread 2 would always succeed and Thread 1 would always fail on the second access. Now, thread 2 could fail while thread 1 is rolling back its change. ==== There are a few reasons why I think this is okay: 1. Buggy code is going to have bugs. An exception _is_ going to be thrown. This just causes the other threads to notice the state is messed up and stop early. 2. If high retention counts are a use case, then ref count should be a long rather than an int. 3. The critical section is greatly reduced compared to the previous version, so the likelihood of this happening is lower 4. On error, the code always rollsback the change atomically, so there is no possibility of corruption. Result: Faster refcounting ``` BEFORE: Benchmark (delay) Mode Cnt Score Error Units AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1 sample 2901361 804.579 ± 1.835 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10 sample 3038729 785.376 ± 16.471 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 100 sample 2899401 817.392 ± 6.668 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1000 sample 3650566 2077.700 ± 0.600 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10000 sample 3005467 19949.334 ± 4.243 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1 sample 456091 48.610 ± 1.162 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10 sample 732051 62.599 ± 0.815 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 100 sample 778925 228.629 ± 1.205 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1000 sample 633682 2002.987 ± 2.856 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10000 sample 506442 19735.345 ± 12.312 ns/op AFTER: Benchmark (delay) Mode Cnt Score Error Units AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1 sample 3761980 383.436 ± 1.315 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10 sample 3667304 474.429 ± 1.101 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 100 sample 3039374 479.267 ± 0.435 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 1000 sample 3709210 2044.603 ± 0.989 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_contended 10000 sample 3011591 19904.227 ± 18.025 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1 sample 494975 52.269 ± 8.345 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10 sample 771094 62.290 ± 0.795 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 100 sample 763230 235.044 ± 1.552 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 1000 sample 634037 2006.578 ± 3.574 ns/op AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended 10000 sample 506284 19742.605 ± 13.729 ns/op ```	2017-10-04 08:42:33 +02:00
Norman Maurer	625a7426cd	[maven-release-plugin] prepare for next development iteration	2017-09-25 06:12:32 +02:00
Norman Maurer	f57d8f00e1	[maven-release-plugin] prepare release netty-4.1.16.Final	2017-09-25 06:12:16 +02:00
Norman Maurer	3c8c7fc7e9	Reduce performance overhead of ResourceLeakDetector Motiviation: The ResourceLeakDetector helps to detect and troubleshoot resource leaks and is often used even in production enviroments with a low level. Because of this its import that we try to keep the overhead as low as overhead. Most of the times no leak is detected (as all is correctly handled) so we should keep the overhead for this case as low as possible. Modifications: - Only call getStackTrace() if a leak is reported as it is a very expensive native call. Also handle the filtering and creating of the String in a lazy fashion - Remove the need to mantain a Queue to store the last access records - Add benchmark Result: Huge decrease of performance overhead. Before the patch: Benchmark (recordTimes) Mode Cnt Score Error Units ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 4358.367 ± 116.419 ops/s ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 2306.027 ± 55.044 ops/s ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 4220.979 ± 114.046 ops/s ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 2250.734 ± 55.352 ops/s With this patch: Benchmark (recordTimes) Mode Cnt Score Error Units ResourceLeakDetectorRecordBenchmark.record 8 thrpt 20 71398.957 ± 2695.925 ops/s ResourceLeakDetectorRecordBenchmark.record 16 thrpt 20 38643.963 ± 1446.694 ops/s ResourceLeakDetectorRecordBenchmark.recordWithHint 8 thrpt 20 71677.882 ± 2923.622 ops/s ResourceLeakDetectorRecordBenchmark.recordWithHint 16 thrpt 20 38660.176 ± 1467.732 ops/s	2017-09-18 16:36:19 -07:00
Norman Maurer	b967805f32	[maven-release-plugin] prepare for next development iteration	2017-08-24 15:38:22 +02:00
Norman Maurer	da8e010a42	[maven-release-plugin] prepare release netty-4.1.15.Final	2017-08-24 15:37:59 +02:00
Norman Maurer	52f384b37f	[maven-release-plugin] prepare for next development iteration	2017-08-02 12:55:10 +00:00
Norman Maurer	8cc1071881	[maven-release-plugin] prepare release netty-4.1.14.Final	2017-08-02 12:54:51 +00:00
Nikolay Fedorovskikh	df568c739e	Use ByteBuf#writeShort/writeMedium instead of writeBytes Motivation: 1. Some encoders used a `ByteBuf#writeBytes` to write short constant byte array (2-3 bytes). This can be replaced with more faster `ByteBuf#writeShort` or `ByteBuf#writeMedium` which do not access the memory. 2. Two chained calls of the `ByteBuf#setByte` with constants can be replaced with one `ByteBuf#setShort` to reduce index checks. 3. The signature of method `HttpHeadersEncoder#encoderHeader` has an unnecessary `throws`. Modifications: 1. Use `ByteBuf#writeShort` or `ByteBuf#writeMedium` instead of `ByteBuf#writeBytes` for the constants. 2. Use `ByteBuf#setShort` instead of chained call of the `ByteBuf#setByte` with constants. 3. Remove an unnecessary `throws` from `HttpHeadersEncoder#encoderHeader`. Result: A bit faster writes constants into buffers.	2017-07-10 14:37:41 +02:00
Norman Maurer	2a376eeb1b	[maven-release-plugin] prepare for next development iteration	2017-07-06 13:24:06 +02:00
Norman Maurer	c7f8168324	[maven-release-plugin] prepare release netty-4.1.13.Final	2017-07-06 13:23:51 +02:00
Dmitriy Dumanskiy	dd69a813d4	Performance improvement for HttpRequestEncoder. Insert char into the string optimized. Motivation: Right now HttpRequestEncoder does insertion of slash for url like http://localhost?pararm=1 before the question mark. It is done not effectively. Modification: Code: new StringBuilder(len + 1) .append(uri, 0, index) .append(SLASH) .append(uri, index, len) .toString(); Replaced with: new StringBuilder(uri) .insert(index, SLASH) .toString(); Result: Faster HttpRequestEncoder. Additional small test. Attached benchmark in PR. Benchmark Mode Cnt Score Error Units HttpRequestEncoderInsertBenchmark.newEncoder thrpt 40 3704843.303 ± 98950.919 ops/s HttpRequestEncoderInsertBenchmark.oldEncoder thrpt 40 3284236.960 ± 134433.217 ops/s	2017-06-27 10:53:43 +02:00
Nikolay Fedorovskikh	aa38b6a769	Prevent unnecessary allocations in the `StringUtil#escapeCsv` Motivation: A `StringUtil#escapeCsv` creates new `StringBuilder` on each value even if the same string is returned in the end. Modifications: Create new `StringBuilder` only if it really needed. Otherwise, return the original string (or just trimmed substring). Result: Less GC load. Up to 4x faster work for not changed strings.	2017-06-13 14:57:38 -07:00
Dmitriy Dumanskiy	acc07fac32	disabling leak detection micro benchmark Motivation: When I run Netty micro benchmarks I get many warnings like: WARNING: -Dio.netty.noResourceLeakDetection is deprecated. Use '-Dio.netty.leakDetection.level=simple' instead. Modification: -Dio.netty.noResourceLeakDetection replaced with -Dio.netty.leakDetection.level=disabled. Result: No warnings.	2017-06-09 18:03:54 +02:00
Norman Maurer	fd67a2354d	[maven-release-plugin] prepare for next development iteration	2017-06-08 21:06:24 +02:00
Norman Maurer	3acd5c68ea	[maven-release-plugin] prepare release netty-4.1.12.Final	2017-06-08 21:06:01 +02:00
Nikolay Fedorovskikh	e4531918a3	Optimizations in NetUtil Motivation: IPv4/6 validation methods use allocations, which can be avoided. IPv4 parse method use StringTokenizer. Modifications: Rewriting IPv4/6 validation methods to avoid allocations. Rewriting IPv4 parse method without use StringTokenizer. Result: IPv4/6 validation and IPv4 parsing faster up to 2-10x.	2017-05-18 16:42:22 -07:00
Norman Maurer	0db2901f4d	[maven-release-plugin] prepare for next development iteration	2017-05-11 16:00:55 +02:00
Norman Maurer	f7a19d330c	[maven-release-plugin] prepare release netty-4.1.11.Final	2017-05-11 16:00:16 +02:00
Scott Mitchell	3cc4052963	New native transport for kqueue Motivation: We currently don't have a native transport which supports kqueue https://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2. This can be useful for BSD systems such as MacOS to take advantage of native features, and provide feature parity with the Linux native transport. Modifications: - Make a new transport-native-unix-common module with all the java classes and JNI code for generic unix items. This module will build a static library for each unix platform, and included in the dynamic libraries used for JNI (e.g. transport-native-epoll, and eventually kqueue). - Make a new transport-native-unix-common-tests module where the tests for the transport-native-unix-common module will live. This is so each unix platform can inherit from these test and ensure they pass. - Add a new transport-native-kqueue module which uses JNI to directly interact with kqueue Result: JNI support for kqueue. Fixes https://github.com/netty/netty/issues/2448 Fixes https://github.com/netty/netty/issues/4231	2017-05-03 09:53:22 -07:00
Norman Maurer	6915ec3bb9	[maven-release-plugin] prepare for next development iteration	2017-04-29 14:10:00 +02:00
Norman Maurer	f30f242fee	[maven-release-plugin] prepare release netty-4.1.10.Final	2017-04-29 14:09:32 +02:00
Nikolay Fedorovskikh	0692bf1b6a	fix the typos	2017-04-20 04:56:09 +02:00
Norman Maurer	e482d933f7	Add 'io.netty.tryAllocateUninitializedArray' system property which allows to allocate byte[] without memset in Java9+ Motivation: Java9 added a new method to Unsafe which allows to allocate a byte[] without memset it. This can have a massive impact in allocation times when the byte[] is big. This change allows to enable this when using Java9 with the io.netty.tryAllocateUninitializedArray property when running Java9+. Please note that you will need to open up the jdk.internal.misc package via '--add-opens java.base/jdk.internal.misc=ALL-UNNAMED' as well. Modifications: Allow to allocate byte[] without memset on Java9+ Result: Better performance when allocate big heap buffers and using java9.	2017-04-19 11:45:39 +02:00
Ade Setyawan Sajim	016629fe3b	Replace system.out.println with InternalLoggerFactory Motivation: There are two files that still use `system.out.println` to log their status Modification: Replace `system.out.println` with a `debug` function inside an instance of `InternalLoggerFactory` Result: Introduce an instance of `InternalLoggerFactory` in class `AbstractMicrobenchmark.java` and `AbstractSharedExecutorMicrobenchmark.java`	2017-03-28 14:51:59 +02:00
Norman Maurer	2b8c8e0805	[maven-release-plugin] prepare for next development iteration	2017-03-10 07:46:17 +01:00
Norman Maurer	1db58ea980	[maven-release-plugin] prepare release netty-4.1.9.Final	2017-03-10 07:45:28 +01:00
Scott Mitchell	743d2d374c	SslHandler benchmark and SslEngine multiple packets benchmark Motivation: We currently don't have a benchmark which includes SslHandler. The SslEngine benchmarks also always include a single TLS packet when encoding/decoding. In practice when reading data from the network there may be multiple TLS packets present and we should expand the benchmarks to understand this use case. Modifications: - SslEngine benchmarks should include wrapping/unwrapping of multiple TLS packets - Introduce SslHandler benchmarks which can also account for wrapping/unwrapping of multiple TLS packets Result: SslHandler and SslEngine benchmarks are more comprehensive.	2017-03-06 08:42:39 -08:00
Scott Mitchell	f9001b9fc0	HTTP/2 move internal HPACK classes to the http2 package Motivation: The internal.hpack classes are no longer exposed in our public APIs and can be made package private in the http2 package. Modifications: - Make the hpack classes package private in the http2 package Result: Less APIs exposed as public.	2017-03-02 07:42:41 -08:00
Norman Maurer	461f9a1212	Allow to obtain informations of used direct and heap memory for ByteBufAllocator implementations Motivation: Often its useful for the user to be able to get some stats about the memory allocated via an allocator. Modifications: - Allow to obtain the used heap and direct memory for an allocator - Add test case Result: Fixes [#6341]	2017-03-01 18:53:43 +01:00
Norman Maurer	90a61046c7	Add benchmarks for UnpooledUnsafeNoCleanerDirectByteBuf vs UnpooledUnsafeDirectByteBuf Motivation: Issue [#6349] brought up the idea to not use UnpooledUnsafeNoCleanerDirectByteBuf by default. To decide what to do a benchmark is needed. Modifications: Add benchmarks for UnpooledUnsafeNoCleanerDirectByteBuf vs UnpooledUnsafeDirectB yteBuf Result: Better idea about impact of using UnpooledUnsafeNoCleanerDirectByteBuf.	2017-02-27 20:04:09 +01:00
Norman Maurer	d73477c7bd	Add benchmarks for SSLEngine implementations Motivation: As we provide our own SSLEngine implementation we should have benchmarks to compare it against JDK impl. Modifications: Add benchmarks for wrap / unwrap and handshake performance. Result: Benchmarks FTW.	2017-02-24 08:02:10 +01:00
Norman Maurer	a80d3411ee	Move all the microbenchmark code into one directory. Motivation: Allmost all our benchmarks are in src/main/java but a few are in src/test/java. We should make it consistent. Modifications: Move everything to src/main/java Result: Consistent code base.	2017-02-23 19:59:09 +01:00
Nikolay Fedorovskikh	0623c6c533	Fix javadoc issues Motivation: Invalid javadoc in project Modifications: Fix it Result: More correct javadoc	2017-02-22 07:31:07 +01:00
Nikolay Fedorovskikh	634a8afa53	Fix some warnings at generics usage Motivation: Existing warnings from java compiler Modifications: Add/fix type parameters Result: Less warnings	2017-02-22 07:29:59 +01:00
Norman Maurer	fd2e142e74	Update to latest jmh version Motivation: We use an outdated jmh version. Modifications: Update to jmh 1.17.4. Result: Using latest jmh version.	2017-02-14 08:40:12 +01:00

1 2 3 4 5 ...

277 Commits