netty5

Author	SHA1	Message	Date
Francesco Nigro	d2c03c9a29	Improve MqttMessageType::valueOf cost (#10400 ) Motivation: MqttMessageType::valueOf has O(N) cost Modifications: MqttMessageType::valueOf uses a const lookup table Result: MqttMessageType::valueOf has O(1) cost	2020-08-31 10:32:33 +02:00
root	bfbeb2dec6	[maven-release-plugin] prepare for next development iteration	2020-07-09 12:27:06 +00:00
root	646934ef0a	[maven-release-plugin] prepare release netty-4.1.51.Final	2020-07-09 12:26:30 +00:00
root	caf51b7284	[maven-release-plugin] prepare for next development iteration	2020-05-13 06:00:23 +00:00
root	8c5b72aaf0	[maven-release-plugin] prepare release netty-4.1.50.Final	2020-05-13 05:59:55 +00:00
root	9c5008b109	[maven-release-plugin] prepare for next development iteration	2020-04-22 09:57:54 +00:00
root	d0ec961cce	[maven-release-plugin] prepare release netty-4.1.49.Final	2020-04-22 09:57:26 +00:00
Linas Medžiūnas	fb5e2cd3aa	Efficient BytBuf search algorithms (#9914 ) (#9955 ) Motivation: We have found out that ByteBufUtil.indexOf can be inefficient for substring search on ByteBuf, both in terms of algorithm complexity (worst case O(needle.readableBytes * haystack.readableBytes)), and in constant factor (esp. on Composite buffers). With implementation of more performant search algorithms we have seen improvements on the order of magnitude. Modifications: This change introduces three search algorithms: 1. Knuth Morris Pratt - classical textbook algorithm, a good default choice. 2. Bit mask based algorithm - stable performance on any input, but limited to maximum search substring (the needle) length of 64 bytes. 3. Aho–Corasick - worse performance and higher memory consumption than [1] and [2], but it supports multiple substring (the needles) search simultaneously, by inspecting every byte of the haystack only once. Each algorithm processes every byte of underlying buffer only once, they are implemented as ByteProcessor. Result: Efficient search algorithms with linear time complexity available in Netty (I will share benchmark results in a comment on a PR).	2020-04-15 10:21:24 +02:00
Dmitry Konstantinov	ea31b59037	Replace usage() with freeBytes() in thresholds within hot paths of PoolChunkList (#10141 ) Motivation: PoolChunk.usage() method has non-trivial computations. It is used currently in hot path methods invoked when an allocation and de-allocation are happened. The idea is to replace usage() output comparison against percent thresholds by Chunk.freeBytes plain comparison against absolute thresholds. In such way the majority of computations from the threshold conditions are moved to init logic. Modifications: Replace PoolChunk.usage() conditions in PoolChunkList with equivalent conditions for PoolChunk.freeBytes() Result: Improve performance of allocation and de-allocation of ByteBuf from normal size cache pool	2020-03-31 22:11:16 +02:00
root	14e4afeba2	[maven-release-plugin] prepare for next development iteration	2020-03-17 09:20:54 +00:00
root	c10c697e5b	[maven-release-plugin] prepare release netty-4.1.48.Final	2020-03-17 09:18:28 +00:00
root	c623a50d19	[maven-release-plugin] prepare for next development iteration	2020-03-09 12:13:56 +00:00
root	a401b2ac92	[maven-release-plugin] prepare release netty-4.1.47.Final	2020-03-09 12:13:26 +00:00
root	e0d73bca4d	[maven-release-plugin] prepare for next development iteration	2020-02-28 06:37:33 +00:00
root	ebe7af5102	[maven-release-plugin] prepare release netty-4.1.46.Final	2020-02-28 06:36:45 +00:00
root	9b1ea10a12	[maven-release-plugin] prepare for next development iteration	2020-01-13 09:13:53 +00:00
root	136db8680a	[maven-release-plugin] prepare release netty-4.1.45.Final	2020-01-13 09:13:30 +00:00
Francesco Nigro	bc026ef8ba	Faster decodeHexNibble (#9896 ) Motivation: decodeHexNibble can be a lot faster using a lookup table Modifications: decodeHexNibble is made faster by using a lookup table Result: decodeHexNibble is faster	2019-12-23 21:15:56 +01:00
Anuraag Agrawal	687308b4de	Separate out query string encoding for non-encoded strings. (#9887 ) Motivation: Currently, characters are appended to the encoded string char-by-char even when no encoding is needed. We can instead separate out codepath that appends the entire string in one go for better `StringBuilder` allocation performance. Modification: Only go into char-by-char loop when finding a character that requires encoding. Result: The results aren't so clear with noise on my hot laptop - the biggest impact is on long strings, both to reduce resizes of the buffer and also to reduce complexity of the loop. I don't think there's a significant downside though for the cases that hit the slow path. After ``` Benchmark Mode Cnt Score Error Units QueryStringEncoderBenchmark.longAscii thrpt 6 1.406 ± 0.069 ops/us QueryStringEncoderBenchmark.longAsciiFirst thrpt 6 0.046 ± 0.001 ops/us QueryStringEncoderBenchmark.longUtf8 thrpt 6 0.046 ± 0.001 ops/us QueryStringEncoderBenchmark.shortAscii thrpt 6 15.781 ± 0.949 ops/us QueryStringEncoderBenchmark.shortAsciiFirst thrpt 6 3.171 ± 0.232 ops/us QueryStringEncoderBenchmark.shortUtf8 thrpt 6 3.900 ± 0.667 ops/us ``` Before ``` Benchmark Mode Cnt Score Error Units QueryStringEncoderBenchmark.longAscii thrpt 6 0.444 ± 0.072 ops/us QueryStringEncoderBenchmark.longAsciiFirst thrpt 6 0.043 ± 0.002 ops/us QueryStringEncoderBenchmark.longUtf8 thrpt 6 0.047 ± 0.001 ops/us QueryStringEncoderBenchmark.shortAscii thrpt 6 16.503 ± 1.015 ops/us QueryStringEncoderBenchmark.shortAsciiFirst thrpt 6 3.316 ± 0.154 ops/us QueryStringEncoderBenchmark.shortUtf8 thrpt 6 3.776 ± 0.956 ops/us ```	2019-12-20 08:51:18 +01:00
Anuraag Agrawal	95b8db0633	Use array to buffer decoded query instead of ByteBuffer. (#9886 ) Motivation: In Java, it is almost always at least slower to use `ByteBuffer` than `byte[]` without pooling or I/O. `QueryStringDecoder` can use `byte[]` with arguably simpler code. Modification: Replace `ByteBuffer` / `CharsetDecoder` with `byte[]` and `new String` Result: After ``` Benchmark Mode Cnt Score Error Units QueryStringDecoderBenchmark.noDecoding thrpt 6 5.612 ± 2.639 ops/us QueryStringDecoderBenchmark.onlyDecoding thrpt 6 1.393 ± 0.067 ops/us QueryStringDecoderBenchmark.mixedDecoding thrpt 6 1.223 ± 0.048 ops/us ``` Before ``` Benchmark Mode Cnt Score Error Units QueryStringDecoderBenchmark.noDecoding thrpt 6 6.123 ± 0.250 ops/us QueryStringDecoderBenchmark.onlyDecoding thrpt 6 0.922 ± 0.159 ops/us QueryStringDecoderBenchmark.mixedDecoding thrpt 6 1.032 ± 0.178 ops/us ``` I notice #6781 switched from an array to `ByteBuffer` but I can't find any motivation for that in the PR. Unit tests pass fine with an array and we get a reasonable speed bump.	2019-12-18 21:11:28 +01:00
root	79d4e74019	[maven-release-plugin] prepare for next development iteration	2019-12-18 08:32:54 +00:00
root	5ddf45a2d5	[maven-release-plugin] prepare release netty-4.1.44.Final	2019-12-18 08:31:43 +00:00
时无两丶	0cde4d9cb4	Uniform null pointer check. (#9840 ) Motivation: Uniform null pointer check. Modifications: Use ObjectUtil.checkNonNull(...) Result: Less code, same result.	2019-12-09 09:47:35 +01:00
Nick Hill	43252a6135	Update to latest JMH version (#9787 ) Motivation JMH 1.22 was released recently, we might as well use the latest when running benchmarks. Summary of changes: https://mail.openjdk.java.net/pipermail/jmh-dev/2019-November/002879.html Modifications Update jmh dependencies in microbench module from version 1.21 to 1.22. Result Benchmarks run using latest JMH	2019-11-19 11:28:18 +01:00
Nick Hill	feb804dca8	Avoid extra Runnable allocs when scheduling tasks outside event loop (#9744 ) Motivation Currently when future tasks are scheduled via EventExecutors from a different thread, at least two allocations are performed - the ScheduledFutureTask wrapping the to-be-run task, and a Runnable wrapping the action to add to the scheduled task priority queue. The latter can be avoided by incorporating this logic into the former. Modification - When scheduling or cancelling a future task from outside the event loop, enqueue the task itself rather than wrapping in a Runnable - Have ScheduledFutureTask#run first verify the task's deadline has passed and if not add or remove it from the scheduledTaskQueue depending on its cancellation state - Add new outside-event-loop benchmarks to ScheduleFutureTaskBenchmark Result Fewer allocations when scheduling/cancelling future tasks	2019-11-04 11:57:53 +01:00
root	844b82b986	[maven-release-plugin] prepare for next development iteration	2019-10-24 12:57:00 +00:00
root	d066f163d7	[maven-release-plugin] prepare release netty-4.1.43.Final	2019-10-24 12:56:30 +00:00
康智冬	bd8cea644a	Fix typos in javadocs (#9527 ) Motivation: We should have correct docs without typos Modification: Fix typos and spelling Result: More correct docs	2019-10-09 17:12:52 +04:00
root	92941cdcac	[maven-release-plugin] prepare for next development iteration	2019-09-25 06:15:31 +00:00
root	bd907c3b3a	[maven-release-plugin] prepare release netty-4.1.42.Final	2019-09-25 06:14:31 +00:00
Nick Hill	2791f0fefa	Avoid use of global AtomicLong for ScheduledFutureTask ids (#9599 ) Motivation Currently a static AtomicLong is used to allocate a unique id whenever a task is scheduled to any event loop. This could be a source of contention if delayed tasks are scheduled at a high frequency and can be easily avoided by having a non-volatile id counter per queue. Modifications - Replace static AtomicLong ScheduledFutureTask#nextTaskId with a long field in AbstractScheduledExecutorService - Set ScheduledFutureTask#id based on this when adding the task to the queue (in event loop) instead of at construction time - Add simple benchmark Result Less contention / cache-miss possibility when scheduling future tasks Before: Benchmark (num) Mode Cnt Score Error Units scheduleLots 100000 thrpt 20 346.008 ± 21.931 ops/s Benchmark (num) Mode Cnt Score Error Units scheduleLots 100000 thrpt 20 654.824 ± 22.064 ops/s	2019-09-25 07:34:25 +02:00
root	01d805bb76	[maven-release-plugin] prepare for next development iteration	2019-09-12 16:09:55 +00:00
root	7cf69022d4	[maven-release-plugin] prepare release netty-4.1.41.Final	2019-09-12 16:09:00 +00:00
root	aef47bec7f	[maven-release-plugin] prepare for next development iteration	2019-09-12 05:38:11 +00:00
root	267e5da481	[maven-release-plugin] prepare release netty-4.1.40.Final	2019-09-12 05:37:30 +00:00
root	d45a4ce01b	[maven-release-plugin] prepare for next development iteration	2019-08-13 17:16:42 +00:00
root	88c2a4cab5	[maven-release-plugin] prepare release netty-4.1.39.Final	2019-08-13 17:15:20 +00:00
root	718b7626e6	[maven-release-plugin] prepare for next development iteration	2019-07-24 09:05:57 +00:00
root	465c900c04	[maven-release-plugin] prepare release netty-4.1.38.Final	2019-07-24 09:05:23 +00:00
jingene	c0f9364870	Change the netty.io homepage scheme(http -> https) (#9344 ) Motivation: Netty homepage(netty.io) serves both "http" and "https". It's recommended to use https than http. Modification: I changed from "http://netty.io" to "https://netty.io" Result: No effects.	2019-07-09 21:09:42 +02:00
Norman Maurer	6da809dc11	Increase maxHeaderListSize for HpackDecoderBenchmark to be able to be… (#9321 ) Motivation: The previous used maxHeaderListSize was too low which resulted in exceptions during the benchmark run: ``` io.netty.handler.codec.http2.Http2Exception: Header size exceeded max allowed size (8192) at io.netty.handler.codec.http2.Http2Exception.connectionError(Http2Exception.java:103) at io.netty.handler.codec.http2.Http2Exception.headerListSizeError(Http2Exception.java:188) at io.netty.handler.codec.http2.Http2CodecUtil.headerListSizeExceeded(Http2CodecUtil.java:231) at io.netty.handler.codec.http2.HpackDecoder$Http2HeadersSink.finish(HpackDecoder.java:545) at io.netty.handler.codec.http2.HpackDecoder.decode(HpackDecoder.java:132) at io.netty.handler.codec.http2.HpackDecoderBenchmark.decode(HpackDecoderBenchmark.java:85) at io.netty.handler.codec.http2.generated.HpackDecoderBenchmark_decode_jmhTest.decode_thrpt_jmhStub(HpackDecoderBenchmark_decode_jmhTest.java:120) at io.netty.handler.codec.http2.generated.HpackDecoderBenchmark_decode_jmhTest.decode_Throughput(HpackDecoderBenchmark_decode_jmhTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:453) at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:437) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) ``` Also we should ensure we only use ascii for header names. Modifications: Just use Integer.MAX_VALUE as limit Result: Be able to run benchmark without exceptions	2019-07-04 11:24:13 +02:00
Carl Mastrangelo	ff0045e3e1	Use Table lookup for HPACK decoder (#9307 ) Motivation: Table based decoding is fast. Modification: Use table based decoding in HPACK decoder, inspired by https://github.com/python-hyper/hpack/blob/master/hpack/huffman_table.py This modifies the table to be based on integers, rather than 3-tuples of bytes. This is for two reasons: 1. It's faster 2. Using bytes makes the static intializer too big, and doesn't compile. Result: Faster Huffman decoding. This only seems to help the ascii case, the other decoding is about the same. Benchmarks: ``` Before: Benchmark (limitToAscii) (sensitive) (size) Mode Cnt Score Error Units HpackDecoderBenchmark.decode true true SMALL thrpt 20 426293.636 ± 1444.843 ops/s HpackDecoderBenchmark.decode true true MEDIUM thrpt 20 57843.738 ± 725.704 ops/s HpackDecoderBenchmark.decode true true LARGE thrpt 20 3002.412 ± 16.998 ops/s HpackDecoderBenchmark.decode true false SMALL thrpt 20 412339.400 ± 1128.394 ops/s HpackDecoderBenchmark.decode true false MEDIUM thrpt 20 58226.870 ± 199.591 ops/s HpackDecoderBenchmark.decode true false LARGE thrpt 20 3044.256 ± 10.675 ops/s HpackDecoderBenchmark.decode false true SMALL thrpt 20 2082615.030 ± 5929.726 ops/s HpackDecoderBenchmark.decode false true MEDIUM thrpt 10 571640.454 ± 26499.229 ops/s HpackDecoderBenchmark.decode false true LARGE thrpt 20 92714.555 ± 2292.222 ops/s HpackDecoderBenchmark.decode false false SMALL thrpt 20 1745872.421 ± 6788.840 ops/s HpackDecoderBenchmark.decode false false MEDIUM thrpt 20 490420.323 ± 2455.431 ops/s HpackDecoderBenchmark.decode false false LARGE thrpt 20 84536.200 ± 398.714 ops/s After(bytes): Benchmark (limitToAscii) (sensitive) (size) Mode Cnt Score Error Units HpackDecoderBenchmark.decode true true SMALL thrpt 20 472649.148 ± 7122.461 ops/s HpackDecoderBenchmark.decode true true MEDIUM thrpt 20 66739.638 ± 341.607 ops/s HpackDecoderBenchmark.decode true true LARGE thrpt 20 3139.773 ± 24.491 ops/s HpackDecoderBenchmark.decode true false SMALL thrpt 20 466933.833 ± 4514.971 ops/s HpackDecoderBenchmark.decode true false MEDIUM thrpt 20 66111.778 ± 568.326 ops/s HpackDecoderBenchmark.decode true false LARGE thrpt 20 3143.619 ± 3.332 ops/s HpackDecoderBenchmark.decode false true SMALL thrpt 20 2109995.177 ± 6203.143 ops/s HpackDecoderBenchmark.decode false true MEDIUM thrpt 20 586026.055 ± 1578.550 ops/s HpackDecoderBenchmark.decode false false SMALL thrpt 20 1775723.270 ± 4932.057 ops/s HpackDecoderBenchmark.decode false false MEDIUM thrpt 20 493316.467 ± 1453.037 ops/s HpackDecoderBenchmark.decode false false LARGE thrpt 10 85726.219 ± 402.573 ops/s After(ints): Benchmark (limitToAscii) (sensitive) (size) Mode Cnt Score Error Units HpackDecoderBenchmark.decode true true SMALL thrpt 20 615549.006 ± 5282.283 ops/s HpackDecoderBenchmark.decode true true MEDIUM thrpt 20 86714.630 ± 654.489 ops/s HpackDecoderBenchmark.decode true true LARGE thrpt 20 3984.439 ± 61.612 ops/s HpackDecoderBenchmark.decode true false SMALL thrpt 20 602489.337 ± 5397.024 ops/s HpackDecoderBenchmark.decode true false MEDIUM thrpt 20 88399.109 ± 241.115 ops/s HpackDecoderBenchmark.decode true false LARGE thrpt 20 3875.729 ± 103.057 ops/s HpackDecoderBenchmark.decode false true SMALL thrpt 20 2092165.454 ± 11918.859 ops/s HpackDecoderBenchmark.decode false true MEDIUM thrpt 20 583465.437 ± 5452.115 ops/s HpackDecoderBenchmark.decode false true LARGE thrpt 20 93290.061 ± 665.904 ops/s HpackDecoderBenchmark.decode false false SMALL thrpt 20 1758402.495 ± 14677.438 ops/s HpackDecoderBenchmark.decode false false MEDIUM thrpt 10 491598.099 ± 5029.698 ops/s HpackDecoderBenchmark.decode false false LARGE thrpt 20 85834.290 ± 554.915 ops/s ```	2019-07-02 20:09:44 +02:00
root	5b58b8e6b5	[maven-release-plugin] prepare for next development iteration	2019-06-28 05:57:21 +00:00
root	35e0843376	[maven-release-plugin] prepare release netty-4.1.37.Final	2019-06-28 05:56:28 +00:00
jimin	856f1185e1	All override methods must be added @override (#9285 ) Motivation: Some methods that either override others or are implemented as part of implementation an interface did miss the `@Override` annotation Modifications: Add missing `@Override`s Result: Code cleanup	2019-06-27 13:51:26 +02:00
Alex Blewitt	52169cba95	Replace accumulation with blackhole.consume (#9275 ) Motivation: SpotJMHBugs reports that accumulating a value as a way of eliding dead code elimination may be inadvisable, as discussed in `JMHSample_34_SafeLooping::measureWrong_2`. Change the test so that it consumes the response with `Blackhole::consume` instead. Modifications: - Replace addition of results with explicit `blackhole.consume()` call Result: Tests work as before, but with different benchmark numbers.	2019-06-25 21:47:07 +02:00
Francesco Nigro	672fa0c779	Documented non-usage of BlackHole::consume on ByteBufAccessBenchmark (#9279 ) Motivation: Some JMH benchmarks need additional explanations to motivate specific code choices. Modifications: Introduced comment to explai why calling BlackHole::consume in a loop is not always the right choice for some benchmark. Result: The relevant method shows a comment that warn about changing the code to introduce BlackHole::consume in the loop.	2019-06-25 14:52:21 +02:00
Alex Blewitt	430eeee2f6	Return the result of the list.recycle() call (#9264 ) Motivation: Resolve the issue highlighted by SpotJMHBugs that the creation of the RecyclableArrayList may be elided by the JIT since the result isn't consumed or returned. Modifications: Return the result of `list.recycle()` so that the list isn't elided. Result: The JMH benchmark shows a change in performance indicating that the prior results of this may be unsound.	2019-06-22 07:22:15 +02:00
Carl Mastrangelo	9abeaf16fd	Properly debounce wakeups (#9191 ) Motivation: The wakeup logic in EpollEventLoop is overly complex Modification: * Simplify the race to wakeup the loop * Dont let the event loop wake up itself (it's already awake!) * Make event loop check if there are any more tasks after preparing to sleep. There is small window where the non-eventloop writers can issue eventfd writes here, but that is okay. Result: Cleaner wakeup logic. Benchmarks: ``` BEFORE Benchmark Mode Cnt Score Error Units EpollSocketChannelBenchmark.executeMulti thrpt 20 408381.411 ± 2857.498 ops/s EpollSocketChannelBenchmark.executeSingle thrpt 20 157022.360 ± 1240.573 ops/s EpollSocketChannelBenchmark.pingPong thrpt 20 60571.704 ± 331.125 ops/s Benchmark Mode Cnt Score Error Units EpollSocketChannelBenchmark.executeMulti thrpt 20 440546.953 ± 1652.823 ops/s EpollSocketChannelBenchmark.executeSingle thrpt 20 168114.751 ± 1176.609 ops/s EpollSocketChannelBenchmark.pingPong thrpt 20 61231.878 ± 520.108 ops/s ```	2019-06-04 05:17:23 -07:00
Nick Hill	2ca526fac6	Ensure "full" ownership of msgs passed to EmbeddedChannel.writeInbound() (#9058 ) Motivation Pipeline handlers are free to "take control" of input buffers if they have singular refcount - in particular to mutate their raw data if non-readonly via discarding of read bytes, etc. However there are various places (primarily unit tests) where a wrapped byte-array buffer is passed in and the wrapped array is assumed not to change (used after the wrapped buffer is passed to EmbeddedChannel.writeInbound()). This invalid assumption could result in unexpected errors, such as those exposed by #8931. Modifications Anywhere that the data passed to writeInbound() might be used again, ensure that either: - A copy is used rather than wrapping a shared byte array, or - The buffer is otherwise protected from modification by making it read-only For the tests, copying is preferred since it still allows the "mutating" optimizations to be exercised. Results Avoid possible errors when pipeline assumes it has full control of input buffer.	2019-05-22 12:08:49 +02:00

1 2 3 4 5 ...

382 Commits