netty5

Author	SHA1	Message	Date
Norman Maurer	310f31b392	Update to new checkstyle plugin (#8777 ) Motivation: We need to update to a new checkstyle plugin to allow the usage of lambdas. Modifications: - Update to new plugin version. - Fix checkstyle problems. Result: Be able to use checkstyle plugin which supports new Java syntax.	2019-01-24 16:24:19 +01:00
Norman Maurer	3d6e6136a9	Decouple EventLoop details from the IO handling for each transport to… (#8680 ) * Decouble EventLoop details from the IO handling for each transport to allow easy re-use of code and customization Motiviation: As today extending EventLoop implementations to add custom logic / metrics / instrumentations is only possible in a very limited way if at all. This is due the fact that most implementations are final or even package-private. That said even if these would be public there are the ability to do something useful with these is very limited as the IO processing and task processing are very tightly coupled. All of the mentioned things are a big pain point in netty 4.x and need improvement. Modifications: This changeset decoubled the IO processing logic from the task processing logic for the main transport (NIO, Epoll, KQueue) by introducing the concept of an IoHandler. The IoHandler itself is responsible to wait for IO readiness and process these IO events. The execution of the IoHandler itself is done by the SingleThreadEventLoop as part of its EventLoop processing. This allows to use the same EventLoopGroup (MultiThreadEventLoupGroup) for all the mentioned transports by just specify a different IoHandlerFactory during construction. Beside this core API change this changeset also allows to easily extend SingleThreadEventExecutor / SingleThreadEventLoop to add custom logic to it which then can be reused by all the transports. The ideas are very similar to what is provided by ScheduledThreadPoolExecutor (that is part of the JDK). This allows for example things like: * Adding instrumentation / metrics: * how many Channels are registered on an SingleThreadEventLoop * how many Channels were handled during the IO processing in an EventLoop run * how many task were handled during the last EventLoop / EventExecutor run * how many outstanding tasks we have ... ... * Implementing custom strategies for choosing the next EventExecutor / EventLoop to use based on these metrics. * Use different Promise / Future / ScheduledFuture implementations * decorate Runnable / Callables when submitted to the EventExecutor / EventLoop As a lot of functionalities are folded into the MultiThreadEventLoopGroup and SingleThreadEventLoopGroup this changeset also removes: * AbstractEventLoop * AbstractEventLoopGroup * EventExecutorChooser * EventExecutorChooserFactory * DefaultEventLoopGroup * DefaultEventExecutor * DefaultEventExecutorGroup Result: Fixes https://github.com/netty/netty/issues/8514 .	2019-01-23 08:32:05 +01:00
Dmitriy Dumanskiy	7b92ff2500	Java 8 migration. Remove ThreadLocalProvider and inline java.util.concurrent.ThreadLocalRandom.current() where necessary. (#8762 ) Motivation: Custom Netty ThreadLocalRandom and ThreadLocalRandomProvider classes are no longer needed and can be removed. Modification: Remove own ThreadLocalRandom Result: Less code to maintain	2019-01-22 20:14:28 +01:00
田欧	9d62deeb6f	Java 8 migration: Use diamond operator (#8749 ) Motivation: We can use the diamond operator these days. Modification: Use diamond operator whenever possible. Result: More modern code and less boiler-plate.	2019-01-22 16:07:26 +01:00
Norman Maurer	8fdf373557	Skip execution of ChannelHandler method if annotated with @Skip and just use the next handler in the pipeline. (#8723 ) Motivation: Invoking ChannelHandlers is not free and can result in some overhead when the ChannelPipeline becomes very long. This is especially true if most handlers will just forward the call to the next handler in the pipeline. When the user extends ChannelHandlerAdapter we can easily detect if can just skip the handler and invoke the next handler in the pipeline directly. This reduce the overhead of dispatch but also reduce the call-stack in many cases. Modifications: Detect if we can skip the handler when walking the pipeline. Result: Reduce overhead for long pipelines. Benchmark (extraHandlers) Mode Cnt Score Error Units DefaultChannelPipelineBenchmark.propagateEventOld 4 thrpt 10 267313.031 ± 9131.140 ops/s DefaultChannelPipelineBenchmark.propagateEvent 4 thrpt 10 824825.673 ± 12727.594 ops/s	2019-01-22 08:58:58 +01:00
Norman Maurer	1fe931b6e2	Make it possible to use a wrapped EventLoop with a Channel (#8677 ) Motiviation: Because of how we implemented the registration / deregistration of an EventLoop it was not possible to wrap an EventLoop implementation and use it with a Channel. Modification: - Introduce EventLoop.Unsafe which is responsible for the actual registration. - Move validation of EventLoop / Channel combo to the EventLoop - Add unit test that verifies that wrapping works Result: Be able to wrap an EventLoop and so add some extra functionality.	2019-01-17 09:17:51 +01:00
Norman Maurer	c10ccc5dec	Tighten contract between Channel and EventLoop by require the EventLoop on Channel construction. (#8587 ) Motivation: At the moment it’s possible to have a Channel in Netty that is not registered / assigned to an EventLoop until register(...) is called. This is suboptimal as if the Channel is not registered it is also not possible to do anything useful with a ChannelFuture that belongs to the Channel. We should think about if we should have the EventLoop as a constructor argument of a Channel and have the register / deregister method only have the effect of add a Channel to KQueue/Epoll/... It is also currently possible to deregister a Channel from one EventLoop and register it with another EventLoop. This operation defeats the threading model assumptions that are wide spread in Netty, and requires careful user level coordination to pull off without any concurrency issues. It is not a commonly used feature in practice, may be better handled by other means (e.g. client side load balancing), and therefore we propose removing this feature. Modifications: - Change all Channel implementations to require an EventLoop for construction ( + an EventLoopGroup for all ServerChannel implementations) - Remove all register(...) methods from EventLoopGroup - Add ChannelOutboundInvoker.register(...) which now basically means we want to register on the EventLoop for IO. - Change ChannelUnsafe.register(...) to not take an EventLoop as parameter (as the EventLoop is supplied on custruction). - Change ChannelFactory to take an EventLoop to create new Channels and introduce ServerChannelFactory which takes an EventLoop and one EventLoopGroup to create new ServerChannel instances. - Add ServerChannel.childEventLoopGroup() - Ensure all operations on the accepted Channel is done in the EventLoop of the Channel in ServerBootstrap - Change unit tests for new behaviour Result: A Channel always has an EventLoop assigned which will never change during its life-time. This ensures we are always be able to call any operation on the Channel once constructed (unit the EventLoop is shutdown). This also simplifies the logic in DefaultChannelPipeline a lot as we can always call handlerAdded / handlerRemoved directly without the need to wait for register() to happen. Also note that its still possible to deregister a Channel and register it again. It's just not possible anymore to move from one EventLoop to another (which was not really safe anyway). Fixes https://github.com/netty/netty/issues/8513.	2019-01-14 20:11:13 +01:00
Norman Maurer	d9a6cf341c	Remove support for marking reader and writerIndex in ByteBuf to reduce overhead and complexity. (#8636 ) Motivation: ByteBuf supports “marker indexes”. The intended use case for these is if a speculative operation (e.g. decode) is in process the user can “mark” and interface and refer to it later if the operation isn’t successful (e.g. not enough data). However this is rarely used in practice, requires extra memory to maintain, and introduces complexity in the state management for derived/pooled buffer initialization, resizing, and other operations which may modify reader/writer indexes. Modifications: Remove support for marking and adjust testcases / code. Result: Fixes https://github.com/netty/netty/issues/8535.	2018-12-11 14:00:49 +01:00
Francesco Nigro	4c2b11633a	Adding an execute burst cost benchmark for Netty executors (#8594 ) Motivation: Netty executors doesn't have yet any means to compare with each others nor to compare with the j.u.c. executors Modifications: A new benchmark measuring execute burst cost is being added Result: It's now possible to compare some of Netty executors with each others and with the j.u.c. executors	2018-12-04 15:46:48 +01:00
Norman Maurer	2c78dde749	Update version number to start working on Netty 5	2018-11-20 15:49:57 +01:00
Nick Hill	10539f4dc7	Streamline CompositeByteBuf internals (#8437 ) Motivation: CompositeByteBuf is a powerful and versatile abstraction, allowing for manipulation of large data without copying bytes. There is still a non-negligible cost to reading/writing however relative to "singular" ByteBufs, and this can be mostly eliminated with some rework of the internals. My use case is message modification/transformation while zero-copy proxying. For example replacing a string within a large message with one of a different length Modifications: - No longer slice added buffers and unwrap added slices - Components store target buf offset relative to position in composite buf - Less allocations, object footprint, pointer indirection, offset arithmetic - Use Component[] rather than ArrayList<Component> - Avoid pointer indirection and duplicate bounds check, more efficient backing array growth - Facilitates optimization when doing bulk-inserts - inserting n ByteBufs behind m is now O(m + n) instead of O(mn) - Avoid unnecessary casting and method call indirection via superclass - Eliminate some duplicate range/ref checks via non-checking versions of toComponentIndex and findComponent - Add simple fast-path for toComponentIndex(0); add racy cache of last-accessed Component to findComponent(int) - Override forEachByte0(...) and forEachByteDesc0(...) methods - Make use of RecyclableArrayList in nioBuffers(int, int) (in line with FasterCompositeByteBuf impl) - Modify addComponents0(boolean,int,Iterable) to use the Iterable directly rather than copy to an array first (and possibly to an ArrayList before that) - Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform repeated array insertions and avoid second loop for offset updates - Simplify other logic in various places, in particular the general pattern used where a sub-range is iterated over - Add benchmarks to demonstrate some improvements While refactoring I also came across a couple of clear bugs. They are fixed in these changes but I will open another PR with unit tests and fixes to the current version. Result: Much faster creation, manipulation, and access; many fewer allocations and smaller footprint. Benchmark results to follow.	2018-11-03 10:37:07 +01:00
root	3e7ddb36c7	[maven-release-plugin] prepare for next development iteration	2018-10-29 15:38:51 +00:00
root	9e50739601	[maven-release-plugin] prepare release netty-4.1.31.Final	2018-10-29 15:37:47 +00:00
Nick Hill	583d838f7c	Optimize AbstractByteBuf.getCharSequence() in US_ASCII case (#8392 ) * Optimize AbstractByteBuf.getCharSequence() in US_ASCII case Motivation: Inspired by https://github.com/netty/netty/pull/8388, I noticed this simple optimization to avoid char[] allocation (also suggested in a TODO here). Modifications: Return an AsciiString from AbstractByteBuf.getCharSequence() if requested charset is US_ASCII or ISO_8859_1 (latter thanks to @Scottmitch's suggestion). Also tweak unit tests not to require Strings and include a new benchmark to demonstrate the speedup. Result: Speed-up of AbstractByteBuf.getCharSequence() in ascii and iso 8859/1 cases	2018-10-26 15:32:38 -07:00
Norman Maurer	87ec2f882a	Reduce overhead by ByteBufUtil.decodeString(...) which is used by `AbstractByteBuf.toString(...)` and `AbstractByteBuf.getCharSequence(...)` (#8388 ) Motivation: Our current implementation that is used for toString(Charset) operations on AbstractByteBuf implementation is quite slow as it does a lot of uncessary memory copies. We should just use new String(...) as it has a lot of optimizations to handle these cases. Modifications: Rewrite ByteBufUtil.decodeString(...) to use new String(...) Result: Less overhead for toString(Charset) operations. Benchmark (charsetName) (direct) (size) Mode Cnt Score Error Units ByteBufUtilDecodeStringBenchmark.decodeString US-ASCII false 8 thrpt 20 22401645.093 ? 4671452.479 ops/s ByteBufUtilDecodeStringBenchmark.decodeString US-ASCII false 64 thrpt 20 23678483.384 ? 3749164.446 ops/s ByteBufUtilDecodeStringBenchmark.decodeString US-ASCII true 8 thrpt 20 15731142.651 ? 3782931.591 ops/s ByteBufUtilDecodeStringBenchmark.decodeString US-ASCII true 64 thrpt 20 16244232.229 ? 1886259.658 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-8 false 8 thrpt 20 25983680.959 ? 5045782.289 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-8 false 64 thrpt 20 26235589.339 ? 2867004.950 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-8 true 8 thrpt 20 18499027.808 ? 4784684.268 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-8 true 64 thrpt 20 16825286.141 ? 1008712.342 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-16 false 8 thrpt 20 5789879.092 ? 1201786.359 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-16 false 64 thrpt 20 2173243.225 ? 417809.341 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-16 true 8 thrpt 20 5035583.011 ? 1001978.854 ops/s ByteBufUtilDecodeStringBenchmark.decodeString UTF-16 true 64 thrpt 20 2162345.301 ? 402410.408 ops/s ByteBufUtilDecodeStringBenchmark.decodeString ISO-8859-1 false 8 thrpt 20 30039052.376 ? 6539111.622 ops/s ByteBufUtilDecodeStringBenchmark.decodeString ISO-8859-1 false 64 thrpt 20 31414163.515 ? 2096710.526 ops/s ByteBufUtilDecodeStringBenchmark.decodeString ISO-8859-1 true 8 thrpt 20 19538587.855 ? 4639115.572 ops/s ByteBufUtilDecodeStringBenchmark.decodeString ISO-8859-1 true 64 thrpt 20 19467839.722 ? 1672687.213 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld US-ASCII false 8 thrpt 20 10787326.745 ? 1034197.864 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld US-ASCII false 64 thrpt 20 7129801.930 ? 1363019.209 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld US-ASCII true 8 thrpt 20 9002529.605 ? 2017642.445 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld US-ASCII true 64 thrpt 20 3860192.352 ? 826218.738 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-8 false 8 thrpt 20 10532838.027 ? 2151743.968 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-8 false 64 thrpt 20 7185554.597 ? 1387685.785 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-8 true 8 thrpt 20 7352253.316 ? 1333823.850 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-8 true 64 thrpt 20 2825578.707 ? 349701.156 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-16 false 8 thrpt 20 7277446.665 ? 1447034.346 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-16 false 64 thrpt 20 2445929.579 ? 562816.641 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-16 true 8 thrpt 20 6201174.401 ? 1236137.786 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld UTF-16 true 64 thrpt 20 2310674.973 ? 525587.959 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld ISO-8859-1 false 8 thrpt 20 11142625.392 ? 1680556.468 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld ISO-8859-1 false 64 thrpt 20 8127116.405 ? 1128513.860 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld ISO-8859-1 true 8 thrpt 20 9405751.952 ? 2193324.806 ops/s ByteBufUtilDecodeStringBenchmark.decodeStringOld ISO-8859-1 true 64 thrpt 20 3943282.076 ? 737798.070 ops/s Benchmark result is saved to /home/norman/mainframer/netty/microbench/target/reports/performance/ByteBufUtilDecodeStringBenchmark.json Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,030.173 sec - in io.netty.buffer.ByteBufUtilDecodeStringBenchmark [1030.460s][info ][gc,heap,exit ] Heap [1030.460s][info ][gc,heap,exit ] garbage-first heap total 516096K, used 257918K [0x0000000609a00000, 0x0000000800000000) [1030.460s][info ][gc,heap,exit ] region size 2048K, 127 young (260096K), 2 survivors (4096K) [1030.460s][info ][gc,heap,exit ] Metaspace used 17123K, capacity 17438K, committed 17792K, reserved 1064960K [1030.460s][info ][gc,heap,exit ] class space used 1709K, capacity 1827K, committed 1920K, reserved 1048576K	2018-10-19 14:00:13 +02:00
root	2d7cb47edd	[maven-release-plugin] prepare for next development iteration	2018-09-27 19:00:45 +00:00
root	3a9ac829d5	[maven-release-plugin] prepare release netty-4.1.30.Final	2018-09-27 18:56:12 +00:00
Norman Maurer	e542a2cf26	Use a non-volatile read for ensureAccessible() whenever possible to reduce overhead and allow better inlining. (#8266 ) Motiviation: At the moment whenever ensureAccessible() is called in our ByteBuf implementations (which is basically on each operation) we will do a volatile read. That per-se is not such a bad thing but the problem here is that it will also reduce the the optimizations that the compiler / jit can do. For example as these are volatile it can not eliminate multiple loads of it when inline the methods of ByteBuf which happens quite frequently because most of them a quite small and very hot. That is especially true for all the methods that act on primitives. It gets even worse as people often call a lot of these after each other in the same method or even use method chaining here. The idea of the change is basically just ue a non-volatile read for the ensureAccessible() check as its a best-effort implementation to detect acting on already released buffers anyway as even with a volatile read it could happen that the user will release it in another thread before we actual access the buffer after the reference check. Modifications: - Try to do a non-volatile read using sun.misc.Unsafe if we can use it. - Add a benchmark Result: Big performance win when multiple ByteBuf methods are called from a method. With the change: UnsafeByteBufBenchmark.setGetLongUnsafeByteBuf thrpt 20 281395842,128 ± 5050792,296 ops/s Before the change: UnsafeByteBufBenchmark.setGetLongUnsafeByteBuf thrpt 20 217419832,801 ± 5080579,030 ops/s	2018-09-07 07:47:02 +02:00
Norman Maurer	052c2fbefe	Update to jmh 1.2.1 (#8270 ) Motivation: We should use the latest jmh version which also supports -prof dtraceasm on MacOS. Modifications: Update to latest jmh version. Result: Better benchmark / profiling support on MacOS.	2018-09-06 22:31:52 +02:00
Norman Maurer	02d559e6a4	Remove flags when running benchmarks. (#8262 ) Motivation: Some of the flags we used are not supported anymore on more recent JDK versions. We should just remove all of them and only keep what we really need. This may also reflect better what people use in production. Modifications: Remove some flags when running the benchmarks. Result: Benchmarks also run with JDK11.	2018-09-05 19:05:02 +02:00
Norman Maurer	8635d88d4d	Allow to generate a jmh uber jar to run benchmarks easily from cmdline with different arguments. (#8264 ) Motivation: It is sometimes useful to be able to run benchmarks easily from the commandline and passs different arguments / options here. We should support this. Modifications: Add the benchmark-jar profile which allows to generate such an "uber-jar" that can be used directly to run benchmarks as documented at http://openjdk.java.net/projects/code-tools/jmh/. Result: More flexible way to run benchmarks.	2018-09-05 18:28:35 +02:00
Carl Mastrangelo	379a56ca49	Add an Epoll benchmark Motivation: Optimizing the Epoll channel needs an objective measure of how fast it is. Modification: Add a simple, closed loop, ping-pong benchmark. Result: Benchmark can be used to measure #7816 Initial numbers: ``` Result "io.netty.microbench.channel.epoll.EpollSocketChannelBenchmark.pingPong": 22614.403 ±(99.9%) 797.263 ops/s [Average] (min, avg, max) = (21093.160, 22614.403, 24977.387), stdev = 918.130 CI (99.9%): [21817.140, 23411.666] (assumes normal distribution) Benchmark Mode Cnt Score Error Units EpollSocketChannelBenchmark.pingPong thrpt 20 22614.403 ± 797.263 ops/s ```	2018-09-04 10:15:15 +02:00
Francesco Nigro	c78be33443	Added configurable ByteBuf bounds checking (#7521 ) Motivation: The JVM isn't always able to hoist out/reduce bounds checking (due to ref counting operations etc etc) hence making it configurable could improve performances for most CPU intensive use cases. Modifications: Each AbstractByteBuf bounds check has been tested against a new static final configuration property similar to checkAccessible ie io.netty.buffer.bytebuf.checkBounds. Result: Any user could disable ByteBuf bounds checking in order to get extra performances.	2018-09-03 20:33:47 +02:00
root	a580dc7585	[maven-release-plugin] prepare for next development iteration	2018-08-24 06:36:33 +00:00
root	3fc789e83f	[maven-release-plugin] prepare release netty-4.1.29.Final	2018-08-24 06:36:06 +00:00
root	fcb19cb589	[maven-release-plugin] prepare for next development iteration	2018-07-27 04:59:28 +00:00
root	ff785fbe39	[maven-release-plugin] prepare release netty-4.1.28.Final	2018-07-27 04:59:06 +00:00
root	b4dbdc2036	[maven-release-plugin] prepare for next development iteration	2018-07-11 15:37:40 +00:00
root	1c16519ac8	[maven-release-plugin] prepare release netty-4.1.27.Final	2018-07-11 15:37:21 +00:00
root	7bb9e7eafe	[maven-release-plugin] prepare for next development iteration	2018-07-10 05:21:24 +00:00
root	8ca5421bd2	[maven-release-plugin] prepare release netty-4.1.26.Final	2018-07-10 05:18:13 +00:00
Norman Maurer	83710cb2e1	Replace toArray(new T[size]) with toArray(new T[0]) to eliminate zero-out and allow the VM to optimize. (#8075 ) Motivation: Using toArray(new T[0]) is usually the faster aproach these days. We should use it. See also https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion. Modifications: Replace toArray(new T[size]) with toArray(new T[0]). Result: Faster code.	2018-06-29 07:56:04 +02:00
unknown	4a8d3a274c	Including the setup code in the benchmark method to avoid JMH Invocation level hiccups. Motivation: The usage of Invocation level for JMH fixture methods (setup/teardown) inccurs in a significant overhead in the benchmark time (see org.openjdk.jmh.annotations.Level documentation). In the case of CodecInputListBenchmark, benchmarks are far too small (less than 50ns) and the Invocation level setup offsets the measurement considerably. On such cases, the recommended fix patch is to include the setup/teardown code in the benchmark method. Modifications: Include the setup/teardown code in the relevant benchmark methods. Remove the setup/teardown methods from the benchmark class. Result: We run the entire benchmark 10 times with default parameters we observed: - ArrayList benchmark affected directly by JMH overhead is now from 15-80% faster. - CodecList benchmark is now 50% faster than original (even with the setup code being measured). - Recyclable ArrayList is ~30% slower. - All benchmarks have significant different means (ANOVA) and medians (Moore) Mode: Throughput (Higher the better) Method Full params Factor Modified (Median) Original (Median) recyclableArrayList (elements = 1) 0.615520967 21719082.75 35285691.2 recyclableArrayList (elements = 4) 0.699553431 17149442.76 24514843.31 arrayList (elements = 4) 1.152666631 27120407.18 23528404.88 codecOutList (elements = 1) 1.527275908 67251089.04 44033359.47 codecOutList (elements = 4) 1.596917095 59174088.78 37055204.03 arrayList (elements = 1) 1.878616889 62188238.24 33103204.06 Environment: Tests run on a Computational server with CPU: E5-1660-3.3GHZ (6 cores + HT), 64 GB RAM.	2018-06-21 12:22:13 +02:00
unknown	cb420a9ffc	Including the setup code in the benchmark method to avoid JMH Invocation level hiccups. Motivation: The usage of Invocation level for JMH fixture methods (setup/teardown) inccurs in a significant impact in in the benchmark time (see org.openjdk.jmh.annotations.Level documentation). When the benchmark and the setup/teardown is too small (less than a milisecond) the Invocation level might saturate the system with timestamp requests and iteration synchronizations which introduce artificial latency, throughput, and scalability bottlenecks. In the HeadersBenchmark, all benchmarks take less than 100ns and the Invocation level setup offsets the measurement considerably. As fixture methods is defined for the entire class, this overhead also impacts every single benchmark in this class, not only the ones that use the emptyHttpHeaders object (cleaned in the setup). The recommended fix patch here is to include the setup/teardown code in the benchmark where the object is used. Modifications: Include the setup/teardown code in the relevant benchmark methods. Remove the setup/teardown method of Invocation level from the benchmark class. Result: We run all benchmarks from HeadersBenchmark 10 times with default parameter, we observe: - Benchmarks that were not directly affected by the fix patch, improved execution time. For instance, http2Remove with (exampleHeader = THREE) had its median reported as 2x faster than the original version. - Benchmarks that had the setup code inserted (eg. http2AddAllFastest) did not suffer a significant punch in the execution time, as the benchmarks are not dominated by the clear(). Environment: Tests run on a Computational server with CPU: E5-1660-3.3GHZ (6 cores + HT), 64 GB RAM.	2018-06-21 12:21:19 +02:00
Norman Maurer	64bb279f47	[maven-release-plugin] prepare for next development iteration	2018-05-14 11:11:45 +00:00
Norman Maurer	c67a3b0507	[maven-release-plugin] prepare release netty-4.1.25.Final	2018-05-14 11:11:24 +00:00
Norman Maurer	b75f44db9a	[maven-release-plugin] prepare for next development iteration	2018-04-19 11:56:07 +00:00
Norman Maurer	04fac00c8c	[maven-release-plugin] prepare release netty-4.1.24.Final	2018-04-19 11:55:47 +00:00
root	0a61f055f5	[maven-release-plugin] prepare for next development iteration	2018-04-04 10:44:46 +00:00
root	8c549bad38	[maven-release-plugin] prepare release netty-4.1.23.Final	2018-04-04 10:44:15 +00:00
Scott Mitchell	9d51a40df0	Update NetUtilBenchmark (#7826 ) Motivation: NetUtilBenchmark is using out of date data, throws an exception in the benchmark, and allocates a Set on each run. Modifications: - Update the benchmark and reduce each run's overhead Result: NetUtilBenchmark is updated.	2018-03-31 08:27:08 +02:00
Francesco Nigro	ed46c4ed00	Copies from read-only heap ByteBuffer to direct ByteBuf can avoid stealth ByteBuf allocation and additional copies Motivation: Read-only heap ByteBuffer doesn't expose array: the existent method to perform copies to direct ByteBuf involves the creation of a (maybe pooled) additional heap ByteBuf instance and copy Modifications: To avoid stressing the allocator with additional (and stealth) heap ByteBuf allocations is provided a method to perform copies using the (pooled) internal NIO buffer Result: Copies from read-only heap ByteBuffer to direct ByteBuf won't create any intermediate ByteBuf	2018-02-27 09:54:21 +09:00
Norman Maurer	69582c0b6c	[maven-release-plugin] prepare for next development iteration	2018-02-21 12:52:33 +00:00
Norman Maurer	786f35c6c9	[maven-release-plugin] prepare release netty-4.1.22.Final	2018-02-21 12:52:19 +00:00
Norman Maurer	e71fa1e7b6	[maven-release-plugin] prepare for next development iteration	2018-02-05 12:02:35 +00:00
Norman Maurer	41ebb5fcca	[maven-release-plugin] prepare release netty-4.1.21.Final	2018-02-05 12:02:19 +00:00
Julien Hoarau	3e6b54bb59	Fix failing h2spec tests 8.1.2.1 related to pseudo-headers validation Motivation: According to the spec: All pseudo-header fields MUST appear in the header block before regular header fields. Any request or response that contains a pseudo-header field that appears in a header block after a regular header field MUST be treated as malformed (Section 8.1.2.6). Pseudo-header fields are only valid in the context in which they are defined. Pseudo-header fields defined for requests MUST NOT appear in responses; pseudo-header fields defined for responses MUST NOT appear in requests. Pseudo-header fields MUST NOT appear in trailers. Endpoints MUST treat a request or response that contains undefined or invalid pseudo-header fields as malformed (Section 8.1.2.6). Clients MUST NOT accept a malformed response. Note that these requirements are intended to protect against several types of common attacks against HTTP; they are deliberately strict because being permissive can expose implementations to these vulnerabilities. Modifications: - Introduce validation in HPackDecoder Result: - Requests with unknown pseudo-field headers are rejected - Requests with containing response specific pseudo-headers are rejected - Requests where pseudo-header appear after regular header are rejected - h2spec 8.1.2.1 pass	2018-01-29 19:42:56 -08:00
Norman Maurer	4c1e0f596a	Use FastThreadLocal for CodecOutputList Motivation: We used Recycler for the CodecOutputList which is not optimized for the use-case of access only from the same Thread all the time. Modifications: - Use FastThreadLocal for CodecOutputList - Add benchmark Result: Less overhead in our codecs.	2018-01-23 11:34:28 +01:00
Norman Maurer	ea58dc7ac7	[maven-release-plugin] prepare for next development iteration	2018-01-21 12:53:51 +00:00
Norman Maurer	96c7132dee	[maven-release-plugin] prepare release netty-4.1.20.Final	2018-01-21 12:53:34 +00:00

1 2 3 4 5 ...

325 Commits