netty5

Author	SHA1	Message	Date
Norman Maurer	5a43de10f7	[maven-release-plugin] prepare release netty-4.1.0.Beta7	2015-10-02 09:02:58 +02:00
Scott Mitchell	d4680c55d8	AsciiString contains utility methods Motivation: When dealing with case insensitive headers it can be useful to have a case insensitive contains method for CharSequence. Modifications: - Add containsCaseInsensative to AsciiString Result: More expressive utility method for case insensitive CharSequence.	2015-10-02 12:50:11 -07:00
Scott Mitchell	284e3702d8	Http2ConnectionHandler Builder instead of constructors Motivation: Using the builder pattern for Http2ConnectionHandler (and subclasses) would be advantageous for the following reasons: 1. Provides the consistent construction afforded by the builder pattern for 'optional' arguments. Users can specify these options 1 time in the builder and then re-use the builder after this. 2. Enforces that the Http2ConnectionHandler's internals (decoder Http2FrameListener) are initialized after construction. Modifications: - Add an extensible builder which can be used to build Http2ConnectionHandler objects - Update classes which inherit from Http2ConnectionHandler Result: It is easier to specify options and construct Http2ConnectionHandler objects.	2015-10-01 13:51:03 -07:00
Scott Mitchell	1485a87e25	Http2ConnectionHandler and Http2FrameListener cyclic dependency Motivation: It is often the case that implementations of Http2FrameListener will want to send responses when data is read. The Http2FrameListener needs access to the Http2ConnectionHandler (or the encoder contained within) to be able to send responses. However the Http2ConnectionHandler requires a Http2FrameListener instance to be passed in during construction time. This creates a cyclic dependency which can make it difficult to cleanly accomplish this relationship. Modifications: - Add Http2ConnectionDecoder.frameListener(..) method to set the frame listener. This will allow the listener to be set after construction. Result: Classes which inherit from Http2ConnectionHandler can more cleanly set the Http2FrameListener.	2015-09-30 15:41:15 -07:00
Scott Mitchell	0e9545e94d	Http2RemoteFlowController stream writibility listener Motivation: For implementations that want to manage flow control down to the stream level it is useful to be notified when stream writability changes. Modifications: - Add writabilityChanged to Http2RemoteFlowController.Listener - Add isWritable to Http2RemoteFlowController Result: The Http2RemoteFlowController provides notification when writability of a stream changes.	2015-09-28 13:47:24 -07:00
nmittler	3ee44a3dbb	Update Netty to latest netty-tcnative Motivation: The latest netty-tcnative fixes a bug in determining the version of the runtime openssl lib. It also publishes an artificact with the classifier linux-<arch>-fedora for fedora-based systems. Modifications: Modified the build files to use the "-fedora" classifier when appropriate for tcnative. Care is taken, however, to not change the classifier for the native epoll transport. Result: Netty is updated the the new shiny netty-tcnative.	2015-09-18 12:07:21 -07:00
Norman Maurer	34de2667c7	[maven-release-plugin] prepare for next development iteration	2015-09-02 11:45:20 +02:00
Norman Maurer	2eb444ec1d	[maven-release-plugin] prepare release netty-4.1.0.Beta6	2015-09-02 11:36:11 +02:00
Scott Mitchell	ba6ce5449e	Headers Performance Boost and Interface Simplification Motivation: A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962. Modifications: - Simplify Headers class hierarchy. - Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0. - Make various other modifications that are causing hot spots. Result: Performance is now on par with 4.0.	2015-08-17 08:50:11 -07:00
Jakob Buchgraber	6fd0a0c55f	Faster and more memory efficient headers for HTTP, HTTP/2, STOMP and SPYD. Fixes #3600 Motivation: We noticed that the headers implementation in Netty for HTTP/2 uses quite a lot of memory and that also at least the performance of randomly accessing a header is quite poor. The main concern however was memory usage, as profiling has shown that a DefaultHttp2Headers not only use a lot of memory it also wastes a lot due to the underlying hashmaps having to be resized potentially several times as new headers are being inserted. This is tracked as issue #3600. Modifications: We redesigned the DefaultHeaders to simply take a Map object in its constructor and reimplemented the class using only the Map primitives. That way the implementation is very concise and hopefully easy to understand and it allows each concrete headers implementation to provide its own map or to even use a different headers implementation for processing requests and writing responses i.e. incoming headers need to provide fast random access while outgoing headers need fast insertion and fast iteration. The new implementation can support this with hardly any code changes. It also comes with the advantage that if the Netty project decides to add a third party collections library as a dependency, one can simply plug in one of those very fast and memory efficient map implementations and get faster and smaller headers for free. For now, we are using the JDK's TreeMap for HTTP and HTTP/2 default headers. Result: - Significantly fewer lines of code in the implementation. While the total commit is still roughly 400 lines less, the actual implementation is a lot less. I just added some more tests and microbenchmarks. - Overall performance is up. The current implementation should be significantly faster for insertion and retrieval. However, it is slower when it comes to iteration. There is simply no way a TreeMap can have the same iteration performance as a linked list (as used in the current headers implementation). That's totally fine though, because when looking at the benchmark results @ejona86 pointed out that the performance of the headers is completely dominated by insertion, that is insertion is so significantly faster in the new implementation that it does make up for several times the iteration speed. You can't iterate what you haven't inserted. I am demonstrating that in this spreadsheet [1]. (Actually, iteration performance is only down for HTTP, it's significantly improved for HTTP/2). - Memory is down. The implementation with TreeMap uses on avg ~30% less memory. It also does not produce any garbage while being resized. In load tests for GRPC we have seen a memory reduction of up to 1.2KB per RPC. I summarized the memory improvements in this spreadsheet [1]. The data was generated by [2] using JOL. - While it was my original intend to only improve the memory usage for HTTP/2, it should be similarly improved for HTTP, SPDY and STOMP as they all share a common implementation. [1] https://docs.google.com/spreadsheets/d/1ck3RQklyzEcCLlyJoqDXPCWRGVUuS-ArZf0etSXLVDQ/edit#gid=0 [2] https://gist.github.com/buchgr/4458a8bdb51dd58c82b4	2015-08-04 17:12:24 -07:00
Scott Mitchell	a7713069a1	HttpObjectDecoder performance improvements Motivation: The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance. Modifications: - Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call. - Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods. Result: Peformance boost for decoding http objects.	2015-07-29 23:26:26 -07:00
Scott Mitchell	9747ffe5fc	HTTP/2 Flow Controller should use Channel.isWritable() Motivation: See #3783 Modifications: - The DefaultHttp2RemoteFlowController should use Channel.isWritable() before attempting to do any write operations. - The Flow controller methods should no longer take ChannelHandlerContext. The concept of flow control is tied to a connection and we do not support 1 flow controller keeping track of multiple ChannelHandlerContext. Result: Writes are delayed until isWritable() is true. Flow controller interface methods are more clear as to ChannelHandlerContext restrictions.	2015-07-16 14:38:48 -07:00
Louis Ryan	05ce33f5ca	Make the flow-controllers write fewer, fatter frames to improve throughput. Motivation: Coalescing many small writes into a larger DATA frame reduces framing overheads on the wire and reduces the number of calls to Http2FrameListeners on the remote side. Delaying the write of WINDOW_UPDATE until flush allows for more consumed bytes to be returned as the aggregate of consumed bytes is returned and not the amount consumed when the threshold was crossed. Modifications: - Remote flow controller no longer immediately writes bytes when a flow-controlled payload is enqueued. Sequential data payloads are now merged into a single CompositeByteBuf which are written when 'writePendingBytes' is called. - Listener added to remote flow-controller which observes written bytes per stream. - Local flow-controller no longer immediately writes WINDOW_UPDATE when the ratio threshold is crossed. Now an explicit call to 'writeWindowUpdates' triggers the WINDOW_UPDATE for all streams who's ratio is exceeded at that time. This results in fewer window updates being sent and more bytes being returned. - Http2ConnectionHandler.flush triggers 'writeWindowUpdates' on the local flow-controller followed by 'writePendingBytes' on the remote flow-controller so WINDOW_UPDATES preceed DATA frames on the wire. Result: - Better throughput for writing many small DATA chunks followed by a flush, saving 9-bytes per coalesced frame. - Fewer WINDOW_UPDATES being written and more flow-control bytes returned to remote side more quickly, thereby improving throughput.	2015-06-19 15:20:31 -07:00
Norman Maurer	f23b7b4efd	[maven-release-plugin] prepare for next development iteration	2015-05-07 14:21:08 -04:00
Norman Maurer	871ce43b1f	[maven-release-plugin] prepare release netty-4.1.0.Beta5	2015-05-07 14:20:38 -04:00
Louis Ryan	a3cea186ce	Have Http2LocalFlowController.consumeBytes indicate whether a WINDOW_UPDATE was written	2015-05-04 13:22:18 -07:00
Scott Mitchell	f812180c2d	ByteString arrayOffset method Motivation: The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support. Modifications: - add arrayOffset method to ByteString - modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset - update all code that uses ByteString.array to ensure it accounts for the offset - add unit tests to test the implementation respects the offset Result: ByteString and AsciiString can represent a sub region of a byte[].	2015-04-24 18:54:01 -07:00
nmittler	70a2608325	Optimizing user-defined stream properties. Motivation: Streams currently maintain a hash map of user-defined properties, which has been shown to add significant memory overhead as well as being a performance bottleneck for lookup of frequently used properties. Modifications: Modifying the connection/stream to use an array as the storage of user-defined properties, indexed by the class that identifies the index into the array where the property is stored. Result: Stream processing performance should be improved.	2015-04-23 12:41:14 -07:00
Scott Mitchell	b426fb1618	Compile error introduced in `ee9233d` Motivation: Commit `ee9233d` introduced a compile error in microbench. Modifications: Fix compile error. Result: Code now builds.	2015-04-22 16:23:39 -07:00
Scott Mitchell	541137cc93	HTTP/2 Flow Controller interface updates Motivation: Flow control is a required part of the HTTP/2 specification but it is currently structured more like an optional item. It must be accessed through the property map which is time consuming and does not represent its required nature. This access pattern does not give any insight into flow control outside of the codec (or flow controller implementation). Modifications: 1. Create a read only public interface for LocalFlowState and RemoteFlowState. 2. Add a LocalFlowState localFlowState(); and RemoteFlowState remoteFlowState(); to Http2Stream. Result: Flow control is not part of the Http2Stream interface. This clarifies its responsibility and logical relationship to other interfaces. The flow controller no longer must be acquired though a map lookup.	2015-04-20 20:02:02 -07:00
Scott Mitchell	2b8104c852	HTTP/2 Priority Tree Benchmark Motivation: There is no benchmark to measure the priority tree implementation performance. Modifications: Introduce a new benchmark which will populate the priority tree, and then shuffle parent/child links around. Result: A simple benchmark to get a baseline for the HTTP/2 codec's priority tree implementation.	2015-04-17 10:14:13 -07:00
Louis Ryan	f3fb77f4bc	Have microbenchmarks produce a deployable artifact. Fix some minor miscellaneous issues. Motivation: Allows for running benchmarks from built jars which is useful in development environments that only take released artifacts. Modifications: Move benchmarks into 'main' from 'test' Add @State annotations to benchmarks that are missing them Fix timing issue grabbing context during channel initialization Result: Users can run benchmarks more easily.	2015-04-17 10:04:26 -07:00
Jakob Buchgraber	c2de195f87	Improve performance of AsciiString.equals(Object). Motivation: The current implementation does byte by byte comparison, which we have seen can be a performance bottleneck when the AsciiString is used as the key in a Map. Modifications: Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time and get closer to the performance of String.equals(Object). Result: Significant improvement (2x - 6x) in performance over the current implementation. Benchmark (size) Mode Samples Score Score error Units i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10 thrpt 10 118843477.518 2347259.347 ops/s i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 50 thrpt 10 43910319.773 198376.996 ops/s i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100 thrpt 10 26339969.001 159599.252 ops/s i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 1000 thrpt 10 2873119.030 20779.056 ops/s i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 10000 thrpt 10 306370.450 1933.303 ops/s i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual 100000 thrpt 10 25750.415 108.391 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10 thrpt 10 248077563.510 635320.093 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 50 thrpt 10 128198943.138 614827.548 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100 thrpt 10 86195621.349 1063959.307 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 1000 thrpt 10 16920264.598 61615.365 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 10000 thrpt 10 1687454.747 6367.602 ops/s i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual 100000 thrpt 10 153717.851 586.916 ops/s	2015-04-16 17:29:54 -07:00
Scott Mitchell	9a7a85dbe5	ByteString introduced as AsciiString super class Motivation: The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots. Modifications: - ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString. Results: The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.	2015-04-14 16:35:17 -07:00
nmittler	ab158a6ea4	Adding basic benchmarks for IntObjectHashMap Motivation: It needs to be fast :) Modifications: Added a simple benchmark to the microbench module. Result: Yay, benchmarks!	2015-04-13 12:24:19 -07:00
Scott Mitchell	cc7ee002dd	HTTP/2 Frame Writer Microbenchmark Fix Motivation: The Http2FrameWriterBenchmark JMH harness class name was not updated for the JVM arguments. The number of forks is 0 which means the JHM will share a JVM with the benchmarks. Sharing the JVM may lead to less reliable benchmarking results and as doesn't allow for the command line arguments to be applied for each benchmark. Modifications: - Update the JMH version from 0.9 to 1.7.1. Benchmarks wouldn't run on old version. - Increase the number of forks from 0 to 1. - Remove allocation of environment from static and cleanup AfterClass to using the Setup and Teardown methods. The forked JVM would not shut down correctly otherwise (and wait for 30+ seconds before timeing out). Result: Benchmarks that run as intended.	2015-04-13 10:59:39 -07:00
nmittler	6fbca14f8a	Cleaning up the initialization of Http2ConnectionHandler Motivation: It currently takes a builder for the encoder and decoder, which makes it difficult to decorate them. Modifications: Removed the builders from the interfaces entirely. Left the builder for the decoder impl but removed it from the encoder since it's constructor only takes 2 parameters. Also added decorator base classes for the encoder and decoder and made the CompressorHttp2ConnectionEncoder extend the decorator. Result: Fixes #3530	2015-03-30 11:23:02 -07:00
Scott Mitchell	2bf592c50f	Backport of HTTP/2 Microbenchmark fail. Motivation: The backport of `a6c729bdf8` failed. Modifications: - Make sure the interfaces are correctly implemented when backporting. Result: Microbenchmark compiles and runs on 4.1 branch.	2015-03-28 18:41:09 -07:00
scottmitch	2dda917f27	Http2DefaultFrameWriter microbenchmark Motivation: A microbenchmark will be useful to get a baseline for performance. Modifications: - Introduce a new microbenchmark which tests the Http2DefaultFrameWriter. - Allow benchmarks to run without thread context switching between JMH and Netty. Result: Microbenchmark exists to test performance.	2015-03-27 13:10:57 -07:00
Norman Maurer	fce0989844	[maven-release-plugin] prepare for next development iteration	2015-03-03 02:06:47 -05:00
Norman Maurer	ca3b1bc4b7	[maven-release-plugin] prepare release netty-4.1.0.Beta4	2015-03-03 02:05:52 -05:00
Michael Nitschinger	1d344f488c	Fix ByteBufUtilBenchmark on utf8 encodings. Motivation ---------- The performance tests for utf8 also used the getBytes on ASCII, which is incorrect and also provides different performance numbers. Modifications ------------- Use CharsetUtil.UTF_8 instead of US_ASCII for the getBytes calls. Result ------ Accurate and semantically correct benchmarking results on utf8 comparisons.	2014-12-31 20:26:42 +09:00
Norman Maurer	fe796fc8ab	Provide helper methods in ByteBufUtil to write UTF-8/ASCII CharSequences. Related to [#909 ] Motivation: We expose no methods in ByteBuf to directly write a CharSequence into it. This leads to have the user either convert the CharSequence first to a byte array or use CharsetEncoder. Both cases have some overheads and we can do a lot better for well known Charsets like UTF-8 and ASCII. Modifications: Add ByteBufUtil.writeAscii(...) and ByteBufUtil.writeUtf8(...) which can do the task in an optimized way. This is especially true if the passed in ByteBuf extends AbstractByteBuf which is true for all of our implementations which not wrap another ByteBuf. Result: Writing an ASCII and UTF-8 CharSequence into a AbstractByteBuf is a lot faster then what the user could do by himself as we can make use of some package private methods and so eliminate reference and range checks. When the Charseq is not ASCII or UTF-8 we can still do a very good job and are on par in most of the cases with what the user would do. The following benchmark shows the improvements: Result: 2456866.966 ?(99.9%) 59066.370 ops/s [Average] Statistics: (min, avg, max) = (2297025.189, 2456866.966, 2586003.225), stdev = 78851.914 Confidence interval (99.9%): [2397800.596, 2515933.336] Benchmark Mode Samples Score Score error Units i.n.m.b.ByteBufUtilBenchmark.writeAscii thrpt 50 9398165.238 131503.098 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiString thrpt 50 9695177.968 176684.821 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArray thrpt 50 4788597.415 83181.549 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArrayWrapped thrpt 50 4722297.435 98984.491 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringWrapped thrpt 50 4028689.762 66192.505 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArray thrpt 50 3234841.565 91308.009 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArrayWrapped thrpt 50 3311387.474 39018.933 ops/s i.n.m.b.ByteBufUtilBenchmark.writeAsciiWrapped thrpt 50 3379764.250 66735.415 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8 thrpt 50 5671116.821 101760.081 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8String thrpt 50 5682733.440 111874.084 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArray thrpt 50 3564548.995 55709.512 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArrayWrapped thrpt 50 3621053.671 47632.820 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringWrapped thrpt 50 2634029.071 52304.876 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArray thrpt 50 3397049.332 57784.119 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArrayWrapped thrpt 50 3318685.262 35869.562 ops/s i.n.m.b.ByteBufUtilBenchmark.writeUtf8Wrapped thrpt 50 2473791.249 46423.114 ops/s Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,387.417 sec - in io.netty.microbench.buffer.ByteBufUtilBenchmark Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 The ViaArray benchmarks are basically doing a toString().getBytes(Charset) which the others are using ByteBufUtil.write*(...).	2014-12-26 15:58:18 +09:00
Idel Pivnitskiy	cff98fff51	Benchmark for HttpRequestDecoder	2014-11-12 14:29:15 +01:00
Scott Mitchell	7e65c09373	IPv6 address to string rfc5952 Motivation: The java implementations for Inet6Address.getHostName() do not follow the RFC 5952 (http://tools.ietf.org/html/rfc5952#section-4) for recommended string representation. This introduces inconsistencies when integrating with other technologies that do follow the RFC. Modifications: -NetUtil.java to have another public static method to convert InetAddress to string. Inet4Address will use the java InetAddress.getHostAddress() implementation and there will be new code to implement the RFC 5952 IPV6 string conversion. -New unit tests to test the new method Result: Netty provides a RFC 5952 compliant string conversion method for IPV6 addresses	2014-10-30 00:05:57 -04:00
Trustin Lee	b5f61d0de5	[maven-release-plugin] prepare for next development iteration	2014-08-16 03:27:42 +09:00
Trustin Lee	76ac3b21a5	[maven-release-plugin] prepare release netty-4.1.0.Beta3	2014-08-16 03:27:37 +09:00
Trustin Lee	b3c1904cc9	[maven-release-plugin] prepare for next development iteration	2014-08-15 09:31:03 +09:00
Trustin Lee	e013b2400f	[maven-release-plugin] prepare release netty-4.1.0.Beta2	2014-08-15 09:30:59 +09:00
Trustin Lee	e167b02d52	[maven-release-plugin] prepare for next development iteration	2014-07-04 17:26:02 +09:00
Trustin Lee	ba50cb829b	[maven-release-plugin] prepare release netty-4.1.0.Beta1	2014-07-04 17:25:54 +09:00
Trustin Lee	787663a644	[maven-release-plugin] rollback the release of netty-4.1.0.Beta1	2014-07-04 17:11:14 +09:00
Trustin Lee	83eae705e1	[maven-release-plugin] prepare release netty-4.1.0.Beta1	2014-07-04 17:02:17 +09:00
Trustin Lee	f67ac5e46d	Fix the inconsistencies between performance tests in ByteBufAllocatorBenchmark Motivation: default() tests are performing a test in a different way, and they must be same with other tests. Modification: Make sure default() tests are same with the others Result: Easier to compare default and non-default allocators	2014-06-21 13:28:02 +09:00
Trustin Lee	085a61a310	Refactor FastThreadLocal to simplify TLV management Motivation: When Netty runs in a managed environment such as web application server, Netty needs to provide an explicit way to remove the thread-local variables it created to prevent class loader leaks. FastThreadLocal uses different execution paths for storing a thread-local variable depending on the type of the current thread. It increases the complexity of thread-local removal. Modifications: - Moved FastThreadLocal and FastThreadLocalThread out of the internal package so that a user can use it. - FastThreadLocal now keeps track of all thread local variables it has initialized, and calling FastThreadLocal.removeAll() will remove all thread-local variables of the caller thread. - Added FastThreadLocal.size() for diagnostics and tests - Introduce InternalThreadLocalMap which is a mixture of hard-wired thread local variable fields and extensible indexed variables - FastThreadLocal now uses InternalThreadLocalMap to implement a thread-local variable. - Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator tells it to stop watching when its thread-local cache has been freed by FastThreadLocal.removeAll(). - Added FastThreadLocalTest to ensure that removeAll() works - Added microbenchmark for FastThreadLocal and JDK ThreadLocal - Upgraded to JMH 0.9 Result: - A user can remove all thread-local variables Netty created, as long as he or she did not exit from the current thread. (Note that there's no way to remove a thread-local variable from outside of the thread.) - FastThreadLocal exposes more useful operations such as isSet() because we always implement a thread local variable via InternalThreadLocalMap instead of falling back to JDK ThreadLocal. - FastThreadLocalBenchmark shows that this change improves the performance of FastThreadLocal even more.	2014-06-19 21:13:55 +09:00
belliottsmith	2a2a21ec59	Introduce FastThreadLocal which uses an EnumMap and a predefined fixed set of possible thread locals Motivation: Provide a faster ThreadLocal implementation Modification: Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark); Result: Improved performance	2014-06-13 10:56:18 +02:00
Norman Maurer	61dbc353ca	[#2436 ] UnsafeByteBuf implementation should only invert bytes if ByteOrder differ from native ByteOrder Motivation: Our UnsafeByteBuf implementation always invert bytes when the native ByteOrder is LITTLE_ENDIAN (this is true on intel), even when the user calls order(ByteOrder.LITTLE_ENDIAN). This is not optimal for performance reasons as the user should be able to set the ByteOrder to LITTLE_ENDIAN and so write bytes without the extra inverting. Modification: - Introduce a new special SwappedByteBuf (called UnsafeDirectSwappedByteBuf) that is used by all the Unsafe*ByteBuf implementation and allows to write without inverting the bytes. - Add benchmark - Upgrade jmh to 0.8 Result: The user is be able to get the max performance even on servers that have ByteOrder.LITTLE_ENDIAN as their native ByteOrder.	2014-06-05 10:59:22 +02:00
Trustin Lee	0cc264b76b	More realistic ByteBuf allocation benchmark Motivation: Allocating a single buffer and releasing it repetitively for a benchmark will not involve the realistic execution path of the allocators. Modifications: Keep the last 8192 allocations and release them randomly. Result: We are now getting the result close to what we got with caliper.	2014-05-29 19:51:05 +09:00
Michael Nitschinger	7d62594cc6	Upgrade JMH to 0.4.1 and make use of @Params.	2014-02-23 16:39:39 +01:00
Michael Nitschinger	33197c7696	Update JMH to 0.3.2	2014-02-14 13:16:13 -08:00

1 2 3

137 Commits