netty5

Author	SHA1	Message	Date
Norman Maurer	522b3e1b92	[#2604 ] Not try to use sun.misc.Cleaner when on android Motivation: When a user tries to use netty on android it currently fails with "Could not find class 'sun.misc.Cleaner'" Modification: Encapsulate sun.misc.Cleaner usage in extra class to workaround this isssue. Result: Netty can be used on android again	2014-06-27 08:18:21 +02:00
Zhihui Jiao	d062500ca0	Fix inconsistent code in the doc	2014-06-27 06:48:47 +02:00
Norman Maurer	8869f6d5c1	[#2598 ] Add Epoll.isAvailable() which allows to check if epoll can be used. Motivation: At the moment there is no simple way for a user to check if the native epoll transport can be used on the running platform. Thus the user can only try to instance it and catch any exception and fallback to nio transport. Modification: Add Epoll.isAvailable() which allows to check if epoll can be used. Result: User can easily check if epoll transport can be used or not	2014-06-26 12:27:40 +02:00
Trustin Lee	a5097b937e	Fix incorrect bytesBefore/indexOf() in ReplayingDecoderBuffer Motivation: bytesBefore(length, ...), bytesBefore(index, length, ...), and indexOf(fromIndex, toIndex,...) in ReplayingDecoderBuffer are buggy. They trigger 'REPLAY even when they don't need to. Modification: Implement the buggy methods properly so that REPLAYs are not triggered unnecessarily. Result: Correct behvaior	2014-06-26 18:57:20 +09:00
Norman Maurer	d1c8bcb40f	[#2605 ] Use SO_REUSEADDR on EpollServerSocketChannel to match defaults of java.nio.ServerSocketChannel impl Motivation: When using openjdk and oracle jdk's nio (while using the nio transport) the ServerSocketChannel uses SO_REUSEADDR by default. Our native transport should do the same to make it easier to switch between the different implementations and get the expected result. Modification: Change EpollServerSocketChannelConfig to set SO_REUSEADDR on the created socket. Result: SO_REUSEADDR is used by default on servers.	2014-06-26 11:53:23 +02:00
Norman Maurer	658fdffbad	Reduce the memory copies in JdkZlibEncoder Motivation: At the moment we use a lot of unnecessary memory copies in JdkZlibEncoder. This is caused by either allocate a to small ByteBuf and expand it later or using a temporary byte array. Beside this the memory footprint of JdkZlibEncoder is pretty high because of the byte[] used for compressing. Modification: - Override allocateBuffer(...) and calculate the estimatedsize in there, this reduce expanding of the ByteBuf later - Not use byte[] in the instance itself but allocate a heap ByteBuf and write directly into the byte array Result: Less memory copies and smaller memory footprint	2014-06-26 11:12:06 +02:00
Trustin Lee	d8d0bbfc26	Optimize PoolChunk - Using short[] for memoryMap did not improve performance. Reverting back to the original dual-byte[] structure in favor of simplicity. - Optimize allocateRun() which yields small performence improvement - Use local variable when member fields are accessed more than once	2014-06-26 17:06:29 +09:00
Trustin Lee	c41538050c	Fix inspector warnings	2014-06-26 17:06:29 +09:00
Pavan Kumar	d2e36a49c7	Improve the allocation algorithm in PoolChunk Motivation: Depth-first search is not always efficient for buddy allocation. Modification: Employ a new faster search algorithm with different memoryMap layout. Result: With thread-local cache disabled, we see a lot of performance improvment, especially when the size of the allocation is as small as the page size, which had the largest search space previously.	2014-06-26 17:06:29 +09:00
Norman Maurer	c3f24444ef	Use IntObjectMap to replace Map in EpollEventLoop. Motivation: We need to map from ints to AbstractEpollChannel in EpollEventLoop but there is no need for box to Integer. Modification: Replace Map with IntObjectMap. Result: No more auto-boxing needed.	2014-06-25 20:23:16 +02:00
Norman Maurer	b4b61c1f41	[#2599 ] Not use sun.nio.ch.DirectBuffer as it not exists on android Motivation: During some refactoring we changed PlatformDependend0 to use sun.nio.ch.DirectBuffer for release direct buffers. This broke support for android as the class does not exist there and so an exception is thrown. Modification: Use again the fieldoffset to get access to Cleaner for release direct buffers. Result: Netty can be used on android again	2014-06-25 15:01:57 +02:00
Alexey Parfenov	063ca10d87	Fix integer overflow in HttpObjectEncoder when handling chunked encoding and FileRegion > Integer.MAX_VALUE Motivation: Due to integer overflow bug, writes of FileRegions to http server pipeline (eg like one from HttpStaticFileServer example) with length greater than Integer.MAX_VALUE are ignored in 1/2 of cases (ie no data gets sent to client) Modification: Correctly handle chunk sized > Integer.MAX_VALUE Result: Be able to use FileRegion > Integer.MAX_VALUE when using chunked encoding.	2014-06-24 12:14:46 +02:00
Trustin Lee	4a13f66e13	Remove 'get' prefix from all HTTP/SPDY messages Motivation: Persuit for the consistency in method naming Modifications: - Remove the 'get' prefix from all HTTP/SPDY message classes - Fix some inspector warnings Result: Consistency Fixes #2594	2014-06-24 18:33:30 +09:00
onlychoice	30e22f5da3	Fix a typo in comment	2014-06-24 11:02:12 +02:00
Trustin Lee	7457801020	Deprecate SocksMessage.encodeAsByteBuf() It was an internal use only method which became public by a mistake during the review process.	2014-06-24 16:41:47 +09:00
Trustin Lee	71dce0193f	Rename fromByte() to valueOf() Motivation: Persuit the consistency in method naming Modifications: Rename fromByte(byte) to valueOf(byte) Result: Consistency	2014-06-24 16:36:10 +09:00
Norman Maurer	790c63e8d2	Improve performance of Recycler Motivation: Recycler is used in many places to reduce GC-pressure but is still not as fast as possible because of the internal datastructures used. Modification: - Rewrite Recycler to use a WeakOrderQueue which makes minimal guaranteer about order and visibility for max performance. - Recycling of the same object multiple times without acquire it will fail. - Introduce a RecyclableMpscLinkedQueueNode which can be used for MpscLinkedQueueNodes that use Recycler These changes are based on @belliottsmith 's work that was part of #2504. Result: Huge increase in performance. 4.0 branch without this commit: Benchmark (size) Mode Samples Score Score error Units i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 116026994.130 2763381.305 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 110823170.627 3007221.464 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 118290272.413 7143962.304 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 120560396.523 6483323.228 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 114726607.428 2960013.108 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 119385917.899 3172913.684 ops/s Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.617 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark 4.0 branch with this commit: Benchmark (size) Mode Samples Score Score error Units i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00000 thrpt 20 204158855.315 5031432.145 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 00256 thrpt 20 205179685.861 1934137.841 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 01024 thrpt 20 209906801.437 8007811.254 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 04096 thrpt 20 214288320.053 6413126.689 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 16384 thrpt 20 215940902.649 7837706.133 ops/s i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread 65536 thrpt 20 211141994.206 5017868.542 ops/s Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.648 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark	2014-06-24 08:09:19 +02:00
Norman Maurer	22f16e52bf	MessageToByteEncoder always starts with ByteBuf that use initalCapacity == 0 Motivation: MessageToByteEncoder always starts with ByteBuf that use initalCapacity == 0 when preferDirect is used. This is really wasteful in terms of performance as every first write into the buffer will cause an expand of the buffer itself. Modifications: - Change ByteBufAllocator.ioBuffer() use the same default initialCapacity as heapBuffer() and directBuffer() - Add new allocateBuffer method to MessageToByteEncoder that allow the user to do some smarter allocation based on the message that will be encoded. Result: Less expanding of buffer and more flexibilty when allocate the buffer for encoding.	2014-06-24 13:55:02 +09:00
Norman Maurer	f0e2d0b77c	Make use of HttpChunkedInput as this will also work when compression is used	2014-06-23 09:38:31 +02:00
Trustin Lee	7162d96ed5	Revert "Improve the allocation algorithm in PoolChunk" This reverts commit `36305d7dce`, which seems to cause an assertion failure on our CI machine.	2014-06-21 19:19:49 +09:00
Trustin Lee	a1b87411fb	Make sure OpenSslEngine is tested against transport-native-epoll	2014-06-21 18:29:00 +09:00
Trustin Lee	3775a6124e	Remove padding utility classes - It's not used anywhere	2014-06-21 17:59:20 +09:00
Trustin Lee	6834c2da51	Add missing last padding / Comment	2014-06-21 17:57:29 +09:00
Trustin Lee	b1a5ced729	Checkstyle / Overall clean-up / Fix serialization	2014-06-21 17:57:29 +09:00
nitsanw	a5d585a8ef	Fix false sharing between head and tail reference in MpscLinkedQueue Motivation: The tail node reference writes (by producer threads) are very likely to invalidate the cache line holding the headRef which is read by the consumer threads in order to access the padded reference to the head node. This is because the resulting layout for the object is: - header - Object AtomicReference.value -> Tail node - Object MpscLinkedQueue.headRef -> PaddedRef -> Head node This is 'passive' false sharing where one thread reads and the other writes. The current implementation suffers from further passive false sharing potential from any and all neighbours to the queue object as no pre/post padding is provided for the class fields. Modifications: Fix the memory layout by adding pre-post padding for the head node and putting the tail node reference in the same object. Result: Fixed false sharing	2014-06-21 17:57:29 +09:00
nmittler	fd895e53f4	Adding int-to-object map implementation. Motivation: Maps with integer keys are used in several places (HTTP/2 code, for example). To reduce the memory footprint of these structures, we need a specialized map class that uses ints as keys. Modifications: Added IntObjectHashMap, which is uses open addressing and double hashing for collision resolution. Result: A new int-based map class that can be shared across Netty.	2014-06-21 08:36:06 +02:00
Trustin Lee	cb994dd926	Fix the inconsistencies between performance tests in ByteBufAllocatorBenchmark Motivation: default() tests are performing a test in a different way, and they must be same with other tests. Modification: Make sure default() tests are same with the others Result: Easier to compare default and non-default allocators	2014-06-21 13:28:11 +09:00
Pavan Kumar	004ffbad90	Improve the allocation algorithm in PoolChunk Motivation: Depth-first search is not always efficient for buddy allocation. Modification: Employ a new faster search algorithm with different memoryMap layout. Result: With thread-local cache disabled, we see a lot of performance improvment, especially when the size of the allocation is as small as the page size, which had the largest search space previously: -- master head -- Benchmark (size) Mode Score Error Units pooledDirectAllocAndFree 8192 thrpt 215.392 1.565 ops/ms pooledDirectAllocAndFree 16384 thrpt 594.625 2.154 ops/ms pooledDirectAllocAndFree 65536 thrpt 1221.520 18.965 ops/ms pooledHeapAllocAndFree 8192 thrpt 217.175 1.653 ops/ms pooledHeapAllocAndFree 16384 thrpt 587.250 14.827 ops/ms pooledHeapAllocAndFree 65536 thrpt 1217.023 44.963 ops/ms -- changes -- Benchmark (size) Mode Score Error Units pooledDirectAllocAndFree 8192 thrpt 3656.744 94.093 ops/ms pooledDirectAllocAndFree 16384 thrpt 4087.152 22.921 ops/ms pooledDirectAllocAndFree 65536 thrpt 4058.814 29.276 ops/ms pooledHeapAllocAndFree 8192 thrpt 3640.355 44.418 ops/ms pooledHeapAllocAndFree 16384 thrpt 4030.206 24.365 ops/ms pooledHeapAllocAndFree 65536 thrpt 4103.991 70.991 ops/ms	2014-06-21 13:20:56 +09:00
Norman Maurer	58f4b4b7d9	[#2589 ] LocalServerChannel.doClose() throws NPE when localAddress == null Motivation: LocalServerChannel.doClose() calls LocalChannelRegistry.unregister(localAddress); without check if localAddress is null and so produce a NPE when pass null the used ConcurrentHashMapV8 Modification: Check for localAddress != null before try to remove it from Map. Also added a unit test which showed the stacktrace of the error. Result: No more NPE during doClose().	2014-06-20 20:07:00 +02:00
Norman Maurer	19a1b603d0	Remove System.out.println(...) debug messages	2014-06-20 19:42:08 +02:00
Norman Maurer	d2b8560a76	[#2580 ] [#2587 ] Fix buffer corruption regression when ByteBuf.order(LITTLE_ENDIAN) is used Motivation: To improve the speed of ByteBuf with order LITTLE_ENDIAN and where the native order is also LITTLE_ENDIAN (intel) we introduces a new special SwappedByteBuf before in commit `4ad3984c8b`. Unfortunally the commit has a flaw which does not handle correctly the case when a ByteBuf expands. This was caused because the memoryAddress was cached and never changed again even if the underlying buffer expanded. This can lead to corrupt data or even to SEGFAULT the JVM if you are lucky enough. Modification: Always lookup the actual memoryAddress of the wrapped ByteBuf. Result: No more data-corruption for ByteBuf with order LITTLE_ENDIAN and no JVM crashes.	2014-06-20 18:25:54 +02:00
Norman Maurer	1278467fec	[#2586 ] Use correct EventLoop to notify delayed successful registration Motivation: At the moment AbstractBoostrap.bind(...) will always use the GlobalEventExecutor to notify the returned ChannelFuture if the registration is not done yet. This should only be done if the registration fails later. If it completes successful we should just notify with the EventLoop of the Channel. Modification: Use EventLoop of the Channel if possible to use the correct Thread to notify and so guaranteer the right order of events. Result: Use the correct EventLoop for notification	2014-06-20 16:51:28 +02:00
Trustin Lee	fb538ea532	Refactor FastThreadLocal to simplify TLV management Motivation: When Netty runs in a managed environment such as web application server, Netty needs to provide an explicit way to remove the thread-local variables it created to prevent class loader leaks. FastThreadLocal uses different execution paths for storing a thread-local variable depending on the type of the current thread. It increases the complexity of thread-local removal. Modifications: - Moved FastThreadLocal and FastThreadLocalThread out of the internal package so that a user can use it. - FastThreadLocal now keeps track of all thread local variables it has initialized, and calling FastThreadLocal.removeAll() will remove all thread-local variables of the caller thread. - Added FastThreadLocal.size() for diagnostics and tests - Introduce InternalThreadLocalMap which is a mixture of hard-wired thread local variable fields and extensible indexed variables - FastThreadLocal now uses InternalThreadLocalMap to implement a thread-local variable. - Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator tells it to stop watching when its thread-local cache has been freed by FastThreadLocal.removeAll(). - Added FastThreadLocalTest to ensure that removeAll() works - Added microbenchmark for FastThreadLocal and JDK ThreadLocal - Upgraded to JMH 0.9 Result: - A user can remove all thread-local variables Netty created, as long as he or she did not exit from the current thread. (Note that there's no way to remove a thread-local variable from outside of the thread.) - FastThreadLocal exposes more useful operations such as isSet() because we always implement a thread local variable via InternalThreadLocalMap instead of falling back to JDK ThreadLocal. - FastThreadLocalBenchmark shows that this change improves the performance of FastThreadLocal even more.	2014-06-19 21:08:16 +09:00
Norman Maurer	7279e48bef	Small improvement in SimpleChannelInboundHandlerAdapter javadoc	2014-06-18 14:49:11 +02:00
Norman Maurer	917132e28d	Make use of AtomicLongFieldUpdater.addAndGet(...) for cleaner code Motivation: The code in ChannelOutboundBuffer can be simplified by using AtomicLongFieldUpdater.addAndGet(...) Modification: Replace our manual looping with AtomicLongFieldUpdater.addAndGet(...) Result: Cleaner code	2014-06-17 19:50:14 +02:00
Norman Maurer	b627824b18	[#2577 ] ChannelOutboundBuffer.addFlush() unnecessary loop through all entries on multiple calls Motivation: If ChannelOutboundBuffer.addFlush() is called multiple times and flushed != unflushed it will still loop through all entries that are not flushed yet even if it is not needed anymore as these were marked uncancellable before. Modifications: Check if new messages were added since addFlush() was called and only if this was the case loop through all entries and try to mark the uncancellable. Result: Less overhead when ChannelOuboundBuffer.addFlush() is called multiple times without new messages been added.	2014-06-17 09:29:16 +02:00
Trustin Lee	b8a7881588	Fix incorrect method signature of awaitInactivity() - Related: #2084	2014-06-17 16:01:25 +09:00
Norman Maurer	6fdf1138ca	[#2573 ] UnpooledUnsafeDirectByteBuf.setBytes(int,ByteBuf,int,int) fails to use fast-path when src has array Motivation: UnpooledUnsafeDirectByteBuf.setBytes(int,ByteBuf,int,int) fails to use fast-path when src uses an array as backing storage. This is because the if else uses the wrong ByteBuf for its check. Modifications: - Use correct ByteBuf when check for array as backing storage - Also eliminate unecessary check in UnpooledDirectByteBuf which always fails anyway Result: Faster setBytes(...) when src ByteBuf is backed by an array. No more IndexOutOfBoundsException or data-corruption.	2014-06-16 11:11:54 +02:00
Norman Maurer	a4bd566cef	[#2572 ] Correctly calculate length of output buffer before inflate to fix IndexOutOfBoundException Motivation: JdkZlibDecoder fails to decode because the length of the output buffer is not calculated correctly. This can cause an IndexOutOfBoundsException or data-corruption when the PooledByteBuffAllocator is used. Modifications: Correctly calculate the length Result: No more IndexOutOfBoundsException or data-corruption.	2014-06-16 10:22:02 +02:00
Phil.Baxter	101b9ded33	export sun security packages as optional	2014-06-15 20:59:55 +02:00
Trustin Lee	e88262861a	Use FastThreadLocal in more places	2014-06-14 17:46:36 +09:00
Norman Maurer	b737d631f1	[maven-release-plugin] prepare for next development iteration	2014-06-12 16:20:52 +02:00
Norman Maurer	1709113a1f	[maven-release-plugin] prepare release netty-4.0.20.Final	2014-06-12 16:14:48 +02:00
Norman Maurer	76043bc8c8	Make use of an array to store FastThreadLocals and so allow to also use it in PooledByteBufAllocator that is instanced by users. Motivation: Allow to make use of our new FastThreadLocal whereever possible Modification: Make use of an array to store FastThreadLocals and so allow to also use it in PooledByteBufAllocator that is instanced by users. The maximal size of the array is configurable per system property to allow to tune it if needed. As default we use 64 entries which should be good enough. Result: More flexible usage of FastThreadLocal	2014-06-12 15:43:20 +02:00
belliottsmith	1ac2ff8d7b	Introduce FastThreadLocal which uses an EnumMap and a predefined fixed set of possible thread locals Motivation: Provide a faster ThreadLocal implementation Modification: Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark); Result: Improved performance	2014-06-12 15:43:20 +02:00
Norman Maurer	cf1d9823a0	Make sure cancelled Timeouts are able to be GC'ed fast. Motivation: At the moment the HashedWheelTimer will only remove the cancelled Timeouts once the HashedWheelBucket is processed again. Until this the instance will not be able to be GC'ed as there are still strong referenced to it even if the user not reference it by himself/herself. This can cause to waste a lot of memory even if the Timeout was cancelled before. Modification: Add a new queue which holds CancelTasks that will be processed on each tick to remove cancelled Timeouts. Because all of this is done only by the WorkerThread there is no need for synchronization and only one extra object creation is needed when cancel() is executed. For addTimeout(...) no new overhead is introduced. Result: Less memory usage for cancelled Timeouts.	2014-06-10 12:47:13 +02:00
Norman Maurer	9b468bc275	Optimize DefaultChannelPipeline in terms of memory usage and initialization time Motivation: Each of DefaultChannelPipeline instance creates an head and tail that wraps a handler. These are used to chain together other DefaultChannelHandlerContext that are created once a new ChannelHandler is added. There are a few things here that can be improved in terms of memory usage and initialization time. Modification: - Only generate the name for the tail and head one time as it will never change anyway - Rename DefaultChannelHandlerContext to AbstractChannelHandlerContext and make it abstract - Create a new DefaultChannelHandlerContext that is used when a ChannelHandler is added to the DefaultChannelPipeline - Rename TailHandler to TailContext and HeadHandler to HeadContext and let them extend AbstractChannelHandlerContext. This way we can save 2 object creations per DefaultChannelPipeline Result: - Less memory usage because we have 2 less objects per DefaultChannelPipeline - Faster creation of DefaultChannelPipeline as we not need to generate the name for the head and tail	2014-06-10 12:45:37 +02:00
Frederic Bregier	6b69ccb585	[#2542 ] HTTP post request decoder does not support quoted boundaries Motivation: According to RFC2616 section 19, boundary string could be quoted, but currently the PostRequestDecoder does not support it while it should. Modifications: Once the boundary is found, one check is made to verify if the boundary is "quoted", and if so, it is "unqoted". Note: in following usage of this boundary (as delimiter), quote seems no more allowed according to the same RFC, so the reason that only the boundary definition is corrected. Result: Now the boundary could be whatever quoted or not. A Junit test case checks it.	2014-06-08 21:57:43 +02:00
Norman Maurer	a0a8f1032b	[#2544 ] Correctly parse Multipart-mixed POST HTTP request in case of entity ends with odd number of 0x0D. Port of @fredericBregier 's work. Motivation: When an attribute is ending with an odd number of CR (0x0D), the decoder add an extra CR in the decoded attribute and should not. Modifications: Each time a CR is detected, the next byte was tested to be LF or not. If not, in a number of places, the CR byte was lost while it should not be. When a CR is detected, if the next byte is not LF, the CR byte should be saved as the position point to the next byte (not LF). When a CR is detected, if there is not yet other available bytes, the position is reset to the position of CR (since a LF could follow). A new Junit test case is added, using DECODER and variable number of CR in the final attribute (testMultipartCodecWithCRasEndOfAttribute). Result: The attribute is now correctly decoded with the right number of CR ending bytes.	2014-06-08 11:50:58 +02:00
Norman Maurer	4ad3984c8b	[#2436 ] UnsafeByteBuf implementation should only invert bytes if ByteOrder differ from native ByteOrder Motivation: Our UnsafeByteBuf implementation always invert bytes when the native ByteOrder is LITTLE_ENDIAN (this is true on intel), even when the user calls order(ByteOrder.LITTLE_ENDIAN). This is not optimal for performance reasons as the user should be able to set the ByteOrder to LITTLE_ENDIAN and so write bytes without the extra inverting. Modification: - Introduce a new special SwappedByteBuf (called UnsafeDirectSwappedByteBuf) that is used by all the Unsafe*ByteBuf implementation and allows to write without inverting the bytes. - Add benchmark - Upgrade jmh to 0.8 Result: The user is be able to get the max performance even on servers that have ByteOrder.LITTLE_ENDIAN as their native ByteOrder.	2014-06-05 11:09:58 +02:00

... 2 3 4 5 6 ...

6002 Commits