netty5

Author	SHA1	Message	Date
Tim Brooks	2dc686ded1	Prefer direct io buffers if direct buffers pooled (#9167 ) Motivation Direct buffers are normally preferred when interfacing with raw sockets. Currently netty will only return direct io buffers (for reading from a channel) when a platform has unsafe. However, this is inconsistent with the write-side (filterOutboundMessage) where a direct byte buffer will be returned if pooling is enabled. This means that environments without unsafe (and no manual netty configurations) end up with many pooled heap byte buffers for reading, many pooled direct byte buffers for writing, and jdk pooled byte buffers (for reading). Modifications This commit modifies the AbstractByteBufAllocator to return a direct byte buffer for io handling when the platform has unsafe or direct byte buffers are pooled. Result: Use direct buffers when direct buffers are pooled for IO.	2019-05-22 07:32:41 +02:00
Norman Maurer	c83904a12a	Allow to automatically trim the PoolThreadCache in a timely interval (#8941 ) Motivation: PooledByteBufAllocator uses a PoolThreadCache per Thread that allocates / deallocates to minimize the performance overhead. This PoolThreadCache is trimmed after X allocations to free up buffers that are not allocated for a long time. This works out quite well when the app continues to allocate but fails if the app stops to allocate frequently (for whatever reason) and so a lot of memory is wasted and not given back to the arena / freed. Modifications: - Add a ThreadExecutorMap that offers multiple methods that wrap Runnable / ThreadFactory / Executor and allow to call ThreadExecutorMap.currentEventExecutor() to get the current executing EventExecutor for the calling Thread. - Use these methods in the constructors of our EventExecutor implementations (which also covers the EventLoop implementations) - Add io.netty.allocator.cacheTrimIntervalMillis system property which can be used to specify a fixed rate / interval on which we should try to trim the PoolThreadCache for a EventExecutor that allocates. - Add PooledByteBufAllocator.trimCurrentThreadCache() to allow the user to trim the cache of the calling thread manually. - Add testcases - Introduce FastThreadLocal.getIfExists() Result: Allow to better / more frequently trim PoolThreadCache and so give back memory to the area / system.	2019-03-22 11:08:37 +01:00
Norman Maurer	90ea3ec9f6	Adjust tests to be able to build / test when using IBM J9 / OpenJ9 (#8900 ) Motivation: We should run a CI job using J9 to ensure netty also works when using different JVMs. Modifications: - Adjust PooledByteBufAllocatorTest to be able to complete faster when using a JVM which takes longer when joining Threads (this seems to be the case with J9). - Skip UDT tests on J9 as UDT is not supported there. Result: Be able to run CI against J9.	2019-03-01 06:47:56 +01:00
Norman Maurer	cd3254df88	Update to new checkstyle plugin (#8777 ) (#8780 ) Motivation: We need to update to a new checkstyle plugin to allow the usage of lambdas. Modifications: - Update to new plugin version. - Fix checkstyle problems. Result: Be able to use checkstyle plugin which supports new Java syntax.	2019-01-25 11:58:42 +01:00
Norman Maurer	e329ca1cf3	Introduce ObjectCleaner and use it in FastThreadLocal to ensure FastThreadLocal.onRemoval(...) is called Motivation: There is no guarantee that FastThreadLocal.onRemoval(...) is called if the FastThreadLocal is used by "non" FastThreacLocalThreads. This can lead to all sort of problems, like for example memory leaks as direct memory is not correctly cleaned up etc. Beside this we use ThreadDeathWatcher to check if we need to release buffers back to the pool when thread local caches are collected. In the past ThreadDeathWatcher was used which will need to "wakeup" every second to check if the registered Threads are still alive. If we can ensure FastThreadLocal.onRemoval(...) is called we do not need this anymore. Modifications: - Introduce ObjectCleaner and use it to ensure FastThreadLocal.onRemoval(...) is always called when a Thread is collected. - Deprecate ThreadDeathWatcher - Add unit tests. Result: Consistent way of cleanup FastThreadLocals when a Thread is collected.	2017-12-21 07:34:44 +01:00
Norman Maurer	09a05b680d	Dont use ThreadDeathWatcher to cleanup PoolThreadCache if FastThreadLocalThread with wrapped Runnable is used Motivation: We dont need to use the ThreadDeathWatcher if we use a FastThreadLocalThread for which we wrap the Runnable and ensure we call FastThreadLocal.removeAll() once the Runnable completes. Modifications: - Dont use a ThreadDeathWatcher if we are sure we will call FastThreadLocal.removeAll() - Add unit test. Result: Less overhead / running theads if you only allocate / deallocate from FastThreadLocalThreads.	2017-11-28 13:43:28 +01:00
Carl Mastrangelo	d2cb51bc2e	Enable PooledByteBufAllocator to work, event without a cache Motivation: `useCacheForAllThreads` may be false which disables memory caching on non netty threads. Setting this argument or the system property makes it impossible to use `PooledByteBufAllocator`. Modifications: Delayed the check of `freeSweepAllocationThreshold` in `PoolThreadCache` to after it knows there will be any caches in use. Additionally, check if the caches will have any data in them (rather than allocating a 0-length array). A test case is also added that fails without this change. Results: Fixes #7194	2017-09-08 10:20:45 +02:00
Norman Maurer	3ad3356892	Expose ByteBufAllocator metric in a more general way Motivation: PR [#6460] added a way to access the used memory of an allocator. The used naming was not very good and how things were exposed are not consistent. Modifications: - Add a new ByteBufAllocatorMetric and ByteBufAllocatorMetricProvider interface - Let the ByteBufAllocator implementations implement ByteBufAllocatorMetricProvider - Move exposed stats / metric from PooledByteBufAllocator to PooledByteBufAllocatorMetric and mark old methods as `@Deprecated`. Result: More consistent way to expose metric / stats for ByteBufAllocator	2017-03-08 20:07:58 +01:00
Norman Maurer	461f9a1212	Allow to obtain informations of used direct and heap memory for ByteBufAllocator implementations Motivation: Often its useful for the user to be able to get some stats about the memory allocated via an allocator. Modifications: - Allow to obtain the used heap and direct memory for an allocator - Add test case Result: Fixes [#6341]	2017-03-01 18:53:43 +01:00
Norman Maurer	deb90923a2	Ensure PooledByteBuf.initUnpooled(...) correctly set the allocator Motivation: Commit `8dda984afe` introduced a regression which lead to the situation that the allocator is not set when PooledByteBuf.initUnpooled(...) is called. Thus it was possible that PooledByteBuf.alloc() returns null or the wrong allocator if multiple PooledByteBufAllocator are used in an application. Modifications: - Correctly set the allocator - Add test-case Result: Fixes [#6436].	2017-02-23 19:53:17 +01:00
Norman Maurer	8a3a3245df	Ensure Unsafe buffer implementations are used when sun.misc.Unsafe is present Motivation: When sun.misc.Unsafe is present we want to use UnsafeByteBuf implementations. We missed to do so in PooledByteBufAllocator when the heapArena is null. Modifications: - Correctly use UnpooledUnsafeHeapByteBuf - Add unit tests Result: Use most optimal ByteBuf implementation.	2017-02-16 07:48:33 +01:00
Norman Maurer	34ea09e552	Add missing assumeTrue(...) that were not added in `54339c08ac`	2017-02-14 08:17:33 +01:00
Norman Maurer	54339c08ac	Only try to calculate direct memory offset when sun.misc.Unsafe is present Motivation: We should only try to calculate the direct memory offset when sun.misc.Unsafe is present as otherwise it will fail with an NPE as PlatformDependent.directBufferAddress(...) will throw it. This problem was introduced by `66b9be3a46`. Modifications: Use offset of 0 if no sun.misc.Unsafe is present. Result: PooledByteBufAllocator also works again when no sun.misc.Unsafe is present.	2017-02-14 07:49:24 +01:00
Kiril Menshikov	66b9be3a46	Allow to allign allocated Buffers Motivation: 64-byte alignment is recommended by the Intel performance guide (https://software.intel.com/en-us/articles/practical-intel-avx-optimization-on-2nd-generation-intel-core-processors) for data-structures over 64 bytes. Requiring padding to a multiple of 64 bytes allows for using SIMD instructions consistently in loops without additional conditional checks. This should allow for simpler and more efficient code. Modification: At the moment cache alignment must be setup manually. But probably it might be taken from the system. The original code was introduced by @normanmaurer https://github.com/netty/netty/pull/4726/files Result: Buffer alignment works better than miss-align cache.	2017-02-06 07:58:29 +01:00
Norman Maurer	756b78b7df	Add common tests for ByteBufAllocator / AbstractByteBufAllocator implementations. Motivation: We not had tests for ByteBufAllocator implementations in general. Modifications: Added ByteBufAllocatorTest, AbstractByteBufAllocatorTest and UnpooledByteBufAllocatorTest Result: More tests for allocator implementations.	2017-02-06 07:51:10 +01:00
Norman Maurer	243b2b9f19	PooledByteBufAllocatorTest may has memory visiblity issues as it uses non concurrent queue Motivation: PooledByteBufAllocatorTest uses an ArrayQueue but access it from multiple threads (not concurrently but still from different threads). This may leak to memory visibility issues. Modifications: - Use a concurrent queue - Some cleanup Result: Non racy test code.	2016-12-02 07:42:19 +01:00
Norman Maurer	f70757da2a	[#6015 ] Fix racy PooledByteBufAllocatorTests Motivation: We had a few tests PooledByteBufAllocatorTests which used parkNanos(...) to give a resource enough time to get destroyed. This is race and may not be good enough. Modifications: Ensure the ThreadCache is really destroyed. Result: No more racy tests that depend on ThreadCaches.	2016-12-01 10:16:12 +01:00
Norman Maurer	771cfaec22	[#5520 ] Correctly include all PoolSubpage metrics Motivation: Because of a bug we missed to include the first PoolSubpage when collection metrics. Modifications: - Correctly include all subpages - Add unit test Result: Correctly include all subpages	2016-07-13 21:33:06 +02:00
Norman Maurer	d41f076419	Add more tests for PoolArenaMetric Motivation: We should add some more tests for PoolarenaMetric Modifications: Add more tests Result: Better test coverage for metrics	2016-05-20 21:56:36 +02:00
Norman Maurer	b83ffcc447	Add timeout to PooledByteBufAllocatorTest Motivation: Some tests in PooledByteBufAllocatorTest are blocking on a CountDownLatch. We should use a timeout on these tests so these will not block forever on a failure. Modifications: Add timeout param to @Test annotation Result: Have sane timeouts on tests.	2016-05-20 21:52:53 +02:00
Norman Maurer	30bb3094c1	[#5227 ] Fix race-condition in PooledByteBufAllocatorTest Motivation: PooledByteBufAllocatorTest.testNumThreadCachesWithNoDirrectArenas() had a race as it just used LockSupport.parkNanos(). We should better use a CountdownLatch and so be sure we really have init everything. Modifications: Replace LockSupport.parkNanos(...) with CountdownLatch usage Result: No more race in test.	2016-05-13 08:51:44 +02:00
Norman Maurer	88b093c481	PoolChunkList not correctly move PoolChunks when these are moved. Motivation: When a PoolChunk needs to get moved to the previous PoolChunkList because of the minUsage / maxUsage constraints we always just moved it one level which is incorrect and so could lead to have PoolChunks in the wrong PoolChunkList (in respect to their minUsage / maxUsage settings). This then could have the effect that PoolChunks are not released / freed in a timely fashion and so. Modifications: - Correctly move PoolChunks between PoolChunkLists, which includes moving it multiple "levels". - Add unit test Result: Correctlty move the PoolChunk to PoolChunkList when it is freed, even if its multiple layers.	2016-04-06 11:52:23 +02:00
Norman Maurer	ea3ffb8536	Fix PoolChunkList.minUsage() and maxUsage() for head and tail Motivation: The PoolChunkList.minUsage() and maxUsage() needs to take special action to translate Integer.MIN_VALUE / MAX_VALUE as these are used internal for tail and head of the linked-list structure. Modifications: - Correct the minUsage() and maxUsage() methods. - Add unit test. Result: Correct metrics	2016-04-06 10:03:28 +02:00
buchgr	83c349ffa9	Fix wrong use of assertTrue in unit test. Motivation: My previous commit `b88a980482` introduced a flawed unit test, that executes an assertion in a different thread than the test thread. If this assertion fails, the test doesn't fail. Modifications: Replace the assertion by a proper workaround. Result: More correct unit test	2016-03-15 16:02:33 +01:00
buchgr	b88a980482	Change arena to thread cache mapping algorithm to be closer to ideal. Motivation: Circular assignment of arenas to thread caches can lead to less than optimal mappings in cases where threads are (frequently) shutdown and started. Example Scenario: There are a total of 2 arenas. The first two threads performing an allocation would lead to the following mapping: Thread 0 -> Arena 0 Thread 1 -> Arena 1 Now, assume Thread 1 is shut down and another Thread 2 is started. The current circular assignment algorithm would lead to the following mapping: Thread 0 -> Arena 0 Thread 2 -> Arena 0 Ideally, we want Thread 2 to use Arena 1 though. Presumably, this is not much of an issue for most Netty applications that do all the allocations inside the eventloop, as eventloop threads are seldomly shut down and restarted. However, applications that only use the netty-buffer package or implement their own threading model outside the eventloop might suffer from increased contention. For example, gRPC Java when using the blocking stub performs some allocations outside the eventloop and within its own thread pool that is dynamically sized depending on system load. Modifications: Implement a linear scan algorithm that assigns a new thread cache to the arena that currently backs the fewest thread caches. Result: Closer to ideal mappings between thread caches and arenas. In order to always get an ideal mapping, we would have to re-balance the mapping whenever a thread dies. However, that's difficult because of deallocation.	2016-03-15 14:16:34 +01:00
Norman Maurer	d93f906255	[#4198 ] Fix race-condition when allocate from multiple-thread. Motivation: Fix a race condition that was introduced by `f18990a8a5` that could lead to a NPE when allocate from the PooledByteBufAllocator concurrently by many threads. Modifications: Correctly synchronize on the PoolSubPage head. Result: No more race.	2015-10-27 07:39:42 +01:00

26 Commits