303 Commits

Author SHA1 Message Date
Trustin Lee
350ac9787e Do not use a pseudo random for tree traversal
Motivation:

If we make allocateRun/SubpageSimple() always try the left node first and make allocateRun/Subpage() always tries the right node first,  it is more likely that allocateRun/Subpage() will find a node with ST_UNUSED sooner.

Modifications:

- Make allocateRunSimple() and allocateSubpageSimple() always try the left node first.
- Make allocateRun() and allocateSubpage() always try the right node first.
- Remove randome

Result:

We get the same performance without using random numbers.
2014-05-30 11:24:22 +09:00
Trustin Lee
3e7dbe072e Optimize PooledByteBufAllocator
Motivation:

We still have a room for improvement in PoolChunk.allocateRun() and
Subpage.allocate().

Modifications:

- Unroll the recursion in PoolChunk.allocateRun()
- Subpage.allocate() makes use of the 'nextAvail' value set by previous
  free().

Result:

- PoolChunk.allocateRun() optimization yields 10%+ improvements in
  allocation throughput for non-subpage allocations.
- Subpage.allocate() optimization makes the subpage allocations for
  tiny buffers as fast as non-tiny buffers even when the pageSize is
  huge (e.g. 1048576) because it doesn't need to perform a linear search
  in most cases.
2014-05-30 10:51:27 +09:00
Jake Luciani
795507aa7b Fix capacity check bug affecting offheap buffers 2014-05-13 07:24:56 +02:00
ian
fde13d96f9 Fix error that causes (up to) double memory usage
Motivation:

PoolArena's 'normalizeCapacity' function was micro-optimized some
time ago to remove a while loop. However, there was a change of
behavior in the function as a result. Capacities passed into it
that are already powers of 2 (and >= 512) are doubled in size. So
if I ask for a buffer with a capacity of 1024, I will get back one
that actually uses 2048 bytes (stored in maxLength).

Aligning to powers of two for book keeping ease is reasonable,
and if someone tries to expand a buffer, you might as well use some
of the previously wasted space. However, since this distinction
between 'easily expanded' and 'costly to expand' space is not
supported at all by the APIs, I cannot imagine this change to
doubling is desirable or intentional.

This is especially costly when using composite buffers. They
frequently allocate components with a capacity that is a power of
2, and they never attempt to expand components themselves. The end
result is that heavy use of pool-backed composite buffers wastes
almost half of the memory pool (the smaller / initial components are
<512 and so are not affected by the off-by-one bug).

Modifications:

Although I find it difficult to believe that such an optimization
is really helpful, I left it in and fixed the off-by-one issue by
decrementing the value at the start.

I also added a simple test to both attempt to verify that the
decrement fixes the issue without introducing any other change, and
to make it easy for a reviewer to test the existing behavior. PoolArena
does not seem to have much testing or testability support though so
the test is kind of a hack and will break for unrelated changes. I
suggest either removing it or factoring out the single non-static
portion of normalizeCapacity so that the fragile dummy PoolArena is
not required.

Result:

Pooled allocators will allocate less resources to the highly
inefficient and undocumented buffer section between length and
maxLength.

Composite buffers of non-trivial size that are backed by pooled
allocators will use about half as much memory.
2014-04-15 07:02:49 +02:00
Norman Maurer
9c934116f2 [#2370] Periodically check for not alive Threads and free up their ThreadPoolCache
Motivation:
At the moment we create new ThreadPoolCache whenever a Thread tries either allocate or release something on the PooledByteBufAllocator. When something is released we put it then in its ThreadPoolCache. The problem is we never check if a Thread is not alive anymore and so we may end up with memory that is never freed again if a user create many short living Threads that use the PooledByteBufAllocator.

Modifications:
Periodically check if the Thread is still alive that has a ThreadPoolCache assinged and if not free it.

Result:
Memory is freed up correctly even for short living Threads.
2014-04-09 11:44:51 +02:00
Norman Maurer
13fd69e871 Implement Thread caches for pooled buffers to minimize conditions. This fixes [#2264] and [#808].
Motivation:
Remove the synchronization bottleneck in PoolArena and so speed up things

Modifications:

This implementation uses kind of the same technics as outlined in the jemalloc paper and jemalloc
blogpost https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919.

At the moment we only cache for "known" Threads (that powers EventExecutors) and not for others to keep the overhead
minimal when need to free up unused buffers in the cache and free up cached buffers once the Thread completes. Here
we use multi-level caches for tiny, small and normal allocations. Huge allocations are not cached at all to keep the
memory usage at a sane level. All the different cache configurations can be adjusted via system properties or the constructor
directly where it makes sense.

Result:
Less conditions as most allocations can be served by the cache itself
2014-03-20 09:18:04 -07:00
Jakob Buchgraber
17ba35b6d0 Bit tricks to check for and calculate power of two.
Motivation:
I was studying the code and thought this was simpler and easier to
understand.

Modifications:
Replaced the for loop and if conditions, with a simple implementation.

Result:
Code is easier to understand.
2014-03-18 15:59:58 +09:00
Bourne, Geoff
1c074eabe5 Fix limit computation of NIO ByteBuffers obtained via ReadOnlyByteBufferBuf.nioBuffer
Motivation:

When starting with a read-only NIO buffer, wrapping it in a ByteBuf,
and then later retrieving a re-wrapped NIO buffer the limit was getting
too short.

Modifications:

Changed ReadOnlyByteBufferBuf.nioBuffer(int,int) to compute the
limit in the same manner as the internalNioBuffer method.

Result:

Round-trip conversion from NIO to ByteBuf to NIO will work reliably.
2014-03-14 08:07:19 +01:00
Trustin Lee
0fc66a411f The default buffer must be unpooled for backward compatibility
Mistakenly set to pooled while merging the changes from 4.1 and master.
2014-02-21 14:43:07 -08:00
Trustin Lee
b18c8fe688 Determine the default allocator from system property
- Add ByteBufAllocator.DEFAULT
- The default allocator is 'unpooled'
2014-02-14 13:05:57 -08:00
Norman Maurer
f23d68b42f [#2187] Always do a volatile read on the refCnt 2014-02-07 09:23:16 +01:00
Norman Maurer
9bee78f91c Provide an optimized AtomicIntegerFieldUpdater, AtomicLongFieldUpdater and AtomicReferenceFieldUpdater 2014-02-06 20:08:45 +01:00
Norman Maurer
b3d8c81557 Fix all leaks reported during tests
- One notable leak is from WebSocketFrameAggregator
- All other leaks are from tests
2013-12-07 00:44:56 +09:00
Trustin Lee
2102cb062b Fix false-positive leaks
- All derived buffers and swapped buffers of a leak-aware buffer must be wrapped again with the leak-aware buffer
2013-12-06 21:32:56 +09:00
Trustin Lee
e506581eb1 Add ReferenceCountUtil.releaseLater() to make writing tests easy with ReferenceCounteds 2013-12-06 15:13:00 +09:00
Trustin Lee
128c4b96b5 Checkstyle 2013-12-06 13:54:36 +09:00
Trustin Lee
5d39b1fc3d Also record retain() and release() 2013-12-06 13:45:24 +09:00
Trustin Lee
e88172495a Ensure backward compatibility
.. by resurrecting the removed methods and system properties.
2013-12-05 01:02:38 +09:00
Trustin Lee
65b522a2a7 Better buffer leak reporting
- Remove the reference to ResourceLeak from the buffer implementations
  and use wrappers instead:
  - SimpleLeakAwareByteBuf and AdvancedLeakAwareByteBuf
  - It is now allocator's responsibility to create a leak-aware buffer.
  - Added AbstractByteBufAllocator.toLeakAwareBuffer() for easier
    implementation
- Add WrappedByteBuf to reduce duplication between *LeakAwareByteBuf and
  UnreleasableByteBuf
- Raise the level of leak reports to ERROR - because it will break the
  app eventually
- Replace enabled/disabled property with the leak detection level
  - Only print stack trace when level is ADVANCED or above to avoid user
    confusion
- Add the 'leak' build profile, which enables highly detailed leak
  reporting during the build
- Remove ResourceLeakException which is unsed anymore
2013-12-05 00:51:39 +09:00
Norman Maurer
053c512f6d Fix checkstyle 2013-12-02 08:23:57 +01:00
Norman Maurer
14600167d6 [#2021] No need to synchronize for unpooled chunks 2013-12-02 08:02:48 +01:00
Norman Maurer
7231be592a Also allow to override how direct ByteBuffers are freed 2013-11-12 12:40:41 +01:00
Norman Maurer
e83fb821d5 Allow to override how wrapped direct ByteBuffer are allocated to make it easier to extend 2013-11-12 12:13:38 +01:00
Norman Maurer
b00f8c6390 [#1976] Fix IndexOutOfBoundsException when calling CompositeByteBuf.discardReadComponents() 2013-11-09 20:13:24 +01:00
Alex Petrov
e4f391f626 Improve docstrings for and of 2013-11-08 12:15:41 +01:00
Trustin Lee
ba3bc0c020 Simpler toString() for ByteBufAllocators 2013-11-08 17:54:34 +09:00
Norman Maurer
77b4ec7e1b [#1800] [#1802] Correctly expand capacity of ByteBuf while preserve content 2013-11-04 15:18:21 +01:00
Trustin Lee
54db9ec725 Use StringUtil.simpleClassName(..) instead of Class.getSimpleName() where necessary
- Class.getSimpleName() doesn't render anonymous classes very well
- + some minor cleanup
2013-11-04 19:46:15 +09:00
Norman Maurer
4ce49a6195 [#1943] Unpooled.copiedBuffer(ByteBuf pooled) should always return unpooled ByteBuf 2013-10-22 20:20:38 +02:00
Norman Maurer
68b616728a [#1925] Only expose sub-region of ByteBuf on nioBuffer(...) 2013-10-16 10:34:33 +02:00
Norman Maurer
d946659520 [#1906] Use a ByteBuf allocator from the ByteBufAllocator when encode Strings 2013-10-09 21:18:08 +02:00
Norman Maurer
1c73be21fc Remove redundant index check 2013-10-08 07:21:01 +02:00
Bill Gallagher
8f612660b2 disable debugging output during test 2013-10-04 06:45:26 +02:00
Norman Maurer
ee192f0321 [#1880] Use ByteBufAllocator when read bytes into new chunks 2013-10-01 10:10:43 +02:00
Norman Maurer
6d09e57be7 [#1875] Correctly check the readerIndex when try to read a byte from AbstractByteBuf 2013-09-30 14:47:49 +02:00
Norman Maurer
b4fa8af079 Cache underlying ByteBuffers and count in ChannelOutboundBuffer.Entry to reduce object creation and so GC pressure
Beside this it also helps to reduce CPU usage as nioBufferCount() is quite expensive when used on CompositeByteBuf which are
nested and contains a lot of components
2013-09-26 20:37:39 +02:00
Norman Maurer
2b9a07cac9 CompositeByteBuf.isDirect() should return true if its only backed by direct buffers 2013-09-26 20:37:31 +02:00
Norman Maurer
a74149e984 [#1865] Only use internalNioBuffer when one of the read* or write* methods are used. This is neccessary to prevent races as those can happen when a slice or duplicate is shared between different Channels
that are not assigned to the same EventLoop. In general get* operations should always be safe to be used from different Threads.

This aslo include unit tests that show the issue
2013-09-25 17:27:26 +02:00
Norman Maurer
910ed31a1b [#1851] EmptyByteBuf.isWritable(..) and isReadable(...) should not throw IndexOutOfBoundsException 2013-09-21 20:40:22 +02:00
Norman Maurer
23baef8fb4 [#1853] Optimize gathering writes for CompositeByteBuf that are only backed by one ByteBuffer 2013-09-19 07:29:58 +02:00
Norman Maurer
c0bbde48b7 [#1852] Fix bug in UnpooledDirectByteBuf.nioBuffer(...) implementation 2013-09-18 20:47:57 +02:00
Greg Soltis
f1d4f813ed Fix nioBuffer implementation for CompositeByteBuf 2013-09-16 06:41:08 +02:00
Norman Maurer
451e91d142 [#1821] Fix IndexOutOfBoundsException which was thrown if the last component was removed but other components was left 2013-09-09 20:29:30 +02:00
Norman Maurer
25c226a835 Make sure only direct ByteBuffer are passed to the underlying jdk Channel.
This is needed because of otherwise the JDK itself will do an extra ByteBuffer copy with it's own pool implementation. Even worth it will be done
multiple times if the ByteBuffer is always only partial written. With this change the copy is done inside of netty using it's own allocator and
only be done one time in all cases.
2013-09-02 20:17:53 +02:00
Norman Maurer
5416f2315e [#1797] No use internalNioBuffer() in derived buffers as it is not meant for concurrent access 2013-09-02 14:15:19 +02:00
Norman Maurer
0007fb81ef Add tests to try to track down some buffer issues 2013-09-02 13:50:47 +02:00
Norman Maurer
795182843d Remove legancy code which we not need anymore as we use gathering writes anyway everywhere 2013-09-01 11:00:58 +02:00
Norman Maurer
5ddd7cee90 [#1797] Throw IllegalArgumentException if AbstractByteBuf.skipBytes(...) is used with a negative value 2013-08-29 11:14:36 +02:00
Trustin Lee
20894bc99e Fix a bug in internalNioBuffer() implementations of derived buffers
- A user can create multiple duplicates of a buffer and access their internal NIO buffers. (e.g. write multiple duplicates to multiple channels assigned to different event loop.)  Because the derived buffers' internalNioBuffer() simply delegates the call to the original buffer, all derived buffers and the original buffer's internalNioBuffer() will return the same buffer, which will lead to a race condition.
- Fixes #1739
2013-08-20 14:28:50 +09:00
bgallagher
9f88552f12 remove some dead code 2013-08-10 20:46:48 +02:00