Commit Graph

79 Commits

Author SHA1 Message Date
Nick Hill
9f2221ebd4 CompositeByteBuf optimizations and new addFlattenedComponents method (#8939)
Motivation:

The CompositeByteBuf discardReadBytes / discardReadComponents methods are currently quite inefficient, including when there are no read components to discard. We would like to call the latter more frequently in ByteToMessageDecoder#COMPOSITE_CUMULATOR.

In the same context it would be beneficial to perform a "shallow copy" of a composite buffer (for example when it has a refcount > 1) to avoid having to allocate and copy the contained bytes just to obtain an "independent" cumulation.

Modifications:

- Optimize discardReadBytes() and discardReadComponents() implementations (start at first comp rather than performing a binary search for the readerIndex).
- New addFlattenedComponents(boolean,ByteBuf) method which performs a shallow copy if the provided buffer is also composite and avoids adding any empty buffers, plus unit test.
- Other minor optimizations to avoid unnecessary checks.

Results:

discardReadXX methods are faster, composite buffers can be easily appended without deepening the buffer "tree" or retaining unused components.
2019-04-08 20:48:08 +02:00
Nick Hill
b36f75044f Fix possible ByteBuf leak when CompositeByteBuf is resized (#8946)
Motivation:

The special case fixed in #8497 also requires that we keep a derived slice when trimming components in place, as done by the capacity(int) and discardReadBytes() methods.

Modifications:

Ensure that we keep a ref to trimmed components' original retained slice in capacity(int) and discardReadBytes() methods, so that it is released properly when the they are later freed. Add unit test which fails prior to the fix.

Result:

Edge case leak is eliminated.
2019-03-22 11:18:10 +01:00
Nick Hill
0811409ca3 Further reduce ensureAccessible() overhead (#8895)
Motivation:

This PR fixes some non-negligible overhead discovered in the ByteBuf
accessibility (non-zero refcount) checking. The cause turned out to be
mostly twofold:
- Unnecessary operations used to calculate the refcount from the "raw"
encoded int field value
- Call stack depths exceeding the default limit for inlining, in some
places (CompositeByteBuf in particular)

It's a follow-on from #8882 which uses the maxCapacity field for a
simpler non-negative check. The performance gap between these two
variants appears to be _mostly_ closed, but there's one exception which
may warrant further analysis.

Modifications:

- Replace ABB.internalRefCount() with ByteBuf.isAccessible(), the
default still checks for non-zero refCnt()
- Just test for parity of raw refCnt instead of converting to "real",
with fast-path for specific small values
- Make sure isAccessible() is delegated by derived/wrapper ByteBufs
- Use existing freed flag in CompositeByteBuf for faster isAccessible()
- Manually inline some calls in methods like CompositeByteBuf.setLong()
and AbstractReferenceCountedByteBuf.isAccessible() to reduce stack
depths (to ensure default inlining limit isn't hit)
- Add ByteBufAccessBenchmark which is an extension of
UnsafeByteBufBenchmark (maybe latter could now be removed)

Results:

Before:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
84524972.863 ±   518338.811  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
38608795.037 ±   298176.974  ops/s
readBatch           HEAP               true           true  thrpt   30
80003697.649 ±   974674.119  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
18495554.788 ±   108075.023  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
247069881.578 ± 10839162.593  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
196355905.206 ±  1802420.990  ops/s
setGetLong          HEAP               true           true  thrpt   30
245686644.713 ± 11769311.527  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
83170940.687 ±   657524.123  ops/s
setLong           UNSAFE               true           true  thrpt   30
278940253.918 ±  1807265.259  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
202556738.764 ± 11887973.563  ops/s
setLong             HEAP               true           true  thrpt   30
280045958.053 ±  2719583.400  ops/s
setLong        COMPOSITE               true           true  thrpt   30
121299806.002 ±  2155084.707  ops/s


After:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
101641801.035 ±  3950050.059  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
84395902.846 ±  4339579.057  ops/s
readBatch           HEAP               true           true  thrpt   30
100179060.207 ±  3222487.287  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
42288494.472 ±   294919.633  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
304530755.027 ±  6574163.899  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
212028547.645 ± 14277828.768  ops/s
setGetLong          HEAP               true           true  thrpt   30
309335422.609 ±  2272150.415  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
160383609.236 ±   966484.033  ops/s
setLong           UNSAFE               true           true  thrpt   30
298055969.747 ±  7437449.627  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
223784178.650 ±  9869750.095  ops/s
setLong             HEAP               true           true  thrpt   30
302543263.328 ±  8140104.706  ops/s
setLong        COMPOSITE               true           true  thrpt   30
157083673.285 ±  3528779.522  ops/s

There's also a similar knock-on improvement to other benchmarks (e.g.
HPACK encoding/decoding) as shown in #8882.

For sanity I did a final comparison of the "fast path" tweak using one
of the HPACK benchmarks:

(rawCnt & 1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score     Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  50914.479 ± 940.114  ops/s


rawCnt == 2 || rawCnt == 4 || rawCnt == 6 || rawCnt == 8 ||  (rawCnt &
1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score      Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  60036.425 ± 1478.196  ops/s
2019-02-28 20:40:41 +01:00
Nick Hill
98aa5fbd66 CompositeByteBuf tidy-up (#8784)
Motivation

There's some miscellaneous cleanup/simplification of CompositeByteBuf
which would help make the code a bit clearer.

Modifications

- Simplify web of constructors and addComponents methods, reducing
duplication of logic
- Rename `Component.freeIfNecessary()` method to just `free()`, which is
less confusing (see #8641)
- Make loop in addComponents0(...) method more verbose/readable (see
https://github.com/netty/netty/pull/8437#discussion_r232124414)
- Simplify addition/subtraction in setBytes(...) methods

Result

Smaller/clearer code
2019-02-01 10:31:53 +01:00
Nick Hill
1d5b7be3a7 Fix three bugs in CompositeByteBuf (#8773)
Motivation

In #8758, @doom369 reported an infinite loop bug in CompositeByteBuf
which was introduced in #8437.

This is the same small fix for that, along with fixes for two other bugs
found while re-inspecting the changes and adding unit tests.

Modification

- Replace recursive call to toComponentIndex with toComponentIndex0 as
intended
- Add missed "lastAccessed" racy cache invalidation in capacity(int)
method
- Fix incorrect determination of initial offset in non-zero cIndex case
of updateComponentOffsets method
- New unit tests for previously uncovered methods

Results

Fewer bugs.
2019-01-24 12:47:04 +01:00
kashike
6fdd7fcddb Fix minor spelling issues in javadocs (#8701)
Motivation:

Javadocs contained some spelling errors, we should fix these.

Modification:

Fix spelling

Result:

Javadoc cleanup.
2019-01-14 07:24:34 +01:00
Nick Travers
d0d30f1fbe Loosen bounds check on CompositeByteBuf's maxNumComponents (#8621)
Motivation:

In versions of Netty prior to 4.1.31.Final, a CompositeByteBuf could be
created with any size (including potentially nonsensical negative
values). This behavior changed in e7737b993, which introduced a bounds
check to only allow for a component size greater than one. This broke
some existing use cases that attempted to create a byte buf with a
single component.

Modifications:

Lower the bounds check on numComponents to include the single component
case, but still throw an exception for anything less than one.

Add unit tests for the case of numComponents being less than, equal to,
and greater than this lower bound.

Result:

Return to the behavior of 4.1.30.Final, allowing one component, but
still include an explicit check against a lower bound.

Note that while creating a CompositeByteBuf with a single component is
in some ways a contradiction of the term "composite", this patch caters
for existing uses while excluding the clearly nonsensical case of asking
for a CompositeByteBuf with zero or fewer components.

Fixes #8613.
2018-12-05 08:42:23 +01:00
Nick Hill
804e1fa9cc Fix ref-counting when CompositeByteBuf is used with retainedSlice() (#8497)
Motivation:

ByteBuf.retainedSlice() and similar methods produce sliced buffers with
an independent refcount to the buffer that they wrap.


One of the optimizations in 10539f4dc7 was
to use the ref to the unwrapped buffer object for added slices, but this
did not take into account the above special case when later releasing.

Thanks to @rkapsi for discovering this via #8495.

Modifications:

Since a reference to the slice is still kept in the Component class,
just changed Component.freeIfNecessary() to release the slice in
preference to the unwrapped buf.

Also added a unit test which reproduces the bug.

Result:

Fixes #8495
2018-11-13 20:56:09 +01:00
Nick Hill
10539f4dc7 Streamline CompositeByteBuf internals (#8437)
Motivation:

CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.

My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length

Modifications:

- No longer slice added buffers and unwrap added slices
   - Components store target buf offset relative to position in
composite buf
   - Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
   - Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
   - Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements

While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.

Result:

Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
Nick Hill
44cca1a26f Avoid allocations when wrapping byte[] and ByteBuffer arrays as ByteBuf (#8420)
Motivation:

Unpooled.wrap(byte[]...) and Unpooled.wrap(ByteBuffer...) currently
allocate/copy an intermediate ByteBuf ArrayList and array, which can be
avoided.

Modifications:

- Define new internal ByteWrapper interface and add a CompositeByteBuf
constructor which takes a ByteWrapper with an array of the type that it
wraps, and modify the appropriate Unpooled.wrap(...) methods to take
advantage of it
- Tidy up other constructors in CompositeByteBuf to remove duplication
and misleading len arg (which is really an end offset into provided
array)

Result:

Less allocation/copying when wrapping byte[] and ByteBuffer arrays,
tidier code.
2018-10-30 19:35:39 +01:00
Nick Hill
48c45cf4ac Fix leak and corruption bugs in CompositeByteBuf (#8438)
Motivation:

I came across two bugs:
- Components removed due to capacity reduction aren't released
- Offsets aren't set correctly on empty components that are added
between existing components

Modifications:

Add unit tests which expose these bugs, fix them.

Result:

Bugs are fixed
2018-10-28 10:28:18 +01:00
Norman Maurer
69545aedc4
CompositeByteBuf.decompose(...) does not correctly slice content. (#8403)
Motivation:

CompositeByteBuf.decompose(...) did not correctly slice the content and so produced an incorrect representation of the data.

Modifications:

- Rewrote implementation to fix bug and also improved it to reduce GC
- Add unit tests.

Result:

Fixes https://github.com/netty/netty/issues/8400.
2018-10-19 08:05:22 +02:00
vincent-grosbois
0bea8ecf5d CompositeByteBuf nioBuffer doesn't always alloc (#8176)
In nioBuffer(int,int) in CompositeByteBuf , we create a sub-array of nioBuffers for the components that are in range, then concatenate all the components in range into a single bigger buffer.
However, if the call to nioBuffers() returned only one sub-buffer, then we are copying it to a newly-allocated buffer "merged" for no reason.

Motivation:

Profiler for Spark shows a lot of time spent in put() method inside nioBuffer(), while usually no copy of data is required.

Modification:
This change skips this last step and just returns a duplicate of the single buffer returned by the call to nioBuffers(), which will in most implementation not copy the data

Result:
No copy when the source is only 1 buffer
2018-08-07 11:31:24 +02:00
Norman Maurer
83710cb2e1
Replace toArray(new T[size]) with toArray(new T[0]) to eliminate zero-out and allow the VM to optimize. (#8075)
Motivation:

Using toArray(new T[0]) is usually the faster aproach these days. We should use it.

See also https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion.

Modifications:

Replace toArray(new T[size]) with toArray(new T[0]).

Result:

Faster code.
2018-06-29 07:56:04 +02:00
nickhill
9b95b8ee62 Reduce array allocations during CompositeByteBuf construction
Motivation:

Eliminate avoidable backing array reallocations when constructing
composite ByteBufs from existing buffer arrays/Iterables. This also
applies to the Unpooled.wrappedBuffer(...) methods.

Modifications:

Ensure the initial components ComponentList is sized at least as large
as the provided buffer array/Iterable in the CompositeByteBuffer
constructors.

In single-arg Unpooled.wrappedBuffer(...) methods, set maxNumComponents
to the count of provided buffers, rather than a fixed default of 16. It
seems likely that most usage of these involves wrapping a list without
subsequent modification, particularly since they return a ByteBuf rather
than CompositeByteBuf. If a different/larger max is required there are
already the wrappedBuffer(int, ...) variants.

In fact the current behaviour could be considered inconsistent - if you
call Unpooled.wrappedBuffer(int, ByteBuf) with a single buffer, you
might expect to subsequently be able to add buffers to it (since you
specified a max related to consolidation), but it will in fact return
just a slice of the provided ByteBuf.

Result:

Fewer and smaller allocations in some cases when using CompositeByteBufs
or Unpooled.wrappedBuffer(...).
2018-06-20 16:09:23 +02:00
Norman Maurer
1988cd041d Reduce Object allocations in CompositeByteBuf.
Motivation:

We used subList in CompositeByteBuf to remove ranges of elements from the internal storage. Beside this we also used an foreach loop in a few cases which will crate an Iterator.

Modifications:

- Use our own sub-class of ArrayList which exposes removeRange(...). This allows to remove a range of elements without an extra allocation.
- Use an old style for loop to iterate over the elements to reduce object allocations.

Result:

Less allocations.
2017-12-12 09:08:58 +01:00
Norman Maurer
cc069722a2 CompositeBytebuf.copy() and copy(...) should respect the allocator
Motivation:

When calling CompositeBytebuf.copy() and copy(...) we currently use Unpooled to allocate the buffer. This is not really correct and may produce more GC then needed. We should use the allocator that was used when creating the CompositeByteBuf to allocate the new buffer which may be for example the PooledByteBufAllocator.

Modifications:

- Use alloc() to allocate the new buffer.
- Add tests
- Fix tests that depend on the copy to be backed by an byte-array without checking hasArray() first.

Result:

Fixes [#7393].
2017-11-10 07:17:16 -08:00
Norman Maurer
756b78b7df Add common tests for ByteBufAllocator / AbstractByteBufAllocator implementations.
Motivation:

We not had tests for ByteBufAllocator implementations in general.

Modifications:

Added ByteBufAllocatorTest, AbstractByteBufAllocatorTest and UnpooledByteBufAllocatorTest

Result:

More tests for allocator implementations.
2017-02-06 07:51:10 +01:00
Norman Maurer
66b1731041 PooledByteBuf.capacity(...) not enforces maxCapacity()
Motivation:

PooledByteBuf.capacity(...) miss to enforce maxCapacity() and so its possible to increase the capacity of the buffer even if it will be bigger then maxCapacity().

Modifications:

- Correctly enforce maxCapacity()
- Add unit tests for capacity(...) calls.

Result:

Correctly enforce maxCapacity().
2017-02-01 18:45:54 +01:00
Norman Maurer
e85d437398 [#5597] Not try to double release empty buffer in Unpooled.wrappedBuffer(...)
Motivation:

When Unpooled.wrappedBuffer(...) is called with an array of ByteBuf with length >= 2 and the first ByteBuf is not readable it will result in double releasing of these empty buffers when release() is called on the returned buffer.

Modifications:

- Ensure we only wrap readable buffers.
- Add unit test

Result:

No double release of buffers.
2016-07-30 21:16:44 +02:00
Norman Maurer
7b25402e80 Add CompositeByteBuf.addComponent(boolean ...) method to simplify usage
Motivation:

At the moment the user is responsible to increase the writer index of the composite buffer when a new component is added. We should add some methods that handle this for the user as this is the most popular usage of the composite buffer.

Modifications:

Add new methods that autoamtically increase the writerIndex when buffers are added.

Result:

Easier usage of CompositeByteBuf.
2016-05-21 19:52:16 +02:00
Xiaoyan Lin
ccb0870600 Add methods with position independent FileChannel calls to ByteBuf
Motivation

See ##3229

Modifications:

Add methods with position independent FileChannel calls to ByteBuf and its subclasses.

Results:

The user can use these new methods to read/write ByteBuff without updating FileChannel's position.
2016-02-14 20:37:37 -08:00
Scott Mitchell
6312c2f00b CompositeByteBuf.addComponent always assume reference count ownership
Motivation:
The current interface for CompositeByteBuf.addComponent is not clear under what conditions ownership is transferred when addComponent is called. There should be a well defined behavior so that users can ensure that no leaks occur.

Modifications:
- CompositeByteBuf.addComponent should always assume reference count ownership

Result:
Users that call CompositeByteBuf.addComponent do not have to independently check if the buffer's ownership has been transferred and if not independently release the buffer.
Fixes https://github.com/netty/netty/issues/4760
2016-02-02 11:38:11 -08:00
Norman Maurer
cdb70d31ee [#4017] Implement proper resource leak detection for CompositeByteBuf
Motivation:

CompositeByteBuf only implemented simple resource leak detection and how it was implemented was completly different to the way it was for ByteBuf. The other problem was that slice(), duplicate() and others would not return a resource leak enabled buffer.

Modifications:

- Proper implementation for all level of resource leak detection for CompositeByteBuf

Result:

Proper resource leak detection for CompositeByteBuf.
2016-01-18 10:14:39 +01:00
Alex Petrov
0f9492c9af Add first-class Little Endian support to ByteBuf and descendants
As discussed in	#3209, this PR adds Little Endian accessors
to ByteBuf and descendants.

Corresponding accessors were added to UnsafeByteBufUtil,
HeapByteBufferUtil to avoid calling `reverseBytes`.

Deprecate `order()`, `order(buf)` and `SwappedByteBuf`.
2015-11-26 20:30:24 +01:00
Norman Maurer
8b1f247a1a [#3623] CompositeByteBuf.iterator() should return optimized Iterable
Motivation:

CompositeByteBuf.iterator() currently creates a new ArrayList and fill it with the ByteBufs, which is more expensive then it needs to be.

Modifications:

- Use special Iterator implementation

Result:

Less overhead when calling iterator()
2015-04-20 10:45:37 +02:00
Norman Maurer
18627749a9 Let CompositeByteBuf implement Iterable
Motivation:

CompositeByteBuf has an iterator() method but fails to implement Iterable

Modifications:

Let CompositeByteBuf implement Iterable<ByteBuf>

Result:

Easier usage
2015-04-12 13:38:27 +02:00
Norman Maurer
41fd857a7c Ensure CompositeByteBuf.addComponent* handles buffer in consistent way and not causes leaks
Motivation:

At the moment we have two problems:
 - CompositeByteBuf.addComponent(...) will not add the supplied buffer to the CompositeByteBuf if its empty, which means it will not be released on CompositeByteBuf.release() call. This is a problem as a user will expect everything added will be released (the user not know we not added it).
 - CompositeByteBuf.addComponents(...) will either add no buffers if none is readable and so has the same problem as addComponent(...) or directly release the ByteBuf if at least one ByteBuf is readable. Again this gives inconsistent handling and may lead to memory leaks.

Modifications:

 - Always add the buffer to the CompositeByteBuf and so release it on release call.

Result:

Consistent handling and no buffer leaks.
2015-02-12 16:09:41 +01:00
Trustin Lee
155c0e2f36 Implement internal memory access methods of CompositeByteBuf correctly
Motivation:

When a CompositeByteBuf is empty (i.e. has no component), its internal
memory access operations do not always behave as expected.

Modifications:

Check if the nunmber of components is zero. If so, return an empty
array or an empty NIO buffer, etc.

Result:

More robustness
2014-12-30 15:56:53 +09:00
Norman Maurer
66294892a0 CompositeByteBuf.nioBuffers(...) must not return an empty ByteBuffer array
Motivation:

CompositeByteBuf.nioBuffers(...) returns an empty ByteBuffer array if the specified length is 0. This is not consistent with other ByteBuf implementations which return an ByteBuffer array of size 1 with an empty ByteBuffer included.

Modifications:

Make CompositeByteBuf.nioBuffers(...) consistent with other ByteBuf implementations.

Result:

Consistent and correct behaviour of nioBufffers(...)
2014-12-22 11:18:32 +01:00
Idel Pivnitskiy
ad1389be9d Small performance improvements
Modifications:

- Added a static modifier for CompositeByteBuf.Component.
This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Removed unnecessary boxing/unboxing operations in HttpResponseDecoder, RtspResponseDecoder, PerMessageDeflateClientExtensionHandshaker and PerMessageDeflateServerExtensionHandshaker
A boxed primitive is created from a String, just to extract the unboxed primitive value.
- Removed unnecessary 3 times calculations in DiskAttribute.addContent(...).
- Removed unnecessary checks if file exists before call mkdirs() in NativeLibraryLoader and PlatformDependent.
Because the method mkdirs() has this check inside.
- Removed unnecessary `instanceof AsciiString` check in StompSubframeAggregator.contentLength(StompHeadersSubframe) and StompSubframeDecoder.getContentLength(StompHeaders, long).
Because StompHeaders.get(CharSequence) always returns java.lang.String.
2014-07-20 09:26:04 +02:00
Brendt Lucas
ac8ac59148 [#2642] CompositeByteBuf.deallocate memory/GC improvement
Motivation:

CompositeByteBuf.deallocate generates unnecessary GC pressure when using the 'foreach' loop, as a 'foreach' loop creates an iterator when looping.

Modification:

Convert 'foreach' loop into regular 'for' loop.

Result:

Less GC pressure (and possibly more throughput) as the 'for' loop does not create an iterator
2014-07-08 21:08:14 +02:00
Trustin Lee
d0912f2709 Fix most inspector warnings
Motivation:

It's good to minimize potentially broken windows.

Modifications:

Fix most inspector warnings from our profile
Update IntObjectHashMap

Result:

Cleaner code
2014-07-02 19:55:07 +09:00
Norman Maurer
9594a81b95 [#2622] Correctly check reference count before try to work on the underlying memory
Motivation:

Because of how we use reference counting we need to check for the reference count before each operation that touches the underlying memory. This is especially true as we use sun.misc.Cleaner.clean() to release the memory ASAP when possible. Because of this the user may cause a SEGFAULT if an operation is called that tries to access the backing memory after it was released.

Modification:

Correctly check the reference count on all methods that access the underlying memory or expose it via a ByteBuffer.

Result:

Safer usage of ByteBuf
2014-06-30 07:14:25 +02:00
Trustin Lee
51349352e2 Fix a bug that CompositeByteBuf.touch() does nothing 2014-02-13 18:24:36 -08:00
Trustin Lee
8837afddf8 Enable a user specify an arbitrary information with ReferenceCounted.touch()
- Related: #2163
- Add ResourceLeakHint to allow a user to provide a meaningful information about the leak when touching it
- DefaultChannelHandlerContext now implements ResourceLeakHint to tell where the message is going.
- Cleaner resource leak report by excluding noisy stack trace elements
2014-02-13 18:16:25 -08:00
Norman Maurer
b00f8c6390 [#1976] Fix IndexOutOfBoundsException when calling CompositeByteBuf.discardReadComponents() 2013-11-09 20:13:24 +01:00
Norman Maurer
1c73be21fc Remove redundant index check 2013-10-08 07:21:01 +02:00
Norman Maurer
2b9a07cac9 CompositeByteBuf.isDirect() should return true if its only backed by direct buffers 2013-09-26 20:37:31 +02:00
Norman Maurer
23baef8fb4 [#1853] Optimize gathering writes for CompositeByteBuf that are only backed by one ByteBuffer 2013-09-19 07:29:58 +02:00
Greg Soltis
f1d4f813ed Fix nioBuffer implementation for CompositeByteBuf 2013-09-16 06:41:08 +02:00
Norman Maurer
451e91d142 [#1821] Fix IndexOutOfBoundsException which was thrown if the last component was removed but other components was left 2013-09-09 20:29:30 +02:00
Norman Maurer
5416f2315e [#1797] No use internalNioBuffer() in derived buffers as it is not meant for concurrent access 2013-09-02 14:15:19 +02:00
Norman Maurer
795182843d Remove legancy code which we not need anymore as we use gathering writes anyway everywhere 2013-09-01 11:00:58 +02:00
Norman Maurer
8a673db92b [#1644] Fixed IndexOutOfBoundException when calling copy() on a empty CompositeByteBuf 2013-07-24 07:35:51 +02:00
Trustin Lee
764741c5ce Change the contract of ResourceLeakDetector.open() so that unsampled resources are recycled
- This also fixes the problem introduced while trying to implement #1612 (Allow to disable resource leak detection).
2013-07-23 14:06:58 +09:00
Trustin Lee
65c2a6ed46 Make ByteBuf an abstract class rather than an interface
- 5% improvement in throughput (HelloWorldServer example)
- Made CompositeByteBuf a concrete class (renamed from DefaultCompositeByteBuf) because there's no multiple inheritance in Java

Fixes #1536
2013-07-08 14:59:52 +09:00
Trustin Lee
0b9235f072 Simplify ByteBufProcessor and MessageListProcessor and Add internal component accessors to CompositeByteBuf
- Fixes #1528

It's not really easy to provide a general-purpose abstraction for fast-yet-safe iteration. Instead of making forEachByte() less optimal, let's make it do what it does really well, and allow a user to implement potentially unsafe-yet-fast loop using unsafe operations.
2013-07-05 14:00:46 +09:00
Trustin Lee
14158070bf Revamp the core API to reduce memory footprint and consumption
The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.

Memory consumption issue:

When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user.  In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.

Memory footprint issue:

The old Netty 4 API decreased overall memory footprint by a great deal
in many cases.  It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read.  Instead, it
created a new buffer for each handler in a pipeline.  This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.

Changes:

All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.

To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:

- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
  - Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
  - Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
  - Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
  ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
  - Similar to ReceiveBufferSizePredictor in Netty 3
  - Some existing configuration properties such as
    DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf

This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
2013-06-10 16:10:39 +09:00
Norman Maurer
4eb0172251 [#1196] Make it clear that addComponent(..) of CompositeByteBuf does NOT increase the writerIndex 2013-03-25 10:58:50 +01:00