Commit Graph

9221 Commits

Author SHA1 Message Date
Norman Maurer
6a3723c619 Respect ctx.read() calls while processing reads for the child channels when using the Http2MultiplexCodec. (#8617)
Motivation:

We did not correct respect ctx.read() calls while processing a read for a child Channel. This could lead to read stales when auto read is disabled and no other read was requested.

Modifications:

- Keep track of extra read() calls while processing reads
- Add unit tests that verify that read() is respected when triggered either in channelRead(...) or channelReadComplete(...)

Result:

Fixes https://github.com/netty/netty/issues/8209.
2018-12-05 15:29:38 +01:00
Nick Travers
0a8c93a6fa Loosen bounds check on CompositeByteBuf's maxNumComponents (#8621)
Motivation:

In versions of Netty prior to 4.1.31.Final, a CompositeByteBuf could be
created with any size (including potentially nonsensical negative
values). This behavior changed in e7737b993, which introduced a bounds
check to only allow for a component size greater than one. This broke
some existing use cases that attempted to create a byte buf with a
single component.

Modifications:

Lower the bounds check on numComponents to include the single component
case, but still throw an exception for anything less than one.

Add unit tests for the case of numComponents being less than, equal to,
and greater than this lower bound.

Result:

Return to the behavior of 4.1.30.Final, allowing one component, but
still include an explicit check against a lower bound.

Note that while creating a CompositeByteBuf with a single component is
in some ways a contradiction of the term "composite", this patch caters
for existing uses while excluding the clearly nonsensical case of asking
for a CompositeByteBuf with zero or fewer components.

Fixes #8613.
2018-12-05 08:43:53 +01:00
Francesco Nigro
4c2b11633a Adding an execute burst cost benchmark for Netty executors (#8594)
Motivation:

Netty executors doesn't have yet any means to compare with each others
nor to compare with the j.u.c. executors

Modifications:

A new benchmark measuring execute burst cost is being added

Result:

It's now possible to compare some of Netty executors with each others
and with the j.u.c. executors
2018-12-04 15:46:48 +01:00
Norman Maurer
85c1590a90 Provide a way to cache the internal nioBuffer of the PooledByteBuffer… (#8603)
Motivation:

Often a temporary ByteBuffer is used which can be cached to reduce the GC pressure.

Modifications:

Cache the ByteBuffer in the PoolThreadCache as well.

Result:

Less GC.
2018-12-04 15:39:55 +01:00
Norman Maurer
361ace7671 Update to OpenJDK 12 ea22 (#8618)
Motivation:

We should use the latest OpenJDK 12 release when running tests against Java12.

Modifications:

- Update to OpenJDK 12 ea22.
- Update pax exam version
- skip OSGI testsuite on Java12 as it does not work ea22 yet.

Result:

Use latest OpenJDK 12 version when running on the CI.
2018-12-04 11:59:23 +01:00
Julien Hoarau
849bdc8a84 Set-Cookie headers should not be combined (#8611)
Motivation:

According to the HTTP spec set-cookie headers should not be combined
because they are not using the list syntax.

Modifications:

Do not combine set-cookie headers.

Result:

Set-Cookie headers won't be combined anymore
2018-12-01 10:47:37 +01:00
Nick Hill
0ecd0c6ff3 Reduce http2 buffer slicing (#8598)
Motivation

DefaultHttp2FrameReader currently does a fair amount of "intermediate"
slicing which can be avoided.

Modifications

Avoid slicing the input buffer in DefaultHttp2FrameReader until
necessary. In one instance this also means retainedSlice can be used
instead (which may also avoid allocating).

Results

Less allocations when using http2.
2018-11-29 19:46:19 +01:00
Nick Hill
fc0a4e15ab Harden ref-counting concurrency semantics (#8583)
Motivation

#8563 highlighted race conditions introduced by the prior optimistic
update optimization in 83a19d5650. These
were known at the time but considered acceptable given the perf
benefit in high contention scenarios.

This PR proposes a modified approach which provides roughly half the
gains but stronger concurrency semantics. Race conditions still exist
but their scope is narrowed to much less likely cases (releases
coinciding with retain overflow), and even in those
cases certain guarantees are still assured. Once release() returns true,
all subsequent release/retains are guaranteed to throw, and in
particular deallocate will be called at most once.

Modifications

- Use even numbers internally (including -ve) for live refcounts
- "Final" release changes to odd number (equivalent to refcount 0)
- Retain still uses faster getAndAdd, release uses CAS loop
- First CAS attempt uses non-volatile read
- Thread.yield() after a failed CAS provides a net gain

Result

More (though not completely) robust concurrency semantics for ref
counting; increased latency under high contention, but still roughly
twice as fast as the original logic. Bench results to follow
2018-11-29 08:32:50 +01:00
Norman Maurer
42e0ce6f7a Move less common code-path to extra method to allow inlining of writeUtf8. (#8600)
Motivation:

ByteBuf is used everywhere so we should try hard to be able to make things inlinable. During benchmarks it showed that writeCharSequence(...) fails to inline writeUtf8 because it is too big even if its hots.

Modifications:

Move less common code-path to extra method to allow inlining.

Result:

Be able to inline writeUtf8 in most cases.
2018-11-27 21:04:13 +01:00
Norman Maurer
45e9fd4802 Revert "Provide a way to cache the internal nioBuffer of the PooledByteBuffer to reduce GC. (#8593)"
This reverts commit 945f123f8b as it seems to produce some failures in some cases. This needs more research.
2018-11-27 20:03:33 +01:00
Norman Maurer
945f123f8b Provide a way to cache the internal nioBuffer of the PooledByteBuffer to reduce GC. (#8593)
Motivation:

Often a temporary ByteBuffer is used which can be cached to reduce the GC pressure.

Modifications:

Add a Deque per PoolChunk which will be used for caching.

Result:

Less GC.
2018-11-27 13:55:24 +01:00
Rolandz
836c39b82f Fix offset calculation in PooledByteBufAllocator when used
Motivation:

When we create new chunk with memory aligned, the offset of direct memory should be
'alignment - address & (alignment - 1)', not just 'address & (alignment - 1)'.

Modification:

Change offset calculating formula to offset = alignment - address & (alignment - 1) in PoolArena.DirectArena#offsetCacheLine and add a unit test to assert that.

Result:

Correctly calculate offset.
2018-11-27 11:47:40 +01:00
Norman Maurer
cb1090653f LocationAwareSlf4jLogger does not correctly format log message. (#8595)
Motivation:

We did miss to use MessageFormatter inside LocationAwareSlf4jLogger and so {} was not correctly replaced in log messages when using slf4j.
This regression was introduced by afe0767e9c.

Modifications:

- Make use of MessageFormatter
- Add unit test.

Result:

Fixes https://github.com/netty/netty/issues/8483.
2018-11-27 11:45:11 +01:00
Norman Maurer
2e0922d773 Use addAndGet(...) as a replacement for compareAndSet(...) when tracking the direct memory usage. (#8596)
Motivation:

We can change from using compareAndSet to addAndGet, which emits a different CPU instruction on x86 (CMPXCHG to XADD) when count direct memory usage. This instruction is cheaper in general and so produce less overhead on the "happy path". If we detect too much memory usage we just rollback the change before throwing the Error.

Modifications:

Replace compareAndSet(...) with addAndGet(...)

Result:

Less overhead when tracking direct memory.
2018-11-27 08:38:01 +01:00
Norman Maurer
48c7ca8edd Factor out less common code-path into own method to allow inlining. (#8590)
Motivation:

During benchmarks two methods showed up as "hot method too big". We can easily make these smaller by factor out some less common code-path to an extra method and so allow inlining.

Modifications:

Factor out less common code path to an extra method.

Result:

Hot methods can be inlined.
2018-11-25 21:46:33 +01:00
Norman Maurer
5b239150f1 HeadContext is inbound and outbound (#8592)
Motivation:

Our HeadContext in DefaultChannelPipeline does handle inbound and outbound but we only marked it as outbound. While this does not have any effect in the current code-base it can lead to problems when we change our internals (this is also how I found the bug).

Modifications:

Construct HeadContext so it is also marked as handling inbound.

Result:

More correct code.
2018-11-24 10:48:13 +01:00
Norman Maurer
e114d6be46
Remove OIO transport (and transports that depend on it). (#8580)
Motivation:

This transport is unique because it uses Java's blocking IO (java.io / java.net) under the hood. However it is not clear if this transport is actually useful so it should be removed.

Modifications:

- Remove OIO transport and RXTX transport which depend on it.
- Remove Oio*Sctp* implementations
- Remove PerThreadEventLoop* which was only used by OIO transport.

Result:

Fixes https://github.com/netty/netty/issues/8510.
2018-11-21 15:23:18 +01:00
Norman Maurer
6650b325ad Mark OIO based transports as deprecated as preparation for removal in Netty 5. (#8579)
Motivation:

We plan to remove the OIO based transports in Netty 5 so we should mark these as deprecated already.

Modifications:

Mark all OIO based transports as deprecated.

Result:

Give the user a heads-up for removal.
2018-11-21 15:16:13 +01:00
Norman Maurer
99aa51b74a Combine flushes in DnsNameResolver to allow usage of sendmmsg to reduce syscall costs (#8470)
Motivation:

Some of transports support gathering writes when using datagrams. For example this is the case for EpollDatagramChannel. We should minimize the calls to flush() to allow making efficient usage of sendmmsg in this case.

Modifications:

- minimize flush() operations when we query for multiple address types.
- reduce GC by always directly schedule doResolveAll0(...) on the EventLoop.

Result:

Be able to use sendmmsg internally in the DnsNameResolver.
2018-11-21 06:42:50 +01:00
Norman Maurer
63b0f78e56 Remove transitive dependency on slf4j in example (#8582)
Motivation:

We currently depend on slf4j in an transitive way in one of our classes in the examples. We should not do this.

Modifications:

Remove logging in example.

Result:

Remove not needed dependency.
2018-11-21 06:39:53 +01:00
Norman Maurer
e4fae1c98e
Remove UDT transport for Netty 5. (#8578)
Motivation:

The UDT transport was marked as @Deprecated a long time ago as the underlying native library is not really maintained anymore. We should remove it as part of Netty 5.

Modifications:

Remove UDT transport

Result:

Dont try to maintain a transport which uses an unmaintained native lib internally.
2018-11-20 19:26:20 +01:00
Norman Maurer
4891f1183d Fix javadoc to correctly explain how ChannelDuplexHandler.deregister(...) works. (#8577)
Motivation:

We had an error in the javadoc which was most likely caused by copy and paste.

Modifications:

Fix javadoc.

Result:

Correct javadoc.
2018-11-20 16:46:18 +01:00
Norman Maurer
2c78dde749 Update version number to start working on Netty 5 2018-11-20 15:49:57 +01:00
Norman Maurer
38524ec3e2
Fix test that assumed detection of peer supported algs is not supported in BoringSSL. (#8573)
Motivation:

0d2e38d5d6 added supported for detection of peer supported algorithms but we missed to fix the testcase.

Modifications:

Fix test-case.

Result:

No more failing tests with BoringSSL.
2018-11-19 12:13:05 +01:00
Norman Maurer
278b49b2a7
Recover from Selector IOException (#8569)
Motivation:

When the Selector throws an IOException during our EventLoop processing we should rebuild it and transfer the registered Channels. At the moment we will continue trying to use it which will never work.

Modifications:

- Rebuild Selector when an IOException is thrown during any select*(...) methods.
- Add unit test.

Result:

Fixes https://github.com/netty/netty/issues/8566.
2018-11-19 07:41:43 +01:00
Jake Luciani
63dc1f5aaa Allow adjusting of lead detection sampling interval. (#8568)
Motivation:

We should allow adjustment of the leak detecting sampling interval when in SAMPLE mode.

Modifications:

Added new int property io.netty.leakDetection.samplingInterval

Result:

Be able to consume changes made by the user.
2018-11-16 17:22:03 +01:00
Andremoniy
abc8a08c96 Rethrow Error during retrieving remoteAddress / localAddress
Motivation:

Besides an error caused by closing socket in Windows a bunch of other errors may happen at this place which won't be somehow logged. For instance any VirtualMachineError as OutOfMemoryError will be simply ignored. The library should at least log the problem.

Modification:

Added logging of the throwable object.

Result:

Fixes #8499.
2018-11-16 17:20:28 +01:00
Norman Maurer
cb0d23923f
Refresh DNS configuration each 5 minutes. (#8468)
Motivation:

We should refresh the DNS configuration each 5 minutes to be able to detect changes done by the user. This is inline with what OpenJDK is doing

Modifications:

Refresh config every 5 minutes.

Result:

Be able to consume changes made by the user.
2018-11-16 10:37:29 +01:00
JStroom
ce02d5a184 Update SslHandler.java (#8564)
Swallow SSL Exception "closing inbound before receiving peer's close_notify" when running on Java 11 (#8463)

Motivation:

When closing a inbound SSL connection before the remote peer has send a close notify, the Java JDK is trigger happy to throw an exception. This exception can be ignored since the connection is about to be closed.
The exception wasn't printed in Java 8, based on filtering on the exception message. In Java 11 the exception message has been changed.

Modifications:
Update the if statement to also filter/swallow the message on Java 11.

Result:
On Java 11 the exception isn't printed with log levels set to debug. The old behaviour is maintained.
2018-11-16 10:32:34 +01:00
Norman Maurer
8d4d76d216
ReferenceCountedOpenSslEngine SSLSession.getLocalCertificates() / getLocalPrincipial() did not work when KeyManagerFactory was used. (#8560)
Motivation:

The SSLSession.getLocalCertificates() / getLocalPrincipial() methods did not correctly return the local configured certificate / principal if a KeyManagerFactory was used when configure the SslContext.

Modifications:

- Correctly update the local certificates / principial when the key material is selected.
- Add test case that verifies the SSLSession after the handshake to ensure we correctly return all values.

Result:

SSLSession returns correct values also when KeyManagerFactory is used with the OpenSSL provider.
2018-11-16 07:38:32 +01:00
Norman Maurer
20d4fda55e
Return the correct pointer from ReferenceCountedOpenSslContext.context() and sslCtxPointer() (#8562)
Motivation:

We did not return the pointer to SSL_CTX put to the internal datastructure of tcnative.

Modifications:

Return the correct pointer.

Result:

Methods work as documented in the javadocs.
2018-11-16 07:37:57 +01:00
Norman Maurer
7667361924
Update to netty-tcnative 2.0.20.Final (#8561)
Motivation:

Update to netty-tcnative 2.0.20.Final which fixed a bug related to retrieving the remote signature algorithms when using BoringSSL.

Modifications:

Update netty-tcnative

Result:

Be able to correctly detect the remote signature algorithms when using BoringSSL.
2018-11-15 18:05:15 +01:00
Norman Maurer
845a65b31c
Nio|Epoll|KqueueEventLoop task execution might throw UnsupportedOperationException on shutdown. (#8476)
Motivation:

There is a racy UnsupportedOperationException instead because the task removal is delegated to MpscChunkedArrayQueue that does not support removal. This happens with SingleThreadEventExecutor that overrides the newTaskQueue to return an MPSC queue instead of the LinkedBlockingQueue returned by the base class such as NioEventLoop, EpollEventLoop and KQueueEventLoop.

Modifications:

- Catch the UnsupportedOperationException
- Add unit test.

Result:

Fix #8475
2018-11-15 07:19:28 +01:00
Norman Maurer
0d2e38d5d6
Correctly convert supported signature algorithms when using BoringSSL (#8481)
* Correctly convert supported signature algorithms when using BoringSSL

Motivation:

BoringSSL uses different naming schemes for the signature algorithms so we need to adjust the regex to also handle these.

Modifications:

- Adjust SignatureAlgorithmConverter to handle BoringSSL naming scheme
- Ensure we do not include duplicates
- Add unit tests.

Result:

Correctly convert boringssl signature algorithm names.
2018-11-14 19:23:11 +01:00
Norman Maurer
d1654484c1
Correctly convert between openssl / boringssl and java cipher names when using TLSv1.3 (#8485)
Motivation:

We did not correctly convert between openssl / boringssl and java ciphers when using TLV1.3 which had different effects when either using openssl or boringssl.
 - When using openssl and TLSv1.3 we always returned SSL_NULL_WITH_NULL_NULL as cipher name
 - When using boringssl with TLSv1.3 we always returned an incorrect constructed cipher name which does not match what is defined by Java.

Modifications:

 - Add correct mappings in CipherSuiteConverter for both openssl and boringssl
 - Add unit tests for CipherSuiteConvert
 - Add unit in SSLEngine which checks that we do not return SSL_NULL_WITH_NULL_NULL ever and that server and client returns the same cipher name.

Result:

Fixes https://github.com/netty/netty/issues/8477.
2018-11-14 08:49:13 +01:00
Tim Brooks
11ec7d892e Cleanup SslHandler handshake/renegotiation (#8555)
Motivation:

The code for initiating a TLS handshake or renegotiation process is
currently difficult to reason about.

Modifications:

This commit introduces to clear paths for starting a handshake. The
first path is a normal handshake. The handshake is started and a timeout
is scheduled.

The second path is renegotiation. If the first handshake is incomplete,
the renegotiation promise is added as a listener to the handshake
promise. Otherwise, the renegotiation promise replaces the original
promsie. At that point the handshake is started again and a timeout is
scheduled.

Result:

Cleaner and easier to understand code.
2018-11-14 08:19:06 +01:00
Bryce Anderson
044515f369 Defer HTTP/2 stream transition state on initial write until headers are written (#8471)
Motivation:
When the DefaultHttp2ConnectionEncoder writes the initial headers for a new
locally created stream we create the stream in the half-closed state if the
end-stream flag is set which signals to the life cycle manager that the headers
have been sent. However, if we synchronously fail to write the headers the
life cycle manager then sends a RST_STREAM on our behalf which is a connection
level PROTOCOL_ERROR because the peer sees the stream in an IDLE state.

Modification:
Don't open the stream in the half-closed state if the end-stream flag is
set and let the life cycle manager take care of it.

Result:
Cleaner state management in the DefaultHttp2ConnectionEncoder.

Fixes #8434.
2018-11-14 08:17:43 +01:00
Nick Hill
804e1fa9cc Fix ref-counting when CompositeByteBuf is used with retainedSlice() (#8497)
Motivation:

ByteBuf.retainedSlice() and similar methods produce sliced buffers with
an independent refcount to the buffer that they wrap.


One of the optimizations in 10539f4dc7 was
to use the ref to the unwrapped buffer object for added slices, but this
did not take into account the above special case when later releasing.

Thanks to @rkapsi for discovering this via #8495.

Modifications:

Since a reference to the slice is still kept in the Component class,
just changed Component.freeIfNecessary() to release the slice in
preference to the unwrapped buf.

Also added a unit test which reproduces the bug.

Result:

Fixes #8495
2018-11-13 20:56:09 +01:00
Norman Maurer
4c73d24ea8
Handshake timeout may never be scheduled if handshake starts via a flush or starttls is used. (#8494)
Motivation:

We did not correctly schedule the handshake timeout if the handshake was either started by a flush(...) or if starttls was used.

Modifications:

- Correctly setup timeout in all cases
- Add unit tests.

Result:

Fixes https://github.com/netty/netty/issues/8493.
2018-11-13 19:22:38 +01:00
Norman Maurer
88e4817cef
Include correct dependencies for testsuite-shading on windows. (#8491)
Motivation:

We missed to include a profile for windows which means that we did not have the correct dependencies setup.

Modifications:

- Add missing profile
- Add assumeFalse(...) to ensure we do only test the native transpot shading on non windows platforms.
- Explicit specify dependency on netty-common

Result:

Fixes https://github.com/netty/netty/issues/8489.
2018-11-12 20:59:44 +01:00
Norman Maurer
c0dfb568a2
SSLHandler may throw AssertionError if writes occur before channelAct… (#8486)
Motivation:

If you attempt to write to a channel with an SslHandler prior to channelActive being called you can hit an assertion. In particular - if you write to a channel it forces some handshaking (through flush calls) to occur.

The AssertionError only happens on Java11+.

Modifications:

- Replace assert by an "early return" in case of the handshake be done already.
- Add unit test that verifies we do not hit the AssertionError anymore and that the future is correctly failed.

Result:

Fixes https://github.com/netty/netty/issues/8479.
2018-11-11 07:23:08 +01:00
Norman Maurer
35471a1a6f
Enable netty-tcnative shading test again (#8492)
Motivation:

We disabled the test at some point but it should work now without any problems.

Modifications:

Remove @Ignore from test.

Result:

Verify shading of netty-tcnative on CI.
2018-11-11 07:18:21 +01:00
Nick Hill
0f8ce1b284 Fix incorrect sizing of temp byte arrays in (Unsafe)ByteBufUtil (#8484)
Motivation:

Two similar bugs were introduced by myself in separate recent PRs #8393
and #8464, while optimizing the assignment/handling of temporary arrays
in ByteBufUtil and UnsafeByteBufUtil.

The temp arrays allocated for buffering data written to an OutputStream
are incorrectly sized to the full length of the data to copy rather than
being capped at WRITE_CHUNK_SIZE.

Unfortunately one of these is in the 4.1.31.Final release, I'm really
sorry and will be more careful in future.

This kind of thing is tricky to cover in unit tests.

Modifications:

Revert the temp array allocations back to their original sizes.

Avoid making duplicate calls to ByteBuf.capacity() in a couple of places
in ByteBufUtil (unrelated thing I noticed, can remove it from this PR if
desired!)

Result:

Temporary byte arrays will be reverted to their originally intended
sizes.
2018-11-09 18:24:38 +01:00
Bryce Anderson
a140e6dcad Make Http2StreamFrameToHttpObjectCodec truly @Sharable (#8482)
Motivation:
The `Http2StreamFrameToHttpObjectCodec` is marked `@Sharable` but mutates
an internal `HttpScheme` field every time it is added to a pipeline.

Modifications:
Instead of storing the `HttpScheme` in the handler we store it as an
attribute on the parent channel.

Result:
Fixes #8480.
2018-11-09 18:23:53 +01:00
Norman Maurer
e766469e87
Update to openjdk 12ea19 (#8487)
Motivation:

We should test against latest EA releases.

Modifications:

Update to openkdk 12ea19

Result:

Use latest openjdk 12 EA build on the CI.
2018-11-09 16:50:58 +01:00
Norman Maurer
8a24df88a4
Ensure we correctly call wrapEngine(...) during tests. (#8473)
Motivation:

We should call wrapEngine(...) in our SSLEngineTest to correctly detect all errors in case of the OpenSSLEngine.

Modifications:

Add missing wrapEngine(...) calls.

Result:

More correct tests
2018-11-08 15:22:33 +01:00
Norman Maurer
fd57d971d1
Override and so delegate all methods in OpenSslX509Certificate (#8472)
Motivation:

We did not override all methods in OpenSslX509Certificate and delegate to the internal 509Certificate.

Modifications:

Add missing overrides.

Result:

More correct implementation
2018-11-07 12:16:04 +01:00
时无两丶
28f9136824 Replace ConcurrentHashMap at allLeaks with a thread-safe set (#8467)
Motivation:
allLeaks is to store the DefaultResourceLeak. When we actually use it, the key is DefaultResourceLeak, and the value is actually a meaningless value.
We only care about the keys of allLeaks and don't care about the values. So Set is more in line with this scenario.
Using Set as a container is more consistent with the definition of a container than Map.

Modification:

Replace allLeaks with set. Create a thread-safe set using 'Collections.newSetFromMap(new ConcurrentHashMap<DefaultResourceLeak<?>, Boolean>()).'
2018-11-06 11:21:56 +01:00
Nick Hill
5954110b9a Use ByteBufUtil.BYTE_ARRAYS ThreadLocal temporary arrays in more places (#8464)
Motivation:

#8388 introduced a reusable ThreadLocal<byte[]> for use in
decodeString(...). It can be used in more places in the buffer package
to avoid temporary allocations of small arrays.

Modifications:

Encapsulate use of the ThreadLocal in a static package-private
ByteBufUtil.threadLocalTempArray(int) method, and make use of it from a
handful of new places including ByteBufUtil.readBytes(...).

Result:

Fewer short-lived small byte array allocations.
2018-11-05 21:11:28 +01:00
Nick Hill
10539f4dc7 Streamline CompositeByteBuf internals (#8437)
Motivation:

CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.

My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length

Modifications:

- No longer slice added buffers and unwrap added slices
   - Components store target buf offset relative to position in
composite buf
   - Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
   - Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
   - Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements

While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.

Result:

Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00