Commit Graph

739 Commits

Author SHA1 Message Date
Carl Mastrangelo
83a19d5650 Optimistically update ref counts
Motivation:
Highly retained and released objects have contention on their ref
count.  Currently, the ref count is updated using compareAndSet
with care to make sure the count doesn't overflow, double free, or
revive the object.

Profiling has shown that a non trivial (~1%) of CPU time on gRPC
latency benchmarks is from the ref count updating.

Modification:
Rather than pessimistically assuming the ref count will be invalid,
optimistically update it assuming it will be.  If the update was
wrong, then use the slow path to revert the change and throw an
execption.  Most of the time, the ref counts are correct.

This changes from using compareAndSet to getAndAdd, which emits a
different CPU instruction on x86 (CMPXCHG to XADD).  Because the
CPU knows it will modifiy the memory, it can avoid contention.

On a highly contended machine, this can be about 2x faster.

There is a downside to the new approach.  The ref counters can
temporarily enter invalid states if over retained or over released.
The code does handle these overflow and underflow scenarios, but it
is possible that another concurrent access may push the failure to
a different location.  For example:

Time 1 Thread 1: obj.retain(INT_MAX - 1)
Time 2 Thread 1: obj.retain(2)
Time 2 Thread 2: obj.retain(1)

Previously Thread 2 would always succeed and Thread 1 would always
fail on the second access.  Now, thread 2 could fail while thread 1
is rolling back its change.

====

There are a few reasons why I think this is okay:

1. Buggy code is going to have bugs.  An exception _is_ going to be
   thrown.  This just causes the other threads to notice the state
   is messed up and stop early.
2. If high retention counts are a use case, then ref count should
   be a long rather than an int.
3. The critical section is greatly reduced compared to the previous
   version, so the likelihood of this happening is lower
4. On error, the code always rollsback the change atomically, so
   there is no possibility of corruption.

Result:
Faster refcounting

```
BEFORE:

Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  2901361       804.579 ±  1.835  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3038729       785.376 ± 16.471  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  2899401       817.392 ±  6.668  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3650566      2077.700 ±  0.600  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3005467     19949.334 ±  4.243  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   456091        48.610 ±  1.162  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   732051        62.599 ±  0.815  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   778925       228.629 ±  1.205  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   633682      2002.987 ±  2.856  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506442     19735.345 ± 12.312  ns/op

AFTER:
Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  3761980       383.436 ±  1.315  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3667304       474.429 ±  1.101  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  3039374       479.267 ±  0.435  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3709210      2044.603 ±  0.989  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3011591     19904.227 ± 18.025  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   494975        52.269 ±  8.345  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   771094        62.290 ±  0.795  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   763230       235.044 ±  1.552  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   634037      2006.578 ±  3.574  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506284     19742.605 ± 13.729  ns/op

```
2017-10-04 08:42:33 +02:00
回眸,境界
06da0ceb64 Fix typo in comment. 2017-10-02 08:16:22 +02:00
Norman Maurer
625a7426cd [maven-release-plugin] prepare for next development iteration 2017-09-25 06:12:32 +02:00
Norman Maurer
f57d8f00e1 [maven-release-plugin] prepare release netty-4.1.16.Final 2017-09-25 06:12:16 +02:00
Carl Mastrangelo
003c8cc7ab Expose all defaults on PooledByteBufAllocator
Motivation:
Most, but not all defaults are statically exposed on
PooledByteBufAllocator.  This makes it cumbersome to make a custom
allocator where most of the defaults remain the same.

Modification:
Expose useCacheForAllThreads, and Direct preferred.  The latter is
needed because it is under the internal package, and public code
should probably not depend on it.

Result:
More customizeable allocators
2017-09-20 21:48:53 -07:00
Norman Maurer
9d56439aa1 Make UnpooledDirectByteBuf, UnpooledHeapByteBuf and UnpooledUnsafeDirectByteBuf constructors public.
Motivation:

The constrcutors a protected atm but the classes are public. We should make the constructors public as well to make it easier to write your own ByteBufAllocator.

Modifications:

Change constructors to be public and add some javadocs.

Result:

Easier to create own ByteBufAllocator.
2017-09-18 21:42:46 -07:00
Carl Mastrangelo
d2cb51bc2e Enable PooledByteBufAllocator to work, event without a cache
Motivation:
`useCacheForAllThreads` may be false which disables memory caching
on non netty threads.  Setting this argument or the system property
makes it impossible to use `PooledByteBufAllocator`.

Modifications:

Delayed the check of `freeSweepAllocationThreshold` in
`PoolThreadCache` to after it knows there will be any caches in
use.  Additionally, check if the caches will have any data in them
(rather than allocating a 0-length array).

A test case is also added that fails without this change.

Results:

Fixes #7194
2017-09-08 10:20:45 +02:00
Norman Maurer
bca35b0449 Allow to construct UnpooledByteBufAllocator that explictly always use sun.misc.Cleaner
Motivation:

When the user want to have the direct memory explicitly managed by the GC (just as java.nio does) it is useful to be able to construct an UnpooledByteBufAllocator that allows this without the chances to see any memory leak.

Modifications:

Allow to explicitly disable the usage of reflection to construct direct ByteBufs and so be sure these will be collected by GC.

Result:

More flexible way to use the UnpooledByteBufAllocator.
2017-08-31 12:57:09 +02:00
Carl Mastrangelo
7528e5a11e Use threadsafe setter on Atomic Updaters
Motivation:
The documentation for field updates says:

> Note that the guarantees of the {@code compareAndSet}
> method in this class are weaker than in other atomic classes.
> Because this class cannot ensure that all uses of the field
> are appropriate for purposes of atomic access, it can
> guarantee atomicity only with respect to other invocations of
> {@code compareAndSet} and {@code set} on the same updater.

This implies that volatiles shouldn't use normal assignment; the
updater should set them.

Modifications:
Use setter for field updaters that make use of compareAndSet.

Result:
Concurrency compliant code
2017-08-31 10:14:40 +02:00
Norman Maurer
b967805f32 [maven-release-plugin] prepare for next development iteration 2017-08-24 15:38:22 +02:00
Norman Maurer
da8e010a42 [maven-release-plugin] prepare release netty-4.1.15.Final 2017-08-24 15:37:59 +02:00
Norman Maurer
e487db7836 Use the ByteBufAllocator when copy a ReadOnlyByteBufferBuf and so also be able to release it without the GC when the Cleaner is present.
Motivation:

In ReadOnlyByteBufferBuf.copy(...) we just allocated a ByteBuffer directly and wrapped it. This way it was not possible for us to free the direct memory that was used by the copy without the GC.

Modifications:

- Ensure we use the allocator when create the copy and so be able to release direct memory in a timely manner
- Add unit test
- Depending on if the to be copied buffer is direct or heap based we also allocate the same type on copy.

Result:

Fixes [#7103].
2017-08-16 07:33:10 +02:00
Nikolay Fedorovskikh
0774c91456 Support the little endian floats and doubles by ByteBuf
Motivation:
`ByteBuf` does not have the little endian variant of float/double access methods.

Modifications:
Add support for little endian floats and doubles into `ByteBuf`.

Result:
`ByteBuf` has get/read/set/writeFloatLE() and get/read/set/writeDoubleLE() methods. Fixes [#6576].
2017-08-15 06:24:28 +02:00
louyl
8be9a63c1c FIX endless loop in ByteBufUtil#writeAscii
Motivation:

Missing return in ByteBufUtil#writeAscii causes endless loop

Modifications:

Add return after write finished

Result:

ByteBufUtil#writeAscii is ok
2017-08-12 12:50:00 +02:00
Norman Maurer
52f384b37f [maven-release-plugin] prepare for next development iteration 2017-08-02 12:55:10 +00:00
Norman Maurer
8cc1071881 [maven-release-plugin] prepare release netty-4.1.14.Final 2017-08-02 12:54:51 +00:00
Scott Mitchell
452fd36240 ByteBufs which are not resizable should not throw in ensureWritable(int,boolean)
Motivation:
ByteBuf#ensureWritable(int,boolean) returns an int indicating the status of the resize operation. For buffers that are unmodifiable or cannot be resized this method shouldn't throw but just return 1.
ByteBuf#ensureWriteable(int) should throw unmodifiable buffers.

Modifications:
- ReadOnlyByteBuf should be updated as described above.
- Add a unit test to SslHandler which verifies the read only buffer can be tolerated in the aggregation algorithm.

Result:
Fixes https://github.com/netty/netty/issues/7002.
2017-07-22 08:44:48 -07:00
Norman Maurer
06f64948d5 Add tests to ensure an IllegalReferenceCountException is thrown if set/writeCharSequence is called on a released buffer
Motivation:

We need to ensure we not allow calling set/writeCharsequence on an released ByteBuf.

Modifications:

Add test-cases

Result:

Proves fix of [#6951].
2017-07-21 07:39:32 +02:00
Norman Maurer
4af47f0ced AbstractByteBuf.setCharSequence(...) must not expand buffer
Motivation:

AbstractByteBuf.setCharSequence(...) must not expand the buffer if not enough writable space is present in the buffer to be consistent with all the other set operations.

Modifications:

- Ensure we only exand the buffer on writeCharSequence(...) but not on setCharSequence(...)
- Add unit tests.

Result:

Consistent and correct behavior.
2017-07-19 19:44:53 +02:00
Norman Maurer
d125adec38 AbstractByteBuf.ensureWritable(...) should check if buffer was released
Motivation:

AbstractByteBuf.ensureWritable(...) should check if buffer was released and if so throw an IllegalReferenceCountException

Modifications:

Ensure we throw in all cases.

Result:

More consistent and correct behaviour
2017-07-19 07:34:08 +02:00
louxiu
0ad99310f5 Record release when enable detailed leak detection
Motivation:
It would be easier to find where is missing release call in several retain release calls on a ByteBuf

Modifications:
Remove final modifier on SimpleLeakAwareByteBuf and SimpleLeakAwareByteBuf release function and override it to record release in AdvancedLeakAwareByteBuf and AdvancedLeakAwareCompositeByteBuf

Result:
Release will be recorded when enable detailed leak detection
2017-07-18 09:28:56 +02:00
Scott Mitchell
86e653e04f SslHandler aggregation of plaintext data on write
Motivation:
Each call to SSL_write may introduce about ~100 bytes of overhead. The OpenSslEngine (based upon OpenSSL) is not able to do gathering writes so this means each wrap operation will incur the ~100 byte overhead. This commit attempts to increase goodput by aggregating the plaintext in chunks of <a href="https://tools.ietf.org/html/rfc5246#section-6.2">2^14</a>. If many small chunks are written this can increase goodput, decrease the amount of calls to SSL_write, and decrease overall encryption operations.

Modifications:
- Introduce SslHandlerCoalescingBufferQueue in SslHandler which will aggregate up to 2^14 chunks of plaintext by default
- Introduce SslHandler#setWrapDataSize to control how much data should be aggregated for each write. Aggregation can be disabled by setting this value to <= 0.

Result:
Better goodput when using SslHandler and the OpenSslEngine.
2017-07-10 12:22:08 -07:00
Nikolay Fedorovskikh
df568c739e Use ByteBuf#writeShort/writeMedium instead of writeBytes
Motivation:

1. Some encoders used a `ByteBuf#writeBytes` to write short constant byte array (2-3 bytes). This can be replaced with more faster `ByteBuf#writeShort` or `ByteBuf#writeMedium` which do not access the memory.
2. Two chained calls of the `ByteBuf#setByte` with constants can be replaced with one `ByteBuf#setShort` to reduce index checks.
3. The signature of method `HttpHeadersEncoder#encoderHeader` has an unnecessary `throws`.

Modifications:

1. Use `ByteBuf#writeShort` or `ByteBuf#writeMedium` instead of `ByteBuf#writeBytes` for the constants.
2. Use `ByteBuf#setShort` instead of chained call of the `ByteBuf#setByte` with constants.
3. Remove an unnecessary `throws` from `HttpHeadersEncoder#encoderHeader`.

Result:

A bit faster writes constants into buffers.
2017-07-10 14:37:41 +02:00
Norman Maurer
83db2b07b4 Also use realloc when shrink the buffer.
Motivation:

We should also use realloc when shrink the buffer to eliminate extra allocations / memory copies when possible.

Modifications:

Use realloc for expanding and shrinking when possible.

Result:

Less memory copies and allocations
2017-07-06 20:03:15 +02:00
Norman Maurer
2a376eeb1b [maven-release-plugin] prepare for next development iteration 2017-07-06 13:24:06 +02:00
Norman Maurer
c7f8168324 [maven-release-plugin] prepare release netty-4.1.13.Final 2017-07-06 13:23:51 +02:00
Nikolay Fedorovskikh
f35047765f Avoid a double check ByteBuf#ensureWritable in ByteBufUtil
Motivation:

Methods `ByteBufUtil#writeUtf8` and `ByteBufUtil#writeAscii` contains a check `ByteBuf#ensureWritable` before the calling `ByteBuf#writeBytes`. But the `ByteBuf#writeBytes` also do a such check inside.

Modifications:

Make checks more targeted.

Result:

Less redundant method calls.
2017-06-28 19:00:01 +02:00
Nikolay Fedorovskikh
ba3616da3e Apply appropriate methods for writing CharSequence into ByteBuf
Motivation:

1. `ByteBuf` contains methods to writing `CharSequence` which optimized for UTF-8 and ASCII encodings. We can also apply optimization for ISO-8859-1.
2. In many places appropriate methods are not used.

Modifications:

1. Apply optimization for ISO-8859-1 encoding in the `ByteBuf#setCharSequence` realizations.
2. Apply appropriate methods for writing `CharSequences` into buffers.

Result:

Reduce overhead from string-to-bytes conversion.
2017-06-27 07:58:39 +02:00
Nikolay Fedorovskikh
01eb428b39 Move methods for decode hex dump into StringUtil
Motivation:

PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.

Modifications:

1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).

Result:

Less code duplication.
2017-06-23 18:52:42 +02:00
Norman Maurer
fd67a2354d [maven-release-plugin] prepare for next development iteration 2017-06-08 21:06:24 +02:00
Norman Maurer
3acd5c68ea [maven-release-plugin] prepare release netty-4.1.12.Final 2017-06-08 21:06:01 +02:00
Norman Maurer
7922757575 Allow to access memoryAddress of wrapped ByteBuf for ReadOnlyByteBuf
Motivation:

We should allow to access the memoryAddress of the wrapped ByteBuf when using ReadOnlyByteBuf for peformance reasons. If a user act on a memoryAddress its his responsible anyway to do nothing "stupid".

Modifications:

Delegate to wrapped ByteBuf.

Result:

Less performance overhead for various operations and also when writing to a native transport (which needs the memoryAddress).
2017-06-07 18:45:26 +02:00
Renjie Sun
629b83e0a5 Move QueryStringDecoder.decodeHexByte into ByteBufUtil
Motivations:
1. There are duplicated implementations of decoding hex strings. #6797
2. ByteBufUtil.HexUtil.decodeHexDump does not handle substring start
index properly and does not decode hex byte rigorously.

Modifications:
1. Function decodeHexByte is moved from QueryStringDecoder into ByteBufUtil.
2. ByteBufUtil.HexUtil.decodeHexDump is changed to use decodeHexByte.
3. Tests are Updated accordingly.

Result:
Fixed #6797 and made hex decoding functions more robust.
2017-06-07 09:27:36 -07:00
Scott Mitchell
b71abcedd1 ByteBufUtil#decodeHexDump
Motivation:
ByteBufUtil provides a hexDump method. For debugging purposes it is often useful to decode that hex dump to get the original content, but no such method exists.

Modifications:
- Add ByteBufUtil#decodeHexDump

Result:
ByteBufUtil#decodeHexDump is available to make debugging easier.
2017-05-30 15:20:54 -07:00
Norman Maurer
0db2901f4d [maven-release-plugin] prepare for next development iteration 2017-05-11 16:00:55 +02:00
Norman Maurer
f7a19d330c [maven-release-plugin] prepare release netty-4.1.11.Final 2017-05-11 16:00:16 +02:00
Scott Mitchell
63f5cdb0d5 ByteBuf#ensureWritable(int, boolean) should not throw
Motivation:
The javadocs for ByteBuf#ensureWritable(int, boolean) indicate that it should not throw, and instead the return code should indicate the result of the operation. Due to a bug in AbstractByteBuf it is possible for a resize to be attempted on a buffer that may exceed maxCapacity() and therefore throw.

Modifications:
- If there is not enough space in the buffer, and force is false, then a resize should not be attempted

Result:
AbstractByteBuf#ensureWritable(int, boolean) enforces the javadoc constraints and does not throw.
2017-05-09 00:12:25 -07:00
Norman Maurer
6915ec3bb9 [maven-release-plugin] prepare for next development iteration 2017-04-29 14:10:00 +02:00
Norman Maurer
f30f242fee [maven-release-plugin] prepare release netty-4.1.10.Final 2017-04-29 14:09:32 +02:00
Norman Maurer
b662afeece Correctly release all buffers in UnpooledTest
Motivation:

We not correctly released all buffers in the UnpooledTest and so showed "bad" way of handling buffers to people that inspect our code to understand when a buffer needs to be released.

Modifications:

Explicit release all buffers.

Result:

Cleaner and more correct code.
2017-04-27 19:29:45 +02:00
Jason Tedor
98beb777f8 Enable configuring available processors
Motivation:

In cases when an application is running in a container or is otherwise
constrained to the number of processors that it is using, the JVM
invocation Runtime#availableProcessors will not return the constrained
value but rather the number of processors available to the virtual
machine. Netty uses this number in sizing various resources.
Additionally, some applications will constrain the number of threads
that they are using independenly of the number of processors available
on the system. Thus, applications should have a way to globally
configure the number of processors.

Modifications:

Rather than invoking Runtime#availableProcessors, Netty should rely on a
method that enables configuration when the JVM is started or by the
application. This commit exposes a new class NettyRuntime for enabling
such configuraiton. This value can only be set once. Its default value
is Runtime#availableProcessors so that there is no visible change to
existing applications, but enables configuring either a system property
or configuring during application startup (e.g., based on settings used
to configure the application).

Additionally, we introduce the usage of forbidden-apis to prevent future
uses of Runtime#availableProcessors from creeping. Future work should
enable the bundled signatures and clean up uses of deprecated and
other forbidden methods.

Result:

Netty can be configured to not use the underlying number of processors,
but rather the constrained number of processors.
2017-04-23 10:31:17 +02:00
Norman Maurer
bf0beb772c Fix IllegalArgumentException when release a wrapped ByteBuffer on Java9
Motivation:

Unsafe.invokeCleaner(...) checks if the passed in ByteBuffer is a slice or duplicate and if so throws an IllegalArgumentException on Java9. We need to ensure we never try to free a ByteBuffer that was provided by the user directly as we not know if its a slice / duplicate or not.

Modifications:

Never try to free a ByteBuffer that was passed into UnpooledUnsafeDirectByteBuf constructor by an user (via Unpooled.wrappedBuffer(....)).

Result:

Build passes again on Java9
2017-04-20 19:19:11 +02:00
Nikolay Fedorovskikh
0692bf1b6a fix the typos 2017-04-20 04:56:09 +02:00
Norman Maurer
e482d933f7 Add 'io.netty.tryAllocateUninitializedArray' system property which allows to allocate byte[] without memset in Java9+
Motivation:

Java9 added a new method to Unsafe which allows to allocate a byte[] without memset it. This can have a massive impact in allocation times when the byte[] is big. This change allows to enable this when using Java9 with the io.netty.tryAllocateUninitializedArray property when running Java9+. Please note that you will need to open up the jdk.internal.misc package via '--add-opens java.base/jdk.internal.misc=ALL-UNNAMED' as well.

Modifications:

Allow to allocate byte[] without memset on Java9+

Result:

Better performance when allocate big heap buffers and using java9.
2017-04-19 11:45:39 +02:00
Scott Mitchell
21562d8808 Retained[Duplicate|Slice] operations should not increase the reference count for UnreleasableByteBuf
Motivation:
UnreleasableByteBuf operations are designed to not modify the reference count of the underlying buffer. The Retained[Duplicate|Slice] operations violate this assumption and can cause the underlying buffer's reference count to be increased, but never allow for it to be decreased. This may lead to memory leaks.

Modifications:
- UnreleasableByteBuf's Retained[Duplicate|Slice] should leave the reference count of the parent buffer unchanged after the operation completes.

Result:
No more memory leaks due to usage of the Retained[Duplicate|Slice] on an UnreleasableByteBuf object.
2017-03-31 17:45:29 -07:00
Scott Mitchell
ef21d5f4ca UnsafeByteBufUtil errors and simplification
Motiviation:
UnsafeByteBufUtil has some bugs related to using an incorrect index, and also omitting the array paramter when dealing with byte[] objects. There is also some simplification possible with respect to type casting, and minor formatting consistentcy issues.

Modifications:
- Ensure indexing is correct when dealing with native memory
- Fix the native access and endianness for the medium/unsigned medium methods
- Ensure array is used when dealing with heap memory
- Remove unecessary casts when using long
- Fix formating and alignment

Result:
UnsafeByteBufUtil is more correct and won't access direct memory when heap arrays are used.
2017-03-30 11:52:03 -07:00
Norman Maurer
6036b3f6ea Fix buffer leak in EmptyByteBufTest introduced by aa2f16f314 2017-03-27 05:20:02 +02:00
Bryce Anderson
aa2f16f314 EmptyByteBuf allows writing ByteBufs with 0 readable bytes
Motivation:

The contract of `ByteBuf.writeBytes(ByteBuf src)` is such that it will
throw an `IndexOutOfBoundsException if `src.readableBytes()` is greater than
`this.writableBytes()`. The EmptyByteBuf class will throw the exception,
even if the source buffer has zero readable bytes, in violation of the
contract.

Modifications:

Use the helper method `checkLength(..)` to check the length and throw
the exception, if appropriate.

Result:

Conformance with the stated behavior of ByteBuf.
2017-03-21 22:00:54 -07:00
Norman Maurer
2b8c8e0805 [maven-release-plugin] prepare for next development iteration 2017-03-10 07:46:17 +01:00
Norman Maurer
1db58ea980 [maven-release-plugin] prepare release netty-4.1.9.Final 2017-03-10 07:45:28 +01:00