Commit Graph

892 Commits

Author SHA1 Message Date
kezhenxu94
ae1340fd72 extract duplicate code into method (#8720)
Motivation:

Clean up code to increase readability.

Modification:

Extract duplicate code blocks into method.

Result:

Less code duplication
2019-01-16 10:56:36 +01:00
Derek Lewis
f08e80ffd6 Remove unnecessary loop variable from AsciiString. (#8711)
Motivation:

Incrementing two variables in sync is not necessary when only one will do.

Modifications:

- Remove `j` from `for` loop and replace with `i`.
- Add more unit testing scenarios to cover changed code.

Results:

Unnecessary variable removed.
2019-01-15 08:33:48 +01:00
kashike
c0aa1ea5c7 Fix minor spelling issues in javadocs (#8701)
Motivation:

Javadocs contained some spelling errors, we should fix these.

Modification:

Fix spelling

Result:

Javadoc cleanup.
2019-01-14 07:25:13 +01:00
Norman Maurer
71430d8bbd Call FastThreadLocal.removeAll() before notify termination future of … (#8666)
Motivation:

We should try removing all FastThreadLocals for the Thread before we notify the termination. future. The user may block on the future and once it unblocks the JVM may terminate and start unloading classes.

Modifications:

Remove all FastThreadLocals for the Thread before notify termination future.

Result:

Fixes https://github.com/netty/netty/issues/6596.
2018-12-21 11:11:05 +01:00
Norman Maurer
7f22743d92
Revert "Ability to run a task at the end of an eventloop iteration." (#8637)
Motivation:

executeAfterEventLoopIteration is an Unstable API and isnt used in Netty. We should remove it to reduce complexity.

Changes:

This reverts commit 77770374fb.

Result:

Simplify implementation / cleanup.
2018-12-12 10:37:07 +01:00
Feri73
563793688f Adding support for whitespace in resource path in tests (#8606)
Motivation:

In windows if the project is in a path that contains whitespace,
resources cannot be accessed and tests fail.

Modifications:

Adds ResourcesUtil.java in netty-common. Tests use ResourcesUtil.java to access a resource.

Result:

Being able to build netty in a path containing whitespace
2018-12-12 10:29:19 +01:00
Nick Hill
fc0a4e15ab Harden ref-counting concurrency semantics (#8583)
Motivation

#8563 highlighted race conditions introduced by the prior optimistic
update optimization in 83a19d5650. These
were known at the time but considered acceptable given the perf
benefit in high contention scenarios.

This PR proposes a modified approach which provides roughly half the
gains but stronger concurrency semantics. Race conditions still exist
but their scope is narrowed to much less likely cases (releases
coinciding with retain overflow), and even in those
cases certain guarantees are still assured. Once release() returns true,
all subsequent release/retains are guaranteed to throw, and in
particular deallocate will be called at most once.

Modifications

- Use even numbers internally (including -ve) for live refcounts
- "Final" release changes to odd number (equivalent to refcount 0)
- Retain still uses faster getAndAdd, release uses CAS loop
- First CAS attempt uses non-volatile read
- Thread.yield() after a failed CAS provides a net gain

Result

More (though not completely) robust concurrency semantics for ref
counting; increased latency under high contention, but still roughly
twice as fast as the original logic. Bench results to follow
2018-11-29 08:32:50 +01:00
Norman Maurer
cb1090653f LocationAwareSlf4jLogger does not correctly format log message. (#8595)
Motivation:

We did miss to use MessageFormatter inside LocationAwareSlf4jLogger and so {} was not correctly replaced in log messages when using slf4j.
This regression was introduced by afe0767e9c.

Modifications:

- Make use of MessageFormatter
- Add unit test.

Result:

Fixes https://github.com/netty/netty/issues/8483.
2018-11-27 11:45:11 +01:00
Norman Maurer
2e0922d773 Use addAndGet(...) as a replacement for compareAndSet(...) when tracking the direct memory usage. (#8596)
Motivation:

We can change from using compareAndSet to addAndGet, which emits a different CPU instruction on x86 (CMPXCHG to XADD) when count direct memory usage. This instruction is cheaper in general and so produce less overhead on the "happy path". If we detect too much memory usage we just rollback the change before throwing the Error.

Modifications:

Replace compareAndSet(...) with addAndGet(...)

Result:

Less overhead when tracking direct memory.
2018-11-27 08:38:01 +01:00
Jake Luciani
63dc1f5aaa Allow adjusting of lead detection sampling interval. (#8568)
Motivation:

We should allow adjustment of the leak detecting sampling interval when in SAMPLE mode.

Modifications:

Added new int property io.netty.leakDetection.samplingInterval

Result:

Be able to consume changes made by the user.
2018-11-16 17:22:03 +01:00
Norman Maurer
845a65b31c
Nio|Epoll|KqueueEventLoop task execution might throw UnsupportedOperationException on shutdown. (#8476)
Motivation:

There is a racy UnsupportedOperationException instead because the task removal is delegated to MpscChunkedArrayQueue that does not support removal. This happens with SingleThreadEventExecutor that overrides the newTaskQueue to return an MPSC queue instead of the LinkedBlockingQueue returned by the base class such as NioEventLoop, EpollEventLoop and KQueueEventLoop.

Modifications:

- Catch the UnsupportedOperationException
- Add unit test.

Result:

Fix #8475
2018-11-15 07:19:28 +01:00
时无两丶
28f9136824 Replace ConcurrentHashMap at allLeaks with a thread-safe set (#8467)
Motivation:
allLeaks is to store the DefaultResourceLeak. When we actually use it, the key is DefaultResourceLeak, and the value is actually a meaningless value.
We only care about the keys of allLeaks and don't care about the values. So Set is more in line with this scenario.
Using Set as a container is more consistent with the definition of a container than Map.

Modification:

Replace allLeaks with set. Create a thread-safe set using 'Collections.newSetFromMap(new ConcurrentHashMap<DefaultResourceLeak<?>, Boolean>()).'
2018-11-06 11:21:56 +01:00
Norman Maurer
bde2865ef8
Make it clear that HashedWheelTimer only support millis. (#8322)
Motivation:

HWT does not support anything smaller then 1ms so we should make it clear that this is the case.

Modifications:

Log a warning if < 1ms is used.

Result:

Less suprising behaviour.
2018-11-02 08:10:18 +01:00
Norman Maurer
d533befa96
PlatformDependent.maxDirectMemory() must respect io.netty.maxDirectMemory (#8452)
Motivation:

In netty we use our own max direct memory limit that can be adjusted by io.netty.maxDirectMemory but we do not take this in acount when maxDirectMemory() is used. That will lead to non optimal configuration of PooledByteBufAllocator in some cases.

This came up on stackoverflow:
https://stackoverflow.com/questions/53097133/why-is-default-num-direct-arena-derived-from-platformdependent-maxdirectmemory

Modifications:

Correctly respect io.netty.maxDirectMemory and so configure PooledByteBufAllocator correctly by default.

Result:

Correct value for max direct memory.
2018-11-02 07:19:43 +01:00
Nick Hill
d7fa7be67f Exploit PlatformDependent.allocateUninitializedArray() in more places (#8393)
Motivation:

There are currently many more places where this could be used which were
possibly not considered when the method was added.

If https://github.com/netty/netty/pull/8388 is included in its current
form, a number of these places could additionally make use of the same
BYTE_ARRAYS threadlocal.

There's also a couple of adjacent places where an optimistically-pooled
heap buffer is used for temp byte storage which could use the
threadlocal too in preference to allocating a temp heap bytebuf wrapper.
For example
https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/ByteBufUtil.java#L1417.

Modifications:

Replace new byte[] with PlatformDependent.allocateUninitializedArray()
where appropriate; make use of ByteBufUtil.getBytes() in some places
which currently perform the equivalent logic, including avoiding copy of
backing array if possible (although would be rare).

Result:

Further potential speed-up with java9+ and appropriate compile flags.
Many of these places could be on latency-sensitive code paths.
2018-10-27 10:43:28 -05:00
almson
1cc692dd7d Fix incorrect reachability assumption in ResourceLeakDetector (#8410)
Motivation:

trackedObject != null gives no guarantee that trackedObject remains reachable. This may cause problems related to premature finalization: false leak detector warnings.
 
Modifications:

Add private method reachabilityFence0 that works on JDK 8 and can be factored out into PlatformDependent. Later, it can be swapped for the real Reference.reachabilityFence.
 
Result:

No false leak detector warnings in future versions of JDK.
2018-10-24 22:15:13 +02:00
almson
fc35e20e2c Include correct duped value in DefaultResourceLeak.toString() (#8413)
Motivation:

DefaultResourceLeak.toString() did include the wrong value for duplicated records.

Modifications:

Include the correct value.

Result:

Correct toString() implementation.
2018-10-22 15:01:38 +02:00
Dmitriy Dumanskiy
b59336142f deprecate own ConcurrentSet for removal (#8340)
Motivation:

Java since version 6 has the wrapper for the ConcurrentHashMap that could be created via Collections.newSetFromMap(map). So no need to create own ConcurrentSet class. Also, since netty plans to switch to Java 8 soon there is another method for that - ConcurrentHashMap.newKeySet().
For now, marking this class @deprecated would be enough, just to warn users who use netty's ConcurrentSet. After switching to Java 8 ConcurrentSet should be removed and replaced with ConcurrentHashMap.newKeySet().

Modification:

ConcurrentSet deprecated.
2018-10-15 19:36:05 +02:00
Dmitriy Dumanskiy
0e4186c552 deprecate IntegerHolder for removal (#8339)
Motivation:

Seems like IntegerHolder counterHashCode field is the very old legacy field that is no longer used. Should be marked as deprecated and removed in the future versions.

Modification:

IntegerHolder class, InternalThreadLocalMap.counterHashCode() and InternalThreadLocalMap.setCounterHashCode(IntegerHolder counterHashCode) are now deprecated.
2018-10-11 14:59:47 +08:00
Norman Maurer
a208f6dc7c
Do the same extended checks as the JDK when a X509TrustManager is used with the OpenSSL provider. (#8307)
Motivation:

When a X509TrustManager is used while configure the SslContext the JDK automatically does some extra checks during validation of provided certs by the remote peer. We should do the same when our native implementation is used.

Modification:

- Automatically wrap a X509TrustManager and so do the same validations as the JDK does.
- Add unit tests.

Result:

More consistent behaviour. Fixes https://github.com/netty/netty/issues/6664.
2018-09-28 09:19:58 +02:00
Carl Mastrangelo
1dff107de1 Don't re-arm timerfd each epoll_wait (#7816)
Motivation:
The Epoll transport checks to see if there are any scheduled tasks
before entering epoll_wait, and resets the timerfd just before.
This causes an extra syscall to timerfd_settime before doing any
actual work.   When scheduled tasks aren't added frequently, or
tasks are added with later deadlines, this is unnecessary.

Modification:
Check the *deadline* of the peeked task in EpollEventLoop, rather
than the *delay*.  If it hasn't changed since last time, don't
re-arm the timer

Result:
About 2us faster on gRPC RTT 50pct latency benchmarks.

Before (2 runs for 5 minutes, 1 minute of warmup):

```
50.0%ile Latency (in nanos):		64267
90.0%ile Latency (in nanos):		72851
95.0%ile Latency (in nanos):		78903
99.0%ile Latency (in nanos):		92327
99.9%ile Latency (in nanos):		119691
100.0%ile Latency (in nanos):		13347327
QPS:                           14933

50.0%ile Latency (in nanos):		63907
90.0%ile Latency (in nanos):		73055
95.0%ile Latency (in nanos):		79443
99.0%ile Latency (in nanos):		93739
99.9%ile Latency (in nanos):		123583
100.0%ile Latency (in nanos):		14028287
QPS:                           14936
```

After:
```
50.0%ile Latency (in nanos):		62123
90.0%ile Latency (in nanos):		70795
95.0%ile Latency (in nanos):		76895
99.0%ile Latency (in nanos):		90887
99.9%ile Latency (in nanos):		117819
100.0%ile Latency (in nanos):		14126591
QPS:                           15387

50.0%ile Latency (in nanos):		61021
90.0%ile Latency (in nanos):		70311
95.0%ile Latency (in nanos):		76687
99.0%ile Latency (in nanos):		90887
99.9%ile Latency (in nanos):		119527
100.0%ile Latency (in nanos):		6351615
QPS:                           15571
```
2018-09-11 13:38:38 +02:00
Norman Maurer
afe0767e9c
Log the correct line-number when using SLF4j with netty if possible. (#8258)
* Log the correct line-number when using SLF4j with netty if possible.

Motivation:

At the moment we do not log the correct line number in many cases as it will log the line number of the logger wrapper itself. Slf4j does have an extra interface that can be used to filter out these nad make it more usable with logging wrappers.

Modifications:

Detect if the returned logger implements LocationAwareLogger and if so make use of its extra methods to be able to log the correct origin of the log request.

Result:

Better logging when using slf4j.
2018-09-07 07:34:22 +02:00
Norman Maurer
3c2dbdb5db
NioEventLoop should also use our special SelectionKeySet on Java9 and later. (#8260)
Motivation:

In Java8 and earlier we used reflection to replace the used key set if not otherwise told. This does not work on Java9 and later without special flags as its not possible to call setAccessible(true) on the Field anymore.

Modifications:

- Use Unsafe to instrument the Selector with out special set when sun.misc.Unsafe is present and we are using Java9+.

Result:

NIO transport produce less GC on Java9 and later as well.
2018-09-05 07:23:03 +02:00
Norman Maurer
ade60c11e1
PlatformDependent0 should be able to better detect if unaligned access is supported on java9 and later. (#8255)
Motivation:

In Java8 and earlier we used reflection to detect if unaligned access is supported. This fails in Java9 and later as we would need to change the accessible level of the method.
Lucky enough we can use Unsafe directly to read the content of the static field here.

Modifications:

Add special handling for detecting if unaligned access is supported on Java9 and later which does not fail due jigsaw.

Result:

Better and more correct detection on Java9 and later.
2018-09-05 07:22:16 +02:00
Norman Maurer
3eec66a974
Do not fail on runtime when an older version of Log4J2 is on the classpath. (#8240)
Motivation:

At the moment we will just assume the correct version of log4j2 is used when we find it on the classpath. This may lead to an AbstractMethodError at runtime. We should not use log4j2 if the version is not correct.

Modifications:

Check on class loading if we can use Log4J2 or not.

Result:

Fixes #8217.
2018-09-03 18:07:53 +02:00
Norman Maurer
9d8846cfce
Cleanup Log4J2Logger (#8245)
Motivation:

Log4J2Logger had some code-duplication with AbstractInternalLogger

Modifications:

Reuse AbstractInternaLogger.EXCEPTION_MESSAGE in Log4J2Logger and so remove some code-duplication

Result:

Less duplicated code.
2018-08-31 17:08:38 +02:00
Norman Maurer
38eee409c8
We should be able to use the ByteBuffer cleaner on java8 (and earlier… (#8234)
* We should be able to use the ByteBuffer cleaner on java8 (and earlier versions) even if sun.misc.Unsafe is not present.

Motivation:

At the moment we have a hard dependency on sun.misc.Unsafe to use the Cleaner on Java8 and earlier. This is not really needed as we can still use pure reflection if sun.misc.Unsafe is not present.

Modifications:

Refactor Cleaner6 to fallback to pure reflection if sun.misc.Unsafe is not present on system.

Result:

More timely releasing of direct memory on Java8 and earlier when sun.misc.Unsafe is not present.
2018-08-30 07:43:10 +02:00
Norman Maurer
4a5b61fc13
Fix log message about using non-direct buffers by default (#8235)
Motivation:

f77891cc17 changed slightly how we detect if we should prefer direct buffers or not but did miss to also take this into account when logging.

Modifications:

Fix branch for log message to reflect changes in f77891cc17.

Result:

Correct logging.
2018-08-30 06:57:12 +02:00
Terence Yim
79706357c7 Fix race condition in the NonStickyEventExecutorGroup (#8232)
Motivation:

There was a race condition between the task submitter and task executor threads such that the last Runnable submitted may not get executed. 

Modifications:

The bug was fixed by checking the task queue and state in the task executor thread after it saw the task queue was empty.

Result:

Fixes #8230
2018-08-29 19:42:01 +02:00
Norman Maurer
f77891cc17
We should prefer direct buffers if we can access the cleaner even if sun.misc.Unsafe is not present. (#8233)
Motivation:

We should prefer direct buffers whenever we can use the cleaner even if sun.misc.Unsafe is not present.

Modifications:

Correctly prefer direct buffers in all cases.

Result:

More correct code.
2018-08-29 08:21:07 +02:00
Norman Maurer
8679c5ef43
CleanerJava9 should be able to do its job even with a SecurityManager installed. (#8204)
Motivation:

CleanerJava9 currently fails whever a SecurityManager is installed. We should make use of AccessController.doPrivileged(...) so the user can give it the correct rights.

Modifications:

Use doPrivileged(...) when needed.

Result:

Fixes https://github.com/netty/netty/issues/8201.
2018-08-28 16:32:29 +02:00
zhaojigang
338ef96931 Recycler will produce npe error when multiple recycled at different thread
Motivation:

Recycler may produce a NPE when the same object is recycled multiple times from different threads.

Modifications:

- Check if the id has changed or if the Stack became null and if so throw an IllegalStateException
- Add unit test

Result:

Fixes https://github.com/netty/netty/issues/8220.
2018-08-27 08:58:40 +02:00
Norman Maurer
2bb9f64e16
Try to monkey-patch library id when shading is used and we are on Mac… (#8210)
* Try to monkey-patch library id when shading is used and we are on MacOS / OSX.

Motivation:

ea4c315b45 did ensure we support using multiple versions of the same shaded native library but the user still needed to run install_name_tool -id on MacOS to ensure the ID is unique.
This is kind of error prone and also means that the shading itself would need to be done on MacOS / OSX.

This is related to https://github.com/netty/netty/issues/7272.

Modifications:

- Monkey patch the shaded native lib on MacOS to ensure the id is unique while unpacking it to the tempory location.

Result:

Easier way of using shaded native libs in netty.
2018-08-23 11:07:09 +02:00
Norman Maurer
182ffdaf6d
Only use manual safepoint polling in PlatformDependent0.copyMemory(...) when using java <= 8 (#8124)
Motivation:

Java9 and later does the safepoint polling by itself so there is not need for us to do it.

Modifications:

Check for java version before doing manual safepoint polling.

Result:

Less custom code and less overhead when using java9 and later. Fixes https://github.com/netty/netty/issues/8122.
2018-08-18 21:09:18 +02:00
Ziyan Mo
785473788f (Nio|Epoll)EventLoop.pendingTasks does not need to dispatch to the EventLoop (#8197)
Motivation:

EventLoop.pendingTasks should be (reasonably) cheap to invoke so it can be used within observability. 

Modifications:

Remove code that dispatch access to the internal taskqueue to the EventLoop when invoked as this is not needed anymore with the current MPSC queues we are using. 

See https://github.com/netty/netty/issues/8196#issuecomment-413653286.

Result:

Fixes https://github.com/netty/netty/issues/8196
2018-08-18 07:28:31 +02:00
Norman Maurer
0dc71cee3a
DefaultPromise.getNow() does not correctly handle DefaultPromise.setUncancellable() (#8154)
Motivation:

We do not correctly check for previous calles of setUncancellable() in getNow() which may result in ClassCastException as we incorrectly return the internally UNCANCELLABLE object and not null if setUncancellable() we as called before.

Modifications:

Correctly check for UNCANCELLABLE and add unit test.

Result:

Fixes https://github.com/netty/netty/issues/8135.
2018-07-27 01:55:21 +08:00
Norman Maurer
8186c9aaea
Fix length calculation in AsciiString.indexOf(...) and so eliminate ArrayIndexOutOfBoundsException. (#8116)
Motivation:

We incorrectly calculated the length that was used for our for loop in AsciiString.indexOf(...). This lead to a possible ArrayIndexOutOfBoundsException.

Modifications:

- Not include the start in the length calculation
- Add unit test.

Result:

Fixes https://github.com/netty/netty/issues/8112.
2018-07-11 10:21:17 +01:00
Norman Maurer
93d2807ff0
Auto-detect Log4J2 for logging if on the class-path (#8109)
Motivation:

https://github.com/netty/netty/pull/5047 added Log4J2 support but missed to add code to try to auto-detect it.

Modifications:

Try to use Log4JLoggerFactory by default.

Result:

Fixes https://github.com/netty/netty/issues/8107.
2018-07-11 10:19:37 +01:00
Sebastian Utz
0920738932 Do not log explicit no unsafe, fixes helper method. (#8111)
Motivation:

Users should not see a scary log message when Netty is initialized if
Netty configuration explicitly disables unsafe. The log message that
produces this warning was previously guarded but by recent refactoring
a bug was introduced inside the guard helper method.

Modifications:

This commit brings back the guard against the scary log message if
unsafe is explicitly disabled.

Result:

No log message is produced when unsafe is unavailable because Netty was
told to not look for it.

Relates https://github.com/netty/netty/pull/5624, https://github.com/netty/netty/pull/6696
2018-07-09 15:57:35 -04:00
Norman Maurer
83710cb2e1
Replace toArray(new T[size]) with toArray(new T[0]) to eliminate zero-out and allow the VM to optimize. (#8075)
Motivation:

Using toArray(new T[0]) is usually the faster aproach these days. We should use it.

See also https://shipilev.net/blog/2016/arrays-wisdom-ancients/#_conclusion.

Modifications:

Replace toArray(new T[size]) with toArray(new T[0]).

Result:

Faster code.
2018-06-29 07:56:04 +02:00
Norman Maurer
5b1fe611a6
Remove usage of ObjectCleaner (#8064)
Motivation:

ObjectCleaner does start a Thread to handle the cleaning of resources which leaks into the users application. We should not use it in netty itself to make things more predictable.

Modifications:

- Remove usage of ObjectCleaner and use finalize as a replacement when possible.
- Clarify javadocs for FastThreadLocal.onRemoval(...) to ensure its clear that remove() is not guaranteed to be called when the Thread completees and so this method is not enough to guarantee cleanup for this case.

Result:

Fixes https://github.com/netty/netty/issues/8017.
2018-06-28 08:15:27 +02:00
nickhill
06f3574e46 Don't calculate max direct memory twice in PlatformDependent
Motivation:

I'm not sure if trivial changes like this are interesting :-) But I
noticed that the PlatformDependent.maxDirectMemory0() method is called
twice unnecessarily during static initialization (on the default path at
least).

Modifications:

Use constant MAX_DIRECT_MEMORY already set to the same value instead of
calling maxDirectMemory0() again.

Result:

A surely imperceivable reduction in operations performed at startup.
2018-06-26 13:58:54 +02:00
Roger
3e3e5155b9 Check if Log level is enabled before creating log statements (#8022)
Motivation

There is a cost to concatenating strings and calling methods that will be wasted if the Logger's level is not enabled.

Modifications

Check if Log level is enabled before producing log statement. These are just a few cases found by RegEx'ing in the code.

Result

Tiny bit more efficient code.
2018-06-13 23:21:53 -07:00
Norman Maurer
d133bf06a4
Allow to schedule tasks up to Long.MAX_VALUE (#7972)
Motivation:

We should allow to schedule tasks with a delay up to Long.MAX_VALUE as we did pre 4.1.25.Final.

Modifications:

Just ensure we not overflow and put the correct max limits in place when schedule a timer. At worse we will get a wakeup to early and then schedule a new timeout.

Result:

Fixes https://github.com/netty/netty/issues/7970.
2018-05-30 11:11:42 +02:00
Norman Maurer
8a85761500
Don't use VM.maxDirectMemory() on IBM J9 / Eclipse OpenJ9 to retrieve direct memory limit (#7966)
Motivation:

On J9 / OpenJ9 netty initializes this value with 64M, even the direct accessible memory is actually unbounded.

Modifications:

Skip usage of VM.maxDirectMemory() on J9 / OpenJ9

Result:

More correct direct memory limit detection. Fixes #7654.
2018-05-25 14:35:32 +02:00
Norman Maurer
c3d29f7b9e
Guard against calling malloc(0) when create ByteBuffer. (#7948)
Motivation:

We did not guard against the case of calling malloc(0) when creating a ByteBuffer without a Cleaner. The problem is that malloc(0) can have different behaviour, it either return a null-pointer or a valid pointer that you can pass to free.

The real problem arise if Unsafe.allocateMemory(0) returns 0 and we use it as the memoryAddress of the ByteBuffer. The problem here is that native libraries test for 0 and handle it as a null-ptr. This is for example true in SSL.bioSetByteBuffer(...) which would throw a NPE when 0 is used as memoryAddress and so produced errors during SSL usage.

Modifications:

- Always allocate 1 byte as minimum (even if we ask for an empty buffer).
- Add unit test.

Result:

No more errors possible because of malloc(0).
2018-05-17 06:55:48 +02:00
Xiaoyan Lin
a0ed6ec06c Fix the error message in ReferenceCounted.release (#7921)
Motivation:

When a buffer is over-released, the current error message of `IllegalReferenceCountException` is `refCnt: XXX, increment: XXX`, which is confusing. The correct message should be `refCnt: XXX, decrement: XXX`.

Modifications:

Pass `-decrement` to create `IllegalReferenceCountException`.

Result:

The error message will be `refCnt: XXX, decrement: XXX` when a buffer is over-released.
2018-05-08 20:09:16 +02:00
Norman Maurer
eaf1771336
Don't use VM.maxDirectMemory() on z/OS to retrieve direct memory limit. (#7886)
Motivation:

On z/OS netty initializes this value with 64M, even the direct accessible memory is actually unbounded.

Modifications:

Skip usage of VM.maxDirectMemory() on z/OS.

Result:

More correct direct memory limit detection. Fixes https://github.com/netty/netty/issues/7654.
2018-04-25 07:33:07 +02:00
Norman Maurer
b47fb81799
EventLoop.schedule with big delay fails (#7402)
Motivation:

Using a very huge delay when calling schedule(...) may cause an Selector error when calling select(...) later on. We should gaurd against such a big value.

Modifications:

- Add guard against a very huge value.
- Added tests.

Result:

Fixes [#7365]
2018-04-24 11:15:20 +02:00
Robin Wang
010dbe3c73 Optimize space usage of FastThreadLocal (#7861)
Motivation:
A FastThreadLocal instance currently occupies 2 slots of InternalThreadLocalMap of every thread. Actually for a FastThreadLocalThread, it does not need to store cleaner flags of FastThreadLocal instances. Besides, using BitSet to store cleaner flags is also better for space usage.

Modification:
Use BitSet to optimize space usage of FastThreadLocal.

Result:
Avoid unnecessary slot occupancy. Cleaner flags are now stored into a BitSet.
2018-04-24 09:19:01 +02:00
Norman Maurer
3ec29455af
Fix eternal loop in Recycler.WeakOrderQueue.Head#run() that blocks ObjectCleaner thread (#7878)
Motivation:

When trying to cleanup WeakOrderQueue by the ObjectCleaner we end up in an endless loop which will cause the ObjectCleaner to be not able to cleanup any other resources anymore.This bug was introduced by 6eb9674bf5.

Modifications:

Correctly update link while cleanup

Result:

Fixes https://github.com/netty/netty/issues/7877
2018-04-19 09:40:38 +02:00
Scott Mitchell
3f244a94fe
AsciiString#indexOf out of bounds (#7866)
Motivation:
The bounds checking for AsciiString#indexOf and AsciiString#lastIndexOf is not correct and may lead to ArrayIndexOutOfBoundsException.

Modifications:
- Correct the bounds checking for AsciiString#indexOf and AsciiString#lastIndexOf

Result
Fixes https://github.com/netty/netty/issues/7863
2018-04-12 12:06:12 -07:00
Gustavo Fernandes
76c5f6cd03 Enable per origin Cors configuration (#7800)
Motivation:

Finer granularity when configuring CorsHandler, enabling different policies for different origins.

Modifications:

The CorsHandler has an extra constructor that accepts a List<CorsConfig> that are evaluated sequentially when processing a Cors request

Result:

The changes don't break backwards compatibility. The extra ctor can be used to provide more than one CorsConfig object.
2018-04-11 10:06:13 +02:00
Dave Moten
0f4001d598 add task before starting thread in SingleThreadEventExecutor.execute (#7841)
Motivation:

Minor performance optimisation that prevents thread from blocking due to task not having been added to queue. Discussed #7815.

Modification:

add task to the queue before starting the thread.

Result:

No additional tests.
2018-04-05 07:57:21 +02:00
Scott Mitchell
602ee5444d NetUtil valid IP methods to accept CharSequence (#7827)
* NetUtil valid IP methods to accept CharSequence

Motivation:
NetUtil has methods to determine if a String is a valid IP address. These methods don't rely upon String specific methods and can use CharSequence instead.

Modifications:
- Use CharSequence instead of String for the IP validator methods.
- Avoid object allocation in AsciiString#indexOf(char,int) and reduce
byte code

Result:
No more copy operation required if a CharSequence exists.
2018-04-01 08:39:43 +02:00
Stephane Landelle
d60cd0231d HttpProxyHandler generates invalid CONNECT url and Host header when address is resolved
Motivation:

HttpProxyHandler uses `NetUtil#toSocketAddressString` to compute
CONNECT url and Host header.

The url is correct when the address is unresolved, as
`NetUtil#toSocketAddressString` will then use
`getHoststring`/`getHostname`. If the address is already resolved, the
url will be based on the IP instead of the hostname.

There’s an additional minor issue with the Host header: default port
443 should be omitted.

Modifications:

* Introduce NetUtil#getHostname
* Introduce HttpUtil#formatHostnameForHttp to format an
InetSocketAddress to
HTTP format
* Change url computation to favor hostname instead of IP
* Introduce HttpProxyHandler ignoreDefaultPortsInConnectHostHeader
parameter to ignore 80 and 443 ports in Host header

Result:

HttpProxyHandler performs properly when connecting to a resolved address
2018-03-27 09:43:11 +02:00
Norman Maurer
de082bf4c7 Correctly record creation stacktrace in ResourceLeakDetector.
Motivation:

We missed to correctly record the stacktrace of the creation of an ResourceLeak record. This could either have the effect to log the wrote stacktrace for creation or not log a stacktrace at all if the object was dropped on the floor after it was created.

Modifications:

Correctly create a Record on creation of the object.

Result:

Fixes https://github.com/netty/netty/issues/7781.
2018-03-16 08:24:18 +01:00
Norman Maurer
6eb9674bf5 Replace finalizer() usage in Recycler.WeakOrderQueue with ObjectCleaner usage.
Motivation:

We recently introduced ObjectCleaner which can be used to ensure some cleanup action is done once an object becomes weakable reachable. We should use this in Recycler.WeakOrderQueue to reduce the overhead of using a finalizer() (which will cause the GC to process it two times).

Modifications:

Replace finalizer() usage with ObjectCleaner

Result:

Fixes [#7343]
2018-03-09 18:45:02 -08:00
Norman Maurer
48df2f66b8 HashedWheelTimer.newTimeout(...) may overflow
Motivation:

We dont protect from overflow and so the timer may fire too early if a large timeout is used.

Modifications:

Add overflow guard and a test.

Result:

Fixes https://github.com/netty/netty/issues/7760.
2018-03-03 15:00:47 -08:00
Carl Mastrangelo
15560530d4 Propagate full Unsafe unavailability reason in PlatformDependent
Motivation:
It is not clear why Unsafe is unavailable when it is explicitly
disabled, or when Netty thinks it is running on Android.

Modification:
Change the "has" fields and methods to be causes.  A null cause
means Unsafe is present.  This catches all possible reason why
Unsafe might not be available.

Result:
Easier to debug Netty start up when logging cannot be turned on.
2018-02-16 10:43:06 -08:00
Shohei Kamimori
73f23c5faa Fix typos in docs.
Motivation:

There are same typos in the docs.

Modifications:

Fix typos. Docs only changing.

Result:

More correct docs.
2018-02-14 08:44:07 +01:00
Scott Mitchell
4f982be91e
DefaultPromise internal state dependent on Signal
Motivation:
DefaultPromise's internal state depends upon specific Signal objects. These Signal objects can be used externally which causes the DefaultPromise object API to not function correct and state to become corrupted.

Modifications:
- DefaultPromise shouldn't depend upon Signal for its internal state

Result:
Fixes https://github.com/netty/netty/issues/7707
2018-02-12 14:24:46 -08:00
Scott Mitchell
6cd5e8b0ca Reduce the default number of objects retained by the Recycler per thread
Motivation:
The Recycler currently retains 32k objects per thread by default. The Recycler is used in more than just one place and may result in large amounts of memory bloat if spikes of traffic are observed.

Modifications:
- Reduce the Recyclers default capacity from 32k to 4k.

Result:
- Lower default capacity of the Recycler and less memory retained.
2018-02-09 19:56:01 +01:00
Johno Crawford
01e46ed03a System property util might return null
Motivation:

isAndroid0 should be robust.

Modifications:

yoda equals for string comparison.

Result:

No NPE.
2018-02-09 19:25:30 +01:00
Carl Mastrangelo
4ed961f4fe To detect Android, check the VM property rather than the classpath
Motivation:
Some java binaries include android classes on their classpath, even
if they aren't actually android.  When this is true, `Unsafe` no
longer works, disabling the Epoll functionality.  A sample case is
for binaries that use the j2objc library.

Modifications:
Check the `java.vm.name` instead of the classpath.   Numerous
Google-internal Android libraries / binaries check this property
rather than the class path.

It is believed this is safe and works with bother ART and Dalvik
VMs, safe for Robolectric, and j2objc.

Results:
Unusually built java server binaries can still use Netty Epoll.
2018-02-09 15:43:25 +01:00
koji lin
1e70d3092d Avoid register multiple cleaner task for same thread's FastThreadLocal index
Motivation:

Currently if user call set/remove/set/remove many times, it will create multiple cleaner task for same index. It may cause OOM due to long live thread will have more and more task in LIVE_SET.

Modification:

Add flag to avoid recreating tasks.

Result:
Only create 1 clean task. But use more space of indexedVariables.
2018-02-05 09:05:51 +01:00
Norman Maurer
e72c197aa3 Reflective setAccessible(true) will produce scary warnings on the console when using java9+, dont do it
Motivation:

Reflective setAccessible(true) will produce scary warnings on the console when using java9+, while netty still works. That said users may feel uncomfortable with these warnings, we should not try to do it by default when using java9+.

Modifications:

Add io.netty.tryReflectionSetAccessible  system property which controls if setAccessible(...) will be used. By default it will bet set to false when using java9+.

Result:

Fixes [#7254].
2018-01-30 12:18:34 +01:00
Jason
9dd5c928f3 Add java-doc for implemented methods of io.netty.util.concurrent.Future#cancel(boolean mayInterruptIfRunning)
Motivation:

The methods implement io.netty.util.concurrent.Future#cancel(boolean mayInterruptIfRunning) which actually ignored the param mayInterruptIfRunning.We need to add comments for the `mayInterruptIfRunning` param.

Modifications:

Add comments for the `mayInterruptIfRunning` param.

Result:

People who call the `cancel` method will be more clear about the effect of `mayInterruptIfRunning` param.
2018-01-29 11:19:52 +01:00
Norman Maurer
1879433ae6 ObjectCleanerThread must be a deamon thread to ensure the JVM can always terminate.
Motivation:

The ObjectCleanerThread must be a daemon thread as otherwise we may block the JVM from exit. By using a daemon thread we basically give the same garantees as the JVM when it comes to cleanup of resources (as the GC threads are also daemon threads and the CleanerImpl uses a deamon thread as well in Java9+).

Modifications:

Change ObjectCleanThread to be a daemon thread.

Result:

JVM shutdown will always be able to complete. Fixed [#7617].
2018-01-26 08:25:42 +01:00
jaymode
f0c76cacc3 Replace reflective access of Throwable#addSuppressed with version guarded access
Motivation:

In environments with a security manager, the reflective access to get the reference to
Throwable#addSuppressed can cause issues that result in Netty failing to load. The main
motivation in this pull request is to remove the use of reflection to prevent issues in
these environments.

Modifications:

ThrowableUtil no longer uses Class#getDeclaredMembers to get the Method that references
Throwable#addSuppressed and instead guards the call to Throwable#addSuppressed with a
Java version check.

Additionally, a annotation was added that suppresses the animal sniffer java16 signature
check on the given method. The benefit of the annotation is that it limits the exclusion
of Throwable to just the ThrowableUtil class and has string text indicating the reason
for suppressing the java16 signature check.

Result:

Netty no longer requires the use of Class#getDeclaredMethod for ThrowableUtil and will
work in environments restricted by a security manager without needing to grant reflection
permissions.

Fixes #7614
2018-01-25 19:56:17 +01:00
jaymode
c0e84070b0 Set thread context classloader in a doPrivileged block
Motivation:

In a few classes, Netty starts a thread and then sets the context classloader of these threads
to prevent classloader leaks. The Thread#setContextClassLoader method is a privileged method in
that it requires permissions to be executed when there is a security manager in place. Unless
these calls are wrapped in a doPrivileged block, they will fail in an environment with a security
manager and restrictive policy in place.

Modifications:

Wrap the calls to Thread#setContextClassLoader in a AccessController#doPrivileged block.

Result:

After this change, the threads can set the context classloader without any errors in an
environment with a security manager and restrictive policy in place.
2018-01-25 10:55:34 +01:00
Scott Mitchell
4921f62c8a
HttpResponseStatus object allocation reduction
Motivation:
Usages of HttpResponseStatus may result in more object allocation then necessary due to not looking for cached objects and the AsciiString parsing method not being used due to CharSequence method being used instead.

Modifications:
- HttpResponseDecoder should attempt to get the HttpResponseStatus from cache instead of allocating a new object
- HttpResponseStatus#parseLine(CharSequence) should check if the type is AsciiString and redirect to the AsciiString parsing method which may not require an additional toString call
- HttpResponseStatus#parseLine(AsciiString) can be optimized and doesn't require and may not require object allocation

Result:
Less allocations when dealing with HttpResponseStatus.
2018-01-24 22:01:52 -08:00
Scott Mitchell
031bad60dc ObjectCleaner should continue cleaning despite exceptions
Motivation:
ObjectCleaner inovkes a Runnable which may execute user code (FastThreadLocal#onRemoval) and therefore exceptions maybe thrown. If an exception is thrown the cleanup thread will exit prematurely and we may never finish cleaning up which will result in leaks.

Modifications:
- ObjectCleaner should suppress exceptions and continue cleaning

Result:
ObjectCleaner will reliably clean despite exceptions being thrown.
2018-01-19 20:09:20 +01:00
Scott Mitchell
f72f162e16 ObjectCleaner may indefinitely block on ReferenceQueue#poll
Motivation:
ObjectCleaner polls a ReferenceQueue which will block indefinitely. However it is possible there is a race condition between the live set of objects being empty due to the WeakReference being cleaned/cleared and polling the queue. If this situation occurs the cleanup thread may never unblock if no more objects are added to the live set, and may result in an application's failure to gracefully close.

Modifications:
- ReferenceQueue.remove should use a timeout to compensate for the race condition, and avoid dead lock

Result:
No more dead lock in ObjectCleaner when polling the ReferenceQueue.
2018-01-19 18:51:56 +01:00
Scott Mitchell
ea73e47a8b
FastThreadLocal#set remove duplicate isIndexedVariableSet call
Motivation:
FastThreadLocal#set calls isIndexedVariableSet to determine if we need to register with the cleaner, but the set(InternalThreadLocalMap, V) method will also internally do this check so we can share code and only do the check a single time.

Modifications:
- extract code from set(InternalThreadLocalMap, V) so it can be called externally to determine if a new item was created

Result:
Less code duplication in FastThreadLocal#set.
2017-12-22 09:41:57 -08:00
Norman Maurer
e004b4a354 Ensure ObjectCleaner will also be used when FastThreadLocal.set is used.
Motivation:

e329ca1 introduced the user of ObjectCleaner in FastThreadLocal but we missed the case to register our cleaner task if FastThreadLocal.set was called only.

Modifications:

- Use ObjectCleaner also when FastThreadLocal.set is used.
- Add test case.

Result:

ObjectCleaner is always used.
2017-12-22 07:11:22 +01:00
Nikolay Fedorovskikh
c9668ce40f The constants calculation in compile-time
Motivation:
Allow pre-computing calculation of the constants for compiler where it could be.
Similar fix in OpenJDK: [1].

Modifications:
- Use parentheses.
- Simplify static initialization of `BYTE2HEX_*` arrays in `StringUtil`.

Result:
Less bytecode, possible faster calculations at runtime.

[1] https://bugs.openjdk.java.net/browse/JDK-4477961
2017-12-21 07:41:38 +01:00
Norman Maurer
e329ca1cf3 Introduce ObjectCleaner and use it in FastThreadLocal to ensure FastThreadLocal.onRemoval(...) is called
Motivation:

There is no guarantee that FastThreadLocal.onRemoval(...) is called if the FastThreadLocal is used by "non" FastThreacLocalThreads. This can lead to all sort of problems, like for example memory leaks as direct memory is not correctly cleaned up etc.

Beside this we use ThreadDeathWatcher to check if we need to release buffers back to the pool when thread local caches are collected. In the past ThreadDeathWatcher was used which will need to "wakeup" every second to check if the registered Threads are still alive. If we can ensure FastThreadLocal.onRemoval(...) is called we do not need this anymore.

Modifications:

- Introduce ObjectCleaner and use it to ensure FastThreadLocal.onRemoval(...) is always called when a Thread is collected.
- Deprecate ThreadDeathWatcher
- Add unit tests.

Result:

Consistent way of cleanup FastThreadLocals when a Thread is collected.
2017-12-21 07:34:44 +01:00
Norman Maurer
640a22df9e Remove WeakOrderedQueue from WeakHashMap when FastThreadLocal value was removed if possible.
Motivation:

We should remove the WeakOrderedQueue from the WeakHashMap directly if possible and only depend on the semantics of the WeakHashMap if there is no other way for us to cleanup it.

Modifications:

Override onRemoval(...) to remove the WeakOrderedQueue if possible.

Result:

Less overhead and quicker collection of WeakOrderedQueue for some cases.
2017-12-15 21:21:18 +01:00
Norman Maurer
5ad35a157c SingleThreadEventExecutor ignores startThread failures
Motivation:

When doStartThread throws an exception, e.g. due to the actual executor being depleted of threads and throwing in its rejected execution handler, the STEE ends up in started state anyway. If we try to execute another task in this executor, it will be queued but the thread won't be started anymore and the task will linger forever.

Modifications:

- Ensure we not update the internal state if the startThread() method throws.
- Add testcase

Result:

Fixes [#7483]
2017-12-14 21:38:37 +00:00
Norman Maurer
0276b6e0f6 Ensure Thread can be collected in a timely manner if Recycler.Stack holds a reference to it.
Motivation:

In our Recycler implementation we store a reference to the current Thread in the Stack that is stored in a FastThreadLocal. The Stack itself is referenced in the DefaultHandle itself. A problem can arise if a user stores a Reference to an Object that holds a reference to the DefaultHandle somewhere and either not remove the reference at all or remove it very late. In this case the Thread itself can not be collected as its still referenced in the Stack that is referenced by the DefaultHandle.

Modifications:

- Use a WeakReference to store the reference to the Thread in the Stack
- Add a test case

Result:

Ensure a Thread can be collected in a timely manner in all cases even if it used the Recycler.
2017-12-14 06:44:47 +01:00
Norman Maurer
63bae0956a Ensure ThreadDeathWatcher and GlobalEventExecutor will not cause classloader leaks.
Motivation:

ThreadDeathWatcher and GlobalEventExecutor may create and start a new thread from various other threads and so inherit the classloader. We need to ensure we not inherit to allow recycling the classloader.

Modifications:

Use Thread.setContextClassLoader(null) to ensure we not hold a strong reference to the classloader and so not leak it.

Result:

Fixes [#7290].
2017-12-12 09:06:54 +01:00
Norman Maurer
f2b1d95164 Fix javadocs for ObjectUtil methods.
Motivation:

The javadocs for a few methds in ObjectUtil are not correct.

Modifications:

Add "not" where it was missing.

Result:

Fixes [#7455].
2017-12-06 20:51:30 +01:00
Norman Maurer
09a05b680d Dont use ThreadDeathWatcher to cleanup PoolThreadCache if FastThreadLocalThread with wrapped Runnable is used
Motivation:

We dont need to use the ThreadDeathWatcher if we use a FastThreadLocalThread for which we wrap the Runnable and ensure we call FastThreadLocal.removeAll() once the Runnable completes.

Modifications:

- Dont use a ThreadDeathWatcher if we are sure we will call FastThreadLocal.removeAll()
- Add unit test.

Result:

Less overhead / running theads if you only allocate / deallocate from FastThreadLocalThreads.
2017-11-28 13:43:28 +01:00
Norman Maurer
65cacc9b15 Guard against NoClassDefFoundError when trying to load Unsafe.
Motivation:

OSGI and other enviroments may not allow to even load Unsafe which will lead to an NoClassDefFoundError when trying to access it. We should guard against this.

Modifications:

Catch NoClassDefFoundError when trying to load Unsafe.

Result:

Be able to use netty with a strict OSGI config.
2017-11-24 20:06:30 +01:00
Soner Kaya
f9cadc0a8c When System property is empty use def value.
Motivation:

When system property is empty, the default value should be used.

Modification:

- Correctly use the default value in all cases
- Add unit tests

Result:

Correct behaviour
2017-11-23 19:45:37 +01:00
Scott Mitchell
0a47c590fe HttpHeaders valuesIterator and contains improvements
Motivation:
In order to determine if a header contains a value we currently rely
upon getAll(..) and regular expressions. This operation is commonly used
during the encode and decode stage to determine the transfer encoding
(e.g. HttpUtil#isTransferEncodingChunked). This operation requires an
intermediate collection and possibly regular expressions for the
CombinedHttpHeaders use case which can be expensive.

Modifications:
- Add a valuesIterator to HttpHeaders and specializations of this method
for DefaultHttpHeaders, ReadOnlyHttpHeaders, and CombinedHttpHeaders.

Result:
Less intermediate collections and allocation overhead when determining
if HttpHeaders contains a name/value pair.
2017-11-20 08:34:06 -08:00
Moses Nakamura
d976dc108d codec-http2: Improve h1 to h2 header conversion
Motivation:

Netty could handle "connection" or "te" headers more gently when
converting from http/1.1 to http/2 headers.  Http/2 headers don't
support single-hop headers, so when we convert from http/1.1 to http/2,
we should drop all single-hop headers.  This includes headers like
"transfer-encoding" and "connection", but also the headers that
"connection" points to, since "connection" can be used to designate
other headers as single-hop headers.  For the "te" header, we can more
permissively convert it by just dropping non-conforming headers (ie
non-"trailers" headers) which is what we do for all other headers when
we convert.

Modifications:

Add a new blacklist to the http/1.1 to http/2 conversion, which is
constructed from the values of the "connection" header, and stop
throwing an exception when a "te" header is passed with a non-"trailers"
value.  Instead, drop all values except for "trailers".  Add unit tests
for "connection" and "te" headers when converting from http/1.1 to http/2.

Result:

This will improve the h2c upgrade request, and also conversions from
http/1.1 to http/2.  This will simplify implementing spec-compliant
http/2 servers that want to share code between their http/1.1 and http/2
implementations.

[Fixes #7355]
2017-11-17 09:09:52 +01:00
Anuraag Agrawal
1f1a60ae7d Use Netty's DefaultPriorityQueue instead of JDK's PriorityQueue for scheduled tasks
Motivation:

`AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue.

Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K).

Modification:

Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance.

Result:

Before - cancelling futures has varying performance, capped at `O(n)`
After - cancelling futures has stable performance, capped at `O(log n)`

Benchmark results

After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds
```
Benchmark                                           (num)   Mode  Cnt       Score      Error  Units
ScheduledFutureTaskBenchmark.cancelInOrder            100  thrpt   20  137779.616 ± 7709.751  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder           1000  thrpt   20   11049.448 ±  385.832  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder          10000  thrpt   20     943.294 ±   12.391  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder         100000  thrpt   20      64.210 ±    1.824  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder     100  thrpt   20  167531.096 ± 9187.865  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder    1000  thrpt   20   33019.786 ± 4737.770  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder   10000  thrpt   20    2976.955 ±  248.555  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder  100000  thrpt   20     362.654 ±   45.716  ops/s
```

Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation.
```
Benchmark                                           (num)   Mode  Cnt       Score       Error  Units
ScheduledFutureTaskBenchmark.cancelInOrder            100  thrpt   20  139968.586 ± 12951.333  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder           1000  thrpt   20   12274.420 ±   337.800  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder          10000  thrpt   20     958.168 ±    15.350  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder         100000  thrpt   20      53.381 ±    13.981  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder     100  thrpt   20  123918.829 ±  3642.517  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder    1000  thrpt   20    5099.810 ±   206.992  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder   10000  thrpt   20      72.335 ±     0.443  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder  100000  thrpt   20       0.743 ±     0.003  ops/s
```
2017-11-10 23:09:32 -08:00
Carl Mastrangelo
ef5ebb40c9 Keep all leak records up to the target amount
Motivation:
When looking for a leak, its nice to be able to request at least a
number of leaks.

Modification:

* Made all leak records up to the target amoutn recorded, and only
  then enable backing off.
* Enable recording more than 32 elements.  Previously the shift
  amount made this impossible.

Result:
Ability to record all accesses.
2017-11-07 19:09:14 -08:00
Trask Stalnaker
58e74e9fee Support running Netty in bootstrap class loader
Motivation:

Fix NullPointerExceptions that occur when running netty-tcnative inside the bootstrap class loader.

Modifications:

- Replace loader.getResource(...) with ClassLoader.getSystemResource(...) when loader is null.
- Replace loader.loadClass(...) with Class.forName(..., false, loader) which works when loader is both null and non-null.

Result:

Support running native libs in bootstrap class loader
2017-10-29 13:13:19 +01:00
Carl Mastrangelo
e62e6df4ac Use WeakReferences for Resource Leaks
Motivation:
Phantom references are for cleaning up resources that were
forgotten, which means they keep their referent alive.   This
means garbage is kept around until the refqueue is drained, rather
than when the reference is unreachable.

Modification:
Use Weak References instead of Phantoms

Result:
More punctual leak detection.
2017-10-24 19:21:33 +02:00
Norman Maurer
740c68faed Add supresswarnings to cleanup 16b1dbdf92.
Motivation:

We should add @SupressWarnings

Modifications:

Add annotations.

Result:

Less warnings
2017-10-22 18:16:46 +02:00
Idel Pivnitskiy
4793daa589 Make Comparators Serializable
Motivation:

Objects of java.util.TreeMap or java.util.TreeSet will become
non-Serializable if instantiated with Comparators, which are not also
 Serializable. This can result in unexpected and difficult-to-diagnose
 bugs.

Modifications:

Implements Serializable for all classes, which implements Comparator.

Result:

Proper Comparators which will not force collections to
non-Serializable mode.
2017-10-22 03:40:28 +02:00
Idel Pivnitskiy
50a067a8f7 Make methods 'static' where it possible
Motivation:

Even if it's a super micro-optimization (most JVM could optimize such
 cases in runtime), in theory (and according to some perf tests) it
 may help a bit. It also makes a code more clear and allows you to
 access such methods in the test scope directly, without instance of
 the class.

Modifications:

Add 'static' modifier for all methods, where it possible. Mostly in
test scope.

Result:

Cleaner code with proper 'static' modifiers.
2017-10-21 14:59:26 +02:00
Idel Pivnitskiy
558097449c Add missed 'serialVersionUID' field for Serializable classes
Motivation:

Without a 'serialVersionUID' field, any change to a class will make
previously serialized versions unreadable.

Modifications:

Add missed 'serialVersionUID' field for all Serializable
classes.

Result:

Proper deserialization of previously serialized objects.
2017-10-21 14:41:18 +02:00
Carl Mastrangelo
16b1dbdf92 Motivation: Resource Leak Detector (RLD) tries to helpfully indicate where an object was last accessed and report the accesses in the case the object was not cleaned up. It handles lightly used objects well, but drops all but the last few accesses.
Configuring this is tough because there is split between highly shared (and accessed) objects and lightly accessed objects.

Modification:
There are a number of changes here.  In relative order of importance:

API / Functionality changes:
* Max records and max sample records are gone.  Only "target" records, the number of records tries to retain is exposed.
* Records are sampled based on the number of already stored records.  The likelihood of recording a new sample is `2^(-n)`, where `n` is the number of currently stored elements.
* Records are stored in a concurrent stack structure rather than a list.  This avoids a head and tail.  Since the stack is only read once, there is no need to maintain head and tail pointers
* The properties of this imply that the very first and very last access are always recorded.  When deciding to sample, the top element is replaced rather than pushed.
* Samples that happen between the first and last accesses now have a chance of being recorded.  Previously only the final few were kept.
* Sampling is no longer deterministic.  Previously, a deterministic access pattern meant that you could conceivably always miss some access points.
* Sampling has a linear ramp for low values and and exponentially backs off roughly equal to 2^n.  This means that for 1,000,000 accesses, about 20 will actually be kept.  I have an elegant proof for this which is too large to fit in this commit message.

Code changes:
* All locks are gone.  Because sampling rarely needs to do a write, there is almost 0 contention.  The dropped records counter is slightly contentious, but this could be removed or changed to a LongAdder.  This was not done because of memory concerns.
* Stack trace exclusion is done outside of RLD.  Classes can opt to remove some of their methods.
* Stack trace exclusion is faster, since it uses String.equals, often getting a pointer compare due to interning.  Previously it used contains()
* Leak printing is outputted fairly differently.  I tried to preserve as much of the original formatting as possible, but some things didn't make sense to keep.

Result:
More useful leak reporting.

Faster:
```
Before:
Benchmark                                           (recordTimes)   Mode  Cnt       Score      Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  136293.404 ± 7669.454  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20   72805.720 ± 3710.864  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  139131.215 ± 4882.751  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20   74146.313 ± 4999.246  ops/s

After:
Benchmark                                           (recordTimes)   Mode  Cnt       Score      Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  155281.969 ± 5301.399  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20   77866.239 ± 3821.054  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  153360.036 ± 8611.353  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20   78670.804 ± 2399.149  ops/s
```
2017-10-19 12:21:21 -07:00
Carl Mastrangelo
d3ca087f6b Propagate all exceptions when loading native code
Motivation:
There are 2 motivations, the first depends on the second:

Loading Netty Epoll statically stopped working in 4.1.16, due to
`Native` always loading the arch specific shared object.  In a
static binary, there is no arch specific SO.

Second, there are a ton of exceptions that can happen when loading
a native library.  When loading native code, Netty tries a bunch of
different paths but a failure in any given may not be fatal.

Additionally: turning on debug logging is not always feasible so
exceptions get silently swallowed.

Modifications:

* Change Epoll and Kqueue to try the static load second
* Modify NativeLibraryLoader to record all the locations where
  exceptions occur.
* Attempt to use `addSuppressed` from Java 7 if available.

Alternatives Considered:

An alternative would be to record log messages at each failure.  If
all load attempts fail, the log messages are printed as warning,
else as debug. The problem with this is there is no `LogRecord` to
create like in java.util.logging.  Buffering the args to
logger.log() at the end of the method loses the call site, and
changes the order of events to be confusing.

Another alternative is to teach NativeLibraryLoader about loading
the SO first, and then the static version.  This would consolidate
the code fore Epoll, Kqueue, and TCNative.   I think this is the
long term better option, but this PR is changing a lot already.
Someone else can take a crack at it later

Results:
Epoll Still Loads and easier debugging.
2017-10-04 08:45:27 +02:00
Carl Mastrangelo
83a19d5650 Optimistically update ref counts
Motivation:
Highly retained and released objects have contention on their ref
count.  Currently, the ref count is updated using compareAndSet
with care to make sure the count doesn't overflow, double free, or
revive the object.

Profiling has shown that a non trivial (~1%) of CPU time on gRPC
latency benchmarks is from the ref count updating.

Modification:
Rather than pessimistically assuming the ref count will be invalid,
optimistically update it assuming it will be.  If the update was
wrong, then use the slow path to revert the change and throw an
execption.  Most of the time, the ref counts are correct.

This changes from using compareAndSet to getAndAdd, which emits a
different CPU instruction on x86 (CMPXCHG to XADD).  Because the
CPU knows it will modifiy the memory, it can avoid contention.

On a highly contended machine, this can be about 2x faster.

There is a downside to the new approach.  The ref counters can
temporarily enter invalid states if over retained or over released.
The code does handle these overflow and underflow scenarios, but it
is possible that another concurrent access may push the failure to
a different location.  For example:

Time 1 Thread 1: obj.retain(INT_MAX - 1)
Time 2 Thread 1: obj.retain(2)
Time 2 Thread 2: obj.retain(1)

Previously Thread 2 would always succeed and Thread 1 would always
fail on the second access.  Now, thread 2 could fail while thread 1
is rolling back its change.

====

There are a few reasons why I think this is okay:

1. Buggy code is going to have bugs.  An exception _is_ going to be
   thrown.  This just causes the other threads to notice the state
   is messed up and stop early.
2. If high retention counts are a use case, then ref count should
   be a long rather than an int.
3. The critical section is greatly reduced compared to the previous
   version, so the likelihood of this happening is lower
4. On error, the code always rollsback the change atomically, so
   there is no possibility of corruption.

Result:
Faster refcounting

```
BEFORE:

Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  2901361       804.579 ±  1.835  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3038729       785.376 ± 16.471  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  2899401       817.392 ±  6.668  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3650566      2077.700 ±  0.600  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3005467     19949.334 ±  4.243  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   456091        48.610 ±  1.162  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   732051        62.599 ±  0.815  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   778925       228.629 ±  1.205  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   633682      2002.987 ±  2.856  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506442     19735.345 ± 12.312  ns/op

AFTER:
Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  3761980       383.436 ±  1.315  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3667304       474.429 ±  1.101  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  3039374       479.267 ±  0.435  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3709210      2044.603 ±  0.989  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3011591     19904.227 ± 18.025  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   494975        52.269 ±  8.345  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   771094        62.290 ±  0.795  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   763230       235.044 ±  1.552  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   634037      2006.578 ±  3.574  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506284     19742.605 ± 13.729  ns/op

```
2017-10-04 08:42:33 +02:00
Norman Maurer
c3298a3836 Fix regression in reporting leaks introduced by 3c8c7fc7e9.
Motivation:

3c8c7fc7e9 introduced some changes to the ResourceLeakDetector that introduced a regression and so would always log that paranoid leak detection should be enabled even it was already.

Modifications:

Correctly not clear the recorded stacktraces when we process the reference queue so we can log these.

Result:

ResourceLeakDetector works again as expected.
2017-09-21 12:17:43 -07:00