949 Commits

Author SHA1 Message Date
Norman Maurer
6c061abc49
Hide Recycler implemention to allow experimenting with different implementions of an Object pool (#9715)
Motivation:

At the moment we directly extend the Recycler base class in our code which makes it hard to experiment with different Object pool implementation. It would be nice to be able to switch from one to another by using a system property in the future. This would also allow to more easily test things like https://github.com/netty/netty/pull/8052.

Modifications:

- Introduce ObjectPool class with static method that we now use internally to obtain an ObjectPool implementation.
- Wrap the Recycler into an ObjectPool and return it for now

Result:

Preparation for different ObjectPool implementations
2019-10-26 00:34:30 -07:00
Norman Maurer
efce8e5363
Guard against busy spinning in HashedWheelTimer when using windows and a tickDuration of 1 (#9714)
Motivation:

We do not correct guard against the gact that when applying our workaround for windows we may end up with a 0 sleep period. In this case we should just sleep for 1 ms.

Modifications:

Guard agains the case when our calculation will produce 0 as sleep time on windows

Result:

Fixes https://github.com/netty/netty/issues/9710.
2019-10-25 11:14:06 -07:00
Sergei Egorov
f4b536edcb Add BlockHound integration that detects blocking calls in event loops (#9687)
Motivation:

Netty is an asynchronous framework.
If somebody uses a blocking call inside Netty's event loops,
it may lead to a severe performance degradation.
BlockHound is a tool that helps detecting such calls.

Modifications:

This change adds a BlockHound's SPI integration that marks
threads created by Netty (`FastThreadLocalThread`s) as non-blocking.
It also marks some of Netty's internal methods as whitelisted
as they are required to run the event loops.

Result:

When BlockHound is installed, any blocking call inside event loops
is intercepted and reported (by default an error will be thrown).
2019-10-25 06:14:14 -07:00
Nick Hill
19b4adf79c Avoid wrapping scheduled Runnables in Callable adapter (#9666)
Motivation

Currently when future tasks are scheduled via schedule(Runnable, ...)
methods, the supplied Runnable is wrapped in a newly allocated Callable
adapter prior to being wrapped in a ScheduledFutureTask.

This can be avoided which saves an object allocation per scheduled task.

Modifications

Change the Callable task field of ScheduledFutureTask to be of type
Object so that it can hold/run Runnables directly in addition to
Callables.

An "adapter" is still used in the case a Runnable is scheduled with an
explicit constant non-null completion value, assumed to be rare.

Result

Less garbage
2019-10-17 07:01:52 -07:00
Norman Maurer
6c05d16967
Use @SuppressJava6Requirement for animal sniffer plugin to ensure we always guard correctly (#9655)
Motivation:

We can use the `@SuppressJava6Requirement` annotation to be more precise about when we use Java6+ APIs. This helps us to ensure we always protect these places.

Modifications:

Make use of `@SuppressJava6Requirement` explicit

Result:

Fixes https://github.com/netty/netty/issues/2509.
2019-10-14 15:54:49 +02:00
Nick Hill
170e4deee6 Fix event loop shutdown timing fragility (#9616)
Motivation

The current event loop shutdown logic is quite fragile and in the
epoll/NIO cases relies on the default 1 second wait/select timeout that
applies when there are no scheduled tasks. Without this default timeout
the shutdown would hang indefinitely.

The timeout only takes effect in this case because queued scheduled
tasks are first cancelled in
SingleThreadEventExecutor#confirmShutdown(), but I _think_ even this
isn't robust, since the main task queue is subsequently serviced which
could result in some new scheduled task being queued with much later
deadline.

It also means shutdowns are unnecessarily delayed by up to 1 second.

Modifications

- Add/extend unit tests to expose the issue
- Adjust SingleThreadEventExecutor shutdown and confirmShutdown methods
to explicitly add no-op tasks to the taskQueue so that the subsequent
event loop iteration doesn't enter blocking wait (as looks like was
originally intended)

Results

Faster and more robust shutdown of event loops, allows removal of the
default wait timeout
2019-10-07 11:06:01 +04:00
Nick Hill
85a663fa52 Null out completed tasks to help with garbage collection (#9613)
Motivation

When ScheduledFutureTasks complete, there's no need to retain a ref to
the wrapped task. Clearing it could help in particular with the case
where many scheduled tasks have been cancelled but their queue removal
delayed (since it is done lazily).

Modifications

This comprises just the PromiseTask changes from #9580. Upon completion,
replace the task reference with a static sentinel depending on the type
of completion (so that it will be reflected by toString).

Result

More expedient collection of cancelled task objects
2019-09-27 09:54:42 +02:00
时无两丶
3ef00eaa06 Double check size to avoid ArrayIndexOutOfBoundsException (#9609)
Motivation:

Recycler$Stack.pop will occurs `ArrayIndexOutOfBoundsException` in some race cases, we should double check `size` even after `scavenge` called.

Modifications:

Double check `size` after `scavenge`

Result:

avoid ArrayIndexOutOfBoundsException in `pop`
2019-09-26 21:53:35 +02:00
Nick Hill
2791f0fefa Avoid use of global AtomicLong for ScheduledFutureTask ids (#9599)
Motivation

Currently a static AtomicLong is used to allocate a unique id whenever a
task is scheduled to any event loop. This could be a source of
contention if delayed tasks are scheduled at a high frequency and can be
easily avoided by having a non-volatile id counter per queue.

Modifications

- Replace static AtomicLong ScheduledFutureTask#nextTaskId with a long
field in AbstractScheduledExecutorService
- Set ScheduledFutureTask#id based on this when adding the task to the
queue (in event loop) instead of at construction time
- Add simple benchmark

Result

Less contention / cache-miss possibility when scheduling future tasks

Before:

Benchmark      (num)   Mode  Cnt    Score    Error  Units
scheduleLots  100000  thrpt   20  346.008 ± 21.931  ops/s

Benchmark      (num)   Mode  Cnt    Score    Error  Units
scheduleLots  100000  thrpt   20  654.824 ± 22.064  ops/s
2019-09-25 07:34:25 +02:00
wyzhang
338e1a991c Fix a bug introduced by 79706357c73ded02615d0445db7503b646ff9547 which can cause thread to spin in an infinite loop. (#9579)
Motivation:
peek() is implemented in a similar way to poll() for the mpsc queue, thus it is more like a consumer call.
It is possible that we could have multiple thread call peek() and possibly one thread calls poll() at at the same time.
This lead to multiple consumer scenario, which violates the multiple producer single consumer condition and could lead to spin in an infinite loop in peek()

Modification:
Use isEmpty() instead of peek() to check if task queue is empty

Result:
Dont violate the mpsc semantics.
2019-09-19 11:59:51 +02:00
Norman Maurer
2fe2a15593
No need to explicit use the AccessController when SystemPropertyUtil is used (#9577)
Motivation:

SystemPropertyUtil already uses the AccessController internally so not need to wrap its usage with AccessController as well.

Modifications:

Remove explicit AccessController usage when SystemPropertyUtil is used.

Result:

Code cleanup
2019-09-19 08:41:27 +02:00
Andrey Mizurov
bcb0d02248 Fix HttpContentEncoder does not handle multiple Accept-Encoding (#9557)
Motivation:
At the current moment HttpContentEncoder handle only first value of multiple accept-encoding headers.

Modification:

Join multiple accept-encoding headers to one separated by comma.

Result:

Fixes #9553
2019-09-11 08:46:06 +02:00
Nick Hill
629eae2082 Avoid redundant volatile read in DefaultPromise#get() (#9547)
Motivation

Currently every call to get() on a promise results in two reads of the
volatile result field when one would suffice. Maybe this is optimized
away but it seems sensible not to rely on that.

Modification

Reimplement get() and get(...) in DefaultPromise to reduce volatile access.

Result

Fewer volatile reads.
2019-09-09 09:54:38 +02:00
Nick Hill
768a825035 Avoid CancellationException construction in DefaultPromise (#9534)
Motivation

#9152 reverted some static exception reuse optimizations due to the
problem with Throwable#addSuppressed() raised in #9151. This introduced
a performance issue when promises are cancelled at a high frequency due
to the construction cost of CancellationException at the time that
DefaultPromise#cancel() is called.

Modifications

- Reinstate the prior static CANCELLATION_CAUSE_HOLDER but use it just
as a sentinel to indicate cancellation, constructing a new
CancellationException only if/when one needs to be explicitly
returned/thrown
- Subclass CancellationException, overriding fillInStackTrace() to
minimize the construction cost in these cases

Result

Promises are much cheaper to cancel. Fixes #9522.
2019-09-05 11:07:24 +02:00
Xiaoqin Fu
21b7e29ea7 Remove extra checks to fix #9456 (#9523)
Motivation:

There are some extra log level checks (logger.isWarnEnabled()).

Modification:

Remove log level checks (logger.isWarnEnabled()) from io.netty.channel.epoll.AbstractEpollStreamChannel, io.netty.channel.DefaultFileRegion, io.netty.channel.nio.AbstractNioChannel, io.netty.util.HashedWheelTimer, io.netty.handler.stream.ChunkedWriteHandler and io.netty.channel.udt.nio.NioUdtMessageConnectorChannel

Result:

Fixes #9456
2019-08-30 10:37:30 +02:00
Codrut Stancu
b7e9829a49 Update GraalVM Native Image configuration. (#9515)
Motivation:

The Netty classes are initialized at build time by default for GraalVM Native Image compilation. This is configured via the `--initialize-at-build-time=io.netty` option. While this reduces start-up time it can lead to some problems:

 - The class initializer of `io.netty.buffer.PooledByteBufAllocator` looks at the maximum memory size to compute the size of internal buffers. If the class initializer runs during image generation, then the buffers are sized according to the very large heap size that the image generator uses, and Netty allocates several arrays that are 16 MByte. The fix is to initialize the following 3 classes at run time: `io.netty.buffer.PooledByteBufAllocator,io.netty.buffer.ByteBufAllocator,io.netty.buffer.ByteBufUtil`. This fix was dependent on a GraalVM Native Image fix that was included in 19.2.0.

 - The class initializer of `io.netty.handler.ssl.util.ThreadLocalInsecureRandom` needs to be initialized at runtime to ensure that the generated values are trully random and not fixed for each generated image.

 - The class initializers of `io.netty.buffer.AbstractReferenceCountedByteBuf` and `io.netty.util.AbstractReferenceCounted` compute field offsets. While the field offset recomputation is necessary for correct execution as a native image these initializers also have logic that depends on the presence/absence of `sun.misc.Unsafe`, e.g., via the `-Dio.netty.noUnsafe=true` flag. The fix is to push these initializers to runtime so that the field offset lookups (and the logic depending on them) run at run time. This way no manual substitutions are necessary either.
 
Modifications:

Add `META-INF/native-image` configuration files that correctly trigger the inialization of the above classes at run time via `--initialize-at-run-time=...` flags.
 
Result:

Fixes the initialisation issues described above for Netty executables built with GraalVM.
2019-08-30 09:21:11 +02:00
szh
1a22c126be Fix log format in HashedWheelTimer (#9507)
Motivation:

log message did not correctly use `{}`

Modification:

replace `%d` by `{}`

Result:

The log is correct.
2019-08-26 08:54:45 +02:00
Nick Hill
a22d4ba859 Simplify EventLoop abstractions for timed scheduled tasks (#9470)
Motivation

The epoll transport was updated in #7834 to decouple setting of the
timerFd from the event loop, so that scheduling delayed tasks does not
require waking up epoll_wait. To achieve this, new overridable hooks
were added in the AbstractScheduledEventExecutor and
SingleThreadEventExecutor superclasses.

However, the minimumDelayScheduledTaskRemoved hook has no current
purpose and I can't envisage a _practical_ need for it. Removing
it would reduce complexity and avoid supporting this specific
API indefinitely. We can add something similar later if needed
but the opposite is not true.

There also isn't a _nice_ way to use the abstractions for
wakeup-avoidance optimizations in other EventLoops that don't have a
decoupled timer.

This PR replaces executeScheduledRunnable and
wakesUpForScheduledRunnable
with two new methods before/afterFutureTaskScheduled that have slightly
different semantics:
 - They only apply to additions; given the current internals there's no
practical use for removals
 - They allow per-submission wakeup decisions via a boolean return val,
which makes them easier to exploit from other existing EL impls (e.g.
NIO/KQueue)
 - They are subjectively "cleaner", taking just the deadline parameter
and not exposing Runnables
 - For current EL/queue impls, only the "after" hook is really needed,
but specialized blocking queue impls can conditionally wake on task
submission (I have one lined up)

Also included are further optimization/simplification/fixes to the
timerFd manipulation logic.

Modifications

- Remove AbstractScheduledEventExecutor#minimumDelayScheduledTaskRemoved()
and supporting methods
- Uplift NonWakeupRunnable and corresponding default wakesUpForTask()
impl from SingleThreadEventLoop to SingleThreadEventExecutor
- Change executeScheduledRunnable() to be package-private, and have a
final impl in SingleThreadEventExecutor which triggers new overridable
hooks before/afterFutureTaskScheduled()
- Remove unnecessary use of bookend tasks while draining the task queue
- Use new hooks to add simpler wake-up avoidance optimization to
NioEventLoop (primarily to demonstrate utility/simplicity)
- Reinstate removed EpollTest class

In EpollEventLoop:
 - Refactor to use only the new afterFutureTaskScheduled() hook for
updating timerFd
 - Fix setTimerFd race condition using a monitor
 - Set nextDeadlineNanos to a negative value while the EL is awake and
use this to block timer changes from outside the EL. Restore the
known-set value prior to sleeping, updating timerFd first if necessary
 - Don't read from timerFd when processing expiry event

Result

- Cleaner API for integrating with different EL/queue timing impls
- Fixed race condition to avoid missing scheduled wakeups
- Eliminate unnecessary timerFd updates while EL is awake, and
unnecessary expired timerFd reads
- Avoid unnecessary scheduled-task wakeups when using NIO transport

I did not yet further explore the suggestion of using
TFD_TIMER_ABSTIME for the timerFd.
2019-08-21 12:34:22 +02:00
Antony T Curtis
8a082532f2 AsciiString contentEqualsIgnoreCase fails when arrayOffset is non-zero (#9477)
Motivation:

AsciiString.contentEqualsIgnoreCase may return true for non-matching strings of equal length when offset is non zero.

Modifications:

- Correctly take offset into account
- Add unit test

Result: 

Fixes #9475
2019-08-17 09:56:39 +02:00
Scott Mitchell
1fa7a5e697 EPOLL - decouple schedule tasks from epoll_wait life cycle (#7834)
Motivation:
EPOLL supports decoupling the timed wakeup mechanism from the selector call. The EPOLL transport takes advantage of this in order to offer more fine grained timer resolution. However we are current calling timerfd_settime on each call to epoll_wait and this is expensive. We don't have to re-arm the timer on every call to epoll_wait and instead only have to arm the timer when a task is scheduled with an earlier expiration than any other existing scheduled task.

Modifications:
- Before scheduled tasks are added to the task queue, we determine if the new
  duration is the soonest to expire, and if so update with timerfd_settime. We
also drain all the tasks at the end of the event loop to make sure we service
any expired tasks and get an accurate next time delay.
- EpollEventLoop maintains a volatile variable which represents the next deadline to expire. This variable is modified inside the event loop thread (before calling epoll_wait) and out side the event loop thread (immediately to ensure proper wakeup time).
- Execute the task queue before the schedule task priority queue. This means we
  may delay the processing of scheduled tasks but it ensures we transfer all
pending tasks from the task queue to the scheduled priority queue to run the
soonest to expire scheduled task first.
- Deprecate IORatio on EpollEventLoop, and drain the executor and scheduled queue on each event loop wakeup. Coupling the amount of time we are allowed to drain the executor queue to a proportion of time we process inbound IO may lead to unbounded queue sizes and unpredictable latency.

Result:
Fixes https://github.com/netty/netty/issues/7829
- In most cases this results in less calls to timerfd_settime
- Less event loop wakeups just to check for scheduled tasks executed outside the event loop
- More predictable executor queue and scheduled task queue draining
- More accurate and responsive scheduled task execution
2019-08-14 10:11:04 +02:00
Nico Kruber
8d9cea2ce0 Try to load native linux libraries with matching classifier first (#9411)
Motivation:

Users' runtime systems may have incompatible dynamic libraries to the ones our
tcnative wrappers link to. Unfortunately, we cannot determine and catch these
scenarios (in which the JVM crashes) but we can make a more educated guess on
what library to load and try to find one that works better before crashing.

Modifications:

1) Build dynamically linked openSSL builds for more OSs (netty-tcnative)
2) Load native linux libraries with matching classifier (first)

Result:

More developers / users can use the dynamically-linked native libraries.
2019-08-12 08:37:27 +02:00
Per Lundberg
aa032b8aea Future.java: Fix typos in Javadoc (#9391)
Motivation:

Docs should have no typos

Modifications:

Fix a few typos

Result:

More correct docs.
2019-07-24 07:23:29 +02:00
YuanHu
94f3930850 Recycler availableSharedCapacity will be slowly exhausted due missing reclaimSpace(...) call (#9394)
Motivation:

We did miss to call reclaimSpace(...) in one case which can lead to the situation of having the Recycler to not correctly reclaim space and so just create new objects when not needed.

Modifications:

Correctly call reclaimSpace(...)

Result:

Recycler correctly reclaims space in all situations.
2019-07-21 21:06:31 +02:00
Dmitriy Dumanskiy
a82d62ae67 prefer instanceOf instead of getClass() (#9366)
Motivation:

`instanceOf` doesn't perform null check like `getClass()` does. So `instanceOf` may be faster. However, it not true for all cases, as C2 could eliminate these null checks for `getClass()`.

Modification:

Replaced `string.getClass() == AsciiString.class` with `string instanceof AsciiString`.

Proof:

```
@BenchmarkMode(Mode.Throughput)
@Fork(value = 1)
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1, batchSize = 1000)
@Measurement(iterations = 10, time = 1, batchSize = 1000)
public class GetClassInstanceOf {

    Object key;

    @Setup
    public void setup() {
        key = "123";
    }

    @Benchmark
    public boolean getClassEquals() {
        return key.getClass() == String.class;
    }

    @Benchmark
    public boolean instanceOf() {
        return key instanceof String;
    }

}
```

```
Benchmark                           Mode  Cnt       Score      Error  Units
GetClassInstanceOf.getClassEquals  thrpt   10  401863.130 ± 3092.568  ops/s
GetClassInstanceOf.instanceOf      thrpt   10  421386.176 ± 4317.328  ops/s
```
2019-07-16 21:20:12 +02:00
jingene
c0f9364870 Change the netty.io homepage scheme(http -> https) (#9344)
Motivation:

Netty homepage(netty.io) serves both "http" and "https".
It's recommended to use https than http.
Modification:

I changed from "http://netty.io" to "https://netty.io"
Result:

No effects.
2019-07-09 21:09:42 +02:00
jimin
a0656d2a31 Remove unnecessary code (#9303)
Motivation:

There are is some unnecessary code (like toString() calls) which can be cleaned up.

Modifications:

- Remove not needed toString() calls
- Simplify subString(...) calls
- Remove some explicit casts when not needed.

Result:

Cleaner code
2019-07-04 08:51:47 +02:00
jimin
ee8206cb26 optimize some code (#9289)
Motivation:

There is some manual coping of elements of Collections which can be replaced by Collections.addAll(...) and also some unnecessary semicolons.

Modifications:

- Simplify branches
- Use Collections.addAll
- Code cleanup

Result:

Code cleanup
2019-06-28 13:48:23 +02:00
jimin
856f1185e1 All override methods must be added @override (#9285)
Motivation:

Some methods that either override others or are implemented as part of implementation an interface did miss the `@Override` annotation

Modifications:

Add missing `@Override`s

Result:

Code cleanup
2019-06-27 13:51:26 +02:00
jimin
9621a5b981 remove unused imports (#9287)
Motivation:

Some imports are not used

Modification:

remove unused imports

Result:

Code cleanup
2019-06-26 21:08:31 +02:00
jimin
6bd8f0502d Call to ‘asList’ with only one argument could be replaced with ‘singletonList’ (#9288)
Motivation:

asList should only be used if there are multiple elements.

Modification:

Call to asList with only one argument could be replaced with singletonList

Result:

Cleaner code and a bit of memory savings
2019-06-26 21:06:48 +02:00
Norman Maurer
c9aaa93d83
Allow to specify a EventLoopTaskQueueFactory for various EventLoopGroup implementations (#9247)
Motivation:

Sometimes it is desirable to be able to use a different Queue implementation for the EventLoop of a Channel. This is currently not possible without resort to reflection.

Modifications:

- Add a new constructor to Nio|Epoll|KQueueEventLoopGroup which allows to specify a factory which is used to create the task queue. This was the user can override the default implementation.
- Add test

Result:

Be able to change Queue that is used for the EventLoop.
2019-06-21 09:05:19 +02:00
Nick Hill
e1a881fa2b Simplify SingleThreadEventExecutor.awaitTermination() implementation (#9081)
Motivation

A Semaphore is currently dedicated to this purpose but a simple
CountDownLatch will do.

Modification

Remove private threadLock Semaphore from SingleThreadEventExecutor and just use a CountDownLatch.

Also eliminate use of PlatformDependent.throwException() in startThread
method, and combine some nested if clauses.

Result

Cleaner EventLoop termination notification.
2019-05-27 16:05:40 +02:00
Norman Maurer
f17bfd0f64
Only use static Exception instances when we can ensure addSuppressed … (#9152)
Motivation:

OOME is occurred by increasing suppressedExceptions because other libraries call Throwable#addSuppressed. As we have no control over what other libraries do we need to ensure this can not lead to OOME.

Modifications:

Only use static instances of the Exceptions if we can either dissable addSuppressed or we run on java6.

Result:

Not possible to OOME because of addSuppressed. Fixes https://github.com/netty/netty/issues/9151.
2019-05-17 22:23:02 +02:00
Nick Hill
cb85e03d72 AsciiString.lastIndexOf(...) is implemented incorrectly (#9103)
Motivation

@xiaoheng1 reported incorrect behaviour of AsciiString.lastIndexOf in
#9099. Upon closer inspection it appears that it was never implemented
correctly and searches between the provided index and the end of the
string similar to indexOf(...), rather than between the provided index
and the beginning of the string as the javadoc states (and in line with
java.lang.String).

Modifications

Fix AsciiString.lastIndexOf implementation and corresponding unit tests
to behave the same as the equivalent String methods.

Result

Fixes #9099
2019-05-13 07:03:32 +02:00
Anuraag Agrawal
526f2da912 Add equality check to contentEquals instance methods. (#9130)
Motivation:

An instance is always equal to itself. It makes sense to skip processing for this case, which isn't uncommon since `AsciiString` is often memoized within an application when used as HTTP header names.

Modification:

`contentEquals` methods first check for instance equality before doing processing.

Result:

`contentEquals` will be faster when comparing an instance with itself.

I couldn't find any unit tests for these methods, only the static version. Let me know if I should add something to `AsciiStringCharacterTest`.

Came up here:
https://github.com/line/armeria/pull/1731#discussion_r280396280
2019-05-08 07:30:34 +02:00
Paulo Lopes
f1495e1945 Add SVM metadata and minimal substitutions to build graalvm native image applications. (#8963)
Motivation:

GraalVM native images are a new way to deliver java applications. Netty is one of the most popular libraries however there are a few limitations that make it impossible to use with native images out of the box. Adding a few metadata (in specific modules will allow the compilation to success and produce working binaries)

Modification:

Added properties files in `META-INF` and substitutions classes (under `internal.svm`) will solve the compilation issues. The substitutions classes are not visible and do not have a public constructor so they are not visible to end users.

Result:

Fixes #8959 

This fix is very conservative as it applies the minimum config required to build:

* pure netty servers
* vert.x applications
* grpc applications

The build is having trouble due to checkstyle which does not seem to be able to find the copyright notice on property files.
2019-04-29 08:39:42 +02:00
Norman Maurer
b5a2774502
Fix flaky GlobalEventExecutorTest.* (#9074)
Motivation:

In GlobalEventExecutorTest we used Thread.sleep(...) which can produce flaky results (as seen on the CI). We should use another alternative during tests.

Modifications:

Replace Thread.sleep(...) with join()

Result:

No more flaky GlobalEventExecutor tests.
2019-04-29 08:33:03 +02:00
Norman Maurer
34aa2c841c
Don't use sun.misc.Unsafe when IKVM.NET is used (#9042)
Motivation:

IKVM.NET seems to ship a bug sun.misc.Unsafe class, for this reason we should better disable our sun.misc.Unsafe usage when we detect IKVM.NET is used.

Modifications:

Check if IKVM.NET is used and if so do not use sun.misc.Unsafe by default.

Result:

Fixes https://github.com/netty/netty/issues/9035 and https://github.com/netty/netty/issues/8916.
2019-04-12 22:41:53 +02:00
Nick Hill
b26a61acd1 Centralize internal reference counting logic (#8614)
Motivation

AbstractReferenceCounted and AbstractReferenceCountedByteBuf contain
duplicate logic for managing the volatile refcount in an optimized and
consistent manner, which increased in complexity in #8583. It's possible
to extract this into a common helper class now that all access is via an
AtomicIntegerFieldUpdater.

Modifications

- Move duplicate logic into a shared ReferenceCountUpdater class
- Incorporate some additional simplification for the most common single
increment/decrement cases (fewer checks/operations)

Result

Less code duplication, better encapsulation of the "non-trivial"
internal volatile refcount manipulation
2019-04-09 16:22:32 +02:00
Norman Maurer
c83904a12a
Allow to automatically trim the PoolThreadCache in a timely interval (#8941)
Motivation:

PooledByteBufAllocator uses a PoolThreadCache per Thread that allocates / deallocates to minimize the performance overhead. This PoolThreadCache is trimmed after X allocations to free up buffers that are not allocated for a long time. This works out quite well when the app continues to allocate but fails if the app stops to allocate frequently (for whatever reason) and so a lot of memory is wasted and not given back to the arena / freed.

Modifications:

- Add a ThreadExecutorMap that offers multiple methods that wrap Runnable / ThreadFactory / Executor and allow to call ThreadExecutorMap.currentEventExecutor() to get the current executing EventExecutor for the calling Thread.
- Use these methods in the constructors of our EventExecutor implementations (which also covers the EventLoop implementations)
- Add io.netty.allocator.cacheTrimIntervalMillis system property which can be used to specify a fixed rate / interval on which we should try to trim the PoolThreadCache for a EventExecutor that allocates.
- Add PooledByteBufAllocator.trimCurrentThreadCache() to allow the user to trim the cache of the calling thread manually.
- Add testcases
- Introduce FastThreadLocal.getIfExists()

Result:

Allow to better / more frequently trim PoolThreadCache and so give back memory to the area / system.
2019-03-22 11:08:37 +01:00
Norman Maurer
9b1a59df38
Remove old internal code that is not used anymore after removing usage of ObjectCleaner (#8956)
Motivation:

We dont use ObjectCleaner in our FastThreadLocal anymore so we also dont need to take special care to store it there anymore.

Modifications:

Remove code that is not needed anymore.

Result:

Code cleanup.
2019-03-20 08:33:06 +01:00
Norman Maurer
c7248d84b5
Let GlobalEventExecutor implement OrderedEventExecutor (#8952)
Motivation:

GlobalEventExecutor does already provide all guarantees of OrderedEventExecutor so it should implement it.

Modifications:

Let GlobalEventExecutor implement OrderedEventExecutor.

Result:

Make it more clear how execution order is handled in GlobalEventExecutor.
2019-03-19 11:39:20 +01:00
Enrico Olivelli
eb1d12c757 Expose the global direct memory counter. (#8945)
Motivation:
This counter is very useful in order to monitor Netty without having every ByteBufAllocator in the JVM

Modification:
Expose the value of DIRECT_MEMORY_COUNTER as we are already doing for DIRECT_MEMORY_LIMIT.
We are returning -1 in case that DIRECT_MEMORY_COUNTER is not available.

Result:

Be able to get the amount of direct memory used.
2019-03-19 08:34:35 +01:00
Norman Maurer
eab849176b
Fix typo in NativeLibraryLoader debug log message (#8947)
Motivation:

We had a typo in NativeLibraryLoader debug log message which could misslead the user.

Modifications:

Fix typo to correctly state java.library.path

Result:

Correct and less confusing log message
2019-03-16 14:27:48 +01:00
Nick Hill
b2eaab092b Optimize Hpack and AsciiString hashcode and equals (#8902)
Motivation:

While looking at hpack header-processing hotspots I noticed some low
level too-big-to-inline methods which can be shrunk.

Modifications:

Reduce bytecode size and/or runtime operations used for the following
methods:

PlatformDependent0.equals(byte[], ...)
PlatformDependent0.equalsConstantTime(byte[], ...)
PlatformDependent0.hashCodeAscii(byte[],int,int)
PlatformDependent.hashCodeAscii(CharSequence)

Result:

Existing benchmarks show decent improvement

Before

Benchmark                     (size)   Mode  Cnt         Score         Error  Units
HpackUtilBenchmark.newEquals   SMALL  thrpt    5  17200229.374 ± 1701239.198  ops/s
HpackUtilBenchmark.newEquals  MEDIUM  thrpt    5   3386061.629 ±   72264.685  ops/s
HpackUtilBenchmark.newEquals   LARGE  thrpt    5    507579.209 ±   65883.951  ops/s

After

Benchmark                     (size)   Mode  Cnt         Score         Error  Units
HpackUtilBenchmark.newEquals   SMALL  thrpt    5  29221527.058 ± 4805825.836  ops/s
HpackUtilBenchmark.newEquals  MEDIUM  thrpt    5   6556251.645 ±  466115.199  ops/s
HpackUtilBenchmark.newEquals   LARGE  thrpt    5    879828.889 ±  148136.641  ops/s

Before

Benchmark                          (size)  Mode  Cnt     Score     Error  Units
PlatformDepBench.unsafeBytesEqual       4  avgt   10     4.263 ±   0.110  ns/op
PlatformDepBench.unsafeBytesEqual      10  avgt   10     5.206 ±   0.133  ns/op
PlatformDepBench.unsafeBytesEqual      50  avgt   10     8.160 ±   0.320  ns/op
PlatformDepBench.unsafeBytesEqual     100  avgt   10    13.810 ±   0.751  ns/op
PlatformDepBench.unsafeBytesEqual    1000  avgt   10    89.077 ±   7.275  ns/op
PlatformDepBench.unsafeBytesEqual   10000  avgt   10   773.940 ±  24.579  ns/op
PlatformDepBench.unsafeBytesEqual  100000  avgt   10  7546.807 ± 110.395  ns/op

After

Benchmark                          (size)  Mode  Cnt     Score     Error  Units
PlatformDepBench.unsafeBytesEqual       4  avgt   10     3.337 ±   0.087  ns/op
PlatformDepBench.unsafeBytesEqual      10  avgt   10     4.286 ±   0.194  ns/op
PlatformDepBench.unsafeBytesEqual      50  avgt   10     7.817 ±   0.123  ns/op
PlatformDepBench.unsafeBytesEqual     100  avgt   10    11.260 ±   0.412  ns/op
PlatformDepBench.unsafeBytesEqual    1000  avgt   10    84.255 ±   2.596  ns/op
PlatformDepBench.unsafeBytesEqual   10000  avgt   10   591.892 ±   5.136  ns/op
PlatformDepBench.unsafeBytesEqual  100000  avgt   10  6978.859 ± 285.043  ns/op
2019-03-08 06:55:11 +01:00
Norman Maurer
452abd9b51
Correctly monkey-patch id also in whe os / arch is used within library name. (#8913)
Motivation:

2bb9f64e16dbb5cbbf691e284a97d745378a7b8a introduced a change which made it possible to use different shaded versions of netty-tcnative on the classpath. This only partly worked as we did not correctly handled the case when os / arch is part of the library name (which is the case when netty-tcnative-boringssl-static is used with the uber jar).

Modifications:

- If patching the ID failed we retry again with the os / arch stripped
- Add unit tests to verify that patching ID now works with and without os / arch as suffix.

Result:

Using multiple shaded version of netty-tcnative-boringssl-static on MacOS works.
2019-03-05 09:10:26 +01:00
Norman Maurer
90ea3ec9f6
Adjust tests to be able to build / test when using IBM J9 / OpenJ9 (#8900)
Motivation:

We should run a CI job using J9 to ensure netty also works when using different JVMs.

Modifications:

- Adjust PooledByteBufAllocatorTest to be able to complete faster when using a JVM which takes longer when joining Threads (this seems to be the case with J9).
- Skip UDT tests on J9 as UDT is not supported there.

Result:

Be able to run CI against J9.
2019-03-01 06:47:56 +01:00
Norman Maurer
625c4e8286
Tighten up contract of PromiseCombiner and so make it more safe to use (#8886)
Motivation:

PromiseCombiner is not thread-safe and even assumes all added Futures are using the same EventExecutor. This is kind of fragile as we do not enforce this. We need to enforce this contract to ensure it's safe to use and easy to spot concurrency problems.

Modifications:

- Add new contructor to PromiseCombiner that takes an EventExecutor and deprecate the old non-arg constructor.
- Check if methods are called from within the EventExecutor thread and if not fail
- Correctly dispatch on the right EventExecutor if the Future uses a different EventExecutor to eliminate concurrency issues.

Result:

More safe use of PromiseCombiner + enforce correct usage / contract.
2019-02-28 20:32:04 +01:00
Norman Maurer
215b61e8e2
Add test for Iterator.remove() on KObjectHashMap.values().iterator() (#8891)
Motivation:

https://github.com/netty/netty/pull/8866 added support for calling Iterator.remove() but did not add a testcase.

Modifications:

Add testcase to ensure removal works.

Result:

Better test-coverage.
2019-02-27 12:06:13 +01:00
Michael André Pearce
e4d4775a10 Support removal using values iterator. (#8866)
Motivation:

As ActiveMQ project using netty, we want to make use of this class, unfortunately the iterator on values(), seems to not support remove method, even so the delegated iterator does. Currently we have to clone and modify this class locally albeit a one line change is needed, it would be ideal if netty could allow remove, then removing the need to maintain a clone.  

Modifications:

* remove throws UnsupportedOperationException, and instead call remove method on delegated iterator

Result:

Be able to call Iterator.remove() for the values.
2019-02-26 21:02:56 +01:00