Commit Graph

756 Commits

Author SHA1 Message Date
Scott Mitchell
0a47c590fe HttpHeaders valuesIterator and contains improvements
Motivation:
In order to determine if a header contains a value we currently rely
upon getAll(..) and regular expressions. This operation is commonly used
during the encode and decode stage to determine the transfer encoding
(e.g. HttpUtil#isTransferEncodingChunked). This operation requires an
intermediate collection and possibly regular expressions for the
CombinedHttpHeaders use case which can be expensive.

Modifications:
- Add a valuesIterator to HttpHeaders and specializations of this method
for DefaultHttpHeaders, ReadOnlyHttpHeaders, and CombinedHttpHeaders.

Result:
Less intermediate collections and allocation overhead when determining
if HttpHeaders contains a name/value pair.
2017-11-20 08:34:06 -08:00
Moses Nakamura
d976dc108d codec-http2: Improve h1 to h2 header conversion
Motivation:

Netty could handle "connection" or "te" headers more gently when
converting from http/1.1 to http/2 headers.  Http/2 headers don't
support single-hop headers, so when we convert from http/1.1 to http/2,
we should drop all single-hop headers.  This includes headers like
"transfer-encoding" and "connection", but also the headers that
"connection" points to, since "connection" can be used to designate
other headers as single-hop headers.  For the "te" header, we can more
permissively convert it by just dropping non-conforming headers (ie
non-"trailers" headers) which is what we do for all other headers when
we convert.

Modifications:

Add a new blacklist to the http/1.1 to http/2 conversion, which is
constructed from the values of the "connection" header, and stop
throwing an exception when a "te" header is passed with a non-"trailers"
value.  Instead, drop all values except for "trailers".  Add unit tests
for "connection" and "te" headers when converting from http/1.1 to http/2.

Result:

This will improve the h2c upgrade request, and also conversions from
http/1.1 to http/2.  This will simplify implementing spec-compliant
http/2 servers that want to share code between their http/1.1 and http/2
implementations.

[Fixes #7355]
2017-11-17 09:09:52 +01:00
Anuraag Agrawal
1f1a60ae7d Use Netty's DefaultPriorityQueue instead of JDK's PriorityQueue for scheduled tasks
Motivation:

`AbstractScheduledEventExecutor` uses a standard `java.util.PriorityQueue` to keep track of task deadlines. `ScheduledFuture.cancel` removes tasks from this `PriorityQueue`. Unfortunately, `PriorityQueue.remove` has `O(n)` performance since it must search for the item in the entire queue before removing it. This is fast when the future is at the front of the queue (e.g., already triggered) but not when it's randomly located in the queue.

Many servers will use `ScheduledFuture.cancel` on all requests, e.g., to manage a request timeout. As these cancellations will be happen in arbitrary order, when there are many scheduled futures, `PriorityQueue.remove` is a bottleneck and greatly hurts performance with many concurrent requests (>10K).

Modification:

Use netty's `DefaultPriorityQueue` for scheduling futures instead of the JDK. `DefaultPriorityQueue` is almost identical to the JDK version except it is able to remove futures without searching for them in the queue. This means `DefaultPriorityQueue.remove` has `O(log n)` performance.

Result:

Before - cancelling futures has varying performance, capped at `O(n)`
After - cancelling futures has stable performance, capped at `O(log n)`

Benchmark results

After - cancelling in order and in reverse order have similar performance within `O(log n)` bounds
```
Benchmark                                           (num)   Mode  Cnt       Score      Error  Units
ScheduledFutureTaskBenchmark.cancelInOrder            100  thrpt   20  137779.616 ± 7709.751  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder           1000  thrpt   20   11049.448 ±  385.832  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder          10000  thrpt   20     943.294 ±   12.391  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder         100000  thrpt   20      64.210 ±    1.824  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder     100  thrpt   20  167531.096 ± 9187.865  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder    1000  thrpt   20   33019.786 ± 4737.770  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder   10000  thrpt   20    2976.955 ±  248.555  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder  100000  thrpt   20     362.654 ±   45.716  ops/s
```

Before - cancelling in order and in reverse order have significantly different performance at higher queue size, orders of magnitude worse than the new implementation.
```
Benchmark                                           (num)   Mode  Cnt       Score       Error  Units
ScheduledFutureTaskBenchmark.cancelInOrder            100  thrpt   20  139968.586 ± 12951.333  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder           1000  thrpt   20   12274.420 ±   337.800  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder          10000  thrpt   20     958.168 ±    15.350  ops/s
ScheduledFutureTaskBenchmark.cancelInOrder         100000  thrpt   20      53.381 ±    13.981  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder     100  thrpt   20  123918.829 ±  3642.517  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder    1000  thrpt   20    5099.810 ±   206.992  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder   10000  thrpt   20      72.335 ±     0.443  ops/s
ScheduledFutureTaskBenchmark.cancelInReverseOrder  100000  thrpt   20       0.743 ±     0.003  ops/s
```
2017-11-10 23:09:32 -08:00
Carl Mastrangelo
ef5ebb40c9 Keep all leak records up to the target amount
Motivation:
When looking for a leak, its nice to be able to request at least a
number of leaks.

Modification:

* Made all leak records up to the target amoutn recorded, and only
  then enable backing off.
* Enable recording more than 32 elements.  Previously the shift
  amount made this impossible.

Result:
Ability to record all accesses.
2017-11-07 19:09:14 -08:00
Trask Stalnaker
58e74e9fee Support running Netty in bootstrap class loader
Motivation:

Fix NullPointerExceptions that occur when running netty-tcnative inside the bootstrap class loader.

Modifications:

- Replace loader.getResource(...) with ClassLoader.getSystemResource(...) when loader is null.
- Replace loader.loadClass(...) with Class.forName(..., false, loader) which works when loader is both null and non-null.

Result:

Support running native libs in bootstrap class loader
2017-10-29 13:13:19 +01:00
Carl Mastrangelo
e62e6df4ac Use WeakReferences for Resource Leaks
Motivation:
Phantom references are for cleaning up resources that were
forgotten, which means they keep their referent alive.   This
means garbage is kept around until the refqueue is drained, rather
than when the reference is unreachable.

Modification:
Use Weak References instead of Phantoms

Result:
More punctual leak detection.
2017-10-24 19:21:33 +02:00
Norman Maurer
740c68faed Add supresswarnings to cleanup 16b1dbdf92.
Motivation:

We should add @SupressWarnings

Modifications:

Add annotations.

Result:

Less warnings
2017-10-22 18:16:46 +02:00
Idel Pivnitskiy
4793daa589 Make Comparators Serializable
Motivation:

Objects of java.util.TreeMap or java.util.TreeSet will become
non-Serializable if instantiated with Comparators, which are not also
 Serializable. This can result in unexpected and difficult-to-diagnose
 bugs.

Modifications:

Implements Serializable for all classes, which implements Comparator.

Result:

Proper Comparators which will not force collections to
non-Serializable mode.
2017-10-22 03:40:28 +02:00
Idel Pivnitskiy
50a067a8f7 Make methods 'static' where it possible
Motivation:

Even if it's a super micro-optimization (most JVM could optimize such
 cases in runtime), in theory (and according to some perf tests) it
 may help a bit. It also makes a code more clear and allows you to
 access such methods in the test scope directly, without instance of
 the class.

Modifications:

Add 'static' modifier for all methods, where it possible. Mostly in
test scope.

Result:

Cleaner code with proper 'static' modifiers.
2017-10-21 14:59:26 +02:00
Idel Pivnitskiy
558097449c Add missed 'serialVersionUID' field for Serializable classes
Motivation:

Without a 'serialVersionUID' field, any change to a class will make
previously serialized versions unreadable.

Modifications:

Add missed 'serialVersionUID' field for all Serializable
classes.

Result:

Proper deserialization of previously serialized objects.
2017-10-21 14:41:18 +02:00
Carl Mastrangelo
16b1dbdf92 Motivation: Resource Leak Detector (RLD) tries to helpfully indicate where an object was last accessed and report the accesses in the case the object was not cleaned up. It handles lightly used objects well, but drops all but the last few accesses.
Configuring this is tough because there is split between highly shared (and accessed) objects and lightly accessed objects.

Modification:
There are a number of changes here.  In relative order of importance:

API / Functionality changes:
* Max records and max sample records are gone.  Only "target" records, the number of records tries to retain is exposed.
* Records are sampled based on the number of already stored records.  The likelihood of recording a new sample is `2^(-n)`, where `n` is the number of currently stored elements.
* Records are stored in a concurrent stack structure rather than a list.  This avoids a head and tail.  Since the stack is only read once, there is no need to maintain head and tail pointers
* The properties of this imply that the very first and very last access are always recorded.  When deciding to sample, the top element is replaced rather than pushed.
* Samples that happen between the first and last accesses now have a chance of being recorded.  Previously only the final few were kept.
* Sampling is no longer deterministic.  Previously, a deterministic access pattern meant that you could conceivably always miss some access points.
* Sampling has a linear ramp for low values and and exponentially backs off roughly equal to 2^n.  This means that for 1,000,000 accesses, about 20 will actually be kept.  I have an elegant proof for this which is too large to fit in this commit message.

Code changes:
* All locks are gone.  Because sampling rarely needs to do a write, there is almost 0 contention.  The dropped records counter is slightly contentious, but this could be removed or changed to a LongAdder.  This was not done because of memory concerns.
* Stack trace exclusion is done outside of RLD.  Classes can opt to remove some of their methods.
* Stack trace exclusion is faster, since it uses String.equals, often getting a pointer compare due to interning.  Previously it used contains()
* Leak printing is outputted fairly differently.  I tried to preserve as much of the original formatting as possible, but some things didn't make sense to keep.

Result:
More useful leak reporting.

Faster:
```
Before:
Benchmark                                           (recordTimes)   Mode  Cnt       Score      Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  136293.404 ± 7669.454  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20   72805.720 ± 3710.864  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  139131.215 ± 4882.751  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20   74146.313 ± 4999.246  ops/s

After:
Benchmark                                           (recordTimes)   Mode  Cnt       Score      Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  155281.969 ± 5301.399  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20   77866.239 ± 3821.054  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  153360.036 ± 8611.353  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20   78670.804 ± 2399.149  ops/s
```
2017-10-19 12:21:21 -07:00
Carl Mastrangelo
d3ca087f6b Propagate all exceptions when loading native code
Motivation:
There are 2 motivations, the first depends on the second:

Loading Netty Epoll statically stopped working in 4.1.16, due to
`Native` always loading the arch specific shared object.  In a
static binary, there is no arch specific SO.

Second, there are a ton of exceptions that can happen when loading
a native library.  When loading native code, Netty tries a bunch of
different paths but a failure in any given may not be fatal.

Additionally: turning on debug logging is not always feasible so
exceptions get silently swallowed.

Modifications:

* Change Epoll and Kqueue to try the static load second
* Modify NativeLibraryLoader to record all the locations where
  exceptions occur.
* Attempt to use `addSuppressed` from Java 7 if available.

Alternatives Considered:

An alternative would be to record log messages at each failure.  If
all load attempts fail, the log messages are printed as warning,
else as debug. The problem with this is there is no `LogRecord` to
create like in java.util.logging.  Buffering the args to
logger.log() at the end of the method loses the call site, and
changes the order of events to be confusing.

Another alternative is to teach NativeLibraryLoader about loading
the SO first, and then the static version.  This would consolidate
the code fore Epoll, Kqueue, and TCNative.   I think this is the
long term better option, but this PR is changing a lot already.
Someone else can take a crack at it later

Results:
Epoll Still Loads and easier debugging.
2017-10-04 08:45:27 +02:00
Carl Mastrangelo
83a19d5650 Optimistically update ref counts
Motivation:
Highly retained and released objects have contention on their ref
count.  Currently, the ref count is updated using compareAndSet
with care to make sure the count doesn't overflow, double free, or
revive the object.

Profiling has shown that a non trivial (~1%) of CPU time on gRPC
latency benchmarks is from the ref count updating.

Modification:
Rather than pessimistically assuming the ref count will be invalid,
optimistically update it assuming it will be.  If the update was
wrong, then use the slow path to revert the change and throw an
execption.  Most of the time, the ref counts are correct.

This changes from using compareAndSet to getAndAdd, which emits a
different CPU instruction on x86 (CMPXCHG to XADD).  Because the
CPU knows it will modifiy the memory, it can avoid contention.

On a highly contended machine, this can be about 2x faster.

There is a downside to the new approach.  The ref counters can
temporarily enter invalid states if over retained or over released.
The code does handle these overflow and underflow scenarios, but it
is possible that another concurrent access may push the failure to
a different location.  For example:

Time 1 Thread 1: obj.retain(INT_MAX - 1)
Time 2 Thread 1: obj.retain(2)
Time 2 Thread 2: obj.retain(1)

Previously Thread 2 would always succeed and Thread 1 would always
fail on the second access.  Now, thread 2 could fail while thread 1
is rolling back its change.

====

There are a few reasons why I think this is okay:

1. Buggy code is going to have bugs.  An exception _is_ going to be
   thrown.  This just causes the other threads to notice the state
   is messed up and stop early.
2. If high retention counts are a use case, then ref count should
   be a long rather than an int.
3. The critical section is greatly reduced compared to the previous
   version, so the likelihood of this happening is lower
4. On error, the code always rollsback the change atomically, so
   there is no possibility of corruption.

Result:
Faster refcounting

```
BEFORE:

Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  2901361       804.579 ±  1.835  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3038729       785.376 ± 16.471  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  2899401       817.392 ±  6.668  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3650566      2077.700 ±  0.600  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3005467     19949.334 ±  4.243  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   456091        48.610 ±  1.162  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   732051        62.599 ±  0.815  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   778925       228.629 ±  1.205  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   633682      2002.987 ±  2.856  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506442     19735.345 ± 12.312  ns/op

AFTER:
Benchmark                                                                                             (delay)    Mode      Cnt         Score    Error  Units
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                            1  sample  3761980       383.436 ±  1.315  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                           10  sample  3667304       474.429 ±  1.101  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                          100  sample  3039374       479.267 ±  0.435  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                         1000  sample  3709210      2044.603 ±  0.989  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_contended                                        10000  sample  3011591     19904.227 ± 18.025  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                          1  sample   494975        52.269 ±  8.345  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                         10  sample   771094        62.290 ±  0.795  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                        100  sample   763230       235.044 ±  1.552  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                       1000  sample   634037      2006.578 ±  3.574  ns/op
AbstractReferenceCountedByteBufBenchmark.retainRelease_uncontended                                      10000  sample   506284     19742.605 ± 13.729  ns/op

```
2017-10-04 08:42:33 +02:00
Norman Maurer
c3298a3836 Fix regression in reporting leaks introduced by 3c8c7fc7e9.
Motivation:

3c8c7fc7e9 introduced some changes to the ResourceLeakDetector that introduced a regression and so would always log that paranoid leak detection should be enabled even it was already.

Modifications:

Correctly not clear the recorded stacktraces when we process the reference queue so we can log these.

Result:

ResourceLeakDetector works again as expected.
2017-09-21 12:17:43 -07:00
Norman Maurer
4d5f0e7ad5 NativeLibraryLoader should check the result of ClassLoader#getResource method
Motivation:

NativeLibraryLoader uses ClassLoader#getResource method that can return nulls when the resource cannot be found. The returned url variable should be checked for nullity and fail in a more usable manner than a NullPointerException

Modifications:

Fail with a FileNotFoundException

Result:

Fixes [#7222].
2017-09-19 17:45:06 -07:00
Norman Maurer
3c8c7fc7e9 Reduce performance overhead of ResourceLeakDetector
Motiviation:

The ResourceLeakDetector helps to detect and troubleshoot resource leaks and is often used even in production enviroments with a low level. Because of this its import that we try to keep the overhead as low as overhead. Most of the times no leak is detected (as all is correctly handled) so we should keep the overhead for this case as low as possible.

Modifications:

- Only call getStackTrace() if a leak is reported as it is a very expensive native call. Also handle the filtering and creating of the String in a lazy fashion
- Remove the need to mantain a Queue to store the last access records
- Add benchmark

Result:

Huge decrease of performance overhead.

Before the patch:

Benchmark                                           (recordTimes)   Mode  Cnt     Score     Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  4358.367 ± 116.419  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20  2306.027 ±  55.044  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  4220.979 ± 114.046  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20  2250.734 ±  55.352  ops/s

With this patch:

Benchmark                                           (recordTimes)   Mode  Cnt      Score      Error  Units
ResourceLeakDetectorRecordBenchmark.record                      8  thrpt   20  71398.957 ± 2695.925  ops/s
ResourceLeakDetectorRecordBenchmark.record                     16  thrpt   20  38643.963 ± 1446.694  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint              8  thrpt   20  71677.882 ± 2923.622  ops/s
ResourceLeakDetectorRecordBenchmark.recordWithHint             16  thrpt   20  38660.176 ± 1467.732  ops/s
2017-09-18 16:36:19 -07:00
Carl Mastrangelo
b32cd26a96 Remove allocation from ResourceLeakDetector
Motivation:
RLD allocates an ArrayDeque in anticipation of recording access
points.  If the leak detection level is less than ADVANCED though,
the dequeue is never used.  Since SIMPLE is the default level,
there is a minor perf win to not preemptively allocate it.

This showed up in garbage profiling when creation a high number of
buffers.

Modifications:
Only allocate the dequeue if it will be used.

Result:
Less garbage created.
2017-09-15 20:22:31 -07:00
Scott Mitchell
b1332bf12e NativeLibraryLoader logging clarify
Motivation:
NativeLibraryLoader may only log a debug statement if the library is successfully loaded from java.library.path, but will log failure statements the if load for java.library.path fails which can mislead users to believe the load actually failed when it may have succeeded.

Modifications:
- Always load a debug statement when a library was successfully loaded

Result:
NativeLibraryLoader log statements more clear.
2017-09-15 09:17:44 -07:00
杨浩
14189140a0 log in PatternLayout (%F:%L)%c.%M
Motivation:

When Log4j2Logger is used with PatternLayout (%F:%L)%c.%M, the log message incorrect shows:

(Log4J2Logger.java:73)io.netty.util.internal.PlatformDependent0.debug ....

Modification:

Extend AbstractLogger

Result:

Fixes [#7186].
2017-09-14 14:51:20 -07:00
Norman Maurer
0fffc844d6 Only load native transport if running architecture match the compiled library architecture.
Motivation:

We should only try to load the native artifacts if the architecture we are currently running on is the same as the one the native libraries were compiled for.

Modifications:

Include architecture in native lib name and append the current arch when trying to load these. This will fail then if its not the same as the arch of the compiled arch.

Result:

Fixes [#7150].
2017-09-04 13:34:55 +02:00
Norman Maurer
bca35b0449 Allow to construct UnpooledByteBufAllocator that explictly always use sun.misc.Cleaner
Motivation:

When the user want to have the direct memory explicitly managed by the GC (just as java.nio does) it is useful to be able to construct an UnpooledByteBufAllocator that allows this without the chances to see any memory leak.

Modifications:

Allow to explicitly disable the usage of reflection to construct direct ByteBufs and so be sure these will be collected by GC.

Result:

More flexible way to use the UnpooledByteBufAllocator.
2017-08-31 12:57:09 +02:00
Carl Mastrangelo
7528e5a11e Use threadsafe setter on Atomic Updaters
Motivation:
The documentation for field updates says:

> Note that the guarantees of the {@code compareAndSet}
> method in this class are weaker than in other atomic classes.
> Because this class cannot ensure that all uses of the field
> are appropriate for purposes of atomic access, it can
> guarantee atomicity only with respect to other invocations of
> {@code compareAndSet} and {@code set} on the same updater.

This implies that volatiles shouldn't use normal assignment; the
updater should set them.

Modifications:
Use setter for field updaters that make use of compareAndSet.

Result:
Concurrency compliant code
2017-08-31 10:14:40 +02:00
Carl Mastrangelo
c891c9c13f Include more detail why Unsafe is not available
Motivation:
PD and PD0 Both try to find and use Unsafe.  If unavailable, they
try to log why and continue on.  However, it is not always east to
enable this logging.  Chaining exceptions together is much easier
to reach, and the original exception is relevant when Unsafe is
needed.

Modifications:
* Make PD log why PD0 could not be loaded with a trace level log
* Make PD0 remember why Unsafe wasn't available
* Expose unavailability cause through PD for higher level use.
* Make Epoll and KQueue include the reason when failing

Result:
Easier debugging in hard to reconfigure environments
2017-08-29 22:02:06 +02:00
Derek Perez
b18a201d02 various errorprone fixes.
Motivation:

Continuing to make netty happy when compiling through errorprone.

Modification:

Mostly comments, some minor switch statement changes.

Result:

No more compiler errors!
2017-08-23 12:49:58 +02:00
Aron Wieck
da86b85a28 Make NativeLibraryLoader check java.library.path first
Motivation:

On restricted systems (e.g. grsecurity), it might not be possible to write a .so on disk and load it afterwards. On those system Netty should check java.library.path for libraries to load.

Modifications:

Changed NativeLibraryLoader.java to first try to load libs from java.library.path before exporting the .so to disk.

Result:

Libraries load fine on restricted systems.
2017-08-16 14:27:50 -07:00
Derek Perez
3469004432 Adding explicit comment about case statement
Motivation:

When compiling this code and running it through errorprone[1], this message appears:
```
StringUtil.java:493: error: [FallThrough] Switch case may fall through; add a `// fall through` comment if it was deliberate
                    case LINE_FEED:
                    ^
    (see http://errorprone.info/bugpattern/FallThrough)
```
By adding that comment, it silences the error and also makes clear the intention of that statement.

[1]http://errorprone.info/index

Modification:

Add simple comment.

Result:

Errorprone is happier with the code.
2017-08-16 07:38:11 +02:00
Nikolay Fedorovskikh
4875a2aad4 Immediate caching the strings wrapped to AsciiString
Motivation:
The `AsciiString#toString` method calculate string value and cache it into field. If an `AsciiString` created from the `String` value, we can avoid rebuilding strings if we cache them immediately when creating `AsciiString`. It would be useful for constants strings, which already stored in the JVMs string table, or in cases where an unavoidable `#toString `method call is assumed.

Modifications:
- Add new static method `AsciiString#cache(String)` which save string value into cache field.
- Apply a "benign" data race in the `#hashCode` and `#toString` methods.

Result:
Less memory usage in some `AsciiString` use cases.
2017-08-15 06:22:14 +02:00
Norman Maurer
19dcb15062 Use underscore in native library names for consistency.
Motivation:

At the moment we try to load the library using multiple names which includes names using - but also _ . We should just use _ all the time.

Modifications:

Replace - with _

Result:

Fixes [#7069]
2017-08-15 06:02:00 +02:00
Nikolay Fedorovskikh
e249b453a0 Make configurable the initial and max size of InternalThreadLocalMap#stringBuilder
Motivation:
In some cases of using an `InternalThreadLocalMap#stringBuilder`, the `StringBuilder`s size can often exceed the exist limit (1024 bytes). This can lead to permanent memory reallocation.

Modifications:
Add custom properties for the initial capacity and maximum size (after which the `StringBuilder`s capacity will be reduced to the initial capacity).

Result:
An `InternalThreadLocalMap#stringBuilder`s initial and max size is configurable. Fixes [#7092].
2017-08-14 13:27:58 -07:00
Norman Maurer
b30c4f899f Remove debug cruft from e218759c0c 2017-08-08 17:01:31 +02:00
Norman Maurer
e218759c0c Fix regression in detecting macOS/osx platform introduced by bdb0a39c8a 2017-08-08 10:39:22 +02:00
Norman Maurer
bdb0a39c8a Remove code-duplication from NativeLibraryLoader
Motivation:

NativeLibraryLoader has some code-duplication that can be removed.

Modifications:

Remove duplicated code and just use provided methods of PlatformDependent.

Result:

Less code duplication, fixes [#3756].
2017-08-08 09:06:41 +02:00
Scott Mitchell
a75ac747f0 Remove io.netty.packagePrefix system property
Motivation:
Now that the NativeLibraryLoader implicitly detects the shaded package prefix we no longer need the io.netty.packagePrefix system property.

Modifications:
- Remove io.netty.packagePrefix processing from NativeLibraryLoader

Result:
Code is cleaner.
2017-08-04 10:53:29 +02:00
Eric Anderson
e5a31a4282 Automatically detect shaded packagePrefix
Motivation:

Shading requires renaming binary components (.so, .dll; for tcnative,
epoll, etc). But the rename then requires setting the
io.netty.packagePrefix system property on the command line or runtime,
which is either a burden or not feasible.

If you don't rename the binary components everything appears to
work, until a dependency on a second version of the binary component is
added. At that point, only one version of the binary will be loaded...
which is what shading is supposed to prevent. So for valid shading, the
binaries must be renamed.

Modifications:

Automatically detect the package prefix by comparing the actual class
name to the non-shaded expected class name. The expected class name must
be obfuscated to prevent shading utilities from changing it.

Result:

When shading and using binary components, runtime configuration is no
longer necessary.

Pre-existing shading users that were not renaming the binary components
will break, because the packagePrefix previously defaulted to "". Since
these pre-existing users had broken configurations that only _appeared_
to work, this breakage is considered a Good Thing. Users may workaround
this breakage temporarily by setting -Dio.netty.packagePrefix= to
restore packagePrefix to "".

Fixes #6963
2017-07-19 18:38:02 -07:00
louxiu
96e06aa74d Calculate correct lastRecords size
Motivation:
ResourceLeakDetector records at most MAX_RECORDS+1 records

Modifications:
Make room before add to lastRecords

Result:
ResourceLeakDetector will record at most MAX_RECORDS records
2017-07-18 19:14:31 -07:00
kashike
c43e09da5a Use the correct murmur3 C1 value
introduced in a7f7d9c8e0
2017-07-18 19:03:33 -07:00
Scott Mitchell
7cfe416182 Use unbounded queues from JCTools 2.0.2
Motivation:
JCTools 2.0.2 provides an unbounded MPSC linked queue. Before we shaded JCTools we had our own unbounded MPSC linked queue and used it in various places but gave this up because there was no public equivalent available in JCTools at the time.

Modifications:
- Use JCTool's MPSC linked queue when no upper bound is specified

Result:
Fixes https://github.com/netty/netty/issues/5951
2017-07-10 12:32:15 -07:00
Nikolay Fedorovskikh
01eb428b39 Move methods for decode hex dump into StringUtil
Motivation:

PR #6811 introduced a public utility methods to decode hex dump and its parts, but they are not visible from netty-common.

Modifications:

1. Move the `decodeHexByte`, `decodeHexDump` and `decodeHexNibble` methods into `StringUtils`.
2. Apply these methods where applicable.
3. Remove similar methods from other locations (e.g. `HpackHex` test class).

Result:

Less code duplication.
2017-06-23 18:52:42 +02:00
Nikolay Fedorovskikh
aa38b6a769 Prevent unnecessary allocations in the StringUtil#escapeCsv
Motivation:

A `StringUtil#escapeCsv` creates new `StringBuilder` on each value even if the same string is returned in the end.

Modifications:

Create new `StringBuilder` only if it really needed. Otherwise, return the original string (or just trimmed substring).

Result:

Less GC load. Up to 4x faster work for not changed strings.
2017-06-13 14:57:38 -07:00
Scott Mitchell
1cc4607f07 AppendableCharSequence not to depend upon IndexOutOfBoundsException for resize
Motivation:
AppendableCharSequence depends upon IndexOutOfBoundsException to trigger a resize operation under the assumption that the resize operation will be rare if the initial size guess is good. However if the initial size guess is not good then the performance will be more unpredictable and likely suffer.

Modifications:
- Check the position in AppendableCharSequence#append to determine if a resize is necessary

Result:
More predictable performance in AppendableCharSequence#append.
2017-06-12 12:42:20 -07:00
Norman Maurer
f208b147a6 Use FQCN to prevent classloader issues on java6
Motivation:

We need to use FQCN to prevent classloader issues for classes that are > Java6. This is a cleanup of ed5fcbb773.

Modifications:

Just remove the imports and use FQCN.

Result:

No classloader issues with java6
2017-06-08 12:04:17 +02:00
Michael K. Werle
ed5fcbb773 Add explicit message when noexec prevents library loading.
Motivation:

Docker's `--tmpfs` flag mounts the temp volume with `noexec` by default,
resulting in an UnsatisfiedLinkError.  While this is good security
practice, it is a surprising failure from a seemingly innocuous flag.

Modifications:

Add a best-effort attempt in `NativeLibraryLoader` to detect when temp
files beng loaded cannot be executed even when execution permissions
are set, often because the `noexec` flag is set on the volume.

Requires numerous additional exclusions to the Animal Sniffer config
for Java7 POSIX permissions manipulation.

Result:

Fixes [#6678].
2017-06-07 09:20:05 -07:00
Norman Maurer
201d9b6536 Share code that is needed to support shaded native libraries.
Motivation:

For our native libraries in netty we support shading, to have this work on runtime the user needs to set a system property. This code should shared.

Modifications:

Move logic to NativeLbiraryLoader and so share for all native libs.

Result:

Less code duplication and also will work for netty-tcnative out of the box once it support shading
2017-05-19 19:33:21 +02:00
Nikolay Fedorovskikh
d768c5e628 MessageFormatter improvements
Motivation:

`FormattingTuple.getArgArray()` is never used.
In the `MessageFormatter` it is possible to make
some improvements, e.g. replace `StringBuffer`
with `StringBuilder`, avoid redundant allocations, etc.

Modifications:

- Remove `argArray` field from the `FormattingTuple`.
- In `MessageFormatter`:
  - replace `StringBuffer` with `StringBuilder`,
  - replace `HashMap` with `HashSet` and make it lazy initialized.
  - avoid redundant allocations (`substring()`, etc.)
  - use appropriate StringBuilder's methods for the some `Number` values.
- Porting unit tests from `slf4j`.

Result:

Less GC load on logging with internal `MessageFormatter`.
2017-05-19 09:47:11 -07:00
Nikolay Fedorovskikh
e4531918a3 Optimizations in NetUtil
Motivation:

IPv4/6 validation methods use allocations, which can be avoided.
IPv4 parse method use StringTokenizer.

Modifications:

Rewriting IPv4/6 validation methods to avoid allocations.
Rewriting IPv4 parse method without use StringTokenizer.

Result:

IPv4/6 validation and IPv4 parsing faster up to 2-10x.
2017-05-18 16:42:22 -07:00
Norman Maurer
0ee49e6d66 Eliminate noisy logging when using sun.misc.Unsafe and running on pre Java9
Motivation:

We should only try to load jdk.internal.misc.Unsafe if we run on Java9+ to eliminate noise in the log.

Modifications:

- Move javaVersion() and related methods to PlatformDependent0 to be able to use these in the static initializer without creating a cycle.
- Only try to load jdk.internal.misc.Unsafe when running in Java9+

Result:

Less noise in the log when running pre java9.
2017-05-16 08:29:06 +02:00
Jason Tedor
d88cd23bfc Trim thread local string builder if large
Motivation:

A previous change allocated a new thread local string builder if it
was getting too large. This is a good change, these string builders
can accidentally get too large and then never shrunk and that is sort
of a memory leak. However, the change allocates an entirely new string
builder which is more allocations than necessary. Instead, we can trim
the string builder if its too large, this only allocates an extra
backing array instead of a whole new object.

Modifications:

If the string builder is above a threshold, we trim the string builder
and then ensure its capacity is reasonable to we do not allocate too
much as we start using the string builder.

Result:

The thread local string builder do not serve as a memory yet we do not
allocate too many new objects.
2017-05-12 08:20:04 +02:00
Nikolay Fedorovskikh
5643cc6a10 IPv6 validation fixes
Motivation:

`NetUtil`'s methods `isValidIpV6Address` and `getIPv6ByName` incorrectly validate some IPv6 addresses.

Modifications:

- `getIPv6ByName`: add checks for single colon at the start or end.
- `isValidIpV6Address`: fix checks for the count of colons and use `endOffset` instead of `ipAddress.length()` for the cases with the brackets or '%'.

Result:

More correct implementation of `NetUtil#isValidIpV6Address` and `NetUtil#getIPv6ByName`.
2017-05-11 08:10:25 -07:00
jiachun.fjc
cd80b6c2d8 Use simple volatile read for SingleThreadEventExecutor#state instead of UNSAFE(AtomicIntegerFieldUpdater#get), CAS operation still to use AtomicIntegerFieldUpdater
Motivation:

AtomicIntegerFieldUpdater#get is unnecessary, I think use simple volatile read is cleaner

Modifications:

Replace code STATE_UPDATER.get(this) to state in SingleThreadEventExecutor

Result:

Cleaner code
2017-05-08 19:36:19 +02:00
jiachun.fjc
963cd22a05 InternalThreadLocalMap#stringBuilder: ensure memory overhead
Motivation:

InternalThreadLocalMap#stringBuilder: ensure memory overhead

Modification:

If the capacity of StringBuilder is greater than 65536 then release it on the next time you get StringBuilder and re-create a StringBuilder.

Result:

Possible less memory usage.
2017-05-05 09:28:51 -07:00