210 Commits

Author SHA1 Message Date
Stephane Landelle
8d4db050f3 Have hosts file support for DnsNameResolver, close #4074
Motivation:

On contrary to `DefaultNameResolver`, `DnsNameResolver` doesn't currently honor hosts file.

Modifications:

* Introduce `HostsFileParser` that parses `/etc/hosts` or `C:\Windows\system32\drivers\etc\hosts` depending on the platform
* Introduce `HostsFileEntriesResolver` that uses the former to resolve host names
* Make `DnsNameResolver` check his `HostsFileEntriesResolver` prior to trying to resolve names against the DNS server
* Introduce `DnsNameResolverBuilder` so we now have a builder for `DnsNameResolver`s
* Additionally introduce a `CompositeNameResolver` that takes several `NameResolver`s and tries to resolve names by delegating sequentially
* Change `DnsNameResolver.asAddressResolver` to return a composite and honor hosts file

Result:

Hosts file support when using `DnsNameResolver`.
Consistent behavior with JDK implementation.
2015-12-17 15:15:42 +01:00
Scott Mitchell
b4b791353d AsciiString optimized hashCode
Motivation:
The AsciiString.hashCode() method can be optimized. This method is frequently used while to build the DefaultHeaders data structure.

Modification:
- Add a PlatformDependent hashCode algorithm which utilizes UNSAFE if available

Result:
AsciiString hashCode is faster.
2015-11-10 10:28:31 -08:00
Scott Mitchell
19658e9cd8 HTTP/2 Headers Type Updates
Motivation:
The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-8.1.2) indicates that header names consist of ASCII characters. We currently use ByteString to represent HTTP/2 header names. The HTTP/2 RFC (https://tools.ietf.org/html/rfc7540#section-10.3) also eludes to header values inheriting the same validity characteristics as HTTP/1.x. Using AsciiString for the value type of HTTP/2 headers would allow for re-use of predefined HTTP/1.x values, and make comparisons more intuitive. The Headers<T> interface could also be expanded to allow for easier use of header types which do not have the same Key and Value type.

Motivation:
- Change Headers<T> to Headers<K, V>
- Change Http2Headers<ByteString> to Http2Headers<CharSequence, CharSequence>
- Remove ByteString. Having AsciiString extend ByteString complicates equality comparisons when the hash code algorithm is no longer shared.

Result:
Http2Header types are more representative of the HTTP/2 RFC, and relationship between HTTP/2 header name/values more directly relates to HTTP/1.x header names/values.
2015-10-30 15:29:44 -07:00
Norman Maurer
a47685b243 Use bitwise operation when sampling for resource leak detection.
Motivation:

Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.

Modifications:

- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.

Result:

Faster sampling.
2015-10-29 19:18:44 +01:00
Norman Maurer
7d4c077492 Add *UnsafeHeapByteBuf for improve performance on systems with sun.misc.Unsafe
Motivation:

sun.misc.Unsafe allows us to handle heap ByteBuf in a more efficient matter. We should use special ByteBuf implementation when sun.misc.Unsafe can be used to increase performance.

Modifications:

- Add PooledUnsafeHeapByteBuf and UnpooledUnsafeHeapByteBuf that are used when sun.misc.Unsafe is ready to use.
- Add UnsafeHeapSwappedByteBuf

Result:

Better performance when using heap buffers and sun.misc.Unsafe is ready to use.
2015-10-21 09:04:13 +02:00
Norman Maurer
f30a51b905 Correctly handle byte shifting if system does not support unaligned access.
Motivation:

We had a bug in our implemention which double "reversed" bytes on systems which not support unaligned access.

Modifications:

- Correctly only reverse bytes if needed.
- Share code between unsafe implementations.

Result:

No more data-corruption on sytems without unaligned access.
2015-10-20 17:32:13 +02:00
Norman Maurer
11e8163aa9 [#4284] Forward decoded messages more frequently
Motivation:

At the moment we only forward decoded messages that were added the out List once the full decode loop was completed. This has the affect that resources may not be released as fast as possible and as an application may incounter higher latency if the user triggeres a writeAndFlush(...) as a result of the decoded messages.

Modifications:

- forward decoded messages after each decode call

Result:

Forwarding decoded messages through the pipeline in a more eager fashion.
2015-10-07 14:15:53 +02:00
Norman Maurer
eb1c97b3b9 [#4110] Correct javadocs of MpscLinkedQueue
Motivation:

The javadocs are incorrect and so give false impressions of use-pattern.

Modifications:

- Fix javadocs of which operations are allowed from multiple threads concurrently.
- Let isEmpty() work concurrently.

Result:

Correctly document usage-patterns.
2015-08-27 09:09:28 +02:00
Scott Mitchell
cbc38e938a UNSAFE.throwException null arg crashes JVM
Motivation:
It has been observed that passing a null argument to Unsafe.throwException can crash the JVM.

Modifications:
- PlatformUnsafe0.throwException should honor http://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.18 and throw a NPE

Result:
No risk of JVM crashing for null argument.
Fixes https://github.com/netty/netty/issues/4131
2015-08-26 23:50:51 -07:00
Scott Mitchell
9bc322a6a8 StringUtil not closing Formatter
Motivation:
The StringUtil class creates a Formatter object, but does not close it. There are also a 2 utility methods which would be generally useful.

Modifications:
- Close the Formatter
- Add length and isNullOrEmpty

Result:
No more resource leaks. Additional utility methods.
2015-08-20 09:44:31 -07:00
Scott Mitchell
ba6ce5449e Headers Performance Boost and Interface Simplification
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.

Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.

Result:
Performance is now on par with 4.0.
2015-08-17 08:50:11 -07:00
Ning Sun
9236a8d156 (fix) typo 2015-07-30 12:49:25 +02:00
Scott Mitchell
a7713069a1 HttpObjectDecoder performance improvements
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.

Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.

Result:
Peformance boost for decoding http objects.
2015-07-29 23:26:26 -07:00
Daniel Darabos
623d9d7202 Fix typo in warning message. 2015-07-29 18:38:33 +02:00
Norman Maurer
81fee66c78 Let PoolThreadCache work even if allocation and deallocation Thread are different
Motivation:

PoolThreadCache did only cache allocations if the allocation and deallocation Thread were the same. This is not optimal as often people write from differen thread then the actual EventLoop thread.

Modification:

- Add MpscArrayQueue which was forked from jctools and lightly modified.
- Use MpscArrayQueue for caches and always add buffer back to the cache that belongs to the allocation thread.

Result:

ThreadPoolCache is now also usable and so gives performance improvements when allocation and deallocation thread are different.

Performance when using same thread for allocation and deallocation is noticable worse then before.
2015-05-27 14:38:11 +02:00
Norman Maurer
08d234cdf0 [#3805] Fix incorrect javadoc in PlatformDependent 2015-05-25 21:42:31 +02:00
Norman Maurer
271af7c624 Expose metrics for PooledByteBufAllocator
Motivation:

The PooledByteBufAllocator is more or less a black-box atm. We need to expose some metrics to allow the user to get a better idea how to tune it.

Modifications:

- Expose different metrics via PooledByteBufAllocator
- Add *Metrics interfaces

Result:

It is now easy to gather metrics and detail about the PooledByteBufAllocator and so get a better understanding about resource-usage etc.
2015-05-20 21:06:17 +02:00
yz_liu
488d905598 fix a typo in RecyclableArrayList 2015-05-06 09:07:48 +02:00
Norman Maurer
cf66edb3a1 [#3675] Fix livelock issue in MpscLinkedQueue
Motivation:

All read operations should be safe to execute from multiple threads which was not the case and so could produce a livelock.

Modifications:

Modify methods so these are safe to be called from multiple threads.

Result:

No more livelock.
2015-05-06 06:21:14 +02:00
Trustin Lee
63a02fc04e Revamp DNS codec
Motivation:

There are various known issues in netty-codec-dns:

- Message types are not interfaces, which can make it difficult for a
  user to implement his/her own message implementation.
- Some class names and field names do not match with the terms in the
  RFC.
- The support for decoding a DNS record was limited. A user had to
  encode and decode by him/herself.
- The separation of DnsHeader from DnsMessage was unnecessary, although
  it is fine conceptually.
- Buffer leak caused by DnsMessage was difficult to analyze, because the
  leak detector tracks down the underlying ByteBuf rather than the
  DnsMessage itself.
- DnsMessage assumes DNS-over-UDP.
- To send an EDNS message, a user have to create a new DNS record class
  instance unnecessarily.

Modifications:

- Make all message types interfaces and add default implementations
- Rename some classes, properties, and constants to match the RFCs
  - DnsResource -> DnsRecord
  - DnsType -> DnsRecordType
  - and many more
- Remove DnsClass and use an integer to support EDNS better
- Add DnsRecordEncoder/DnsRecordDecoder and their default
  implementations
  - DnsRecord does not require RDATA to be ByteBuf anymore.
  - Add DnsRawRecord as the catch-all record type
- Merge DnsHeader into DnsMessage
- Make ResourceLeakDetector track AbstractDnsMessage
- Remove DnsMessage.sender/recipient properties
  - Wrap DnsMessage with AddressedEnvelope
  - Add DatagramDnsQuest and DatagramDnsResponse for ease of use
  - Rename DnsQueryEncoder to DatagramDnsQueryEncoder
  - Rename DnsResponseDecoder to DatagramDnsResponseDecoder
- Miscellaneous changes
  - Add StringUtil.TAB

Result:

- Cleaner APi
- Can support DNS-over-TCP more easily in the future
- Reduced memory footprint in the default DnsQuery/Response
  implementations
- Better leak tracking for DnsMessages
- Possibility to introduce new DnsRecord types in the future and provide
  full record encoder/decoder implementation.
- No unnecessary instantiation for an EDNS pseudo resource record
2015-05-01 11:33:16 +09:00
JongYoon Lim
a5c8e145ee Remove the condition which is always true when reached
Motivation:
Condition 'isNextCharDoubleQuote' is always 'true' when reached.

Motification:
- Removed Condition 'isNextCharDoubleQuote'.
- Additionally fixed typo in javadoc

Result:
Cleaner code.
2015-04-30 16:58:39 -07:00
Norman Maurer
56c98839c3 [#3218] Add ChannelPool / ChannelPoolMap abstraction and implementations
Motivation:

Many projects need some kind a Channel/Connection pool implementation. While the protocols are different many things can be shared, so we should provide a generic API and implementation.

Modifications:

Add ChannelPool / ChannelPoolMap API and implementations.

Result:

Reusable / Generic pool implementation that users can use.
2015-04-30 12:13:19 +02:00
Scott Mitchell
f812180c2d ByteString arrayOffset method
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.

Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset

Result:
ByteString and AsciiString can represent a sub region of a byte[].
2015-04-24 18:54:01 -07:00
Norman Maurer
a7d1dc362a [#3652] Improve performance of StringUtil.simpleClassName()
Motivation:

static Package getPackage(Class<?> c) uses synchronized block internally.
Thanks to @jingene for the hint and initial report of the issue.

Modifications:

-Use simple lastIndexOf(...) and substring for a faster implementation

Result:

No more lock condition.
2015-04-22 09:14:40 +02:00
Jakob Buchgraber
c2de195f87 Improve performance of AsciiString.equals(Object).
Motivation:

The current implementation does byte by byte comparison, which we have seen
can be a performance bottleneck when the AsciiString is used as the key in
a Map.

Modifications:

Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
and get closer to the performance of String.equals(Object).

Result:

Significant improvement (2x - 6x) in performance over the current implementation.

Benchmark                                             (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       10  thrpt        10 118843477.518 2347259.347    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       50  thrpt        10 43910319.773   198376.996    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual      100  thrpt        10 26339969.001   159599.252    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual     1000  thrpt        10  2873119.030    20779.056    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual    10000  thrpt        10   306370.450     1933.303    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual   100000  thrpt        10    25750.415      108.391    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       10  thrpt        10 248077563.510  635320.093    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       50  thrpt        10 128198943.138  614827.548    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual      100  thrpt        10 86195621.349  1063959.307    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual     1000  thrpt        10 16920264.598    61615.365    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual    10000  thrpt        10  1687454.747     6367.602    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual   100000  thrpt        10   153717.851      586.916    ops/s
2015-04-16 17:29:54 -07:00
nmittler
7aac50a79a Optimizing KObjectHashMap hashIndex()
Motivation:

The IntObjectHashMap benchmarks show the Agrona collections to be faster on put, lookup, and remove. One major difference is that we're using 2 modulus operations each time we increment the position index while iterating.  Agrona uses a mask instead.

Modifications:

Modified the KObjectHashMap to use masking rather than modulus when wrapping the position index. This requires that the capacity be a power of 2.

Result:

Improved performance of IntObjectHashMap.
2015-04-16 10:27:17 -07:00
Scott Mitchell
9a7a85dbe5 ByteString introduced as AsciiString super class
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.

Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.

Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
2015-04-14 16:35:17 -07:00
Norman Maurer
aebbb862ac Add support for ALPN when using openssl + NPN client mode and support for CipherSuiteFilter
Motivation:

To support HTTP2 we need APLN support. This was not provided before when using OpenSslEngine, so SSLEngine (JDK one) was the only bet.
Beside this CipherSuiteFilter was not supported

Modifications:

- Upgrade netty-tcnative and make use of new features to support ALPN and NPN in server and client mode.
- Guard against segfaults after the ssl pointer is freed
- support correctly different failure behaviours
- add support for CipherSuiteFilter

Result:

Be able to use OpenSslEngine for ALPN / NPN for server and client.
2015-04-10 18:52:34 +02:00
Daniel Bevenius
c53b8d5a85 Suggestion for supporting single header fields.
Motivation:
At the moment if you want to return a HTTP header containing multiple
values you have to set/add that header once with the values wanted. If
you used set/add with an array/iterable multiple HTTP header fields will
be returned in the response.

Note, that this is indeed a suggestion and additional work and tests
should be added. This is mainly to bring up a discussion.

Modifications:
Added a flag to specify that when multiple values exist for a single
HTTP header then add them as a comma separated string.
In addition added a method to StringUtil to help escape comma separated
value charsequences.

Result:
Allows for responses to be smaller.
2015-02-18 10:54:15 +01:00
Trustin Lee
a1efd1871b Reorder PlatformDependent.isRoot() check
Motivation:

isRoot() is an expensive operation. We should avoid calling it if
possible.

Modifications:

Move the isRoot() checks to the end of the 'if' block, so that isRoot()
is evaluated only when really necessary.

Result:

isRoot() is evaluated only when SO_BROADCAST is set and the bind address
is anylocal address.
2015-02-08 12:00:16 +09:00
Greg Gibeling
a79466769f Lazily check for root, avoids unnecessary errors & resources
Motivation:

io.netty.util.internal.PlatformDependent.isRoot() depends on the IS_ROOT field which is filled in during class initialization. This spawns processes and consumes resources, which are not generally necessary to the complete functioning of that class.

Modifications:

This switches the class to use lazy initialization this field inside of the isRoot() method using double-checked locking (http://en.wikipedia.org/wiki/Double-checked_locking).

Result:

The first call to isRoot() will be slightly slower, at a tradeoff that class loading is faster, uses fewer resources and platform errors are avoided unless necessary.
2014-12-05 09:17:06 +01:00
Scott Mitchell
04f77b76f8 Backport ALPN and Mutual Auth SSL
Motivation:

Improvements were made on the main line to support ALPN and mutual
authentication for TLS. These should be backported.

Modifications:

- Backport commits from the master branch
  - f8af84d5993456426a63ad0146479147b1a4a5e5
  - e74c8edba3fcbfd2e895ed6aac440efeb3aa637f

Result:

Support for ALPN and mutual authentication.
2014-10-31 12:52:26 +09:00
Frederic Bregier
eb415fded6 V4.1 Fix "=" character in HttpPostRequestDecoder
Motivation
Issue #3004 shows that "=" character was not supported as it should in
the HttpPostRequestDecoder in form-data boundary.

Modifications:
Add 2 methods in StringUtil
- split with maxPart argument: String split with max parts only (to prevent multiple '='
to be source of extra split while not needed)
- substringAfter: String part after delimiter (since first part is not
needed)
Use those methods in HttpPostRequestDecoder.
Change and the HttpPostRequestDecoderTest to check using a boundary
beginning with "=".

Results:
The fix implies more stability and fix the issue.
2014-10-21 16:06:37 +09:00
Norman Maurer
09100e5043 Avoid redundant reads of head in peakNode
Motivation:

There is not need todo redunant reads of head in peakNode as we can just spin on next() until it becomes visible.

Modifications:

Remove redundant reads of head in peakNode. This is based on @nitsanw's patch for akka.
See https://github.com/akka/akka/pull/15596

Result:

Less volatile access.
2014-08-21 09:01:22 +02:00
Trustin Lee
3c4321ce43 Use our own URL shortener wherever possible 2014-07-31 17:06:19 -07:00
Norman Maurer
88bd6e7a93 Optimize native transport for gathering writes
Motivation:

While benchmarking the native transport with gathering writes I noticed that it is quite slow. This is due the fact that we need to do a lot of array copies to get the buffers into the iov array.

Modification:

Introduce a new class calles IovArray which allows to fill buffers directly in a iov array that can be passed over to JNI without any array copies. This gives a nice optimization in terms of speed when doing gathering writes.

Result:

Big performance improvement when doing gathering writes. See the included benchmark...

Before:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    23.44ms   16.37ms 259.57ms   91.77%
    Req/Sec   181.99k    31.69k  304.60k    78.12%
  346544071 requests in 2.00m, 46.48GB read
Requests/sec: 2887885.09
Transfer/sec:    396.59MB

With this change:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    21.93ms   16.33ms 305.73ms   92.34%
    Req/Sec   194.56k    33.75k  309.33k    77.04%
  369617503 requests in 2.00m, 49.57GB read
Requests/sec: 3080169.65
Transfer/sec:    423.00MB
2014-07-25 09:55:02 +02:00
Idel Pivnitskiy
b83df4c6b3 Fix NPE problems
Motivation:

Now Netty has a few problems with null values.

Modifications:

- Check HAProxyProxiedProtocol in HAProxyMessage constructor and throw NPE if it is null.
If HAProxyProxiedProtocol is null we will set AddressFamily as null. So we will get NPE inside checkAddress(String, AddressFamily) and it won't be easy to understand why addrFamily is null.
- Check File in DiskFileUpload.toString().
If File is null we will get NPE when calling toString() method.
- Check Result<String> in MqttDecoder.decodeConnectionPayload(...).
If !mqttConnectVariableHeader.isWillFlag() || !mqttConnectVariableHeader.hasUserName() || !mqttConnectVariableHeader.hasPassword() we will get NPE when we will try to create new instance of MqttConnectPayload.
- Check Unsafe before calling unsafe.getClass() in PlatformDependent0 static block.
- Removed unnecessary null check in WebSocket08FrameEncoder.encode(...).
Because msg.content() can not return null.
- Removed unnecessary null check in DefaultStompFrame(StompCommand) constructor.
Because we have this check in the super class.
- Removed unnecessary null checks in ConcurrentHashMapV8.removeTreeNode(TreeNode<K,V>).
- Removed unnecessary null check in OioDatagramChannel.doReadMessages(List<Object>).
Because tmpPacket.getSocketAddress() always returns new SocketAddress instance.
- Removed unnecessary null check in OioServerSocketChannel.doReadMessages(List<Object>).
Because socket.accept() always returns new Socket instance.
- Pass Unpooled.buffer(0) instead of null inside CloseWebSocketFrame(boolean, int) constructor.
If we will pass null we will get NPE in super class constructor.
- Added throw new IllegalStateException in GlobalEventExecutor.awaitInactivity(long, TimeUnit) if it will be called before GlobalEventExecutor.execute(Runnable).
Because now we will get NPE. IllegalStateException will be better in this case.
- Fixed null check in OpenSslServerContext.setTicketKeys(byte[]).
Now we throw new NPE if byte[] is not null.

Result:

Added new null checks when it is necessary, removed unnecessary null checks and fixed some NPE problems.
2014-07-20 12:55:22 +02:00
Idel Pivnitskiy
ad1389be9d Small performance improvements
Modifications:

- Added a static modifier for CompositeByteBuf.Component.
This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Removed unnecessary boxing/unboxing operations in HttpResponseDecoder, RtspResponseDecoder, PerMessageDeflateClientExtensionHandshaker and PerMessageDeflateServerExtensionHandshaker
A boxed primitive is created from a String, just to extract the unboxed primitive value.
- Removed unnecessary 3 times calculations in DiskAttribute.addContent(...).
- Removed unnecessary checks if file exists before call mkdirs() in NativeLibraryLoader and PlatformDependent.
Because the method mkdirs() has this check inside.
- Removed unnecessary `instanceof AsciiString` check in StompSubframeAggregator.contentLength(StompHeadersSubframe) and StompSubframeDecoder.getContentLength(StompHeaders, long).
Because StompHeaders.get(CharSequence) always returns java.lang.String.
2014-07-20 09:26:04 +02:00
Trustin Lee
5b87cdc8bd Reduce the perceived time taken to retrieve initialSeedUniquifier
Motivation:

When system is in short of entrophy, the initialization of
ThreadLocalRandom can take at most 3 seconds.  The initialization occurs
when ThreadLocalRandom.current() is invoked first time, which might be
much later than the moment when the application has started.  If we
start the initialization of ThreadLocalRandom as early as possible, we
can reduce the perceived time taken for the retrieval.

Modification:

Begin the initialization of ThreadLocalRandom in InternalLoggerFactory,
potentially one of the firstly initialized class in a Netty application.

Make DefaultChannelId retrieve the current process ID before retrieving
the current machine ID, because retrieval of a machine ID is more likely
to use ThreadLocalRandom.current().

Use a dummy channel ID for EmbeddedChannel, which prevents many unit
tests from creating a ThreadLocalRandom instance.

Result:

We gain extra 100ms at minimum for initialSeedUniquifier generation.  If
an application has its own initialization that takes long enough time
and generates good amount of entrophy, it is very likely that we will
gain a lot more.
2014-07-04 16:04:48 +09:00
Trustin Lee
11fdec3c4a Log the time taken for generating the initialSeedUniquifier
- Sometimes useful to know it how long it takes from the log, to make
  sure it's not something else that is blocking.
2014-07-04 13:26:58 +09:00
Trustin Lee
3c6bc0b4cb Fix unclean backport in InternalLoggerFactory
.. which leaked in from d0912f27091e4548466df81f545c017a25c9d256
2014-07-02 20:27:06 +09:00
Trustin Lee
d0912f2709 Fix most inspector warnings
Motivation:

It's good to minimize potentially broken windows.

Modifications:

Fix most inspector warnings from our profile
Update IntObjectHashMap

Result:

Cleaner code
2014-07-02 19:55:07 +09:00
Norman Maurer
90c65b7157 [#2604] Not try to use sun.misc.Cleaner when on android
Motivation:

When a user tries to use netty on android it currently fails with "Could not find class 'sun.misc.Cleaner'"

Modification:

Encapsulate sun.misc.Cleaner usage in extra class to workaround this isssue.

Result:
Netty can be used on android again
2014-06-27 08:25:42 +02:00
Norman Maurer
8a75ba35ef [#2599] Not use sun.nio.ch.DirectBuffer as it not exists on android
Motivation:

During some refactoring we changed PlatformDependend0 to use sun.nio.ch.DirectBuffer for release direct buffers. This broke support for android as the class does not exist there and so an exception is thrown.

Modification:

Use again the fieldoffset to get access to Cleaner for release direct buffers.

Result:
Netty can be used on android again
2014-06-25 15:07:02 +02:00
Norman Maurer
030bcaae81 Improve performance of Recycler
Motivation:

Recycler is used in many places to reduce GC-pressure but is still not as fast as possible because of the internal datastructures used.

Modification:

 - Rewrite Recycler to use a WeakOrderQueue which makes minimal guaranteer about order and visibility for max performance.
 - Recycling of the same object multiple times without acquire it will fail.
 - Introduce a RecyclableMpscLinkedQueueNode which can be used for MpscLinkedQueueNodes that use Recycler

These changes are based on @belliottsmith 's work that was part of #2504.

Result:

Huge increase in performance.

4.0 branch without this commit:

Benchmark                                                (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00000  thrpt        20 116026994.130  2763381.305    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00256  thrpt        20 110823170.627  3007221.464    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    01024  thrpt        20 118290272.413  7143962.304    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    04096  thrpt        20 120560396.523  6483323.228    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    16384  thrpt        20 114726607.428  2960013.108    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    65536  thrpt        20 119385917.899  3172913.684    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.617 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark

4.0 branch with this commit:

Benchmark                                                (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00000  thrpt        20 204158855.315  5031432.145    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00256  thrpt        20 205179685.861  1934137.841    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    01024  thrpt        20 209906801.437  8007811.254    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    04096  thrpt        20 214288320.053  6413126.689    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    16384  thrpt        20 215940902.649  7837706.133    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    65536  thrpt        20 211141994.206  5017868.542    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.648 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark
2014-06-24 10:47:38 +02:00
Trustin Lee
cdaeb54fb9 Remove padding utility classes
- It's not used anywhere
2014-06-21 17:59:49 +09:00
Trustin Lee
f44720850c Add missing last padding / Comment 2014-06-21 17:57:06 +09:00
Trustin Lee
a368f9d12a Checkstyle / Overall clean-up / Fix serialization 2014-06-21 17:57:06 +09:00
nitsanw
32aab3b0b3 Fix false sharing between head and tail reference in MpscLinkedQueue
Motivation:

The tail node reference writes (by producer threads) are very likely to
invalidate the cache line holding the headRef which is read by the
consumer threads in order to access the padded reference to the head
node. This is because the resulting layout for the object is:

- header
- Object AtomicReference.value -> Tail node
- Object MpscLinkedQueue.headRef -> PaddedRef -> Head node

This is 'passive' false sharing where one thread reads and the other
writes.  The current implementation suffers from further passive false
sharing potential from any and all neighbours to the queue object as no
pre/post padding is provided for the class fields.

Modifications:

Fix the memory layout by adding pre-post padding for the head node and
putting the tail node reference in the same object.

Result:

Fixed false sharing
2014-06-21 17:57:06 +09:00
Trustin Lee
085a61a310 Refactor FastThreadLocal to simplify TLV management
Motivation:

When Netty runs in a managed environment such as web application server,
Netty needs to provide an explicit way to remove the thread-local
variables it created to prevent class loader leaks.

FastThreadLocal uses different execution paths for storing a
thread-local variable depending on the type of the current thread.
It increases the complexity of thread-local removal.

Modifications:

- Moved FastThreadLocal and FastThreadLocalThread out of the internal
  package so that a user can use it.
- FastThreadLocal now keeps track of all thread local variables it has
  initialized, and calling FastThreadLocal.removeAll() will remove all
  thread-local variables of the caller thread.
- Added FastThreadLocal.size() for diagnostics and tests
- Introduce InternalThreadLocalMap which is a mixture of hard-wired
  thread local variable fields and extensible indexed variables
- FastThreadLocal now uses InternalThreadLocalMap to implement a
  thread-local variable.
- Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
  tells it to stop watching when its thread-local cache has been freed
  by FastThreadLocal.removeAll().
- Added FastThreadLocalTest to ensure that removeAll() works
- Added microbenchmark for FastThreadLocal and JDK ThreadLocal
- Upgraded to JMH 0.9

Result:

- A user can remove all thread-local variables Netty created, as long as
  he or she did not exit from the current thread. (Note that there's no
  way to remove a thread-local variable from outside of the thread.)
- FastThreadLocal exposes more useful operations such as isSet() because
  we always implement a thread local variable via InternalThreadLocalMap
  instead of falling back to JDK ThreadLocal.
- FastThreadLocalBenchmark shows that this change improves the
  performance of FastThreadLocal even more.
2014-06-19 21:13:55 +09:00