Commit Graph

520 Commits

Author SHA1 Message Date
Jakob Buchgraber
f8bee2e94c Make Nio/EpollEventLoop run on a ForkJoinPool
Related issue: #2250

Motivation:

Prior to this commit, Netty's non blocking EventLoops
were each assigned a fixed thread by which all of the
EventLoop's I/O and handler logic would be performed.

While this is a fine approach for most users of Netty,
some advanced users require more flexibility in
scheduling the EventLoops.

Modifications:

Remove all direct usages of threads in MultithreadEventExecutorGroup,
SingleThreadEventExecutor et al., and introduce an Executor
abstraction instead.

The way to think about this change is, that each
iteration of an eventloop is now a task that gets scheduled
in a ForkJoinPool.

While the ForkJoinPool is the default, one also has the
ability to plug in his/her own Executor (aka thread pool)
into a EventLoop(Group).

Result:

Netty hands off thread management to a ForkJoinPool by default.
Users can also provide their own thread pool implementation and
get some control over scheduling Netty's EventLoops
2014-08-11 15:28:46 -07:00
Norman Maurer
7022dcd4a1 [#2744] Fix flakey HashedWheelTimerTest.testExecutionOnTime()
Motivation:

The calculation of the max wait time for HashedWheelTimerTest.testExecutionOnTime() was wrong and so the test sometimes failed.

Modifications:

Fix the max wait time.

Result:

No more test-failures
2014-08-06 07:01:24 +02:00
Trustin Lee
e59ec5384d Fix a bug where ChannelFuture.setFailure(null) doesn't fail
Motivation:

We forgot to do a null check on the cause parameter of
ChannelFuture.setFailure(cause)

Modifications:

Add a null check

Result:

Fixed issue: #2728
2014-08-05 11:23:52 -07:00
Norman Maurer
4a3ef90381 Port ChannelOutboundBuffer and related changes from 4.0
Motivation:

We did various changes related to the ChannelOutboundBuffer in 4.0 branch. This commit port all of them over and so make sure our branches are synced in terms of these changes.

Related to [#2734], [#2709], [#2729], [#2710] and [#2693] .

Modification:
Port all changes that was done on the ChannelOutboundBuffer.

This includes the port of the following commits:
 - 73dfd7c01b
 - 997d8c32d2
 - e282e504f1
 - 5e5d1a58fd
 - 8ee3575e72
 - d6f0d12a86
 - 16e50765d1
 - 3f3e66c31a

Result:
 - Less memory usage by ChannelOutboundBuffer
 - Same code as in 4.0 branch
 - Make it possible to use ChannelOutboundBuffer with Channel implementation that not extends AbstractChannel
2014-08-05 15:00:56 +02:00
Trustin Lee
2b12640960 More brief somaxconn logging
- Consistent log message format
- Avoid unnecessary autoboxing when debug level is off
- Remove the duplication of somaxconn path
2014-08-04 10:27:22 -07:00
Norman Maurer
bc92ac6ba3 Remove duplicated code 2014-07-31 18:27:26 -07:00
Norman Maurer
045f9ff0ed [#2720] Check if /proc/sys/net/core/somaxconn exists before try to parse it
Motivation:

As /proc/sys/net/core/somaxconn does not exists on non-linux platforms you see a noisy stacktrace when debug level is enabled while the static method of NetUtil is executed.

Modifications:

Check if the file exists before try to parse it.

Result:

Less noisy logging on non-linux platforms.
2014-07-31 18:09:08 -07:00
Trustin Lee
c5ae3e3dcb Use our own URL shortener wherever possible 2014-07-31 17:06:30 -07:00
Trustin Lee
1c9cb90655 Fix a ConstantPoolTest failure 2014-07-29 15:45:50 -07:00
Trustin Lee
e64644b21c Fix a bug where AbstractConstant.compareTo() returns 0 for different constants
Related issue: #2354

Motivation:

AbstractConstant.compareTo() can return 0 even if the specified constant
object is not the same instance with 'this'.

Modifications:

- Compare the identityHashCode of constant first. If that fails,
  allocate a small direct buffer and use its memory address as a unique
  value.  If the platform does not provide a way to get the memory
  address of a direct buffer, use a thread-local random value.
- Signal cannot extend AbstractConstant. Use delegation.

Result:

It is practically impossible for AbstractConstant.compareTo() to return
0 for different constant objects.
2014-07-29 14:56:28 -07:00
Norman Maurer
ca0571cc9d Optimize native transport for gathering writes
Motivation:

While benchmarking the native transport with gathering writes I noticed that it is quite slow. This is due the fact that we need to do a lot of array copies to get the buffers into the iov array.

Modification:

Introduce a new class calles IovArray which allows to fill buffers directly in a iov array that can be passed over to JNI without any array copies. This gives a nice optimization in terms of speed when doing gathering writes.

Result:

Big performance improvement when doing gathering writes. See the included benchmark...

Before:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    23.44ms   16.37ms 259.57ms   91.77%
    Req/Sec   181.99k    31.69k  304.60k    78.12%
  346544071 requests in 2.00m, 46.48GB read
Requests/sec: 2887885.09
Transfer/sec:    396.59MB

With this change:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    21.93ms   16.33ms 305.73ms   92.34%
    Req/Sec   194.56k    33.75k  309.33k    77.04%
  369617503 requests in 2.00m, 49.57GB read
Requests/sec: 3080169.65
Transfer/sec:    423.00MB
2014-07-25 09:54:41 +02:00
Norman Maurer
44ea769f53 [#2662] Fix race in cancellation of TimerTasks which could let to NPE
Motivation:

Due some race-condition while handling canellation of TimerTasks it was possibleto corrupt the linked-list structure that is represent by HashedWheelBucket and so produce a NPE.

Modification:

Fix the problem by adding another MpscLinkedQueue which holds the cancellation tasks and process them on each tick. This allows to use no synchronization / locking at all while introduce a latency of max 1 tick before the TimerTask can be GC'ed.

Result:

No more NPE
2014-07-25 06:29:28 +02:00
Osvaldo Doederlein
8fc58a4a91 Fixes and improvements to IntObjectHashMap. Related to [#2659]
- Rewrite with linear probing, no state array, compaction at cleanup
- Optimize keys() and values() to not use reflection
- Optimize hashCode() and equals() for efficient iteration
- Fixed equals() to not return true for equals(null)
- Optimize iterator to not allocate new Entry at each next()
- Added toString()
- Added some new unit tests
2014-07-21 16:15:18 +02:00
Idel Pivnitskiy
6860019f41 Fix NPE problems
Motivation:

Now Netty has a few problems with null values.

Modifications:

- Check HAProxyProxiedProtocol in HAProxyMessage constructor and throw NPE if it is null.
If HAProxyProxiedProtocol is null we will set AddressFamily as null. So we will get NPE inside checkAddress(String, AddressFamily) and it won't be easy to understand why addrFamily is null.
- Check File in DiskFileUpload.toString().
If File is null we will get NPE when calling toString() method.
- Check Result<String> in MqttDecoder.decodeConnectionPayload(...).
If !mqttConnectVariableHeader.isWillFlag() || !mqttConnectVariableHeader.hasUserName() || !mqttConnectVariableHeader.hasPassword() we will get NPE when we will try to create new instance of MqttConnectPayload.
- Check Unsafe before calling unsafe.getClass() in PlatformDependent0 static block.
- Removed unnecessary null check in WebSocket08FrameEncoder.encode(...).
Because msg.content() can not return null.
- Removed unnecessary null check in DefaultStompFrame(StompCommand) constructor.
Because we have this check in the super class.
- Removed unnecessary null checks in ConcurrentHashMapV8.removeTreeNode(TreeNode<K,V>).
- Removed unnecessary null check in OioDatagramChannel.doReadMessages(List<Object>).
Because tmpPacket.getSocketAddress() always returns new SocketAddress instance.
- Removed unnecessary null check in OioServerSocketChannel.doReadMessages(List<Object>).
Because socket.accept() always returns new Socket instance.
- Pass Unpooled.buffer(0) instead of null inside CloseWebSocketFrame(boolean, int) constructor.
If we will pass null we will get NPE in super class constructor.
- Added throw new IllegalStateException in GlobalEventExecutor.awaitInactivity(long, TimeUnit) if it will be called before GlobalEventExecutor.execute(Runnable).
Because now we will get NPE. IllegalStateException will be better in this case.
- Fixed null check in OpenSslServerContext.setTicketKeys(byte[]).
Now we throw new NPE if byte[] is not null.

Result:

Added new null checks when it is necessary, removed unnecessary null checks and fixed some NPE problems.
2014-07-20 12:55:08 +02:00
Idel Pivnitskiy
a5095c5616 Small fixes and improvements
Motivation:

Fix some typos in Netty.

Modifications:

- Fix potentially dangerous use of non-short-circuit logic in Recycler.transfer(Stack<?>).
- Removed double 'the the' in javadoc of EmbeddedChannel.
- Write to log an exception message if we can not get SOMAXCONN in the NetUtil's static block.
2014-07-20 09:37:36 +02:00
Idel Pivnitskiy
dd26d47adc Small performance improvements
Modifications:

- Added a static modifier for CompositeByteBuf.Component.
This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Removed unnecessary boxing/unboxing operations in HttpResponseDecoder, RtspResponseDecoder, PerMessageDeflateClientExtensionHandshaker and PerMessageDeflateServerExtensionHandshaker
A boxed primitive is created from a String, just to extract the unboxed primitive value.
- Removed unnecessary 3 times calculations in DiskAttribute.addContent(...).
- Removed unnecessary checks if file exists before call mkdirs() in NativeLibraryLoader and PlatformDependent.
Because the method mkdirs() has this check inside.
- Removed unnecessary `instanceof AsciiString` check in StompSubframeAggregator.contentLength(StompHeadersSubframe) and StompSubframeDecoder.getContentLength(StompHeaders, long).
Because StompHeaders.get(CharSequence) always returns java.lang.String.
2014-07-20 09:24:56 +02:00
Norman Maurer
0eb64b5a5d Fix over-sensible testcase 2014-07-13 17:19:47 +02:00
Norman Maurer
1cdbbe0f9b [#2651] Fix possible infinite-loop when cancel tasks
Motivations:
In our new version of HWT we used some kind of lazy cancelation of timeouts by put them back in the queue and let them pick up on the next tick. This  multiple problems:
 - we may corrupt the MpscLinkedQueue if the task is used as tombstone
 - this sometimes lead to an uncessary delay especially when someone did executed some "heavy" logic in the TimeTask

Modifications:
Use a Lock per HashedWheelBucket for save and fast removal.

Modifications:
Cancellation of tasks can be done fast and so stuff can be GC'ed and no more infinite-loop possible
2014-07-11 15:41:44 +02:00
nmittler
67159e7119 Upgrade to HTTP/2 draft 13
Motivation:

Need to upgrade HTTP/2 implementation to latest draft.

Modifications:

Various changes to support draft 13.

Result:

Support for HTTP/2 draft 13.
2014-07-08 21:10:46 +02:00
Trustin Lee
3921f7c88a Reduce the perceived time taken to retrieve initialSeedUniquifier
Motivation:

When system is in short of entrophy, the initialization of
ThreadLocalRandom can take at most 3 seconds.  The initialization occurs
when ThreadLocalRandom.current() is invoked first time, which might be
much later than the moment when the application has started.  If we
start the initialization of ThreadLocalRandom as early as possible, we
can reduce the perceived time taken for the retrieval.

Modification:

Begin the initialization of ThreadLocalRandom in InternalLoggerFactory,
potentially one of the firstly initialized class in a Netty application.

Make DefaultChannelId retrieve the current process ID before retrieving
the current machine ID, because retrieval of a machine ID is more likely
to use ThreadLocalRandom.current().

Use a dummy channel ID for EmbeddedChannel, which prevents many unit
tests from creating a ThreadLocalRandom instance.

Result:

We gain extra 100ms at minimum for initialSeedUniquifier generation.  If
an application has its own initialization that takes long enough time
and generates good amount of entrophy, it is very likely that we will
gain a lot more.
2014-07-04 16:04:37 +09:00
Trustin Lee
b4c2ae5e59 Log the time taken for generating the initialSeedUniquifier
- Sometimes useful to know it how long it takes from the log, to make
  sure it's not something else that is blocking.
2014-07-04 13:25:26 +09:00
Trustin Lee
ba81831e0a Fix the cruft produced from the refactoring
.. in 330404da07
2014-07-02 20:30:45 +09:00
Trustin Lee
330404da07 Fix most inspector warnings
Motivation:

It's good to minimize potentially broken windows.

Modifications:

Fix most inspector warnings from our profile

Result:

Cleaner code
2014-07-02 19:04:11 +09:00
Norman Maurer
887838a06b Correctly return from selector loop one a scheduled task is ready for processing
Motivation:

We use the nanoTime of the scheduledTasks to calculate the milli-seconds to wait for a select operation to select something. Once these elapsed we check if there was something selected or some task is ready for processing. Unfortunally we not take into account scheduled tasks here so the selection loop will continue if only scheduled tasks are ready for processing. This will delay the execution of these tasks.

Modification:

- Check if a scheduled task is ready after selecting
- also make a tiny change in NioEventLoop to not trigger a rebuild if nothing was selected because the timeout was reached a few times in a row.

Result:

Execute scheduled tasks on time.
2014-07-02 09:10:49 +02:00
Norman Maurer
20adbb325e [#2604] Not try to use sun.misc.Cleaner when on android
Motivation:

When a user tries to use netty on android it currently fails with "Could not find class 'sun.misc.Cleaner'"

Modification:

Encapsulate sun.misc.Cleaner usage in extra class to workaround this isssue.

Result:
Netty can be used on android again
2014-06-27 08:25:56 +02:00
Norman Maurer
d4d27248b4 [#2599] Not use sun.nio.ch.DirectBuffer as it not exists on android
Motivation:

During some refactoring we changed PlatformDependend0 to use sun.nio.ch.DirectBuffer for release direct buffers. This broke support for android as the class does not exist there and so an exception is thrown.

Modification:

Use again the fieldoffset to get access to Cleaner for release direct buffers.

Result:
Netty can be used on android again
2014-06-25 15:07:16 +02:00
Norman Maurer
580965f775 Improve performance of Recycler
Motivation:

Recycler is used in many places to reduce GC-pressure but is still not as fast as possible because of the internal datastructures used.

Modification:

 - Rewrite Recycler to use a WeakOrderQueue which makes minimal guaranteer about order and visibility for max performance.
 - Recycling of the same object multiple times without acquire it will fail.
 - Introduce a RecyclableMpscLinkedQueueNode which can be used for MpscLinkedQueueNodes that use Recycler

These changes are based on @belliottsmith 's work that was part of #2504.

Result:

Huge increase in performance.

4.0 branch without this commit:

Benchmark                                                (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00000  thrpt        20 116026994.130  2763381.305    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00256  thrpt        20 110823170.627  3007221.464    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    01024  thrpt        20 118290272.413  7143962.304    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    04096  thrpt        20 120560396.523  6483323.228    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    16384  thrpt        20 114726607.428  2960013.108    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    65536  thrpt        20 119385917.899  3172913.684    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.617 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark

4.0 branch with this commit:

Benchmark                                                (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00000  thrpt        20 204158855.315  5031432.145    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    00256  thrpt        20 205179685.861  1934137.841    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    01024  thrpt        20 209906801.437  8007811.254    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    04096  thrpt        20 214288320.053  6413126.689    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    16384  thrpt        20 215940902.649  7837706.133    ops/s
i.n.m.i.RecyclableArrayListBenchmark.recycleSameThread    65536  thrpt        20 211141994.206  5017868.542    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 297.648 sec - in io.netty.microbench.internal.RecyclableArrayListBenchmark
2014-06-24 10:57:35 +02:00
Trustin Lee
a874062434 Remove padding utility classes
- It's not used anywhere
2014-06-21 18:00:01 +09:00
Trustin Lee
5409b560e3 Add missing last padding / Comment 2014-06-21 17:55:38 +09:00
Trustin Lee
a2400d9eb2 Checkstyle / Overall clean-up / Fix serialization 2014-06-21 17:54:22 +09:00
nitsanw
0925bc3e79 Fix false sharing between head and tail reference in MpscLinkedQueue
Motivation:

The tail node reference writes (by producer threads) are very likely to
invalidate the cache line holding the headRef which is read by the
consumer threads in order to access the padded reference to the head
node. This is because the resulting layout for the object is:

- header
- Object AtomicReference.value -> Tail node
- Object MpscLinkedQueue.headRef -> PaddedRef -> Head node

This is 'passive' false sharing where one thread reads and the other
writes.  The current implementation suffers from further passive false
sharing potential from any and all neighbours to the queue object as no
pre/post padding is provided for the class fields.

Modifications:

Fix the memory layout by adding pre-post padding for the head node and
putting the tail node reference in the same object.

Result:

Fixed false sharing
2014-06-21 17:54:22 +09:00
Trustin Lee
365b3f5e11 Rename io.netty.util.collection.Collections to PrimitiveCollections
Motivation:

An IDE's auto-completion often confuses between java.util.Collections
and io.netty.util.collection.Collections, and it's annoying to me. :-)

Modifications:

Use a different class name.

Result:

When I type Collections, it's always 'the' Collections.
2014-06-21 16:57:06 +09:00
nmittler
217a666b1f Grab-bag of minor changes to HTTP/2
Motivation:

Various small fixes/improvements to the interface to the HTTP/2 classes,
as well as some minor performance improvements.

Modifications:

- Added fix for IntObjectHashMap to ensure that capacity is always odd.
Even capacity can cause probing to fail.
- Cleaned the access to GOAWAY information in Http2Connection interface.
Endpoints now manage their own state for GOAWAY. Also added a goingAway
event handler.
- Added Endpoint methods for checking MAX_CONCURRENT_STREAMS or if the
number of streams for the endpoint have been exhausted. See
Endpoint.nextStreamId()/acceptingNewStreams().
- Changed DefaultHttp2Connection to use IntObjectHashMap. This should be
a slight memory improvement.
- Fixed check for MAX_CONCURRENT_STREAMS to correctly use the number of
active streams for the endpoint (not total active). See
DefaultHttp2Connection.checkNewStreamAllowed.
- Exposing a few methods to subclasses of AbstractHttp2ConnectionHandler
(e.g. exception handling).
- Cleaning up GOAWAY and RST_STREAM handling in
AbstractHttp2ConnectionHandler.

Result:

HTTP/2 code should provide more information to subclasses and will have
a reduced memory footprint.
2014-06-19 20:00:26 +02:00
Trustin Lee
760bbc7ea6 Refactor FastThreadLocal to simplify TLV management
Motivation:

When Netty runs in a managed environment such as web application server,
Netty needs to provide an explicit way to remove the thread-local
variables it created to prevent class loader leaks.

FastThreadLocal uses different execution paths for storing a
thread-local variable depending on the type of the current thread.
It increases the complexity of thread-local removal.

Modifications:

- Moved FastThreadLocal and FastThreadLocalThread out of the internal
  package so that a user can use it.
- FastThreadLocal now keeps track of all thread local variables it has
  initialized, and calling FastThreadLocal.removeAll() will remove all
  thread-local variables of the caller thread.
- Added FastThreadLocal.size() for diagnostics and tests
- Introduce InternalThreadLocalMap which is a mixture of hard-wired
  thread local variable fields and extensible indexed variables
- FastThreadLocal now uses InternalThreadLocalMap to implement a
  thread-local variable.
- Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
  tells it to stop watching when its thread-local cache has been freed
  by FastThreadLocal.removeAll().
- Added FastThreadLocalTest to ensure that removeAll() works
- Added microbenchmark for FastThreadLocal and JDK ThreadLocal
- Upgraded to JMH 0.9

Result:

- A user can remove all thread-local variables Netty created, as long as
  he or she did not exit from the current thread. (Note that there's no
  way to remove a thread-local variable from outside of the thread.)
- FastThreadLocal exposes more useful operations such as isSet() because
  we always implement a thread local variable via InternalThreadLocalMap
  instead of falling back to JDK ThreadLocal.
- FastThreadLocalBenchmark shows that this change improves the
  performance of FastThreadLocal even more.
2014-06-19 21:17:46 +09:00
Trustin Lee
9a169f5865 Fix incorrect method signature of awaitInactivity()
- Related: #2084
2014-06-17 15:59:11 +09:00
nmittler
376f4b2516 Adding int-to-object map implementation.
Motivation:

Maps with integer keys are used in several places (HTTP/2 code, for
example). To reduce the memory footprint of these structures, we need a
specialized map class that uses ints as keys.

Modifications:

Added IntObjectHashMap, which is uses open addressing and double hashing
for collision resolution.

Result:

A new int-based map class that can be shared across Netty.
2014-06-16 10:32:16 +02:00
Trustin Lee
fb5464a544 Use FastThreadLocal in more places 2014-06-14 17:45:43 +09:00
Trustin Lee
de2872f7f7 Introduce TextHeaders and AsciiString
Motivation:

We have quite a bit of code duplication between HTTP/1, HTTP/2, SPDY,
and STOMP codec, because they all have a notion of 'headers', which is a
multimap of string names and values.

Modifications:

- Add TextHeaders and its default implementation
- Add AsciiString to replace HttpHeaderEntity
  - Borrowed some portion from Apache Harmony's java.lang.String.
- Reimplement HttpHeaders, SpdyHeaders, and StompHeaders using
  TextHeaders
- Add AsciiHeadersEncoder to reuse the encoding a TextHeaders
  - Used a dedicated encoder for HTTP headers for better performance
    though
- Remove shortcut methods in SpdyHeaders
- Remove shortcut methods in SpdyHttpHeaders
- Replace SpdyHeaders.getStatus() with HttpResponseStatus.parseLine()

Result:

- Removed quite a bit of code duplication in the header implementations.
- Slightly better performance thanks to improved header validation and
  hash code calculation
2014-06-14 17:14:30 +09:00
belliottsmith
7d37af5dfb Introduce FastThreadLocal which uses an EnumMap and a predefined fixed set of possible thread locals
Motivation:
Provide a faster ThreadLocal implementation

Modification:
Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark);

Result:
Improved performance
2014-06-13 11:02:16 +02:00
Norman Maurer
c4d585420f Make sure cancelled Timeouts are able to be GC'ed fast.
Motivation:
At the moment the HashedWheelTimer will only remove the cancelled Timeouts once the HashedWheelBucket is processed again. Until this the instance will not be able to be GC'ed as there are still strong referenced to it even if the user not reference it by himself/herself. This can cause to waste a lot of memory even if the Timeout was cancelled before.

Modification:
Add a new queue which holds CancelTasks that will be processed on each tick to remove cancelled Timeouts. Because all of this is done only by the WorkerThread there is no need for synchronization and only one extra object creation is needed when cancel() is executed. For addTimeout(...) no new overhead is introduced.

Result:
Less memory usage for cancelled Timeouts.
2014-06-10 12:49:56 +02:00
Norman Maurer
805ba157e4 DNS codec for Netty which is based on the work of [#1622].
Motivation:
As part of GSOC 2013 we had @mbakkar working on a DNS codec but did not integrate it yet as it needs some cleanup. This commit is based on @mbakkar's work and provide the codec for DNS.

Modifications:
Add DNS codec

Result:
Reusable DNS codec will be included in netty.

This PR also includes a AsynchronousDnsResolver which allows to resolve DNS entries in a non blocking way by make use
of the dns codec and netty transport itself.
2014-06-10 09:47:25 +02:00
Trustin Lee
d0640a6686 Clean up MpscLinkedQueue, fix its leak, and make it work without Unsafe
Motivation:

MpscLinkedQueue has various issues:
- It does not work without sun.misc.Unsafe.
- Some field names are confusing.
  - Node.tail does not refer to the tail node really.
  - The tail node is the starting point of iteration. I think the tail
    node should be the head node and vice versa to reduce confusion.
- Some important methods are not implemented (e.g. iterator())
- Not serializable
- Potential false cache sharing problem due to lack of padding
- MpscLinkedQueue extends AtomicReference and thus exposes various
  operations that mutates the internal state of the queue directly.

Modifications:

- Use AtomicReferenceFieldUpdater wherever possible so that we do not
  use Unsafe directly. (e.g. use lazySet() instead of putOrderedObject)
- Extend AbstractQueue to implement most operations
- Implement serialization and iterator()
- Rename tail to head and head to tail to reduce confusion.
- Rename Node.tail to Node.next.
- Fix a leak where the references in the removed head are not cleared
  properly.
- Add Node.clearMaybe() method so that the value of the new head node
  is cleared if possible.
- Add some comments for my own educational purposes
- Add padding to the head node
  - Add FullyPaddedReference and RightPaddedReference for future reuse
- Make MpscLinkedQueue package-local so that a user cannot access the
  dangerous yet public operations exposed by the superclass.
  - MpscLinkedQueue.Node becomes MpscLinkedQueueNode, a top level class

Result:

- It's more like a drop-in replacement of ConcurrentLinkedQueue for the
  MPSC case.
- Works without sun.misc.Unsafe
- Code potentially easier to understand
- Fixed leak (related: #2372)
2014-06-04 03:23:44 +09:00
Trustin Lee
fc45635af3 Fix checkstyle 2014-06-02 19:27:26 +09:00
Trustin Lee
c827c6da37 Add awaitInactivity() to GlobalEventExecutor and ThreadDeathWatcher
Motivation:

When running Netty on a container environment, the container will often
complain about the lingering threads such as the worker threads of
ThreadDeathWatcher and GlobalEventExecutor.  We should provide an
operation that allows a use to wait until such threads are terminated.

Modifications:

- Add awaitInactivity()
- (misc) Fix typo in GlobalEventExecutorTest
- (misc) Port ThreadDeathWatch's CAS-based thread life cycle management
  to GlobalEventExecutor

Result:

- Fixes #2084
- Less overhead on task submission of GlobalEventExecutor
2014-06-02 19:23:50 +09:00
Trustin Lee
9f4543fb39 Fix checkstyle 2014-06-02 18:26:54 +09:00
Trustin Lee
642f4bb3b1 Introduce ThreadDeathWatcher
Motivation:

PooledByteBufAllocator's thread local cache and
ReferenceCountUtil.releaseLater() are in need of a way to run an
arbitrary logic when a certain thread is terminated.

Modifications:

- Add ThreadDeathWatcher, which spawns a low-priority daemon thread
  that watches a list of threads periodically (every second) and
  invokes the specified tasks when the associated threads are not alive
  anymore
  - Start-stop logic based on CAS operation proposed by @tea-dragon
- Add debug-level log messages to see if ThreadDeathWatcher works

Result:

- Fixes #2519 because we don't use GlobalEventExecutor anymore
- Cleaner code
2014-06-02 18:14:23 +09:00
Norman Maurer
1347e7f312 [#2523] Fix infinite-loop when remove attribute and create the same attribute again
Motivation:
The current DefaultAttributeMap cause an infinite-loop when the user removes an attribute and create the same attribute again. This regression was introduced by c3bd7a8ff1.

Modification:
Correctly break out loop

Result:
No infinite-loop anymore.
2014-06-01 13:06:50 +02:00
Trustin Lee
47f961b6b9 Fix a bug in DefaultPromise.notifyLateListener() where the listener is not notified
Motivation:

When (listeners == null && lateListeners == null) and (stackDepth >= MAX_LISTENER_STACK_DEPTH), the listener is not notified at all. The discard client does not work.

Modification:

Make sure to submit the notification task.

Result:

The discard client works again and all listeners are notified.
2014-05-23 09:47:22 +09:00
Trustin Lee
087e95e899 Checkstyle 2014-05-20 17:38:52 +09:00
Trustin Lee
5c6c8da0c9 Limit the number of bytes to copy per Unsafe.copyMemory()
Motivation:

During a large memory copy, safepoint polling is diabled, hindering
accurate profiling.

Modifications:

Only copy up to 1 MiB per Unsafe.copyMemory()

Result:

Potentially more reliable performance
2014-05-20 17:16:15 +09:00
Trustin Lee
a72230061d Add an OpenSslEngine and the universal API for enabling SSL
Motivation:

Some users already use an SSLEngine implementation in finagle-native. It
wraps OpenSSL to get higher SSL performance.  However, to take advantage
of it, finagle-native must be compiled manually, and it means we cannot
pull it in as a dependency and thus we cannot test our SslHandler
against the OpenSSL-based SSLEngine.  For an instance, we had #2216.

Because the construction procedures of JDK SSLEngine and OpenSslEngine
are very different from each other, we also need to provide a universal
way to enable SSL in a Netty application.

Modifications:

- Pull netty-tcnative in as an optional dependency.
  http://netty.io/wiki/forked-tomcat-native.html
- Backport NativeLibraryLoader from 4.0
- Move OpenSSL-based SSLEngine implementation into our code base.
  - Copied from finagle-native; originally written by @jpinner et al.
  - Overall cleanup by @trustin.
- Run all SslHandler tests with both default SSLEngine and OpenSslEngine
- Add a unified API for creating an SSL context
  - SslContext allows you to create a new SSLEngine or a new SslHandler
    with your PKCS#8 key and X.509 certificate chain.
  - Add JdkSslContext and its subclasses
  - Add OpenSslServerContext
- Add ApplicationProtocolSelector to ensure the future support for NPN
  (NextProtoNego) and ALPN (Application Layer Protocol Negotiation) on
  the client-side.
- Add SimpleTrustManagerFactory to help a user write a
  TrustManagerFactory easily, which should be useful for those who need
  to write an alternative verification mechanism. For example, we can
  use it to implement an unsafe TrustManagerFactory that accepts
  self-signed certificates for testing purposes.
- Add InsecureTrustManagerFactory and FingerprintTrustManager for quick
  and dirty testing
- Add SelfSignedCertificate class which generates a self-signed X.509
  certificate very easily.
- Update all our examples to use SslContext.newClient/ServerContext()
- SslHandler now logs the chosen cipher suite when handshake is
  finished.

Result:

- Cleaner unified API for configuring an SSL client and an SSL server
  regardless of its internal implementation.
- When native libraries are available, OpenSSL-based SSLEngine
  implementation is selected automatically to take advantage of its
  performance benefit.
- Examples take advantage of this modification and thus are cleaner.
2014-05-18 02:33:26 +09:00
Norman Maurer
c3bd7a8ff1 Better implementation of AttributeMap and also add hasAttr(...). See [#2439]
Motivation:
The old DefaultAttributeMap impl did more synchronization then needed and also did not expose a efficient way to check if an attribute exists with a specific key.

Modifications:
* Rewrite DefaultAttributeMap to not use IdentityHashMap and synchronization on the map directly. The new impl uses a combination of AtomicReferenceArray and synchronization per chain (linked-list). Also access the first Attribute per bucket can be done without any synchronization at all and just uses atomic operations. This should fit for most use-cases pretty weel.
* Add hasAttr(...) implementation

Result:
It's now possible to check for the existence of a attribute without create one. Synchronization is per linked-list and the first entry can even be added via atomic operation.
2014-05-15 06:47:58 +02:00
Norman Maurer
1f68479e3c Minimize memory footprint of HashedWheelTimer and context-switching
Motivation:
At the moment there are two issues with HashedWheelTimer:
* the memory footprint of it is pretty heavy (250kb fon an empty instance)
* the way how added Timeouts are handled is inefficient in terms of how locks etc are used and so a lot of context-switching / condition can happen.

Modification:
Rewrite HashedWheelTimer to use an optimized bucket implementation to store the submitted Timeouts and a MPSC queue to handover the timeouts.  So volatile writes are reduced to a minimum and also the memory foot-print of the buckets itself is reduced a lot as the bucket uses a double-linked-list. Beside this we use Atomic*FieldUpdater where-ever possible to improve the memory foot-print and performance.

Result:
Lower memory-footprint and better performance
2014-05-11 15:15:46 +02:00
Trustin Lee
10ab0db8f5 More robust native library discovery in Mac OS X
Motivation:

Some JDK versions of Mac OS X generates a JNI dynamic library with '.jnilib' extension rather than with '.dynlib' extension.  However, System.mapLibraryName() always returns 'lib<name>.dynlib'. As a result, NativeLibraryLoader fails to load the native library whose extension is .jnilib.

Modification:

Try to find both '.jnilib' and '.dynlib' resources on OS X.

Result:

Dynamic libraries are loaded correctly in Mac OS X, and thus we can continue the OpenSslEngine work.
2014-05-11 18:53:15 +09:00
Trustin Lee
68b4c9f2b5 Simplify native library resolution using os-maven-plugin
Motivation:

So far, we used a very simple platform string such as linux64 and
linux32.  However, this is far from perfection because it does not
include anything about the CPU architecture.

Also, the current build tries to put multiple versions of .so files into
a single JAR.  This doesn't work very well when we have to ship for many
different platforms.  Think about shipping .so/.dynlib files for both
Linux and Mac OS X.

Modification:

- Use os-maven-plugin as an extension to determine the current OS and
  CPU architecture reliable at build time
- Use Maven classifier instead of trying to put all shared libraries
  into a single JAR
- NativeLibraryLoader does not guess the OS and bit mode anymore and it
  always looks for the same location regardless of platform, because the
  Maven classifier does the job instead.

Result:

Better scalable native library deployment and retrieval
2014-05-02 04:21:29 +09:00
Trustin Lee
958a3b70ce Code clean-up
Motivation:

It is less confusing not to spread Thread.interrupt() calls.

Modification:

- Comments
- Move generatorThread.interrupt() to where currentThread.interrupt() is
  triggered

Result:

Code that is easier to read
2014-04-25 17:42:00 +09:00
Trustin Lee
2d9735817c Synchronized between 4.1 and master (part 3)
Motivation:

4 and 5 were diverged long time ago and we recently reverted some of the
early commits in master.  We must make sure 4.1 and master are not very
different now.

Modification:

Fix found differences

Result:

4.1 and master got closer.
2014-04-25 16:17:16 +09:00
Trustin Lee
d2614cfc01 Synchronized between 4.1 and master
Motivation:

4 and 5 were diverged long time ago and we recently reverted some of the
early commits in master.  We must make sure 4.1 and master are not very
different now.

Modification:

Fix found differences

Result:

4.1 and master got closer.
2014-04-25 00:36:01 +09:00
Trustin Lee
acc110e8bd Stop ThreadLocalRandom's initial seed generation immediately on interruption
Motivation:

ThreadLocalRandomTest reveals that ThreadLocalRandom's initial seed generation loop becomes tight if the thread is interrupted.
We currently interrupt ourselves inside the wait loop, which will raise an InterruptedException again in the next iteration, resulting in infinite (up to 3 seconds) exception construction and thread interruptions.

Modification:

- When the initial seed generator thread is interrupted, break out of the wait loop immediately.
- Log properly when the initial seed generation failed due to interruption.
- When failed to generate the initial seed, interrupt the generator thread just in case the SecureRandom implementation handles it properly.
- Make the initial seed generator thread daemon and handle potential exceptions raised due to the interruption.

Result:

No more tight loop on interruption.  More robust generator thread termination. Fixes #2412
2014-04-20 17:55:38 +09:00
Norman Maurer
31a36e09ad [#2353] Use a privileged block to get ClassLoader and System property if needed
Motivation:
When using System.getProperty(...) and various methods to get a ClassLoader it will fail when a SecurityManager is in place.

Modifications:
Use a priveled block if needed. This work is based in the PR #2353 done by @anilsaldhana .

Result:
Code works also when SecurityManager is present
2014-04-08 14:13:49 +02:00
Daniel Bevenius
96656564dc Fixing spelling/typo in javadocs for EventExecutor.
Motivation:
Minor spelling/typo correction for EventExecutor's newSucceededFuture and
newFailedFuture javadocs.

Modifications:
Just correcting the spelling/typo.

Result:
Improve the javadocs.
2014-03-23 16:29:51 +01:00
Trustin Lee
fe279d3f02 Use SecureRandom.generateSeed() to generate ThreadLocalRandom's initialSeedUniquifier
Motivation:

Previously, we used SecureRandom.nextLong() to generate the initialSeedUniquifier.  This required more entrophy than necessary because it has to 1) generate the seed of SecureRandom first and then 2) generate a random long integer.  Instead, we can use generateSeed() to skip the step (2)

Modifications:

Use generateSeed() instead of nextLong()

Result:

ThreadLocalRandom requires less amount of entrphy to start up
2014-03-21 13:43:49 +09:00
Trustin Lee
3621a0169c Reduce the time taken by NetUtil and DefaultChannelId class initialization
Motivation:

As reported in #2331, some query operations in NetworkInterface takes much longer time than we expected.  For example, specifying -Djava.net.preferIPv4Stack=true option in Window increases the execution time by more than 4 times.  Some Windows systems have more than 20 network interfaces, and this problem gets bigger as the number of unused (virtual) NICs increases.

Modification:

Use NetworkInterface.getInetAddresses() wherever possible.
Before iterating over all NICs reported by NetworkInterface, filter the NICs without proper InetAddresses.  This reduces the number of candidates quite a lot.
NetUtil does not query hardware address of NIC in the first place but uses InetAddress.isLoopbackAddress().
Do not call unnecessary query operations on NetworkInterface.  Just get hardware address and compare.

Result:

Significantly reduced class initialization time, which should fix #2331.
2014-03-21 13:01:55 +09:00
Trustin Lee
ab72dd7303 Fix and simplify freeing a direct buffer / Fix Android support
Motivation:

6e8ba291cf introduced a regression in Android because Android does not have sun.nio.ch.DirectBuffer (see #2330.)  I also found PlatformDependent0.freeDirectBuffer() and freeDirectBufferUnsafe() are pretty much same after the commit and the unsafe version should be removed.

Modifications:

- Do not use the pooled allocator in Android because it's too resource hungry for Androids.
- Merge PlatformDependent0.freeDirectBuffer() and freeDirectBufferUnsafe() into one method.
- Make the Unsafe unavailable when sun.nio.ch.DirectBuffer is unavailable.  We could keep the Unsafe available and handle the sun.nio.ch.DirectBuffer case separately, but I don't want to complicate our code just because of that.  All supported JDK versions have sun.nio.ch.DirectBuffer if the Unsafe is available.

Result:

Simpler code. Fixes Android support (#2330)
2014-03-20 11:12:15 +09:00
Norman Maurer
4c64bc0414 [#2307] Remove synchronized bottleneck in SingleThreadEventExecutor.execute(...)
Motivation:
Remove the synchronization bottleneck in startThread() which is called by each execute(..) call from outside the EventLoop.

Modifications:
Replace the synchronized block with the use of AtomicInteger and compareAndSet loops.

Result:
Less conditions during SingleThreadEventExecutor.execute(...)
2014-03-13 10:28:43 +01:00
Norman Maurer
fa062e06f0 Fix checkstyle errors introduced by f0d1bbd63e 2014-03-12 13:44:09 +01:00
Trustin Lee
f0d1bbd63e Add capacity limit to Recycler / Optimize when assertion is off
Motivation:

- As reported recently [1], Recycler's thread-local object pool has unbounded capacity which is a potential problem.
- It accesses a hash table on each push and pop for debugging purposes.  We don't really need it besides debugging Netty itself.

Modifications:

- Introduced the maxCapacity constructor parameter to Recycler.  The default default maxCapacity is retrieved from the system property whose default is 256K, which should be plenty for most cases.
- Recycler.Stack.map is now created and accessed only when assertion is enabled for Recycler.

Result:

- Recycler does not grow infinitely anymore.
- If assertion is disabled, Recycler should be much faster.

[1] https://github.com/netty/netty/issues/1841
2014-03-12 18:17:35 +09:00
Norman Maurer
90047b0b5f Use bitwise operations to choose next EventExecutor if number of EventExecutors is power of two 2014-03-10 20:52:36 +01:00
Jatinder
6b4cd6f7d1 [#2252] Fix bug where AppendableCharSequence private constructor does not set correct position 2014-03-03 20:03:14 +01:00
Andrew Gaul
aa5306adc1 Correct ConcurrentHashMapV8 bitwise arithmetic
Previously ConcurrentHashMapV8 evaulated ((x | 1) == 0), an expression
that always returned false.  This commit brings Netty closer to the
Java 8 implementation.
2014-03-03 06:45:06 +01:00
Norman Maurer
6efac6179e [#1259] Add optimized queue for SCMP pattern and use it in NIO and native transport
This queue also produces less GC then CLQ when make use of OneTimeTask
2014-02-27 13:56:15 +01:00
Norman Maurer
ecc8fb1b89 Fix a regression which could lead to GenericFutureListeners never been notifed. Part of [#2186].
This regression was introduced by commit c97f2d2de00ad74835067cb6f5a62cd4651d1161
2014-02-20 15:12:58 +01:00
Norman Maurer
e0299e1222 Introduce a native transport for linux using epoll ET
This transport use JNI (C) to directly make use of epoll in Edge-Triggered mode for maximal performance on Linux. Beside this it also support using TCP_CORK and produce less GC then the NIO transport using JDK NIO.
It only builds on linux and skip the build if linux is not used. The transport produce a jar which contains all needed .so files for 32bit and 64 bit. The user only need to include the jar as dependency as usually
to make use of it and use the correct classes.

This includes also some cleanup of @trustin
2014-02-15 22:42:07 +01:00
Trustin Lee
bf0ae49c4f Do not warn about Unsafe in Android 2014-02-14 12:07:07 -08:00
Trustin Lee
cf821683f8 Fix a bug where DefaultPromise.setUncancellable() returns a wrong value
- Fixes #2220 - again
- Missing negation
2014-02-10 11:47:54 -08:00
Trustin Lee
ffcf1fe3d0 Fix a bug where DefaultPromise.setUncancellable() returns a wrong value
- Fixes #2220
- Its Javadoc says it returns true when the promise is done (but not cancelled) or the promise is uncancellable, but it returns false when the promise is done.
2014-02-10 11:40:32 -08:00
Trustin Lee
c94e5cb1cb Fix the compilation error in ConcurrentHashMapV8 + JDK8 2014-02-08 08:56:03 -08:00
Trustin Lee
2ba00cd358 Log the rejected listener notification task under a dedicated logger name.
- Fixes #2166
- Some user applications are fine with the failure of notification
2014-02-07 10:23:06 -08:00
Norman Maurer
e7b800eb82 [#2187] Always do a volatile read on the refCnt 2014-02-07 09:37:31 +01:00
Trustin Lee
35ae000c59 Reduce code duplication in DefaultPromise 2014-02-06 22:29:53 -08:00
Trustin Lee
4628d9a672 Fix a race condition in DefaultPromise
.. which occurs when a user adds a listener from different threads after the promise is done and the notifications for the listeners, that were added before the promise is done, is in progress.  For instance:

   Thread-1: p.addListener(listenerA);
   Thread-1: p.setSuccess(null);
   Thread-2: p.addListener(listenerB);
   Thread-2: p.executor.execute(taskNotifyListenerB);
   Thread-1: p.executor.execute(taskNotifyListenerA);

taskNotifyListenerB should not really notify listenerB until taskNotifyListenerA is finished.

To fix this issue:

- Change the semantic of (listeners == null) to determine if the early
  listeners [1] were notified
- If a late listener is added before the early listeners are notified,
  the notification of the late listener is deferred until the early
  listeners are notified (i.e. until listeners == null)
- The late listeners with deferred notifications are stored in a lazily
  instantiated queue to preserve ordering, and then are notified once
  the early listeners are notified.

[1] the listeners that were added before the promise is done
[2] the listeners that were added after the promise is done
2014-02-06 22:05:06 -08:00
Trustin Lee
4a86446053 Fix the potential copyright issue in SocksCommonUtils
- Add StringUtil.toHexString() methods which are based on LoggingHandler's lookup table implementation, and use it wherever possible
2014-02-06 15:01:55 -08:00
Norman Maurer
7a1a30f0ad Provide an optimized AtomicIntegerFieldUpdater, AtomicLongFieldUpdater and AtomicReferenceFieldUpdater 2014-02-06 21:07:31 +01:00
Valentin Kovalenko
8f10e7791b Restore of interrupt status after catch of InterruptedException was added 2014-02-03 06:50:31 +01:00
Norman Maurer
1d8a200849 Not wakeup the EventLoop for writes as they will not cause a flush anyway 2014-02-01 15:00:14 +01:00
Trustin Lee
143951ac5f Remove code duplication 2014-01-29 12:12:46 +09:00
MiddleBen
f3f2d143e3 Simplify the acquisition of Cleaner 2014-01-29 12:00:09 +09:00
Trustin Lee
0f1b1be0aa Enable a user specify an arbitrary information with ReferenceCounted.touch()
- Related: #2163
- Add ResourceLeakHint to allow a user to provide a meaningful information about the leak when touching it
- DefaultChannelHandlerContext now implements ResourceLeakHint to tell where the message is going.
- Cleaner resource leak report by excluding noisy stack trace elements
2014-01-29 11:44:59 +09:00
Trustin Lee
3061b154c3 Fix inspector warnings 2014-01-28 20:17:13 +09:00
Trustin Lee
b887e35ac2 Add ReferenceCounted.touch() / Add missing retain() overrides
- Fixes #2163
- Inspector warnings
2014-01-28 20:06:55 +09:00
Trustin Lee
2be6d1bcc1 Disable Javassist completely on Android
- Related: #2127
- Inspector warnings
2014-01-21 14:25:31 +09:00
Veebs
73027521ce Correct JavaDoc 2014-01-13 17:39:21 +09:00
Trustin Lee
f3a842ecca [maven-release-plugin] prepare for next development iteration 2013-12-22 22:06:15 +09:00
Trustin Lee
888dfba76f [maven-release-plugin] prepare release netty-5.0.0.Alpha1 2013-12-22 22:06:06 +09:00
Trustin Lee
04ec2e1330 Add Recycler.Handle.recycle() so that it's possible to recycle an object without an explicit reference to Recycler 2013-12-19 01:10:52 +09:00
Trustin Lee
ee92a12ed5 Remove unnecessary check in DefaultPromise.await0()
- Fixes #2032
- Fix inspection warnings
2013-12-16 15:15:53 +09:00
Trustin Lee
cc295107b3 Prevent NPE from StringUtil.simpleName(..) 2013-12-16 13:54:23 +09:00
Norman Maurer
ee17139a03 [#2053] Do not allow < 1 on AppendableCharSequence init. 2013-12-11 10:18:49 +01:00
Trustin Lee
d53f7595d3 Fix checkstyle 2013-12-08 14:05:04 +09:00
Trustin Lee
4802c785f6 Add convenient logging methods for logging exceptions quickly
.. Mainly useful for writing tests or ad-hoc debugging
2013-12-08 13:20:52 +09:00