110 Commits

Author SHA1 Message Date
Norman Maurer
20e32f62ec Directly receive remote address when call accept(...)
Motivation:

There is a small race in the native transport where an accept(...) may success but a later try to obtain the remote address from the fd may fail is the fd is already closed.

Modifications:

Let accept(...) directly set the remote address.

Result:

No more race possible.
2015-02-23 14:54:35 +01:00
Norman Maurer
b016b14e24 Correctly handle autoRead == false when epoll LT is used
Motivation:

When epoll LT is used and autoRead == false when entering epollIn() we need to return without reading any data.

Modifications:

Correctly respect autoRead == false if using epoll LT.

Result:

Consistent and correct behaviour.
2015-02-23 07:44:55 +01:00
Norman Maurer
c3f7a71a73 [#3438] Throw pre-instanced IOException on connection reset
Motivation:

In the native transport we should throw a pre-instanced IOException on connection reset while reading.

Modifications:

Correctly throw pre-instanced IOException when ECONNRESET is received

Result:

Less overhead on connection reset
2015-02-21 21:33:09 +01:00
Norman Maurer
98925a1fa4 Move generic unix classes/interfaces out of epoll package
Motivation:

As we plan to have other native transports soon (like a kqueue transport) we should move unix classes/interfaces out of the epoll package so we
introduce other implementations without breaking stuff before the next stable release.

Modifications:

Create a new io.netty.channel.unix package and move stuff over there.

Result:

Possible to introduce other native impls beside epoll.
2015-02-17 18:41:18 +01:00
Norman Maurer
6b941e9bdb Allow to create Epoll*Channel from FileDescriptor
Motivation:

Sometimes it's useful to be able to create a Epoll*Channel from an existing file descriptor. This is especially helpful if you integrade some c/jni code.

Modifications:

- Add extra constructor to Epoll*Channel implementations that take a FileDescriptor as an argument
- Make Rename EpollFileDescriptor to NativeFileDescriptor and make it public
- Also ensure we obtain the correct remote/local address when create a Channel from a FileDescriptor

Result:

It's now possible to create a FileDescriptor and instance a Epoll*Channel via it.
2015-02-09 09:56:33 +01:00
Norman Maurer
ad7deb8dc1 Not execute shutdownOutput(...) and close(...) in the EventLoop if SO_LINGER is used.
Motivation:

If SO_LINGER is used shutdownOutput() and close() syscalls will block until either all data was send or until the timeout exceed. This is a problem when we try to execute them on the EventLoop as this means the EventLoop may be blocked and so can not process any other I/O.

Modifications:

- Add AbstractUnsafe.closeExecutor() which returns null by default and use this Executor for close if not null.
- Override the closeExecutor() in NioSocketChannel and EpollSocketChannel and return GlobalEventExecutor.INSTANCE if getSoLinger() > 0
- use closeExecutor() in shutdownInput(...) in NioSocketChannel and EpollSocketChannel

 Result:

No more blocking of the EventLoop if SO_LINGER is used and shutdownOutput() or close() is called.
2015-02-08 20:48:37 +01:00
Norman Maurer
8f61e4cdd6 Give compiler hint about inline functions
Motivation:

Some of the methods are frequently called and so should be inlined if possible.

Modifications:

Give the compiler a hint that we want to inline these methods.

Result:

Better performance if inlined.
2015-02-08 13:57:47 +01:00
Norman Maurer
75630dd892 Cleanup code. Part of [#3398] 2015-02-08 13:04:34 +01:00
Norman Maurer
340287f377 Add workaround for bug in older linux kernels handling epoll_wait(...)
Motivation:

Older linux kernels have problems handling a large value for epoll_wait(...) and so wait for ever.

Modifications:

Adjust timeout on the fly if a too big value is passed in.

Result:

Correctly works also on older kernels.
2015-02-08 11:37:21 +01:00
Norman Maurer
111781f38f Respect ChannelConfig.getWriteSpinCount() when using epoll transport
Motivation:

The writeSpinCount was ignored in the epoll transport and it just kept on trying writing. This could cause unnessary cpu spinning if a slow remote peer was reading the data very very slow.

Modification:

- Correctly take writeSpinCount into account when writing.

Result:

Less cpu spinning when writing to a slow remote peer.
2015-02-08 11:08:40 +01:00
Norman Maurer
c62e6b676f Correctly set EPOLLRDHUP for all stream channels.
Motivation:

Fix regression introduced by 585ce1593fdccc5a8d868a96c7643e0d63b1e21b, which missed to set EPOLLRDHUP for all stream channels.

Modifications:

Correctly set EPOLLRDHUP for all stream channels in the AbstractEpollStreamChannel constructor.

Result:

No more test failures in EpollDomain*Channel tests.
2015-02-08 08:50:23 +01:00
Norman Maurer
ffc862dd31 Faster event processing when epoll transport is used
Motivation:

Before we used a long[] to store the ready events, this had a few problems and limitations:
 - An extra loop was needed to translate between epoll_event and our long
 - JNI may need to do extra memory copy if the JVM not supports pinning
 - More branches

Modifications:

- Introduce a EpollEventArray which allows to directly write in a struct epoll_event* and pass it to epoll_wait.

Result:

Better speed when using native transport, as shown in the benchmark.

Before:
[xxx@xxx wrk]$ ./wrk -H 'Connection: keep-alive' -d 120 -c 256 -t 16 -s scripts/pipeline-many.lua  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
 16 threads and 256 connections
 Thread Stats   Avg      Stdev     Max   +/- Stdev
   Latency    14.56ms    8.64ms 117.15ms   80.58%
   Req/Sec   286.17k    38.71k  421.48k    68.17%
 546324329 requests in 2.00m, 73.78GB read
Requests/sec: 4553438.39
Transfer/sec:    629.66MB

After:
[xxx@xxx wrk]$ ./wrk -H 'Connection: keep-alive' -d 120 -c 256 -t 16 -s scripts/pipeline-many.lua  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
 16 threads and 256 connections
 Thread Stats   Avg      Stdev     Max   +/- Stdev
   Latency    14.12ms    8.69ms 100.40ms   83.08%
   Req/Sec   294.79k    40.23k  472.70k    66.75%
 555997226 requests in 2.00m, 75.08GB read
Requests/sec: 4634343.40
Transfer/sec:    640.85MB
2015-02-08 08:49:46 +01:00
Norman Maurer
d1a60e682a Allow to change epoll mode
Motivation:
Netty uses edge-triggered epoll by default for performance reasons. The downside here is that a messagesPerRead limit can not be enforced correctly, as we need to consume everything from the channel when notified.

Modification:
- Allow to switch epoll modes before channel is registered
- Some refactoring to share more code

Result:
It's now possible to switch epoll mode.
2015-02-04 21:41:58 +01:00
Norman Maurer
cabecee127 Allow to recv and send file descriptors when using EpollDomainSocketChannel.
Motiviation:

When using domain sockets on linux it is supported to recv and send file descriptors. This can be used to pass around for example sockets.

Modifications:
- Add support for recv and send file descriptors when using EpollDomainSocketChannel.
- Allow to obtain the file descriptor for an Epoll*Channel so it can be send via domain sockets.

Result:
recv and send of file descriptors is supported now.
2015-02-04 20:24:09 +01:00
Norman Maurer
19ec0f3997 Add support for Unix Domain Sockets when using native epoll transport
Motivation:

Using Unix Domain Sockets can be very useful when communication should take place on the same host and has less overhead then using loopback. We should support this with the native epoll transport.

Modifications:

- Add support for Unix Domain Sockets.
- Adjust testsuite to be able to reuse tests.

Result:

Unix Domain Sockets are now support when using native epoll transport.
2015-02-04 15:47:26 +01:00
Norman Maurer
e8b1f033fb [#3378] Automatically increase number of possible handled events
Motivation:

At the moment the max number of events that can be handled per epoll wakup was set during construction.

Modifications:

- Automatically increase the max number of events to handle

Result:

Better performance when a lot of events need to be handled without adjusting the code.
2015-01-30 07:05:21 +01:00
Norman Maurer
94c6c83318 [#3377] Faster overflow guard when generate nextId in EpollEventLoop
Motivation:

The current way how the guard against overflow when generating the nextId() is pretty slow once an overflow happened.

Modifications:

Once a possible overflow is detected all ids used by the EpollEventLoop are scrubed and re-assigned to the registered Channels. This way we only need to do extra work each time an overflow is detected.

Result:

More consistent performance even after the first overflow was detected.
2015-01-30 06:19:23 +01:00
Norman Maurer
201d9ed9ba [#3112] Add supprt for TCP_INFO when using EpollSocketChannel
Motivation:

On Linux, you can gather various metrics using getsockopt(..., TCP_INFO,
...).

Modifications:

Add EpollSocketChannel.tcpInfo() which returns EpollTcpInfo that exposes
all metrics exposed via getsockopt(..., TCP_INFO, ...)

Result:

TCP_INFO support implemented
2015-01-27 07:07:18 +01:00
Norman Maurer
ca05dcd65e Fix NPE when remote address can not be obtained
Motivation:

In the native transport we use getpeername to obtain the remote address from the file descriptor. This may fail for various reasons in which case NULL is returned.

Modifications:

- Check for null when try to obtain remote / local address

Result:

No more NPE
2015-01-26 10:43:35 +01:00
Trustin Lee
dc3bec4a25 Fix duplicate channelReadComplete() in EpollDatagramChannel 2014-12-31 19:14:03 +09:00
Trustin Lee
427ff76c8a Fire channelReadComplete() in EpollDatagramChannel
Related: #3274

Motivation:

channelReadComplete() event is not triggered after reading successfully
in EpollDatagramChannel.

Modifications:

- Trigger exceptionCaught() event for read failure only once for less
  noise
- Trigger channelReadComplete() event at the end of the read.

Result:

Fix #3274
2014-12-31 17:35:35 +09:00
Trustin Lee
2c55089327 Clean up the exception messages
- Consistency
- Use the method name of the current scope if possible
2014-12-30 12:52:24 +09:00
Trustin Lee
2a717bd50e Throw exceptions outside the native code
Rebased and cleaned-up based on the work by @normanmaurer

Motivation:

Currently, IOExceptions and ClosedChannelExceptions are thrown from
inside the JNI methods. Instantiation of Java objects inside JNI code is
an expensive operation, needless to say about filling stack trace for
every instantiation of an exception.

Modifications:

Change most JNI methods to return a negative value on failure so that
the exceptions are instantiated outside the native code.

Also, pre-instantiate some commonly-thrown exceptions for better
performance.

Result:

Performance gain
2014-12-30 12:25:29 +09:00
Norman Maurer
bb3aed78cd Remove bottleneck while create InetSocketAddress in native transport
Motivation:

Everytime a new connection is accepted via EpollSocketServerChannel it will create a new EpollSocketChannel that needs to get the remote and local addresses in the constructor. The current implementation uses new InetSocketAddress(String, int) to create these. This is quite slow due the implementation in oracle and openjdk.

Modifications:

Encode all needed informations into a byte array before return from jni layer and then use new InetSocketAddress(InetAddress, int) to create the socket addresses. This allows to create the InetAddress via a byte[] and so reduce the overhead, this is done either by using InetAddress.getByteAddress(byte[]) or by Inet6Address.getByteAddress(String, byte[], int).

Result:

Reduce performance overhead while accept new connections with native transport
2014-12-12 18:23:18 +01:00
Trustin Lee
d838b19b6f Test TLS renegotiation with explicit cipher suite change
Motivation:

So far, our TLS renegotiation test did not test changing cipher suite
during renegotiation explicitly.

Modifications:

- Switch the cipher suite during renegotiation

Result:

We are now sure the cipher suite change works.
2014-12-12 17:43:23 +09:00
Norman Maurer
140a32bcb3 Allow to lazy create a DefaultFileRegion from a File
Motivation:

We only provided a constructor in DefaultFileRegion that takes a FileChannel which means the File itself needs to get opened on construction. This has the problem that if you want to write a lot of Files very fast you may end up with may open FD's even if they are not needed yet. This can lead to hit the open FD limit of the OS.

Modifications:

Add a new constructor to DefaultFileRegion which allows to construct it from a File. The FileChannel will only be obtained when transferTo(...) is called or the DefaultFileRegion is explicit open'ed via open() (this is needed for the native epoll transport)

Result:

Less resource usage when writing a lot of DefaultFileRegion.
2014-12-11 12:06:52 +01:00
Trustin Lee
d1612f67ad Add SslHandler.renegotiate()
Related: #3125

Motivation:

We did not expose a way to initiate TLS renegotiation and to get
notified when the renegotiation is done.

Modifications:

- Add SslHandler.renegotiate() so that a user can initiate TLS
  renegotiation and get the future that's notified on completion
- Make SslHandler.handshakeFuture() return the future for the most
  recent handshake so that a user can get the future of the last
  renegotiation
- Add the test for renegotiation to SocketSslEchoTest

Result:

Both client-initiated and server-initiated renegotiations are now
supported properly.
2014-12-10 18:47:53 +09:00
Trustin Lee
fa248cecb5 Name resolver API and DNS-based name resolver
Motivation:

So far, we relied on the domain name resolution mechanism provided by
JDK.  It served its purpose very well, but had the following
shortcomings:

- Domain name resolution is performed in a blocking manner.
  This becomes a problem when a user has to connect to thousands of
  different hosts. e.g. web crawlers
- It is impossible to employ an alternative cache/retry policy.
  e.g. lower/upper bound in TTL, round-robin
- It is impossible to employ an alternative name resolution mechanism.
  e.g. Zookeeper-based name resolver

Modification:

- Add the resolver API in the new module: netty-resolver
- Implement the DNS-based resolver: netty-resolver-dns
  .. which uses netty-codec-dns
- Make ChannelFactory reusable because it's now used by
  io.netty.bootstrap, io.netty.resolver.dns, and potentially by other
  modules in the future
  - Move ChannelFactory from io.netty.bootstrap to io.netty.channel
  - Deprecate the old ChannelFactory
  - Add ReflectiveChannelFactory

Result:

It is trivial to resolve a large number of domain names asynchronously.
2014-10-16 17:10:36 +09:00
Trustin Lee
d9739126f2 Finish porting netty-proxy module
- Fix problems introduced by de9c81bf6e021ec809d9a52d1f5895682c2cb59d
- Fix inspector warnings
2014-10-14 12:46:35 +09:00
Trustin Lee
e4ab108a10 Add AbstractUnsafe.annotateConnectException()
Motivation:

JDK's exception messages triggered by a connection attempt failure do
not contain the related remote address in its message.  We currently
append the remote address to ConnectException's message, but I found
that we need to cover more exception types such as SocketException.

Modifications:

- Add AbstractUnsafe.annotateConnectException() to de-duplicate the
  code that appends the remote address

Result:

- Less duplication
- A transport implementor can annotate connection attempt failure
  message more easily
2014-10-14 12:38:12 +09:00
Norman Maurer
fc45473e79 [#2926] Fix 1 byte memory leak in native transport
Motivation:

We use malloc(1) in the on JNI_OnLoad method but never free the allocated memory. This means we have a tiny memory leak of 1 byte.

Modifications:

Call free(...) on previous allocated memory.

Result:

Fix memory leak
2014-09-22 15:12:20 +02:00
Scott Mitchell
94deea409e Fix Native EPOLL Build Failure
Motiviation:
If sendmmsg is already defined then the native epoll module failed to build because of conflicting definitions.
The mmsghdr type was also redefined on systems that already supported this structure.

Modifications:
Provide a way so that systems which already define sendmmsg and mmsghdr can build
Provide a way so that systems which don't define sendmmsg and mmsghdr can build

Result:
The native EPOLL module can build in more environments
2014-09-17 20:56:12 +02:00
Jakob Buchgraber
b72a05edb4 Shutdown Executor on MultithreadEventLoopGroup shutdown. Fixes #2837
Motivation:

Currently the Executor created by (Nio|Epoll)EventLoopGroup is not correctly shutdown.
This might lead to resource shortages, due to resources not being freed asap.

Modifications:

If (Nio|Epoll)EventLoopGroup create their internal Executor via a constructor
provided `ExecutorServiceFactory` object or via
MultithreadEventLoopGroup.newDefaultExecutorService(...) the ExecutorService.shutdown()
method will be called after (Nio|Epoll)EventLoopGroup is shutdown.

ExecutorService.shutdown() will not be called if the Executor object was passed
to the (Nio|Epoll)EventLoopGroup (that is, it was instantiated outside of Netty).

Result:

Correctly release resources on (Nio|Epoll)EventLoopGroup shutdown.
2014-09-12 12:17:29 +09:00
Norman Maurer
2ba9f824c2 Directly write CompositeByteBuf if possible without memory copy. Related to [#2719]
Motivation:

In linux it is possible to write more then one buffer withone syscall when sending datagram messages.

Modifications:

Not copy CompositeByteBuf if it only contains direct buffers.

Result:

More performance due less overhead for copy.
2014-09-10 14:33:29 +02:00
Norman Maurer
d5b9f58f1f Add support for sendmmsg(...) and so allow to write multiple DatagramPackets with one syscall. Related to [#2719]
Motivation:

On linux with glibc >= 2.14 it is possible to send multiple DatagramPackets with one syscall. This can be a huge performance win and so we should support it in our native transport.

Modification:

- Add support for sendmmsg by reuse IovArray
- Factor out ThreadLocal support of IovArray to IovArrayThreadLocal for better separation as we use IovArray also without ThreadLocal in NativeDatagramPacketArray now
- Introduce NativeDatagramPacketArray which is used for sendmmsg(...)
- Implement sendmmsg(...) via jni
- Expand DatagramUnicastTest to test also sendmmsg(...)

Result:

Netty now automatically use sendmmsg(...) if it is supported and we have more then 1 DatagramPacket in the ChannelOutboundBuffer and flush() is called.
2014-09-09 10:03:17 +02:00
Norman Maurer
f1f14f524a Allow to write CompositeByteBuf directly via EpollDatagramChannel. Related to [#2719]
Motivation:

On linux it is possible to use the sendMsg(...) system call to write multiple buffers with one system call when using datagram/udp.

Modifications:

- Implement the needed changes and make use of sendMsg(...) if possible for max performance
- Add tests that test sending datagram packets with all kind of different ByteBuf implementations.

Result:

Performance improvement when using CompoisteByteBuf and EpollDatagramChannel.
2014-09-09 09:50:42 +02:00
Norman Maurer
7e48801f71 [#2867] Workaround performance issue with IPv4-mapped-on-IPv6 addresses
Motivation:

InetAddress.getByName(...) uses exceptions for control flow when try to parse IPv4-mapped-on-IPv6 addresses. This is quite expensive.

Modifications:

Detect IPv4-mapped-on-IPv6 addresses in the JNI level and convert to IPv4 addresses before pass to InetAddress.getByName(...) (via InetSocketAddress constructor).

Result:

Eliminate performance problem causes by exception creation when parsing IPv4-mapped-on-IPv6 addresses.
2014-09-09 07:24:17 +02:00
Norman Maurer
e166780f0b [#2823] Writing DefaultFileRegion with EpollSocketChannel may cause hang
Motivation:

In EpollSocketchannel.doWriteFileRegion(...) we need to make sure we write until sendFile(...) returns either 0 or all is written. Otherwise we may not get notified once the Channel is writable again.

This is the case as we use EPOLL_ET.

Modifications:

Always write until either sendFile returns 0 or all is written.

Result:

No more hangs when writing DefaultFileRegion can happen.
2014-08-26 15:09:27 +02:00
Norman Maurer
ee198d7cfc Allow efficient writing of CompositeByteBuf when using native epoll transport.
Motivation:

There were no way to efficient write a CompositeByteBuf as we always did a memory copy to a direct buffer in this case. This is not needed as we can just write a CompositeByteBuf as long as all the components are buffers with a memory address.

Modifications:

- Write CompositeByteBuf which contains only direct buffers without memory copy
- Also handle CompositeByteBuf that have more components then 1024.

Result:

More efficient writing of CompositeByteBuf.
2014-08-21 10:58:42 +02:00
Trustin Lee
2ded0dd6fe Fix data corruption in FileRegion transfer with epoll transport
Related issue: #2764

Motivation:

EpollSocketChannel.writeFileRegion() does not handle the case where the
position of a FileRegion is non-zero properly.

Modifications:

- Improve SocketFileRegionTest so that it tests the cases where the file
  transfer begins from the middle of the file
- Add another jlong parameter named 'base_off' so that we can take the
  position of a FileRegion into account

Result:

Improved test passes. Corruption is gone.
2014-08-13 16:58:35 -07:00
Norman Maurer
c5b6808645 Allow to obtain RecvByteBufAllocator.Handle to allow more flexible implementations
Motivation:

At the moment it's only possible for a user to set the RecvByteBufAllocator for a Channel but not access the Handle once it is assigned. This makes it hard to write more flexible implementations.

Modifications:

Add a new method to the Channel.Unsafe to allow access the the used Handle for the Channel. The RecvByteBufAllocator.Handle is created lazily.

Result:

It's possible to write more flexible implementatons that allow to adjust stuff on the fly for a Handle that is used by a Channel
2014-08-12 06:54:29 +02:00
Jakob Buchgraber
220660e351 Deregistration of a Channel from an EventLoop
Motivation:

After a channel is created it's usually assigned to an
EventLoop. During the lifetime of a Channel the
EventLoop is then responsible for processing all I/O
and compute tasks of the Channel.

For various reasons (e.g. load balancing) a user might
require the ability for a Channel to be assigned to
another EventLoop during its lifetime.

Modifications:

Introduce under the hood changes that ensure that Netty's
thread model is obeyed during and after the deregistration
of a channel.

Ensure that tasks (one time and periodic) are executed by
the right EventLoop at all times.

Result:

A Channel can be deregistered from one and re-registered with
another EventLoop.
2014-08-11 15:28:46 -07:00
Jakob Buchgraber
f8bee2e94c Make Nio/EpollEventLoop run on a ForkJoinPool
Related issue: #2250

Motivation:

Prior to this commit, Netty's non blocking EventLoops
were each assigned a fixed thread by which all of the
EventLoop's I/O and handler logic would be performed.

While this is a fine approach for most users of Netty,
some advanced users require more flexibility in
scheduling the EventLoops.

Modifications:

Remove all direct usages of threads in MultithreadEventExecutorGroup,
SingleThreadEventExecutor et al., and introduce an Executor
abstraction instead.

The way to think about this change is, that each
iteration of an eventloop is now a task that gets scheduled
in a ForkJoinPool.

While the ForkJoinPool is the default, one also has the
ability to plug in his/her own Executor (aka thread pool)
into a EventLoop(Group).

Result:

Netty hands off thread management to a ForkJoinPool by default.
Users can also provide their own thread pool implementation and
get some control over scheduling Netty's EventLoops
2014-08-11 15:28:46 -07:00
Norman Maurer
4a3ef90381 Port ChannelOutboundBuffer and related changes from 4.0
Motivation:

We did various changes related to the ChannelOutboundBuffer in 4.0 branch. This commit port all of them over and so make sure our branches are synced in terms of these changes.

Related to [#2734], [#2709], [#2729], [#2710] and [#2693] .

Modification:
Port all changes that was done on the ChannelOutboundBuffer.

This includes the port of the following commits:
 - 73dfd7c01b49aca006a34cc48197dee3fc360af1
 - 997d8c32d23f2d88903b7b607360907b99101002
 - e282e504f17b0874719ff606c728494e3509b1a0
 - 5e5d1a58fd3159c04ac7d10edfb8ed7a83d3935e
 - 8ee3575e72d6ee000a99c717d96f36695a8667a0
 - d6f0d12a8692c095df43b2a4462cbc97cf5c5a2d
 - 16e50765d1fb99005ad761409c28dcedf477531b
 - 3f3e66c31ae3da70c36cc125ca9bcac8215390e4

Result:
 - Less memory usage by ChannelOutboundBuffer
 - Same code as in 4.0 branch
 - Make it possible to use ChannelOutboundBuffer with Channel implementation that not extends AbstractChannel
2014-08-05 15:00:56 +02:00
Trustin Lee
730bfd4fdb Add more utility methods to check the availability of the epoll transport
Related issue: #2733

Motivation:

Unlike OpenSsl, Epoll lacks a couple useful availability checker
methods:

- ensureAvailability()
- unavailabilityCause()

Modifications:

Add missing methods

Result:

More ways to check the availability and to get the cause of
unavailability programatically.
2014-08-04 15:05:13 -07:00
Norman Maurer
410b5623ff Use correct exception message when throw exception from native code
Motivation:

We sometimes not use the correct exception message when throw it from the native code.

Modifications:

Fixed the message.

Result:

Correct message in exception
2014-07-28 13:34:09 -07:00
Norman Maurer
2131edadf2 [#2692] Allows notify ChannelFutureProgressListener on complete writes
Motivation:

We have some inconsistency when handling writes. Sometimes we call ChannelOutboundBuffer.progress(...) also for complete writes and sometimes not. We should call it always.

Modifications:

Correctly call ChannelOuboundBuffer.progress(...) for complete and incomplete writes.

Result:

Consistent behavior
2014-07-28 04:20:04 -07:00
Norman Maurer
30295138fe Correctly write single ByteBuf with memoryAddress
Motivation:

While optimize gathering writes I introduced a bug when writing single ByteBuf that have a memoryAddress. This regression was introduced by 88bd6e7a9300073707f305409fa6481f1eeb2077.

Modifications:

Correctly use the writerIndex as argument when call Native.writeAddress(...)

Result:

No more corruption while write single buffers.
2014-07-25 17:29:31 +02:00
Norman Maurer
ca0571cc9d Optimize native transport for gathering writes
Motivation:

While benchmarking the native transport with gathering writes I noticed that it is quite slow. This is due the fact that we need to do a lot of array copies to get the buffers into the iov array.

Modification:

Introduce a new class calles IovArray which allows to fill buffers directly in a iov array that can be passed over to JNI without any array copies. This gives a nice optimization in terms of speed when doing gathering writes.

Result:

Big performance improvement when doing gathering writes. See the included benchmark...

Before:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    23.44ms   16.37ms 259.57ms   91.77%
    Req/Sec   181.99k    31.69k  304.60k    78.12%
  346544071 requests in 2.00m, 46.48GB read
Requests/sec: 2887885.09
Transfer/sec:    396.59MB

With this change:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    21.93ms   16.33ms 305.73ms   92.34%
    Req/Sec   194.56k    33.75k  309.33k    77.04%
  369617503 requests in 2.00m, 49.57GB read
Requests/sec: 3080169.65
Transfer/sec:    423.00MB
2014-07-25 09:54:41 +02:00
Norman Maurer
860e475a18 [#2685] Epoll transport should use GetPrimitiveArrayCritical / ReleasePrimitiveArrayCritical
Motivation:

At the moment we use Get*ArrayElement all the time in the epoll transport which may be wasteful as the JVM may do a memory copy for this. For code-path that will get executed fast (without blocking) we should better make use of GetPrimitiveArrayCritical and ReleasePrimitiveArrayCritical as this signal the JVM that we not want to do any memory copy if not really needed. It is important to only do this on non-blocking code-path as this may even suspend the GC to disallow the JVM to move the arrays around.

See also http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#GetPrimitiveArrayCritical

Modification:

Make use of GetPrimitiveArrayCritical / ReleasePrimitiveArrayCritical as replacement for Get*ArrayElement / Release*ArrayElement where possible.

Result:

Better performance due less memory copies.
2014-07-21 07:13:41 +02:00