Commit Graph

234 Commits

Author SHA1 Message Date
Scott Mitchell
1485a87e25 Http2ConnectionHandler and Http2FrameListener cyclic dependency
Motivation:
It is often the case that implementations of Http2FrameListener will want to send responses when data is read. The Http2FrameListener needs access to the Http2ConnectionHandler (or the encoder contained within) to be able to send responses. However the Http2ConnectionHandler requires a Http2FrameListener instance to be passed in during construction time. This creates a cyclic dependency which can make it difficult to cleanly accomplish this relationship.

Modifications:
- Add Http2ConnectionDecoder.frameListener(..) method to set the frame listener. This will allow the listener to be set after construction.

Result:
Classes which inherit from Http2ConnectionHandler can more cleanly set the Http2FrameListener.
2015-09-30 15:41:15 -07:00
Scott Mitchell
0e9545e94d Http2RemoteFlowController stream writibility listener
Motivation:
For implementations that want to manage flow control down to the stream level it is useful to be notified when stream writability changes.

Modifications:
- Add writabilityChanged to Http2RemoteFlowController.Listener
- Add isWritable to Http2RemoteFlowController

Result:
The Http2RemoteFlowController provides notification when writability of a stream changes.
2015-09-28 13:47:24 -07:00
nmittler
3ee44a3dbb Update Netty to latest netty-tcnative
Motivation:

The latest netty-tcnative fixes a bug in determining the version of the runtime openssl lib.  It also publishes an artificact with the classifier linux-<arch>-fedora for fedora-based systems.

Modifications:

Modified the build files to use the "-fedora" classifier when appropriate for tcnative. Care is taken, however, to not change the classifier for the native epoll transport.

Result:

Netty is updated the the new shiny netty-tcnative.
2015-09-18 12:07:21 -07:00
Norman Maurer
34de2667c7 [maven-release-plugin] prepare for next development iteration 2015-09-02 11:45:20 +02:00
Norman Maurer
2eb444ec1d [maven-release-plugin] prepare release netty-4.1.0.Beta6 2015-09-02 11:36:11 +02:00
Scott Mitchell
ba6ce5449e Headers Performance Boost and Interface Simplification
Motivation:
A degradation in performance has been observed from the 4.0 branch as documented in https://github.com/netty/netty/issues/3962.

Modifications:
- Simplify Headers class hierarchy.
- Restore the DefaultHeaders to be based upon DefaultHttpHeaders from 4.0.
- Make various other modifications that are causing hot spots.

Result:
Performance is now on par with 4.0.
2015-08-17 08:50:11 -07:00
Jakob Buchgraber
6fd0a0c55f Faster and more memory efficient headers for HTTP, HTTP/2, STOMP and SPYD. Fixes #3600
Motivation:

We noticed that the headers implementation in Netty for HTTP/2 uses quite a lot of memory
and that also at least the performance of randomly accessing a header is quite poor. The main
concern however was memory usage, as profiling has shown that a DefaultHttp2Headers
not only use a lot of memory it also wastes a lot due to the underlying hashmaps having
to be resized potentially several times as new headers are being inserted.

This is tracked as issue #3600.

Modifications:
We redesigned the DefaultHeaders to simply take a Map object in its constructor and
reimplemented the class using only the Map primitives. That way the implementation
is very concise and hopefully easy to understand and it allows each concrete headers
implementation to provide its own map or to even use a different headers implementation
for processing requests and writing responses i.e. incoming headers need to provide
fast random access while outgoing headers need fast insertion and fast iteration. The
new implementation can support this with hardly any code changes. It also comes
with the advantage that if the Netty project decides to add a third party collections library
as a dependency, one can simply plug in one of those very fast and memory efficient map
implementations and get faster and smaller headers for free.

For now, we are using the JDK's TreeMap for HTTP and HTTP/2 default headers.

Result:

- Significantly fewer lines of code in the implementation. While the total commit is still
  roughly 400 lines less, the actual implementation is a lot less. I just added some more
  tests and microbenchmarks.

- Overall performance is up. The current implementation should be significantly faster
  for insertion and retrieval. However, it is slower when it comes to iteration. There is simply
  no way a TreeMap can have the same iteration performance as a linked list (as used in the
  current headers implementation). That's totally fine though, because when looking at the
  benchmark results @ejona86 pointed out that the performance of the headers is completely
  dominated by insertion, that is insertion is so significantly faster in the new implementation
  that it does make up for several times the iteration speed. You can't iterate what you haven't
  inserted. I am demonstrating that in this spreadsheet [1]. (Actually, iteration performance is
  only down for HTTP, it's significantly improved for HTTP/2).

- Memory is down. The implementation with TreeMap uses on avg ~30% less memory. It also does not
  produce any garbage while being resized. In load tests for GRPC we have seen a memory reduction
  of up to 1.2KB per RPC. I summarized the memory improvements in this spreadsheet [1]. The data
  was generated by [2] using JOL.

- While it was my original intend to only improve the memory usage for HTTP/2, it should be similarly
  improved for HTTP, SPDY and STOMP as they all share a common implementation.

[1] https://docs.google.com/spreadsheets/d/1ck3RQklyzEcCLlyJoqDXPCWRGVUuS-ArZf0etSXLVDQ/edit#gid=0
[2] https://gist.github.com/buchgr/4458a8bdb51dd58c82b4
2015-08-04 17:12:24 -07:00
Scott Mitchell
a7713069a1 HttpObjectDecoder performance improvements
Motivation:
The HttpObjectDecoder is on the hot code path for the http codec. There are a few hot methods which can be modified to improve performance.

Modifications:
- Modify AppendableCharSequence to provide unsafe methods which don't need to re-check bounds for every call.
- Update HttpObjectDecoder methods to take advantage of new AppendableCharSequence methods.

Result:
Peformance boost for decoding http objects.
2015-07-29 23:26:26 -07:00
Scott Mitchell
9747ffe5fc HTTP/2 Flow Controller should use Channel.isWritable()
Motivation:
See #3783

Modifications:
- The DefaultHttp2RemoteFlowController should use Channel.isWritable() before attempting to do any write operations.
- The Flow controller methods should no longer take ChannelHandlerContext. The concept of flow control is tied to a connection and we do not support 1 flow controller keeping track of multiple ChannelHandlerContext.

Result:
Writes are delayed until isWritable() is true. Flow controller interface methods are more clear as to ChannelHandlerContext restrictions.
2015-07-16 14:38:48 -07:00
Louis Ryan
05ce33f5ca Make the flow-controllers write fewer, fatter frames to improve throughput.
Motivation:

Coalescing many small writes into a larger DATA frame reduces framing overheads on the wire and reduces the number of calls to Http2FrameListeners on the remote side.
Delaying the write of WINDOW_UPDATE until flush allows for more consumed bytes to be returned as the aggregate of consumed bytes is returned and not the amount consumed when the threshold was crossed.

Modifications:
- Remote flow controller no longer immediately writes bytes when a flow-controlled payload is enqueued. Sequential data payloads are now merged into a single CompositeByteBuf which are written when 'writePendingBytes' is called.
- Listener added to remote flow-controller which observes written bytes per stream.
- Local flow-controller no longer immediately writes WINDOW_UPDATE when the ratio threshold is crossed. Now an explicit call to 'writeWindowUpdates' triggers the WINDOW_UPDATE for all streams who's ratio is exceeded at that time. This results in
  fewer window updates being sent and more bytes being returned.
- Http2ConnectionHandler.flush triggers 'writeWindowUpdates' on the local flow-controller followed by 'writePendingBytes' on the remote flow-controller so WINDOW_UPDATES preceed DATA frames on the wire.

Result:
- Better throughput for writing many small DATA chunks followed by a flush, saving 9-bytes per coalesced frame.
- Fewer WINDOW_UPDATES being written and more flow-control bytes returned to remote side more quickly, thereby improving throughput.
2015-06-19 15:20:31 -07:00
Norman Maurer
f23b7b4efd [maven-release-plugin] prepare for next development iteration 2015-05-07 14:21:08 -04:00
Norman Maurer
871ce43b1f [maven-release-plugin] prepare release netty-4.1.0.Beta5 2015-05-07 14:20:38 -04:00
Louis Ryan
a3cea186ce Have Http2LocalFlowController.consumeBytes indicate whether a WINDOW_UPDATE was written 2015-05-04 13:22:18 -07:00
Scott Mitchell
f812180c2d ByteString arrayOffset method
Motivation:
The ByteString class currently assumes the underlying array will be a complete representation of data. This is limiting as it does not allow a subsection of another array to be used. The forces copy operations to take place to compensate for the lack of API support.

Modifications:
- add arrayOffset method to ByteString
- modify all ByteString and AsciiString methods that loop over or index into the underlying array to use this offset
- update all code that uses ByteString.array to ensure it accounts for the offset
- add unit tests to test the implementation respects the offset

Result:
ByteString and AsciiString can represent a sub region of a byte[].
2015-04-24 18:54:01 -07:00
nmittler
70a2608325 Optimizing user-defined stream properties.
Motivation:

Streams currently maintain a hash map of user-defined properties, which has been shown to add significant memory overhead as well as being a performance bottleneck for lookup of frequently used properties.

Modifications:

Modifying the connection/stream to use an array as the storage of user-defined properties, indexed by the class that identifies the index into the array where the property is stored.

Result:

Stream processing performance should be improved.
2015-04-23 12:41:14 -07:00
Scott Mitchell
b426fb1618 Compile error introduced in ee9233d
Motivation:
Commit ee9233d introduced a compile error in microbench.

Modifications:
Fix compile error.

Result:
Code now builds.
2015-04-22 16:23:39 -07:00
Scott Mitchell
541137cc93 HTTP/2 Flow Controller interface updates
Motivation:
Flow control is a required part of the HTTP/2 specification but it is currently structured more like an optional item. It must be accessed through the property map which is time consuming and does not represent its required nature. This access pattern does not give any insight into flow control outside of the codec (or flow controller implementation).

Modifications:
1. Create a read only public interface for LocalFlowState and RemoteFlowState.
2. Add a LocalFlowState localFlowState(); and RemoteFlowState remoteFlowState(); to Http2Stream.

Result:
Flow control is not part of the Http2Stream interface. This clarifies its responsibility and logical relationship to other interfaces. The flow controller no longer must be acquired though a map lookup.
2015-04-20 20:02:02 -07:00
Scott Mitchell
2b8104c852 HTTP/2 Priority Tree Benchmark
Motivation:
There is no benchmark to measure the priority tree implementation performance.

Modifications:
Introduce a new benchmark which will populate the priority tree, and then shuffle parent/child links around.

Result:
A simple benchmark to get a baseline for the HTTP/2 codec's priority tree implementation.
2015-04-17 10:14:13 -07:00
Louis Ryan
f3fb77f4bc Have microbenchmarks produce a deployable artifact. Fix some minor miscellaneous issues.
Motivation:
Allows for running benchmarks from built jars which is useful in development environments that only take released artifacts.

Modifications:
Move benchmarks into 'main' from 'test'
Add @State annotations to benchmarks that are missing them
Fix timing issue grabbing context during channel initialization

Result:
Users can run benchmarks more easily.
2015-04-17 10:04:26 -07:00
Jakob Buchgraber
c2de195f87 Improve performance of AsciiString.equals(Object).
Motivation:

The current implementation does byte by byte comparison, which we have seen
can be a performance bottleneck when the AsciiString is used as the key in
a Map.

Modifications:

Use sun.misc.Unsafe (on supporting platforms) to compare up to eight bytes at a time
and get closer to the performance of String.equals(Object).

Result:

Significant improvement (2x - 6x) in performance over the current implementation.

Benchmark                                             (size)   Mode   Samples        Score  Score error    Units
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       10  thrpt        10 118843477.518 2347259.347    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual       50  thrpt        10 43910319.773   198376.996    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual      100  thrpt        10 26339969.001   159599.252    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual     1000  thrpt        10  2873119.030    20779.056    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual    10000  thrpt        10   306370.450     1933.303    ops/s
i.n.m.i.PlatformDependentBenchmark.arraysBytesEqual   100000  thrpt        10    25750.415      108.391    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       10  thrpt        10 248077563.510  635320.093    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual       50  thrpt        10 128198943.138  614827.548    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual      100  thrpt        10 86195621.349  1063959.307    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual     1000  thrpt        10 16920264.598    61615.365    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual    10000  thrpt        10  1687454.747     6367.602    ops/s
i.n.m.i.PlatformDependentBenchmark.unsafeBytesEqual   100000  thrpt        10   153717.851      586.916    ops/s
2015-04-16 17:29:54 -07:00
Scott Mitchell
9a7a85dbe5 ByteString introduced as AsciiString super class
Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.

Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.

Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
2015-04-14 16:35:17 -07:00
nmittler
ab158a6ea4 Adding basic benchmarks for IntObjectHashMap
Motivation:

It needs to be fast :)

Modifications:

Added a simple benchmark to the microbench module.

Result:

Yay, benchmarks!
2015-04-13 12:24:19 -07:00
Scott Mitchell
cc7ee002dd HTTP/2 Frame Writer Microbenchmark Fix
Motivation:
The Http2FrameWriterBenchmark JMH harness class name was not updated for the JVM arguments. The number of forks is 0 which means the JHM will share a JVM with the benchmarks.  Sharing the JVM may lead to less reliable benchmarking results and as doesn't allow for the command line arguments to be applied for each benchmark.

Modifications:
- Update the JMH version from 0.9 to 1.7.1.  Benchmarks wouldn't run on old version.
- Increase the number of forks from 0 to 1.
- Remove allocation of environment from static and cleanup AfterClass to using the Setup and Teardown methods. The forked JVM would not shut down correctly otherwise (and wait for 30+ seconds before timeing out).

Result:
Benchmarks that run as intended.
2015-04-13 10:59:39 -07:00
nmittler
6fbca14f8a Cleaning up the initialization of Http2ConnectionHandler
Motivation:

It currently takes a builder for the encoder and decoder, which makes it difficult to decorate them.

Modifications:

Removed the builders from the interfaces entirely. Left the builder for the decoder impl but removed it from the encoder since it's constructor only takes 2 parameters. Also added decorator base classes for the encoder and decoder and made the CompressorHttp2ConnectionEncoder extend the decorator.

Result:

Fixes #3530
2015-03-30 11:23:02 -07:00
Scott Mitchell
2bf592c50f Backport of HTTP/2 Microbenchmark fail.
Motivation:
The backport of a6c729bdf8 failed.

Modifications:
- Make sure the interfaces are correctly implemented when backporting.

Result:
Microbenchmark compiles and runs on 4.1 branch.
2015-03-28 18:41:09 -07:00
scottmitch
2dda917f27 Http2DefaultFrameWriter microbenchmark
Motivation:
A microbenchmark will be useful to get a baseline for performance.

Modifications:
- Introduce a new microbenchmark which tests the Http2DefaultFrameWriter.
- Allow benchmarks to run without thread context switching between JMH and Netty.

Result:
Microbenchmark exists to test performance.
2015-03-27 13:10:57 -07:00
Norman Maurer
fce0989844 [maven-release-plugin] prepare for next development iteration 2015-03-03 02:06:47 -05:00
Norman Maurer
ca3b1bc4b7 [maven-release-plugin] prepare release netty-4.1.0.Beta4 2015-03-03 02:05:52 -05:00
Michael Nitschinger
1d344f488c Fix ByteBufUtilBenchmark on utf8 encodings.
Motivation
----------
The performance tests for utf8 also used the getBytes on ASCII,
which is incorrect and also provides different performance numbers.

Modifications
-------------
Use CharsetUtil.UTF_8 instead of US_ASCII for the getBytes calls.

Result
------
Accurate and semantically correct benchmarking results on utf8
comparisons.
2014-12-31 20:26:42 +09:00
Norman Maurer
fe796fc8ab Provide helper methods in ByteBufUtil to write UTF-8/ASCII CharSequences. Related to [#909]
Motivation:

We expose no methods in ByteBuf to directly write a CharSequence into it. This leads to have the user either convert the CharSequence first to a byte array or use CharsetEncoder. Both cases have some overheads and we can do a lot better for well known Charsets like UTF-8 and ASCII.

Modifications:

Add ByteBufUtil.writeAscii(...) and ByteBufUtil.writeUtf8(...) which can do the task in an optimized way. This is especially true if the passed in ByteBuf extends AbstractByteBuf which is true for all of our implementations which not wrap another ByteBuf.

Result:

Writing an ASCII and UTF-8 CharSequence into a AbstractByteBuf is a lot faster then what the user could do by himself as we can make use of some package private methods and so eliminate reference and range checks. When the Charseq is not ASCII or UTF-8 we can still do a very good job and are on par in most of the cases with what the user would do.

The following benchmark shows the improvements:

Result: 2456866.966 ?(99.9%) 59066.370 ops/s [Average]
  Statistics: (min, avg, max) = (2297025.189, 2456866.966, 2586003.225), stdev = 78851.914
  Confidence interval (99.9%): [2397800.596, 2515933.336]

Benchmark                                                        Mode   Samples        Score  Score error    Units
i.n.m.b.ByteBufUtilBenchmark.writeAscii                         thrpt        50  9398165.238   131503.098    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiString                   thrpt        50  9695177.968   176684.821    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArray           thrpt        50  4788597.415    83181.549    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringViaArrayWrapped    thrpt        50  4722297.435    98984.491    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiStringWrapped            thrpt        50  4028689.762    66192.505    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArray                 thrpt        50  3234841.565    91308.009    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiViaArrayWrapped          thrpt        50  3311387.474    39018.933    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeAsciiWrapped                  thrpt        50  3379764.250    66735.415    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8                          thrpt        50  5671116.821   101760.081    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8String                    thrpt        50  5682733.440   111874.084    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArray            thrpt        50  3564548.995    55709.512    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringViaArrayWrapped     thrpt        50  3621053.671    47632.820    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8StringWrapped             thrpt        50  2634029.071    52304.876    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArray                  thrpt        50  3397049.332    57784.119    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8ViaArrayWrapped           thrpt        50  3318685.262    35869.562    ops/s
i.n.m.b.ByteBufUtilBenchmark.writeUtf8Wrapped                   thrpt        50  2473791.249    46423.114    ops/s
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,387.417 sec - in io.netty.microbench.buffer.ByteBufUtilBenchmark

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

The *ViaArray* benchmarks are basically doing a toString().getBytes(Charset) which the others are using ByteBufUtil.write*(...).
2014-12-26 15:58:18 +09:00
Idel Pivnitskiy
cff98fff51 Benchmark for HttpRequestDecoder 2014-11-12 14:29:15 +01:00
Scott Mitchell
7e65c09373 IPv6 address to string rfc5952
Motivation:
The java implementations for Inet6Address.getHostName() do not follow the RFC 5952 (http://tools.ietf.org/html/rfc5952#section-4) for recommended string representation. This introduces inconsistencies when integrating with other technologies that do follow the RFC.

Modifications:
-NetUtil.java to have another public static method to convert InetAddress to string. Inet4Address will use the java InetAddress.getHostAddress() implementation and there will be new code to implement the RFC 5952 IPV6 string conversion.
-New unit tests to test the new method

Result:
Netty provides a RFC 5952 compliant string conversion method for IPV6 addresses
2014-10-30 00:05:57 -04:00
Trustin Lee
b5f61d0de5 [maven-release-plugin] prepare for next development iteration 2014-08-16 03:27:42 +09:00
Trustin Lee
76ac3b21a5 [maven-release-plugin] prepare release netty-4.1.0.Beta3 2014-08-16 03:27:37 +09:00
Trustin Lee
b3c1904cc9 [maven-release-plugin] prepare for next development iteration 2014-08-15 09:31:03 +09:00
Trustin Lee
e013b2400f [maven-release-plugin] prepare release netty-4.1.0.Beta2 2014-08-15 09:30:59 +09:00
Trustin Lee
e167b02d52 [maven-release-plugin] prepare for next development iteration 2014-07-04 17:26:02 +09:00
Trustin Lee
ba50cb829b [maven-release-plugin] prepare release netty-4.1.0.Beta1 2014-07-04 17:25:54 +09:00
Trustin Lee
787663a644 [maven-release-plugin] rollback the release of netty-4.1.0.Beta1 2014-07-04 17:11:14 +09:00
Trustin Lee
83eae705e1 [maven-release-plugin] prepare release netty-4.1.0.Beta1 2014-07-04 17:02:17 +09:00
Trustin Lee
f67ac5e46d Fix the inconsistencies between performance tests in ByteBufAllocatorBenchmark
Motivation:

default*() tests are performing a test in a different way, and they must be same with other tests.

Modification:

Make sure default*() tests are same with the others

Result:

Easier to compare default and non-default allocators
2014-06-21 13:28:02 +09:00
Trustin Lee
085a61a310 Refactor FastThreadLocal to simplify TLV management
Motivation:

When Netty runs in a managed environment such as web application server,
Netty needs to provide an explicit way to remove the thread-local
variables it created to prevent class loader leaks.

FastThreadLocal uses different execution paths for storing a
thread-local variable depending on the type of the current thread.
It increases the complexity of thread-local removal.

Modifications:

- Moved FastThreadLocal and FastThreadLocalThread out of the internal
  package so that a user can use it.
- FastThreadLocal now keeps track of all thread local variables it has
  initialized, and calling FastThreadLocal.removeAll() will remove all
  thread-local variables of the caller thread.
- Added FastThreadLocal.size() for diagnostics and tests
- Introduce InternalThreadLocalMap which is a mixture of hard-wired
  thread local variable fields and extensible indexed variables
- FastThreadLocal now uses InternalThreadLocalMap to implement a
  thread-local variable.
- Added ThreadDeathWatcher.unwatch() so that PooledByteBufAllocator
  tells it to stop watching when its thread-local cache has been freed
  by FastThreadLocal.removeAll().
- Added FastThreadLocalTest to ensure that removeAll() works
- Added microbenchmark for FastThreadLocal and JDK ThreadLocal
- Upgraded to JMH 0.9

Result:

- A user can remove all thread-local variables Netty created, as long as
  he or she did not exit from the current thread. (Note that there's no
  way to remove a thread-local variable from outside of the thread.)
- FastThreadLocal exposes more useful operations such as isSet() because
  we always implement a thread local variable via InternalThreadLocalMap
  instead of falling back to JDK ThreadLocal.
- FastThreadLocalBenchmark shows that this change improves the
  performance of FastThreadLocal even more.
2014-06-19 21:13:55 +09:00
belliottsmith
2a2a21ec59 Introduce FastThreadLocal which uses an EnumMap and a predefined fixed set of possible thread locals
Motivation:
Provide a faster ThreadLocal implementation

Modification:
Add a "FastThreadLocal" which uses an EnumMap and a predefined fixed set of possible thread locals (all of the static instances created by netty) that is around 10-20% faster than standard ThreadLocal in my benchmarks (and can be seen having an effect in the direct PooledByteBufAllocator benchmark that uses the DEFAULT ByteBufAllocator which uses this FastThreadLocal, as opposed to normal instantiations that do not, and in the new RecyclableArrayList benchmark);

Result:
Improved performance
2014-06-13 10:56:18 +02:00
Norman Maurer
61dbc353ca [#2436] Unsafe*ByteBuf implementation should only invert bytes if ByteOrder differ from native ByteOrder
Motivation:
Our Unsafe*ByteBuf implementation always invert bytes when the native ByteOrder is LITTLE_ENDIAN (this is true on intel), even when the user calls order(ByteOrder.LITTLE_ENDIAN). This is not optimal for performance reasons as the user should be able to set the ByteOrder to LITTLE_ENDIAN and so write bytes without the extra inverting.

Modification:
- Introduce a new special SwappedByteBuf (called UnsafeDirectSwappedByteBuf) that is used by all the Unsafe*ByteBuf implementation and allows to write without inverting the bytes.
- Add benchmark
- Upgrade jmh to 0.8

Result:
The user is be able to get the max performance even on servers that have ByteOrder.LITTLE_ENDIAN as their native ByteOrder.
2014-06-05 10:59:22 +02:00
Trustin Lee
0cc264b76b More realistic ByteBuf allocation benchmark
Motivation:

Allocating a single buffer and releasing it repetitively for a benchmark will not involve the realistic execution path of the allocators.

Modifications:

Keep the last 8192 allocations and release them randomly.

Result:

We are now getting the result close to what we got with caliper.
2014-05-29 19:51:05 +09:00
Michael Nitschinger
7d62594cc6 Upgrade JMH to 0.4.1 and make use of @Params. 2014-02-23 16:39:39 +01:00
Michael Nitschinger
33197c7696 Update JMH to 0.3.2 2014-02-14 13:16:13 -08:00
Trustin Lee
bb145c0057 Fix wiki link 2014-02-14 12:04:12 -08:00
Trustin Lee
ac70dc4546 Update the version to 4.1.0.Alpha1-SNAPSHOT 2014-02-13 18:32:26 -08:00
Norman Maurer
d67184b488 [maven-release-plugin] prepare for next development iteration 2014-01-21 08:18:32 +01:00
Norman Maurer
287515210d [maven-release-plugin] prepare release netty-4.0.15.Final 2014-01-21 08:18:26 +01:00
Michael Nitschinger
ac332dfe02 Using SystemPropertyUtil for prperty parsing. 2014-01-15 18:53:28 +01:00
Michael Nitschinger
99f9c6dbc3 Make JMH options modifiable through the subclassed benchmark. 2014-01-15 18:53:22 +01:00
Trustin Lee
462817cd96 Add a link to the wiki page about the microbench module 2014-01-15 16:00:53 +09:00
Michael Nitschinger
03b0099b63 microbench: move from Caliper to JMH 2014-01-14 14:56:20 +09:00
Trustin Lee
e83d2e0b4e [maven-release-plugin] prepare for next development iteration 2013-12-22 21:57:48 +09:00
Trustin Lee
cdb700c7a4 [maven-release-plugin] prepare release netty-4.0.14.Final 2013-12-22 21:57:40 +09:00
Trustin Lee
0b7aedb13b [maven-release-plugin] rollback the release of netty-4.0.14.Final 2013-12-22 21:53:24 +09:00
Trustin Lee
4bf6ec7171 [maven-release-plugin] prepare release netty-4.0.14.Final 2013-12-22 21:52:56 +09:00
Trustin Lee
9c1a49c58e [maven-release-plugin] rollback the release of netty-4.0.14.Final 2013-12-22 21:47:35 +09:00
Trustin Lee
008a049bf4 [maven-release-plugin] prepare for next development iteration 2013-12-22 21:43:55 +09:00
Trustin Lee
f6cb9088c6 [maven-release-plugin] prepare release netty-4.0.14.Final 2013-12-22 21:43:45 +09:00
Norman Maurer
17f5865e38 [maven-release-plugin] prepare for next development iteration 2013-11-29 19:31:01 +01:00
Norman Maurer
ead617fdcc [maven-release-plugin] prepare release netty-4.0.14.Beta1 2013-11-29 19:30:55 +01:00
Norman Maurer
6cf2748dbb [maven-release-plugin] prepare for next development iteration 2013-11-28 15:04:51 +01:00
Norman Maurer
5fe7596f49 [maven-release-plugin] prepare release netty-4.0.13.Final 2013-11-28 15:04:46 +01:00
Norman Maurer
db78581bbb [maven-release-plugin] prepare for next development iteration 2013-11-07 18:11:45 +01:00
Norman Maurer
2386777af8 [maven-release-plugin] prepare release netty-4.0.12.Final 2013-11-07 18:11:38 +01:00
Norman Maurer
ceab146b54 [maven-release-plugin] prepare for next development iteration 2013-10-21 07:43:42 +02:00
Norman Maurer
27a89d6032 [maven-release-plugin] prepare release netty-4.0.11.Final 2013-10-21 07:41:49 +02:00
Norman Maurer
d7da19f745 [maven-release-plugin] prepare for next development iteration 2013-10-02 15:48:52 +02:00
Norman Maurer
d35768ae11 [maven-release-plugin] prepare release netty-4.0.10.Final 2013-10-02 15:48:45 +02:00
Norman Maurer
ffab456aca Bump up version to reflect correct one 2013-09-09 11:20:12 +02:00
Norman Maurer
363531caf9 [maven-release-plugin] rollback the release of netty-4.0.9.Final 2013-09-06 09:18:34 +02:00
Norman Maurer
9d53573ee8 [maven-release-plugin] prepare for next development iteration 2013-09-06 09:17:15 +02:00
Norman Maurer
2e39b25cd4 [maven-release-plugin] prepare for next development iteration 2013-08-26 12:01:03 +02:00
Norman Maurer
b67659a866 [maven-release-plugin] prepare release netty-4.0.8.Final 2013-08-26 12:00:54 +02:00
Norman Maurer
1d3560e389 [maven-release-plugin] prepare for next development iteration 2013-08-08 13:53:28 +02:00
Norman Maurer
8e97e6c461 [maven-release-plugin] prepare release netty-4.0.7.Final 2013-08-08 13:53:19 +02:00
Norman Maurer
3f2000fa3a [maven-release-plugin] prepare for next development iteration 2013-08-01 10:59:55 +02:00
Norman Maurer
3f70d5caa4 [maven-release-plugin] prepare release netty-4.0.6.Final 2013-08-01 10:59:46 +02:00
Norman Maurer
e3410680de [maven-release-plugin] prepare for next development iteration 2013-07-31 20:08:14 +02:00
Norman Maurer
0e124583d6 [maven-release-plugin] prepare release netty-4.0.5.Final 2013-07-31 20:08:05 +02:00
Norman Maurer
0bc7d3f5d1 [maven-release-plugin] prepare for next development iteration 2013-07-23 10:04:23 +02:00
Norman Maurer
ca00182797 [maven-release-plugin] prepare release netty-4.0.4.Final 2013-07-23 10:04:14 +02:00
Trustin Lee
b130ee6a6c [maven-release-plugin] prepare for next development iteration 2013-07-18 11:17:42 +09:00
Trustin Lee
10d395e829 [maven-release-plugin] prepare release netty-4.0.3.Final 2013-07-18 11:17:31 +09:00
Norman Maurer
fc7c950b08 [maven-release-plugin] prepare for next development iteration 2013-07-17 15:58:36 +02:00
Norman Maurer
bbbf72359e [maven-release-plugin] prepare release netty-4.0.2.Final 2013-07-17 15:58:28 +02:00
Trustin Lee
57eb531eb8 [maven-release-plugin] prepare for next development iteration 2013-07-16 17:16:10 +09:00
Trustin Lee
76cefcc421 [maven-release-plugin] prepare release netty-4.0.1.Final 2013-07-16 17:15:54 +09:00
Norman Maurer
5297eba280 [maven-release-plugin] prepare for next development iteration 2013-07-15 15:48:15 +02:00
Norman Maurer
c5d8af446a [maven-release-plugin] prepare release netty-4.0.0.Final 2013-07-15 15:48:05 +02:00
Trustin Lee
246a3ecdcb [maven-release-plugin] prepare for next development iteration 2013-07-15 20:58:33 +09:00
Trustin Lee
e8fd209115 [maven-release-plugin] prepare release netty-4.0.0.Final 2013-07-15 20:58:21 +09:00
Norman Maurer
ec5e793a2f [maven-release-plugin] prepare for next development iteration 2013-07-02 11:41:18 +02:00
Norman Maurer
ca73eaef0d [maven-release-plugin] prepare release netty-4.0.0.CR9 2013-07-02 11:41:09 +02:00
Norman Maurer
830c559405 [maven-release-plugin] rollback the release of netty-4.0.0.CR9 2013-07-02 11:34:29 +02:00
Norman Maurer
66a16b133c [maven-release-plugin] prepare release netty-4.0.0.CR9 2013-07-02 10:45:12 +02:00
Trustin Lee
7e3a01cc51 [maven-release-plugin] prepare for next development iteration 2013-07-02 10:26:48 +09:00
Trustin Lee
149db34c19 [maven-release-plugin] prepare release netty-4.0.0.CR8 2013-07-02 10:26:32 +09:00
Trustin Lee
613547b0b9 [maven-release-plugin] prepare for next development iteration 2013-06-28 22:15:33 +09:00
Trustin Lee
a6abd2feb2 [maven-release-plugin] prepare release netty-4.0.0.CR7 2013-06-28 22:15:20 +09:00
Trustin Lee
a6795d7780 [maven-release-plugin] prepare for next development iteration 2013-06-25 11:07:15 +09:00
Trustin Lee
2221446425 [maven-release-plugin] prepare release netty-4.0.0.CR6 2013-06-25 11:07:15 +09:00
Trustin Lee
dba3aa2d4f Add io.netty.noResourceLeak option to microbench 2013-06-25 11:07:14 +09:00
Trustin Lee
a5871dfd86 [maven-release-plugin] prepare for next development iteration 2013-06-14 12:55:15 +09:00
Trustin Lee
f5377cc8d7 [maven-release-plugin] prepare release netty-4.0.0.CR5 2013-06-14 12:55:05 +09:00
Trustin Lee
e5ca6518ba [maven-release-plugin] prepare for next development iteration 2013-06-13 17:02:32 +09:00
Trustin Lee
381063e09c [maven-release-plugin] prepare release netty-4.0.0.CR4 2013-06-13 17:02:19 +09:00
Trustin Lee
14158070bf Revamp the core API to reduce memory footprint and consumption
The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.

Memory consumption issue:

When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user.  In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.

Memory footprint issue:

The old Netty 4 API decreased overall memory footprint by a great deal
in many cases.  It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read.  Instead, it
created a new buffer for each handler in a pipeline.  This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.

Changes:

All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.

To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:

- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
  - Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
  - Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
  - Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
  ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
  - Similar to ReceiveBufferSizePredictor in Netty 3
  - Some existing configuration properties such as
    DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf

This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
2013-06-10 16:10:39 +09:00
Norman Maurer
81e3c1719a [maven-release-plugin] prepare for next development iteration 2013-05-18 09:59:13 +02:00
Norman Maurer
99caefdf39 [maven-release-plugin] prepare release netty-4.0.0.CR3 2013-05-18 09:57:11 +02:00
Norman Maurer
c43950a03f [maven-release-plugin] prepare for next development iteration 2013-05-08 18:19:51 +02:00
Norman Maurer
ae76502040 [maven-release-plugin] prepare release netty-4.0.0.CR2 2013-05-08 18:19:38 +02:00
Prajwal Tuladhar
05850da863 enable checkstyle for test source directory and fix checkstyle errors 2013-03-30 13:18:57 +01:00
Norman Maurer
59012390f6 Fix version numbering 2013-03-25 08:01:11 +01:00
Norman Maurer
7d7b676eeb [maven-release-plugin] prepare for next development iteration 2013-03-22 15:20:35 +01:00
Norman Maurer
60fc7dac4d [maven-release-plugin] prepare release netty-4.0.0.CR1 2013-03-22 15:20:11 +01:00
Trustin Lee
2a87950784 [maven-release-plugin] prepare for next development iteration 2013-03-16 18:41:36 +09:00
Trustin Lee
adfb29330b [maven-release-plugin] prepare release netty-4.0.0.Beta3 2013-03-16 18:40:59 +09:00
Trustin Lee
8d88acb4a7 Change ByteBufAllocator.buffer() to allocate a direct buffer only when the platform can handle a direct buffer reliably
- Rename directbyDefault to preferDirect
 - Add a system property 'io.netty.prederDirect' to allow a user from changing the preference on launch-time
 - Merge UnpooledByteBufAllocator.DEFAULT_BY_* to DEFAULT
2013-03-05 17:55:24 +09:00
Trustin Lee
49aa907bd0 [maven-release-plugin] prepare for next development iteration 2013-02-26 16:55:07 -08:00
Trustin Lee
5026c2f359 [maven-release-plugin] prepare release netty-4.0.0.Beta2 2013-02-26 16:54:53 -08:00
Trustin Lee
d68a04a879 [maven-release-plugin] prepare for next development iteration 2013-02-14 12:56:24 -08:00
Trustin Lee
59e638f8f5 [maven-release-plugin] prepare release netty-4.0.0.Beta1 2013-02-14 12:56:15 -08:00
Trustin Lee
eeae6f993c Microbench doesn't need to be an OSGi bundle 2013-02-14 12:15:58 -08:00
Trustin Lee
54ac1cd420 Fix build issues in microbench / Disable tests by default 2013-02-14 11:51:55 -08:00
Trustin Lee
b9996908b1 Implement reference counting
- Related: #1029
- Replace Freeable with ReferenceCounted
- Add AbstractReferenceCounted
- Add AbstractReferenceCountedByteBuf
- Add AbstractDerivedByteBuf
- Add EmptyByteBuf
2013-02-10 13:10:09 +09:00
Norman Maurer
fd75615d7a [#870] Convert all modules into osgi bundles 2013-02-06 07:57:11 +01:00
Trustin Lee
03e68482bb Remove ChannelBuf/ByteBuf.Unsafe
- Fixes #826
Unsafe.isFreed(), free(), suspend/resumeIntermediaryAllocations() are not that dangerous. internalNioBuffer() and internalNioBuffers() are dangerous but it seems like nobody is using it even inside Netty. Removing those two methods also removes the necessity to keep Unsafe interface at all.
2012-12-17 17:41:21 +09:00
Trustin Lee
e37aeb38d6 Add the original copyright 2012-12-14 00:10:28 +09:00
Trustin Lee
6339feaa8f Apply advanced JVM options to benchmarks / Fix duplicate uploads
- Add common optimization options when launching a new JVM to run a benchmark
- Fix a bug where a benchmark report is uploaded twice
- Simplify pom.xml and move the build instruction messages to DefaultBenchmark
- Print an empty line to prettify the output
2012-12-14 00:00:41 +09:00
Trustin Lee
b47fc77522 Add PooledByteBufAllocator + microbenchmark module
This pull request introduces the new default ByteBufAllocator implementation based on jemalloc, with a some differences:

* Minimum possible buffer capacity is 16 (jemalloc: 2)
* Uses binary heap with random branching (jemalloc: red-black tree)
* No thread-local cache yet (jemalloc has thread-local cache)
* Default page size is 8 KiB (jemalloc: 4 KiB)
* Default chunk size is 16 MiB (jemalloc: 2 MiB)
* Cannot allocate a buffer bigger than the chunk size (jemalloc: possible) because we don't have control over memory layout in Java. A user can work around this issue by creating a composite buffer, but it's not always a feasible option. Although 16 MiB is a pretty big default, a user's handler might need to deal with the bounded buffers when the user wants to deal with a large message.

Also, to ensure the new allocator performs good enough, I wrote a microbenchmark for it and made it a dedicated Maven module. It uses Google's Caliper framework to run and publish the test result (example)

Miscellaneous changes:

* Made some ByteBuf implementations public so that those who implements a new allocator can make use of them.
* Added ByteBufAllocator.compositeBuffer() and its variants.
* ByteBufAllocator.ioBuffer() creates a buffer with 0 capacity.
2012-12-13 22:35:06 +09:00