Go to file
Nick Hill 35161ad174 Further reduce ensureAccessible() overhead (#8895)
Motivation:

This PR fixes some non-negligible overhead discovered in the ByteBuf
accessibility (non-zero refcount) checking. The cause turned out to be
mostly twofold:
- Unnecessary operations used to calculate the refcount from the "raw"
encoded int field value
- Call stack depths exceeding the default limit for inlining, in some
places (CompositeByteBuf in particular)

It's a follow-on from #8882 which uses the maxCapacity field for a
simpler non-negative check. The performance gap between these two
variants appears to be _mostly_ closed, but there's one exception which
may warrant further analysis.

Modifications:

- Replace ABB.internalRefCount() with ByteBuf.isAccessible(), the
default still checks for non-zero refCnt()
- Just test for parity of raw refCnt instead of converting to "real",
with fast-path for specific small values
- Make sure isAccessible() is delegated by derived/wrapper ByteBufs
- Use existing freed flag in CompositeByteBuf for faster isAccessible()
- Manually inline some calls in methods like CompositeByteBuf.setLong()
and AbstractReferenceCountedByteBuf.isAccessible() to reduce stack
depths (to ensure default inlining limit isn't hit)
- Add ByteBufAccessBenchmark which is an extension of
UnsafeByteBufBenchmark (maybe latter could now be removed)

Results:

Before:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
84524972.863 ±   518338.811  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
38608795.037 ±   298176.974  ops/s
readBatch           HEAP               true           true  thrpt   30
80003697.649 ±   974674.119  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
18495554.788 ±   108075.023  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
247069881.578 ± 10839162.593  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
196355905.206 ±  1802420.990  ops/s
setGetLong          HEAP               true           true  thrpt   30
245686644.713 ± 11769311.527  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
83170940.687 ±   657524.123  ops/s
setLong           UNSAFE               true           true  thrpt   30
278940253.918 ±  1807265.259  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
202556738.764 ± 11887973.563  ops/s
setLong             HEAP               true           true  thrpt   30
280045958.053 ±  2719583.400  ops/s
setLong        COMPOSITE               true           true  thrpt   30
121299806.002 ±  2155084.707  ops/s


After:

Benchmark   (bufferType)  (checkAccessible)  (checkBounds)   Mode  Cnt
Score          Error  Units
readBatch         UNSAFE               true           true  thrpt   30
101641801.035 ±  3950050.059  ops/s
readBatch   UNSAFE_SLICE               true           true  thrpt   30
84395902.846 ±  4339579.057  ops/s
readBatch           HEAP               true           true  thrpt   30
100179060.207 ±  3222487.287  ops/s
readBatch      COMPOSITE               true           true  thrpt   30
42288494.472 ±   294919.633  ops/s
setGetLong        UNSAFE               true           true  thrpt   30
304530755.027 ±  6574163.899  ops/s
setGetLong  UNSAFE_SLICE               true           true  thrpt   30
212028547.645 ± 14277828.768  ops/s
setGetLong          HEAP               true           true  thrpt   30
309335422.609 ±  2272150.415  ops/s
setGetLong     COMPOSITE               true           true  thrpt   30
160383609.236 ±   966484.033  ops/s
setLong           UNSAFE               true           true  thrpt   30
298055969.747 ±  7437449.627  ops/s
setLong     UNSAFE_SLICE               true           true  thrpt   30
223784178.650 ±  9869750.095  ops/s
setLong             HEAP               true           true  thrpt   30
302543263.328 ±  8140104.706  ops/s
setLong        COMPOSITE               true           true  thrpt   30
157083673.285 ±  3528779.522  ops/s

There's also a similar knock-on improvement to other benchmarks (e.g.
HPACK encoding/decoding) as shown in #8882.

For sanity I did a final comparison of the "fast path" tweak using one
of the HPACK benchmarks:

(rawCnt & 1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score     Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  50914.479 ± 940.114  ops/s


rawCnt == 2 || rawCnt == 4 || rawCnt == 6 || rawCnt == 8 ||  (rawCnt &
1) == 0:

Benchmark                     (limitToAscii)  (sensitive)  (size)   Mode
Cnt      Score      Error  Units
HpackDecoderBenchmark.decode            true         true  MEDIUM  thrpt
30  60036.425 ± 1478.196  ops/s
2019-02-28 20:41:16 +01:00
.github Use GitHub Issue/PR Template Feature 2016-12-07 11:40:26 -08:00
.mvn support publishing snapshots from docker based ci (#8634) 2018-12-07 06:16:39 +01:00
all Remove OIO transport (and transports that depend on it). (#8580) 2018-11-21 15:23:18 +01:00
bom Remove OIO transport (and transports that depend on it). (#8580) 2018-11-21 15:23:18 +01:00
buffer Further reduce ensureAccessible() overhead (#8895) 2019-02-28 20:41:16 +01:00
codec Tighten up contract of PromiseCombiner and so make it more safe to use (#8886) 2019-02-28 20:39:37 +01:00
codec-dns use checkPositive/checkPositiveOrZero (#8835) 2019-02-04 15:55:07 +01:00
codec-haproxy migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
codec-http Avoid unnecessary char casts for CookieEncoder (#8827) 2019-02-25 19:50:46 +01:00
codec-http2 Tighten up contract of PromiseCombiner and so make it more safe to use (#8886) 2019-02-28 20:39:37 +01:00
codec-memcache use checkPositive/checkPositiveOrZero (#8835) 2019-02-04 15:55:07 +01:00
codec-mqtt migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
codec-redis migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
codec-smtp migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
codec-socks migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
codec-stomp use checkPositive/checkPositiveOrZero (#8835) 2019-02-04 15:55:07 +01:00
codec-xml Update to new checkstyle plugin (#8777) 2019-01-24 16:24:19 +01:00
common Tighten up contract of PromiseCombiner and so make it more safe to use (#8886) 2019-02-28 20:39:37 +01:00
dev-tools Update version number to start working on Netty 5 2018-11-20 15:49:57 +01:00
docker Update JDK12 and 13 to latest EA releases. (#8809) 2019-02-28 13:55:01 +01:00
example Drop SPDY support (#8845) 2019-02-07 09:25:31 +01:00
handler Correctly resume wrap / unwrap when SslTask execution completes (#8899) 2019-02-28 20:30:04 +01:00
handler-proxy Also use java.util.Base64 in handler-proxy module (#8850) 2019-02-12 08:04:09 -08:00
license Add the NOTICE of the forked portion of Apache Harmony 2018-01-30 11:22:51 +01:00
microbench Further reduce ensureAccessible() overhead (#8895) 2019-02-28 20:41:16 +01:00
resolver migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
resolver-dns migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
tarball Update version number to start working on Netty 5 2018-11-20 15:49:57 +01:00
testsuite DefaultFileRegion.transferTo with invalid count may cause busy-spin (#8885) 2019-02-26 11:21:03 +01:00
testsuite-autobahn Compare HttpMethod by reference (#8815) 2019-01-30 21:17:24 +01:00
testsuite-http2 migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
testsuite-osgi Java 8 migration: Use diamond operator (#8749) 2019-01-22 16:07:26 +01:00
testsuite-shading Update version number to start working on Netty 5 2018-11-20 15:49:57 +01:00
transport Tighten up contract of PromiseCombiner and so make it more safe to use (#8886) 2019-02-28 20:39:37 +01:00
transport-native-epoll DefaultFileRegion.transferTo with invalid count may cause busy-spin (#8885) 2019-02-26 11:21:03 +01:00
transport-native-kqueue DefaultFileRegion.transferTo with invalid count may cause busy-spin (#8885) 2019-02-26 11:21:03 +01:00
transport-native-unix-common migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
transport-native-unix-common-tests Correcting Maven Dependencies (#8622) 2018-12-06 09:02:00 +01:00
transport-sctp migrate java8: use requireNonNull (#8840) 2019-02-04 10:32:25 +01:00
.fbprefs Updated Find Bugs configuration 2009-03-04 10:33:09 +00:00
.gitattributes Include mvn wrapper to make setup of development env easier 2018-01-26 08:13:17 +01:00
.gitignore Exclude mainframer related files from git 2018-10-14 13:20:18 +02:00
CONTRIBUTING.md Move the pull request guide to the developer guide 2014-03-12 13:13:58 +09:00
LICENSE.txt Relicensed to Apache License v2 2009-08-28 07:15:49 +00:00
mvnw Include mvn wrapper to make setup of development env easier 2018-01-26 08:13:17 +01:00
mvnw.cmd Include mvn wrapper to make setup of development env easier 2018-01-26 08:13:17 +01:00
NOTICE.txt Add the NOTICE of the forked portion of Apache Harmony 2018-01-30 11:22:51 +01:00
pom.xml Update to new checkstyle plugin (#8777) 2019-01-24 16:24:19 +01:00
README.md Provide an Automatic-Module-Name for the netty-all artifact fixes #7644 2018-01-27 20:31:16 +01:00
run-example.sh Drop SPDY support (#8845) 2019-02-07 09:25:31 +01:00

Netty Project

Netty is an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients.

How to build

For the detailed information about building and developing Netty, please visit the developer guide. This page only gives very basic information.

You require the following to build Netty:

Note that this is build-time requirement. JDK 5 (for 3.x) or 6 (for 4.0+) is enough to run your Netty-based application.

Branches to look

Development of all versions takes place in each branch whose name is identical to <majorVersion>.<minorVersion>. For example, the development of 3.9 and 4.0 resides in the branch '3.9' and the branch '4.0' respectively.

Usage with JDK 9

Netty can be used in modular JDK9 applications as a collection of automatic modules. The module names follow the reverse-DNS style, and are derived from subproject names rather than root packages due to historical reasons. They are listed below:

  • io.netty.all
  • io.netty.buffer
  • io.netty.codec
  • io.netty.codec.dns
  • io.netty.codec.haproxy
  • io.netty.codec.http
  • io.netty.codec.http2
  • io.netty.codec.memcache
  • io.netty.codec.mqtt
  • io.netty.codec.redis
  • io.netty.codec.smtp
  • io.netty.codec.socks
  • io.netty.codec.stomp
  • io.netty.codec.xml
  • io.netty.common
  • io.netty.handler
  • io.netty.handler.proxy
  • io.netty.resolver
  • io.netty.resolver.dns
  • io.netty.transport
  • io.netty.transport.epoll (native omitted - reserved keyword in Java)
  • io.netty.transport.kqueue (native omitted - reserved keyword in Java)
  • io.netty.transport.unix.common (native omitted - reserved keyword in Java)
  • io.netty.transport.rxtx
  • io.netty.transport.sctp
  • io.netty.transport.udt

Automatic modules do not provide any means to declare dependencies, so you need to list each used module separately in your module-info file.