Norman Maurer 73dfd7c01b [#2693] Reduce memory usage of ChannelOutboundBuffer
Motiviation:

ChannelOuboundBuffer uses often too much memory. This is especially a problem if you want to serve a lot of connections. This is due the fact that it uses 2 arrays internally. One if used as a circular buffer and store the Entries that are never released  (ChannelOutboundBuffer is pooled) and one is used to hold the ByteBuffers that are used for gathering writes.

Modifications:

Rewrite ChannelOutboundBuffer to remove these two arrays by:
  - Make Entry recyclable and use it as linked Node
  - Remove the circular buffer which was used for the Entries as we use a Linked-List like structure now
  - Remove the array that did hold the ByteBuffers and replace it by an ByteBuffer array that is hold by a FastThreadLocal. We use a fixed capacity of 1024 here which is fine as we share these anyway.
  - ChannelOuboundBuffer is not recyclable anymore as it is now a "light-weight" object. We recycle the internally used Entries instead.

Result:

Less memory footprint and resource usage. Performance seems to be a bit better but most likely as we not need to expand any arrays anymore.

Benchmark before change:
[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    26.88ms   67.47ms   1.26s    97.97%
    Req/Sec   191.81k    28.22k  255.63k    83.86%
  364806639 requests in 2.00m, 48.92GB read
Requests/sec: 3040101.23
Transfer/sec:    417.49MB

Benchmark after change:

[nmaurer@xxx]~% wrk/wrk -H 'Host: localhost' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Connection: keep-alive' -d 120 -c 256 -t 16 --pipeline 256  http://xxx:8080/plaintext
Running 2m test @ http://xxx:8080/plaintext
  16 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    22.22ms   17.22ms 301.77ms   90.13%
    Req/Sec   194.98k    41.98k  328.38k    70.50%
  371816023 requests in 2.00m, 49.86GB read
Requests/sec: 3098461.44
Transfer/sec:    425.51MB
2014-07-28 15:08:16 -07:00
..