The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.
Memory consumption issue:
When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user. In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.
Memory footprint issue:
The old Netty 4 API decreased overall memory footprint by a great deal
in many cases. It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read. Instead, it
created a new buffer for each handler in a pipeline. This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.
Changes:
All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.
To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:
- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
- Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
- Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
- Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
- Similar to ReceiveBufferSizePredictor in Netty 3
- Some existing configuration properties such as
DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf
This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
- Fixes#1282 (not perfectly, but to the extent it's possible with the current API)
- Add AddressedEnvelope and DefaultAddressedEnvelope
- Make DatagramPacket extend DefaultAddressedEnvelope<ByteBuf, InetSocketAddress>
- Rename ByteBufHolder.data() to content() so that a message can implement both AddressedEnvelope and ByteBufHolder (DatagramPacket does) without introducing two getter methods for the content
- Datagram channel implementations now understand ByteBuf and ByteBufHolder as a message with unspecified remote address.
shutdownGracefully() provides two optional parameters that give more
control over when an executor has to be shut down.
- Related issue: #1307
- Add shutdownGracefully(..) and isShuttingDown()
- Deprecate shutdown() / shutdownNow()
- Replace lastAccessTime with lastExecutionTime and update it after task
execution for accurate quiet period check
- runAllTasks() and runShutdownTasks() update it automatically.
- Add updateLastExecutionTime() so that subclasses can update it
- Add a constructor parameter that tells not to add an unncessary wakeup
task in execute() if addTask() wakes up the executor thread
automatically. Previously, execute() always called wakeup() after
addTask(), which often caused an extra dummy task in the task queue.
- Use shutdownGracefully() wherever possible / Deprecation javadoc
- Reduce the running time of SingleThreadEventLoopTest from 40s to 15s
using custom graceful shutdown parameters
- Other changes made along with this commit:
- takeTask() does not throw InterruptedException anymore.
- Returns null on interruption or wakeup
- Make sure runShutdownTasks() return true even if an exception was
raised while running the shutdown tasks
- Remove unnecessary isShutdown() checks
- Consistent use of SingleThreadEventExecutor.nanoTime()
Replace isWakeupOverridden with a constructor parameter
- Fixes#1308
freeInboundBuffer() and freeOutboundBuffer() were introduced in the early days of the new API when we did not have reference counting mechanism in the buffer. A user did not want Netty to free the handler buffers had to override these methods.
However, now that we have reference counting mechanism built into the buffer, a user who wants to retain the buffers beyond handler's life cycle can simply return the buffer whose reference count is greater than 1 in newInbound/OutboundBuffer().
This change also introduce a few other changes which was needed:
* ChannelHandler.beforeAdd(...) and ChannelHandler.beforeRemove(...) were removed
* ChannelHandler.afterAdd(...) -> handlerAdded(...)
* ChannelHandler.afterRemoved(...) -> handlerRemoved(...)
* SslHandler.handshake() -> SslHandler.hanshakeFuture() as the handshake is triggered automatically after
the Channel becomes active
- Add ChannelHandlerUtil and move the core logic of ChannelInbound/OutboundMessageHandler to ChannelHandlerUtil
- Add ChannelHandlerUtil.SingleInbound/OutboundMessageHandler and make ChannelInbound/OutboundMessageHandlerAdapter implement them. This is a backward incompatible change because it forces all handler methods to be public (was protected previously)
- Fixes: #1119
This pull request introduces a new operation called read() that replaces the existing inbound traffic control method. EventLoop now performs socket reads only when the read() operation has been issued. Once the requested read() operation is actually performed, EventLoop triggers an inboundBufferSuspended event that tells the handlers that the requested read() operation has been performed and the inbound traffic has been suspended again. A handler can decide to continue reading or not.
Unlike other outbound operations, read() does not use ChannelFuture at all to avoid GC cost. If there's a good reason to create a new future per read at the GC cost, I'll change this.
This pull request consequently removes the readable property in ChannelHandlerContext, which means how the traffic control works changed significantly.
This pull request also adds a new configuration property ChannelOption.AUTO_READ whose default value is true. If true, Netty will call ctx.read() for you. If you need a close control over when read() is called, you can set it to false.
Another interesting fact is that non-terminal handlers do not really need to call read() at all. Only the last inbound handler will have to call it, and that's just enough. Actually, you don't even need to call it at the last handler in most cases because of the ChannelOption.AUTO_READ mentioned above.
There's no serious backward compatibility issue. If the compiler complains your handler does not implement the read() method, add the following:
public void read(ChannelHandlerContext ctx) throws Exception {
ctx.read();
}
Note that this pull request certainly makes bounded inbound buffer support very easy, but itself does not add the bounded inbound buffer support.
- Fixes#831
This commit ensures the following events are never triggered as a direct
invocation if they are triggered via ChannelPipeline.fire*():
- channelInactive
- channelUnregistered
- exceptionCaught
This commit also fixes the following issues surfaced by this fix:
- Embedded channel implementations run scheduled tasks too early
- SpdySessionHandlerTest tries to generate inbound data even after the
channel is closed.
- AioSocketChannel enters into an infinite loop on I/O error.
This pull request introduces the new default ByteBufAllocator implementation based on jemalloc, with a some differences:
* Minimum possible buffer capacity is 16 (jemalloc: 2)
* Uses binary heap with random branching (jemalloc: red-black tree)
* No thread-local cache yet (jemalloc has thread-local cache)
* Default page size is 8 KiB (jemalloc: 4 KiB)
* Default chunk size is 16 MiB (jemalloc: 2 MiB)
* Cannot allocate a buffer bigger than the chunk size (jemalloc: possible) because we don't have control over memory layout in Java. A user can work around this issue by creating a composite buffer, but it's not always a feasible option. Although 16 MiB is a pretty big default, a user's handler might need to deal with the bounded buffers when the user wants to deal with a large message.
Also, to ensure the new allocator performs good enough, I wrote a microbenchmark for it and made it a dedicated Maven module. It uses Google's Caliper framework to run and publish the test result (example)
Miscellaneous changes:
* Made some ByteBuf implementations public so that those who implements a new allocator can make use of them.
* Added ByteBufAllocator.compositeBuffer() and its variants.
* ByteBufAllocator.ioBuffer() creates a buffer with 0 capacity.