Motivation:
We need to ensure the tracked object can not be GC'ed before ResourceLeak.close() is called as otherwise we may get false-positives reported by the ResourceLeakDetector. This can happen as the JIT / GC may be able to figure out that we do not need the tracked object anymore and so already enqueue it for collection before we actually get a chance to close the enclosing ResourceLeak.
Modifications:
- Add ResourceLeakTracker and deprecate the old ResourceLeak
- Fix some javadocs to correctly release buffers.
- Add a unit test for ResourceLeakDetector that shows that ResourceLeakTracker has not the problems.
Result:
No more false-positives reported by ResourceLeakDetector when ResourceLeakDetector.track(...) is used.
Motivation:
42fba015ce changed the implemention of ResourceLeakDetector to improve performance. While this was done a branch was missed that can be removed. Beside this using a Boolean as value for the ConcurrentMap is sub-optimal as when calling remove(key, value) an uncessary instanceof check and cast is needed on each removal.
Modifications:
- Remove branch which is not needed anymore
- Replace usage of Boolean as value type of the ConcurrentMap and use our own special type which only compute hash-code one time and use a == operation for equals(...) to reduce overhead present when using Boolean.
Result:
Faster and cleaner ResourceLeakDetector.
Motivation:
ResourceLeakDetector shows two main problems, racy access and heavy lock contention.
Modifications:
This PR fixes this by doing two things:
1. Replace the sampling counter with a ThreadLocalRandom. This has two benefits.
First, it makes the sampling ration no longer have to be a power of two. Second,
it de-noises the continuous races that fight over this single value. Instead,
this change uses slightly more CPU to decide if it should sample by using TLR.
2. DefaultResourceLeaks need to be kept alive in order to catch leaks. The means
by which this happens is by a singular, doubly-linked list. This creates a
large amount of contention when allocating quickly. This is noticeable when
running on a multi core machine.
Instead, this uses a concurrent hash map to keep track of active resources
which has much better contention characteristics.
Results:
Better concurrent hygiene. Running the gRPC QPS benchmark showed RLD taking about
3 CPU seconds for every 1 wall second when runnign with 12 threads.
There are some minor perks to this as well. DefaultResourceLeak accounting is
moved to a central place which probably has better caching behavior.
Motivation:
ResourceLeakDetector reports leak for first call to open(obj) as its leakCheckCnt starts with value 0 and increment subsequently. with value of leakCheckCnt =0, it always returns ResourceLeak. Our application calls ResourceLeakDetector.open(obj) to validate Leak and it fails at very first call even though there is no leak in application.
Modifications:
ResourceLeakDetector.leakCheckCnt value will not be 0 while deriving leak and it will not return incorrect value of ResourceLeak.
Result:
Fix false leak report on first call on ResourceLeakDetector.
Motivation:
Some usages of findNextPositivePowerOfTwo assume that bounds checking is taken care of by this method. However bounds checking is not taken care of by findNextPositivePowerOfTwo and instead assert statements are used to imply the caller has checked the bounds. This can lead to unexpected non power of 2 return values if the caller is not careful and thus invalidate any logic which depends upon a power of 2.
Modifications:
- Add a safeFindNextPositivePowerOfTwo method which will do runtime bounds checks and always return a power of 2
Result:
Fixes https://github.com/netty/netty/issues/5601
Motivation:
We recently added the ResourceLeakDetectorFactory but missed to updated HashedWheelTimer to use it.
Modifications:
- Add new abstract method to ResourceLeakDetectorFactory that allows to provide also samplingInterval and maxActive args.
- Deprecate most constructors in ResourceLeakDetector and add doc explaining that people should use ResourceLeakDetectorFactory
Result:
Custom ResourceLeakDetectorFactory will also be used in HashedWheelTimer if configured.
Allow users of Netty to plug in their own leak detector for the purpose
of instrumentation.
Motivation:
We are rolling out a large Netty deployment and want to be able to
track the amount of leaks we're seeing in production via custom
instrumentation. In order to achieve this today, I had to plug in a
custom `ByteBufAllocator` into the bootstrap and have it initialize a
custom `ResourceLeakDetector`. Due to these classes mostly being marked
`final` or having private or static methods, a lot of the code had to
be copy-pasted and it's quite ugly.
Modifications:
* I've added a static loader method for the `ResourceLeakDetector` in
`AbstractByteBuf` that tries to instantiate the class passed in via the
`-Dio.netty.customResourceLeakDetector`, otherwise falling back to the
default one.
* I've modified `ResourceLeakDetector` to be non-final and to have the
reporting broken out in to methods that can be overridden.
Result:
You can instrument leaks in your application by just adding something
like the following:
```java
public class InstrumentedResourceLeakDetector<T> extends
ResourceLeakDetector<T> {
@Monitor("InstanceLeakCounter")
private final AtomicInteger instancesLeakCounter;
@Monitor("LeakCounter")
private final AtomicInteger leakCounter;
public InstrumentedResourceLeakDetector(Class<T> resource) {
super(resource);
this.instancesLeakCounter = new AtomicInteger();
this.leakCounter = new AtomicInteger();
}
@Override
protected void reportTracedLeak(String records) {
super.reportTracedLeak(records);
leakCounter.incrementAndGet();
}
@Override
protected void reportUntracedLeak() {
super.reportUntracedLeak();
leakCounter.incrementAndGet();
}
@Override
protected void reportInstancesLeak() {
super.reportInstancesLeak();
instancesLeakCounter.incrementAndGet();
}
}
```
Motivation:
ResourceLeakDetector enforces a limit as to how large the queue is allowed to grow for stack traces in order to keep memory from growing too large. However it is not always clear when records are dropped, or how many have been dropped. This can make interpreting leak reports more difficult if you assume all information is present when it may not be. Also we should increase the limit (currently 4) when running tests on the CI servers.
Modifications:
- Increase leak detector record limit on CI servers from 4 to 32.
- Track how many records have been discarded and disclose this in the leak report.
Result:
Leak reports clarify how many records were dropped, and how to increase the limit.
Motivation:
Modulo operations are slow, we can use bitwise operation to detect if resource leak detection must be done while sampling.
Modifications:
- Ensure the interval is a power of two
- Use bitwise operation for sampling
- Add benchmark.
Result:
Faster sampling.
Motivation:
Leak detector, when it detects a leak, will print the last 5 stack
traces that touched the ByteBuf. In some cases that might not be enough
to identify the root cause of the leak.
Also, sometimes users might not be interested in tracing all the
operations on the buffer, but just the ones that are affecting the
reference count.
Modifications:
Added command line properties to override default values:
* Allow to configure max number of stack traces to collect
* Allow to only record retain/release operation on buffers
Result:
Users can increase the number of stack traces to debug buffer leaks
with lot of retain/release operations.
Motivation:
When a user sees an error message, sometimes he or she does not know
what exactly he or she has to do to fix the problem.
Modifications:
Log the URL of the wiki pages that might help the user troubleshoot.
Result:
We are more friendly.
Motivation:
Found performance issues via FindBugs and PMD.
Modifications:
- Removed unnecessary boxing/unboxing operations in DefaultTextHeaders.convertToInt(CharSequence) and DefaultTextHeaders.convertToLong(CharSequence). A boxed primitive is created from a string, just to extract the unboxed primitive value.
- Added a static modifier for DefaultHttp2Connection.ParentChangedEvent class. This class is an inner class, but does not use its embedded reference to the object which created it. This reference makes the instances of the class larger, and may keep the reference to the creator object alive longer than necessary.
- Added a static compiled Pattern to avoid compile it each time it is used when we need to replace some part of authority.
- Improved using of StringBuilders.
Result:
Performance improvements.
- Related: #2163
- Add ResourceLeakHint to allow a user to provide a meaningful information about the leak when touching it
- DefaultChannelHandlerContext now implements ResourceLeakHint to tell where the message is going.
- Cleaner resource leak report by excluding noisy stack trace elements
- Remove the reference to ResourceLeak from the buffer implementations
and use wrappers instead:
- SimpleLeakAwareByteBuf and AdvancedLeakAwareByteBuf
- It is now allocator's responsibility to create a leak-aware buffer.
- Added AbstractByteBufAllocator.toLeakAwareBuffer() for easier
implementation
- Add WrappedByteBuf to reduce duplication between *LeakAwareByteBuf and
UnreleasableByteBuf
- Raise the level of leak reports to ERROR - because it will break the
app eventually
- Replace enabled/disabled property with the leak detection level
- Only print stack trace when level is ADVANCED or above to avoid user
confusion
- Add the 'leak' build profile, which enables highly detailed leak
reporting during the build
- Remove ResourceLeakException which is unsed anymore
The API changes made so far turned out to increase the memory footprint
and consumption while our intention was actually decreasing them.
Memory consumption issue:
When there are many connections which does not exchange data frequently,
the old Netty 4 API spent a lot more memory than 3 because it always
allocates per-handler buffer for each connection unless otherwise
explicitly stated by a user. In a usual real world load, a client
doesn't always send requests without pausing, so the idea of having a
buffer whose life cycle if bound to the life cycle of a connection
didn't work as expected.
Memory footprint issue:
The old Netty 4 API decreased overall memory footprint by a great deal
in many cases. It was mainly because the old Netty 4 API did not
allocate a new buffer and event object for each read. Instead, it
created a new buffer for each handler in a pipeline. This works pretty
well as long as the number of handlers in a pipeline is only a few.
However, for a highly modular application with many handlers which
handles connections which lasts for relatively short period, it actually
makes the memory footprint issue much worse.
Changes:
All in all, this is about retaining all the good changes we made in 4 so
far such as better thread model and going back to the way how we dealt
with message events in 3.
To fix the memory consumption/footprint issue mentioned above, we made a
hard decision to break the backward compatibility again with the
following changes:
- Remove MessageBuf
- Merge Buf into ByteBuf
- Merge ChannelInboundByte/MessageHandler and ChannelStateHandler into ChannelInboundHandler
- Similar changes were made to the adapter classes
- Merge ChannelOutboundByte/MessageHandler and ChannelOperationHandler into ChannelOutboundHandler
- Similar changes were made to the adapter classes
- Introduce MessageList which is similar to `MessageEvent` in Netty 3
- Replace inboundBufferUpdated(ctx) with messageReceived(ctx, MessageList)
- Replace flush(ctx, promise) with write(ctx, MessageList, promise)
- Remove ByteToByteEncoder/Decoder/Codec
- Replaced by MessageToByteEncoder<ByteBuf>, ByteToMessageDecoder<ByteBuf>, and ByteMessageCodec<ByteBuf>
- Merge EmbeddedByteChannel and EmbeddedMessageChannel into EmbeddedChannel
- Add SimpleChannelInboundHandler which is sometimes more useful than
ChannelInboundHandlerAdapter
- Bring back Channel.isWritable() from Netty 3
- Add ChannelInboundHandler.channelWritabilityChanges() event
- Add RecvByteBufAllocator configuration property
- Similar to ReceiveBufferSizePredictor in Netty 3
- Some existing configuration properties such as
DatagramChannelConfig.receivePacketSize is gone now.
- Remove suspend/resumeIntermediaryDeallocation() in ByteBuf
This change would have been impossible without @normanmaurer's help. He
fixed, ported, and improved many parts of the changes.
Now that we are going to use buffer pooling by default, it is obvious
that a user will forget to call .free() and report memory leak. In this
case, we should have a tool to determine if it is a bug in our allocator
implementation or in the user's code.
This pull request adds a system property flag called
'io.netty.resourceLeakDetection'. If set, when a user forgets to call
.free(), the ResourceLeakDetector will detect it and log a message with
detailed stack trace to tell where the leaked buffer has been allocated.
Because obtaining stack trace is an expensive operation, I used sampling
technique. Allocation is recorded only for every 113th allocation. I
chose 113 because it's a prime number.
In production, a user might not want to enable this option due to
potential performance impact. If a user does not specify the
'-Dio.netty.resourceLeakDetection' option leak detection is disabled.
Even if the leak detection is enabled, the overhead should be less than
5% because only ~1% of allocations are monitored.
I also replaced SharedResourceMisuseDetector with ResourceLeakDetector.