netty5/buffer/src/main/java/io/net5/buffer/PoolArena.java

687 lines
23 KiB
Java
Raw Normal View History

/*
* Copyright 2012 The Netty Project
*
* The Netty Project licenses this file to you under the Apache License,
* version 2.0 (the "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at:
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*/
2021-09-17 16:28:14 +02:00
package io.net5.buffer;
2021-09-17 16:28:14 +02:00
import io.net5.util.internal.PlatformDependent;
import io.net5.util.internal.StringUtil;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
Change arena to thread cache mapping algorithm to be closer to ideal. Motivation: Circular assignment of arenas to thread caches can lead to less than optimal mappings in cases where threads are (frequently) shutdown and started. Example Scenario: There are a total of 2 arenas. The first two threads performing an allocation would lead to the following mapping: Thread 0 -> Arena 0 Thread 1 -> Arena 1 Now, assume Thread 1 is shut down and another Thread 2 is started. The current circular assignment algorithm would lead to the following mapping: Thread 0 -> Arena 0 Thread 2 -> Arena 0 Ideally, we want Thread 2 to use Arena 1 though. Presumably, this is not much of an issue for most Netty applications that do all the allocations inside the eventloop, as eventloop threads are seldomly shut down and restarted. However, applications that only use the netty-buffer package or implement their own threading model outside the eventloop might suffer from increased contention. For example, gRPC Java when using the blocking stub performs some allocations outside the eventloop and within its own thread pool that is dynamically sized depending on system load. Modifications: Implement a linear scan algorithm that assigns a new thread cache to the arena that currently backs the fewest thread caches. Result: Closer to ideal mappings between thread caches and arenas. In order to always get an ideal mapping, we would have to re-balance the mapping whenever a thread dies. However, that's difficult because of deallocation.
2016-03-14 17:25:43 +01:00
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.LongAdder;
2021-09-17 16:28:14 +02:00
import static io.net5.buffer.PoolChunk.isSubpage;
import static java.lang.Math.max;
abstract class PoolArena<T> extends SizeClasses implements PoolArenaMetric {
static final boolean HAS_UNSAFE = PlatformDependent.hasUnsafe();
enum SizeClass {
Small,
Normal
}
final PooledByteBufAllocator parent;
final int numSmallSubpagePools;
final int directMemoryCacheAlignment;
private final PoolSubpage<T>[] smallSubpagePools;
private final PoolChunkList<T> q050;
private final PoolChunkList<T> q025;
private final PoolChunkList<T> q000;
private final PoolChunkList<T> qInit;
private final PoolChunkList<T> q075;
private final PoolChunkList<T> q100;
private final List<PoolChunkListMetric> chunkListMetrics;
// Metrics for allocations and deallocations
private long allocationsNormal;
// We need to use the LongAdder here as this is not guarded via synchronized block.
private final LongAdder allocationsSmall = new LongAdder();
private final LongAdder allocationsHuge = new LongAdder();
private final LongAdder activeBytesHuge = new LongAdder();
private long deallocationsSmall;
private long deallocationsNormal;
// We need to use the LongAdder here as this is not guarded via synchronized block.
private final LongAdder deallocationsHuge = new LongAdder();
Change arena to thread cache mapping algorithm to be closer to ideal. Motivation: Circular assignment of arenas to thread caches can lead to less than optimal mappings in cases where threads are (frequently) shutdown and started. Example Scenario: There are a total of 2 arenas. The first two threads performing an allocation would lead to the following mapping: Thread 0 -> Arena 0 Thread 1 -> Arena 1 Now, assume Thread 1 is shut down and another Thread 2 is started. The current circular assignment algorithm would lead to the following mapping: Thread 0 -> Arena 0 Thread 2 -> Arena 0 Ideally, we want Thread 2 to use Arena 1 though. Presumably, this is not much of an issue for most Netty applications that do all the allocations inside the eventloop, as eventloop threads are seldomly shut down and restarted. However, applications that only use the netty-buffer package or implement their own threading model outside the eventloop might suffer from increased contention. For example, gRPC Java when using the blocking stub performs some allocations outside the eventloop and within its own thread pool that is dynamically sized depending on system load. Modifications: Implement a linear scan algorithm that assigns a new thread cache to the arena that currently backs the fewest thread caches. Result: Closer to ideal mappings between thread caches and arenas. In order to always get an ideal mapping, we would have to re-balance the mapping whenever a thread dies. However, that's difficult because of deallocation.
2016-03-14 17:25:43 +01:00
// Number of thread caches backed by this arena.
final AtomicInteger numThreadCaches = new AtomicInteger();
// TODO: Test if adding padding helps under contention
//private long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7;
protected PoolArena(PooledByteBufAllocator parent, int pageSize,
int pageShifts, int chunkSize, int cacheAlignment) {
super(pageSize, pageShifts, chunkSize, cacheAlignment);
this.parent = parent;
directMemoryCacheAlignment = cacheAlignment;
numSmallSubpagePools = nSubpages;
smallSubpagePools = newSubpagePoolArray(numSmallSubpagePools);
for (int i = 0; i < smallSubpagePools.length; i ++) {
smallSubpagePools[i] = newSubpagePoolHead();
}
q100 = new PoolChunkList<>(this, null, 100, Integer.MAX_VALUE, chunkSize);
q075 = new PoolChunkList<>(this, q100, 75, 100, chunkSize);
q050 = new PoolChunkList<>(this, q075, 50, 100, chunkSize);
q025 = new PoolChunkList<>(this, q050, 25, 75, chunkSize);
q000 = new PoolChunkList<>(this, q025, 1, 50, chunkSize);
qInit = new PoolChunkList<>(this, q000, Integer.MIN_VALUE, 25, chunkSize);
q100.prevList(q075);
q075.prevList(q050);
q050.prevList(q025);
q025.prevList(q000);
q000.prevList(null);
qInit.prevList(qInit);
List<PoolChunkListMetric> metrics = new ArrayList<>(6);
metrics.add(qInit);
metrics.add(q000);
metrics.add(q025);
metrics.add(q050);
metrics.add(q075);
metrics.add(q100);
chunkListMetrics = Collections.unmodifiableList(metrics);
}
private PoolSubpage<T> newSubpagePoolHead() {
PoolSubpage<T> head = new PoolSubpage<>();
head.prev = head;
head.next = head;
return head;
}
@SuppressWarnings("unchecked")
private PoolSubpage<T>[] newSubpagePoolArray(int size) {
return new PoolSubpage[size];
}
abstract boolean isDirect();
PooledByteBuf<T> allocate(PoolThreadCache cache, int reqCapacity, int maxCapacity) {
PooledByteBuf<T> buf = newByteBuf(maxCapacity);
allocate(cache, buf, reqCapacity);
return buf;
}
private void allocate(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity) {
final int sizeIdx = size2SizeIdx(reqCapacity);
if (sizeIdx <= smallMaxSizeIdx) {
tcacheAllocateSmall(cache, buf, reqCapacity, sizeIdx);
} else if (sizeIdx < nSizes) {
tcacheAllocateNormal(cache, buf, reqCapacity, sizeIdx);
} else {
int normCapacity = directMemoryCacheAlignment > 0
? normalizeSize(reqCapacity) : reqCapacity;
// Huge allocations are never served via the cache so just call allocateHuge
allocateHuge(buf, normCapacity);
}
}
private void tcacheAllocateSmall(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity,
final int sizeIdx) {
if (cache.allocateSmall(this, buf, reqCapacity, sizeIdx)) {
// was able to allocate out of the cache so move on
return;
}
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
/*
* Synchronize on the head. This is needed as {@link PoolChunk#allocateSubpage(int)} and
* {@link PoolChunk#free(long)} may modify the doubly linked list as well.
*/
final PoolSubpage<T> head = smallSubpagePools[sizeIdx];
final boolean needsNormalAllocation;
synchronized (head) {
final PoolSubpage<T> s = head.next;
needsNormalAllocation = s == head;
if (!needsNormalAllocation) {
assert s.doNotDestroy && s.elemSize == sizeIdx2size(sizeIdx);
long handle = s.allocate();
assert handle >= 0;
s.chunk.initBufWithSubpage(buf, null, handle, reqCapacity, cache);
}
}
if (needsNormalAllocation) {
synchronized (this) {
allocateNormal(buf, reqCapacity, sizeIdx, cache);
}
}
incSmallAllocation();
}
private void tcacheAllocateNormal(PoolThreadCache cache, PooledByteBuf<T> buf, final int reqCapacity,
final int sizeIdx) {
if (cache.allocateNormal(this, buf, reqCapacity, sizeIdx)) {
// was able to allocate out of the cache so move on
return;
}
synchronized (this) {
allocateNormal(buf, reqCapacity, sizeIdx, cache);
++allocationsNormal;
}
}
// Method must be called inside synchronized(this) { ... } block
private void allocateNormal(PooledByteBuf<T> buf, int reqCapacity, int sizeIdx, PoolThreadCache threadCache) {
if (q050.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
q025.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
q000.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
qInit.allocate(buf, reqCapacity, sizeIdx, threadCache) ||
q075.allocate(buf, reqCapacity, sizeIdx, threadCache)) {
return;
}
// Add a new chunk.
PoolChunk<T> c = newChunk(pageSize, nPSizes, pageShifts, chunkSize);
boolean success = c.allocate(buf, reqCapacity, sizeIdx, threadCache);
assert success;
qInit.add(c);
}
private void incSmallAllocation() {
allocationsSmall.increment();
}
private void allocateHuge(PooledByteBuf<T> buf, int reqCapacity) {
PoolChunk<T> chunk = newUnpooledChunk(reqCapacity);
activeBytesHuge.add(chunk.chunkSize());
buf.initUnpooled(chunk, reqCapacity);
allocationsHuge.increment();
}
void free(PoolChunk<T> chunk, ByteBuffer nioBuffer, long handle, int normCapacity, PoolThreadCache cache) {
if (chunk.unpooled) {
int size = chunk.chunkSize();
destroyChunk(chunk);
activeBytesHuge.add(-size);
deallocationsHuge.increment();
} else {
SizeClass sizeClass = sizeClass(handle);
if (cache != null && cache.add(this, chunk, nioBuffer, handle, normCapacity, sizeClass)) {
// cached so not free it.
return;
}
freeChunk(chunk, handle, normCapacity, sizeClass, nioBuffer, false);
}
}
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
private static SizeClass sizeClass(long handle) {
return isSubpage(handle) ? SizeClass.Small : SizeClass.Normal;
}
void freeChunk(PoolChunk<T> chunk, long handle, int normCapacity, SizeClass sizeClass, ByteBuffer nioBuffer,
boolean finalizer) {
final boolean destroyChunk;
synchronized (this) {
Don't try to put back MemoryRegionCache.Entry objects into the Recycler when recycled because of a finalizer. (#8955) Motivation: In MemoryRegionCache.Entry we use the Recycler to reduce GC pressure and churn. The problem is that these will also be recycled when the PoolThreadCache is collected and finalize() is called. This then can have the effect that we try to load class but the WebApp is already stoped. This will produce an stacktrace like this on Tomcat: ``` 19-Mar-2019 15:53:21.351 INFO [Finalizer] org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading Illegal access: this web application instance has been stopped already. Could not load [java.util.WeakHashMap]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [java.util.WeakHashMap]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1383) at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1371) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1224) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1186) at io.netty.util.Recycler$3.initialValue(Recycler.java:233) at io.netty.util.Recycler$3.initialValue(Recycler.java:230) at io.netty.util.concurrent.FastThreadLocal.initialize(FastThreadLocal.java:188) at io.netty.util.concurrent.FastThreadLocal.get(FastThreadLocal.java:142) at io.netty.util.Recycler$Stack.pushLater(Recycler.java:624) at io.netty.util.Recycler$Stack.push(Recycler.java:597) at io.netty.util.Recycler$DefaultHandle.recycle(Recycler.java:225) at io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry.recycle(PoolThreadCache.java:478) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.freeEntry(PoolThreadCache.java:459) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:430) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:422) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:279) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:270) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:241) at io.netty.buffer.PoolThreadCache.finalize(PoolThreadCache.java:230) at java.lang.System$2.invokeFinalize(System.java:1270) at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:102) at java.lang.ref.Finalizer.access$100(Finalizer.java:34) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:217) ``` Beside this we also need to ensure we not try to lazy load SizeClass when the finalizer is used as it may not be present anymore if the ClassLoader is already destroyed. This would produce an error like: ``` 20-Mar-2019 11:26:35.254 INFO [Finalizer] org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading Illegal access: this web application instance has been stopped already. Could not load [io.netty.buffer.PoolArena$1]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [io.netty.buffer.PoolArena$1]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access. at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1383) at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1371) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1224) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1186) at io.netty.buffer.PoolArena.freeChunk(PoolArena.java:287) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.freeEntry(PoolThreadCache.java:464) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:429) at io.netty.buffer.PoolThreadCache$MemoryRegionCache.free(PoolThreadCache.java:421) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:278) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:269) at io.netty.buffer.PoolThreadCache.free(PoolThreadCache.java:240) at io.netty.buffer.PoolThreadCache.finalize(PoolThreadCache.java:229) at java.lang.System$2.invokeFinalize(System.java:1270) at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:102) at java.lang.ref.Finalizer.access$100(Finalizer.java:34) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:217) ``` Modifications: - Only try to put the Entry back into the Recycler if the PoolThredCache is not destroyed because of the finalizer. - Only try to access SizeClass if not triggered by finalizer. Result: No IllegalStateException anymoe when a webapp is reloaded in Tomcat that uses netty and uses the PooledByteBufAllocator.
2019-03-22 12:16:21 +01:00
// We only call this if freeChunk is not called because of the PoolThreadCache finalizer as otherwise this
// may fail due lazy class-loading in for example tomcat.
if (!finalizer) {
switch (sizeClass) {
case Normal:
++deallocationsNormal;
break;
case Small:
++deallocationsSmall;
break;
default:
throw new Error();
}
}
destroyChunk = !chunk.parent.free(chunk, handle, normCapacity, nioBuffer);
}
if (destroyChunk) {
// destroyChunk not need to be called while holding the synchronized lock.
destroyChunk(chunk);
}
}
PoolSubpage<T> findSubpagePoolHead(int sizeIdx) {
return smallSubpagePools[sizeIdx];
}
void reallocate(PooledByteBuf<T> buf, int newCapacity, boolean freeOldMemory) {
assert newCapacity >= 0 && newCapacity <= buf.maxCapacity();
int oldCapacity = buf.length;
if (oldCapacity == newCapacity) {
return;
}
PoolChunk<T> oldChunk = buf.chunk;
ByteBuffer oldNioBuffer = buf.tmpNioBuf;
long oldHandle = buf.handle;
T oldMemory = buf.memory;
int oldOffset = buf.offset;
int oldMaxLength = buf.maxLength;
// This does not touch buf's reader/writer indices
allocate(parent.threadCache(), buf, newCapacity);
int bytesToCopy;
if (newCapacity > oldCapacity) {
bytesToCopy = oldCapacity;
} else {
buf.trimIndicesToCapacity(newCapacity);
bytesToCopy = newCapacity;
}
memoryCopy(oldMemory, oldOffset, buf, bytesToCopy);
if (freeOldMemory) {
free(oldChunk, oldNioBuffer, oldHandle, oldMaxLength, buf.cache);
}
}
Change arena to thread cache mapping algorithm to be closer to ideal. Motivation: Circular assignment of arenas to thread caches can lead to less than optimal mappings in cases where threads are (frequently) shutdown and started. Example Scenario: There are a total of 2 arenas. The first two threads performing an allocation would lead to the following mapping: Thread 0 -> Arena 0 Thread 1 -> Arena 1 Now, assume Thread 1 is shut down and another Thread 2 is started. The current circular assignment algorithm would lead to the following mapping: Thread 0 -> Arena 0 Thread 2 -> Arena 0 Ideally, we want Thread 2 to use Arena 1 though. Presumably, this is not much of an issue for most Netty applications that do all the allocations inside the eventloop, as eventloop threads are seldomly shut down and restarted. However, applications that only use the netty-buffer package or implement their own threading model outside the eventloop might suffer from increased contention. For example, gRPC Java when using the blocking stub performs some allocations outside the eventloop and within its own thread pool that is dynamically sized depending on system load. Modifications: Implement a linear scan algorithm that assigns a new thread cache to the arena that currently backs the fewest thread caches. Result: Closer to ideal mappings between thread caches and arenas. In order to always get an ideal mapping, we would have to re-balance the mapping whenever a thread dies. However, that's difficult because of deallocation.
2016-03-14 17:25:43 +01:00
@Override
public int numThreadCaches() {
return numThreadCaches.get();
}
@Override
public int numTinySubpages() {
return 0;
}
@Override
public int numSmallSubpages() {
return smallSubpagePools.length;
}
@Override
public int numChunkLists() {
return chunkListMetrics.size();
}
@Override
public List<PoolSubpageMetric> tinySubpages() {
return Collections.emptyList();
}
@Override
public List<PoolSubpageMetric> smallSubpages() {
return subPageMetricList(smallSubpagePools);
}
@Override
public List<PoolChunkListMetric> chunkLists() {
return chunkListMetrics;
}
private static List<PoolSubpageMetric> subPageMetricList(PoolSubpage<?>[] pages) {
List<PoolSubpageMetric> metrics = new ArrayList<>();
for (PoolSubpage<?> head : pages) {
if (head.next == head) {
continue;
}
PoolSubpage<?> s = head.next;
do {
metrics.add(s);
s = s.next;
} while (s != head);
}
return metrics;
}
@Override
public long numAllocations() {
final long allocsNormal;
synchronized (this) {
allocsNormal = allocationsNormal;
}
return allocationsSmall.longValue() + allocsNormal + allocationsHuge.longValue();
}
@Override
public long numTinyAllocations() {
return 0;
}
@Override
public long numSmallAllocations() {
return allocationsSmall.longValue();
}
@Override
public synchronized long numNormalAllocations() {
return allocationsNormal;
}
@Override
public long numDeallocations() {
final long deallocs;
synchronized (this) {
deallocs = deallocationsSmall + deallocationsNormal;
}
return deallocs + deallocationsHuge.longValue();
}
@Override
public long numTinyDeallocations() {
return 0;
}
@Override
public synchronized long numSmallDeallocations() {
return deallocationsSmall;
}
@Override
public synchronized long numNormalDeallocations() {
return deallocationsNormal;
}
@Override
public long numHugeAllocations() {
return allocationsHuge.longValue();
}
@Override
public long numHugeDeallocations() {
return deallocationsHuge.longValue();
}
@Override
public long numActiveAllocations() {
long val = allocationsSmall.longValue() + allocationsHuge.longValue()
- deallocationsHuge.longValue();
synchronized (this) {
val += allocationsNormal - (deallocationsSmall + deallocationsNormal);
}
return max(val, 0);
}
@Override
public long numActiveTinyAllocations() {
return 0;
}
@Override
public long numActiveSmallAllocations() {
return max(numSmallAllocations() - numSmallDeallocations(), 0);
}
@Override
public long numActiveNormalAllocations() {
final long val;
synchronized (this) {
val = allocationsNormal - deallocationsNormal;
}
return max(val, 0);
}
@Override
public long numActiveHugeAllocations() {
return max(numHugeAllocations() - numHugeDeallocations(), 0);
}
@Override
public long numActiveBytes() {
long val = activeBytesHuge.longValue();
synchronized (this) {
for (int i = 0; i < chunkListMetrics.size(); i++) {
for (PoolChunkMetric m: chunkListMetrics.get(i)) {
val += m.chunkSize();
}
}
}
return max(0, val);
}
/**
* Return the number of bytes that are currently pinned to buffer instances, by the arena. The pinned memory is not
* accessible for use by any other allocation, until the buffers using have all been released.
*/
public long numPinnedBytes() {
long val = activeBytesHuge.longValue(); // Huge chunks are exact-sized for the buffers they were allocated to.
synchronized (this) {
for (int i = 0; i < chunkListMetrics.size(); i++) {
for (PoolChunkMetric m: chunkListMetrics.get(i)) {
val += ((PoolChunk<?>) m).pinnedBytes();
}
}
}
return max(0, val);
}
protected abstract PoolChunk<T> newChunk(int pageSize, int maxPageIdx, int pageShifts, int chunkSize);
protected abstract PoolChunk<T> newUnpooledChunk(int capacity);
protected abstract PooledByteBuf<T> newByteBuf(int maxCapacity);
protected abstract void memoryCopy(T src, int srcOffset, PooledByteBuf<T> dst, int length);
protected abstract void destroyChunk(PoolChunk<T> chunk);
@Override
public synchronized String toString() {
StringBuilder buf = new StringBuilder()
.append("Chunk(s) at 0~25%:")
.append(StringUtil.NEWLINE)
.append(qInit)
.append(StringUtil.NEWLINE)
.append("Chunk(s) at 0~50%:")
.append(StringUtil.NEWLINE)
.append(q000)
.append(StringUtil.NEWLINE)
.append("Chunk(s) at 25~75%:")
.append(StringUtil.NEWLINE)
.append(q025)
.append(StringUtil.NEWLINE)
.append("Chunk(s) at 50~100%:")
.append(StringUtil.NEWLINE)
.append(q050)
.append(StringUtil.NEWLINE)
.append("Chunk(s) at 75~100%:")
.append(StringUtil.NEWLINE)
.append(q075)
.append(StringUtil.NEWLINE)
.append("Chunk(s) at 100%:")
.append(StringUtil.NEWLINE)
.append(q100)
.append(StringUtil.NEWLINE)
.append("small subpages:");
appendPoolSubPages(buf, smallSubpagePools);
buf.append(StringUtil.NEWLINE);
return buf.toString();
}
private static void appendPoolSubPages(StringBuilder buf, PoolSubpage<?>[] subpages) {
for (int i = 0; i < subpages.length; i ++) {
PoolSubpage<?> head = subpages[i];
if (head.next == head) {
continue;
}
buf.append(StringUtil.NEWLINE)
.append(i)
.append(": ");
PoolSubpage<?> s = head.next;
do {
buf.append(s);
s = s.next;
} while (s != head);
}
}
@Override
protected final void finalize() throws Throwable {
try {
super.finalize();
} finally {
destroyPoolSubPages(smallSubpagePools);
destroyPoolChunkLists(qInit, q000, q025, q050, q075, q100);
}
}
private static void destroyPoolSubPages(PoolSubpage<?>[] pages) {
for (PoolSubpage<?> page : pages) {
page.destroy();
}
}
private void destroyPoolChunkLists(PoolChunkList<T>... chunkLists) {
for (PoolChunkList<T> chunkList: chunkLists) {
chunkList.destroy(this);
}
}
static final class HeapArena extends PoolArena<byte[]> {
HeapArena(PooledByteBufAllocator parent, int pageSize, int pageShifts,
int chunkSize, int directMemoryCacheAlignment) {
super(parent, pageSize, pageShifts, chunkSize,
directMemoryCacheAlignment);
}
private static byte[] newByteArray(int size) {
return PlatformDependent.allocateUninitializedArray(size);
}
@Override
boolean isDirect() {
return false;
}
@Override
protected PoolChunk<byte[]> newChunk(int pageSize, int maxPageIdx, int pageShifts, int chunkSize) {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
return new PoolChunk<>(
this, null, newByteArray(chunkSize), pageSize, pageShifts, chunkSize, maxPageIdx);
}
@Override
protected PoolChunk<byte[]> newUnpooledChunk(int capacity) {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
return new PoolChunk<>(this, null, newByteArray(capacity), capacity);
}
@Override
protected void destroyChunk(PoolChunk<byte[]> chunk) {
// Rely on GC.
}
@Override
protected PooledByteBuf<byte[]> newByteBuf(int maxCapacity) {
return HAS_UNSAFE ? PooledUnsafeHeapByteBuf.newUnsafeInstance(maxCapacity)
: PooledHeapByteBuf.newInstance(maxCapacity);
}
@Override
protected void memoryCopy(byte[] src, int srcOffset, PooledByteBuf<byte[]> dst, int length) {
if (length == 0) {
return;
}
System.arraycopy(src, srcOffset, dst.memory, dst.offset, length);
}
}
static final class DirectArena extends PoolArena<ByteBuffer> {
DirectArena(PooledByteBufAllocator parent, int pageSize, int pageShifts,
int chunkSize, int directMemoryCacheAlignment) {
super(parent, pageSize, pageShifts, chunkSize,
directMemoryCacheAlignment);
}
@Override
boolean isDirect() {
return true;
}
@Override
protected PoolChunk<ByteBuffer> newChunk(int pageSize, int maxPageIdx,
int pageShifts, int chunkSize) {
if (directMemoryCacheAlignment == 0) {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
ByteBuffer memory = allocateDirect(chunkSize);
return new PoolChunk<>(this, memory, memory, pageSize, pageShifts,
chunkSize, maxPageIdx);
}
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
final ByteBuffer base = allocateDirect(chunkSize + directMemoryCacheAlignment);
final ByteBuffer memory = PlatformDependent.alignDirectBuffer(base, directMemoryCacheAlignment);
return new PoolChunk<>(this, base, memory, pageSize,
pageShifts, chunkSize, maxPageIdx);
}
@Override
protected PoolChunk<ByteBuffer> newUnpooledChunk(int capacity) {
if (directMemoryCacheAlignment == 0) {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
ByteBuffer memory = allocateDirect(capacity);
return new PoolChunk<>(this, memory, memory, capacity);
}
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
final ByteBuffer base = allocateDirect(capacity + directMemoryCacheAlignment);
final ByteBuffer memory = PlatformDependent.alignDirectBuffer(base, directMemoryCacheAlignment);
return new PoolChunk<>(this, base, memory, capacity);
}
private static ByteBuffer allocateDirect(int capacity) {
return PlatformDependent.useDirectBufferNoCleaner() ?
PlatformDependent.allocateDirectNoCleaner(capacity) : ByteBuffer.allocateDirect(capacity);
}
@Override
protected void destroyChunk(PoolChunk<ByteBuffer> chunk) {
if (PlatformDependent.useDirectBufferNoCleaner()) {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
PlatformDependent.freeDirectNoCleaner((ByteBuffer) chunk.base);
} else {
Fix alignment handling for pooled direct buffers (#11106) Motivation: Alignment handling was broken, and basically turned into a fixed offset into each allocation address regardless of its initial value, instead of ensuring that the allocated address is either aligned or bumped to the nearest alignment offset. The brokenness of the alignment handling extended so far, that overlapping ByteBuf instances could even be created, as was seen in #11101. Modification: Instead of fixing the per-allocation pointer bump, we now ensure that 1) the minimum page size is a whole multiple of the alignment, and 2) the reference memory for each chunk is bumped to the nearest aligned address, and finally 3) ensured that the reservations are whole multiples of the alignment, thus ensuring that the next allocation automatically occurs from an aligned address. Incidentally, (3) above comes for free because the reservations are in whole pages, and in (1) we ensured that pages are sized in whole multiples of the alignment. In order to ensure that the memory for a chunk is aligned, we introduce some new PlatformDependent infrastructure. The PlatformDependent.alignDirectBuffer will produce a slice of the given buffer, and the slice will have an address that is aligned. This method is plainly available on ByteBuffer in Java 9 onwards, but for pre-9 we have to use Unsafe, which means it can fail and might not be available on all platforms. Attempts to create a PooledByteBufAllocator that uses alignment, when this is not supported, will throw an exception. Luckily, I think use of aligned allocations are rare. Result: Aligned pooled byte bufs now work correctly, and never have any overlap. Fixes #11101
2021-03-23 17:07:06 +01:00
PlatformDependent.freeDirectBuffer((ByteBuffer) chunk.base);
}
}
@Override
protected PooledByteBuf<ByteBuffer> newByteBuf(int maxCapacity) {
if (HAS_UNSAFE) {
return PooledUnsafeDirectByteBuf.newInstance(maxCapacity);
} else {
return PooledDirectByteBuf.newInstance(maxCapacity);
}
}
@Override
protected void memoryCopy(ByteBuffer src, int srcOffset, PooledByteBuf<ByteBuffer> dstBuf, int length) {
if (length == 0) {
return;
}
if (HAS_UNSAFE) {
PlatformDependent.copyMemory(
PlatformDependent.directBufferAddress(src) + srcOffset,
PlatformDependent.directBufferAddress(dstBuf.memory) + dstBuf.offset, length);
} else {
// We must duplicate the NIO buffers because they may be accessed by other Netty buffers.
src = src.duplicate();
ByteBuffer dst = dstBuf.internalNioBuffer();
src.position(srcOffset).limit(srcOffset + length);
dst.position(dstBuf.offset);
dst.put(src);
}
}
}
}