2008-08-08 02:37:18 +02:00
|
|
|
/*
|
2012-06-04 22:31:44 +02:00
|
|
|
* Copyright 2012 The Netty Project
|
2008-08-08 02:37:18 +02:00
|
|
|
*
|
2011-12-09 06:18:34 +01:00
|
|
|
* The Netty Project licenses this file to you under the Apache License,
|
|
|
|
* version 2.0 (the "License"); you may not use this file except in compliance
|
|
|
|
* with the License. You may obtain a copy of the License at:
|
2008-08-08 02:37:18 +02:00
|
|
|
*
|
2012-06-04 22:31:44 +02:00
|
|
|
* http://www.apache.org/licenses/LICENSE-2.0
|
2008-08-08 03:27:24 +02:00
|
|
|
*
|
2009-08-28 09:15:49 +02:00
|
|
|
* Unless required by applicable law or agreed to in writing, software
|
|
|
|
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
2011-12-09 06:18:34 +01:00
|
|
|
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
2009-08-28 09:15:49 +02:00
|
|
|
* License for the specific language governing permissions and limitations
|
|
|
|
* under the License.
|
2008-08-08 02:37:18 +02:00
|
|
|
*/
|
2011-12-09 04:38:59 +01:00
|
|
|
package io.netty.buffer;
|
2008-08-08 02:37:18 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
import io.netty.util.ByteProcessor;
|
|
|
|
import io.netty.util.IllegalReferenceCountException;
|
|
|
|
import io.netty.util.ReferenceCountUtil;
|
2013-07-08 06:32:33 +02:00
|
|
|
import io.netty.util.internal.EmptyArrays;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
import io.netty.util.internal.RecyclableArrayList;
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2012-10-18 08:57:23 +02:00
|
|
|
import java.io.IOException;
|
2013-07-08 06:32:33 +02:00
|
|
|
import java.io.InputStream;
|
2012-10-18 08:57:23 +02:00
|
|
|
import java.io.OutputStream;
|
|
|
|
import java.nio.ByteBuffer;
|
2013-07-08 06:32:33 +02:00
|
|
|
import java.nio.ByteOrder;
|
2016-02-05 04:02:40 +01:00
|
|
|
import java.nio.channels.FileChannel;
|
2013-07-08 06:32:33 +02:00
|
|
|
import java.nio.channels.GatheringByteChannel;
|
|
|
|
import java.nio.channels.ScatteringByteChannel;
|
|
|
|
import java.util.ArrayList;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
import java.util.Arrays;
|
2013-07-08 06:32:33 +02:00
|
|
|
import java.util.Collection;
|
|
|
|
import java.util.Collections;
|
2015-04-10 09:43:06 +02:00
|
|
|
import java.util.ConcurrentModificationException;
|
2013-07-08 06:32:33 +02:00
|
|
|
import java.util.Iterator;
|
2008-08-08 02:37:18 +02:00
|
|
|
import java.util.List;
|
2015-04-10 09:43:06 +02:00
|
|
|
import java.util.NoSuchElementException;
|
2008-08-08 02:37:18 +02:00
|
|
|
|
2016-01-26 22:44:20 +01:00
|
|
|
import static io.netty.util.internal.ObjectUtil.checkNotNull;
|
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* A virtual buffer which shows multiple buffers as a single merged buffer. It is recommended to use
|
|
|
|
* {@link ByteBufAllocator#compositeBuffer()} or {@link Unpooled#wrappedBuffer(ByteBuf...)} instead of calling the
|
|
|
|
* constructor explicitly.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2015-04-10 09:43:06 +02:00
|
|
|
public class CompositeByteBuf extends AbstractReferenceCountedByteBuf implements Iterable<ByteBuf> {
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2014-12-30 07:52:57 +01:00
|
|
|
private static final ByteBuffer EMPTY_NIO_BUFFER = Unpooled.EMPTY_BUFFER.nioBuffer();
|
2015-04-10 09:43:06 +02:00
|
|
|
private static final Iterator<ByteBuf> EMPTY_ITERATOR = Collections.<ByteBuf>emptyList().iterator();
|
2014-12-30 07:52:57 +01:00
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
private final ByteBufAllocator alloc;
|
|
|
|
private final boolean direct;
|
|
|
|
private final int maxNumComponents;
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private int componentCount;
|
|
|
|
private Component[] components; // resized when needed
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
private boolean freed;
|
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
private CompositeByteBuf(ByteBufAllocator alloc, boolean direct, int maxNumComponents, int initSize) {
|
2017-01-26 12:15:10 +01:00
|
|
|
super(AbstractByteBufAllocator.DEFAULT_MAX_CAPACITY);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (alloc == null) {
|
|
|
|
throw new NullPointerException("alloc");
|
|
|
|
}
|
2018-12-05 08:42:23 +01:00
|
|
|
if (maxNumComponents < 1) {
|
2018-10-30 19:35:39 +01:00
|
|
|
throw new IllegalArgumentException(
|
2018-12-05 08:42:23 +01:00
|
|
|
"maxNumComponents: " + maxNumComponents + " (expected: >= 1)");
|
2018-10-30 19:35:39 +01:00
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
this.alloc = alloc;
|
|
|
|
this.direct = direct;
|
|
|
|
this.maxNumComponents = maxNumComponents;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
components = newCompArray(initSize, maxNumComponents);
|
2018-10-30 19:35:39 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
public CompositeByteBuf(ByteBufAllocator alloc, boolean direct, int maxNumComponents) {
|
|
|
|
this(alloc, direct, maxNumComponents, 0);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
public CompositeByteBuf(ByteBufAllocator alloc, boolean direct, int maxNumComponents, ByteBuf... buffers) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
this(alloc, direct, maxNumComponents, buffers, 0);
|
2016-07-29 07:51:16 +02:00
|
|
|
}
|
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
CompositeByteBuf(ByteBufAllocator alloc, boolean direct, int maxNumComponents,
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
ByteBuf[] buffers, int offset) {
|
|
|
|
this(alloc, direct, maxNumComponents, buffers.length - offset);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
addComponents0(false, 0, buffers, offset);
|
2013-07-08 06:32:33 +02:00
|
|
|
consolidateIfNeeded();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
setIndex0(0, capacity());
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
public CompositeByteBuf(
|
|
|
|
ByteBufAllocator alloc, boolean direct, int maxNumComponents, Iterable<ByteBuf> buffers) {
|
2018-10-30 19:35:39 +01:00
|
|
|
this(alloc, direct, maxNumComponents,
|
|
|
|
buffers instanceof Collection ? ((Collection<ByteBuf>) buffers).size() : 0);
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
addComponents(false, 0, buffers);
|
2018-10-30 19:35:39 +01:00
|
|
|
setIndex(0, capacity());
|
|
|
|
}
|
|
|
|
|
|
|
|
// support passing arrays of other types instead of having to copy to a ByteBuf[] first
|
|
|
|
interface ByteWrapper<T> {
|
|
|
|
ByteBuf wrap(T bytes);
|
|
|
|
boolean isEmpty(T bytes);
|
|
|
|
}
|
|
|
|
|
|
|
|
static final ByteWrapper<byte[]> BYTE_ARRAY_WRAPPER = new ByteWrapper<byte[]>() {
|
|
|
|
@Override
|
|
|
|
public ByteBuf wrap(byte[] bytes) {
|
|
|
|
return Unpooled.wrappedBuffer(bytes);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2018-10-30 19:35:39 +01:00
|
|
|
@Override
|
|
|
|
public boolean isEmpty(byte[] bytes) {
|
|
|
|
return bytes.length == 0;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2018-10-30 19:35:39 +01:00
|
|
|
};
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
static final ByteWrapper<ByteBuffer> BYTE_BUFFER_WRAPPER = new ByteWrapper<ByteBuffer>() {
|
|
|
|
@Override
|
|
|
|
public ByteBuf wrap(ByteBuffer bytes) {
|
|
|
|
return Unpooled.wrappedBuffer(bytes);
|
|
|
|
}
|
|
|
|
@Override
|
|
|
|
public boolean isEmpty(ByteBuffer bytes) {
|
|
|
|
return !bytes.hasRemaining();
|
|
|
|
}
|
|
|
|
};
|
2018-06-20 09:13:09 +02:00
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
<T> CompositeByteBuf(ByteBufAllocator alloc, boolean direct, int maxNumComponents,
|
|
|
|
ByteWrapper<T> wrapper, T[] buffers, int offset) {
|
|
|
|
this(alloc, direct, maxNumComponents, buffers.length - offset);
|
2016-01-12 15:06:30 +01:00
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
addComponents0(false, 0, wrapper, buffers, offset);
|
2013-07-08 06:32:33 +02:00
|
|
|
consolidateIfNeeded();
|
|
|
|
setIndex(0, capacity());
|
2016-01-12 15:06:30 +01:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private static Component[] newCompArray(int initComponents, int maxNumComponents) {
|
2018-06-20 09:13:09 +02:00
|
|
|
int capacityGuess = Math.min(AbstractByteBufAllocator.DEFAULT_MAX_COMPONENTS, maxNumComponents);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return new Component[Math.max(initComponents, capacityGuess)];
|
2016-01-12 15:06:30 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
// Special constructor used by WrappedCompositeByteBuf
|
|
|
|
CompositeByteBuf(ByteBufAllocator alloc) {
|
|
|
|
super(Integer.MAX_VALUE);
|
|
|
|
this.alloc = alloc;
|
|
|
|
direct = false;
|
|
|
|
maxNumComponents = 0;
|
2017-12-12 09:06:18 +01:00
|
|
|
components = null;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2008-08-08 02:37:18 +02:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
* If you need to have it increased use {@link #addComponent(boolean, ByteBuf)}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of {@code buffer} is transferred to this {@link CompositeByteBuf}.
|
|
|
|
* @param buffer the {@link ByteBuf} to add. {@link ByteBuf#release()} ownership is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf addComponent(ByteBuf buffer) {
|
2016-05-17 14:19:33 +02:00
|
|
|
return addComponent(false, buffer);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* Add the given {@link ByteBuf}s.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
* If you need to have it increased use {@link #addComponents(boolean, ByteBuf[])}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all {@link ByteBuf#release()}
|
2019-01-14 07:24:34 +01:00
|
|
|
* ownership of all {@link ByteBuf} objects is transferred to this {@link CompositeByteBuf}.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf addComponents(ByteBuf... buffers) {
|
2016-05-17 14:19:33 +02:00
|
|
|
return addComponents(false, buffers);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-07-20 06:18:17 +02:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}s.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
* If you need to have it increased use {@link #addComponents(boolean, Iterable)}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all {@link ByteBuf#release()}
|
2019-01-14 07:24:34 +01:00
|
|
|
* ownership of all {@link ByteBuf} objects is transferred to this {@link CompositeByteBuf}.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf addComponents(Iterable<ByteBuf> buffers) {
|
2016-05-17 14:19:33 +02:00
|
|
|
return addComponents(false, buffers);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* Add the given {@link ByteBuf} on the specific index.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
* If you need to have it increased use {@link #addComponent(boolean, int, ByteBuf)}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of {@code buffer} is transferred to this {@link CompositeByteBuf}.
|
2016-01-26 22:44:20 +01:00
|
|
|
* @param cIndex the index on which the {@link ByteBuf} will be added.
|
2019-01-14 07:24:34 +01:00
|
|
|
* @param buffer the {@link ByteBuf} to add. {@link ByteBuf#release()} ownership is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf addComponent(int cIndex, ByteBuf buffer) {
|
2016-05-17 14:19:33 +02:00
|
|
|
return addComponent(false, cIndex, buffer);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf} and increase the {@code writerIndex} if {@code increaseWriterIndex} is
|
|
|
|
* {@code true}.
|
|
|
|
*
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of {@code buffer} is transferred to this {@link CompositeByteBuf}.
|
|
|
|
* @param buffer the {@link ByteBuf} to add. {@link ByteBuf#release()} ownership is transferred to this
|
2016-05-17 14:19:33 +02:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
*/
|
|
|
|
public CompositeByteBuf addComponent(boolean increaseWriterIndex, ByteBuf buffer) {
|
2019-02-01 10:31:53 +01:00
|
|
|
return addComponent(increaseWriterIndex, componentCount, buffer);
|
2016-05-17 14:19:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}s and increase the {@code writerIndex} if {@code increaseWriterIndex} is
|
|
|
|
* {@code true}.
|
|
|
|
*
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-05-17 14:19:33 +02:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all {@link ByteBuf#release()}
|
2019-01-14 07:24:34 +01:00
|
|
|
* ownership of all {@link ByteBuf} objects is transferred to this {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
*/
|
|
|
|
public CompositeByteBuf addComponents(boolean increaseWriterIndex, ByteBuf... buffers) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
checkNotNull(buffers, "buffers");
|
|
|
|
addComponents0(increaseWriterIndex, componentCount, buffers, 0);
|
2016-05-17 14:19:33 +02:00
|
|
|
consolidateIfNeeded();
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}s and increase the {@code writerIndex} if {@code increaseWriterIndex} is
|
|
|
|
* {@code true}.
|
|
|
|
*
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-05-17 14:19:33 +02:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all {@link ByteBuf#release()}
|
2019-01-14 07:24:34 +01:00
|
|
|
* ownership of all {@link ByteBuf} objects is transferred to this {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
*/
|
|
|
|
public CompositeByteBuf addComponents(boolean increaseWriterIndex, Iterable<ByteBuf> buffers) {
|
2019-02-01 10:31:53 +01:00
|
|
|
return addComponents(increaseWriterIndex, componentCount, buffers);
|
2016-05-17 14:19:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf} on the specific index and increase the {@code writerIndex}
|
|
|
|
* if {@code increaseWriterIndex} is {@code true}.
|
|
|
|
*
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of {@code buffer} is transferred to this {@link CompositeByteBuf}.
|
2016-05-17 14:19:33 +02:00
|
|
|
* @param cIndex the index on which the {@link ByteBuf} will be added.
|
2019-01-14 07:24:34 +01:00
|
|
|
* @param buffer the {@link ByteBuf} to add. {@link ByteBuf#release()} ownership is transferred to this
|
2016-05-17 14:19:33 +02:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
*/
|
|
|
|
public CompositeByteBuf addComponent(boolean increaseWriterIndex, int cIndex, ByteBuf buffer) {
|
2016-01-26 22:44:20 +01:00
|
|
|
checkNotNull(buffer, "buffer");
|
2016-05-17 14:19:33 +02:00
|
|
|
addComponent0(increaseWriterIndex, cIndex, buffer);
|
2013-07-08 06:32:33 +02:00
|
|
|
consolidateIfNeeded();
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
2016-01-26 22:44:20 +01:00
|
|
|
/**
|
|
|
|
* Precondition is that {@code buffer != null}.
|
|
|
|
*/
|
2016-05-17 14:19:33 +02:00
|
|
|
private int addComponent0(boolean increaseWriterIndex, int cIndex, ByteBuf buffer) {
|
2016-01-26 22:44:20 +01:00
|
|
|
assert buffer != null;
|
|
|
|
boolean wasAdded = false;
|
|
|
|
try {
|
|
|
|
checkComponentIndex(cIndex);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2016-01-26 22:44:20 +01:00
|
|
|
// No need to consolidate - just add a component to the list.
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = newComponent(buffer, 0);
|
|
|
|
int readableBytes = c.length();
|
|
|
|
|
|
|
|
addComp(cIndex, c);
|
2018-10-28 10:28:18 +01:00
|
|
|
wasAdded = true;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (readableBytes > 0 && cIndex < componentCount - 1) {
|
2018-10-28 10:28:18 +01:00
|
|
|
updateComponentOffsets(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
} else if (cIndex > 0) {
|
|
|
|
c.reposition(components[cIndex - 1].endOffset);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2016-05-17 14:19:33 +02:00
|
|
|
if (increaseWriterIndex) {
|
2019-04-08 20:48:08 +02:00
|
|
|
writerIndex += readableBytes;
|
2016-05-17 14:19:33 +02:00
|
|
|
}
|
2016-01-26 22:44:20 +01:00
|
|
|
return cIndex;
|
|
|
|
} finally {
|
|
|
|
if (!wasAdded) {
|
|
|
|
buffer.release();
|
2014-12-11 21:13:03 +01:00
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
@SuppressWarnings("deprecation")
|
|
|
|
private Component newComponent(ByteBuf buf, int offset) {
|
2019-02-28 20:40:42 +01:00
|
|
|
if (checkAccessible && !buf.isAccessible()) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
throw new IllegalReferenceCountException(0);
|
|
|
|
}
|
|
|
|
int srcIndex = buf.readerIndex(), len = buf.readableBytes();
|
|
|
|
ByteBuf slice = null;
|
2019-02-01 10:31:53 +01:00
|
|
|
// unwrap if already sliced
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (buf instanceof AbstractUnpooledSlicedByteBuf) {
|
|
|
|
srcIndex += ((AbstractUnpooledSlicedByteBuf) buf).idx(0);
|
|
|
|
slice = buf;
|
|
|
|
buf = buf.unwrap();
|
|
|
|
} else if (buf instanceof PooledSlicedByteBuf) {
|
|
|
|
srcIndex += ((PooledSlicedByteBuf) buf).adjustment;
|
|
|
|
slice = buf;
|
|
|
|
buf = buf.unwrap();
|
|
|
|
}
|
|
|
|
return new Component(buf.order(ByteOrder.BIG_ENDIAN), srcIndex, offset, len, slice);
|
|
|
|
}
|
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}s on the specific index
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
|
|
|
* If you need to have it increased you need to handle it by your own.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param cIndex the index on which the {@link ByteBuf} will be added. {@link ByteBuf#release()} ownership of all
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all {@link ByteBuf#release()}
|
2019-01-14 07:24:34 +01:00
|
|
|
* ownership of all {@link ByteBuf} objects is transferred to this {@link CompositeByteBuf}.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf addComponents(int cIndex, ByteBuf... buffers) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
checkNotNull(buffers, "buffers");
|
|
|
|
addComponents0(false, cIndex, buffers, 0);
|
2013-07-08 06:32:33 +02:00
|
|
|
consolidateIfNeeded();
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
private CompositeByteBuf addComponents0(boolean increaseWriterIndex,
|
|
|
|
final int cIndex, ByteBuf[] buffers, int arrOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
final int len = buffers.length, count = len - arrOffset;
|
2019-02-01 10:31:53 +01:00
|
|
|
// only set ci after we've shifted so that finally block logic is always correct
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int ci = Integer.MAX_VALUE;
|
2016-01-26 22:44:20 +01:00
|
|
|
try {
|
|
|
|
checkComponentIndex(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
shiftComps(cIndex, count); // will increase componentCount
|
|
|
|
int nextOffset = cIndex > 0 ? components[cIndex - 1].endOffset : 0;
|
2019-02-01 10:31:53 +01:00
|
|
|
for (ci = cIndex; arrOffset < len; arrOffset++, ci++) {
|
|
|
|
ByteBuf b = buffers[arrOffset];
|
|
|
|
if (b == null) {
|
|
|
|
break;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = newComponent(b, nextOffset);
|
|
|
|
components[ci] = c;
|
|
|
|
nextOffset = c.endOffset;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
return this;
|
2016-01-26 22:44:20 +01:00
|
|
|
} finally {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
// ci is now the index following the last successfully added component
|
|
|
|
if (ci < componentCount) {
|
|
|
|
if (ci < cIndex + count) {
|
|
|
|
// we bailed early
|
|
|
|
removeCompRange(ci, cIndex + count);
|
|
|
|
for (; arrOffset < len; ++arrOffset) {
|
|
|
|
ReferenceCountUtil.safeRelease(buffers[arrOffset]);
|
2016-01-26 22:44:20 +01:00
|
|
|
}
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
updateComponentOffsets(ci); // only need to do this here for components after the added ones
|
|
|
|
}
|
|
|
|
if (increaseWriterIndex && ci > cIndex && ci <= componentCount) {
|
2019-04-08 20:48:08 +02:00
|
|
|
writerIndex += components[ci - 1].endOffset - components[cIndex].offset;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
2018-10-30 19:35:39 +01:00
|
|
|
private <T> int addComponents0(boolean increaseWriterIndex, int cIndex,
|
|
|
|
ByteWrapper<T> wrapper, T[] buffers, int offset) {
|
|
|
|
checkComponentIndex(cIndex);
|
|
|
|
|
|
|
|
// No need for consolidation
|
|
|
|
for (int i = offset, len = buffers.length; i < len; i++) {
|
|
|
|
T b = buffers[i];
|
|
|
|
if (b == null) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (!wrapper.isEmpty(b)) {
|
|
|
|
cIndex = addComponent0(increaseWriterIndex, cIndex, wrapper.wrap(b)) + 1;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
2018-10-30 19:35:39 +01:00
|
|
|
if (cIndex > size) {
|
|
|
|
cIndex = size;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return cIndex;
|
|
|
|
}
|
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf}s on the specific index
|
|
|
|
*
|
2013-03-25 10:58:50 +01:00
|
|
|
* Be aware that this method does not increase the {@code writerIndex} of the {@link CompositeByteBuf}.
|
|
|
|
* If you need to have it increased you need to handle it by your own.
|
2016-01-26 22:44:20 +01:00
|
|
|
* <p>
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects in {@code buffers} is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param cIndex the index on which the {@link ByteBuf} will be added.
|
2016-01-26 22:44:20 +01:00
|
|
|
* @param buffers the {@link ByteBuf}s to add. {@link ByteBuf#release()} ownership of all
|
2019-01-14 07:24:34 +01:00
|
|
|
* {@link ByteBuf#release()} ownership of all {@link ByteBuf} objects is transferred to this
|
2016-01-26 22:44:20 +01:00
|
|
|
* {@link CompositeByteBuf}.
|
2013-07-08 06:32:33 +02:00
|
|
|
*/
|
|
|
|
public CompositeByteBuf addComponents(int cIndex, Iterable<ByteBuf> buffers) {
|
2019-02-01 10:31:53 +01:00
|
|
|
return addComponents(false, cIndex, buffers);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
2019-04-08 20:48:08 +02:00
|
|
|
/**
|
|
|
|
* Add the given {@link ByteBuf} and increase the {@code writerIndex} if {@code increaseWriterIndex} is
|
|
|
|
* {@code true}. If the provided buffer is a {@link CompositeByteBuf} itself, a "shallow copy" of its
|
|
|
|
* readable components will be performed. Thus the actual number of new components added may vary
|
|
|
|
* and in particular will be zero if the provided buffer is not readable.
|
|
|
|
* <p>
|
|
|
|
* {@link ByteBuf#release()} ownership of {@code buffer} is transferred to this {@link CompositeByteBuf}.
|
|
|
|
* @param buffer the {@link ByteBuf} to add. {@link ByteBuf#release()} ownership is transferred to this
|
|
|
|
* {@link CompositeByteBuf}.
|
|
|
|
*/
|
|
|
|
public CompositeByteBuf addFlattenedComponents(boolean increaseWriterIndex, ByteBuf buffer) {
|
|
|
|
checkNotNull(buffer, "buffer");
|
|
|
|
final int ridx = buffer.readerIndex();
|
|
|
|
final int widx = buffer.writerIndex();
|
|
|
|
if (ridx == widx) {
|
|
|
|
buffer.release();
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
if (!(buffer instanceof CompositeByteBuf)) {
|
|
|
|
addComponent0(increaseWriterIndex, componentCount, buffer);
|
|
|
|
consolidateIfNeeded();
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
final CompositeByteBuf from = (CompositeByteBuf) buffer;
|
|
|
|
from.checkIndex(ridx, widx - ridx);
|
|
|
|
final Component[] fromComponents = from.components;
|
|
|
|
final int compCountBefore = componentCount;
|
|
|
|
final int writerIndexBefore = writerIndex;
|
|
|
|
try {
|
|
|
|
for (int cidx = from.toComponentIndex0(ridx), newOffset = capacity();; cidx++) {
|
|
|
|
final Component component = fromComponents[cidx];
|
|
|
|
final int compOffset = component.offset;
|
|
|
|
final int fromIdx = Math.max(ridx, compOffset);
|
|
|
|
final int toIdx = Math.min(widx, component.endOffset);
|
|
|
|
final int len = toIdx - fromIdx;
|
|
|
|
if (len > 0) { // skip empty components
|
|
|
|
// Note that it's safe to just retain the unwrapped buf here, even in the case
|
|
|
|
// of PooledSlicedByteBufs - those slices will still be properly released by the
|
|
|
|
// source Component's free() method.
|
|
|
|
addComp(componentCount, new Component(
|
|
|
|
component.buf.retain(), component.idx(fromIdx), newOffset, len, null));
|
|
|
|
}
|
|
|
|
if (widx == toIdx) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
newOffset += len;
|
|
|
|
}
|
|
|
|
if (increaseWriterIndex) {
|
|
|
|
writerIndex = writerIndexBefore + (widx - ridx);
|
|
|
|
}
|
|
|
|
consolidateIfNeeded();
|
|
|
|
buffer.release();
|
|
|
|
buffer = null;
|
|
|
|
return this;
|
|
|
|
} finally {
|
|
|
|
if (buffer != null) {
|
|
|
|
// if we did not succeed, attempt to rollback any components that were added
|
|
|
|
if (increaseWriterIndex) {
|
|
|
|
writerIndex = writerIndexBefore;
|
|
|
|
}
|
|
|
|
for (int cidx = componentCount - 1; cidx >= compCountBefore; cidx--) {
|
|
|
|
components[cidx].free();
|
|
|
|
removeComp(cidx);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
// TODO optimize further, similar to ByteBuf[] version
|
|
|
|
// (difference here is that we don't know *always* know precise size increase in advance,
|
|
|
|
// but we do in the most common case that the Iterable is a Collection)
|
2019-02-01 10:31:53 +01:00
|
|
|
private CompositeByteBuf addComponents(boolean increaseIndex, int cIndex, Iterable<ByteBuf> buffers) {
|
2013-07-08 06:32:33 +02:00
|
|
|
if (buffers instanceof ByteBuf) {
|
|
|
|
// If buffers also implements ByteBuf (e.g. CompositeByteBuf), it has to go to addComponent(ByteBuf).
|
2019-02-01 10:31:53 +01:00
|
|
|
return addComponent(increaseIndex, cIndex, (ByteBuf) buffers);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2016-01-26 22:44:20 +01:00
|
|
|
checkNotNull(buffers, "buffers");
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Iterator<ByteBuf> it = buffers.iterator();
|
|
|
|
try {
|
|
|
|
checkComponentIndex(cIndex);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
// No need for consolidation
|
|
|
|
while (it.hasNext()) {
|
|
|
|
ByteBuf b = it.next();
|
|
|
|
if (b == null) {
|
|
|
|
break;
|
2016-01-26 22:44:20 +01:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
cIndex = addComponent0(increaseIndex, cIndex, b) + 1;
|
|
|
|
cIndex = Math.min(cIndex, componentCount);
|
|
|
|
}
|
|
|
|
} finally {
|
|
|
|
while (it.hasNext()) {
|
|
|
|
ReferenceCountUtil.safeRelease(it.next());
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
consolidateIfNeeded();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* This should only be called as last operation from a method as this may adjust the underlying
|
|
|
|
* array of components and so affect the index etc.
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
private void consolidateIfNeeded() {
|
|
|
|
// Consolidate if the number of components will exceed the allowed maximum by the current
|
|
|
|
// operation.
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
|
|
|
if (size > maxNumComponents) {
|
|
|
|
final int capacity = components[size - 1].endOffset;
|
2013-07-08 06:32:33 +02:00
|
|
|
|
|
|
|
ByteBuf consolidated = allocBuffer(capacity);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = null;
|
2013-07-08 06:32:33 +02:00
|
|
|
|
|
|
|
// We're not using foreach to avoid creating an iterator.
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = 0; i < size; i ++) {
|
|
|
|
components[i].transferTo(consolidated);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
|
|
|
|
components[0] = new Component(consolidated, 0, 0, capacity, consolidated);
|
|
|
|
removeCompRange(1, size);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private void checkComponentIndex(int cIndex) {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (cIndex < 0 || cIndex > componentCount) {
|
2013-07-08 06:32:33 +02:00
|
|
|
throw new IndexOutOfBoundsException(String.format(
|
|
|
|
"cIndex: %d (expected: >= 0 && <= numComponents(%d))",
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
cIndex, componentCount));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private void checkComponentIndex(int cIndex, int numComponents) {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (cIndex < 0 || cIndex + numComponents > componentCount) {
|
2013-07-08 06:32:33 +02:00
|
|
|
throw new IndexOutOfBoundsException(String.format(
|
|
|
|
"cIndex: %d, numComponents: %d " +
|
|
|
|
"(expected: cIndex >= 0 && cIndex + numComponents <= totalNumComponents(%d))",
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
cIndex, numComponents, componentCount));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private void updateComponentOffsets(int cIndex) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
2013-09-09 20:29:30 +02:00
|
|
|
if (size <= cIndex) {
|
2013-07-08 06:32:33 +02:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-01-24 12:47:04 +01:00
|
|
|
int nextIndex = cIndex > 0 ? components[cIndex - 1].endOffset : 0;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (; cIndex < size; cIndex++) {
|
|
|
|
Component c = components[cIndex];
|
|
|
|
c.reposition(nextIndex);
|
|
|
|
nextIndex = c.endOffset;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
2012-07-20 06:18:17 +02:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Remove the {@link ByteBuf} from the given index.
|
|
|
|
*
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param cIndex the index on from which the {@link ByteBuf} will be remove
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf removeComponent(int cIndex) {
|
|
|
|
checkComponentIndex(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component comp = components[cIndex];
|
|
|
|
if (lastAccessed == comp) {
|
|
|
|
lastAccessed = null;
|
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
comp.free();
|
2018-11-13 20:56:09 +01:00
|
|
|
removeComp(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (comp.length() > 0) {
|
2014-12-11 21:13:03 +01:00
|
|
|
// Only need to call updateComponentOffsets if the length was > 0
|
|
|
|
updateComponentOffsets(cIndex);
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
return this;
|
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Remove the number of {@link ByteBuf}s starting from the given index.
|
|
|
|
*
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param cIndex the index on which the {@link ByteBuf}s will be started to removed
|
|
|
|
* @param numComponents the number of components to remove
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf removeComponents(int cIndex, int numComponents) {
|
|
|
|
checkComponentIndex(cIndex, numComponents);
|
|
|
|
|
2014-12-11 21:13:03 +01:00
|
|
|
if (numComponents == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
2017-12-12 09:06:18 +01:00
|
|
|
int endIndex = cIndex + numComponents;
|
2014-12-11 21:13:03 +01:00
|
|
|
boolean needsUpdate = false;
|
2017-12-12 09:06:18 +01:00
|
|
|
for (int i = cIndex; i < endIndex; ++i) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
if (c.length() > 0) {
|
2014-12-11 21:13:03 +01:00
|
|
|
needsUpdate = true;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (lastAccessed == c) {
|
|
|
|
lastAccessed = null;
|
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
c.free();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
removeCompRange(cIndex, endIndex);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2014-12-11 21:13:03 +01:00
|
|
|
if (needsUpdate) {
|
|
|
|
// Only need to call updateComponentOffsets if the length was > 0
|
|
|
|
updateComponentOffsets(cIndex);
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
2015-04-10 09:43:06 +02:00
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public Iterator<ByteBuf> iterator() {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return componentCount == 0 ? EMPTY_ITERATOR : new CompositeByteBufIterator();
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected int forEachByteAsc0(int start, int end, ByteProcessor processor) throws Exception {
|
|
|
|
if (end <= start) {
|
|
|
|
return -1;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = toComponentIndex0(start), length = end - start; length > 0; i++) {
|
|
|
|
Component c = components[i];
|
|
|
|
if (c.offset == c.endOffset) {
|
|
|
|
continue; // empty
|
|
|
|
}
|
|
|
|
ByteBuf s = c.buf;
|
|
|
|
int localStart = c.idx(start);
|
|
|
|
int localLength = Math.min(length, c.endOffset - start);
|
|
|
|
// avoid additional checks in AbstractByteBuf case
|
|
|
|
int result = s instanceof AbstractByteBuf
|
|
|
|
? ((AbstractByteBuf) s).forEachByteAsc0(localStart, localStart + localLength, processor)
|
|
|
|
: s.forEachByte(localStart, localLength, processor);
|
|
|
|
if (result != -1) {
|
|
|
|
return result - c.adjustment;
|
|
|
|
}
|
|
|
|
start += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
}
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected int forEachByteDesc0(int rStart, int rEnd, ByteProcessor processor) throws Exception {
|
|
|
|
if (rEnd > rStart) { // rStart *and* rEnd are inclusive
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
for (int i = toComponentIndex0(rStart), length = 1 + rStart - rEnd; length > 0; i--) {
|
|
|
|
Component c = components[i];
|
|
|
|
if (c.offset == c.endOffset) {
|
|
|
|
continue; // empty
|
|
|
|
}
|
|
|
|
ByteBuf s = c.buf;
|
|
|
|
int localRStart = c.idx(length + rEnd);
|
|
|
|
int localLength = Math.min(length, localRStart), localIndex = localRStart - localLength;
|
|
|
|
// avoid additional checks in AbstractByteBuf case
|
|
|
|
int result = s instanceof AbstractByteBuf
|
|
|
|
? ((AbstractByteBuf) s).forEachByteDesc0(localRStart - 1, localIndex, processor)
|
|
|
|
: s.forEachByteDesc(localIndex, localLength, processor);
|
|
|
|
|
|
|
|
if (result != -1) {
|
|
|
|
return result - c.adjustment;
|
|
|
|
}
|
|
|
|
length -= localLength;
|
|
|
|
}
|
|
|
|
return -1;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Same with {@link #slice(int, int)} except that this method returns a list.
|
|
|
|
*/
|
|
|
|
public List<ByteBuf> decompose(int offset, int length) {
|
|
|
|
checkIndex(offset, length);
|
|
|
|
if (length == 0) {
|
|
|
|
return Collections.emptyList();
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int componentId = toComponentIndex0(offset);
|
2018-10-19 08:05:22 +02:00
|
|
|
int bytesToSlice = length;
|
2013-07-08 06:32:33 +02:00
|
|
|
// The first component
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component firstC = components[componentId];
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
ByteBuf slice = firstC.buf.slice(firstC.idx(offset), Math.min(firstC.endOffset - offset, bytesToSlice));
|
2018-10-19 08:05:22 +02:00
|
|
|
bytesToSlice -= slice.readableBytes();
|
2013-07-08 06:32:33 +02:00
|
|
|
|
2018-10-19 08:05:22 +02:00
|
|
|
if (bytesToSlice == 0) {
|
|
|
|
return Collections.singletonList(slice);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
List<ByteBuf> sliceList = new ArrayList<ByteBuf>(componentCount - componentId);
|
2018-10-19 08:05:22 +02:00
|
|
|
sliceList.add(slice);
|
|
|
|
|
|
|
|
// Add all the slices until there is nothing more left and then return the List.
|
|
|
|
do {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component component = components[++componentId];
|
|
|
|
slice = component.buf.slice(component.idx(component.offset), Math.min(component.length(), bytesToSlice));
|
2018-10-19 08:05:22 +02:00
|
|
|
bytesToSlice -= slice.readableBytes();
|
|
|
|
sliceList.add(slice);
|
|
|
|
} while (bytesToSlice > 0);
|
|
|
|
|
|
|
|
return sliceList;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public boolean isDirect() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
2013-09-26 07:04:53 +02:00
|
|
|
if (size == 0) {
|
|
|
|
return false;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2013-09-26 07:04:53 +02:00
|
|
|
for (int i = 0; i < size; i++) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (!components[i].buf.isDirect()) {
|
2013-09-26 07:04:53 +02:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return true;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public boolean hasArray() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return true;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[0].buf.hasArray();
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
return false;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public byte[] array() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return EmptyArrays.EMPTY_BYTES;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[0].buf.array();
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
throw new UnsupportedOperationException();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int arrayOffset() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return 0;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[0];
|
|
|
|
return c.idx(c.buf.arrayOffset());
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
throw new UnsupportedOperationException();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public boolean hasMemoryAddress() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return Unpooled.EMPTY_BUFFER.hasMemoryAddress();
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[0].buf.hasMemoryAddress();
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
return false;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public long memoryAddress() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return Unpooled.EMPTY_BUFFER.memoryAddress();
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[0];
|
|
|
|
return c.buf.memoryAddress() + c.adjustment;
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
throw new UnsupportedOperationException();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int capacity() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
|
|
|
return size > 0 ? components[size - 1].endOffset : 0;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf capacity(int newCapacity) {
|
2017-01-26 11:22:16 +01:00
|
|
|
checkNewCapacity(newCapacity);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
final int size = componentCount, oldCapacity = capacity();
|
2013-07-08 06:32:33 +02:00
|
|
|
if (newCapacity > oldCapacity) {
|
|
|
|
final int paddingLength = newCapacity - oldCapacity;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
ByteBuf padding = allocBuffer(paddingLength).setIndex(0, paddingLength);
|
|
|
|
addComponent0(false, size, padding);
|
|
|
|
if (componentCount >= maxNumComponents) {
|
2013-07-08 06:32:33 +02:00
|
|
|
// FIXME: No need to create a padding buffer and consolidate.
|
|
|
|
// Just create a big single buffer and put the current content there.
|
|
|
|
consolidateIfNeeded();
|
|
|
|
}
|
|
|
|
} else if (newCapacity < oldCapacity) {
|
2019-01-24 12:47:04 +01:00
|
|
|
lastAccessed = null;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = size - 1;
|
|
|
|
for (int bytesToTrim = oldCapacity - newCapacity; i >= 0; i--) {
|
|
|
|
Component c = components[i];
|
|
|
|
final int cLength = c.length();
|
|
|
|
if (bytesToTrim < cLength) {
|
|
|
|
// Trim the last component
|
|
|
|
c.endOffset -= bytesToTrim;
|
2019-03-22 11:18:10 +01:00
|
|
|
ByteBuf slice = c.slice;
|
|
|
|
if (slice != null) {
|
|
|
|
// We must replace the cached slice with a derived one to ensure that
|
|
|
|
// it can later be released properly in the case of PooledSlicedByteBuf.
|
|
|
|
c.slice = slice.slice(0, c.length());
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
break;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
c.free();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
bytesToTrim -= cLength;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
removeCompRange(i + 1, size);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
|
|
|
if (readerIndex() > newCapacity) {
|
2019-04-08 20:48:08 +02:00
|
|
|
setIndex0(newCapacity, newCapacity);
|
|
|
|
} else if (writerIndex > newCapacity) {
|
|
|
|
writerIndex = newCapacity;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public ByteBufAllocator alloc() {
|
|
|
|
return alloc;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public ByteOrder order() {
|
|
|
|
return ByteOrder.BIG_ENDIAN;
|
|
|
|
}
|
2009-12-30 04:14:50 +01:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Return the current number of {@link ByteBuf}'s that are composed in this instance
|
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public int numComponents() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return componentCount;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Return the max number of {@link ByteBuf}'s that are composed in this instance
|
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public int maxNumComponents() {
|
|
|
|
return maxNumComponents;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Return the index for the given offset
|
|
|
|
*/
|
|
|
|
public int toComponentIndex(int offset) {
|
|
|
|
checkIndex(offset);
|
2019-01-24 12:47:04 +01:00
|
|
|
return toComponentIndex0(offset);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private int toComponentIndex0(int offset) {
|
|
|
|
int size = componentCount;
|
|
|
|
if (offset == 0) { // fast-path zero offset
|
|
|
|
for (int i = 0; i < size; i++) {
|
|
|
|
if (components[i].endOffset > 0) {
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2019-04-08 20:48:08 +02:00
|
|
|
if (size <= 2) { // fast-path for 1 and 2 component count
|
|
|
|
return size == 1 || offset < components[0].endOffset ? 0 : 1;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int low = 0, high = size; low <= high;) {
|
2013-07-08 06:32:33 +02:00
|
|
|
int mid = low + high >>> 1;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[mid];
|
2013-07-08 06:32:33 +02:00
|
|
|
if (offset >= c.endOffset) {
|
|
|
|
low = mid + 1;
|
|
|
|
} else if (offset < c.offset) {
|
|
|
|
high = mid - 1;
|
|
|
|
} else {
|
|
|
|
return mid;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
throw new Error("should not reach here");
|
|
|
|
}
|
|
|
|
|
|
|
|
public int toByteIndex(int cIndex) {
|
|
|
|
checkComponentIndex(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[cIndex].offset;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public byte getByte(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent(index);
|
|
|
|
return c.buf.getByte(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected byte _getByte(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
|
|
|
return c.buf.getByte(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected short _getShort(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 2 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getShort(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return (short) ((_getByte(index) & 0xff) << 8 | _getByte(index + 1) & 0xff);
|
|
|
|
} else {
|
|
|
|
return (short) (_getByte(index) & 0xff | (_getByte(index + 1) & 0xff) << 8);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected short _getShortLE(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 2 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getShortLE(c.idx(index));
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return (short) (_getByte(index) & 0xff | (_getByte(index + 1) & 0xff) << 8);
|
|
|
|
} else {
|
|
|
|
return (short) ((_getByte(index) & 0xff) << 8 | _getByte(index + 1) & 0xff);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
protected int _getUnsignedMedium(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 3 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getUnsignedMedium(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return (_getShort(index) & 0xffff) << 8 | _getByte(index + 2) & 0xff;
|
|
|
|
} else {
|
|
|
|
return _getShort(index) & 0xFFFF | (_getByte(index + 2) & 0xFF) << 16;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected int _getUnsignedMediumLE(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 3 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getUnsignedMediumLE(c.idx(index));
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return _getShortLE(index) & 0xffff | (_getByte(index + 2) & 0xff) << 16;
|
|
|
|
} else {
|
|
|
|
return (_getShortLE(index) & 0xffff) << 8 | _getByte(index + 2) & 0xff;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
protected int _getInt(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 4 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getInt(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return (_getShort(index) & 0xffff) << 16 | _getShort(index + 2) & 0xffff;
|
|
|
|
} else {
|
|
|
|
return _getShort(index) & 0xFFFF | (_getShort(index + 2) & 0xFFFF) << 16;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected int _getIntLE(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 4 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getIntLE(c.idx(index));
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return _getShortLE(index) & 0xffff | (_getShortLE(index + 2) & 0xffff) << 16;
|
|
|
|
} else {
|
|
|
|
return (_getShortLE(index) & 0xffff) << 16 | _getShortLE(index + 2) & 0xffff;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
protected long _getLong(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 8 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getLong(c.idx(index));
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return (_getInt(index) & 0xffffffffL) << 32 | _getInt(index + 4) & 0xffffffffL;
|
|
|
|
} else {
|
|
|
|
return _getInt(index) & 0xFFFFFFFFL | (_getInt(index + 4) & 0xFFFFFFFFL) << 32;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected long _getLongLE(int index) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 8 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return c.buf.getLongLE(c.idx(index));
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
return _getIntLE(index) & 0xffffffffL | (_getIntLE(index + 4) & 0xffffffffL) << 32;
|
|
|
|
} else {
|
|
|
|
return (_getIntLE(index) & 0xffffffffL) << 32 | _getIntLE(index + 4) & 0xffffffffL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf getBytes(int index, byte[] dst, int dstIndex, int length) {
|
|
|
|
checkDstIndex(index, length, dstIndex, dst.length);
|
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.getBytes(c.idx(index), dst, dstIndex, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
dstIndex += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf getBytes(int index, ByteBuffer dst) {
|
|
|
|
int limit = dst.limit();
|
|
|
|
int length = dst.remaining();
|
|
|
|
|
|
|
|
checkIndex(index, length);
|
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
try {
|
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
2013-07-08 06:32:33 +02:00
|
|
|
dst.limit(dst.position() + localLength);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.getBytes(c.idx(index), dst);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
} finally {
|
|
|
|
dst.limit(limit);
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
|
|
|
|
checkDstIndex(index, length, dstIndex, dst.capacity());
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.getBytes(c.idx(index), dst, dstIndex, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
dstIndex += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int getBytes(int index, GatheringByteChannel out, int length)
|
|
|
|
throws IOException {
|
2013-09-19 07:29:21 +02:00
|
|
|
int count = nioBufferCount();
|
|
|
|
if (count == 1) {
|
|
|
|
return out.write(internalNioBuffer(index, length));
|
2013-07-08 06:32:33 +02:00
|
|
|
} else {
|
2013-09-19 07:29:21 +02:00
|
|
|
long writtenBytes = out.write(nioBuffers(index, length));
|
|
|
|
if (writtenBytes > Integer.MAX_VALUE) {
|
|
|
|
return Integer.MAX_VALUE;
|
|
|
|
} else {
|
|
|
|
return (int) writtenBytes;
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-02-05 04:02:40 +01:00
|
|
|
@Override
|
|
|
|
public int getBytes(int index, FileChannel out, long position, int length)
|
|
|
|
throws IOException {
|
|
|
|
int count = nioBufferCount();
|
|
|
|
if (count == 1) {
|
|
|
|
return out.write(internalNioBuffer(index, length), position);
|
|
|
|
} else {
|
|
|
|
long writtenBytes = 0;
|
|
|
|
for (ByteBuffer buf : nioBuffers(index, length)) {
|
|
|
|
writtenBytes += out.write(buf, position + writtenBytes);
|
|
|
|
}
|
|
|
|
if (writtenBytes > Integer.MAX_VALUE) {
|
|
|
|
return Integer.MAX_VALUE;
|
|
|
|
}
|
|
|
|
return (int) writtenBytes;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf getBytes(int index, OutputStream out, int length) throws IOException {
|
|
|
|
checkIndex(index, length);
|
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.getBytes(c.idx(index), out, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setByte(int index, int value) {
|
|
|
|
Component c = findComponent(index);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setByte(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected void _setByte(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
|
|
|
c.buf.setByte(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setShort(int index, int value) {
|
2019-02-28 20:40:42 +01:00
|
|
|
checkIndex(index, 2);
|
|
|
|
_setShort(index, value);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected void _setShort(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 2 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setShort(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setByte(index, (byte) (value >>> 8));
|
|
|
|
_setByte(index + 1, (byte) value);
|
|
|
|
} else {
|
|
|
|
_setByte(index, (byte) value);
|
|
|
|
_setByte(index + 1, (byte) (value >>> 8));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected void _setShortLE(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 2 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setShortLE(c.idx(index), value);
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setByte(index, (byte) value);
|
|
|
|
_setByte(index + 1, (byte) (value >>> 8));
|
|
|
|
} else {
|
|
|
|
_setByte(index, (byte) (value >>> 8));
|
|
|
|
_setByte(index + 1, (byte) value);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setMedium(int index, int value) {
|
2019-02-28 20:40:42 +01:00
|
|
|
checkIndex(index, 3);
|
|
|
|
_setMedium(index, value);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected void _setMedium(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 3 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setMedium(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setShort(index, (short) (value >> 8));
|
|
|
|
_setByte(index + 2, (byte) value);
|
|
|
|
} else {
|
|
|
|
_setShort(index, (short) value);
|
|
|
|
_setByte(index + 2, (byte) (value >>> 16));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected void _setMediumLE(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 3 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setMediumLE(c.idx(index), value);
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setShortLE(index, (short) value);
|
|
|
|
_setByte(index + 2, (byte) (value >>> 16));
|
|
|
|
} else {
|
|
|
|
_setShortLE(index, (short) (value >> 8));
|
|
|
|
_setByte(index + 2, (byte) value);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setInt(int index, int value) {
|
2019-02-28 20:40:42 +01:00
|
|
|
checkIndex(index, 4);
|
|
|
|
_setInt(index, value);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected void _setInt(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 4 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setInt(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setShort(index, (short) (value >>> 16));
|
|
|
|
_setShort(index + 2, (short) value);
|
|
|
|
} else {
|
|
|
|
_setShort(index, (short) value);
|
|
|
|
_setShort(index + 2, (short) (value >>> 16));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected void _setIntLE(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 4 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setIntLE(c.idx(index), value);
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setShortLE(index, (short) value);
|
|
|
|
_setShortLE(index + 2, (short) (value >>> 16));
|
|
|
|
} else {
|
|
|
|
_setShortLE(index, (short) (value >>> 16));
|
|
|
|
_setShortLE(index + 2, (short) value);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setLong(int index, long value) {
|
2019-02-28 20:40:42 +01:00
|
|
|
checkIndex(index, 8);
|
|
|
|
_setLong(index, value);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
protected void _setLong(int index, long value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (index + 8 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setLong(c.idx(index), value);
|
2013-07-08 06:32:33 +02:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setInt(index, (int) (value >>> 32));
|
|
|
|
_setInt(index + 4, (int) value);
|
|
|
|
} else {
|
|
|
|
_setInt(index, (int) value);
|
|
|
|
_setInt(index + 4, (int) (value >>> 32));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-11-16 22:20:16 +01:00
|
|
|
@Override
|
|
|
|
protected void _setLongLE(int index, long value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = findComponent0(index);
|
2015-11-16 22:20:16 +01:00
|
|
|
if (index + 8 <= c.endOffset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setLongLE(c.idx(index), value);
|
2015-11-16 22:20:16 +01:00
|
|
|
} else if (order() == ByteOrder.BIG_ENDIAN) {
|
|
|
|
_setIntLE(index, (int) value);
|
|
|
|
_setIntLE(index + 4, (int) (value >>> 32));
|
|
|
|
} else {
|
|
|
|
_setIntLE(index, (int) (value >>> 32));
|
|
|
|
_setIntLE(index + 4, (int) value);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setBytes(int index, byte[] src, int srcIndex, int length) {
|
|
|
|
checkSrcIndex(index, length, srcIndex, src.length);
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.setBytes(c.idx(index), src, srcIndex, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
srcIndex += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setBytes(int index, ByteBuffer src) {
|
|
|
|
int limit = src.limit();
|
|
|
|
int length = src.remaining();
|
|
|
|
|
|
|
|
checkIndex(index, length);
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
try {
|
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
2013-07-08 06:32:33 +02:00
|
|
|
src.limit(src.position() + localLength);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
c.buf.setBytes(c.idx(index), src);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
} finally {
|
|
|
|
src.limit(limit);
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf setBytes(int index, ByteBuf src, int srcIndex, int length) {
|
|
|
|
checkSrcIndex(index, length, srcIndex, src.capacity());
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.setBytes(c.idx(index), src, srcIndex, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
srcIndex += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int setBytes(int index, InputStream in, int length) throws IOException {
|
|
|
|
checkIndex(index, length);
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
|
|
|
return in.read(EmptyArrays.EMPTY_BYTES);
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
int readBytes = 0;
|
|
|
|
do {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
2016-02-05 04:02:40 +01:00
|
|
|
if (localLength == 0) {
|
|
|
|
// Skip empty buffer
|
|
|
|
i++;
|
|
|
|
continue;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int localReadBytes = c.buf.setBytes(c.idx(index), in, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
if (localReadBytes < 0) {
|
|
|
|
if (readBytes == 0) {
|
|
|
|
return -1;
|
|
|
|
} else {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
index += localReadBytes;
|
|
|
|
length -= localReadBytes;
|
|
|
|
readBytes += localReadBytes;
|
2013-07-08 06:32:33 +02:00
|
|
|
if (localReadBytes == localLength) {
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
} while (length > 0);
|
|
|
|
|
|
|
|
return readBytes;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int setBytes(int index, ScatteringByteChannel in, int length) throws IOException {
|
|
|
|
checkIndex(index, length);
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length == 0) {
|
2014-12-30 07:52:57 +01:00
|
|
|
return in.read(EMPTY_NIO_BUFFER);
|
2013-07-24 07:35:51 +02:00
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2013-07-08 06:32:33 +02:00
|
|
|
int readBytes = 0;
|
|
|
|
do {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
2016-02-05 04:02:40 +01:00
|
|
|
if (localLength == 0) {
|
|
|
|
// Skip empty buffer
|
|
|
|
i++;
|
|
|
|
continue;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int localReadBytes = c.buf.setBytes(c.idx(index), in, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
|
|
|
if (localReadBytes == 0) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (localReadBytes < 0) {
|
|
|
|
if (readBytes == 0) {
|
|
|
|
return -1;
|
|
|
|
} else {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
index += localReadBytes;
|
|
|
|
length -= localReadBytes;
|
|
|
|
readBytes += localReadBytes;
|
2013-07-08 06:32:33 +02:00
|
|
|
if (localReadBytes == localLength) {
|
2016-02-05 04:02:40 +01:00
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
} while (length > 0);
|
|
|
|
|
|
|
|
return readBytes;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public int setBytes(int index, FileChannel in, long position, int length) throws IOException {
|
|
|
|
checkIndex(index, length);
|
|
|
|
if (length == 0) {
|
|
|
|
return in.read(EMPTY_NIO_BUFFER, position);
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int i = toComponentIndex0(index);
|
2016-02-05 04:02:40 +01:00
|
|
|
int readBytes = 0;
|
|
|
|
do {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
2016-02-05 04:02:40 +01:00
|
|
|
if (localLength == 0) {
|
|
|
|
// Skip empty buffer
|
|
|
|
i++;
|
|
|
|
continue;
|
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int localReadBytes = c.buf.setBytes(c.idx(index), in, position + readBytes, localLength);
|
2016-02-05 04:02:40 +01:00
|
|
|
|
|
|
|
if (localReadBytes == 0) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (localReadBytes < 0) {
|
|
|
|
if (readBytes == 0) {
|
|
|
|
return -1;
|
|
|
|
} else {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
index += localReadBytes;
|
|
|
|
length -= localReadBytes;
|
|
|
|
readBytes += localReadBytes;
|
2016-02-05 04:02:40 +01:00
|
|
|
if (localReadBytes == localLength) {
|
2013-07-08 06:32:33 +02:00
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
} while (length > 0);
|
|
|
|
|
|
|
|
return readBytes;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public ByteBuf copy(int index, int length) {
|
|
|
|
checkIndex(index, length);
|
2017-11-09 17:48:19 +01:00
|
|
|
ByteBuf dst = allocBuffer(length);
|
2013-07-24 07:35:51 +02:00
|
|
|
if (length != 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
copyTo(index, length, toComponentIndex0(index), dst);
|
2013-07-24 07:35:51 +02:00
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
return dst;
|
|
|
|
}
|
|
|
|
|
|
|
|
private void copyTo(int index, int length, int componentId, ByteBuf dst) {
|
|
|
|
int dstIndex = 0;
|
|
|
|
int i = componentId;
|
|
|
|
|
|
|
|
while (length > 0) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[i];
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
c.buf.getBytes(c.idx(index), dst, dstIndex, localLength);
|
2013-07-08 06:32:33 +02:00
|
|
|
index += localLength;
|
|
|
|
dstIndex += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
|
|
|
}
|
|
|
|
|
|
|
|
dst.writerIndex(dst.capacity());
|
|
|
|
}
|
2009-12-30 04:14:50 +01:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Return the {@link ByteBuf} on the specified index
|
|
|
|
*
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param cIndex the index for which the {@link ByteBuf} should be returned
|
|
|
|
* @return buf the {@link ByteBuf} on the specified index
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuf component(int cIndex) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
checkComponentIndex(cIndex);
|
|
|
|
return components[cIndex].duplicate();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Return the {@link ByteBuf} on the specified index
|
|
|
|
*
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param offset the offset for which the {@link ByteBuf} should be returned
|
|
|
|
* @return the {@link ByteBuf} on the specified index
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuf componentAtOffset(int offset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return findComponent(offset).duplicate();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2009-12-30 04:14:50 +01:00
|
|
|
|
2013-07-05 07:00:05 +02:00
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* Return the internal {@link ByteBuf} on the specified index. Note that updating the indexes of the returned
|
2013-07-05 07:00:05 +02:00
|
|
|
* buffer will lead to an undefined behavior of this buffer.
|
|
|
|
*
|
|
|
|
* @param cIndex the index for which the {@link ByteBuf} should be returned
|
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuf internalComponent(int cIndex) {
|
|
|
|
checkComponentIndex(cIndex);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[cIndex].slice();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2013-07-05 07:00:05 +02:00
|
|
|
|
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* Return the internal {@link ByteBuf} on the specified offset. Note that updating the indexes of the returned
|
2013-07-05 07:00:05 +02:00
|
|
|
* buffer will lead to an undefined behavior of this buffer.
|
|
|
|
*
|
|
|
|
* @param offset the offset for which the {@link ByteBuf} should be returned
|
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuf internalComponentAtOffset(int offset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return findComponent(offset).slice();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
// weak cache - check it first when looking for component
|
|
|
|
private Component lastAccessed;
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
private Component findComponent(int offset) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component la = lastAccessed;
|
|
|
|
if (la != null && offset >= la.offset && offset < la.endOffset) {
|
|
|
|
ensureAccessible();
|
|
|
|
return la;
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
checkIndex(offset);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return findIt(offset);
|
|
|
|
}
|
|
|
|
|
|
|
|
private Component findComponent0(int offset) {
|
|
|
|
Component la = lastAccessed;
|
|
|
|
if (la != null && offset >= la.offset && offset < la.endOffset) {
|
|
|
|
return la;
|
|
|
|
}
|
|
|
|
return findIt(offset);
|
|
|
|
}
|
2013-07-08 06:32:33 +02:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private Component findIt(int offset) {
|
|
|
|
for (int low = 0, high = componentCount; low <= high;) {
|
2013-07-08 06:32:33 +02:00
|
|
|
int mid = low + high >>> 1;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[mid];
|
2013-07-08 06:32:33 +02:00
|
|
|
if (offset >= c.endOffset) {
|
|
|
|
low = mid + 1;
|
|
|
|
} else if (offset < c.offset) {
|
|
|
|
high = mid - 1;
|
|
|
|
} else {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = c;
|
2013-07-08 06:32:33 +02:00
|
|
|
return c;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
throw new Error("should not reach here");
|
|
|
|
}
|
2013-07-05 07:00:05 +02:00
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public int nioBufferCount() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int size = componentCount;
|
|
|
|
switch (size) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return 1;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[0].buf.nioBufferCount();
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
2013-07-08 06:32:33 +02:00
|
|
|
int count = 0;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = 0; i < size; i++) {
|
|
|
|
count += components[i].buf.nioBufferCount();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public ByteBuffer internalNioBuffer(int index, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return EMPTY_NIO_BUFFER;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[0];
|
|
|
|
return c.buf.internalNioBuffer(c.idx(index), length);
|
2014-12-30 07:52:57 +01:00
|
|
|
default:
|
|
|
|
throw new UnsupportedOperationException();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-08-30 09:51:12 +02:00
|
|
|
@Override
|
|
|
|
public ByteBuffer nioBuffer(int index, int length) {
|
2014-12-30 07:52:57 +01:00
|
|
|
checkIndex(index, length);
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
switch (componentCount) {
|
2014-12-30 07:52:57 +01:00
|
|
|
case 0:
|
|
|
|
return EMPTY_NIO_BUFFER;
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Component c = components[0];
|
|
|
|
ByteBuf buf = c.buf;
|
2013-08-30 09:51:12 +02:00
|
|
|
if (buf.nioBufferCount() == 1) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return buf.nioBuffer(c.idx(index), length);
|
2013-08-30 09:51:12 +02:00
|
|
|
}
|
|
|
|
}
|
2014-12-30 07:52:57 +01:00
|
|
|
|
2013-08-30 09:51:12 +02:00
|
|
|
ByteBuffer[] buffers = nioBuffers(index, length);
|
|
|
|
|
2018-08-07 11:31:24 +02:00
|
|
|
if (buffers.length == 1) {
|
|
|
|
return buffers[0].duplicate();
|
|
|
|
}
|
|
|
|
|
|
|
|
ByteBuffer merged = ByteBuffer.allocate(length).order(order());
|
2014-12-30 07:52:57 +01:00
|
|
|
for (ByteBuffer buf: buffers) {
|
|
|
|
merged.put(buf);
|
2013-08-30 09:51:12 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
merged.flip();
|
|
|
|
return merged;
|
|
|
|
}
|
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
@Override
|
|
|
|
public ByteBuffer[] nioBuffers(int index, int length) {
|
|
|
|
checkIndex(index, length);
|
|
|
|
if (length == 0) {
|
2014-12-30 07:52:57 +01:00
|
|
|
return new ByteBuffer[] { EMPTY_NIO_BUFFER };
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
RecyclableArrayList buffers = RecyclableArrayList.newInstance(componentCount);
|
|
|
|
try {
|
|
|
|
int i = toComponentIndex0(index);
|
|
|
|
while (length > 0) {
|
|
|
|
Component c = components[i];
|
|
|
|
ByteBuf s = c.buf;
|
|
|
|
int localLength = Math.min(length, c.endOffset - index);
|
|
|
|
switch (s.nioBufferCount()) {
|
2013-07-08 06:32:33 +02:00
|
|
|
case 0:
|
|
|
|
throw new UnsupportedOperationException();
|
|
|
|
case 1:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
buffers.add(s.nioBuffer(c.idx(index), localLength));
|
2013-07-08 06:32:33 +02:00
|
|
|
break;
|
|
|
|
default:
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
Collections.addAll(buffers, s.nioBuffers(c.idx(index), localLength));
|
|
|
|
}
|
|
|
|
|
|
|
|
index += localLength;
|
|
|
|
length -= localLength;
|
|
|
|
i ++;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return buffers.toArray(new ByteBuffer[0]);
|
|
|
|
} finally {
|
|
|
|
buffers.recycle();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
|
|
|
* Consolidate the composed {@link ByteBuf}s
|
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf consolidate() {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
final int numComponents = componentCount;
|
2013-07-08 06:32:33 +02:00
|
|
|
if (numComponents <= 1) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
final int capacity = components[numComponents - 1].endOffset;
|
2013-07-08 06:32:33 +02:00
|
|
|
final ByteBuf consolidated = allocBuffer(capacity);
|
|
|
|
|
|
|
|
for (int i = 0; i < numComponents; i ++) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
components[i].transferTo(consolidated);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = null;
|
|
|
|
components[0] = new Component(consolidated, 0, 0, capacity, consolidated);
|
|
|
|
removeCompRange(1, numComponents);
|
2013-07-08 06:32:33 +02:00
|
|
|
return this;
|
|
|
|
}
|
2012-12-21 22:22:40 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Consolidate the composed {@link ByteBuf}s
|
|
|
|
*
|
2013-07-08 06:32:33 +02:00
|
|
|
* @param cIndex the index on which to start to compose
|
|
|
|
* @param numComponents the number of components to compose
|
2012-12-21 22:22:40 +01:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf consolidate(int cIndex, int numComponents) {
|
|
|
|
checkComponentIndex(cIndex, numComponents);
|
|
|
|
if (numComponents <= 1) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
final int endCIndex = cIndex + numComponents;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
final Component last = components[endCIndex - 1];
|
|
|
|
final int capacity = last.endOffset - components[cIndex].offset;
|
2013-07-08 06:32:33 +02:00
|
|
|
final ByteBuf consolidated = allocBuffer(capacity);
|
|
|
|
|
|
|
|
for (int i = cIndex; i < endCIndex; i ++) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
components[i].transferTo(consolidated);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = null;
|
|
|
|
removeCompRange(cIndex + 1, endCIndex);
|
|
|
|
components[cIndex] = new Component(consolidated, 0, 0, capacity, consolidated);
|
2013-07-08 06:32:33 +02:00
|
|
|
updateComponentOffsets(cIndex);
|
|
|
|
return this;
|
|
|
|
}
|
2009-12-30 04:14:50 +01:00
|
|
|
|
2012-12-21 22:22:40 +01:00
|
|
|
/**
|
2013-07-08 06:32:33 +02:00
|
|
|
* Discard all {@link ByteBuf}s which are read.
|
2012-06-28 11:00:43 +02:00
|
|
|
*/
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf discardReadComponents() {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
2013-07-08 06:32:33 +02:00
|
|
|
final int readerIndex = readerIndex();
|
|
|
|
if (readerIndex == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Discard everything if (readerIndex = writerIndex = capacity).
|
|
|
|
int writerIndex = writerIndex();
|
|
|
|
if (readerIndex == writerIndex && writerIndex == capacity()) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = 0, size = componentCount; i < size; i++) {
|
2019-02-01 10:31:53 +01:00
|
|
|
components[i].free();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = null;
|
|
|
|
clearComps();
|
2013-07-08 06:32:33 +02:00
|
|
|
setIndex(0, 0);
|
|
|
|
adjustMarkers(readerIndex);
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Remove read components.
|
2019-04-08 20:48:08 +02:00
|
|
|
int firstComponentId = 0;
|
|
|
|
Component c = null;
|
|
|
|
for (int size = componentCount; firstComponentId < size; firstComponentId++) {
|
|
|
|
c = components[firstComponentId];
|
|
|
|
if (c.endOffset > readerIndex) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
c.free();
|
|
|
|
}
|
|
|
|
if (firstComponentId == 0) {
|
|
|
|
return this; // Nothing to discard
|
|
|
|
}
|
|
|
|
Component la = lastAccessed;
|
|
|
|
if (la != null && la.endOffset < readerIndex) {
|
|
|
|
lastAccessed = null;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
removeCompRange(0, firstComponentId);
|
2013-07-08 06:32:33 +02:00
|
|
|
|
|
|
|
// Update indexes and markers.
|
2019-04-08 20:48:08 +02:00
|
|
|
int offset = c.offset;
|
2013-07-08 06:32:33 +02:00
|
|
|
updateComponentOffsets(0);
|
2013-11-09 20:13:24 +01:00
|
|
|
setIndex(readerIndex - offset, writerIndex - offset);
|
|
|
|
adjustMarkers(offset);
|
2013-07-08 06:32:33 +02:00
|
|
|
return this;
|
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf discardReadBytes() {
|
2014-06-30 07:10:12 +02:00
|
|
|
ensureAccessible();
|
2013-07-08 06:32:33 +02:00
|
|
|
final int readerIndex = readerIndex();
|
|
|
|
if (readerIndex == 0) {
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Discard everything if (readerIndex = writerIndex = capacity).
|
|
|
|
int writerIndex = writerIndex();
|
|
|
|
if (readerIndex == writerIndex && writerIndex == capacity()) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = 0, size = componentCount; i < size; i++) {
|
2019-02-01 10:31:53 +01:00
|
|
|
components[i].free();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
lastAccessed = null;
|
|
|
|
clearComps();
|
2013-07-08 06:32:33 +02:00
|
|
|
setIndex(0, 0);
|
|
|
|
adjustMarkers(readerIndex);
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
2019-04-08 20:48:08 +02:00
|
|
|
int firstComponentId = 0;
|
|
|
|
Component c = null;
|
|
|
|
for (int size = componentCount; firstComponentId < size; firstComponentId++) {
|
|
|
|
c = components[firstComponentId];
|
|
|
|
if (c.endOffset > readerIndex) {
|
|
|
|
break;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
}
|
2019-04-08 20:48:08 +02:00
|
|
|
c.free();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
2019-04-08 20:48:08 +02:00
|
|
|
// Replace the first readable component with a new slice.
|
|
|
|
int trimmedBytes = readerIndex - c.offset;
|
|
|
|
c.offset = 0;
|
|
|
|
c.endOffset -= readerIndex;
|
|
|
|
c.adjustment += readerIndex;
|
|
|
|
ByteBuf slice = c.slice;
|
|
|
|
if (slice != null) {
|
|
|
|
// We must replace the cached slice with a derived one to ensure that
|
|
|
|
// it can later be released properly in the case of PooledSlicedByteBuf.
|
|
|
|
c.slice = slice.slice(trimmedBytes, c.length());
|
|
|
|
}
|
|
|
|
Component la = lastAccessed;
|
|
|
|
if (la != null && la.endOffset < readerIndex) {
|
|
|
|
lastAccessed = null;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
removeCompRange(0, firstComponentId);
|
2017-12-12 09:06:18 +01:00
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
// Update indexes and markers.
|
|
|
|
updateComponentOffsets(0);
|
|
|
|
setIndex(0, writerIndex - readerIndex);
|
|
|
|
adjustMarkers(readerIndex);
|
|
|
|
return this;
|
|
|
|
}
|
|
|
|
|
|
|
|
private ByteBuf allocBuffer(int capacity) {
|
2016-01-26 22:44:20 +01:00
|
|
|
return direct ? alloc().directBuffer(capacity) : alloc().heapBuffer(capacity);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public String toString() {
|
|
|
|
String result = super.toString();
|
|
|
|
result = result.substring(0, result.length() - 1);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return result + ", components=" + componentCount + ')';
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
2014-07-19 14:48:54 +02:00
|
|
|
private static final class Component {
|
2013-07-08 06:32:33 +02:00
|
|
|
final ByteBuf buf;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int adjustment;
|
2013-07-08 06:32:33 +02:00
|
|
|
int offset;
|
|
|
|
int endOffset;
|
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private ByteBuf slice; // cached slice, may be null
|
|
|
|
|
|
|
|
Component(ByteBuf buf, int srcOffset, int offset, int len, ByteBuf slice) {
|
2013-07-08 06:32:33 +02:00
|
|
|
this.buf = buf;
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
this.offset = offset;
|
|
|
|
this.endOffset = offset + len;
|
|
|
|
this.adjustment = srcOffset - offset;
|
|
|
|
this.slice = slice;
|
|
|
|
}
|
|
|
|
|
|
|
|
int idx(int index) {
|
|
|
|
return index + adjustment;
|
|
|
|
}
|
|
|
|
|
|
|
|
int length() {
|
|
|
|
return endOffset - offset;
|
|
|
|
}
|
|
|
|
|
|
|
|
void reposition(int newOffset) {
|
|
|
|
int move = newOffset - offset;
|
|
|
|
endOffset += move;
|
|
|
|
adjustment -= move;
|
|
|
|
offset = newOffset;
|
|
|
|
}
|
|
|
|
|
|
|
|
// copy then release
|
|
|
|
void transferTo(ByteBuf dst) {
|
|
|
|
dst.writeBytes(buf, idx(offset), length());
|
2019-02-01 10:31:53 +01:00
|
|
|
free();
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
ByteBuf slice() {
|
|
|
|
return slice != null ? slice : (slice = buf.slice(idx(offset), length()));
|
|
|
|
}
|
|
|
|
|
|
|
|
ByteBuf duplicate() {
|
|
|
|
return buf.duplicate().setIndex(idx(offset), idx(endOffset));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
|
2019-02-01 10:31:53 +01:00
|
|
|
void free() {
|
2018-11-13 20:56:09 +01:00
|
|
|
// Release the slice if present since it may have a different
|
|
|
|
// refcount to the unwrapped buf if it is a PooledSlicedByteBuf
|
|
|
|
ByteBuf buffer = slice;
|
|
|
|
if (buffer != null) {
|
|
|
|
buffer.release();
|
|
|
|
} else {
|
|
|
|
buf.release();
|
|
|
|
}
|
2019-02-01 10:31:53 +01:00
|
|
|
// null out in either case since it could be racy if set lazily (but not
|
|
|
|
// in the case we care about, where it will have been set in the ctor)
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
slice = null;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readerIndex(int readerIndex) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readerIndex(readerIndex);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writerIndex(int writerIndex) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writerIndex(writerIndex);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setIndex(int readerIndex, int writerIndex) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.setIndex(readerIndex, writerIndex);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf clear() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.clear();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf markReaderIndex() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.markReaderIndex();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf resetReaderIndex() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.resetReaderIndex();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf markWriterIndex() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.markWriterIndex();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf resetWriterIndex() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.resetWriterIndex();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
2012-12-17 09:41:21 +01:00
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf ensureWritable(int minWritableBytes) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.ensureWritable(minWritableBytes);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-12-17 09:41:21 +01:00
|
|
|
|
2012-10-18 08:57:23 +02:00
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf getBytes(int index, ByteBuf dst) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return getBytes(index, dst, dst.writableBytes());
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf getBytes(int index, ByteBuf dst, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
getBytes(index, dst, dst.writerIndex(), length);
|
|
|
|
dst.writerIndex(dst.writerIndex() + length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf getBytes(int index, byte[] dst) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return getBytes(index, dst, 0, dst.length);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setBoolean(int index, boolean value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return setByte(index, value? 1 : 0);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setChar(int index, int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return setShort(index, value);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setFloat(int index, float value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return setInt(index, Float.floatToRawIntBits(value));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setDouble(int index, double value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return setLong(index, Double.doubleToRawLongBits(value));
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setBytes(int index, ByteBuf src) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.setBytes(index, src, src.readableBytes());
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setBytes(int index, ByteBuf src, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.setBytes(index, src, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setBytes(int index, byte[] src) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return setBytes(index, src, 0, src.length);
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf setZero(int index, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.setZero(index, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(ByteBuf dst) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst, dst.writableBytes());
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(ByteBuf dst, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(ByteBuf dst, int dstIndex, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst, dstIndex, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(byte[] dst) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst, 0, dst.length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(byte[] dst, int dstIndex, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst, dstIndex, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(ByteBuffer dst) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(dst);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf readBytes(OutputStream out, int length) throws IOException {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.readBytes(out, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf skipBytes(int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.skipBytes(length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBoolean(boolean value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
writeByte(value ? 1 : 0);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeByte(int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
ensureWritable0(1);
|
|
|
|
_setByte(writerIndex++, value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeShort(int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeShort(value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeMedium(int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeMedium(value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeInt(int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeInt(value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeLong(long value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeLong(value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeChar(int value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeShort(value);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeFloat(float value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeInt(Float.floatToRawIntBits(value));
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeDouble(double value) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeLong(Double.doubleToRawLongBits(value));
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(ByteBuf src) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeBytes(src, src.readableBytes());
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(ByteBuf src, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeBytes(src, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(ByteBuf src, int srcIndex, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeBytes(src, srcIndex, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(byte[] src) {
|
2019-02-28 20:40:42 +01:00
|
|
|
super.writeBytes(src, 0, src.length);
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(byte[] src, int srcIndex, int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeBytes(src, srcIndex, length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeBytes(ByteBuffer src) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeBytes(src);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf writeZero(int length) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.writeZero(length);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf retain(int increment) {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.retain(increment);
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf retain() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
super.retain();
|
|
|
|
return this;
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
2014-01-29 03:44:59 +01:00
|
|
|
@Override
|
|
|
|
public CompositeByteBuf touch() {
|
2014-02-07 06:00:05 +01:00
|
|
|
return this;
|
2014-01-29 03:44:59 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public CompositeByteBuf touch(Object hint) {
|
2014-02-07 06:00:05 +01:00
|
|
|
return this;
|
2014-01-29 03:44:59 +01:00
|
|
|
}
|
|
|
|
|
2012-10-18 08:57:23 +02:00
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuffer[] nioBuffers() {
|
|
|
|
return nioBuffers(readerIndex(), readableBytes());
|
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public CompositeByteBuf discardSomeReadBytes() {
|
|
|
|
return discardReadComponents();
|
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
protected void deallocate() {
|
|
|
|
if (freed) {
|
|
|
|
return;
|
|
|
|
}
|
2012-10-18 08:57:23 +02:00
|
|
|
|
2013-07-08 06:32:33 +02:00
|
|
|
freed = true;
|
2014-07-08 19:31:37 +02:00
|
|
|
// We're not using foreach to avoid creating an iterator.
|
|
|
|
// see https://github.com/netty/netty/issues/2642
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
for (int i = 0, size = componentCount; i < size; i++) {
|
2019-02-01 10:31:53 +01:00
|
|
|
components[i].free();
|
2013-07-08 06:32:33 +02:00
|
|
|
}
|
|
|
|
}
|
2013-02-13 08:09:33 +01:00
|
|
|
|
2019-02-28 20:40:42 +01:00
|
|
|
@Override
|
|
|
|
boolean isAccessible() {
|
|
|
|
return !freed;
|
|
|
|
}
|
|
|
|
|
2013-02-13 08:09:33 +01:00
|
|
|
@Override
|
2013-07-08 06:32:33 +02:00
|
|
|
public ByteBuf unwrap() {
|
|
|
|
return null;
|
|
|
|
}
|
2015-04-10 09:43:06 +02:00
|
|
|
|
|
|
|
private final class CompositeByteBufIterator implements Iterator<ByteBuf> {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private final int size = numComponents();
|
2015-04-10 09:43:06 +02:00
|
|
|
private int index;
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public boolean hasNext() {
|
|
|
|
return size > index;
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public ByteBuf next() {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
if (size != numComponents()) {
|
2015-04-10 09:43:06 +02:00
|
|
|
throw new ConcurrentModificationException();
|
|
|
|
}
|
|
|
|
if (!hasNext()) {
|
|
|
|
throw new NoSuchElementException();
|
|
|
|
}
|
|
|
|
try {
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
return components[index++].slice();
|
2015-04-10 09:43:06 +02:00
|
|
|
} catch (IndexOutOfBoundsException e) {
|
|
|
|
throw new ConcurrentModificationException();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
@Override
|
|
|
|
public void remove() {
|
|
|
|
throw new UnsupportedOperationException("Read-Only");
|
|
|
|
}
|
|
|
|
}
|
2017-12-12 09:06:18 +01:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
// Component array manipulation - range checking omitted
|
|
|
|
|
|
|
|
private void clearComps() {
|
|
|
|
removeCompRange(0, componentCount);
|
|
|
|
}
|
|
|
|
|
|
|
|
private void removeComp(int i) {
|
|
|
|
removeCompRange(i, i + 1);
|
|
|
|
}
|
2017-12-12 09:06:18 +01:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private void removeCompRange(int from, int to) {
|
|
|
|
if (from >= to) {
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
final int size = componentCount;
|
|
|
|
assert from >= 0 && to <= size;
|
|
|
|
if (to < size) {
|
|
|
|
System.arraycopy(components, to, components, from, size - to);
|
2017-12-12 09:06:18 +01:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
int newSize = size - to + from;
|
|
|
|
for (int i = newSize; i < size; i++) {
|
|
|
|
components[i] = null;
|
|
|
|
}
|
|
|
|
componentCount = newSize;
|
|
|
|
}
|
2017-12-12 09:06:18 +01:00
|
|
|
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
private void addComp(int i, Component c) {
|
|
|
|
shiftComps(i, 1);
|
|
|
|
components[i] = c;
|
|
|
|
}
|
|
|
|
|
|
|
|
private void shiftComps(int i, int count) {
|
|
|
|
final int size = componentCount, newSize = size + count;
|
|
|
|
assert i >= 0 && i <= size && count > 0;
|
|
|
|
if (newSize > components.length) {
|
|
|
|
// grow the array
|
|
|
|
int newArrSize = Math.max(size + (size >> 1), newSize);
|
|
|
|
Component[] newArr;
|
|
|
|
if (i == size) {
|
|
|
|
newArr = Arrays.copyOf(components, newArrSize, Component[].class);
|
|
|
|
} else {
|
|
|
|
newArr = new Component[newArrSize];
|
|
|
|
if (i > 0) {
|
|
|
|
System.arraycopy(components, 0, newArr, 0, i);
|
|
|
|
}
|
|
|
|
if (i < size) {
|
|
|
|
System.arraycopy(components, i, newArr, i + count, size - i);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
components = newArr;
|
|
|
|
} else if (i < size) {
|
|
|
|
System.arraycopy(components, i, components, i + count, size - i);
|
2017-12-12 09:06:18 +01:00
|
|
|
}
|
Streamline CompositeByteBuf internals (#8437)
Motivation:
CompositeByteBuf is a powerful and versatile abstraction, allowing for
manipulation of large data without copying bytes. There is still a
non-negligible cost to reading/writing however relative to "singular"
ByteBufs, and this can be mostly eliminated with some rework of the
internals.
My use case is message modification/transformation while zero-copy
proxying. For example replacing a string within a large message with one
of a different length
Modifications:
- No longer slice added buffers and unwrap added slices
- Components store target buf offset relative to position in
composite buf
- Less allocations, object footprint, pointer indirection, offset
arithmetic
- Use Component[] rather than ArrayList<Component>
- Avoid pointer indirection and duplicate bounds check, more
efficient backing array growth
- Facilitates optimization when doing bulk-inserts - inserting n
ByteBufs behind m is now O(m + n) instead of O(mn)
- Avoid unnecessary casting and method call indirection via superclass
- Eliminate some duplicate range/ref checks via non-checking versions of
toComponentIndex and findComponent
- Add simple fast-path for toComponentIndex(0); add racy cache of
last-accessed Component to findComponent(int)
- Override forEachByte0(...) and forEachByteDesc0(...) methods
- Make use of RecyclableArrayList in nioBuffers(int, int) (in line with
FasterCompositeByteBuf impl)
- Modify addComponents0(boolean,int,Iterable) to use the Iterable
directly rather than copy to an array first (and possibly to an
ArrayList before that)
- Optimize addComponents0(boolean,int,ByteBuf[],int) to not perform
repeated array insertions and avoid second loop for offset updates
- Simplify other logic in various places, in particular the general
pattern used where a sub-range is iterated over
- Add benchmarks to demonstrate some improvements
While refactoring I also came across a couple of clear bugs. They are
fixed in these changes but I will open another PR with unit tests and
fixes to the current version.
Result:
Much faster creation, manipulation, and access; many fewer allocations
and smaller footprint. Benchmark results to follow.
2018-11-03 10:37:07 +01:00
|
|
|
componentCount = newSize;
|
2017-12-12 09:06:18 +01:00
|
|
|
}
|
2008-08-08 02:37:18 +02:00
|
|
|
}
|