netty5

Author	SHA1	Message	Date
root	411f76d3ad	[maven-release-plugin] prepare for next development iteration	2021-02-08 10:48:37 +00:00
root	97d044812d	[maven-release-plugin] prepare release netty-4.1.59.Final	2021-02-08 10:47:46 +00:00
Francesco Nigro	d943d11eb0	DecodeHexBenchmark is too branch-predictor friendly (#9942 ) Motivation: DecodeHexBenchmark needs to be less branch-predictor friendly to mimic the "real" behaviour while decoding Modifications: DecodeHexBenchmark uses a larger sets of inputs, picking them at random on each iteration and the benchmarked method is made !inlineable Result: DecodeHexBenchmark is more trusty while showing the performance difference between different decoding methods	2021-02-05 15:27:35 +01:00
Francesco Nigro	9a02832fdb	Implement SWAR indexOf byte search (#10737 ) Motivation: Faster indexOf Modification: Create generic SWAR indexOf that any ByteBuf implementation can use Result: Fixes #10731	2021-01-15 15:09:27 +01:00
root	a137ce2042	[maven-release-plugin] prepare for next development iteration	2021-01-13 10:28:54 +00:00
root	10b03e65f1	[maven-release-plugin] prepare release netty-4.1.58.Final	2021-01-13 10:27:17 +00:00
root	c6b894d03d	[maven-release-plugin] prepare for next development iteration	2021-01-12 11:10:44 +00:00
root	b016568e21	[maven-release-plugin] prepare release netty-4.1.57.Final	2021-01-12 11:10:20 +00:00
Oleksii Kachaiev	ab8c4f22c6	Improve performance of HPACK static table lookup (#10840 ) Motivation: HPACK static table is organized in a way that fields with the same name are sequential. Which means when doing sequential scan we can short-circuit scan on name mismatch. Modifications: * `HpackStaticTable.getIndexIndensitive` returns -1 on name mismatch rather than keep scanning. * `HpackStaticTable` statically defined max position in the array where name duplication is possible (after the given index there's no need to check for other fields with the same name) * Benchmark for different lookup patterns Result: Better HPACK static table lookup performance. Co-authored-by: Norman Maurer <norman_maurer@apple.com>	2020-12-21 15:34:04 +01:00
root	a9ec3d86f6	[maven-release-plugin] prepare for next development iteration	2020-12-17 06:11:39 +00:00
root	1188d8320e	[maven-release-plugin] prepare release netty-4.1.56.Final	2020-12-17 06:11:18 +00:00
root	f57d64f1c7	[maven-release-plugin] prepare for next development iteration	2020-12-08 11:51:39 +00:00
root	38da45ffe1	[maven-release-plugin] prepare release netty-4.1.55.Final	2020-12-08 11:51:25 +00:00
Andrey Mizurov	f551db2bda	Provide ability to extend StompSubframeEncoder and improve full stomp frame encoding (allocate one buffer for full frame considering the size of the headers) (#10778 ) Motivation: At the moment `StompSubframeEncoder` encode a frame only to `ByteBuf` it is not convenient if further we need to convert it to another type of message, e.g. `WebSocketFrame`. Also, if we send a full frame, it splits into two headers and a content what makes it difficult to convert it in the next handler. Modification: Introduce additional converter methods e.g. (`Object protected convertFullFrame(StompFrame original, ByteBuf encoded`)...) for extending encoder functionality and allocate only one `ByteBuf` for full stomp frame. Change headers size calculation, previously used only 256 bytes that reallocate a new buffer each time when headers size more than this threshold. Add `StompEncoderBenchmark`. Result: Improved `StompSubframeEncoder` fro extensions. Previous version benchmark ``` Benchmark (contentLength) (headersType) (pooledAllocator) Mode Cnt Score Error Units StompEncoderBenchmark.writeStompFrame 0 ONE true thrpt 10 4432132.884 ± 178923.436 ops/s StompEncoderBenchmark.writeStompFrame 0 ONE false thrpt 10 1281122.756 ± 52484.174 ops/s StompEncoderBenchmark.writeStompFrame 0 THREE true thrpt 10 2980897.937 ± 130253.049 ops/s StompEncoderBenchmark.writeStompFrame 0 THREE false thrpt 10 1116883.574 ± 35471.482 ops/s StompEncoderBenchmark.writeStompFrame 0 SEVEN true thrpt 10 1988012.159 ± 74352.450 ops/s StompEncoderBenchmark.writeStompFrame 0 SEVEN false thrpt 10 881772.343 ± 94633.870 ops/s StompEncoderBenchmark.writeStompFrame 0 ELEVEN true thrpt 10 1048125.919 ± 151053.902 ops/s StompEncoderBenchmark.writeStompFrame 0 ELEVEN false thrpt 10 429900.066 ± 47956.661 ops/s StompEncoderBenchmark.writeStompFrame 0 TWENTY true thrpt 10 660584.122 ± 104973.439 ops/s StompEncoderBenchmark.writeStompFrame 0 TWENTY false thrpt 10 278255.488 ± 20143.708 ops/s StompEncoderBenchmark.writeStompFrame 10 ONE true thrpt 10 4251498.549 ± 625050.979 ops/s StompEncoderBenchmark.writeStompFrame 10 ONE false thrpt 10 1214006.861 ± 60421.601 ops/s StompEncoderBenchmark.writeStompFrame 10 THREE true thrpt 10 3117736.486 ± 173613.974 ops/s StompEncoderBenchmark.writeStompFrame 10 THREE false thrpt 10 1046605.891 ± 94428.064 ops/s StompEncoderBenchmark.writeStompFrame 10 SEVEN true thrpt 10 2006986.881 ± 108456.748 ops/s StompEncoderBenchmark.writeStompFrame 10 SEVEN false thrpt 10 877983.112 ± 82919.387 ops/s StompEncoderBenchmark.writeStompFrame 10 ELEVEN true thrpt 10 1132844.437 ± 84578.571 ops/s StompEncoderBenchmark.writeStompFrame 10 ELEVEN false thrpt 10 429334.649 ± 35403.161 ops/s StompEncoderBenchmark.writeStompFrame 10 TWENTY true thrpt 10 657093.390 ± 48092.947 ops/s StompEncoderBenchmark.writeStompFrame 10 TWENTY false thrpt 10 252140.876 ± 37337.255 ops/s StompEncoderBenchmark.writeStompFrame 100 ONE true thrpt 10 4720507.067 ± 100993.908 ops/s StompEncoderBenchmark.writeStompFrame 100 ONE false thrpt 10 1266182.925 ± 85888.413 ops/s StompEncoderBenchmark.writeStompFrame 100 THREE true thrpt 10 2898746.621 ± 452579.753 ops/s StompEncoderBenchmark.writeStompFrame 100 THREE false thrpt 10 1019555.288 ± 65640.507 ops/s StompEncoderBenchmark.writeStompFrame 100 SEVEN true thrpt 10 2259187.459 ± 20025.989 ops/s StompEncoderBenchmark.writeStompFrame 100 SEVEN false thrpt 10 896405.412 ± 53750.148 ops/s StompEncoderBenchmark.writeStompFrame 100 ELEVEN true thrpt 10 1110670.772 ± 107650.327 ops/s StompEncoderBenchmark.writeStompFrame 100 ELEVEN false thrpt 10 445187.398 ± 28845.959 ops/s StompEncoderBenchmark.writeStompFrame 100 TWENTY true thrpt 10 611506.846 ± 25304.240 ops/s StompEncoderBenchmark.writeStompFrame 100 TWENTY false thrpt 10 247687.007 ± 43471.578 ops/s StompEncoderBenchmark.writeStompFrame 1000 ONE true thrpt 10 4140949.576 ± 270274.087 ops/s StompEncoderBenchmark.writeStompFrame 1000 ONE false thrpt 10 1154515.598 ± 134413.876 ops/s StompEncoderBenchmark.writeStompFrame 1000 THREE true thrpt 10 3349996.875 ± 162309.889 ops/s StompEncoderBenchmark.writeStompFrame 1000 THREE false thrpt 10 1141040.562 ± 5895.693 ops/s StompEncoderBenchmark.writeStompFrame 1000 SEVEN true thrpt 10 2184632.248 ± 8957.833 ops/s StompEncoderBenchmark.writeStompFrame 1000 SEVEN false thrpt 10 959545.704 ± 5835.161 ops/s StompEncoderBenchmark.writeStompFrame 1000 ELEVEN true thrpt 10 1081113.327 ± 3957.527 ops/s StompEncoderBenchmark.writeStompFrame 1000 ELEVEN false thrpt 10 467524.660 ± 1383.236 ops/s StompEncoderBenchmark.writeStompFrame 1000 TWENTY true thrpt 10 568411.797 ± 108712.493 ops/s StompEncoderBenchmark.writeStompFrame 1000 TWENTY false thrpt 10 260764.231 ± 43149.129 ops/s StompEncoderBenchmark.writeStompFrame 10000 ONE true thrpt 10 4369787.147 ± 619367.939 ops/s StompEncoderBenchmark.writeStompFrame 10000 ONE false thrpt 10 1246782.845 ± 47468.764 ops/s StompEncoderBenchmark.writeStompFrame 10000 THREE true thrpt 10 3333328.810 ± 253061.481 ops/s StompEncoderBenchmark.writeStompFrame 10000 THREE false thrpt 10 1108278.988 ± 81905.149 ops/s StompEncoderBenchmark.writeStompFrame 10000 SEVEN true thrpt 10 2062961.266 ± 247096.284 ops/s StompEncoderBenchmark.writeStompFrame 10000 SEVEN false thrpt 10 925199.985 ± 36734.594 ops/s StompEncoderBenchmark.writeStompFrame 10000 ELEVEN true thrpt 10 1223240.034 ± 58833.801 ops/s StompEncoderBenchmark.writeStompFrame 10000 ELEVEN false thrpt 10 460864.117 ± 2361.459 ops/s StompEncoderBenchmark.writeStompFrame 10000 TWENTY true thrpt 10 655864.762 ± 35237.335 ops/s StompEncoderBenchmark.writeStompFrame 10000 TWENTY false thrpt 10 286388.865 ± 1002.460 ops/s ``` A new version benchmark ``` Benchmark (contentLength) (headersType) (pooledAllocator) Mode Cnt Score Error Units StompEncoderBenchmark.writeStompFrame 0 ONE true thrpt 10 4366110.018 ± 420377.867 ops/s StompEncoderBenchmark.writeStompFrame 0 ONE false thrpt 10 1289437.153 ± 215271.656 ops/s StompEncoderBenchmark.writeStompFrame 0 THREE true thrpt 10 2818791.355 ± 218894.471 ops/s StompEncoderBenchmark.writeStompFrame 0 THREE false thrpt 10 1040151.615 ± 75352.695 ops/s StompEncoderBenchmark.writeStompFrame 0 SEVEN true thrpt 10 1842144.001 ± 94668.864 ops/s StompEncoderBenchmark.writeStompFrame 0 SEVEN false thrpt 10 916742.825 ± 65467.820 ops/s StompEncoderBenchmark.writeStompFrame 0 ELEVEN true thrpt 10 1310454.012 ± 100747.490 ops/s StompEncoderBenchmark.writeStompFrame 0 ELEVEN false thrpt 10 679934.001 ± 82168.249 ops/s StompEncoderBenchmark.writeStompFrame 0 TWENTY true thrpt 10 746867.549 ± 68373.269 ops/s StompEncoderBenchmark.writeStompFrame 0 TWENTY false thrpt 10 483316.314 ± 50978.009 ops/s StompEncoderBenchmark.writeStompFrame 10 ONE true thrpt 10 4791698.722 ± 263890.510 ops/s StompEncoderBenchmark.writeStompFrame 10 ONE false thrpt 10 1289877.116 ± 128677.185 ops/s StompEncoderBenchmark.writeStompFrame 10 THREE true thrpt 10 2984662.187 ± 395567.524 ops/s StompEncoderBenchmark.writeStompFrame 10 THREE false thrpt 10 1079028.782 ± 43548.555 ops/s StompEncoderBenchmark.writeStompFrame 10 SEVEN true thrpt 10 1806763.709 ± 59162.209 ops/s StompEncoderBenchmark.writeStompFrame 10 SEVEN false thrpt 10 935274.980 ± 22064.148 ops/s StompEncoderBenchmark.writeStompFrame 10 ELEVEN true thrpt 10 1284172.151 ± 119068.047 ops/s StompEncoderBenchmark.writeStompFrame 10 ELEVEN false thrpt 10 687174.498 ± 30270.916 ops/s StompEncoderBenchmark.writeStompFrame 10 TWENTY true thrpt 10 803843.483 ± 29106.133 ops/s StompEncoderBenchmark.writeStompFrame 10 TWENTY false thrpt 10 502134.552 ± 23653.215 ops/s StompEncoderBenchmark.writeStompFrame 100 ONE true thrpt 10 4337438.694 ± 378524.452 ops/s StompEncoderBenchmark.writeStompFrame 100 ONE false thrpt 10 1289174.213 ± 50640.853 ops/s StompEncoderBenchmark.writeStompFrame 100 THREE true thrpt 10 3232767.156 ± 311934.194 ops/s StompEncoderBenchmark.writeStompFrame 100 THREE false thrpt 10 1115247.028 ± 15683.477 ops/s StompEncoderBenchmark.writeStompFrame 100 SEVEN true thrpt 10 2213147.232 ± 86326.187 ops/s StompEncoderBenchmark.writeStompFrame 100 SEVEN false thrpt 10 901120.188 ± 71344.491 ops/s StompEncoderBenchmark.writeStompFrame 100 ELEVEN true thrpt 10 1238317.714 ± 68148.477 ops/s StompEncoderBenchmark.writeStompFrame 100 ELEVEN false thrpt 10 671336.339 ± 72735.337 ops/s StompEncoderBenchmark.writeStompFrame 100 TWENTY true thrpt 10 754565.791 ± 28574.382 ops/s StompEncoderBenchmark.writeStompFrame 100 TWENTY false thrpt 10 498939.383 ± 38146.118 ops/s StompEncoderBenchmark.writeStompFrame 1000 ONE true thrpt 10 3722594.471 ± 515861.000 ops/s StompEncoderBenchmark.writeStompFrame 1000 ONE false thrpt 10 1265629.633 ± 84113.347 ops/s StompEncoderBenchmark.writeStompFrame 1000 THREE true thrpt 10 2829696.349 ± 172520.267 ops/s StompEncoderBenchmark.writeStompFrame 1000 THREE false thrpt 10 1111454.609 ± 26275.913 ops/s StompEncoderBenchmark.writeStompFrame 1000 SEVEN true thrpt 10 1901506.449 ± 37701.353 ops/s StompEncoderBenchmark.writeStompFrame 1000 SEVEN false thrpt 10 912528.888 ± 46221.215 ops/s StompEncoderBenchmark.writeStompFrame 1000 ELEVEN true thrpt 10 1299674.123 ± 21889.002 ops/s StompEncoderBenchmark.writeStompFrame 1000 ELEVEN false thrpt 10 724527.644 ± 2757.370 ops/s StompEncoderBenchmark.writeStompFrame 1000 TWENTY true thrpt 10 811389.799 ± 2606.626 ops/s StompEncoderBenchmark.writeStompFrame 1000 TWENTY false thrpt 10 504955.449 ± 6737.804 ops/s StompEncoderBenchmark.writeStompFrame 10000 ONE true thrpt 10 3837912.649 ± 380742.919 ops/s StompEncoderBenchmark.writeStompFrame 10000 ONE false thrpt 10 1375544.306 ± 3157.068 ops/s StompEncoderBenchmark.writeStompFrame 10000 THREE true thrpt 10 3224743.448 ± 297369.719 ops/s StompEncoderBenchmark.writeStompFrame 10000 THREE false thrpt 10 1125772.007 ± 4051.498 ops/s StompEncoderBenchmark.writeStompFrame 10000 SEVEN true thrpt 10 2127352.136 ± 106787.777 ops/s StompEncoderBenchmark.writeStompFrame 10000 SEVEN false thrpt 10 934848.418 ± 4564.147 ops/s StompEncoderBenchmark.writeStompFrame 10000 ELEVEN true thrpt 10 1379672.772 ± 8778.640 ops/s StompEncoderBenchmark.writeStompFrame 10000 ELEVEN false thrpt 10 723169.459 ± 2317.767 ops/s StompEncoderBenchmark.writeStompFrame 10000 TWENTY true thrpt 10 802275.113 ± 4155.137 ops/s StompEncoderBenchmark.writeStompFrame 10000 TWENTY false thrpt 10 517604.265 ± 3398.384 ops/s ``` For headers over 256 bytes we get a speedup.	2020-12-07 09:00:52 +01:00
Norman Maurer	221c1a1ed7	Fix caching for normal allocations (#10825 ) Motivation: https://github.com/netty/netty/pull/10267 introduced a change that reduced the fragmentation. Unfortunally it also introduced a regression when it comes to caching of normal allocations. This can have a negative performance impact depending on the allocation sizes. Modifications: - Fix algorithm to calculate the array size for normal allocation caches - Correctly calculate indeox for normal caches - Add unit test Result: Fixes https://github.com/netty/netty/issues/10805	2020-11-25 15:05:30 +01:00
Frédéric Brégier	1c230405fd	Fix for performance regression on HttpPost RequestDecoder (#10623 ) Fix issue #10508 where PARANOID mode slow down about 1000 times compared to ADVANCED. Also fix a rare issue when internal buffer was growing over a limit, it was partially discarded using `discardReadBytes()` which causes bad changes within previously discovered HttpData. Reasons were: Too many `readByte()` method calls while other ways exist (such as keep in memory the last scan position when trying to find a delimiter or using `bytesBefore(firstByte)` instead of looping externally). Changes done: - major change on way buffer are parsed: instead of read byte per byte until found delimiter, try to find the delimiter using `bytesBefore()` and keep the last unfound position to skeep already parsed parts (algorithms are the same but implementation of scan are different) - Change the condition to discard read bytes when refCnt is at most 1. Observations using Async-Profiler: ================================== 1) Without optimizations, most of the time (more than 95%) is through `readByte()` method within `loadDataMultipartStandard` method. 2) With using `bytesBefore(byte)` instead of `readByte()` to find various delimiter, the `loadDataMultipartStandard` method is going down to 19 to 33% depending on the test used. the `readByte()` method or equivalent `getByte(pos)` method are going down to 15% (from 95%). Times are confirming those profiling: - With optimizations, in SIMPLE mode about 82% better, in ADVANCED mode about 79% better and in PARANOID mode about 99% better (most of the duplicate read accesses are removed or make internally through `bytesBefore(byte)` method) A benchmark is added to show the behavior of the various cases (one big item, such as File upload, and many items) and various level of detection (Disabled, Simple, Advanced, Paranoid). This benchmark is intend to alert if new implementations make too many differences (such as the previous version where about PARANOID gives about 1000 times slower than other levels, while it is now about at most 10 times). Extract of Benchmark run: ========================= Run complete. Total time: 00:13:27 Benchmark Mode Cnt Score Error Units HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigAdvancedLevel thrpt 6 2,248 ± 0,198 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigDisabledLevel thrpt 6 2,067 ± 1,219 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigParanoidLevel thrpt 6 1,109 ± 0,038 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderBigSimpleLevel thrpt 6 2,326 ± 0,314 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighAdvancedLevel thrpt 6 1,444 ± 0,226 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighDisabledLevel thrpt 6 1,462 ± 0,642 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighParanoidLevel thrpt 6 0,159 ± 0,003 ops/ms HttpPostMultipartRequestDecoderBenchmark.multipartRequestDecoderHighSimpleLevel thrpt 6 1,522 ± 0,049 ops/ms	2020-11-19 08:00:35 +01:00
root	944a020586	[maven-release-plugin] prepare for next development iteration	2020-11-11 05:47:51 +00:00
root	715353ecd6	[maven-release-plugin] prepare release netty-4.1.54.Final	2020-11-11 05:47:37 +00:00
root	afca81a9d8	[maven-release-plugin] rollback the release of netty-4.1.54.Final	2020-11-10 12:02:24 +00:00
root	e256074e49	[maven-release-plugin] prepare for next development iteration	2020-11-10 11:12:23 +00:00
root	cea659bd8a	[maven-release-plugin] prepare release netty-4.1.54.Final	2020-11-10 11:12:06 +00:00
Norman Maurer	5ffca6ef4a	Use http in xmlns URIs to make maven release plugin happy again (#10788 ) Motivation: https in xmlns URIs does not work and will let the maven release plugin fail: ``` [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 1.779 s [INFO] Finished at: 2020-11-10T07:45:21Z [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare (default-cli) on project netty-parent: Execution default-cli of goal org.apache.maven.plugins:maven-release-plugin:2.5.3:prepare failed: The namespace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" could not be added as a namespace to "project": The namespace prefix "xsi" collides with an additional namespace declared by the element -> [Help 1] [ERROR] ``` See also https://issues.apache.org/jira/browse/HBASE-24014. Modifications: Use http for xmlns Result: Be able to use maven release plugin	2020-11-10 10:22:35 +01:00
Chris Vest	1c0662ea42	Use JUnit 5 for running all tests (#10764 ) Motivation: JUnit 5 is the new hotness. It's more expressive, extensible, and composable in many ways, and it's better able to run tests in parallel. But most importantly, it's able to directly run JUnit 4 tests. This means we can update and start using JUnit 5 without touching any of our existing tests. I'm also introducing a dependency on assertj-core, which is like hamcrest, but arguably has a nicer and more discoverable API. Modification: Add the JUnit 5 and assertj-core dependencies, without converting any tests at time time. Result: All our tests are now executed through the JUnit 5 Vintage Engine. Also, the JUnit 5 test APIs are available, and any JUnit 5 tests that are added from now on will also be executed.	2020-11-04 10:19:59 +01:00
Artem Smotrakov	e5951d46fc	Enable nohttp check during the build (#10708 ) Motivation: HTTP is a plaintext protocol which means that someone may be able to eavesdrop the data. To prevent this, HTTPS should be used whenever possible. However, maintaining using https:// in all URLs may be difficult. The nohttp tool can help here. The tool scans all the files in a repository and reports where http:// is used. Modifications: - Added nohttp (via checkstyle) into the build process. - Suppressed findings for the websites that don't support HTTPS or that are not reachable Result: - Prevent using HTTP in the future. - Encourage users to use HTTPS when they follow the links they found in the code.	2020-10-23 14:44:18 +02:00
root	01b7e18632	[maven-release-plugin] prepare for next development iteration	2020-10-13 06:29:26 +00:00
root	d4a0050ef3	[maven-release-plugin] prepare release netty-4.1.53.Final	2020-10-13 06:29:02 +00:00
Francesco Nigro	69b44c6d06	Reduce DefaultAttributeMap lookup cost (#10530 ) Motivation: DefaultAttributeMap::attr has a blocking behaviour on lookup of an existing attribute: it can be made non-blocking. Modification: Replace the existing fixed bucket table using a locked intrusive linked list with an hand-rolled copy-on-write ordered single array Result: Non blocking behaviour for the lookup happy path	2020-10-02 18:24:35 +02:00
Francesco Nigro	162e59848a	Improve predictability of writeUtf8/writeAscii performance (#10368 ) Motivation: writeUtf8 can suffer from inlining issues and/or megamorphic call-sites on the hot path due to ByteBuf hierarchy Modifications: Duplicate and specialize the code paths to reduce the need of polymorphic calls Result: Performance are more stable in user code	2020-09-09 16:10:26 +02:00
root	957ef746d8	[maven-release-plugin] prepare for next development iteration	2020-09-08 05:26:25 +00:00
root	ada9c38c0a	[maven-release-plugin] prepare release netty-4.1.52.Final	2020-09-08 05:26:05 +00:00
Francesco Nigro	38f01e0840	Reduce garbage on MQTT (#10509 ) Reduce garbage on MQTT encoding Motivation: MQTT encoding and decoding is doing unnecessary object allocation in a number of places: - MqttEncoder create many byte[] to encode Strings into UTF-8 bytes - MqttProperties uses Integer keys instead of int - Some enums valueOf create unnecessary arrays on the hot paths - MqttDecoder was using unecessary Result<T> Modification: - ByteBufUtil::utf8Bytes and ByteBufUtil::reserveAndWriteUtf8 allows to perform the same operation GC-free - MqttProperties uses a primitive key map - Implemented GC free const table lookup/switch valueOf - Use some bit-tricks to pack 2 ints into a single primitive long to store both result and numberOfBytesConsumed and use byte[].length to compute numberOfByteConsumed on fly. These changes allowed to save creating Result<T>. Result: Significantly less garbage produced in MQTT encoding/decoding	2020-09-04 18:27:22 +02:00
Francesco Nigro	d2c03c9a29	Improve MqttMessageType::valueOf cost (#10400 ) Motivation: MqttMessageType::valueOf has O(N) cost Modifications: MqttMessageType::valueOf uses a const lookup table Result: MqttMessageType::valueOf has O(1) cost	2020-08-31 10:32:33 +02:00
root	bfbeb2dec6	[maven-release-plugin] prepare for next development iteration	2020-07-09 12:27:06 +00:00
root	646934ef0a	[maven-release-plugin] prepare release netty-4.1.51.Final	2020-07-09 12:26:30 +00:00
root	caf51b7284	[maven-release-plugin] prepare for next development iteration	2020-05-13 06:00:23 +00:00
root	8c5b72aaf0	[maven-release-plugin] prepare release netty-4.1.50.Final	2020-05-13 05:59:55 +00:00
root	9c5008b109	[maven-release-plugin] prepare for next development iteration	2020-04-22 09:57:54 +00:00
root	d0ec961cce	[maven-release-plugin] prepare release netty-4.1.49.Final	2020-04-22 09:57:26 +00:00
Linas Medžiūnas	fb5e2cd3aa	Efficient BytBuf search algorithms (#9914 ) (#9955 ) Motivation: We have found out that ByteBufUtil.indexOf can be inefficient for substring search on ByteBuf, both in terms of algorithm complexity (worst case O(needle.readableBytes * haystack.readableBytes)), and in constant factor (esp. on Composite buffers). With implementation of more performant search algorithms we have seen improvements on the order of magnitude. Modifications: This change introduces three search algorithms: 1. Knuth Morris Pratt - classical textbook algorithm, a good default choice. 2. Bit mask based algorithm - stable performance on any input, but limited to maximum search substring (the needle) length of 64 bytes. 3. Aho–Corasick - worse performance and higher memory consumption than [1] and [2], but it supports multiple substring (the needles) search simultaneously, by inspecting every byte of the haystack only once. Each algorithm processes every byte of underlying buffer only once, they are implemented as ByteProcessor. Result: Efficient search algorithms with linear time complexity available in Netty (I will share benchmark results in a comment on a PR).	2020-04-15 10:21:24 +02:00
Dmitry Konstantinov	ea31b59037	Replace usage() with freeBytes() in thresholds within hot paths of PoolChunkList (#10141 ) Motivation: PoolChunk.usage() method has non-trivial computations. It is used currently in hot path methods invoked when an allocation and de-allocation are happened. The idea is to replace usage() output comparison against percent thresholds by Chunk.freeBytes plain comparison against absolute thresholds. In such way the majority of computations from the threshold conditions are moved to init logic. Modifications: Replace PoolChunk.usage() conditions in PoolChunkList with equivalent conditions for PoolChunk.freeBytes() Result: Improve performance of allocation and de-allocation of ByteBuf from normal size cache pool	2020-03-31 22:11:16 +02:00
root	14e4afeba2	[maven-release-plugin] prepare for next development iteration	2020-03-17 09:20:54 +00:00
root	c10c697e5b	[maven-release-plugin] prepare release netty-4.1.48.Final	2020-03-17 09:18:28 +00:00
root	c623a50d19	[maven-release-plugin] prepare for next development iteration	2020-03-09 12:13:56 +00:00
root	a401b2ac92	[maven-release-plugin] prepare release netty-4.1.47.Final	2020-03-09 12:13:26 +00:00
root	e0d73bca4d	[maven-release-plugin] prepare for next development iteration	2020-02-28 06:37:33 +00:00
root	ebe7af5102	[maven-release-plugin] prepare release netty-4.1.46.Final	2020-02-28 06:36:45 +00:00
root	9b1ea10a12	[maven-release-plugin] prepare for next development iteration	2020-01-13 09:13:53 +00:00
root	136db8680a	[maven-release-plugin] prepare release netty-4.1.45.Final	2020-01-13 09:13:30 +00:00
Francesco Nigro	bc026ef8ba	Faster decodeHexNibble (#9896 ) Motivation: decodeHexNibble can be a lot faster using a lookup table Modifications: decodeHexNibble is made faster by using a lookup table Result: decodeHexNibble is faster	2019-12-23 21:15:56 +01:00
Anuraag Agrawal	687308b4de	Separate out query string encoding for non-encoded strings. (#9887 ) Motivation: Currently, characters are appended to the encoded string char-by-char even when no encoding is needed. We can instead separate out codepath that appends the entire string in one go for better `StringBuilder` allocation performance. Modification: Only go into char-by-char loop when finding a character that requires encoding. Result: The results aren't so clear with noise on my hot laptop - the biggest impact is on long strings, both to reduce resizes of the buffer and also to reduce complexity of the loop. I don't think there's a significant downside though for the cases that hit the slow path. After ``` Benchmark Mode Cnt Score Error Units QueryStringEncoderBenchmark.longAscii thrpt 6 1.406 ± 0.069 ops/us QueryStringEncoderBenchmark.longAsciiFirst thrpt 6 0.046 ± 0.001 ops/us QueryStringEncoderBenchmark.longUtf8 thrpt 6 0.046 ± 0.001 ops/us QueryStringEncoderBenchmark.shortAscii thrpt 6 15.781 ± 0.949 ops/us QueryStringEncoderBenchmark.shortAsciiFirst thrpt 6 3.171 ± 0.232 ops/us QueryStringEncoderBenchmark.shortUtf8 thrpt 6 3.900 ± 0.667 ops/us ``` Before ``` Benchmark Mode Cnt Score Error Units QueryStringEncoderBenchmark.longAscii thrpt 6 0.444 ± 0.072 ops/us QueryStringEncoderBenchmark.longAsciiFirst thrpt 6 0.043 ± 0.002 ops/us QueryStringEncoderBenchmark.longUtf8 thrpt 6 0.047 ± 0.001 ops/us QueryStringEncoderBenchmark.shortAscii thrpt 6 16.503 ± 1.015 ops/us QueryStringEncoderBenchmark.shortAsciiFirst thrpt 6 3.316 ± 0.154 ops/us QueryStringEncoderBenchmark.shortUtf8 thrpt 6 3.776 ± 0.956 ops/us ```	2019-12-20 08:51:18 +01:00

1 2 3 4 5 ...

413 Commits