Commit Graph

4 Commits

Author SHA1 Message Date
HouBingjian
a378a4c2ac arm64 crc prefetch optimise (#5773)
Summary:
prefetch data for following block,avoid cache miss when doing crc caculate

I do performance test at kunpeng-920 server(arm-v8, 64core@2.6GHz)
./db_bench --benchmarks=crc32c --block_size=500000000
before optimise : 587313.500 micros/op 1 ops/sec;  811.9 MB/s (500000000 per op)
after optimise  : 289248.500 micros/op 3 ops/sec; 1648.5 MB/s (500000000 per op)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5773

Differential Revision: D17347339

fbshipit-source-id: bfcd74f0f0eb4b322b959be68019ddcaae1e3341
2019-09-12 16:59:44 -07:00
Yuqi Gu
a3c1832e86 Arm64 CRC32 parallel computation optimization for RocksDB (#5494)
Summary:
Crc32c Parallel computation optimization:
Algorithm comes from Intel whitepaper: [crc-iscsi-polynomial-crc32-instruction-paper](https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/crc-iscsi-polynomial-crc32-instruction-paper.pdf)
 Input data is divided into three equal-sized blocks
Three parallel blocks (crc0, crc1, crc2) for 1024 Bytes
One Block: 42(BLK_LENGTH) * 8(step length: crc32c_u64) bytes

1. crc32c_test:
```
[==========] Running 4 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from CRC
[ RUN      ] CRC.StandardResults
[       OK ] CRC.StandardResults (1 ms)
[ RUN      ] CRC.Values
[       OK ] CRC.Values (0 ms)
[ RUN      ] CRC.Extend
[       OK ] CRC.Extend (0 ms)
[ RUN      ] CRC.Mask
[       OK ] CRC.Mask (0 ms)
[----------] 4 tests from CRC (1 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test case ran. (1 ms total)
[  PASSED  ] 4 tests.
```

2. RocksDB benchmark: db_bench --benchmarks="crc32c"

```
Linear Arm crc32c:
  crc32c: 1.005 micros/op 995133 ops/sec; 3887.2 MB/s (4096 per op)
```

```
Parallel optimization with Armv8 crypto extension:
  crc32c: 0.419 micros/op 2385078 ops/sec; 9316.7 MB/s (4096 per op)
```

It gets ~2.4x speedup compared to linear Arm crc32c instructions.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5494

Differential Revision: D16340806

fbshipit-source-id: 95dae9a5b646fd20a8303671d82f17b2e162e945
2019-07-17 11:22:38 -07:00
Zhongyi Xie
d68f9f4580 simplify include directive involving inttypes (#5402)
Summary:
When using `PRIu64` type of printf specifier, current code base does the following:
```
#ifndef __STDC_FORMAT_MACROS
#define __STDC_FORMAT_MACROS
#endif
#include <inttypes.h>
```
However, this can be simplified to
```
#include <cinttypes>
```
as long as flag `-std=c++11` is used.
This should solve issues like https://github.com/facebook/rocksdb/issues/5159
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5402

Differential Revision: D15701195

Pulled By: miasantreble

fbshipit-source-id: 6dac0a05f52aadb55e9728038599d3d2e4b59d03
2019-06-06 13:56:07 -07:00
Yuqi Gu
03c7ae24c2 RocksDB CRC32c optimization with ARMv8 Intrinsic (#5221)
Summary:
1. Add Arm linear crc32c implemtation for RocksDB.
2. Arm runtime check for crc32
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5221

Differential Revision: D15013685

Pulled By: siying

fbshipit-source-id: 2c2983743d26656d93f212dc7c1a3cf66a1acf12
2019-04-30 10:59:05 -07:00