f54d7f5fea
Summary: **# Summary** RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain. This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used. **# Performance Test Results** We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm. Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s. 1) ReadRandom in db_bench overall metrics PER RUN Algorithm | run | micros/op | ops/sec |Throughput (MB/s) 3-way | 1 | 4.143 | 241387 | 26.7 3-way | 2 | 3.775 | 264872 | 29.3 3-way | 3 | 4.116 | 242929 | 26.9 FastCrc32c|1 | 4.037 | 247727 | 27.4 FastCrc32c|2 | 4.648 | 215166 | 23.8 FastCrc32c|3 | 4.352 | 229799 | 25.4 AVG Algorithm | Average of micros/op | Average of ops/sec | Average of Throughput (MB/s) 3-way | 4.01 | 249,729 | 27.63 FastCrc32c | 4.35 | 230,897 | 25.53 2) Crc32c computation CPU cost (inclusive samples percentage) PER RUN Implementation | run | TotalSamples | Crc32c percentage 3-way | 1 | 4,572,250,000 | 4.37% 3-way | 2 | 3,779,250,000 | 4.62% 3-way | 3 | 4,129,500,000 | 4.48% FastCrc32c | 1 | 4,663,500,000 | 11.24% FastCrc32c | 2 | 4,047,500,000 | 12.34% FastCrc32c | 3 | 4,366,750,000 | 11.68% **# Test Plan** make -j64 corruption_test && ./corruption_test By default it uses 3-way SSE algorithm NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test make clean && DEBUG_LEVEL=0 make -j64 db_bench make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench Closes https://github.com/facebook/rocksdb/pull/3173 Differential Revision: D6330882 Pulled By: yingsu00 fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9 |
||
---|---|---|
buckifier | ||
build_tools | ||
cache | ||
cmake | ||
coverage | ||
db | ||
docs | ||
env | ||
examples | ||
hdfs | ||
include/rocksdb | ||
java | ||
memtable | ||
monitoring | ||
options | ||
port | ||
table | ||
third-party | ||
tools | ||
util | ||
utilities | ||
.clang-format | ||
.gitignore | ||
.travis.yml | ||
appveyor.yml | ||
AUTHORS | ||
CMakeLists.txt | ||
CODE_OF_CONDUCT.md | ||
CONTRIBUTING.md | ||
COPYING | ||
DEFAULT_OPTIONS_HISTORY.md | ||
DUMP_FORMAT.md | ||
HISTORY.md | ||
INSTALL.md | ||
issue_template.md | ||
LANGUAGE-BINDINGS.md | ||
LICENSE.Apache | ||
LICENSE.leveldb | ||
Makefile | ||
README.md | ||
ROCKSDB_LITE.md | ||
src.mk | ||
TARGETS | ||
thirdparty.inc | ||
USERS.md | ||
Vagrantfile | ||
WINDOWS_PORT.md |
RocksDB: A Persistent Key-Value Store for Flash and RAM Storage
RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)
This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.
Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples
See the github wiki for more explanation.
The public interface is in include/
. Callers should not include or
rely on the details of any other header files in this package. Those
internal APIs may be changed without warning.
Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/