A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
yingsu00 f54d7f5fea Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**

RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.

This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.

**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.

Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.

1) ReadRandom in db_bench overall metrics

    PER RUN
    Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
    3-way      |  1   | 4.143   | 241387 | 26.7
    3-way      |  2   | 3.775   | 264872 | 29.3
    3-way      | 3    | 4.116   | 242929 | 26.9
    FastCrc32c|1  | 4.037   | 247727 | 27.4
    FastCrc32c|2  | 4.648   | 215166 | 23.8
    FastCrc32c|3  | 4.352   | 229799 | 25.4

     AVG
    Algorithm     |    Average of micros/op |   Average of ops/sec |    Average of Throughput (MB/s)
    3-way           |     4.01                               |      249,729                 |      27.63
    FastCrc32c  |     4.35                              |     230,897                  |      25.53

 2)   Crc32c computation CPU cost (inclusive samples percentage)
    PER RUN
    Implementation | run |  TotalSamples   | Crc32c percentage
    3-way                 |  1    |  4,572,250,000 | 4.37%
    3-way                 |  2    |  3,779,250,000 | 4.62%
    3-way                 |  3    |  4,129,500,000 | 4.48%
    FastCrc32c       |  1    |  4,663,500,000 | 11.24%
    FastCrc32c       |  2    |  4,047,500,000 | 12.34%
    FastCrc32c       |  3    |  4,366,750,000 | 11.68%

 **# Test Plan**
     make -j64 corruption_test && ./corruption_test
      By default it uses 3-way SSE algorithm

     NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test

    make clean && DEBUG_LEVEL=0 make -j64 db_bench
    make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173

Differential Revision: D6330882

Pulled By: yingsu00

fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
2017-12-19 18:26:49 -08:00
buckifier Remove import use from TARGETS 2017-11-30 15:27:34 -08:00
build_tools Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
cache fix gflags namespace 2017-12-01 10:42:05 -08:00
cmake add missing config checks to CMakeLists.txt 2017-11-30 22:57:00 -08:00
coverage Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
db WritePrepared Txn: Return NotSupported on iterator refresh 2017-12-18 22:29:30 -08:00
docs blog post for auto-tuned rate limiter 2017-12-18 17:56:50 -08:00
env Suppress valgrind "unimplemented functionality" error 2017-11-15 14:28:34 -08:00
examples Pinnableslice examples and blog post 2017-08-24 12:26:07 -07:00
hdfs Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
include/rocksdb Remove incorrect comment 2017-12-18 17:56:47 -08:00
java Add a histogram stat for memtable flush 2017-12-15 18:57:00 -08:00
memtable fix gflags namespace 2017-12-01 10:42:05 -08:00
monitoring fix ThreadStatus for bottom-pri compaction threads 2017-12-14 14:57:49 -08:00
options Make Universal compaction options dynamic 2017-12-11 13:27:06 -08:00
port Fix a race condition in WindowsThread (port::Thread) 2017-12-07 13:42:53 -08:00
table NUMBER_BLOCK_COMPRESSED, etc, shouldn't be treated as timer counter 2017-12-14 10:27:43 -08:00
third-party Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
tools Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
util Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
utilities BlobDB: dump blob db options on open 2017-12-19 16:57:12 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Remove leftover references to phutil_module_cache 2017-08-23 12:12:21 -07:00
.travis.yml CMake cross platform Java support and add JNI to travis 2017-11-28 12:27:53 -08:00
appveyor.yml Add -DPORTABLE=1 to MSVC CI build 2017-08-31 16:42:48 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
CODE_OF_CONDUCT.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
INSTALL.md Default one to rocksdb:x64-windows 2017-09-28 16:12:24 -07:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
LANGUAGE-BINDINGS.md Add Elixir to the list of language bindings 2017-11-21 10:13:14 -08:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Port 3 way SSE4.2 crc32c implementation from Folly 2017-12-19 18:26:49 -08:00
README.md Appveyor badge to show master branch 2016-07-26 13:54:08 -07:00
ROCKSDB_LITE.md Optimistic Transactions 2015-05-29 14:36:35 -07:00
src.mk Refactor ReadBlockContents() 2017-12-11 15:27:32 -08:00
TARGETS WritePrepared Txn: make buck tests parallel 2017-12-18 14:42:09 -08:00
thirdparty.inc Enable cacheline_aligned_alloc() to allocate from jemalloc if enabled. 2017-10-27 13:27:12 -07:00
USERS.md Added ProfaneDB 2017-11-19 10:11:44 -08:00
Vagrantfile Update Vagrant file (test internal phabricator workflow) 2016-10-28 15:39:19 -07:00
WINDOWS_PORT.md Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Build Status Build status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/