A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Peter Dillinger 249eff0f30 Stats for redundant insertions into block cache (#6681)
Summary:
Since read threads do not coordinate on loading data into block
cache, two threads between Lookup and Insert can end up loading and
inserting the same data. This is particularly concerning with
cache_index_and_filter_blocks since those are hot and more likely to
be race targets if ejected from (or not pre-populated in) the cache.

Particularly with moves toward disaggregated / network storage, the cost
of redundant retrieval might be high, and we should at least have some
hard statistics from which we can estimate impact.

Example with full filter thrashing "cliff":

    $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
    ...
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 14181
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 476
    rocksdb.block.cache.data.add COUNT : 12749
    rocksdb.block.cache.data.add.redundant COUNT : 18
    rocksdb.block.cache.filter.add COUNT : 1003
    rocksdb.block.cache.filter.add.redundant COUNT : 217
    rocksdb.block.cache.index.add COUNT : 429
    rocksdb.block.cache.index.add.redundant COUNT : 241
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 1182223
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 302728
    rocksdb.block.cache.data.add COUNT : 31425
    rocksdb.block.cache.data.add.redundant COUNT : 12
    rocksdb.block.cache.filter.add COUNT : 795455
    rocksdb.block.cache.filter.add.redundant COUNT : 130238
    rocksdb.block.cache.index.add COUNT : 355343
    rocksdb.block.cache.index.add.redundant COUNT : 172478
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681

Test Plan: Some manual testing (above) and unit test covering key metrics is included

Reviewed By: ltamasi

Differential Revision: D21134113

Pulled By: pdillinger

fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
2020-04-27 13:20:27 -07:00
.circleci Migrate AppVeyor to CircleCI (#6518) 2020-03-13 21:58:51 -07:00
buckifier Update buckifier to unblock future internal release (#6726) 2020-04-26 17:35:37 -07:00
build_tools C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
cache Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
cmake C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
coverage Update a few scripts to be python3 compatible (#6525) 2020-03-24 21:00:27 -07:00
db Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
db_stress_tool Disable O_DIRECT in stress test when db directory does not support direct IO (#6727) 2020-04-25 00:01:03 -07:00
docs Log warning for high bits/key in legacy Bloom filter (#6312) 2020-01-17 19:37:35 -08:00
env Fix unused variable of r in release mode (#6750) 2020-04-24 15:14:13 -07:00
examples Add a ConfigOptions for use in comparing objects and converting to/from strings (#6389) 2020-04-21 17:38:17 -07:00
file Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) 2020-04-24 15:32:56 -07:00
hdfs Add IsDirectory() to Env and FS (#6711) 2020-04-17 14:39:18 -07:00
include/rocksdb Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
java Add a ConfigOptions for use in comparing objects and converting to/from strings (#6389) 2020-04-21 17:38:17 -07:00
logging Fix info log source file display length (#5824) 2020-04-08 20:18:08 -07:00
memory C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
memtable C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
monitoring Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
options Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) 2020-04-24 15:32:56 -07:00
port C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
table Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
test_util Disable O_DIRECT in stress test when db directory does not support direct IO (#6727) 2020-04-25 00:01:03 -07:00
third-party C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
tools Allow sst_dump to check size of different compression levels and report time (#6634) 2020-04-27 12:36:16 -07:00
trace_replay Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
util Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
utilities Add a ConfigOptions for use in comparing objects and converting to/from strings (#6389) 2020-04-21 17:38:17 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Separate timestamp related test from db_basic_test (#6516) 2020-03-13 11:37:15 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) 2020-04-24 15:32:56 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Allow sst_dump to check size of different compression levels and report time (#6634) 2020-04-27 12:36:16 -07:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md LANGUAGE-BINDINGS.md: mention python-rocksdb 2019-03-20 11:10:48 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Understand common build variables passed as make variables (#6740) 2020-04-27 10:48:49 -07:00
README.md Replaced some words (#5877) 2019-10-07 12:28:09 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Reduce memory copies when fetching and uncompressing blocks from SST files (#6689) 2020-04-24 15:32:56 -07:00
TARGETS Update buckifier to unblock future internal release (#6726) 2020-04-26 17:35:37 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md add user nebula (#6271) 2020-01-08 13:46:43 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.