A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Peter Dillinger 8aa99fc71e Warn on excessive keys for legacy Bloom filter with 32-bit hash (#6317)
Summary:
With many millions of keys, the old Bloom filter implementation
for the block-based table (format_version <= 4) would have excessive FP
rate due to the limitations of feeding the Bloom filter with a 32-bit hash.
This change computes an estimated inflated FP rate due to this effect
and warns in the log whenever an SST filter is constructed (almost
certainly a "full" not "partitioned" filter) that exceeds 1.5x FP rate
due to this effect. The detailed condition is only checked if 3 million
keys or more have been added to a filter, as this should be a lower
bound for common bits/key settings (< 20).

Recommended remedies include smaller SST file size, using
format_version >= 5 (for new Bloom filter), or using partitioned
filters.

This does not change behavior other than generating warnings for some
constructed filters using the old implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6317

Test Plan:
Example with warning, 15M keys @ 15 bits / key: (working_mem_size_mb is just to stop after building one filter if it's large)

    $ ./filter_bench -quick -impl=0 -working_mem_size_mb=1 -bits_per_key=15 -average_keys_per_filter=15000000 2>&1 | grep 'FP rate'
    [WARN] [/block_based/filter_policy.cc:292] Using legacy SST/BBT Bloom filter with excessive key count (15.0M @ 15bpk), causing estimated 1.8x higher filter FP rate. Consider using new Bloom with format_version>=5, smaller SST file size, or partitioned filters.
    Predicted FP rate %: 0.766702
    Average FP rate %: 0.66846

Example without warning (150K keys):

    $ ./filter_bench -quick -impl=0 -working_mem_size_mb=1 -bits_per_key=15 -average_keys_per_filter=150000 2>&1 | grep 'FP rate'
    Predicted FP rate %: 0.422857
    Average FP rate %: 0.379301
    $

With more samples at 15 bits/key:
  150K keys -> no warning; actual: 0.379% FP rate (baseline)
  1M keys -> no warning; actual: 0.396% FP rate, 1.045x
  9M keys -> no warning; actual: 0.563% FP rate, 1.485x
  10M keys -> warning (1.5x); actual: 0.564% FP rate, 1.488x
  15M keys -> warning (1.8x); actual: 0.668% FP rate, 1.76x
  25M keys -> warning (2.4x); actual: 0.880% FP rate, 2.32x

At 10 bits/key:
  150K keys -> no warning; actual: 1.17% FP rate (baseline)
  1M keys -> no warning; actual: 1.16% FP rate
  10M keys -> no warning; actual: 1.32% FP rate, 1.13x
  25M keys -> no warning; actual: 1.63% FP rate, 1.39x
  35M keys -> warning (1.6x); actual: 1.81% FP rate, 1.55x

At 5 bits/key:
  150K keys -> no warning; actual: 9.32% FP rate (baseline)
  25M keys -> no warning; actual: 9.62% FP rate, 1.03x
  200M keys -> no warning; actual: 12.2% FP rate, 1.31x
  250M keys -> warning (1.5x); actual: 12.8% FP rate, 1.37x
  300M keys -> warning (1.6x); actual: 13.4% FP rate, 1.43x

The reason for the modest inaccuracy at low bits/key is that the assumption of independence between a collision between 32-hash values feeding the filter and an FP in the filter is not quite true for implementations using "simple" logic to compute indices from the stock hash result. There's math on this in my dissertation, but I don't think it's worth the effort just for these extreme cases (> 100 million keys and low-ish bits/key).

Differential Revision: D19471715

Pulled By: pdillinger

fbshipit-source-id: f80c96893a09bf1152630ff0b964e5cdd7e35c68
2020-01-20 21:31:47 -08:00
buckifier PosixRandomAccessFile::MultiRead() to use I/O uring if supported (#5881) 2019-12-07 20:55:52 -08:00
build_tools Improve instructions to install formatter (#6162) 2019-12-12 14:04:01 -08:00
cache Remove key length assertion LRUHandle::CalcTotalCharge (#6115) 2019-12-02 15:00:07 -08:00
cmake cmake: do not build tests for Release build and cleanups (#5916) 2019-12-13 12:48:06 -08:00
coverage Fix interpreter lines for files with python2-only syntax. 2019-07-09 10:51:37 -07:00
db Separate enable-WAL and disable-WAL writer to avoid unwanted data in log files (#6290) 2020-01-17 15:54:55 -08:00
db_stress_tool Variable key length in db_stress (#6273) 2020-01-09 21:27:18 -08:00
docs Log warning for high bits/key in legacy Bloom filter (#6312) 2020-01-17 19:37:35 -08:00
env Fix some shadow warning (#6242) 2020-01-08 18:20:13 -08:00
examples Update example of optimistic transaction (#6074) 2020-01-16 14:04:44 -08:00
file Remove earlier partial BlobDB GC implementation (#6278) 2020-01-14 15:08:44 -08:00
hdfs Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
include/rocksdb Log warning for high bits/key in legacy Bloom filter (#6312) 2020-01-17 19:37:35 -08:00
java Access Maven Central over HTTPS (#6301) 2020-01-15 17:54:53 -08:00
logging Increase max_log_size in FlushJob to 1024 bytes (#6258) 2020-01-06 10:16:52 -08:00
memory Charge block cache for cache internal usage (#5797) 2019-09-16 15:26:21 -07:00
memtable Misc hashing updates / upgrades (#5909) 2019-10-24 17:16:46 -07:00
monitoring Apply formatter to recent 200+ commits. (#5830) 2019-09-20 12:04:26 -07:00
options Introduce a new storage specific Env API (#5761) 2019-12-13 14:48:41 -08:00
port Implement getfreespace for WinEnv (#6265) 2020-01-07 13:56:13 -08:00
table Warn on excessive keys for legacy Bloom filter with 32-bit hash (#6317) 2020-01-20 21:31:47 -08:00
test_util unordered_write incompatible with max_successive_merges (#6284) 2020-01-10 16:53:19 -08:00
third-party Apply formatter to some recent commits (#6138) 2019-12-09 15:49:49 -08:00
tools Fix bug which causes crash_test to always run on sync mode (#6304) 2020-01-17 01:46:48 -08:00
trace_replay Misc hashing updates / upgrades (#5909) 2019-10-24 17:16:46 -07:00
util Warn on excessive keys for legacy Bloom filter with 32-bit hash (#6317) 2020-01-20 21:31:47 -08:00
utilities Remove earlier partial BlobDB GC implementation (#6278) 2020-01-14 15:08:44 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Make buckifier python3 compatible (#5922) 2019-10-23 13:52:27 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Small tidy and speed up of the travis build (#6181) 2019-12-17 13:56:45 -08:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Add Visual Studio 2015 to AppVeyor (#5446) 2019-12-10 20:02:31 -08:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Introduce a new storage specific Env API (#5761) 2019-12-13 14:48:41 -08:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Add clarifying/instructive header to TARGETS and defs.bzl 2019-11-05 20:20:33 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Adjust thread pool sizes when setting max_background_jobs dynamically (#6300) 2020-01-16 14:35:10 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
LANGUAGE-BINDINGS.md LANGUAGE-BINDINGS.md: mention python-rocksdb 2019-03-20 11:10:48 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Fix a clang analyzer report, and 'analyze' make rule (#6244) 2019-12-24 18:46:40 -08:00
README.md Replaced some words (#5877) 2019-10-07 12:28:09 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Introduce a new storage specific Env API (#5761) 2019-12-13 14:48:41 -08:00
TARGETS Introduce a new storage specific Env API (#5761) 2019-12-13 14:48:41 -08:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md add user nebula (#6271) 2020-01-08 13:46:43 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.