A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Peter Dillinger 8b8a2e9f05 Ribbon: major re-work of hashing, seeds, and more (#7635)
Summary:
* Fully optimized StandardHasher, in terms of efficiently generating Start, CoeffRow, and ResultRow from a stock hash value, with sufficient independence between them to have no measurably degraded behavior. (Degraded behavior would be an FP rate higher than explainable by 2^-b and, if using a 32-bit stock hash function, expected stock hash collisions.) Details in code comments.
* Our standard 64-bit and 32-bit hash functions do not exhibit sufficient independence on sequential seeds (for one Ribbon construction attempt to have independent probability from the next). I have worked around this in the Ribbon code by "pre-mixing" "ordinal seeds," sequentially tried and appropriate for storage in persisted metadata, into "raw seeds," ready for application and appropriate for in-memory storage. This way the pre-mixing step (though fast) is only applied on loading or configuring the structure, not on each query or banding add.
* Fix a subtle flaw in which backtracking not clearing ResultRow data could lead to elevated FP rate on keys that were backtracked on and should (for generality) exhibit the same FP rate as novel keys.
* Added a basic test for PhsfQuery and construction algorithms (map or "retrieval structure" rather than set or filter), and made a few trivial related fixes.
* Better random configuration generation in unit tests
* Some other minor cleanup / clarification / etc.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7635

Test Plan: unit tests included

Reviewed By: jay-zhuang

Differential Revision: D24738978

Pulled By: pdillinger

fbshipit-source-id: f9d03599d9e2ca3e30e9d3e7d81cd936b56f76f0
2020-11-07 17:22:54 -08:00
.circleci Fix many tests to run with MEM_ENV and ENCRYPTED_ENV; Introduce a MemoryFileSystem class (#7566) 2020-10-27 10:33:09 -07:00
.github/workflows Update clang-format-diff.py (#7609) 2020-11-04 16:09:01 -08:00
buckifier Add a rocksdb lib target with link_whole=True (#7466) 2020-09-30 22:50:32 -07:00
build_tools Update clang-format-diff.py (#7609) 2020-11-04 16:09:01 -08:00
cache Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
cmake Add find_dependency() in cmake config file. (#6791) 2020-05-12 21:18:29 -07:00
coverage Find the correct gcov (#6904) 2020-06-01 16:33:05 -07:00
db Track WAL in MANIFEST: LogAndApply WAL events to MANIFEST (#7601) 2020-11-06 17:22:36 -08:00
db_stress_tool Track WAL in MANIFEST: LogAndApply WAL events to MANIFEST (#7601) 2020-11-06 17:22:36 -08:00
docs Update github-pages to v207 (#7235) 2020-08-12 09:26:24 -07:00
env Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
examples Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
file Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Add API to verify whole sst file checksum (#7578) 2020-11-03 20:34:56 -08:00
java Simplify a test case in Java ReadOnlyTest (#7608) 2020-11-04 16:49:17 -08:00
logging Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
memory slightly improve jemalloc allocator API header (#7592) 2020-10-28 13:47:12 -07:00
memtable Test for LoadLatestOptions (#7554) 2020-10-14 22:28:55 -07:00
monitoring Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
options Expand effect of dictionary settings in ColumnFamilyOptions::compression_opts (#7619) 2020-11-02 19:21:11 -08:00
port Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
table Fix MultiGet unable to query timestamp data issue (#7589) 2020-11-03 09:45:41 -08:00
test_util Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
third-party Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
tools Add "max_write_buffer_size_to_maintain" to crash test (#7634) 2020-11-03 13:55:18 -08:00
trace_replay Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
util Ribbon: major re-work of hashing, seeds, and more (#7635) 2020-11-07 17:22:54 -08:00
utilities Skip fsync in txn tests (#7641) 2020-11-06 14:25:14 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Ribbon: InterleavedSolutionStorage (#7598) 2020-11-03 12:46:36 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Update Travis config for broken snapd on ppc (#7381) 2020-09-14 14:23:13 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Compute NeedCompact() after table builder Finish() (#7627) 2020-11-04 10:44:56 -08:00
INSTALL.md Update the version of the dependencies used by the RocksJava static build (#4761) 2018-12-18 20:25:43 -08:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Add RestoreDBFromLatestBackup to C API, add new C# package (#7092) 2020-07-08 11:56:41 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
README.md Fix the CI badge for ppc64le Jenkins (#7561) 2020-10-16 09:00:56 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
TARGETS Ribbon: initial (general) algorithms and basic unit test (#7491) 2020-10-25 20:44:49 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Add YugabyteDB to USERS (#6786) 2020-05-06 10:28:29 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md #5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152) 2019-04-04 11:38:19 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.