A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Andrew Kryczka b860a42158 Recover to exact latest seqno of data committed to MANIFEST (#9305)
Summary:
The LastSequence field in the MANIFEST file is the baseline seqno for a recovered DB. Recovering WAL entries might cause the recovered DB's seqno to advance above this baseline, but the recovered DB will never use a smaller seqno.

Before this PR, we were writing the DB's seqno at the time of LogAndApply() as the LastSequence value. This works in the sense that it is a large enough baseline for the recovered DB that it'll never overwrite any records in existing SST files. At the same time, it's arbitrarily larger than what's needed. This behavior comes from LevelDB, where there was no tracking of largest seqno in an SST file.

Now we know the largest seqno of newly written SST files, so we can write an exact value in LastSequence that actually reflects the largest seqno in any file referred to by the MANIFEST. This is primarily useful for correctness testing with unsynced data loss, where the recovered DB's seqno needs to indicate what records were recovered.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9305

Test Plan:
- https://github.com/facebook/rocksdb/issues/9338 adds crash-recovery correctness testing coverage for WAL disabled use cases
- https://github.com/facebook/rocksdb/issues/9357 will extend that testing to cover file ingestion
- Added assertion at end of LogAndApply() for `VersionSet::descriptor_last_sequence_` consistency with files
- Manually tested upgrade/downgrade compatibility with a custom crash test that randomly picks between a `db_stress` built with and without this PR (for old code it must run with `-disable_wal=0`)

Reviewed By: riversand963

Differential Revision: D33182770

Pulled By: ajkr

fbshipit-source-id: 0bfafaf685f347cc8cb0e1d62e0186340a738f7d
2022-01-05 16:02:21 -08:00
.circleci Fix some CI output (#9193) 2021-11-20 10:21:58 -08:00
.github/workflows Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
buckifier Update TARGETS and related scripts (#9310) 2021-12-17 11:51:51 -08:00
build_tools Enable core dumps in ASAN crash tests (#9330) 2021-12-22 10:14:16 -08:00
cache Fix unity build with SUPPORT_CLOCK_CACHE (#9309) 2021-12-17 14:15:07 -08:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Remove asan_symbolize.py for internal asan build (#8737) 2021-09-07 15:39:11 -07:00
db Recover to exact latest seqno of data committed to MANIFEST (#9305) 2022-01-05 16:02:21 -08:00
db_stress_tool Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
docs New blog post for Ribbon filter (#8992) 2021-12-28 21:54:39 -08:00
env Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
examples Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
file Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
fuzz Make EventListener into a Customizable Class (#8473) 2021-07-27 07:47:02 -07:00
hdfs Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
include/rocksdb Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
java Recover to exact latest seqno of data committed to MANIFEST (#9305) 2022-01-05 16:02:21 -08:00
logging Use system-wide thread ID in info log lines (#9164) 2021-11-12 19:46:06 -08:00
memory Change GTEST_SKIP to BYPASS for MemoryAllocatorTest (#9340) 2021-12-29 03:41:39 -08:00
memtable Minor improvement to CacheReservationManager/WriteBufferManager/CompressionDictBuilding (#9139) 2021-11-05 16:13:47 -07:00
microbench Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
monitoring Add tiered storage related read bytes stats to Statistic (#9123) 2021-11-16 15:17:17 -08:00
options Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
table Remove/Reduce use of Regex in ObjectRegistry/Library (#9264) 2021-12-29 07:56:23 -08:00
test_util Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
third-party Add support for building on s390x platform (#8962) 2021-10-22 10:13:15 -07:00
tools Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
trace_replay Added TraceOptions::preserve_write_order (#9334) 2021-12-28 15:04:26 -08:00
util Remove/Reduce use of Regex in ObjectRegistry/Library (#9264) 2021-12-29 07:56:23 -08:00
utilities Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore gitignore cmake-build-* for CLion integration (#7933) 2021-02-19 13:43:15 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Re-enable 390x+cmake* Travis jobs (#9110) 2021-11-03 20:30:15 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
INSTALL.md Update installation instructions (#8158) 2021-04-06 16:02:04 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Fixes for building RocksJava builds on s390x (#9321) 2021-12-22 12:57:50 -08:00
PLUGINS.md Add ZenFS to plugin list (#8218) 2021-04-22 11:12:40 -07:00
README.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
TARGETS Update TARGETS and related scripts (#9310) 2021-12-17 11:51:51 -08:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Update USERS.md (#8923) 2021-10-01 16:10:35 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.