A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Yanqin Jin 5237b39d2e Fix assertion error during compaction with write-prepared txn enabled (#9105)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9105

The user contract of SingleDelete is that: a SingleDelete can only be issued to
a key that exists and has NOT been updated. For example, application can insert
one key `key`, and uses a SingleDelete to delete it in the future. The `key`
cannot be updated or removed using Delete.
In reality, especially when write-prepared transaction is being used, things
can get tricky. For example, a prepared transaction already writes `key` to the
memtable after a successful Prepare(). Afterwards, should the transaction
rollback, it will insert a Delete into the memtable to cancel out the prior
Put. Consider the following sequence of operations.

```
// operation sequence 1
Begin txn
Put(key)
Prepare()
Flush()

Rollback txn
Flush()
```

There will be two SSTs resulting from above. One of the contains a PUT, while
the second one contains a Delete. It is also known that releasing a snapshot
can lead to an L0 containing only a SD for a particular key. Consider the
following operations following the above block.

```
// operation sequence 2
db->Put(key)
db->SingleDelete(key)
Flush()
```

The operation sequence 2 can result in an L0 with only the SD.

Should there be a snapshot for conflict checking created before operation
sequence 1, then an attempt to compact the db may hit the assertion failure
below, because ikey_.type is Delete (from a rollback).

```
else if (clear_and_output_next_key_) {
  assert(ikey_.type == kTypeValue || ikey_.type == kTypeBlobIndex);
}
```

To fix the assertion failure, we can skip the SingleDelete if we detect an
earlier Delete in the same snapshot interval.

Reviewed By: ltamasi

Differential Revision: D32056848

fbshipit-source-id: 23620a91e28562d91c45cf7e95f414b54b729748
2021-11-05 15:29:18 -07:00
.circleci Fix EnvLibrados and add to CI (#9088) 2021-10-29 08:19:03 -07:00
.github/workflows Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
buckifier Update buckify scripts (#9104) 2021-11-01 10:11:18 -07:00
build_tools Fixing more loosely formatted issue (#9140) 2021-11-05 12:21:38 -07:00
cache Add new API CacheReservationManager::GetDummyEntrySize() (#9072) 2021-11-01 14:46:09 -07:00
cmake Add find_dependency() in cmake config file. (#6791) 2020-05-12 21:18:29 -07:00
coverage Remove asan_symbolize.py for internal asan build (#8737) 2021-09-07 15:39:11 -07:00
db Fix assertion error during compaction with write-prepared txn enabled (#9105) 2021-11-05 15:29:18 -07:00
db_stress_tool Avoid div-by-zero error in db_stress (#9086) 2021-10-31 22:16:03 -07:00
docs Misc doc fixes (#8983) 2021-10-07 11:22:17 -07:00
env Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
examples Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
file Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
fuzz Make EventListener into a Customizable Class (#8473) 2021-07-27 07:47:02 -07:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Improve comments on options.writable_file_max_buffer_size (#9131) 2021-11-04 16:38:09 -07:00
java Regression tests for tickets fixed by previous change. (#9019) 2021-11-01 15:06:47 -07:00
logging Make SystemClock into a Customizable Class (#8636) 2021-09-21 09:23:48 -07:00
memory Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
memtable Fix race in WriteBufferManager (#9009) 2021-10-12 00:16:21 -07:00
microbench Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
monitoring remove bad extra RecordTick(stats_, WRITE_WITH_WAL) (#9064) 2021-11-01 11:43:14 -07:00
options Fix -Werror=type-limits seen in Travis (#9128) 2021-11-04 15:01:10 -07:00
plugin Makefile support to statically link external plugin code (#7918) 2021-02-10 08:35:34 -08:00
port Make FileSystem a Customizable Class (#8649) 2021-11-02 09:07:11 -07:00
table Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter (#9070) 2021-11-04 13:35:38 -07:00
test_util Make FileSystem a Customizable Class (#8649) 2021-11-02 09:07:11 -07:00
third-party Add support for building on s390x platform (#8962) 2021-10-22 10:13:15 -07:00
tools Add options.manual_wal_flush to db_bench (#9132) 2021-11-04 16:03:47 -07:00
trace_replay Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
util Sanitize negative request bytes in GenericRateLimiter::Request and clarify API (#9112) 2021-11-04 10:11:53 -07:00
utilities Fix assertion error during compaction with write-prepared txn enabled (#9105) 2021-11-05 15:29:18 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore gitignore cmake-build-* for CLion integration (#7933) 2021-02-19 13:43:15 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Re-enable 390x+cmake* Travis jobs (#9110) 2021-11-03 20:30:15 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Add support for building on s390x platform (#8962) 2021-10-22 10:13:15 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Fix assertion error during compaction with write-prepared txn enabled (#9105) 2021-11-05 15:29:18 -07:00
INSTALL.md Update installation instructions (#8158) 2021-04-06 16:02:04 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
PLUGINS.md Add ZenFS to plugin list (#8218) 2021-04-22 11:12:40 -07:00
README.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Experimental support for SST unique IDs (#8990) 2021-10-18 23:32:01 -07:00
TARGETS internal_repo_rocksdb/repo 2021-10-29 19:34:39 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Update USERS.md (#8923) 2021-10-01 16:10:35 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.