A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Herman Lee d9d456de49 Fix locktree accesses to PessimisticTransactions (#9898)
Summary:
The current locktree implementation stores the address of the
PessimisticTransactions object as the TXNID. However, when a transaction
is blocked on a lock, it records the list of waitees with conflicting
locks using the rocksdb assigned TransactionID. This is performed by
calling GetID() on PessimisticTransactions objects of the waitees,
and then recorded in the waiter's list.

However, there is no guarantee the objects are valid when recording the
waitee list during the conflict callbacks because the waitee
could have released the lock and freed the PessimisticTransactions
object.

The waitee/txnid values are only valid PessimisticTransaction objects
while the mutex for the root of the locktree is held.

The simplest fix for this problem is to use the address of the
PessimisticTransaction as the TransactionID so that it is consistent
with its usage in the locktree. The TXNID is only converted back to a
PessimisticTransaction for the report_wait callbacks. Since
these callbacks are now all made within the critical section where the
lock_request queue mutx is held, these conversions will be safe.
Otherwise, only the uint64_t TXNID of the waitee is registerd
with the waiter transaction. The PessimisitcTransaction object of the
waitee is never referenced.

The main downside of this approach is the TransactionID will not change
if the PessimisticTransaction object is reused for new transactions.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9898

Test Plan:
Add a new test case and run unit tests.
Also verified with MyRocks workloads using range locks that the
crash no longer happens.

Reviewed By: riversand963

Differential Revision: D35950376

Pulled By: hermanlee

fbshipit-source-id: 8c9cae272e23e487fc139b6a8ed5b8f8f24b1570
2022-04-27 09:12:52 -07:00
.circleci Misc CI improvements / additions (#9859) 2022-04-18 20:26:37 -07:00
.github/workflows Use released clang-format instead of the one from dev branch (#9646) 2022-03-01 10:51:38 -08:00
buckifier Remove flaky servicelab metrics DBPut P95/P99 (#9844) 2022-04-15 13:56:22 -07:00
build_tools Add "no compression" job to CircleCI (#9850) 2022-04-18 12:47:16 -07:00
cache Expose CacheEntryRole and map keys for block cache stat collections (#9838) 2022-04-14 09:38:55 -07:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Fix commit_prereq and other targets (#9797) 2022-04-04 09:58:18 -07:00
db Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
db_stress_tool RocksDB: fix bug in crash-recovery correctness testing (#9897) 2022-04-27 06:01:09 -07:00
docs Bump nokogiri from 1.13.3 to 1.13.4 in /docs (#9831) 2022-04-12 09:07:14 -07:00
env Add experimental new FS API AbortIO to cancel read request (#9901) 2022-04-25 14:20:03 -07:00
examples Remove BlockBasedTableOptions.hash_index_allow_collision (#9454) 2022-03-01 13:58:02 -08:00
file Add stats related to async prefetching (#9845) 2022-04-25 21:58:22 -07:00
fuzz Fix compilation errors and add fuzzers to CircleCI (#9420) 2022-02-01 10:32:15 -08:00
include/rocksdb Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
java Revert open logic changes in #9634 (#9906) 2022-04-26 14:46:53 -07:00
logging Use system-wide thread ID in info log lines (#9164) 2021-11-12 19:46:06 -08:00
memory Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
memtable Account memory of big memory users in BlockBasedTable in global memory limit (#9748) 2022-04-06 10:33:00 -07:00
microbench Add mmap DBGet microbench parameters (#9903) 2022-04-26 16:46:39 -07:00
monitoring Revert open logic changes in #9634 (#9906) 2022-04-26 14:46:53 -07:00
options Misc CI improvements / additions (#9859) 2022-04-18 20:26:37 -07:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
table Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
test_util Remove BlockBasedTableOptions.hash_index_allow_collision (#9454) 2022-03-01 13:58:02 -08:00
third-party Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
tools Add 95% confidence intervals to db_bench output (#9882) 2022-04-25 14:49:54 -07:00
trace_replay Added TraceOptions::preserve_write_order (#9334) 2021-12-28 15:04:26 -08:00
util Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
utilities Fix locktree accesses to PessimisticTransactions (#9898) 2022-04-27 09:12:52 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
crash_test.mk Add new determinators for multiops transactions stress test (#9708) 2022-03-23 22:29:50 -07:00
DEFAULT_OPTIONS_HISTORY.md Add Options::DisableExtraChecks, clarify force_consistency_checks (#9363) 2022-01-18 17:31:03 -08:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
INSTALL.md Update supported VS versions in INSTALL.md (#9823) 2022-04-13 13:03:40 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Misc CI improvements / additions (#9859) 2022-04-18 20:26:37 -07:00
PLUGINS.md Move RADOS support to separate repo (#9206) 2022-01-24 22:50:07 -08:00
python.mk crash_test Makefile refactoring, add to CircleCI (#9702) 2022-03-16 15:58:06 -07:00
README.md README: De-list slack channel, list Google group (#9387) 2022-01-18 08:19:48 -08:00
ROCKSDB_LITE.md Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
src.mk Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
TARGETS Eliminate unnecessary (slow) block cache Ref()ing in MultiGet (#9899) 2022-04-26 21:59:24 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Add Solana's RocksDB use case in USERS.md (#9558) 2022-02-16 09:23:01 -08:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.