A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Giuseppe Ottaviano 3ebe8658d0 Fix race in WriteBufferManager (#9009)
Summary:
EndWriteStall has a data race: `queue_.empty()` is checked outside of the
mutex, so once we enter the critical section another thread may already have
cleared the list, and accessing the `front()` is undefined behavior (and causes
interesting crashes under high concurrency).

This PR fixes the bug, and also rewrites the logic to make it easier to reason
about it. It also fixes another subtle bug: if some writers are stalled and
`SetBufferSize(0)` is called, which disables the WBM, the writer are not
unblocked because of an early `enabled()` check in `EndWriteStall()`.

It doesn't significantly change the locking behavior, as before writers won't
lock unless entering a stall condition, and `FreeMem` almost always locks if
stalling is allowed, but that is inevitable with the current design. Liveness is
guaranteed by the fact that if some writes are blocked, eventually all writes
will be blocked due to `stall_active_`, and eventually all memory is freed.

While at it, do a couple of optimizations:

- In `WBMStallInterface::Signal()` signal the CV only after releasing the
  lock. Signaling under the lock is a common pitfall, as it causes the woken-up
  thread to immediately go back to sleep because the mutex is still locked by
  the awaker.

- Move all allocations and deallocations outside of the lock.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9009

Test Plan:
```
USE_CLANG=1 make -j64 all check
```

Reviewed By: akankshamahajan15

Differential Revision: D31550668

Pulled By: ot

fbshipit-source-id: 5125387c3dc7ecaaa2b8bbc736e58c4156698580
2021-10-14 10:44:46 -07:00
.circleci Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
.github/workflows Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
buckifier Modify script which generates TARGETS (#8366) 2021-06-04 16:28:59 -07:00
build_tools Adjust contrun name (#8924) 2021-09-16 15:06:30 -07:00
cache Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
cmake Add find_dependency() in cmake config file. (#6791) 2020-05-12 21:18:29 -07:00
coverage Remove asan_symbolize.py for internal asan build (#8737) 2021-09-07 15:39:11 -07:00
db Fix race in WriteBufferManager (#9009) 2021-10-14 10:44:46 -07:00
db_stress_tool Protect existing files in FaultInjectionTest{Env,FS}::ReopenWritableFile() (#8995) 2021-10-11 16:39:36 -07:00
docs Fix minor typo in blog post (#8906) 2021-09-13 10:31:19 -07:00
env Add a gflag for IO uring enable/disable (#8931) 2021-09-18 10:24:56 -07:00
examples Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
file RandomAccessFileReader::MultiRead() should not return read bytes not read (#8941) 2021-09-22 15:26:10 -07:00
fuzz Make EventListener into a Customizable Class (#8473) 2021-07-27 07:47:02 -07:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Fix race in WriteBufferManager (#9009) 2021-10-14 10:44:46 -07:00
java Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
logging Do not attempt to rename non-existent info log (#8622) 2021-08-04 17:25:00 -07:00
memory Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
memtable Fix race in WriteBufferManager (#9009) 2021-10-14 10:44:46 -07:00
microbench Add micro-benchmark support (#8493) 2021-07-08 18:22:45 -07:00
monitoring Added a default Name method to Statistics (#8918) 2021-09-17 07:25:43 -07:00
options Make Statistics a Customizable Class (#8637) 2021-09-10 09:47:39 -07:00
plugin Makefile support to statically link external plugin code (#7918) 2021-02-10 08:35:34 -08:00
port Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
table Batch blob read IO for MultiGet (#8699) 2021-09-17 19:23:13 -07:00
test_util Make MemTableRepFactory into a Customizable class (#8419) 2021-09-08 07:46:44 -07:00
third-party Fix a compilation error in CircleCI vs2019 CXX20 (#8090) 2021-03-23 10:28:04 -07:00
tools Protect existing files in FaultInjectionTest{Env,FS}::ReopenWritableFile() (#8995) 2021-10-11 16:39:36 -07:00
trace_replay Replace std::shared_ptr<SystemClock> by SystemClock* in TraceExecutionHandler (#8729) 2021-08-31 11:24:27 -07:00
util Return Status::NotSupported() in RateLimiter::GetTotalPendingRequests default impl (#8950) 2021-09-22 21:41:04 -07:00
utilities Protect existing files in FaultInjectionTest{Env,FS}::ReopenWritableFile() (#8995) 2021-10-11 16:39:36 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore gitignore cmake-build-* for CLion integration (#7933) 2021-02-19 13:43:15 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Move arm build from travis to circleci (#8203) 2021-04-19 20:07:02 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt Improve support for using regexes (#8740) 2021-09-07 13:05:23 -07:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md update HISTORY.md and version.h for 6.25.2 2021-10-11 16:42:46 -07:00
INSTALL.md Update installation instructions (#8158) 2021-04-06 16:02:04 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
PLUGINS.md Add ZenFS to plugin list (#8218) 2021-04-22 11:12:40 -07:00
README.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Improve support for using regexes (#8740) 2021-09-07 13:05:23 -07:00
TARGETS Improve support for using regexes (#8740) 2021-09-07 13:05:23 -07:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Add Kafka to USERS (#8911) 2021-09-14 10:26:15 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.