18eeb7b90e
Summary: This is a trivial fix for OOMs we've seen a few days ago in logdevice. RocksDB get into the following state: (1) Write throughput is too high for flushes to keep up. Compactions are out of the picture - automatic compactions are disabled, and for manual compactions we don't care that much if they fall behind. We write to many CFs, with only a few L0 sst files in each, so compactions are not needed most of the time. (2) total_log_size_ is consistently greater than GetMaxTotalWalSize(). It doesn't get smaller since flushes are falling ever further behind. (3) Total size of memtables is way above db_write_buffer_size and keeps growing. But the write_buffer_manager_->ShouldFlush() is not checked because (2) prevents it (for no good reason, afaict; this is what this commit fixes). (4) Every call to WriteImpl() hits the MaybeFlushColumnFamilies() path. This keeps flushing the memtables one by one in order of increasing log file number. (5) No write stalling trigger is hit. We rely on max_write_buffer_number Closes https://github.com/facebook/rocksdb/pull/1893 Differential Revision: D4593590 Pulled By: yiwu-arbug fbshipit-source-id: af79c5f |
||
---|---|---|
arcanist_util | ||
build_tools | ||
cmake/modules | ||
coverage | ||
db | ||
docs | ||
examples | ||
hdfs | ||
include/rocksdb | ||
java | ||
memtable | ||
port | ||
table | ||
third-party | ||
tools | ||
util | ||
utilities | ||
.arcconfig | ||
.clang-format | ||
.gitignore | ||
.travis.yml | ||
appveyor.yml | ||
AUTHORS | ||
CMakeLists.txt | ||
CONTRIBUTING.md | ||
DEFAULT_OPTIONS_HISTORY.md | ||
DUMP_FORMAT.md | ||
HISTORY.md | ||
INSTALL.md | ||
LANGUAGE-BINDINGS.md | ||
LICENSE | ||
Makefile | ||
PATENTS | ||
README.md | ||
ROCKSDB_LITE.md | ||
src.mk | ||
thirdparty.inc | ||
USERS.md | ||
Vagrantfile | ||
WINDOWS_PORT.md |
RocksDB: A Persistent Key-Value Store for Flash and RAM Storage
RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)
This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.
Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples
See the github wiki for more explanation.
The public interface is in include/
. Callers should not include or
rely on the details of any other header files in this package. Those
internal APIs may be changed without warning.
Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/