b5c99cc908
Summary: FILE_FLAG_WRITE_THROUGH is for disabling device on-board cache in windows API, which should be disabled if user doesn't need system cache. There was a perf issue related with this, we found during memtable flush, the high percentile latency jumps significantly. During profiling, we found those high latency (P99.9) read requests got queue-jumped by write requests from memtable flush and takes 80ms or even more time to wait, even when SSD overall IO throughput is relatively low. After enabling FILE_FLAG_WRITE_THROUGH, we rerun the test found high percentile latency drops a lot without observable impact on writes. Scenario 1: 40MB/s + 40MB/s R/W compaction throughput Original | FILE_FLAG_WRITE_THROUGH | Percentage reduction --------------------------------------------------------------- P99.9 | 56.897 ms | 35.593 ms | -37.4% P99 | 3.905 ms | 3.896 ms | -2.8% Scenario 2: 14MB/s + 14MB/s R/W compaction throughput, cohosted with 100+ other rocksdb instances have manually triggered memtable flush operations (memtable is tiny), creating a lot of randomized the small file writes operations during test. Original | FILE_FLAG_WRITE_THROUGH | Percentage reduction --------------------------------------------------------------- P99.9 | 86.227 ms | 50.436 ms | -41.5% P99 | 8.415 ms | 3.356 ms | -60.1% Closes https://github.com/facebook/rocksdb/pull/3225 Differential Revision: D6624174 Pulled By: miasantreble fbshipit-source-id: 321b86aee9d74470840c70e5d0d4fa9880660a91 |
||
---|---|---|
buckifier | ||
build_tools | ||
cache | ||
cmake | ||
coverage | ||
db | ||
docs | ||
env | ||
examples | ||
hdfs | ||
include/rocksdb | ||
java | ||
memtable | ||
monitoring | ||
options | ||
port | ||
table | ||
third-party | ||
tools | ||
util | ||
utilities | ||
.clang-format | ||
.gitignore | ||
.travis.yml | ||
appveyor.yml | ||
AUTHORS | ||
CMakeLists.txt | ||
CODE_OF_CONDUCT.md | ||
CONTRIBUTING.md | ||
COPYING | ||
DEFAULT_OPTIONS_HISTORY.md | ||
DUMP_FORMAT.md | ||
HISTORY.md | ||
INSTALL.md | ||
issue_template.md | ||
LANGUAGE-BINDINGS.md | ||
LICENSE.Apache | ||
LICENSE.leveldb | ||
Makefile | ||
README.md | ||
ROCKSDB_LITE.md | ||
src.mk | ||
TARGETS | ||
thirdparty.inc | ||
USERS.md | ||
Vagrantfile | ||
WINDOWS_PORT.md |
RocksDB: A Persistent Key-Value Store for Flash and RAM Storage
RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)
This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.
Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples
See the github wiki for more explanation.
The public interface is in include/
. Callers should not include or
rely on the details of any other header files in this package. Those
internal APIs may be changed without warning.
Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/