rocksdb

Go to file

Andrew Kryczka 8272a6de57 Optionally wait on bytes_per_sync to smooth I/O (#5183 )

Summary:
The existing implementation does not guarantee bytes reach disk every `bytes_per_sync` when writing SST files, or every `wal_bytes_per_sync` when writing WALs. This can cause confusing behavior for users who enable this feature to avoid large syncs during flush and compaction, but then end up hitting them anyways.

My understanding of the existing behavior is we used `sync_file_range` with `SYNC_FILE_RANGE_WRITE` to submit ranges for async writeback, such that we could continue processing the next range of bytes while that I/O is happening. I believe we can preserve that benefit while also limiting how far the processing can get ahead of the I/O, which prevents huge syncs from happening when the file finishes.

Consider this `sync_file_range` usage: `sync_file_range(fd_, 0, static_cast<off_t>(offset + nbytes), SYNC_FILE_RANGE_WAIT_BEFORE | SYNC_FILE_RANGE_WRITE)`. Expanding the range to start at 0 and adding the `SYNC_FILE_RANGE_WAIT_BEFORE` flag causes any pending writeback (like from a previous call to `sync_file_range`) to finish before it proceeds to submit the latest `nbytes` for writeback. The latest `nbytes` are still written back asynchronously, unless processing exceeds I/O speed, in which case the following `sync_file_range` will need to wait on it.

There is a second change in this PR to use `fdatasync` when `sync_file_range` is unavailable (determined statically) or has some known problem with the underlying filesystem (determined dynamically).

The above two changes only apply when the user enables a new option, `strict_bytes_per_sync`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5183

Differential Revision: D14953553

Pulled By: siying

fbshipit-source-id: 445c3862e019fb7b470f9c7f314fc231b62706e9

2019-04-22 11:51:39 -07:00

buckifier

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

build_tools

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

cache

Consolidate hash function used for non-persistent data in a new function (#5155 )

2019-04-08 13:32:06 -07:00

cmake

Make FindZLIB consistent with official definitions (#4823 )

2019-01-02 12:49:57 -08:00

coverage

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

Add BlockBasedTableOptions::index_shortening (#5174 )

2019-04-22 08:20:35 -07:00

docs

Blog post for format_version=4

2019-03-08 16:49:30 -08:00

env

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

examples

Support for single-primary, multi-secondary instances (#4899 )

2019-03-26 16:45:31 -07:00

hdfs

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

include/rocksdb

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

java

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

memtable

WriteBufferManager's dummy entry size to block cache 1MB -> 256KB (#5175 )

2019-04-16 12:03:07 -07:00

monitoring

Still implement StatisticsImpl::measureTime() (#5181 )

2019-04-12 11:00:35 -07:00

options

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

port

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

table

Add BlockBasedTableOptions::index_shortening (#5174 )

2019-04-22 08:20:35 -07:00

third-party/gtest-1.7.0/fused-src/gtest

remove bundled but unused fbson library (#5108 )

2019-03-26 16:37:52 -07:00

tools

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

util

refactor SavePoints (#5192 )

2019-04-19 20:33:04 -07:00

utilities

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

.clang-format

A script that automatically reformat affected lines

2014-01-14 12:21:24 -08:00

.gitignore

RocksDB Trace Analyzer (#4091 )

2018-08-13 11:44:02 -07:00

.lgtm.yml

Create lgtm.yml for LGTM.com C/C++ analysis (#4058 )

2018-06-26 12:43:04 -07:00

.travis.yml

Fix printf formatting on MacOS (#4533 )

2018-10-19 14:46:09 -07:00

appveyor.yml

Add RocksJava build to AppVeyor

2019-01-03 10:44:44 -08:00

AUTHORS

Update RocksDB Authors File

2017-10-18 14:42:10 -07:00

CMakeLists.txt

Support for single-primary, multi-secondary instances (#4899 )

2019-03-26 16:45:31 -07:00

CODE_OF_CONDUCT.md

Add Code of Conduct

2017-12-05 18:42:35 -08:00

CONTRIBUTING.md

Add Code of Conduct

2017-12-05 18:42:35 -08:00

COPYING

Add GPLv2 as an alternative license.

2017-04-27 18:06:12 -07:00

DEFAULT_OPTIONS_HISTORY.md

options.delayed_write_rate use the rate of rate_limiter by default.

2017-05-24 09:58:24 -07:00

defs.bzl

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

DUMP_FORMAT.md

First version of rocksdb_dump and rocksdb_undump.

2015-06-19 16:24:36 -07:00

HISTORY.md

Optionally wait on bytes_per_sync to smooth I/O (#5183 )

2019-04-22 11:51:39 -07:00

INSTALL.md

Update the version of the dependencies used by the RocksJava static build (#4761 )

2018-12-18 20:25:43 -08:00

issue_template.md

Add a template for issues

2017-09-29 11:41:28 -07:00

LANGUAGE-BINDINGS.md

LANGUAGE-BINDINGS.md: mention python-rocksdb

2019-03-20 11:10:48 -07:00

LICENSE.Apache

Change RocksDB License

2017-07-15 16:11:23 -07:00

LICENSE.leveldb

Add back the LevelDB license file

2017-07-16 18:42:18 -07:00

Makefile

Support for single-primary, multi-secondary instances (#4899 )

2019-03-26 16:45:31 -07:00

README.md

Add LevelDB repository link in the Readme

2019-04-01 18:19:09 -07:00

ROCKSDB_LITE.md

Fix some typos in comments and docs.

2018-03-08 10:27:25 -08:00

src.mk

Support for single-primary, multi-secondary instances (#4899 )

2019-03-26 16:45:31 -07:00

TARGETS

Support for single-primary, multi-secondary instances (#4899 )

2019-03-26 16:45:31 -07:00

thirdparty.inc

Add copyright headers per FB open-source checkup tool. (#5199 )

2019-04-18 10:55:01 -07:00

USERS.md

Adding IOTA Foundation to USERS.MD (#4436 )

2018-10-02 10:03:46 -07:00

Vagrantfile

Adding CentOS 7 Vagrantfile & build script

2018-02-26 15:27:17 -08:00

WINDOWS_PORT.md

#5145 , rename port/dirent.h to port/port_dirent.h to avoid compile err when use port dir as header dir output (#5152 )

2019-04-04 11:38:19 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.

Languages

C++ 82.1%

Java 10.3%

C 2.5%

Python 1.7%

Perl 1.1%

Other 2.1%