rocksdb/util
Haobo Xu 778e179046 [RocksDB] Sync file to disk incrementally
Summary:
During compaction, we sync the output files after they are fully written out. This causes unnecessary blocking of the compaction thread and burstiness of the write traffic.
This diff simply asks the OS to sync data incrementally as they are written, on the background. The hope is that, at the final sync, most of the data are already on disk and we would block less on the sync call. Thus, each compaction runs faster and we could use fewer number of compaction threads to saturate IO.
In addition, the write traffic will be smoothed out, hopefully reducing the IO P99 latency too.

Some quick tests show 10~20% improvement in per thread compaction throughput. Combined with posix advice on compaction read, just 5 threads are enough to almost saturate the udb flash bandwidth for 800 bytes write only benchmark.
What's more promising is that, with saturated IO, iostat shows average wait time is actually smoother and much smaller.
For the write only test 800bytes test:
Before the change:  await  occillate between 10ms and 3ms
After the change: await ranges 1-3ms

Will test against read-modify-write workload too, see if high read latency P99 could be resolved.

Will introduce a parameter to control the sync interval in a follow up diff after cleaning up EnvOptions.

Test Plan: make check; db_bench; db_stress

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11115
2013-06-12 12:53:59 -07:00
..
arena_test.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
arena.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
arena.h A number of fixes: 2011-10-31 17:22:06 +00:00
auto_roll_logger_test.cc [RocksDB] Fix PosixLogger and AutoRollLogger thread safety 2013-05-21 11:39:44 -07:00
auto_roll_logger.cc [RocksDB] Fix PosixLogger and AutoRollLogger thread safety 2013-05-21 11:39:44 -07:00
auto_roll_logger.h [RocksDB] Fix PosixLogger and AutoRollLogger thread safety 2013-05-21 11:39:44 -07:00
bloom_test.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
bloom.cc Fix all the lint errors. 2012-11-28 17:18:41 -08:00
build_version.h Stop continually re-creating build_version.c 2013-01-24 17:51:39 -08:00
cache_test.cc [RocksDB] Fix LRUCache Eviction problem 2013-04-04 11:22:50 -07:00
cache.cc [RocksDB] Fix LRUCache Eviction problem 2013-04-04 11:22:50 -07:00
coding_test.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
coding.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
coding.h Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
comparator.cc merge 1.5 2012-08-28 11:43:33 -07:00
crc32c_test.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
crc32c.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
crc32c.h A number of fixes: 2011-10-31 17:22:06 +00:00
env_hdfs.cc Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. 2013-03-20 23:14:03 -07:00
env_posix.cc [RocksDB] Sync file to disk incrementally 2013-06-12 12:53:59 -07:00
env_test.cc [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
env.cc [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
filelock_test.cc Prevent concurrent multiple opens of leveldb database. 2012-08-20 23:55:04 -07:00
filter_policy.cc Added bloom filter support. 2012-04-17 08:36:46 -07:00
hash.cc A number of fixes: 2011-10-31 17:22:06 +00:00
hash.h reverting disastrous MOE commit, returning to r21 2011-04-19 23:11:15 +00:00
histogram_test.cc Introduce histogram in statistics.h 2013-02-20 10:43:32 -08:00
histogram.cc [Rocksdb] Remove unused double apis to record into histograms 2013-05-16 10:40:30 -07:00
histogram.h [Rocksdb] Remove unused double apis to record into histograms 2013-05-16 10:40:30 -07:00
ldb_cmd_execute_result.h Enhanced ldb to support data access commands 2013-01-28 11:38:26 -08:00
ldb_cmd.cc [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
ldb_cmd.h Ability to set different size fanout multipliers for every level. 2013-05-21 13:50:20 -07:00
ldb_tool.cc Enhance the ldb tool to support ttl databases 2013-05-15 12:10:00 -07:00
logging.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
logging.h A number of fixes: 2011-10-31 17:22:06 +00:00
murmurhash.cc Implement RowLocks for assoc schema 2012-10-03 23:19:01 -07:00
murmurhash.h Implement RowLocks for assoc schema 2012-10-03 23:19:01 -07:00
mutexlock.h Implement ReadWrite locks for leveldb 2012-10-01 22:37:39 -07:00
options.cc [RocksDB] Introduce Fast Mutex option 2013-06-01 23:11:34 -07:00
posix_logger.h [RocksDB] Fix PosixLogger and AutoRollLogger thread safety 2013-05-21 11:39:44 -07:00
random.h [RocksDB] Include 64bit random number generator 2013-06-04 13:52:27 -07:00
signal_test.cc [RocksDB] fix build 2013-04-20 10:26:51 -07:00
stack_trace.h [RocksDB] Add stacktrace signal handler 2013-04-20 10:26:50 -07:00
stats_logger.h Clean up compiler warnings generated by -Wall option. 2012-08-29 14:24:51 -07:00
status.cc [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
stop_watch.h [RocksDB] Simplify StopWatch implementation 2013-05-17 10:55:34 -07:00
string_util.cc Ability to set different size fanout multipliers for every level. 2013-05-21 13:50:20 -07:00
string_util.h Ability to set different size fanout multipliers for every level. 2013-05-21 13:50:20 -07:00
testharness.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
testharness.h A number of fixes: 2011-10-31 17:22:06 +00:00
testutil.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
testutil.h Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. 2013-03-20 23:14:03 -07:00