rocksdb/tools
sdong c66b4429ff Incremental Space Amp Compactions in Universal Style (#8655)
Summary:
This commit introduces incremental compaction in univeral style for space amplification. This follows the first improvement mentioned in https://rocksdb.org/blog/2021/04/12/universal-improvements.html . The implemention simply picks up files about size of max_compaction_bytes to compact and execute if the penalty is not too big. More optimizations can be done in the future, e.g. prioritizing between this compaction and other types. But for now, the feature is supposed to be functional and can often reduce frequency of full compactions, although it can introduce penalty.

In order to add cut files more efficiently so that more files from upper levels can be included, SST file cutting threshold (for current file + overlapping parent level files) is set to 1.5X of target file size. A 2MB target file size will generate files like this: https://gist.github.com/siying/29d2676fba417404f3c95e6c013c7de8 Number of files indeed increases but it is not out of control.

Two set of write benchmarks are run:
1. For ingestion rate limited scenario, we can see full compaction is mostly eliminated: https://gist.github.com/siying/959bc1186066906831cf4c808d6e0a19 . The write amp increased from 7.7 to 9.4, as expected. After applying file cutting, the number is improved to 8.9. In another benchmark, the write amp is even better with the incremental approach: https://gist.github.com/siying/d1c16c286d7c59c4d7bba718ca198163
2. For ingestion rate unlimited scenario, incremental compaction turns out to be too expensive most of the time and is not executed, as expected.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8655

Test Plan: Add unit tests to the functionality.

Reviewed By: ajkr

Differential Revision: D31787034

fbshipit-source-id: ce813e63b15a61d5a56e97bf8902a1b28e011beb
2021-10-20 10:04:13 -07:00
..
advisor Update branch as "main" in tools/advisor/README.md (#8744) 2021-09-01 20:26:28 -07:00
block_cache_analyzer Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
dump Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rdb Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
analyze_txn_stress_test.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
auto_sanity_test.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
backup_db.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
benchmark_leveldb.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
benchmark.sh Add a benchmarking wrapper script for BlobDB (#9015) 2021-10-12 11:36:03 -07:00
blob_dump.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
check_all_python.py Allow missing "unversioned" python, as in CentOS 8 (#6883) 2020-05-29 11:29:23 -07:00
check_format_compatible.sh Add 6.24 to the format compatibility checker (#8716) 2021-08-26 16:35:58 -07:00
CMakeLists.txt Mark dependencies as PRIVATE and fix missing dependencies in tools. (#6790) 2020-05-12 21:07:55 -07:00
db_bench_tool_test.cc Make it possible to force the garbage collection of the oldest blob files (#8994) 2021-10-11 18:03:01 -07:00
db_bench_tool.cc Incremental Space Amp Compactions in Universal Style (#8655) 2021-10-20 10:04:13 -07:00
db_bench.cc Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
db_crashtest.py Make it possible to force the garbage collection of the oldest blob files (#8994) 2021-10-11 18:03:01 -07:00
db_repl_stress.cc Migrate away from Travis+Linux+amd64 (#7791) 2020-12-22 00:20:57 -08:00
db_sanity_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dbench_monitor Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
Dockerfile adding docker build script and dockerfile 2015-05-22 16:03:39 -07:00
generate_random_db.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
ingest_external_sst.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
io_tracer_parser_test.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
io_tracer_parser_tool.cc Add request_id in IODebugContext. (#8045) 2021-04-01 13:14:51 -07:00
io_tracer_parser_tool.h Add IO Tracer Parser (#7333) 2020-09-23 15:50:26 -07:00
io_tracer_parser.cc Add IO Tracer Parser (#7333) 2020-09-23 15:50:26 -07:00
ldb_cmd_impl.h Some fixes and enhancements to ldb repair (#8544) 2021-07-28 16:44:14 -07:00
ldb_cmd_test.cc Fix flaky ldb_cmd_test tests caused by file deletions during validation (#8942) 2021-09-21 11:27:38 -07:00
ldb_cmd.cc List blob files when using command - list_live_files_metadata (#8976) 2021-09-30 15:13:11 -07:00
ldb_test.py Add list live files metadata (#8446) 2021-06-22 19:07:46 -07:00
ldb_tool.cc Add list live files metadata (#8446) 2021-06-22 19:07:46 -07:00
ldb.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
pflag Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
reduce_levels_test.cc Remove some unneeded code (#8736) 2021-09-01 14:28:58 -07:00
regression_test.sh Fix COMMIT_ID in regression_test.sh (#9047) 2021-10-18 11:01:06 -07:00
restore_db.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
rocksdb_dump_test.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
run_blob_bench.sh Add a benchmarking wrapper script for BlobDB (#9015) 2021-10-12 11:36:03 -07:00
run_flash_bench.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
run_leveldb.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
sample-dump.dmp First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
simulated_hybrid_file_system.cc Change the File System File Wrappers to std::unique_ptr (#8618) 2021-09-13 08:46:19 -07:00
simulated_hybrid_file_system.h Change the File System File Wrappers to std::unique_ptr (#8618) 2021-09-13 08:46:19 -07:00
sst_dump_test.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
sst_dump_tool.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
sst_dump.cc Implement a new subcommand "identify" for sst_dump (#6943) 2020-06-08 13:58:28 -07:00
trace_analyzer_test.cc Simplify TraceAnalyzer (#8697) 2021-08-24 18:18:36 -07:00
trace_analyzer_tool.cc Simplify TraceAnalyzer (#8697) 2021-08-24 18:18:36 -07:00
trace_analyzer_tool.h Simplify TraceAnalyzer (#8697) 2021-08-24 18:18:36 -07:00
trace_analyzer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
verify_random_db.sh Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
write_external_sst.sh Revamp check_format_compatible.sh (#8012) 2021-03-02 11:42:27 -08:00
write_stress_runner.py Allow missing "unversioned" python, as in CentOS 8 (#6883) 2020-05-29 11:29:23 -07:00
write_stress.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00