rocksdb/db
sdong e03f8a0c12 L0 Subcompaction to trim input files (#9802)
Summary:
When sub compaction is decided for L0->L1 compaction, most of the cases, all L0 files will be involved in all sub compactions. However, it is not always the case. When files are generally (but not strictly) inserted in sequential order, there can be a subset of L0 files invovled. Yet RocksDB always open all those L0 files, and build an iterator, read many of the files' first of last block with expensive readahead. We trim some input files to reduce overhead a little bit.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9802

Test Plan: Add a unit test to cover this case and manually validate the behavior while running the test.

Reviewed By: ajkr

Differential Revision: D35371031

fbshipit-source-id: 701ed7375b5cbe41672e93b38fe8a1503dad08b6
2022-04-06 18:19:19 -07:00
..
blob Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
compaction L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
db_impl Fix GetMergeOperands() heap-use-after-free on flushed memtable (#9805) 2022-04-05 12:26:36 -07:00
arena_wrapped_db_iter.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
arena_wrapped_db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
builder.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
builder.h Expose blob file information through the EventListener interface (#8675) 2021-09-16 17:23:36 -07:00
c_test.c Fix remaining uses of "backupable" (#9792) 2022-04-05 09:52:33 -07:00
c.cc Fix a major performance bug in 7.0 re: filter compatibility (#9736) 2022-03-23 10:00:54 -07:00
column_family_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
column_family.cc Fix heap use-after-free race with DropColumnFamily (#9730) 2022-03-24 13:05:17 -07:00
column_family.h Fix heap use-after-free race with DropColumnFamily (#9730) 2022-03-24 13:05:17 -07:00
compact_files_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
comparator_db_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
convenience.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
corruption_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
cuckoo_table_db_test.cc Experimental support for SST unique IDs (#8990) 2021-10-18 23:32:01 -07:00
db_basic_test.cc Fix some MultiGet batching stats (#9583) 2022-02-17 16:31:41 -08:00
db_block_cache_test.cc Enhance new cache key testing & comments (#9329) 2022-02-04 14:15:58 -08:00
db_bloom_filter_test.cc Tests for filter compatibility (#9773) 2022-04-06 15:54:40 -07:00
db_compaction_filter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_compaction_test.cc L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
db_dynamic_level_test.cc Remove deprecated API AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit (#9452) 2022-01-27 13:01:09 -08:00
db_encryption_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_filesnapshot.cc Improve comments for some files (#9793) 2022-04-01 16:06:14 -07:00
db_flush_test.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
db_info_dumper.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc fix a bug, c api, if enable inplace_update_support, and use create sn… (#9471) 2022-03-21 12:04:33 -07:00
db_io_failure_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_iter_stress_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_iter_test.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iter.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
db_iterator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_kv_checksum_test.cc Revise APIs related to user-defined timestamp (#8946) 2022-02-01 22:19:01 -08:00
db_log_iter_test.cc Attempt to deflake DBTestXactLogIterator.TransactionLogIteratorCorruptedLog (#8627) 2021-08-10 11:10:07 -07:00
db_logical_block_size_cache_test.cc Attempt to deflake DBLogicalBlockSizeCacheTest.CreateColumnFamilies (#9516) 2022-03-04 11:35:28 -08:00
db_memtable_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_merge_operand_test.cc Fix GetMergeOperands() heap-use-after-free on flushed memtable (#9805) 2022-04-05 12:26:36 -07:00
db_merge_operator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_options_test.cc Remove deprecated option new_table_reader_for_compaction_inputs (#9443) 2022-02-08 19:31:28 -08:00
db_properties_test.cc Tests for filter compatibility (#9773) 2022-04-06 15:54:40 -07:00
db_range_del_test.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
db_rate_limiter_test.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
db_secondary_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
db_sst_test.cc Account memory of big memory users in BlockBasedTable in global memory limit (#9748) 2022-04-06 10:33:00 -07:00
db_statistics_test.cc Bytes read stat for VerifyChecksum() and VerifyFileChecksums() APIs (#8741) 2021-09-07 13:28:29 -07:00
db_table_properties_test.cc Account memory of big memory users in BlockBasedTable in global memory limit (#9748) 2022-04-06 10:33:00 -07:00
db_tailing_iter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_test_util.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
db_test_util.h Update Cache::Release param from force_erase to erase_if_last_ref (#9728) 2022-03-22 10:22:18 -07:00
db_test.cc Fix FileStorageInfo fields from GetLiveFilesMetaData (#9769) 2022-03-29 14:36:35 -07:00
db_test2.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
db_universal_compaction_test.cc Adhere to per-DB concurrency limit when bottom-pri compactions exist (#9179) 2021-11-18 17:31:50 -08:00
db_wal_test.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
db_with_timestamp_basic_test.cc Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
db_with_timestamp_compaction_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
db_write_buffer_manager_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_write_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat.cc Track per-SST user-defined timestamp information in MANIFEST (#9092) 2021-11-10 10:49:04 -08:00
dbformat.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
deletefile_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
error_handler_fs_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
event_helpers.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
event_helpers.h Add a listener callback for end of auto error recovery (#9244) 2021-12-08 14:30:57 -08:00
experimental.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
external_sst_file_basic_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
external_sst_file_ingestion_job.cc Do not rely on ADL when invoking std::max_element (#9608) 2022-03-02 17:41:02 -08:00
external_sst_file_ingestion_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
external_sst_file_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
fault_injection_test.cc Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc fixing issue #8345 RocksDB does not work when using UNC network paths (#9384) 2022-03-30 15:55:31 -07:00
flush_job_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
flush_job.cc Fix a bug in PosixClock (#9695) 2022-03-21 16:11:02 -07:00
flush_job.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator_bench.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
forward_iterator.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
forward_iterator.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
history_trimming_iterator.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
import_column_family_job.cc Add Temperature info in NewSequentialFile() (#9499) 2022-02-18 18:23:07 -08:00
import_column_family_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
import_column_family_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
internal_stats.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
internal_stats.h Support GetMapProperty() with "rocksdb.dbstats" (#9057) 2021-10-20 13:17:00 -07:00
job_context.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
kv_checksum.h fix compile errors in db/kv_checksum.h (#9173) 2021-11-16 10:20:50 -08:00
listener_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
log_format.h Add record to set WAL compression type if enabled (#9556) 2022-02-17 16:19:31 -08:00
log_reader.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_reader.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_test.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
malloc_stats.cc Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
memtable_list_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable_list.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
memtable_list.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable.cc Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
memtable.h Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper_test.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
obsolete_files_test.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
output_validator.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
periodic_work_scheduler.cc Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
periodic_work_scheduler.h Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
pinned_iterators_manager.h Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
plain_table_db_test.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
pre_release_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_bench.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_tombstone_fragmenter_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
read_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
repair_test.cc Some fixes and enhancements to ldb repair (#8544) 2021-07-28 16:44:14 -07:00
repair.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
table_cache.cc fix a bug of the ticker NO_FILE_OPENS (#9677) 2022-03-15 09:55:49 -07:00
table_cache.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
table_properties_collector_test.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Track each SST's timestamp information as user properties (#9093) 2021-11-19 11:37:06 -08:00
transaction_log_impl.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
transaction_log_impl.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_builder.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_builder.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
version_edit_handler.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc File temperature information should be preserved when restart the DB (#9242) 2021-12-03 14:43:14 -08:00
version_edit.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
version_edit.h Clean up VersionStorageInfo a bit (#9494) 2022-02-04 08:19:20 -08:00
version_set_test.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
version_set.cc L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
version_set.h L0 Subcompaction to trim input files (#9802) 2022-04-06 18:19:19 -07:00
version_util.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_manager_test.cc Make SystemClock into a Customizable Class (#8636) 2021-09-21 09:23:48 -07:00
wal_manager.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
wal_manager.h Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch_test.cc Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch.cc Fix a few documentation errors including in public APIs (#9789) 2022-04-01 10:30:17 -07:00
write_callback_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
write_thread.h Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00