rocksdb/db
Andrew Kryczka 9b18cc2363 single-file bottom-level compaction when snapshot released
Summary:
When snapshots are held for a long time, files may reach the bottom level containing overwritten/deleted keys. We previously had no mechanism to trigger compaction on such files. This particularly impacted DBs that write to different parts of the keyspace over time, as such files would never be naturally compacted due to second-last level files moving down. This PR introduces a mechanism for bottommost files to be recompacted upon releasing all snapshots that prevent them from dropping their deleted/overwritten keys.

- Changed `CompactionPicker` to compact files in `BottommostFilesMarkedForCompaction()`. These are the last choice when picking. Each file will be compacted alone and output to the same level in which it originated. The goal of this type of compaction is to rewrite the data excluding deleted/overwritten keys.
- Changed `ReleaseSnapshot()` to recompute the bottom files marked for compaction when the oldest existing snapshot changes, and schedule a compaction if needed. We cache the value that oldest existing snapshot needs to exceed in order for another file to be marked in `bottommost_files_mark_threshold_`, which allows us to avoid recomputing marked files for most snapshot releases.
- Changed `VersionStorageInfo` to track the list of bottommost files, which is recomputed every time the version changes by `UpdateBottommostFiles()`. The list of marked bottommost files is first computed in `ComputeBottommostFilesMarkedForCompaction()` when the version changes, but may also be recomputed when `ReleaseSnapshot()` is called.
- Extracted core logic of `Compaction::IsBottommostLevel()` into `VersionStorageInfo::RangeMightExistAfterSortedRun()` since logic to check whether a file is bottommost is now necessary outside of compaction.
Closes https://github.com/facebook/rocksdb/pull/3009

Differential Revision: D6062044

Pulled By: ajkr

fbshipit-source-id: 123d201cf140715a7d5928e8b3cb4f9cd9f7ad21
2017-10-25 16:30:37 -07:00
..
builder.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
builder.h Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
c_test.c Added save points for transactions C API 2017-09-14 14:18:59 -07:00
c.cc Make bytes_per_sync and wal_bytes_per_sync mutable 2017-09-27 17:49:45 -07:00
column_family_test.cc arena: derive alignment unit from std::max_align_t 2017-10-17 11:13:19 -07:00
column_family.cc Fix unused var warnings in Release mode 2017-10-23 14:27:04 -07:00
column_family.h Fix a typo in a comment 2017-10-18 12:32:28 -07:00
compact_files_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
compacted_db_impl.cc Inform caller when rocksdb is stalling writes 2017-10-05 18:11:43 -07:00
compacted_db_impl.h Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
compaction_iteration_stats.h add counter for deletion dropping optimization 2017-08-19 14:10:08 -07:00
compaction_iterator_test.cc WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
compaction_iterator.cc Fix unused var warnings in Release mode 2017-10-23 14:27:04 -07:00
compaction_iterator.h WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
compaction_job_stats_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
compaction_job_test.cc WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
compaction_job.cc fix delete range bug 2017-10-17 11:13:19 -07:00
compaction_job.h WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
compaction_picker_test.cc Make FIFO compaction options dynamically configurable 2017-10-19 15:26:36 -07:00
compaction_picker_universal.cc update scores after picking universal compaction 2017-08-16 18:42:33 -07:00
compaction_picker_universal.h Change RocksDB License 2017-07-15 16:11:23 -07:00
compaction_picker.cc single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
compaction_picker.h fix hanging after CompactFiles with L0 overlap 2017-09-13 15:41:38 -07:00
compaction.cc single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
compaction.h Change RocksDB License 2017-07-15 16:11:23 -07:00
comparator_db_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
convenience.cc add VerifyChecksum() to db.h 2017-08-09 15:58:13 -07:00
corruption_test.cc fix corruption_test valgrind 2017-08-11 12:29:14 -07:00
cuckoo_table_db_test.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
db_basic_test.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
db_blob_index_test.cc fix lite build 2017-10-17 08:57:09 -07:00
db_block_cache_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_bloom_filter_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_compaction_filter_test.cc Split CompactionFilterWithValueChange 2017-10-20 15:42:07 -07:00
db_compaction_test.cc single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
db_dynamic_level_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_encryption_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_filesnapshot.cc Remove some left-over BSD headers 2017-07-18 11:56:57 -07:00
db_flush_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_impl_compaction_flush.cc Fix unused var warnings in Release mode 2017-10-23 14:27:04 -07:00
db_impl_debug.cc Enable two write queues for transactions 2017-10-23 14:27:04 -07:00
db_impl_experimental.cc Inform caller when rocksdb is stalling writes 2017-10-05 18:11:43 -07:00
db_impl_files.cc WritePrepared Txn: Recovery 2017-09-28 16:56:45 -07:00
db_impl_open.cc Enable two write queues for transactions 2017-10-23 14:27:04 -07:00
db_impl_readonly.cc WritePrepared Txn: Iterator 2017-10-09 17:15:28 -07:00
db_impl_readonly.h Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_impl_write.cc WritePrepared Txn: end-to-end tests 2017-10-06 14:26:45 -07:00
db_impl.cc single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
db_impl.h Enable two write queues for transactions 2017-10-23 14:27:04 -07:00
db_info_dumper.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_info_dumper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
db_inplace_update_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_io_failure_test.cc Fix LITE unit tests 2017-07-26 21:11:47 -07:00
db_iter_test.cc Fix tombstone scans in SeekForPrev outside prefix 2017-10-25 15:12:00 -07:00
db_iter.cc Fix tombstone scans in SeekForPrev outside prefix 2017-10-25 15:12:00 -07:00
db_iter.h WritePrepared Txn: Iterator 2017-10-09 17:15:28 -07:00
db_iterator_test.cc Split CompactionFilterWithValueChange 2017-10-20 15:42:07 -07:00
db_log_iter_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_memtable_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_merge_operator_test.cc Introduce conditional merge-operator invocation in point lookups 2017-09-28 15:58:49 -07:00
db_options_test.cc Make FIFO compaction options dynamically configurable 2017-10-19 15:26:36 -07:00
db_properties_test.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
db_range_del_test.cc Fix wrong smallest key of delete range tombstones 2017-08-29 18:41:35 -07:00
db_sst_test.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
db_statistics_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_table_properties_test.cc fix deletion-triggered compaction in table builder 2017-09-28 18:17:30 -07:00
db_tailing_iter_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
db_test2.cc fix lite build 2017-10-17 08:57:09 -07:00
db_test_util.cc Fix build on OpenBSD 2017-10-24 13:27:38 -07:00
db_test_util.h Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
db_test.cc Exclude DBTest.DynamicFIFOCompactionOptions test under RocksDB Lite 2017-10-20 17:11:39 -07:00
db_universal_compaction_test.cc update scores after picking universal compaction 2017-08-16 18:42:33 -07:00
db_wal_test.cc Add test kPointInTimeRecoveryCFConsistency 2017-09-22 17:26:36 -07:00
db_write_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
dbformat_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
dbformat.cc Add ValueType::kTypeBlobIndex 2017-10-03 09:11:23 -07:00
dbformat.h Add ValueType::kTypeBlobIndex 2017-10-03 09:11:23 -07:00
deletefile_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
event_helpers.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
event_helpers.h Change RocksDB License 2017-07-15 16:11:23 -07:00
experimental.cc Replace dynamic_cast<> 2017-07-28 16:27:16 -07:00
external_sst_file_basic_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
external_sst_file_ingestion_job.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
external_sst_file_ingestion_job.h Change RocksDB License 2017-07-15 16:11:23 -07:00
external_sst_file_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
fault_injection_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
file_indexer_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
file_indexer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
file_indexer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
filename_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_job_test.cc WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
flush_job.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
flush_job.h WritePrepared Txn: Compaction/Flush 2017-10-06 10:41:53 -07:00
flush_scheduler.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_scheduler.h Change RocksDB License 2017-07-15 16:11:23 -07:00
forward_iterator_bench.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
forward_iterator.cc fix populating range deletions in forward iterator 2017-09-21 17:56:38 -07:00
forward_iterator.h Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
internal_stats.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
internal_stats.h Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
job_context.h Inform caller when rocksdb is stalling writes 2017-10-05 18:11:43 -07:00
listener_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
log_format.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_reader.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
log_reader.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_test.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
log_writer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
log_writer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
malloc_stats.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
malloc_stats.h Change RocksDB License 2017-07-15 16:11:23 -07:00
managed_iterator.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
managed_iterator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
manual_compaction_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
memtable_list_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
memtable_list.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
memtable_list.h Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
memtable.cc Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
memtable.h Add DB::Properties::kEstimateOldestKeyTime 2017-10-23 15:27:27 -07:00
merge_context.h Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_helper_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_helper.cc Allow merge operator to be called even with a single operand 2017-08-16 23:42:00 -07:00
merge_helper.h Allow merge operator to be called even with a single operand 2017-08-16 23:42:00 -07:00
merge_operator.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
options_file_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
perf_context_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
pinned_iterators_manager.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_db_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
prefix_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
range_del_aggregator_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
range_del_aggregator.cc Two small refactoring for better inlining 2017-09-14 15:41:49 -07:00
range_del_aggregator.h Two small refactoring for better inlining 2017-09-14 15:41:49 -07:00
read_callback.h write-prepared txn: call IsInSnapshot 2017-09-11 09:14:48 -07:00
repair_test.cc fix file numbers after repair 2017-10-10 13:12:37 -07:00
repair.cc WritePrepared Txn: Disable GC during recovery 2017-10-18 09:11:50 -07:00
snapshot_checker.h WritePrepared Txn: Disable GC during recovery 2017-10-18 09:11:50 -07:00
snapshot_impl.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
snapshot_impl.h WriteAtPrepare: Efficient read from snapshot list 2017-08-26 01:00:38 -07:00
table_cache.cc expose a hook to skip tables during iteration 2017-10-17 22:12:00 -07:00
table_cache.h Change RocksDB License 2017-07-15 16:11:23 -07:00
table_properties_collector_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
table_properties_collector.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
table_properties_collector.h Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
transaction_log_impl.cc Enable two write queues for transactions 2017-10-23 14:27:04 -07:00
transaction_log_impl.h Change RocksDB License 2017-07-15 16:11:23 -07:00
version_builder_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
version_builder.cc VersionBuilder: Erase with iterators for better performance 2017-10-17 10:12:37 -07:00
version_builder.h Allow DB reopen with reduced options.num_levels 2017-08-24 16:10:54 -07:00
version_edit_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
version_edit.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
version_edit.h Change RocksDB License 2017-07-15 16:11:23 -07:00
version_set_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
version_set.cc single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
version_set.h single-file bottom-level compaction when snapshot released 2017-10-25 16:30:37 -07:00
wal_manager_test.cc Revert "comment out unused parameters" 2017-07-21 18:26:26 -07:00
wal_manager.cc Replace dynamic_cast<> 2017-07-28 16:27:16 -07:00
wal_manager.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_batch_base.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_batch_internal.h WritePrepared Txn: end-to-end tests 2017-10-06 14:26:45 -07:00
write_batch_test.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
write_batch.cc Fix counter for memtable updates 2017-10-10 21:26:11 -07:00
write_callback_test.cc Exclude incompatible options in test 2017-09-12 14:58:46 -07:00
write_callback.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_thread.cc WritePrepared Txn: Recovery 2017-09-28 16:56:45 -07:00
write_thread.h WritePrepared Txn: Recovery 2017-09-28 16:56:45 -07:00