rocksdb/db
Baptiste Lemaire e51be2c5a1 Improve MemPurge sampling (#8656)
Summary:
Previously, the `MemPurge` sampling function was assessing whether a random entry from a memtable was garbage or not by simply querying the given memtable (see https://github.com/facebook/rocksdb/issues/8628 for more details).
In this diff, I am updating the sampling function by querying not only the memtable the entry was drawn from, but also all subsequent memtables that have a greater memtable ID.
I also added the size of the value for KV entries in the payload/useful payload estimates (which was also one of the reasons why sampling was not as good as mempurging all the time in terms of L0 SST files reduction).
Once these changes were made, I was able to clean obsolete objects and functions from the `MemtableList` struct, and did a bit of cleanup everywhere.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8656

Reviewed By: pdillinger

Differential Revision: D30288583

Pulled By: bjlemaire

fbshipit-source-id: 7646a545ec56f4715949daa59ab5eee74540feb3
2021-08-13 14:35:41 -07:00
..
blob Fix a issue with initializing blob header buffer (#8537) 2021-08-02 17:15:06 -07:00
compaction Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
db_impl Make TraceRecord and Replayer public (#8611) 2021-08-11 19:32:46 -07:00
arena_wrapped_db_iter.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
arena_wrapped_db_iter.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
builder.cc Memtable "MemPurge" prototype (#8454) 2021-07-02 05:23:02 -07:00
builder.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
c_test.c Add ribbon filter to C API (#8486) 2021-07-09 16:22:48 -07:00
c.cc Memtable sampling for mempurge heuristic. (#8628) 2021-08-10 18:09:03 -07:00
column_family_test.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
column_family.cc Fix a race in ColumnFamilyData::UnrefAndTryDelete (#8605) 2021-08-02 18:12:11 -07:00
column_family.h Fix a race in ColumnFamilyData::UnrefAndTryDelete (#8605) 2021-08-02 18:12:11 -07:00
compact_files_test.cc Compaction should not move data to up level (#8116) 2021-03-29 17:10:42 -07:00
comparator_db_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
convenience.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
corruption_test.cc Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
cuckoo_table_db_test.cc Revert "Turn on memtable bloom filter by default. (#6584)" (#7939) 2021-02-06 22:34:30 -08:00
db_basic_test.cc Fix the sorting of KeyContexts for batched MultiGet (#8633) 2021-08-06 16:27:42 -07:00
db_block_cache_test.cc Dynamically configure BlockBasedTableOptions.prepopulate_block_cache (#8620) 2021-08-05 19:44:51 -07:00
db_bloom_filter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_compaction_filter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_compaction_test.cc Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
db_dynamic_level_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_encryption_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_filesnapshot.cc DB::GetSortedWalFiles() to ensure file deletion is disabled (#8591) 2021-07-29 11:51:08 -07:00
db_flush_test.cc Memtable sampling for mempurge heuristic. (#8628) 2021-08-10 18:09:03 -07:00
db_info_dumper.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_io_failure_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_iter_stress_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_iter_test.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iter.cc Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iter.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
db_iterator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_kv_checksum_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_log_iter_test.cc Attempt to deflake DBTestXactLogIterator.TransactionLogIteratorCorruptedLog (#8627) 2021-08-10 11:10:07 -07:00
db_logical_block_size_cache_test.cc Add further tests to ASSERT_STATUS_CHECKED (2) (#7698) 2020-12-09 21:21:16 -08:00
db_memtable_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_merge_operand_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_merge_operator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_options_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_properties_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_range_del_test.cc Fix missing Handle release in TableCache::GetRangeTombstoneIterator (#8589) 2021-07-27 21:32:11 -07:00
db_secondary_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_sst_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_statistics_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_table_properties_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_tailing_iter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_test2.cc Make TraceRecord and Replayer public (#8611) 2021-08-11 19:32:46 -07:00
db_test_util.cc Make EncryptionProvider and BlockCipher into Customizable objects (#8354) 2021-07-16 07:58:51 -07:00
db_test_util.h Do not attempt to rename non-existent info log (#8622) 2021-08-04 17:25:00 -07:00
db_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_universal_compaction_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_wal_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_with_timestamp_basic_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
db_with_timestamp_compaction_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_write_buffer_manager_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_write_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
dbformat_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
dbformat.cc Make CompactRange and GetApproximateSizes work with timestamp (#7684) 2020-12-02 13:00:53 -08:00
dbformat.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
deletefile_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
error_handler_fs_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
error_handler.cc DB::GetSortedWalFiles() to ensure file deletion is disabled (#8591) 2021-07-29 11:51:08 -07:00
error_handler.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
event_helpers.cc Make EventListener into a Customizable Class (#8473) 2021-07-27 07:47:02 -07:00
event_helpers.h Pass SST file checksum information through OnTableFileCreated (#7108) 2020-08-25 10:46:11 -07:00
experimental.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
external_sst_file_basic_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
external_sst_file_ingestion_job.cc Sync ingested files only if reopen is supported by the FS (#8296) 2021-05-18 19:33:55 -07:00
external_sst_file_ingestion_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
external_sst_file_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
fault_injection_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
flush_job_test.cc Fix NotifyOnFlushCompleted() for atomic flush (#8585) 2021-08-03 13:31:10 -07:00
flush_job.cc Improve MemPurge sampling (#8656) 2021-08-13 14:35:41 -07:00
flush_job.h Improve MemPurge sampling (#8656) 2021-08-13 14:35:41 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
forward_iterator.cc Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
forward_iterator.h Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
import_column_family_job.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
import_column_family_job.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
import_column_family_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
internal_stats.cc Don't hold DB mutex for block cache entry stat scans (#8538) 2021-07-16 14:13:08 -07:00
internal_stats.h Don't hold DB mutex for block cache entry stat scans (#8538) 2021-07-16 14:13:08 -07:00
job_context.h Rename ImmutableOptions variables (#8409) 2021-06-16 16:51:38 -07:00
kv_checksum.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
listener_test.cc Fix NotifyOnFlushCompleted() for atomic flush (#8585) 2021-08-03 13:31:10 -07:00
log_format.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_reader.cc Fix kPointInTimeRecovery handling of truncated WAL (#7701) 2020-11-30 18:11:38 -08:00
log_reader.h Real fix for race in backup custom checksum checking (#7309) 2020-08-26 10:39:20 -07:00
log_test.cc Make StringEnv, StringSink, StringSource use FS classes (#7786) 2021-01-04 16:01:01 -08:00
log_writer.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
log_writer.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
memtable_list_test.cc Fix NotifyOnFlushCompleted() for atomic flush (#8585) 2021-08-03 13:31:10 -07:00
memtable_list.cc Improve MemPurge sampling (#8656) 2021-08-13 14:35:41 -07:00
memtable_list.h Improve MemPurge sampling (#8656) 2021-08-13 14:35:41 -07:00
memtable.cc Retire superfluous functions introduced in earlier mempurge PRs. (#8558) 2021-07-22 18:29:13 -07:00
memtable.h Memtable sampling for mempurge heuristic. (#8628) 2021-08-10 18:09:03 -07:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper.cc Add support for Merge with base value during Compaction in IntegratedBlobDB (#8445) 2021-06-24 18:11:30 -07:00
merge_helper.h Add support for Merge with base value during Compaction in IntegratedBlobDB (#8445) 2021-06-24 18:11:30 -07:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc MergeHelper::FilterMerge() calling ElapsedNanosSafe() upon exit even … (#7867) 2021-01-21 13:13:02 -08:00
obsolete_files_test.cc Attempt to deflake ObsoleteFilesTest.DeleteObsoleteOptionsFile (#8624) 2021-08-05 18:36:16 -07:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Use NPHash64 in more places (#7632) 2020-11-10 23:42:13 -08:00
output_validator.h Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
periodic_work_scheduler.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler.h Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
pinned_iterators_manager.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
plain_table_db_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
pre_release_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_bench.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_tombstone_fragmenter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
read_callback.h Get() with timestamp should respect snapshot (#7227) 2020-08-14 19:20:58 -07:00
repair_test.cc Some fixes and enhancements to ldb repair (#8544) 2021-07-28 16:44:14 -07:00
repair.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
table_cache.cc Fix use-after-free on implicit temporary FileOptions (#8571) 2021-07-27 21:49:14 -07:00
table_cache.h Fix use-after-free on implicit temporary FileOptions (#8571) 2021-07-27 21:49:14 -07:00
table_properties_collector_test.cc Make it possible to apply only a subrange of table property collectors (#8298) 2021-05-17 18:28:39 -07:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Partially revert the "apply subrange of table property collectors" change (#8465) 2021-07-06 10:14:32 -07:00
transaction_log_impl.cc Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
transaction_log_impl.h Store FileSystemPtr object that contains FileSystem ptr (#7180) 2020-08-12 17:31:23 -07:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
version_builder.cc Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_builder.h Handle blob files when options.best_efforts_recovery is true (#8180) 2021-04-19 11:56:14 -07:00
version_edit_handler.cc Retire superfluous functions introduced in earlier mempurge PRs. (#8558) 2021-07-22 18:29:13 -07:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc Make it able to ignore WAL related VersionEdits in older versions (#7873) 2021-01-19 19:27:53 -08:00
version_edit.cc Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
version_edit.h Write file temperature information to manifest (#8284) 2021-05-17 15:15:23 -07:00
version_set_test.cc Print blob file checksums as hex (#8437) 2021-06-22 09:49:44 -07:00
version_set.cc Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
version_set.h Retire superfluous functions introduced in earlier mempurge PRs. (#8558) 2021-07-22 18:29:13 -07:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_manager_test.cc Use DbSessionId as cache key prefix when secondary cache is enabled (#8360) 2021-06-10 11:02:43 -07:00
wal_manager.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
wal_manager.h Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Several simple local code clean-ups (#8565) 2021-07-30 12:07:49 -07:00
write_batch_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
write_batch.cc Several simple local code clean-ups (#8565) 2021-07-30 12:07:49 -07:00
write_callback_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Stall writes in WriteBufferManager when memory_usage exceeds buffer_size (#7898) 2021-04-21 13:54:02 -07:00
write_thread.h typo: fix typo in db/write_thread's state (#8423) 2021-06-18 17:14:51 -07:00