rocksdb/db
Peter Dillinger a8a422e962 Add manifest fix-up utility for file temperatures (#9683)
Summary:
The goal of this change is to allow changes to the "current" (in
FileSystem) file temperatures to feed back into DB metadata, so that
they can inform decisions and stats reporting. In part because of
modular code factoring, it doesn't seem easy to do this automagically,
where opening an SST file and observing current Temperature different
from expected would trigger a change in metadata and DB manifest write
(essentially giving the deep read path access to the write path). It is also
difficult to do this while the DB is open because of the limitations of
LogAndApply.

This change allows updating file temperature metadata on a closed DB
using an experimental utility function UpdateManifestForFilesState()
or `ldb update_manifest --update_temperatures`. This should suffice for
"migration" scenarios where outside tooling has placed or re-arranged DB
files into a (different) tiered configuration without going through
RocksDB itself (currently, only compaction can change temperature
metadata).

Some details:
* Refactored and added unit test for `ldb unsafe_remove_sst_file` because
of shared functionality
* Pulled in autovector.h changes from https://github.com/facebook/rocksdb/issues/9546 to fix SuperVersionContext
move constructor (related to an older draft of this change)

Possible follow-up work:
* Support updating manifest with file checksums, such as when a
new checksum function is used and want existing DB metadata updated
for it.
* It's possible that for some repair scenarios, lighter weight than
full repair, we might want to support UpdateManifestForFilesState() to
modify critical file details like size or checksum using same
algorithm. But let's make sure these are differentiated from modifying
file details in ways that don't suspect corruption (or require extreme
trust).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9683

Test Plan: unit tests added

Reviewed By: jay-zhuang

Differential Revision: D34798828

Pulled By: pdillinger

fbshipit-source-id: cfd83e8fb10761d8c9e7f9c020d68c9106a95554
2022-03-18 16:35:51 -07:00
..
blob Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
compaction DisableManualCompaction may fail to cancel an unscheduled task (#9659) 2022-03-12 20:07:04 -08:00
db_impl Fix assertion error by doing comparison with mutex (#9717) 2022-03-18 13:11:57 -07:00
arena_wrapped_db_iter.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
arena_wrapped_db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
builder.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
builder.h Expose blob file information through the EventListener interface (#8675) 2021-09-16 17:23:36 -07:00
c_test.c Hide deprecated, inefficient block-based filter from public API (#9535) 2022-02-12 07:05:57 -08:00
c.cc Remove BlockBasedTableOptions.hash_index_allow_collision (#9454) 2022-03-01 13:58:02 -08:00
column_family_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
column_family.cc DisableManualCompaction may fail to cancel an unscheduled task (#9659) 2022-03-12 20:07:04 -08:00
column_family.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
compact_files_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
comparator_db_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
convenience.cc Add rate limiter priority to ReadOptions (#9424) 2022-02-16 23:18:14 -08:00
corruption_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
cuckoo_table_db_test.cc Experimental support for SST unique IDs (#8990) 2021-10-18 23:32:01 -07:00
db_basic_test.cc Fix some MultiGet batching stats (#9583) 2022-02-17 16:31:41 -08:00
db_block_cache_test.cc Enhance new cache key testing & comments (#9329) 2022-02-04 14:15:58 -08:00
db_bloom_filter_test.cc Dynamic toggling of BlockBasedTableOptions::detect_filter_construct_corruption (#9654) 2022-03-04 10:35:08 -08:00
db_compaction_filter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_compaction_test.cc Fix a race condition when disable and enable manual compaction (#9694) 2022-03-15 12:31:14 -07:00
db_dynamic_level_test.cc Remove deprecated API AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit (#9452) 2022-01-27 13:01:09 -08:00
db_encryption_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_filesnapshot.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
db_flush_test.cc Fix mempurge crash reported in #8958 (#9671) 2022-03-10 15:16:55 -08:00
db_info_dumper.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_io_failure_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_iter_stress_test.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
db_iter_test.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iter.cc Remove iter_start_seqnum and preserve_deletes (#9430) 2022-01-28 13:28:38 -08:00
db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
db_iterator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_kv_checksum_test.cc Revise APIs related to user-defined timestamp (#8946) 2022-02-01 22:19:01 -08:00
db_log_iter_test.cc Attempt to deflake DBTestXactLogIterator.TransactionLogIteratorCorruptedLog (#8627) 2021-08-10 11:10:07 -07:00
db_logical_block_size_cache_test.cc Attempt to deflake DBLogicalBlockSizeCacheTest.CreateColumnFamilies (#9516) 2022-03-04 11:35:28 -08:00
db_memtable_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_merge_operand_test.cc Fix PinSelf() read-after-free in DB::GetMergeOperands() (#9507) 2022-02-15 12:25:18 -08:00
db_merge_operator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_options_test.cc Remove deprecated option new_table_reader_for_compaction_inputs (#9443) 2022-02-08 19:31:28 -08:00
db_properties_test.cc compression_per_level should be used for flush and changeable (#9658) 2022-03-07 18:06:19 -08:00
db_range_del_test.cc fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258) 2022-03-15 09:50:21 -07:00
db_rate_limiter_test.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
db_secondary_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
db_sst_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
db_statistics_test.cc Bytes read stat for VerifyChecksum() and VerifyFileChecksums() APIs (#8741) 2021-09-07 13:28:29 -07:00
db_table_properties_test.cc Fix backward compatibility with 2.5 through 2.7 (#9189) 2021-11-19 17:31:01 -08:00
db_tailing_iter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_test2.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
db_test_util.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
db_test_util.h New backup meta schema, with file temperatures (#9660) 2022-03-18 11:06:17 -07:00
db_test.cc Fix spelling in public API (#9490) 2022-02-03 15:15:23 -08:00
db_universal_compaction_test.cc Adhere to per-DB concurrency limit when bottom-pri compactions exist (#9179) 2021-11-18 17:31:50 -08:00
db_wal_test.cc Add record to set WAL compression type if enabled (#9556) 2022-02-17 16:19:31 -08:00
db_with_timestamp_basic_test.cc Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
db_with_timestamp_compaction_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
db_write_buffer_manager_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_write_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat.cc Track per-SST user-defined timestamp information in MANIFEST (#9092) 2021-11-10 10:49:04 -08:00
dbformat.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
deletefile_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
error_handler_fs_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
event_helpers.cc Add a listener callback for end of auto error recovery (#9244) 2021-12-08 14:30:57 -08:00
event_helpers.h Add a listener callback for end of auto error recovery (#9244) 2021-12-08 14:30:57 -08:00
experimental.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
external_sst_file_basic_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
external_sst_file_ingestion_job.cc Do not rely on ADL when invoking std::max_element (#9608) 2022-03-02 17:41:02 -08:00
external_sst_file_ingestion_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
external_sst_file_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
fault_injection_test.cc Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
flush_job_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
flush_job.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
flush_job.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator_bench.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
forward_iterator.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
forward_iterator.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
history_trimming_iterator.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
import_column_family_job.cc Add Temperature info in NewSequentialFile() (#9499) 2022-02-18 18:23:07 -08:00
import_column_family_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
import_column_family_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
internal_stats.cc Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
internal_stats.h Support GetMapProperty() with "rocksdb.dbstats" (#9057) 2021-10-20 13:17:00 -07:00
job_context.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
kv_checksum.h fix compile errors in db/kv_checksum.h (#9173) 2021-11-16 10:20:50 -08:00
listener_test.cc Fix test race conditions with OnFlushCompleted() (#9617) 2022-02-22 12:23:00 -08:00
log_format.h Add record to set WAL compression type if enabled (#9556) 2022-02-17 16:19:31 -08:00
log_reader.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_reader.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_test.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
malloc_stats.cc Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
memtable_list_test.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable_list.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable_list.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
memtable.cc Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
memtable.h Fix major bug with MultiGet, DeleteRange, and memtable Bloom (#9453) 2022-01-27 14:55:04 -08:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper_test.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
obsolete_files_test.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
output_validator.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
perf_context_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
periodic_work_scheduler_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
periodic_work_scheduler.cc Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
periodic_work_scheduler.h Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
pinned_iterators_manager.h Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
plain_table_db_test.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
pre_release_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
prefix_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
range_del_aggregator_bench.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_tombstone_fragmenter_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
read_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
repair_test.cc Some fixes and enhancements to ldb repair (#8544) 2021-07-28 16:44:14 -07:00
repair.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
table_cache.cc fix a bug of the ticker NO_FILE_OPENS (#9677) 2022-03-15 09:55:49 -07:00
table_cache.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
table_properties_collector_test.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Track each SST's timestamp information as user properties (#9093) 2021-11-19 11:37:06 -08:00
transaction_log_impl.cc Add commit marker with timestamp (#9266) 2021-12-10 11:05:35 -08:00
transaction_log_impl.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_builder.cc Use a sorted vector instead of a map to store blob file metadata (#9526) 2022-02-09 12:36:43 -08:00
version_builder.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
version_edit_handler.cc Clean up VersionStorageInfo a bit (#9494) 2022-02-04 08:19:20 -08:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc File temperature information should be preserved when restart the DB (#9242) 2021-12-03 14:43:14 -08:00
version_edit.cc Print file checksum in hex (#9196) 2021-11-22 09:30:47 -08:00
version_edit.h Clean up VersionStorageInfo a bit (#9494) 2022-02-04 08:19:20 -08:00
version_set_test.cc Rework VersionStorageInfo::ComputeFilesMarkedForForcedBlobGC a bit (#9548) 2022-02-11 08:41:41 -08:00
version_set.cc Fix some MultiGet batching stats (#9583) 2022-02-17 16:31:41 -08:00
version_set.h Fix PinSelf() read-after-free in DB::GetMergeOperands() (#9507) 2022-02-15 12:25:18 -08:00
version_util.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_manager_test.cc Make SystemClock into a Customizable Class (#8636) 2021-09-21 09:23:48 -07:00
wal_manager.cc Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
wal_manager.h Allow WAL dir to change with db dir (#8582) 2021-07-30 12:16:44 -07:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch_test.cc Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch.cc Improve stress test for transactions (#9568) 2022-03-16 19:00:04 -07:00
write_callback_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_thread.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
write_thread.h Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00