rocksdb/db
Peter Dillinger 8a6c925ca7 Restore file size in backup table file names (and other cleanup)
Summary: Prior to 6.12, backup files using share_files_with_checksum had
the file size encoded in the file name, after the last '_' and before
the last '.'. We considered this an implementation detail subject to
change, and indeed removed this information from the file name (with an
option to use old behavior) because it was considered
ineffective/inefficient for file name uniqueness. However, some
downstream RocksDB users were relying on this information since the file
size is not explicitly in the backup manifest file.

This primary purpose of this change is "retrofitting" the 6.12 release
(not yet a public release) to simultaneously support the benefits of the
new naming scheme (I/O performance and data correctness at scale) and
preserve the file size information, both as default behaviors. With this
change, we are essentially making the file size information encoded in
the file name an official, though obscure, extension of the backup meta
file format.

We preserve an option (kLegacyCrc32cAndFileSize) to use the original
"legacy" naming scheme, with its caveats, and make it easy to omit the
file size information (no kFlagIncludeFileSize), for more compact file
names. But note that changing the naming scheme used on an existing db
and backup directory can lead to transient space amplification, as some
files will be stored under two names in the shared_checksum directory.
Because some backups were saved using the original 6.12 naming scheme,
we offer two ways of dealing with those files: SST files generated by
older 6.12 versions can either use the default naming scheme in effect
when the SST files were generated (kFlagMatchInterimNaming, default, no
transient space amplification) or can use a new naming scheme (no
kFlagMatchInterimNaming, potential space amplification because some
already stored files getting a new name).

We don't have a natural way to detect which files were generated by
previous 6.12 versions, but this change hacks one in by changing DB
session ids to now use a more concise encoding, reducing file name
length, saving ~dozen bytes from SST files, and making them visually
distinct from DB ids so that they are less likely to be mixed up.

Finally, recognizing that the backup file names have become a de facto
part of the backup meta schema, this change makes them easier to parse
and extend by putting a distinct marker, 's', before DB session ids
embedded in the name. When we extend this to allow custom checksums in
the name, they can get their own marker to ensure safe parsing. For
backward compatibility, file size does not get a marker but is assumed
for _[0-9]+[.]

Test Plan: unit tests included. Sync point callbacks are used to mimic
previous version SST files.
2020-09-17 00:22:30 -07:00
..
blob Sync blob files before closing them (#7160) 2020-07-22 17:25:20 -07:00
compaction Fix wrong level args (#7346) 2020-09-16 09:31:03 -07:00
db_impl Restore file size in backup table file names (and other cleanup) 2020-09-17 00:22:30 -07:00
arena_wrapped_db_iter.cc Fix a bug that causes iterator to return wrong result in a rare data race (#6973) 2020-06-18 10:16:38 -07:00
arena_wrapped_db_iter.h Iterator with timestamp (#6255) 2020-03-06 16:24:27 -08:00
builder.cc Add hash of key/value checks when paranoid_file_checks=true (#7134) 2020-07-22 11:04:40 -07:00
builder.h Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
c_test.c export stats_persist_period_sec (#7168) 2020-07-28 13:05:34 -07:00
c.cc export stats_persist_period_sec (#7168) 2020-07-28 13:05:34 -07:00
column_family_test.cc column_family_test: fix a data race related to sleeping task (#7150) 2020-07-20 14:19:48 -07:00
column_family.cc Make max_subcompactions dynamically changeable (#7159) 2020-07-22 18:32:52 -07:00
column_family.h Make max_subcompactions dynamically changeable (#7159) 2020-07-22 18:32:52 -07:00
compact_files_test.cc Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
compacted_db_impl.cc Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
compacted_db_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
comparator_db_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
convenience.cc sst_dump to reduce number of file reads (#6836) 2020-05-12 18:23:33 -07:00
corruption_test.cc Add hash of key/value checks when paranoid_file_checks=true (#7134) 2020-07-22 11:04:40 -07:00
cuckoo_table_db_test.cc Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
db_basic_test.cc Restore file size in backup table file names (and other cleanup) 2020-09-17 00:22:30 -07:00
db_block_cache_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_bloom_filter_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_compaction_filter_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_compaction_test.cc SST Partitioner interface that allows to split SST files (#6957) 2020-07-24 13:44:49 -07:00
db_dynamic_level_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_encryption_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_filesnapshot.cc First step towards handling MANIFEST write error (#6949) 2020-06-24 19:07:08 -07:00
db_flush_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_info_dumper.cc Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_io_failure_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_iter_stress_test.cc Test CircleCI with CLANG-10 (#7025) 2020-06-24 16:22:49 -07:00
db_iter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_iter.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
db_iter.h make iterator return versions between timestamp bounds (#6544) 2020-04-10 09:51:58 -07:00
db_iterator_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_log_iter_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_logical_block_size_cache_test.cc Get block size only in direct IO mode (#6522) 2020-03-20 15:26:10 -07:00
db_memtable_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_merge_operand_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_merge_operator_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_options_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_properties_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_range_del_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_sst_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_statistics_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_table_properties_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_tailing_iter_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_test2.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_test_util.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_test_util.h More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_test.cc Clean snapshot dir before taking snapshot (#7156) 2020-07-22 13:54:01 -07:00
db_universal_compaction_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_wal_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_with_timestamp_basic_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
db_with_timestamp_compaction_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
db_write_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
dbformat_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dbformat.cc Separate internal and user key comparators in BlockIter (#6944) 2020-07-07 17:26:16 -07:00
dbformat.h Separate internal and user key comparators in BlockIter (#6944) 2020-07-07 17:26:16 -07:00
deletefile_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
error_handler_fs_test.cc Remove time out testing cases in error_handler_fs_test (#7141) 2020-07-17 23:27:21 -07:00
error_handler.cc Auto resume the DB from Retryable IO Error (#6765) 2020-07-15 11:03:58 -07:00
error_handler.h Auto resume the DB from Retryable IO Error (#6765) 2020-07-15 11:03:58 -07:00
event_helpers.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
event_helpers.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
experimental.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
external_sst_file_basic_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
external_sst_file_ingestion_job.cc Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
external_sst_file_ingestion_job.h Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
external_sst_file_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
fault_injection_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_job_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
flush_job.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
flush_job.h Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
forward_iterator_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
forward_iterator.cc make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
forward_iterator.h Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
import_column_family_job.cc Fix potential size_t overflow in import_column_family (#6762) 2020-04-30 08:40:42 -07:00
import_column_family_job.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
import_column_family_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
internal_stats.cc First step towards handling MANIFEST write error (#6949) 2020-06-24 19:07:08 -07:00
internal_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
job_context.h Expose the set of live blob files from Version/VersionSet (#6785) 2020-05-04 15:08:13 -07:00
listener_test.cc Use steady_clock instead of system_clock in FileOperationInfo::TimePoint (#7153) 2020-07-22 08:55:02 -07:00
log_format.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_reader.cc Fail point-in-time WAL recovery upon IOError reading WAL (#6963) 2020-06-11 18:42:10 -07:00
log_reader.h Fix tabs and lint-ignores (#6734) 2020-04-20 11:39:31 -07:00
log_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
log_writer.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
log_writer.h Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) 2020-03-27 16:04:43 -07:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
lookup_key.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Skip high levels with no key falling in the range in CompactRange (#6482) 2020-03-04 20:15:25 -08:00
memtable_list_test.cc Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) 2020-03-27 16:04:43 -07:00
memtable_list.cc Fix data race to VersionSet::io_status_ (#7034) 2020-06-27 08:57:31 -07:00
memtable_list.h Fix some defects reported by Coverity Scan (#6933) 2020-06-04 15:46:27 -07:00
memtable.cc Implement NextAndGetResult() in memtable and level iterator (#7179) 2020-07-29 09:45:21 -07:00
memtable.h return timestamp from get (#6409) 2020-03-02 16:01:00 -08:00
merge_context.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
obsolete_files_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
options_file_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
perf_context_test.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
pinned_iterators_manager.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
plain_table_db_test.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
pre_release_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
prefix_test.cc Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
range_del_aggregator_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
read_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
repair_test.cc Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
repair.cc Reduce env_->GetChildren() calls in DBImpl::Recover() (#7044) 2020-07-10 13:41:08 -07:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_cache.cc Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
table_cache.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
table_properties_collector_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
table_properties_collector.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_properties_collector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
transaction_log_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
transaction_log_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Clean up blob files based on the linked SST set (#7001) 2020-06-30 15:31:21 -07:00
version_builder.cc Clean up blob files based on the linked SST set (#7001) 2020-06-30 15:31:21 -07:00
version_builder.h make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
version_edit_handler.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
version_edit_handler.h Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
version_edit_test.cc Revert "Added the safe-to-ignore tag to version_edit (#6530)" (#6569) 2020-03-23 10:27:47 -07:00
version_edit.cc Remove unnecessary inclusion of version_edit.h in env (#6952) 2020-06-07 21:56:55 -07:00
version_edit.h Remove unnecessary inclusion of version_edit.h in env (#6952) 2020-06-07 21:56:55 -07:00
version_set_test.cc Clean up blob files based on the linked SST set (#7001) 2020-06-30 15:31:21 -07:00
version_set.cc Implement NextAndGetResult() in memtable and level iterator (#7179) 2020-07-29 09:45:21 -07:00
version_set.h Suppress a TSAN warning (#7126) 2020-07-15 13:25:14 -07:00
wal_manager_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
wal_manager.cc Fix FilterBench when RTTI=0 (#6732) 2020-04-29 13:09:23 -07:00
wal_manager.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch.cc Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
write_callback_test.cc Divide WriteCallbackTest.WriteWithCallbackTest (#7037) 2020-06-30 12:31:30 -07:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_thread.cc fix some spelling typos (#6464) 2020-02-28 14:14:03 -08:00
write_thread.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00