rocksdb/utilities
Peter Dillinger 8a6c925ca7 Restore file size in backup table file names (and other cleanup)
Summary: Prior to 6.12, backup files using share_files_with_checksum had
the file size encoded in the file name, after the last '_' and before
the last '.'. We considered this an implementation detail subject to
change, and indeed removed this information from the file name (with an
option to use old behavior) because it was considered
ineffective/inefficient for file name uniqueness. However, some
downstream RocksDB users were relying on this information since the file
size is not explicitly in the backup manifest file.

This primary purpose of this change is "retrofitting" the 6.12 release
(not yet a public release) to simultaneously support the benefits of the
new naming scheme (I/O performance and data correctness at scale) and
preserve the file size information, both as default behaviors. With this
change, we are essentially making the file size information encoded in
the file name an official, though obscure, extension of the backup meta
file format.

We preserve an option (kLegacyCrc32cAndFileSize) to use the original
"legacy" naming scheme, with its caveats, and make it easy to omit the
file size information (no kFlagIncludeFileSize), for more compact file
names. But note that changing the naming scheme used on an existing db
and backup directory can lead to transient space amplification, as some
files will be stored under two names in the shared_checksum directory.
Because some backups were saved using the original 6.12 naming scheme,
we offer two ways of dealing with those files: SST files generated by
older 6.12 versions can either use the default naming scheme in effect
when the SST files were generated (kFlagMatchInterimNaming, default, no
transient space amplification) or can use a new naming scheme (no
kFlagMatchInterimNaming, potential space amplification because some
already stored files getting a new name).

We don't have a natural way to detect which files were generated by
previous 6.12 versions, but this change hacks one in by changing DB
session ids to now use a more concise encoding, reducing file name
length, saving ~dozen bytes from SST files, and making them visually
distinct from DB ids so that they are less likely to be mixed up.

Finally, recognizing that the backup file names have become a de facto
part of the backup meta schema, this change makes them easier to parse
and extend by putting a distinct marker, 's', before DB session ids
embedded in the name. When we extend this to allow custom checksums in
the name, they can get their own marker to ensure safe parsing. For
backward compatibility, file size does not get a marker but is assumed
for _[0-9]+[.]

Test Plan: unit tests included. Sync point callbacks are used to mimic
previous version SST files.
2020-09-17 00:22:30 -07:00
..
backupable Restore file size in backup table file names (and other cleanup) 2020-09-17 00:22:30 -07:00
blob_db Expose the start of the expiration range for TTL blob files through LiveFileMetaData (#7365) 2020-09-14 16:11:01 -07:00
cassandra Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
checkpoint Work around a backup bug with DB custom checksums 2020-08-20 15:24:11 -07:00
compaction_filters Compaction filter support for BlobDB (#6850) 2020-06-29 17:32:14 -07:00
convenience Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
leveldb_options Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
memory More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
merge_operators Make StringAppendOperatorTest a parameterized test (#6930) 2020-06-04 14:17:11 -07:00
option_change_migration More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
options Fixed Factory construct just for calling .Name() (#7080) 2020-07-08 11:54:00 -07:00
persistent_cache More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
simulator_cache Revert "Whole DBTest to skip fsync (#7049)" (#7070) 2020-07-02 10:22:43 -07:00
table_properties_collectors Trigger compaction in CompactOnDeletionCollector based on deletion ratio (#6806) 2020-05-18 08:42:05 -07:00
trace Pass a timeout to FileSystem for random reads (#6751) 2020-04-30 14:50:39 -07:00
transactions Add CLANG analyze to CircleCI (#7114) 2020-07-13 12:33:16 -07:00
ttl Compaction filter support for BlobDB (#6850) 2020-06-29 17:32:14 -07:00
write_batch_with_index Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
debug.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_librados_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_librados.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_librados.md Add EnvLibrados - RocksDB Env of RADOS (#1222) 2016-07-21 11:16:34 -07:00
env_mirror_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_mirror.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_timed_test.cc Make env*_test work with ASSERT_STATUS_CHECKED (#7176) 2020-07-28 22:59:48 -07:00
env_timed.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
fault_injection_env.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
fault_injection_env.h More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
fault_injection_fs.cc More Makefile Cleanup (#7097) 2020-07-09 14:35:17 -07:00
fault_injection_fs.h Make env*_test work with ASSERT_STATUS_CHECKED (#7176) 2020-07-28 22:59:48 -07:00
merge_operators.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
object_registry_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
object_registry.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
util_merge_operators_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00