93719fc953
Summary: Prior to 6.12, backup files using share_files_with_checksum had the file size encoded in the file name, after the last '\_' and before the last '.'. We considered this an implementation detail subject to change, and indeed removed this information from the file name (with an option to use old behavior) because it was considered ineffective/inefficient for file name uniqueness. However, some downstream RocksDB users were relying on this information since the file size is not explicitly in the backup manifest file. This primary purpose of this change is "retrofitting" the 6.12 release (not yet a public release) to simultaneously support the benefits of the new naming scheme (I/O performance and data correctness at scale) and preserve the file size information, both as default behaviors. With this change, we are essentially making the file size information encoded in the file name an official, though obscure, extension of the backup meta file format. We preserve an option (kLegacyCrc32cAndFileSize) to use the original "legacy" naming scheme, with its caveats, and make it easy to omit the file size information (no kFlagIncludeFileSize), for more compact file names. But note that changing the naming scheme used on an existing db and backup directory can lead to transient space amplification, as some files will be stored under two names in the shared_checksum directory. Because some backups were saved using the original 6.12 naming scheme, we offer two ways of dealing with those files: SST files generated by older 6.12 versions can either use the default naming scheme in effect when the SST files were generated (kFlagMatchInterimNaming, default, no transient space amplification) or can use a new naming scheme (no kFlagMatchInterimNaming, potential space amplification because some already stored files getting a new name). We don't have a natural way to detect which files were generated by previous 6.12 versions, but this change hacks one in by changing DB session ids to now use a more concise encoding, reducing file name length, saving ~dozen bytes from SST files, and making them visually distinct from DB ids so that they are less likely to be mixed up. Two final auxiliary notes: Recognizing that the backup file names have become a de facto part of the backup meta schema, this change makes them easier to parse and extend by putting a distinct marker, 's', before DB session ids embedded in the name. When we extend this to allow custom checksums in the name, they can get their own marker to ensure safe parsing. For backward compatibility, file size does not get a marker but is assumed for `_[0-9]+[.]` Another change from initial 6.12 default behavior is never including file custom checksum in the file name. Looking ahead to 6.13, we do not want the default behavior to cause backup space amplification for someone turning on file custom checksum checking in BackupEngine; we want that to be an easy decision. When implemented, including file custom checksums in backup file names will be a non-default option. Actual file name patterns and priorities, as regexes: kLegacyCrc32cAndFileSize OR pre-6.12 SST file -> [0-9]+_[0-9]+_[0-9]+[.]sst kFlagMatchInterimNaming set (default) AND early 6.12 SST file -> [0-9]+_[0-9a-fA-F-]+[.]sst kUseDbSessionId AND NOT kFlagIncludeFileSize -> [0-9]+_s[0-9A-Z]{20}[.]sst kUseDbSessionId AND kFlagIncludeFileSize (default) -> [0-9]+_s[0-9A-Z]{20}_[0-9]+[.]sst We might add opt-in options for more '\_' separated data in the name, but embedded file size, if present, will always be after last '\_' and before '.sst'. This change was originally applied to version 6.12. (See https://github.com/facebook/rocksdb/issues/7390) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7400 Test Plan: unit tests included. Sync point callbacks are used to mimic previous version SST files. Reviewed By: ajkr Differential Revision: D23759587 Pulled By: pdillinger fbshipit-source-id: f62d8af4e0978de0a34f26288cfbe66049b70025 |
||
---|---|---|
.. | ||
db_impl_compaction_flush.cc | ||
db_impl_debug.cc | ||
db_impl_experimental.cc | ||
db_impl_files.cc | ||
db_impl_open.cc | ||
db_impl_readonly.cc | ||
db_impl_readonly.h | ||
db_impl_secondary.cc | ||
db_impl_secondary.h | ||
db_impl_write.cc | ||
db_impl.cc | ||
db_impl.h | ||
db_secondary_test.cc |