rocksdb/utilities
Peter Dillinger 3bfd3ed2f3 Begin forward compatibility for new backup meta schema (#8069)
Summary:
This does not add any new public APIs or published
functionality, but adds the ability to read and use (and in tests,
write) backups with a new meta file schema, based on the old schema
but not forward-compatible (before this change). The new schema enables
some capabilities not in the old:

* Explicit versioning, so that users get clean error messages the next
time we want to break forward compatibility.
* Ignoring unrecognized fields (with warning), so that new non-critical
features can be added without breaking forward compatibility.
* Rejecting future "non-ignorable" fields, so that new features critical
to some use-cases could potentially be added outside of linear schema
versions, with broken forward compatibility.
* Fields at the end of the meta file, such as for checksum of the meta
file's contents (up to that point)
* New optional 'size' field for each file, which is checked when present
* Optionally omitting 'crc32' field, so that we aren't required to have
a crc32c checksum for files to take a backup. (E.g. to support backup
via hard links and to better support file custom checksums.)

Because we do not have a JSON parser and to share code, the new schema
is simply derived from the old schema.

BackupEngine code is updated to allow missing checksums in some places,
and to make that easier, `has_checksum` and `verify_checksum_after_work`
are eliminated. Empty `checksum_hex` indicates checksum is unknown. I'm
not too afraid of regressing on data integrity, because
(a) we have pretty good test coverage of corruption detection in backups, and
(b) we are increasingly relying on the DB itself for data integrity rather than
it being an exclusive feature of backups.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8069

Test Plan:
new unit tests, added to crash test (some local run with
boosted backup probability)

Reviewed By: ajkr

Differential Revision: D27139824

Pulled By: pdillinger

fbshipit-source-id: 9e0e4decfb42bb84783d64d2d246456d97e8e8c5
2021-03-19 20:15:40 -07:00
..
backupable Begin forward compatibility for new backup meta schema (#8069) 2021-03-19 20:15:40 -07:00
blob_db Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
cassandra Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
checkpoint Support retrieving checksums for blob files from the MANIFEST when checkpointing (#8003) 2021-03-01 20:07:07 -08:00
compaction_filters Compaction filter support for BlobDB (#6850) 2020-06-29 17:32:14 -07:00
convenience Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
leveldb_options Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
memory No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
merge_operators No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
option_change_migration Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
options Create a CustomEnv class; Add WinFileSystem; Make LegacyFileSystemWrapper private (#7703) 2021-01-06 10:49:32 -08:00
persistent_cache Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
simulator_cache Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
table_properties_collectors Add more tests for assert status checked (#7524) 2020-12-22 23:45:58 -08:00
trace Remove Legacy and Custom FileWrapper classes from header files (#7851) 2021-01-28 22:10:32 -08:00
transactions Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
ttl Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
write_batch_with_index Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
debug.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
env_librados_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_librados.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_librados.md Add EnvLibrados - RocksDB Env of RADOS (#1222) 2016-07-21 11:16:34 -07:00
env_mirror_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
env_mirror.cc Add new Append API with DataVerificationInfo to Env WritableFile (#8071) 2021-03-19 11:44:13 -07:00
env_timed_test.cc Make env*_test work with ASSERT_STATUS_CHECKED (#7176) 2020-07-28 22:59:48 -07:00
env_timed.cc Make ChRootEnv, EncryptedEnv, and TimedEnv into FileSystems (#7968) 2021-03-15 19:50:11 -07:00
fault_injection_env.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
fault_injection_env.h Add new Append API with DataVerificationInfo to Env WritableFile (#8071) 2021-03-19 11:44:13 -07:00
fault_injection_fs.cc Handoff checksum Implementation (#7523) 2021-02-10 22:20:32 -08:00
fault_injection_fs.h Handoff checksum Implementation (#7523) 2021-02-10 22:20:32 -08:00
merge_operators.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
object_registry_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
object_registry.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
util_merge_operators_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00