rocksdb/db/compaction
Yanqin Jin e66199d848 First step towards handling MANIFEST write error (#6949)
Summary:
This PR provides preliminary support for handling IO error during MANIFEST write.
File write/sync is not guaranteed to be atomic. If we encounter an IOError while writing/syncing to the MANIFEST file, we cannot be sure about the state of the MANIFEST file. The version edits may or may not have reached the file. During cleanup, if we delete the newly-generated SST files referenced by the pending version edit(s), but the version edit(s) actually are persistent in the MANIFEST, then next recovery attempt will process the version edits(s) and then fail since the SST files have already been deleted.
One approach is to truncate the MANIFEST after write/sync error, so that it is safe to delete the SST files. However, file truncation may not be supported on certain file systems. Therefore, we take the following approach.
If an IOError is detected during MANIFEST write/sync, we disable file deletions for the faulty database. Depending on whether the IOError is retryable (set by underlying file system), either RocksDB or application can call `DB::Resume()`, or simply shutdown and restart. During `Resume()`, RocksDB will try to switch to a new MANIFEST and write all existing in-memory version storage in the new file. If this succeeds, then RocksDB may proceed. If all recovery is completed, then file deletions will be re-enabled.
Note that multiple threads can call `LogAndApply()` at the same time, though only one of them will be going through the process MANIFEST write, possibly batching the version edits of other threads. When the leading MANIFEST writer finishes, all of the MANIFEST writing threads in this batch will have the same IOError. They will all call `ErrorHandler::SetBGError()` in which file deletion will be disabled.

Possible future directions:
- Add an `ErrorContext` structure so that it is easier to pass more info to `ErrorHandler`. Currently, as in this example, a new `BackgroundErrorReason` has to be added.

Test plan (dev server):
make check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6949

Reviewed By: anand1976

Differential Revision: D22026020

Pulled By: riversand963

fbshipit-source-id: f3c68a2ef45d9b505d0d625c7c5e0c88495b91c8
2020-06-24 19:07:08 -07:00
..
compaction_iteration_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_iterator_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
compaction_iterator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_job_stats_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
compaction_job_test.cc Make options.bottommost_compression, compression_opts and bottommost_compression_opts dynamically changeable. (#6615) 2020-03-31 12:11:42 -07:00
compaction_job.cc First step towards handling MANIFEST write error (#6949) 2020-06-24 19:07:08 -07:00
compaction_job.h Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
compaction_picker_fifo.cc Fix clang analyze error (#6622) 2020-04-01 10:01:38 -07:00
compaction_picker_fifo.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_picker_level.cc Refactor level compaction picker (#6804) 2020-05-05 11:09:29 -07:00
compaction_picker_level.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_picker_test.cc Make it possible to look up files by number in VersionStorageInfo (#6862) 2020-05-28 10:03:06 -07:00
compaction_picker_universal.cc Fix potential overflow of unsigned type in for loop (#6902) 2020-06-02 15:05:07 -07:00
compaction_picker_universal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_picker.cc Fix race due to delete triggered compaction in Universal compaction mode (#6799) 2020-05-07 17:32:17 -07:00
compaction_picker.h Make options.bottommost_compression, compression_opts and bottommost_compression_opts dynamically changeable. (#6615) 2020-03-31 12:11:42 -07:00
compaction.cc Compaction with timestamp: input boundaries (#6645) 2020-04-10 16:05:49 -07:00
compaction.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00