FlushMemTable return ok but memtable does not synchronize flush (#8173)

Summary:
Fix https://github.com/facebook/rocksdb/issues/8046 : FlushMemTable return ok but memtable does not synchronize flush. The way to fix it is to expose RecoveryError.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8173

Reviewed By: ajkr

Differential Revision: D31674552

Pulled By: jay-zhuang

fbshipit-source-id: 9d16b69ba12a196bb429332ec8224754de97773d
This commit is contained in:
zhuchong0329 2022-01-12 13:20:46 -08:00 committed by Facebook GitHub Bot
parent 0376869f05
commit 5f2b661f54
2 changed files with 13 additions and 4 deletions

View File

@ -9,6 +9,9 @@
### Behavior Changes ### Behavior Changes
* `DB::DestroyColumnFamilyHandle()` will return Status::InvalidArgument() if called with `DB::DefaultColumnFamily()`. * `DB::DestroyColumnFamilyHandle()` will return Status::InvalidArgument() if called with `DB::DefaultColumnFamily()`.
### Bug Fixes
* Fix a bug that FlushMemTable may return ok even flush not succeed.
## 6.28.0 (2021-12-17) ## 6.28.0 (2021-12-17)
### New Features ### New Features
* Introduced 'CommitWithTimestamp' as a new tag. Currently, there is no API for user to trigger a write with this tag to the WAL. This is part of the efforts to support write-commited transactions with user-defined timestamps. * Introduced 'CommitWithTimestamp' as a new tag. Currently, there is no API for user to trigger a write with this tag to the WAL. This is part of the efforts to support write-commited transactions with user-defined timestamps.

View File

@ -2264,21 +2264,27 @@ Status DBImpl::WaitForFlushMemTables(
int num = static_cast<int>(cfds.size()); int num = static_cast<int>(cfds.size());
// Wait until the compaction completes // Wait until the compaction completes
InstrumentedMutexLock l(&mutex_); InstrumentedMutexLock l(&mutex_);
Status s;
// If the caller is trying to resume from bg error, then // If the caller is trying to resume from bg error, then
// error_handler_.IsDBStopped() is true. // error_handler_.IsDBStopped() is true.
while (resuming_from_bg_err || !error_handler_.IsDBStopped()) { while (resuming_from_bg_err || !error_handler_.IsDBStopped()) {
if (shutting_down_.load(std::memory_order_acquire)) { if (shutting_down_.load(std::memory_order_acquire)) {
return Status::ShutdownInProgress(); s = Status::ShutdownInProgress();
return s;
} }
// If an error has occurred during resumption, then no need to wait. // If an error has occurred during resumption, then no need to wait.
// But flush operation may fail because of this error, so need to
// return the status.
if (!error_handler_.GetRecoveryError().ok()) { if (!error_handler_.GetRecoveryError().ok()) {
s = error_handler_.GetRecoveryError();
break; break;
} }
// If BGWorkStopped, which indicate that there is a BG error and // If BGWorkStopped, which indicate that there is a BG error and
// 1) soft error but requires no BG work, 2) no in auto_recovery_ // 1) soft error but requires no BG work, 2) no in auto_recovery_
if (!resuming_from_bg_err && error_handler_.IsBGWorkStopped() && if (!resuming_from_bg_err && error_handler_.IsBGWorkStopped() &&
error_handler_.GetBGError().severity() < Status::Severity::kHardError) { error_handler_.GetBGError().severity() < Status::Severity::kHardError) {
return error_handler_.GetBGError(); s = error_handler_.GetBGError();
return s;
} }
// Number of column families that have been dropped. // Number of column families that have been dropped.
@ -2296,7 +2302,8 @@ Status DBImpl::WaitForFlushMemTables(
} }
} }
if (1 == num_dropped && 1 == num) { if (1 == num_dropped && 1 == num) {
return Status::ColumnFamilyDropped(); s = Status::ColumnFamilyDropped();
return s;
} }
// Column families involved in this flush request have either been dropped // Column families involved in this flush request have either been dropped
// or finished flush. Then it's time to finish waiting. // or finished flush. Then it's time to finish waiting.
@ -2305,7 +2312,6 @@ Status DBImpl::WaitForFlushMemTables(
} }
bg_cv_.Wait(); bg_cv_.Wait();
} }
Status s;
// If not resuming from bg error, and an error has caused the DB to stop, // If not resuming from bg error, and an error has caused the DB to stop,
// then report the bg error to caller. // then report the bg error to caller.
if (!resuming_from_bg_err && error_handler_.IsDBStopped()) { if (!resuming_from_bg_err && error_handler_.IsDBStopped()) {