Fix race condition between log_.erase and log_.back

Summary: log_ contract specifies that it should not be modified unless both mutex_ and log_write_mutex_ are held. log_.erase however does that with only holding mutex_. This causes a race condition with two_write_queues since logs_.back is read with holding only log_write_mutex_ (which is correct according to logs_ contract) but logs_.erase is called concurrently. This is probably the cause of logs_.back returning nullptr in https://github.com/facebook/rocksdb/issues/3852 although I could not reproduce it. Fixes https://github.com/facebook/rocksdb/issues/3852 Closes https://github.com/facebook/rocksdb/pull/3859 Differential Revision: D8026103 Pulled By: maysamyabandeh fbshipit-source-id: ee394e00fe4aa520d884c5ef87981e9d6b5ccb28
2018-05-16 12:51:31 -07:00 · 2018-05-16 12:51:31 -07:00 · 46fde6b653
commit 46fde6b653
parent 42cb4775c1
1 changed files with 2 additions and 0 deletions
--- a/db/db_impl.cc
+++ b/db/db_impl.cc
@ -810,6 +810,8 @@ void DBImpl::MarkLogsSynced(uint64_t up_to, bool synced_dir,
    assert(log.getting_synced);
    if (status.ok() && logs_.size() > 1) {
      logs_to_free_.push_back(log.ReleaseWriter());
+      // To modify logs_ both mutex_ and log_write_mutex_ must be held
+      InstrumentedMutexLock l(&log_write_mutex_);
      it = logs_.erase(it);
    } else {
      log.getting_synced = false;