always invalidate sequential-insertion cache for concurrent skiplist adds

Summary: InlineSkipList::InsertConcurrently should invalidate the sequential-insertion cache prev_[] for all inserts of multi-level nodes, not just those that increase the height of the skip list. The invariant for prev_ is that prev_[i] (i > 0) is supposed to be the predecessor of prev_[0] at level i. Before this diff InsertConcurrently could violate this constraint when inserting a multi-level node after prev_[i] but before prev_[0]. This diff also reenables kConcurrentSkipList as db_test's MultiThreaded/MultiThreadedDBTest.MultiThreaded/29. Test Plan: 1. unit tests 2. temporarily hack kConcurrentSkipList timing so that it is fast but has a 1.5% failure rate on my dev box (1ms stagger on thread launch, 1s test duration, failure rate baseline over 1000 runs) 3. observe 1000 passes post-fix Reviewers: igor, sdong Reviewed By: sdong Subscribers: MarkCallaghan, dhruba Differential Revision: https://reviews.facebook.net/D53751
2016-02-03 09:21:44 -08:00 · 2016-02-03 09:21:44 -08:00 · 2c1db5ea51
commit 2c1db5ea51
parent c12ff20ab2
2 changed files with 21 additions and 12 deletions
--- a/db/db_test_util.h
+++ b/db/db_test_util.h
@ -525,8 +525,8 @@ class DBTestBase : public testing::Test {
    kOptimizeFiltersForHits = 27,
    kRowCache = 28,
    kRecycleLogFiles = 29,
    kEnd = 30,
    kConcurrentSkipList = 30,
    kEnd = 31,
    kLevelSubcompactions = 31,
    kUniversalSubcompactions = 32,
  };
--- a/db/inlineskiplist.h
+++ b/db/inlineskiplist.h
@ -147,8 +147,9 @@ class InlineSkipList {
  // values are ok.
  std::atomic<int> max_height_;  // Height of the entire list
-  // Used for optimizing sequential insert patterns.  Tricky.  prev_[i] for
+  // Used for optimizing sequential insert patterns.  Tricky.  prev_height_
-  // i up to max_height_ - 1 (inclusive) is the predecessor of prev_[0].
+  // of zero means prev_ is undefined.  Otherwise: prev_[i] for i up
  // to max_height_ - 1 (inclusive) is the predecessor of prev_[0], and
  // prev_height_ is the height of prev_[0].  prev_[0] can only be equal
  // to head when max_height_ and prev_height_ are both 1.
  Node** prev_;
@ -510,11 +511,10 @@ InlineSkipList<Comparator>::AllocateNode(size_t key_size, int height) {
 template <class Comparator>
 void InlineSkipList<Comparator>::Insert(const char* key) {
-  // InsertConcurrently can't maintain the prev_ invariants when it needs
+  // InsertConcurrently often can't maintain the prev_ invariants, so
-  // to increase max_height_.  In that case it sets prev_height_ to zero,
+  // it just sets prev_height_ to zero, letting us know that we should
-  // letting us know that we should ignore it.  A relaxed load suffices
+  // ignore it.  A relaxed load suffices here because write thread
-  // here because write thread synchronization separates Insert calls
+  // synchronization separates Insert calls from InsertConcurrently calls.
  // from InsertConcurrently calls.
  auto prev_height = prev_height_.load(std::memory_order_relaxed);
  // fast path for sequential insertion
@ -595,15 +595,24 @@ void InlineSkipList<Comparator>::InsertConcurrently(const char* key) {
  int height = x->UnstashHeight();
  assert(height >= 1 && height <= kMaxHeight_);
  // We don't have a lock-free algorithm for updating prev_, but we do have
  // the option of invalidating the entire sequential-insertion cache.
  // prev_'s invariant is that prev_[i] (i > 0) is the predecessor of
  // prev_[0] at that level.  We're only going to violate that if height
  // > 1 and key lands after prev_[height - 1] but before prev_[0].
  // Comparisons are pretty expensive, so an easier version is to just
  // clear the cache if height > 1.  We only write to prev_height_ if the
  // nobody else has, to avoid invalidating the root of the skip list in
  // all of the other CPU caches.
  if (height > 1 && prev_height_.load(std::memory_order_relaxed) != 0) {
    prev_height_.store(0, std::memory_order_relaxed);
  }
  int max_height = max_height_.load(std::memory_order_relaxed);
  while (height > max_height) {
    if (max_height_.compare_exchange_strong(max_height, height)) {
      // successfully updated it
      max_height = height;
      // we dont have a lock-free algorithm for fixing up prev_, so just
      // mark it invalid
      prev_height_.store(0, std::memory_order_relaxed);
      break;
    }
    // else retry, possibly exiting the loop because somebody else