rocksdb/db/forward_iterator.h
Peter Dillinger fc9d4071f0 Fast path for detecting unchanged prefix_extractor (#9407)
Summary:
Fixes a major performance regression in 6.26, where
extra CPU is spent in SliceTransform::AsString when reads involve
a prefix_extractor (Get, MultiGet, Seek). Common case performance
is now better than 6.25.

This change creates a "fast path" for verifying that the current prefix
extractor is unchanged and compatible with what was used to
generate a table file. This fast path detects the common case by
pointer comparison on the current prefix_extractor and a "known
good" prefix extractor (if applicable) that is saved at the time the
table reader is opened. The "known good" prefix extractor is saved
as another shared_ptr copy (in an existing field, however) to ensure
the pointer is not recycled.

When the prefix_extractor has changed to a different instance but
same compatible configuration (rare, odd), performance is still a
regression compared to 6.25, but this is likely acceptable because
of the oddity of such a case. The performance of incompatible
prefix_extractor is essentially unchanged.

Also fixed a minor case (ForwardIterator) where a prefix_extractor
could be used via a raw pointer after being freed as a shared_ptr,
if replaced via SetOptions.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9407

Test Plan:
## Performance
Populate DB with `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12`

Running head-to-head comparisons simultaneously with `TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -use_existing_db -readonly -benchmarks=seekrandom -num=10000000 -duration=20 -disable_wal=1 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12`

Below each is compared by ops/sec vs. baseline which is version 6.25 (multiple baseline runs because of variable machine load)

v6.26: 4833 vs. 6698 (<- major regression!)
v6.27: 4737 vs. 6397 (still)
New: 6704 vs. 6461 (better than baseline in common case)
Disabled fastpath: 4843 vs. 6389 (e.g. if prefix extractor instance changes but is still compatible)
Changed prefix size (no usable filter) in new: 787 vs. 5927
Changed prefix size (no usable filter) in new & baseline: 773 vs. 784

Reviewed By: mrambacher

Differential Revision: D33677812

Pulled By: pdillinger

fbshipit-source-id: 571d9711c461fb97f957378a061b7e7dbc4d6a76
2022-01-21 11:37:46 -08:00

165 lines
5.7 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#pragma once
#ifndef ROCKSDB_LITE
#include <string>
#include <vector>
#include <queue>
#include "memory/arena.h"
#include "rocksdb/db.h"
#include "rocksdb/iterator.h"
#include "rocksdb/options.h"
#include "table/internal_iterator.h"
namespace ROCKSDB_NAMESPACE {
class DBImpl;
class Env;
struct SuperVersion;
class ColumnFamilyData;
class ForwardLevelIterator;
class VersionStorageInfo;
struct FileMetaData;
class MinIterComparator {
public:
explicit MinIterComparator(const Comparator* comparator) :
comparator_(comparator) {}
bool operator()(InternalIterator* a, InternalIterator* b) {
return comparator_->Compare(a->key(), b->key()) > 0;
}
private:
const Comparator* comparator_;
};
using MinIterHeap =
std::priority_queue<InternalIterator*, std::vector<InternalIterator*>,
MinIterComparator>;
/**
* ForwardIterator is a special type of iterator that only supports Seek()
* and Next(). It is expected to perform better than TailingIterator by
* removing the encapsulation and making all information accessible within
* the iterator. At the current implementation, snapshot is taken at the
* time Seek() is called. The Next() followed do not see new values after.
*/
class ForwardIterator : public InternalIterator {
public:
ForwardIterator(DBImpl* db, const ReadOptions& read_options,
ColumnFamilyData* cfd, SuperVersion* current_sv = nullptr,
bool allow_unprepared_value = false);
virtual ~ForwardIterator();
void SeekForPrev(const Slice& /*target*/) override {
status_ = Status::NotSupported("ForwardIterator::SeekForPrev()");
valid_ = false;
}
void SeekToLast() override {
status_ = Status::NotSupported("ForwardIterator::SeekToLast()");
valid_ = false;
}
void Prev() override {
status_ = Status::NotSupported("ForwardIterator::Prev");
valid_ = false;
}
virtual bool Valid() const override;
void SeekToFirst() override;
virtual void Seek(const Slice& target) override;
virtual void Next() override;
virtual Slice key() const override;
virtual Slice value() const override;
virtual Status status() const override;
virtual bool PrepareValue() override;
virtual Status GetProperty(std::string prop_name, std::string* prop) override;
virtual void SetPinnedItersMgr(
PinnedIteratorsManager* pinned_iters_mgr) override;
virtual bool IsKeyPinned() const override;
virtual bool IsValuePinned() const override;
bool TEST_CheckDeletedIters(int* deleted_iters, int* num_iters);
private:
void Cleanup(bool release_sv);
// Unreference and, if needed, clean up the current SuperVersion. This is
// either done immediately or deferred until this iterator is unpinned by
// PinnedIteratorsManager.
void SVCleanup();
static void SVCleanup(
DBImpl* db, SuperVersion* sv, bool background_purge_on_iterator_cleanup);
static void DeferredSVCleanup(void* arg);
void RebuildIterators(bool refresh_sv);
void RenewIterators();
void BuildLevelIterators(const VersionStorageInfo* vstorage,
SuperVersion* sv);
void ResetIncompleteIterators();
void SeekInternal(const Slice& internal_key, bool seek_to_first);
void UpdateCurrent();
bool NeedToSeekImmutable(const Slice& internal_key);
void DeleteCurrentIter();
uint32_t FindFileInRange(
const std::vector<FileMetaData*>& files, const Slice& internal_key,
uint32_t left, uint32_t right);
bool IsOverUpperBound(const Slice& internal_key) const;
// Set PinnedIteratorsManager for all children Iterators, this function should
// be called whenever we update children Iterators or pinned_iters_mgr_.
void UpdateChildrenPinnedItersMgr();
// A helper function that will release iter in the proper manner, or pass it
// to pinned_iters_mgr_ to release it later if pinning is enabled.
void DeleteIterator(InternalIterator* iter, bool is_arena = false);
DBImpl* const db_;
const ReadOptions read_options_;
ColumnFamilyData* const cfd_;
const SliceTransform* const prefix_extractor_;
const Comparator* user_comparator_;
const bool allow_unprepared_value_;
MinIterHeap immutable_min_heap_;
SuperVersion* sv_;
InternalIterator* mutable_iter_;
std::vector<InternalIterator*> imm_iters_;
std::vector<InternalIterator*> l0_iters_;
std::vector<ForwardLevelIterator*> level_iters_;
InternalIterator* current_;
bool valid_;
// Internal iterator status; set only by one of the unsupported methods.
Status status_;
// Status of immutable iterators, maintained here to avoid iterating over
// all of them in status().
Status immutable_status_;
// Indicates that at least one of the immutable iterators pointed to a key
// larger than iterate_upper_bound and was therefore destroyed. Seek() may
// need to rebuild such iterators.
bool has_iter_trimmed_for_upper_bound_;
// Is current key larger than iterate_upper_bound? If so, makes Valid()
// return false.
bool current_over_upper_bound_;
// Left endpoint of the range of keys that immutable iterators currently
// cover. When Seek() is called with a key that's within that range, immutable
// iterators don't need to be moved; see NeedToSeekImmutable(). This key is
// included in the range after a Seek(), but excluded when advancing the
// iterator using Next().
IterKey prev_key_;
bool is_prev_set_;
bool is_prev_inclusive_;
PinnedIteratorsManager* pinned_iters_mgr_;
Arena arena_;
};
} // namespace ROCKSDB_NAMESPACE
#endif // ROCKSDB_LITE