rocksdb/cache/sharded_cache.cc
Peter Dillinger 311a544c2a Use deleters to label cache entries and collect stats (#8297)
Summary:
This change gathers and publishes statistics about the
kinds of items in block cache. This is especially important for
profiling relative usage of cache by index vs. filter vs. data blocks.
It works by iterating over the cache during periodic stats dump
(InternalStats, stats_dump_period_sec) or on demand when
DB::Get(Map)Property(kBlockCacheEntryStats), except that for
efficiency and sharing among column families, saved data from
the last scan is used when the data is not considered too old.

The new information can be seen in info LOG, for example:

    Block cache LRUCache@0x7fca62229330 capacity: 95.37 MB collections: 8 last_copies: 0 last_secs: 0.00178 secs_since: 0
    Block cache entry stats(count,size,portion): DataBlock(7092,28.24 MB,29.6136%) FilterBlock(215,867.90 KB,0.888728%) FilterMetaBlock(2,5.31 KB,0.00544%) IndexBlock(217,180.11 KB,0.184432%) WriteBuffer(1,256.00 KB,0.262144%) Misc(1,0.00 KB,0%)

And also through DB::GetProperty and GetMapProperty (here using
ldb just for demonstration):

    $ ./ldb --db=/dev/shm/dbbench/ get_property rocksdb.block-cache-entry-stats
    rocksdb.block-cache-entry-stats.bytes.data-block: 0
    rocksdb.block-cache-entry-stats.bytes.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-block: 0
    rocksdb.block-cache-entry-stats.bytes.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.bytes.index-block: 178992
    rocksdb.block-cache-entry-stats.bytes.misc: 0
    rocksdb.block-cache-entry-stats.bytes.other-block: 0
    rocksdb.block-cache-entry-stats.bytes.write-buffer: 0
    rocksdb.block-cache-entry-stats.capacity: 8388608
    rocksdb.block-cache-entry-stats.count.data-block: 0
    rocksdb.block-cache-entry-stats.count.deprecated-filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-block: 0
    rocksdb.block-cache-entry-stats.count.filter-meta-block: 0
    rocksdb.block-cache-entry-stats.count.index-block: 215
    rocksdb.block-cache-entry-stats.count.misc: 1
    rocksdb.block-cache-entry-stats.count.other-block: 0
    rocksdb.block-cache-entry-stats.count.write-buffer: 0
    rocksdb.block-cache-entry-stats.id: LRUCache@0x7f3636661290
    rocksdb.block-cache-entry-stats.percent.data-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.deprecated-filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.filter-meta-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.index-block: 2.133751
    rocksdb.block-cache-entry-stats.percent.misc: 0.000000
    rocksdb.block-cache-entry-stats.percent.other-block: 0.000000
    rocksdb.block-cache-entry-stats.percent.write-buffer: 0.000000
    rocksdb.block-cache-entry-stats.secs_for_last_collection: 0.000052
    rocksdb.block-cache-entry-stats.secs_since_last_collection: 0

Solution detail - We need some way to flag what kind of blocks each
entry belongs to, preferably without changing the Cache API.
One of the complications is that Cache is a general interface that could
have other users that don't adhere to whichever convention we decide
on for keys and values. Or we would pay for an extra field in the Handle
that would only be used for this purpose.

This change uses a back-door approach, the deleter, to indicate the
"role" of a Cache entry (in addition to the value type, implicitly).
This has the added benefit of ensuring proper code origin whenever we
recognize a particular role for a cache entry; if the entry came from
some other part of the code, it will use an unrecognized deleter, which
we simply attribute to the "Misc" role.

An internal API makes for simple instantiation and automatic
registration of Cache deleters for a given value type and "role".

Another internal API, CacheEntryStatsCollector, solves the problem of
caching the results of a scan and sharing them, to ensure scans are
neither excessive nor redundant so as not to harm Cache performance.

Because code is added to BlocklikeTraits, it is pulled out of
block_based_table_reader.cc into its own file.

This is a reformulation of https://github.com/facebook/rocksdb/issues/8276, without the type checking option
(could still be added), and with actual stat gathering.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8297

Test Plan: manual testing with db_bench, and a couple of basic unit tests

Reviewed By: ltamasi

Differential Revision: D28488721

Pulled By: pdillinger

fbshipit-source-id: 472f524a9691b5afb107934be2d41d84f2b129fb
2021-05-19 16:51:13 -07:00

233 lines
7.2 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#include "cache/sharded_cache.h"
#include <algorithm>
#include <cstdint>
#include <memory>
#include "util/hash.h"
#include "util/math.h"
#include "util/mutexlock.h"
namespace ROCKSDB_NAMESPACE {
namespace {
inline uint32_t HashSlice(const Slice& s) {
return Lower32of64(GetSliceNPHash64(s));
}
} // namespace
ShardedCache::ShardedCache(size_t capacity, int num_shard_bits,
bool strict_capacity_limit,
std::shared_ptr<MemoryAllocator> allocator)
: Cache(std::move(allocator)),
shard_mask_((uint32_t{1} << num_shard_bits) - 1),
capacity_(capacity),
strict_capacity_limit_(strict_capacity_limit),
last_id_(1) {}
void ShardedCache::SetCapacity(size_t capacity) {
uint32_t num_shards = GetNumShards();
const size_t per_shard = (capacity + (num_shards - 1)) / num_shards;
MutexLock l(&capacity_mutex_);
for (uint32_t s = 0; s < num_shards; s++) {
GetShard(s)->SetCapacity(per_shard);
}
capacity_ = capacity;
}
void ShardedCache::SetStrictCapacityLimit(bool strict_capacity_limit) {
uint32_t num_shards = GetNumShards();
MutexLock l(&capacity_mutex_);
for (uint32_t s = 0; s < num_shards; s++) {
GetShard(s)->SetStrictCapacityLimit(strict_capacity_limit);
}
strict_capacity_limit_ = strict_capacity_limit;
}
Status ShardedCache::Insert(const Slice& key, void* value, size_t charge,
DeleterFn deleter, Handle** handle,
Priority priority) {
uint32_t hash = HashSlice(key);
return GetShard(Shard(hash))
->Insert(key, hash, value, charge, deleter, handle, priority);
}
Status ShardedCache::Insert(const Slice& key, void* value,
const CacheItemHelper* helper, size_t charge,
Handle** handle, Priority priority) {
uint32_t hash = HashSlice(key);
if (!helper) {
return Status::InvalidArgument();
}
return GetShard(Shard(hash))
->Insert(key, hash, value, helper, charge, handle, priority);
}
Cache::Handle* ShardedCache::Lookup(const Slice& key, Statistics* /*stats*/) {
uint32_t hash = HashSlice(key);
return GetShard(Shard(hash))->Lookup(key, hash);
}
Cache::Handle* ShardedCache::Lookup(const Slice& key,
const CacheItemHelper* helper,
const CreateCallback& create_cb,
Priority priority, bool wait,
Statistics* /*stats*/) {
uint32_t hash = HashSlice(key);
return GetShard(Shard(hash))
->Lookup(key, hash, helper, create_cb, priority, wait);
}
bool ShardedCache::IsReady(Handle* handle) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->IsReady(handle);
}
void ShardedCache::Wait(Handle* handle) {
uint32_t hash = GetHash(handle);
GetShard(Shard(hash))->Wait(handle);
}
bool ShardedCache::Ref(Handle* handle) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->Ref(handle);
}
bool ShardedCache::Release(Handle* handle, bool force_erase) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->Release(handle, force_erase);
}
bool ShardedCache::Release(Handle* handle, bool useful, bool force_erase) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->Release(handle, useful, force_erase);
}
void ShardedCache::Erase(const Slice& key) {
uint32_t hash = HashSlice(key);
GetShard(Shard(hash))->Erase(key, hash);
}
uint64_t ShardedCache::NewId() {
return last_id_.fetch_add(1, std::memory_order_relaxed);
}
size_t ShardedCache::GetCapacity() const {
MutexLock l(&capacity_mutex_);
return capacity_;
}
bool ShardedCache::HasStrictCapacityLimit() const {
MutexLock l(&capacity_mutex_);
return strict_capacity_limit_;
}
size_t ShardedCache::GetUsage() const {
// We will not lock the cache when getting the usage from shards.
uint32_t num_shards = GetNumShards();
size_t usage = 0;
for (uint32_t s = 0; s < num_shards; s++) {
usage += GetShard(s)->GetUsage();
}
return usage;
}
size_t ShardedCache::GetUsage(Handle* handle) const {
return GetCharge(handle);
}
size_t ShardedCache::GetPinnedUsage() const {
// We will not lock the cache when getting the usage from shards.
uint32_t num_shards = GetNumShards();
size_t usage = 0;
for (uint32_t s = 0; s < num_shards; s++) {
usage += GetShard(s)->GetPinnedUsage();
}
return usage;
}
void ShardedCache::ApplyToAllEntries(
const std::function<void(const Slice& key, void* value, size_t charge,
DeleterFn deleter)>& callback,
const ApplyToAllEntriesOptions& opts) {
uint32_t num_shards = GetNumShards();
// Iterate over part of each shard, rotating between shards, to
// minimize impact on latency of concurrent operations.
std::unique_ptr<uint32_t[]> states(new uint32_t[num_shards]{});
uint32_t aepl_in_32 = static_cast<uint32_t>(
std::min(size_t{UINT32_MAX}, opts.average_entries_per_lock));
aepl_in_32 = std::min(aepl_in_32, uint32_t{1});
bool remaining_work;
do {
remaining_work = false;
for (uint32_t s = 0; s < num_shards; s++) {
if (states[s] != UINT32_MAX) {
GetShard(s)->ApplyToSomeEntries(callback, aepl_in_32, &states[s]);
remaining_work |= states[s] != UINT32_MAX;
}
}
} while (remaining_work);
}
void ShardedCache::EraseUnRefEntries() {
uint32_t num_shards = GetNumShards();
for (uint32_t s = 0; s < num_shards; s++) {
GetShard(s)->EraseUnRefEntries();
}
}
std::string ShardedCache::GetPrintableOptions() const {
std::string ret;
ret.reserve(20000);
const int kBufferSize = 200;
char buffer[kBufferSize];
{
MutexLock l(&capacity_mutex_);
snprintf(buffer, kBufferSize, " capacity : %" ROCKSDB_PRIszt "\n",
capacity_);
ret.append(buffer);
snprintf(buffer, kBufferSize, " num_shard_bits : %d\n",
GetNumShardBits());
ret.append(buffer);
snprintf(buffer, kBufferSize, " strict_capacity_limit : %d\n",
strict_capacity_limit_);
ret.append(buffer);
}
snprintf(buffer, kBufferSize, " memory_allocator : %s\n",
memory_allocator() ? memory_allocator()->Name() : "None");
ret.append(buffer);
ret.append(GetShard(0)->GetPrintableOptions());
return ret;
}
int GetDefaultCacheShardBits(size_t capacity) {
int num_shard_bits = 0;
size_t min_shard_size = 512L * 1024L; // Every shard is at least 512KB.
size_t num_shards = capacity / min_shard_size;
while (num_shards >>= 1) {
if (++num_shard_bits >= 6) {
// No more than 6.
return num_shard_bits;
}
}
return num_shard_bits;
}
int ShardedCache::GetNumShardBits() const { return BitsSetToOne(shard_mask_); }
uint32_t ShardedCache::GetNumShards() const { return shard_mask_ + 1; }
} // namespace ROCKSDB_NAMESPACE