311a544c2a
Summary: This change gathers and publishes statistics about the kinds of items in block cache. This is especially important for profiling relative usage of cache by index vs. filter vs. data blocks. It works by iterating over the cache during periodic stats dump (InternalStats, stats_dump_period_sec) or on demand when DB::Get(Map)Property(kBlockCacheEntryStats), except that for efficiency and sharing among column families, saved data from the last scan is used when the data is not considered too old. The new information can be seen in info LOG, for example: Block cache LRUCache@0x7fca62229330 capacity: 95.37 MB collections: 8 last_copies: 0 last_secs: 0.00178 secs_since: 0 Block cache entry stats(count,size,portion): DataBlock(7092,28.24 MB,29.6136%) FilterBlock(215,867.90 KB,0.888728%) FilterMetaBlock(2,5.31 KB,0.00544%) IndexBlock(217,180.11 KB,0.184432%) WriteBuffer(1,256.00 KB,0.262144%) Misc(1,0.00 KB,0%) And also through DB::GetProperty and GetMapProperty (here using ldb just for demonstration): $ ./ldb --db=/dev/shm/dbbench/ get_property rocksdb.block-cache-entry-stats rocksdb.block-cache-entry-stats.bytes.data-block: 0 rocksdb.block-cache-entry-stats.bytes.deprecated-filter-block: 0 rocksdb.block-cache-entry-stats.bytes.filter-block: 0 rocksdb.block-cache-entry-stats.bytes.filter-meta-block: 0 rocksdb.block-cache-entry-stats.bytes.index-block: 178992 rocksdb.block-cache-entry-stats.bytes.misc: 0 rocksdb.block-cache-entry-stats.bytes.other-block: 0 rocksdb.block-cache-entry-stats.bytes.write-buffer: 0 rocksdb.block-cache-entry-stats.capacity: 8388608 rocksdb.block-cache-entry-stats.count.data-block: 0 rocksdb.block-cache-entry-stats.count.deprecated-filter-block: 0 rocksdb.block-cache-entry-stats.count.filter-block: 0 rocksdb.block-cache-entry-stats.count.filter-meta-block: 0 rocksdb.block-cache-entry-stats.count.index-block: 215 rocksdb.block-cache-entry-stats.count.misc: 1 rocksdb.block-cache-entry-stats.count.other-block: 0 rocksdb.block-cache-entry-stats.count.write-buffer: 0 rocksdb.block-cache-entry-stats.id: LRUCache@0x7f3636661290 rocksdb.block-cache-entry-stats.percent.data-block: 0.000000 rocksdb.block-cache-entry-stats.percent.deprecated-filter-block: 0.000000 rocksdb.block-cache-entry-stats.percent.filter-block: 0.000000 rocksdb.block-cache-entry-stats.percent.filter-meta-block: 0.000000 rocksdb.block-cache-entry-stats.percent.index-block: 2.133751 rocksdb.block-cache-entry-stats.percent.misc: 0.000000 rocksdb.block-cache-entry-stats.percent.other-block: 0.000000 rocksdb.block-cache-entry-stats.percent.write-buffer: 0.000000 rocksdb.block-cache-entry-stats.secs_for_last_collection: 0.000052 rocksdb.block-cache-entry-stats.secs_since_last_collection: 0 Solution detail - We need some way to flag what kind of blocks each entry belongs to, preferably without changing the Cache API. One of the complications is that Cache is a general interface that could have other users that don't adhere to whichever convention we decide on for keys and values. Or we would pay for an extra field in the Handle that would only be used for this purpose. This change uses a back-door approach, the deleter, to indicate the "role" of a Cache entry (in addition to the value type, implicitly). This has the added benefit of ensuring proper code origin whenever we recognize a particular role for a cache entry; if the entry came from some other part of the code, it will use an unrecognized deleter, which we simply attribute to the "Misc" role. An internal API makes for simple instantiation and automatic registration of Cache deleters for a given value type and "role". Another internal API, CacheEntryStatsCollector, solves the problem of caching the results of a scan and sharing them, to ensure scans are neither excessive nor redundant so as not to harm Cache performance. Because code is added to BlocklikeTraits, it is pulled out of block_based_table_reader.cc into its own file. This is a reformulation of https://github.com/facebook/rocksdb/issues/8276, without the type checking option (could still be added), and with actual stat gathering. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8297 Test Plan: manual testing with db_bench, and a couple of basic unit tests Reviewed By: ltamasi Differential Revision: D28488721 Pulled By: pdillinger fbshipit-source-id: 472f524a9691b5afb107934be2d41d84f2b129fb
233 lines
7.2 KiB
C++
233 lines
7.2 KiB
C++
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
// This source code is licensed under both the GPLv2 (found in the
|
|
// COPYING file in the root directory) and Apache 2.0 License
|
|
// (found in the LICENSE.Apache file in the root directory).
|
|
//
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
#include "cache/sharded_cache.h"
|
|
|
|
#include <algorithm>
|
|
#include <cstdint>
|
|
#include <memory>
|
|
|
|
#include "util/hash.h"
|
|
#include "util/math.h"
|
|
#include "util/mutexlock.h"
|
|
|
|
namespace ROCKSDB_NAMESPACE {
|
|
|
|
namespace {
|
|
|
|
inline uint32_t HashSlice(const Slice& s) {
|
|
return Lower32of64(GetSliceNPHash64(s));
|
|
}
|
|
|
|
} // namespace
|
|
|
|
ShardedCache::ShardedCache(size_t capacity, int num_shard_bits,
|
|
bool strict_capacity_limit,
|
|
std::shared_ptr<MemoryAllocator> allocator)
|
|
: Cache(std::move(allocator)),
|
|
shard_mask_((uint32_t{1} << num_shard_bits) - 1),
|
|
capacity_(capacity),
|
|
strict_capacity_limit_(strict_capacity_limit),
|
|
last_id_(1) {}
|
|
|
|
void ShardedCache::SetCapacity(size_t capacity) {
|
|
uint32_t num_shards = GetNumShards();
|
|
const size_t per_shard = (capacity + (num_shards - 1)) / num_shards;
|
|
MutexLock l(&capacity_mutex_);
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
GetShard(s)->SetCapacity(per_shard);
|
|
}
|
|
capacity_ = capacity;
|
|
}
|
|
|
|
void ShardedCache::SetStrictCapacityLimit(bool strict_capacity_limit) {
|
|
uint32_t num_shards = GetNumShards();
|
|
MutexLock l(&capacity_mutex_);
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
GetShard(s)->SetStrictCapacityLimit(strict_capacity_limit);
|
|
}
|
|
strict_capacity_limit_ = strict_capacity_limit;
|
|
}
|
|
|
|
Status ShardedCache::Insert(const Slice& key, void* value, size_t charge,
|
|
DeleterFn deleter, Handle** handle,
|
|
Priority priority) {
|
|
uint32_t hash = HashSlice(key);
|
|
return GetShard(Shard(hash))
|
|
->Insert(key, hash, value, charge, deleter, handle, priority);
|
|
}
|
|
|
|
Status ShardedCache::Insert(const Slice& key, void* value,
|
|
const CacheItemHelper* helper, size_t charge,
|
|
Handle** handle, Priority priority) {
|
|
uint32_t hash = HashSlice(key);
|
|
if (!helper) {
|
|
return Status::InvalidArgument();
|
|
}
|
|
return GetShard(Shard(hash))
|
|
->Insert(key, hash, value, helper, charge, handle, priority);
|
|
}
|
|
|
|
Cache::Handle* ShardedCache::Lookup(const Slice& key, Statistics* /*stats*/) {
|
|
uint32_t hash = HashSlice(key);
|
|
return GetShard(Shard(hash))->Lookup(key, hash);
|
|
}
|
|
|
|
Cache::Handle* ShardedCache::Lookup(const Slice& key,
|
|
const CacheItemHelper* helper,
|
|
const CreateCallback& create_cb,
|
|
Priority priority, bool wait,
|
|
Statistics* /*stats*/) {
|
|
uint32_t hash = HashSlice(key);
|
|
return GetShard(Shard(hash))
|
|
->Lookup(key, hash, helper, create_cb, priority, wait);
|
|
}
|
|
|
|
bool ShardedCache::IsReady(Handle* handle) {
|
|
uint32_t hash = GetHash(handle);
|
|
return GetShard(Shard(hash))->IsReady(handle);
|
|
}
|
|
|
|
void ShardedCache::Wait(Handle* handle) {
|
|
uint32_t hash = GetHash(handle);
|
|
GetShard(Shard(hash))->Wait(handle);
|
|
}
|
|
|
|
bool ShardedCache::Ref(Handle* handle) {
|
|
uint32_t hash = GetHash(handle);
|
|
return GetShard(Shard(hash))->Ref(handle);
|
|
}
|
|
|
|
bool ShardedCache::Release(Handle* handle, bool force_erase) {
|
|
uint32_t hash = GetHash(handle);
|
|
return GetShard(Shard(hash))->Release(handle, force_erase);
|
|
}
|
|
|
|
bool ShardedCache::Release(Handle* handle, bool useful, bool force_erase) {
|
|
uint32_t hash = GetHash(handle);
|
|
return GetShard(Shard(hash))->Release(handle, useful, force_erase);
|
|
}
|
|
|
|
void ShardedCache::Erase(const Slice& key) {
|
|
uint32_t hash = HashSlice(key);
|
|
GetShard(Shard(hash))->Erase(key, hash);
|
|
}
|
|
|
|
uint64_t ShardedCache::NewId() {
|
|
return last_id_.fetch_add(1, std::memory_order_relaxed);
|
|
}
|
|
|
|
size_t ShardedCache::GetCapacity() const {
|
|
MutexLock l(&capacity_mutex_);
|
|
return capacity_;
|
|
}
|
|
|
|
bool ShardedCache::HasStrictCapacityLimit() const {
|
|
MutexLock l(&capacity_mutex_);
|
|
return strict_capacity_limit_;
|
|
}
|
|
|
|
size_t ShardedCache::GetUsage() const {
|
|
// We will not lock the cache when getting the usage from shards.
|
|
uint32_t num_shards = GetNumShards();
|
|
size_t usage = 0;
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
usage += GetShard(s)->GetUsage();
|
|
}
|
|
return usage;
|
|
}
|
|
|
|
size_t ShardedCache::GetUsage(Handle* handle) const {
|
|
return GetCharge(handle);
|
|
}
|
|
|
|
size_t ShardedCache::GetPinnedUsage() const {
|
|
// We will not lock the cache when getting the usage from shards.
|
|
uint32_t num_shards = GetNumShards();
|
|
size_t usage = 0;
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
usage += GetShard(s)->GetPinnedUsage();
|
|
}
|
|
return usage;
|
|
}
|
|
|
|
void ShardedCache::ApplyToAllEntries(
|
|
const std::function<void(const Slice& key, void* value, size_t charge,
|
|
DeleterFn deleter)>& callback,
|
|
const ApplyToAllEntriesOptions& opts) {
|
|
uint32_t num_shards = GetNumShards();
|
|
// Iterate over part of each shard, rotating between shards, to
|
|
// minimize impact on latency of concurrent operations.
|
|
std::unique_ptr<uint32_t[]> states(new uint32_t[num_shards]{});
|
|
|
|
uint32_t aepl_in_32 = static_cast<uint32_t>(
|
|
std::min(size_t{UINT32_MAX}, opts.average_entries_per_lock));
|
|
aepl_in_32 = std::min(aepl_in_32, uint32_t{1});
|
|
|
|
bool remaining_work;
|
|
do {
|
|
remaining_work = false;
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
if (states[s] != UINT32_MAX) {
|
|
GetShard(s)->ApplyToSomeEntries(callback, aepl_in_32, &states[s]);
|
|
remaining_work |= states[s] != UINT32_MAX;
|
|
}
|
|
}
|
|
} while (remaining_work);
|
|
}
|
|
|
|
void ShardedCache::EraseUnRefEntries() {
|
|
uint32_t num_shards = GetNumShards();
|
|
for (uint32_t s = 0; s < num_shards; s++) {
|
|
GetShard(s)->EraseUnRefEntries();
|
|
}
|
|
}
|
|
|
|
std::string ShardedCache::GetPrintableOptions() const {
|
|
std::string ret;
|
|
ret.reserve(20000);
|
|
const int kBufferSize = 200;
|
|
char buffer[kBufferSize];
|
|
{
|
|
MutexLock l(&capacity_mutex_);
|
|
snprintf(buffer, kBufferSize, " capacity : %" ROCKSDB_PRIszt "\n",
|
|
capacity_);
|
|
ret.append(buffer);
|
|
snprintf(buffer, kBufferSize, " num_shard_bits : %d\n",
|
|
GetNumShardBits());
|
|
ret.append(buffer);
|
|
snprintf(buffer, kBufferSize, " strict_capacity_limit : %d\n",
|
|
strict_capacity_limit_);
|
|
ret.append(buffer);
|
|
}
|
|
snprintf(buffer, kBufferSize, " memory_allocator : %s\n",
|
|
memory_allocator() ? memory_allocator()->Name() : "None");
|
|
ret.append(buffer);
|
|
ret.append(GetShard(0)->GetPrintableOptions());
|
|
return ret;
|
|
}
|
|
int GetDefaultCacheShardBits(size_t capacity) {
|
|
int num_shard_bits = 0;
|
|
size_t min_shard_size = 512L * 1024L; // Every shard is at least 512KB.
|
|
size_t num_shards = capacity / min_shard_size;
|
|
while (num_shards >>= 1) {
|
|
if (++num_shard_bits >= 6) {
|
|
// No more than 6.
|
|
return num_shard_bits;
|
|
}
|
|
}
|
|
return num_shard_bits;
|
|
}
|
|
|
|
int ShardedCache::GetNumShardBits() const { return BitsSetToOne(shard_mask_); }
|
|
|
|
uint32_t ShardedCache::GetNumShards() const { return shard_mask_ + 1; }
|
|
|
|
} // namespace ROCKSDB_NAMESPACE
|