rocksdb/cache
Peter Dillinger 95168f3514 Pin CacheEntryStatsCollector to fix performance bug (#8385)
Summary:
If the block Cache is full with strict_capacity_limit=false,
then our CacheEntryStatsCollector could be immediately evicted on
release, so iterating through column families with shared block cache
could trigger re-scan for each CF. This change fixes that problem by
pinning the CacheEntryStatsCollector from InternalStats so that it's not
evicted.

I had originally thought that this object could participate in LRU like
everything else, but even though a re-load+re-scan only touches memory,
it can be orders of magnitude more expensive than other cache misses.
One service in Facebook has scans that take ~20s over 100GB block cache
that is mostly 4KB entries. (The up-side of this bug and https://github.com/facebook/rocksdb/issues/8369 is that
we had a natural experiment on the effect on some service metrics even
with block cache scans running continuously in the background--a kind
of worst case scenario. Metrics like latency were not affected enough
to trigger warnings.)

Other smaller fixes:

20s is already a sizable portion of 600s stats dump period, or 180s
default max age to force re-scan, so added logic to ensure that (for
each block cache) we don't spend more than 0.2% of our background thread
time scanning it. Nevertheless, "foreground" requests for cache entry
stats (calls to `db->GetMapProperty(DB::Properties::kBlockCacheEntryStats)`)
are permitted to consume more CPU.

Renamed field to cache_entry_stats_ to match code style.

This change is intended for patching in 6.21 release.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8385

Test Plan:
unit test expanded to cover new logic (detect regression),
some manual testing with db_bench

Reviewed By: ajkr

Differential Revision: D29042759

Pulled By: pdillinger

fbshipit-source-id: 236faa902397f50038c618f50fbc8cf3f277308c
2021-06-14 15:06:50 -07:00
..
cache_bench_tool.cc Allow cache_bench/db_bench to use a custom secondary cache (#8312) 2021-05-19 15:26:18 -07:00
cache_bench.cc Allow cache_bench/db_bench to use a custom secondary cache (#8312) 2021-05-19 15:26:18 -07:00
cache_entry_roles.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
cache_entry_roles.h Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
cache_entry_stats.h Pin CacheEntryStatsCollector to fix performance bug (#8385) 2021-06-14 15:06:50 -07:00
cache_helpers.h Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
cache_test.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
cache.cc Refactor Option obj address from char* to void* (#8295) 2021-05-13 14:29:42 -07:00
clock_cache.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
clock_cache.h Change RocksDB License 2017-07-15 16:11:23 -07:00
lru_cache_test.cc fix lru caching test and fix reference binding to null pointer (#8326) 2021-05-24 11:04:54 -07:00
lru_cache.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
lru_cache.h Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
sharded_cache.cc Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00
sharded_cache.h Use deleters to label cache entries and collect stats (#8297) 2021-05-19 16:51:13 -07:00