Summary:
Amongst other things, PR https://github.com/facebook/rocksdb/issues/5504 refactored the filter block readers so that
only the filter block contents are stored in the block cache (as opposed to the
earlier design where the cache stored the filter block reader itself, leading to
potentially dangling pointers and concurrency bugs). However, this change
introduced a performance hit since with the new code, the metadata fields are
re-parsed upon every access. This patch reunites the block contents with the
filter bits reader to eliminate this overhead; since this is still a self-contained
pure data object, it is safe to store it in the cache. (Note: this is similar to how
the zstd digest is handled.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5936
Test Plan:
make asan_check
filter_bench results for the old code:
```
$ ./filter_bench -quick
WARNING: Assertions are enabled; benchmarks unnecessarily slow
Building...
Build avg ns/key: 26.7153
Number of filters: 16669
Total memory (MB): 200.009
Bits/key actual: 10.0647
----------------------------
Inside queries...
Dry run (46b) ns/op: 33.4258
Single filter ns/op: 42.5974
Random filter ns/op: 217.861
----------------------------
Outside queries...
Dry run (25d) ns/op: 32.4217
Single filter ns/op: 50.9855
Random filter ns/op: 219.167
Average FP rate %: 1.13993
----------------------------
Done. (For more info, run with -legend or -help.)
$ ./filter_bench -quick -use_full_block_reader
WARNING: Assertions are enabled; benchmarks unnecessarily slow
Building...
Build avg ns/key: 26.5172
Number of filters: 16669
Total memory (MB): 200.009
Bits/key actual: 10.0647
----------------------------
Inside queries...
Dry run (46b) ns/op: 32.3556
Single filter ns/op: 83.2239
Random filter ns/op: 370.676
----------------------------
Outside queries...
Dry run (25d) ns/op: 32.2265
Single filter ns/op: 93.5651
Random filter ns/op: 408.393
Average FP rate %: 1.13993
----------------------------
Done. (For more info, run with -legend or -help.)
```
With the new code:
```
$ ./filter_bench -quick
WARNING: Assertions are enabled; benchmarks unnecessarily slow
Building...
Build avg ns/key: 25.4285
Number of filters: 16669
Total memory (MB): 200.009
Bits/key actual: 10.0647
----------------------------
Inside queries...
Dry run (46b) ns/op: 31.0594
Single filter ns/op: 43.8974
Random filter ns/op: 226.075
----------------------------
Outside queries...
Dry run (25d) ns/op: 31.0295
Single filter ns/op: 50.3824
Random filter ns/op: 226.805
Average FP rate %: 1.13993
----------------------------
Done. (For more info, run with -legend or -help.)
$ ./filter_bench -quick -use_full_block_reader
WARNING: Assertions are enabled; benchmarks unnecessarily slow
Building...
Build avg ns/key: 26.5308
Number of filters: 16669
Total memory (MB): 200.009
Bits/key actual: 10.0647
----------------------------
Inside queries...
Dry run (46b) ns/op: 33.2968
Single filter ns/op: 58.6163
Random filter ns/op: 291.434
----------------------------
Outside queries...
Dry run (25d) ns/op: 32.1839
Single filter ns/op: 66.9039
Random filter ns/op: 292.828
Average FP rate %: 1.13993
----------------------------
Done. (For more info, run with -legend or -help.)
```
Differential Revision: D17991712
Pulled By: ltamasi
fbshipit-source-id: 7ea205550217bfaaa1d5158ebd658e5832e60f29
Summary:
Fixed some spots where converting size_t or uint_fast32_t to
uint32_t. Wrapped mt19937 in a new Random32 class to avoid future
such traps.
NB: I tried using Random32::Uniform (std::uniform_int_distribution) in
filter_bench instead of fastrange, but that more than doubled the dry
run time! So I added fastrange as Random32::Uniformish. ;)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5894
Test Plan: USE_CLANG=1 build, and manual re-run filter_bench
Differential Revision: D17825131
Pulled By: pdillinger
fbshipit-source-id: 68feee333b5f8193c084ded760e3d6679b405ecd
Summary:
Example: using the tool before and after PR https://github.com/facebook/rocksdb/issues/5784 shows that
the refactoring, presumed performance-neutral, actually sped up SST
filters by about 3% to 8% (repeatable result):
Before:
- Dry run ns/op: 22.4725
- Single filter ns/op: 51.1078
- Random filter ns/op: 120.133
After:
+ Dry run ns/op: 22.2301
+ Single filter run ns/op: 47.4313
+ Random filter ns/op: 115.9
Only tests filters for the block-based table (full filters and
partitioned filters - same implementation; not block-based filters),
which seems to be the recommended format/implementation.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5825
Differential Revision: D17804987
Pulled By: pdillinger
fbshipit-source-id: 0f18a9c254c57f7866030d03e7fa4ba503bac3c5