57f3032285
Summary: There's no technological impediment to allowing the Bloom filter bits/key to be non-integer (fractional/decimal) values, and it provides finer control over the memory vs. accuracy trade-off. This is especially handy in using the format_version=5 Bloom filter in place of the old one, because bits_per_key=9.55 provides the same accuracy as the old bits_per_key=10. This change not only requires refining the logic for choosing the best num_probes for a given bits/key setting, it revealed a flaw in that logic. As bits/key gets higher, the best num_probes for a cache-local Bloom filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a standard Bloom filter. For example, at 16 bits per key, the best num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%). This change fixes and refines that logic (for the format_version=5 Bloom filter only, just in case) based on empirical tests to find accuracy inflection points between each num_probes. Although bits_per_key is now specified as a double, the new Bloom filter converts/rounds this to "millibits / key" for predictable/precise internal computations. Just in case of unforeseen compatibility issues, we round to the nearest whole number bits / key for the legacy Bloom filter, so as not to unlock new behaviors for it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092 Test Plan: unit tests included Differential Revision: D18711313 Pulled By: pdillinger fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a |
||
---|---|---|
.. | ||
adaptive | ||
block_based | ||
cuckoo | ||
plain | ||
block_fetcher.cc | ||
block_fetcher.h | ||
cleanable_test.cc | ||
format.cc | ||
format.h | ||
get_context.cc | ||
get_context.h | ||
internal_iterator.h | ||
iter_heap.h | ||
iterator_wrapper.h | ||
iterator.cc | ||
merger_test.cc | ||
merging_iterator.cc | ||
merging_iterator.h | ||
meta_blocks.cc | ||
meta_blocks.h | ||
mock_table.cc | ||
mock_table.h | ||
multiget_context.h | ||
persistent_cache_helper.cc | ||
persistent_cache_helper.h | ||
persistent_cache_options.h | ||
scoped_arena_iterator.h | ||
sst_file_reader_test.cc | ||
sst_file_reader.cc | ||
sst_file_writer_collectors.h | ||
sst_file_writer.cc | ||
table_builder.h | ||
table_properties_internal.h | ||
table_properties.cc | ||
table_reader_bench.cc | ||
table_reader_caller.h | ||
table_reader.h | ||
table_test.cc | ||
two_level_iterator.cc | ||
two_level_iterator.h |