57f3032285
Summary: There's no technological impediment to allowing the Bloom filter bits/key to be non-integer (fractional/decimal) values, and it provides finer control over the memory vs. accuracy trade-off. This is especially handy in using the format_version=5 Bloom filter in place of the old one, because bits_per_key=9.55 provides the same accuracy as the old bits_per_key=10. This change not only requires refining the logic for choosing the best num_probes for a given bits/key setting, it revealed a flaw in that logic. As bits/key gets higher, the best num_probes for a cache-local Bloom filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a standard Bloom filter. For example, at 16 bits per key, the best num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%). This change fixes and refines that logic (for the format_version=5 Bloom filter only, just in case) based on empirical tests to find accuracy inflection points between each num_probes. Although bits_per_key is now specified as a double, the new Bloom filter converts/rounds this to "millibits / key" for predictable/precise internal computations. Just in case of unforeseen compatibility issues, we round to the nearest whole number bits / key for the legacy Bloom filter, so as not to unlock new behaviors for it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092 Test Plan: unit tests included Differential Revision: D18711313 Pulled By: pdillinger fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a |
||
---|---|---|
.. | ||
block_based_filter_block_test.cc | ||
block_based_filter_block.cc | ||
block_based_filter_block.h | ||
block_based_table_builder.cc | ||
block_based_table_builder.h | ||
block_based_table_factory.cc | ||
block_based_table_factory.h | ||
block_based_table_reader.cc | ||
block_based_table_reader.h | ||
block_builder.cc | ||
block_builder.h | ||
block_prefix_index.cc | ||
block_prefix_index.h | ||
block_test.cc | ||
block_type.h | ||
block.cc | ||
block.h | ||
cachable_entry.h | ||
data_block_footer.cc | ||
data_block_footer.h | ||
data_block_hash_index_test.cc | ||
data_block_hash_index.cc | ||
data_block_hash_index.h | ||
filter_block_reader_common.cc | ||
filter_block_reader_common.h | ||
filter_block.h | ||
filter_policy_internal.h | ||
filter_policy.cc | ||
flush_block_policy.cc | ||
flush_block_policy.h | ||
full_filter_block_test.cc | ||
full_filter_block.cc | ||
full_filter_block.h | ||
index_builder.cc | ||
index_builder.h | ||
mock_block_based_table.h | ||
parsed_full_filter_block.cc | ||
parsed_full_filter_block.h | ||
partitioned_filter_block_test.cc | ||
partitioned_filter_block.cc | ||
partitioned_filter_block.h | ||
uncompression_dict_reader.cc | ||
uncompression_dict_reader.h |