rocksdb/util
Peter Dillinger bae6f58696 Basic MultiGet support for partitioned filters (#6757)
Summary:
In MultiGet, access each applicable filter partition only once
per batch, rather than for each applicable key. Also,

* Fix Bloom stats for MultiGet
* Fix/refactor MultiGetContext::Range::KeysLeft, including
* Add efficient BitsSetToOne implementation
* Assert that MultiGetContext::Range does not go beyond shift range

Performance test: Generate db:

    $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 -partition_index_and_filters=true
    ...

Before (middle performing run of three; note some missing Bloom stats):

    $ ./db_bench --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
    multireadrandom :      26.403 micros/op 597517 ops/sec; (548427 of 671968 found)
    rocksdb.block.cache.filter.hit COUNT : 83443275
    rocksdb.bloom.filter.useful COUNT : 0
    rocksdb.bloom.filter.full.positive COUNT : 0
    rocksdb.bloom.filter.full.true.positive COUNT : 7931450
    rocksdb.number.multiget.get COUNT : 385984
    rocksdb.number.multiget.keys.read COUNT : 12351488
    rocksdb.number.multiget.bytes.read COUNT : 793145000
    rocksdb.number.multiget.keys.found COUNT : 7931450

After (middle performing run of three):

    $ ./db_bench_new --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
    multireadrandom :      21.024 micros/op 752963 ops/sec; (705188 of 863968 found)
    rocksdb.block.cache.filter.hit COUNT : 49856682
    rocksdb.bloom.filter.useful COUNT : 45684579
    rocksdb.bloom.filter.full.positive COUNT : 10395458
    rocksdb.bloom.filter.full.true.positive COUNT : 9908456
    rocksdb.number.multiget.get COUNT : 481984
    rocksdb.number.multiget.keys.read COUNT : 15423488
    rocksdb.number.multiget.bytes.read COUNT : 990845600
    rocksdb.number.multiget.keys.found COUNT : 9908456

So that's about 25% higher throughput even for random keys
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6757

Test Plan: unit test included

Reviewed By: anand1976

Differential Revision: D21243256

Pulled By: pdillinger

fbshipit-source-id: 5644a1468d9e8c8575be02f4e04bc5d62dbbb57f
2020-04-28 14:49:34 -07:00
..
aligned_buffer.h Remove memcpy from RandomAccessFileReader::Read in direct IO mode (#6455) 2020-03-06 14:05:12 -08:00
autovector_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
autovector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
bloom_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
bloom_test.cc fix MSVC build failures (#6517) 2020-03-12 08:42:39 -07:00
build_version.cc.in Add copyright headers per FB open-source checkup tool. (#5199) 2019-04-18 10:55:01 -07:00
build_version.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cast_util.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
channel.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_job_stats_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
comparator.cc Iterator with timestamp (#6255) 2020-03-06 16:24:27 -08:00
compression_context_cache.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compression_context_cache.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compression.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
concurrent_task_limiter_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
concurrent_task_limiter_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
core_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
crc32c_arm64.cc Check for sys/auxv.h (#6359) 2020-03-03 18:09:59 -08:00
crc32c_arm64.h Apply formatter to recent 200+ commits. (#5830) 2019-09-20 12:04:26 -07:00
crc32c_ppc_asm.S Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
crc32c_ppc_constants.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
crc32c_ppc.c C file should not include <cinttypes>, it is a C++ header. (#5499) 2019-06-24 16:12:39 -07:00
crc32c_ppc.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
crc32c_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
crc32c.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
crc32c.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
defer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
defer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
duplicate_detector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_checksum_helper.cc Add NewFileChecksumGenCrc32cFactory to file checksum (#6688) 2020-04-13 19:13:41 -07:00
file_checksum_helper.h Use FileChecksumGenFactory for SST file checksum (#6600) 2020-03-29 15:58:46 -07:00
file_reader_writer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filelock_test.cc Include more information in file lock failure (#6507) 2020-03-11 16:23:08 -07:00
filter_bench.cc Misc filter_bench improvements (#6444) 2020-02-21 13:31:57 -08:00
gflags_compat.h filter_bench - a prelim tool for SST filter benchmarking (#5825) 2019-10-07 20:10:53 -07:00
hash_map.h Change HashMap::Insert()'s value to a const reference (#6567) 2020-03-20 14:59:54 -07:00
hash_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
hash.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
hash.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
heap_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
heap.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
kv_map.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_write_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
math.h Basic MultiGet support for partitioned filters (#6757) 2020-04-28 14:49:34 -07:00
murmurhash.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
murmurhash.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
mutexlock.h Prevents Table Cache to open same files more times (#6707) 2020-04-21 13:16:31 -07:00
ppc-opcode.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
random_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
random.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
random.h C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
rate_limiter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rate_limiter.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rate_limiter.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
repeatable_thread_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
repeatable_thread.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
set_comparator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice_transform_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
status.cc Stats for redundant insertions into block cache (#6681) 2020-04-27 13:20:27 -07:00
stderr_logger.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
stop_watch.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
string_util.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
string_util.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_list_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_local_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_local.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_operation.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
threadpool_imp.cc Be able to decrease background thread's CPU priority when creating database backup (#6602) 2020-03-28 19:07:25 -07:00
threadpool_imp.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
timer_queue_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
timer_queue.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
timer_test.cc Add a simple timer support to schedule work at fixed times/intervals (#6543) 2020-04-07 11:55:27 -07:00
timer.h Add a simple timer support to schedule work at fixed times/intervals (#6543) 2020-04-07 11:55:27 -07:00
user_comparator_wrapper.h Iterator with timestamp (#6255) 2020-03-06 16:24:27 -08:00
vector_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
work_queue_test.cc Add pipelined & parallel compression optimization (#6262) 2020-04-01 16:40:18 -07:00
work_queue.h Revamp cache_bench to resemble a real workload (#6629) 2020-04-03 10:26:49 -07:00
xxh3p.h Remove XXH3(preview) streaming APIs (#6540) 2020-03-16 17:02:00 -07:00
xxhash.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
xxhash.h Misc hashing updates / upgrades (#5909) 2019-10-24 17:16:46 -07:00