Peter Dillinger
c4d8838a2b
New bit manipulation functions and 128-bit value library ( #7338 )
...
Summary:
These new functions and 128-bit value bit operations are
expected to be used in a forthcoming Bloom filter alternative.
No functional changes to production code, just new code only called by
unit tests, cosmetic changes to existing headers, and fix an existing
function for a yet-unused template instantiation (BitsSetToOne on
something signed and smaller than 32 bits).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7338
Test Plan:
Unit tests included. Works with and without
TEST_UINT128_COMPAT=1 to check compatibility with and without
__uint128_t. Also added that parameter to the CircleCI build
build-linux-shared_lib-alt_namespace-status_checked.
Reviewed By: jay-zhuang
Differential Revision: D23494945
Pulled By: pdillinger
fbshipit-source-id: 5c0dc419100d9df5d4d9abb153b2855d5aea39e8
2020-09-03 09:32:59 -07:00
Peter Dillinger
bae6f58696
Basic MultiGet support for partitioned filters ( #6757 )
...
Summary:
In MultiGet, access each applicable filter partition only once
per batch, rather than for each applicable key. Also,
* Fix Bloom stats for MultiGet
* Fix/refactor MultiGetContext::Range::KeysLeft, including
* Add efficient BitsSetToOne implementation
* Assert that MultiGetContext::Range does not go beyond shift range
Performance test: Generate db:
$ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 -partition_index_and_filters=true
...
Before (middle performing run of three; note some missing Bloom stats):
$ ./db_bench --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
multireadrandom : 26.403 micros/op 597517 ops/sec; (548427 of 671968 found)
rocksdb.block.cache.filter.hit COUNT : 83443275
rocksdb.bloom.filter.useful COUNT : 0
rocksdb.bloom.filter.full.positive COUNT : 0
rocksdb.bloom.filter.full.true.positive COUNT : 7931450
rocksdb.number.multiget.get COUNT : 385984
rocksdb.number.multiget.keys.read COUNT : 12351488
rocksdb.number.multiget.bytes.read COUNT : 793145000
rocksdb.number.multiget.keys.found COUNT : 7931450
After (middle performing run of three):
$ ./db_bench_new --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
multireadrandom : 21.024 micros/op 752963 ops/sec; (705188 of 863968 found)
rocksdb.block.cache.filter.hit COUNT : 49856682
rocksdb.bloom.filter.useful COUNT : 45684579
rocksdb.bloom.filter.full.positive COUNT : 10395458
rocksdb.bloom.filter.full.true.positive COUNT : 9908456
rocksdb.number.multiget.get COUNT : 481984
rocksdb.number.multiget.keys.read COUNT : 15423488
rocksdb.number.multiget.bytes.read COUNT : 990845600
rocksdb.number.multiget.keys.found COUNT : 9908456
So that's about 25% higher throughput even for random keys
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6757
Test Plan: unit test included
Reviewed By: anand1976
Differential Revision: D21243256
Pulled By: pdillinger
fbshipit-source-id: 5644a1468d9e8c8575be02f4e04bc5d62dbbb57f
2020-04-28 14:49:34 -07:00