rocksdb/db
Peter Dillinger 280b9f371a Fix auto_prefix_mode performance with partitioned filters (#10012)
Summary:
Essentially refactored the RangeMayExist implementation in
FullFilterBlockReader to FilterBlockReaderCommon so that it applies to
partitioned filters as well. (The function is not called for the
block-based filter case.) RangeMayExist is essentially a series of checks
around a possible PrefixMayExist, and I'm confident those checks should
be the same for partitioned as for full filters. (I think it's likely
that bugs remain in those checks, but this change is overall a simplifying
one.)

Added auto_prefix_mode support to db_bench

Other small fixes as well

Fixes https://github.com/facebook/rocksdb/issues/10003

Pull Request resolved: https://github.com/facebook/rocksdb/pull/10012

Test Plan:
Expanded unit test that uses statistics to check for filter
optimization, fails without the production code changes here

Performance: populate two DBs with
```
TEST_TMPDIR=/dev/shm/rocksdb_nonpartitioned ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8
TEST_TMPDIR=/dev/shm/rocksdb_partitioned ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=30000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -partition_index_and_filters
```

Observe no measurable change in non-partitioned performance
```
TEST_TMPDIR=/dev/shm/rocksdb_nonpartitioned ./db_bench -benchmarks=seekrandom[-X1000] -num=10000000 -readonly -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -auto_prefix_mode -cache_index_and_filter_blocks=1 -cache_size=1000000000 -duration 20
```
Before: seekrandom [AVG 15 runs] : 11798 (± 331) ops/sec
After: seekrandom [AVG 15 runs] : 11724 (± 315) ops/sec

Observe big improvement with partitioned (also supported by bloom use statistics)
```
TEST_TMPDIR=/dev/shm/rocksdb_partitioned ./db_bench -benchmarks=seekrandom[-X1000] -num=10000000 -readonly -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=8 -partition_index_and_filters -auto_prefix_mode -cache_index_and_filter_blocks=1 -cache_size=1000000000 -duration 20
```
Before: seekrandom [AVG 12 runs] : 2942 (± 57) ops/sec
After: seekrandom [AVG 12 runs] : 7489 (± 184) ops/sec

Reviewed By: siying

Differential Revision: D36469796

Pulled By: pdillinger

fbshipit-source-id: bcf1e2a68d347b32adb2b27384f945434e7a266d
2022-05-19 13:09:03 -07:00
..
blob Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
compaction Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
db_impl Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
arena_wrapped_db_iter.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
arena_wrapped_db_iter.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
builder.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
builder.h CompactionIterator sees consistent view of which keys are committed (#9830) 2022-04-14 11:11:04 -07:00
c_test.c Port the batched version of MultiGet() to RocksDB's C API (#9952) 2022-05-12 18:17:36 -07:00
c.cc Port the batched version of MultiGet() to RocksDB's C API (#9952) 2022-05-12 18:17:36 -07:00
column_family_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
column_family.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
column_family.h Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
compact_files_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
comparator_db_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
convenience.cc Specify largest_seqno in VerifyChecksum (#9919) 2022-05-02 10:22:08 -07:00
corruption_test.cc Update WAL corruption test so that it fails without fix (#9942) 2022-05-11 16:12:55 -07:00
cuckoo_table_db_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_basic_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_block_cache_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_bloom_filter_test.cc Rewrite memory-charging feature's option API (#9926) 2022-05-17 15:01:51 -07:00
db_compaction_filter_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_compaction_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_dynamic_level_test.cc Remove deprecated API AdvancedColumnFamilyOptions::soft_rate_limit/hard_rate_limit (#9452) 2022-01-27 13:01:09 -08:00
db_encryption_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_filesnapshot.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
db_flush_test.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
db_info_dumper.cc Printing IO Error in DumpDBFileSummary (#9940) 2022-05-04 10:19:53 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc fix a bug, c api, if enable inplace_update_support, and use create sn… (#9471) 2022-03-21 12:04:33 -07:00
db_io_failure_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_iter_stress_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_iter_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_iter.cc Remove dead code (#9825) 2022-04-11 10:26:55 -07:00
db_iter.h Remove dead code (#9825) 2022-04-11 10:26:55 -07:00
db_iterator_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_kv_checksum_test.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
db_log_iter_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_logical_block_size_cache_test.cc Attempt to deflake DBLogicalBlockSizeCacheTest.CreateColumnFamilies (#9516) 2022-03-04 11:35:28 -08:00
db_memtable_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_merge_operand_test.cc Fix GetMergeOperands() heap-use-after-free on flushed memtable (#9805) 2022-04-05 12:26:36 -07:00
db_merge_operator_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_options_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
db_properties_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_range_del_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_rate_limiter_test.cc Set Read rate limiter priority dynamically and pass it to FS (#9996) 2022-05-18 19:41:44 -07:00
db_secondary_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
db_sst_test.cc Rewrite memory-charging feature's option API (#9926) 2022-05-17 15:01:51 -07:00
db_statistics_test.cc Bytes read stat for VerifyChecksum() and VerifyFileChecksums() APIs (#8741) 2021-09-07 13:28:29 -07:00
db_table_properties_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_tailing_iter_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
db_test2.cc Fix auto_prefix_mode performance with partitioned filters (#10012) 2022-05-19 13:09:03 -07:00
db_test_util.cc Rewrite memory-charging feature's option API (#9926) 2022-05-17 15:01:51 -07:00
db_test_util.h Rewrite memory-charging feature's option API (#9926) 2022-05-17 15:01:51 -07:00
db_test.cc Revert "Bugfix/fix manual flush blocking bug (#9893)" (#9992) 2022-05-13 12:31:30 -07:00
db_universal_compaction_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_wal_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_with_timestamp_basic_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
db_with_timestamp_compaction_test.cc Use the comparator from the sst file table properties in sst_dump_tool (#9491) 2022-02-08 12:15:35 -08:00
db_write_buffer_manager_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
db_write_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
dbformat_test.cc Enable a few unit tests to use custom Env objects (#9087) 2021-11-08 11:05:59 -08:00
dbformat.cc Track per-SST user-defined timestamp information in MANIFEST (#9092) 2021-11-10 10:49:04 -08:00
dbformat.h Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
deletefile_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
error_handler_fs_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
error_handler.cc Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
error_handler.h Expand auto recovery to background read errors (#9679) 2022-03-15 14:45:34 -07:00
event_helpers.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
event_helpers.h Add a listener callback for end of auto error recovery (#9244) 2021-12-08 14:30:57 -08:00
experimental.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
external_sst_file_basic_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
external_sst_file_ingestion_job.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
external_sst_file_ingestion_job.h Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
external_sst_file_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
fault_injection_test.cc Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
filename_test.cc fixing issue #8345 RocksDB does not work when using UNC network paths (#9384) 2022-03-30 15:55:31 -07:00
flush_job_test.cc Set Write rate limiter priority dynamically and pass it to FS (#9988) 2022-05-18 00:41:41 -07:00
flush_job.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
flush_job.h Set Write rate limiter priority dynamically and pass it to FS (#9988) 2022-05-18 00:41:41 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
forward_iterator_bench.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
forward_iterator.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
forward_iterator.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
history_trimming_iterator.h Add OpenAndTrimHistory API to support trimming data with specified timestamp (#9410) 2022-03-11 16:13:23 -08:00
import_column_family_job.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
import_column_family_job.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
import_column_family_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
internal_stats.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
internal_stats.h Expose the amount of garbage in live blob files as a dedicated DB property (#9835) 2022-04-13 13:36:30 -07:00
job_context.h CompactionIterator sees consistent view of which keys are committed (#9830) 2022-04-14 11:11:04 -07:00
kv_checksum.h fix compile errors in db/kv_checksum.h (#9173) 2021-11-16 10:20:50 -08:00
listener_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
log_format.h Add record to set WAL compression type if enabled (#9556) 2022-02-17 16:19:31 -08:00
log_reader.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_reader.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_test.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.cc Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
log_writer.h Integrate WAL compression into log reader/writer. (#9642) 2022-03-09 15:49:53 -08:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Include C++ standard library headers instead of C compatibility headers (#8068) 2021-03-19 12:09:47 -07:00
lookup_key.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
malloc_stats.cc Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Remove using namespace (#9369) 2022-01-12 09:31:12 -08:00
memtable_list_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
memtable_list.cc Encode min_log_number_to_keep and delete_wals_before in one version edit (#9766) 2022-03-31 20:00:52 -07:00
memtable_list.h Fix various spelling errors still found in code (#9653) 2022-05-05 19:45:32 -07:00
memtable.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
memtable.h Meta-internal folly integration with F14FastMap (#9546) 2022-04-13 07:34:01 -07:00
merge_context.h Add Merge Operator support to WriteBatchWithIndex (#8135) 2021-05-10 12:50:25 -07:00
merge_helper_test.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.cc Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_helper.h Support readahead during compaction for blob files (#9187) 2021-11-19 17:53:47 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Make the Env class Customizable (#9293) 2022-01-04 16:45:49 -08:00
obsolete_files_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
options_file_test.cc No elide constructors (#7798) 2020-12-23 16:55:53 -08:00
output_validator.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
output_validator.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
perf_context_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
periodic_work_scheduler_test.cc Fix a minor issue with initializing the test path (#8555) 2021-07-23 08:38:45 -07:00
periodic_work_scheduler.cc Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
periodic_work_scheduler.h Fix a timer crash caused by invalid memory management (#9656) 2022-03-12 11:45:56 -08:00
pinned_iterators_manager.h Replace most typedef with using= (#8751) 2021-09-07 11:31:59 -07:00
plain_table_db_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
pre_release_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
prefix_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
range_del_aggregator_bench.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_del_aggregator.cc In ParseInternalKey(), include corrupt key info in Status (#7515) 2020-10-28 10:12:58 -07:00
range_del_aggregator.h Fix some typos in comments (#8066) 2021-03-25 21:18:08 -07:00
range_tombstone_fragmenter_test.cc Cleanup multiple implementations of VectorIterator (#8901) 2021-10-06 07:48:31 -07:00
range_tombstone_fragmenter.cc Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
range_tombstone_fragmenter.h Added memtable garbage statistics (#8411) 2021-06-18 04:57:27 -07:00
read_callback.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
repair_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
repair.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
snapshot_checker.h Use STATIC_AVOID_DESTRUCTION for static objects with non-trivial destructors (#9958) 2022-05-17 09:39:22 -07:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
table_cache.cc Improve the precision of row entry charge in row_cache (#9337) 2022-05-09 12:27:38 -07:00
table_cache.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
table_properties_collector_test.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
table_properties_collector.cc Apply sample_for_compression to all block-based tables (#8105) 2021-03-25 15:00:45 -07:00
table_properties_collector.h Track each SST's timestamp information as user properties (#9093) 2021-11-19 11:37:06 -08:00
transaction_log_impl.cc Add checks to GetUpdatesSince (#9459) 2022-04-14 17:12:16 -07:00
transaction_log_impl.h Add checks to GetUpdatesSince (#9459) 2022-04-14 17:12:16 -07:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_builder.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
version_builder.h Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
version_edit_handler.cc Fix a race condition in WAL tracking causing DB open failure (#9715) 2022-03-23 19:41:31 -07:00
version_edit_handler.h Fixed manifest_dump issues when printing keys and values containing null characters (#8378) 2021-06-10 12:55:20 -07:00
version_edit_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_edit.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_edit.h Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_set_test.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_set.cc Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_set.h Track SST unique id in MANIFEST and verify (#9990) 2022-05-19 11:04:21 -07:00
version_util.h Add manifest fix-up utility for file temperatures (#9683) 2022-03-18 16:35:51 -07:00
wal_edit_test.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.cc Always track WAL obsoletion (#7759) 2020-12-09 16:02:12 -08:00
wal_edit.h Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
wal_manager_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
wal_manager.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
wal_manager.h Add checks to GetUpdatesSince (#9459) 2022-04-14 17:12:16 -07:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Support WBWI for keys having timestamps (#9603) 2022-02-22 14:23:01 -08:00
write_batch_test.cc Remove own ToString() (#9955) 2022-05-06 13:03:58 -07:00
write_batch.cc Use std::numeric_limits<> (#9954) 2022-05-05 13:08:21 -07:00
write_callback_test.cc Move slow valgrind tests behind -DROCKSDB_FULL_VALGRIND_RUN (#8475) 2021-07-07 11:14:05 -07:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.cc Revamp WriteController (#8064) 2021-03-18 09:47:31 -07:00
write_controller.h Set Write rate limiter priority dynamically and pass it to FS (#9988) 2022-05-18 00:41:41 -07:00
write_thread.cc Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00
write_thread.h Rate-limit automatic WAL flush after each user write (#9607) 2022-03-08 13:19:39 -08:00