rocksdb/table
anand76 fefd4b98c5 Introduce a new MultiGet batching implementation (#5011)
Summary:
This PR introduces a new MultiGet() API, with the underlying implementation grouping keys based on SST file and batching lookups in a file. The reason for the new API is twofold - the definition allows callers to allocate storage for status and values on stack instead of std::vector, as well as return values as PinnableSlices in order to avoid copying, and it keeps the original MultiGet() implementation intact while we experiment with batching.

Batching is useful when there is some spatial locality to the keys being queries, as well as larger batch sizes. The main benefits are due to -
1. Fewer function calls, especially to BlockBasedTableReader::MultiGet() and FullFilterBlockReader::KeysMayMatch()
2. Bloom filter cachelines can be prefetched, hiding the cache miss latency

The next step is to optimize the binary searches in the level_storage_info, index blocks and data blocks, since we could reduce the number of key comparisons if the keys are relatively close to each other. The batching optimizations also need to be extended to other formats, such as PlainTable and filter formats. This also needs to be added to db_stress.

Benchmark results from db_bench for various batch size/locality of reference combinations are given below. Locality was simulated by offsetting the keys in a batch by a stride length. Each SST file is about 8.6MB uncompressed and key/value size is 16/100 uncompressed. To focus on the cpu benefit of batching, the runs were single threaded and bound to the same cpu to eliminate interference from other system events. The results show a 10-25% improvement in micros/op from smaller to larger batch sizes (4 - 32).

Batch   Sizes

1        | 2        | 4         | 8      | 16  | 32

Random pattern (Stride length 0)
4.158 | 4.109 | 4.026 | 4.05 | 4.1 | 4.074        - Get
4.438 | 4.302 | 4.165 | 4.122 | 4.096 | 4.075 - MultiGet (no batching)
4.461 | 4.256 | 4.277 | 4.11 | 4.182 | 4.14        - MultiGet (w/ batching)

Good locality (Stride length 16)
4.048 | 3.659 | 3.248 | 2.99 | 2.84 | 2.753
4.429 | 3.728 | 3.406 | 3.053 | 2.911 | 2.781
4.452 | 3.45 | 2.833 | 2.451 | 2.233 | 2.135

Good locality (Stride length 256)
4.066 | 3.786 | 3.581 | 3.447 | 3.415 | 3.232
4.406 | 4.005 | 3.644 | 3.49 | 3.381 | 3.268
4.393 | 3.649 | 3.186 | 2.882 | 2.676 | 2.62

Medium locality (Stride length 4096)
4.012 | 3.922 | 3.768 | 3.61 | 3.582 | 3.555
4.364 | 4.057 | 3.791 | 3.65 | 3.57 | 3.465
4.479 | 3.758 | 3.316 | 3.077 | 2.959 | 2.891

dbbench command used (on a DB with 4 levels, 12 million keys)-
TEST_TMPDIR=/dev/shm numactl -C 10  ./db_bench.tmp -use_existing_db=true -benchmarks="readseq,multireadrandom" -write_buffer_size=4194304 -target_file_size_base=4194304 -max_bytes_for_level_base=16777216 -num=12000000 -reads=12000000 -duration=90 -threads=1 -compression_type=none -cache_size=4194304000 -batch_size=32 -disable_auto_compactions=true -bloom_bits=10 -cache_index_and_filter_blocks=true -pin_l0_filter_and_index_blocks_in_cache=true -multiread_batched=true -multiread_stride=4
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5011

Differential Revision: D14348703

Pulled By: anand1976

fbshipit-source-id: 774406dab3776d979c809522a67bedac6c17f84b
2019-04-11 14:28:26 -07:00
..
adaptive_table_factory.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
adaptive_table_factory.h Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
block_based_filter_block_test.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_based_filter_block.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_based_filter_block.h Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_based_table_builder.cc Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
block_based_table_builder.h Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
block_based_table_factory.cc Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
block_based_table_factory.h Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
block_based_table_reader.cc Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
block_based_table_reader.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
block_builder.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_builder.h Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_fetcher.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_fetcher.h Cache dictionary used for decompressing data blocks (#4881) 2019-01-23 18:15:47 -08:00
block_prefix_index.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_prefix_index.h Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block_test.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
block.h Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
bloom_block.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
bloom_block.h Disallow customized hash function in DynamicBloom (#4915) 2019-01-24 10:34:30 -08:00
cleanable_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_builder_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_builder.cc Promote rocksdb.{deleted.keys,merge.operands} to main table properties (#4594) 2018-10-30 15:34:27 -07:00
cuckoo_table_builder.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_factory.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_factory.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_reader_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_reader.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
cuckoo_table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
data_block_footer.cc Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_footer.h Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_hash_index_test.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
data_block_hash_index.cc DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
data_block_hash_index.h DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
filter_block.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
flush_block_policy.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
format.cc Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
format.h Apply automatic formatting to some files (#5114) 2019-03-27 16:24:45 -07:00
full_filter_bits_builder.h Skip duplicate bloom keys when whole_key and prefix are mixed 2018-04-24 10:58:16 -07:00
full_filter_block_test.cc Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
full_filter_block.cc Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
full_filter_block.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
get_context.cc PlainTable should avoid copying Get() results from immortal source. (#4924) 2019-01-25 17:12:19 -08:00
get_context.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
index_builder.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
index_builder.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
internal_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iter_heap.h Make InternalKeyComparator final and directly use it in merging iterator 2017-09-11 12:04:21 -07:00
iterator_wrapper.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iterator.cc Fix arena allocation size in NewEmptyInternalIterator (#4905) 2019-03-29 15:09:35 -07:00
merger_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merging_iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merging_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
meta_blocks.cc Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
meta_blocks.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
mock_table.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
mock_table.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
multiget_context.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
partitioned_filter_block_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
partitioned_filter_block.cc Fix bug in partition filters with format_version=4 (#4381) 2018-09-17 17:28:15 -07:00
partitioned_filter_block.h Fix bug in partition filters with format_version=4 (#4381) 2018-09-17 17:28:15 -07:00
persistent_cache_helper.cc Remove two variables from BlockContents class and don't use class Block for compressed block (#4650) 2018-11-13 17:02:55 -08:00
persistent_cache_helper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
persistent_cache_options.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_builder.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_builder.h Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_factory.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_factory.h Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
plain_table_index.cc Fix many bugs in log statement arguments (#5089) 2019-04-04 12:12:11 -07:00
plain_table_index.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_key_coding.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
plain_table_key_coding.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
plain_table_reader.cc Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
plain_table_reader.h Remove some "using std::..." from header files. (#5113) 2019-03-27 10:28:21 -07:00
scoped_arena_iterator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
sst_file_reader_test.cc Fix SstFileReader not able to open ingested file (#5097) 2019-03-26 10:25:18 -07:00
sst_file_reader.cc Fix SstFileReader not able to open ingested file (#5097) 2019-03-26 10:25:18 -07:00
sst_file_writer_collectors.h Fix SstFileReader not able to open ingested file (#5097) 2019-03-26 10:25:18 -07:00
sst_file_writer.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_builder.h Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
table_properties_internal.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
table_properties.cc Periodic Compactions (#5166) 2019-04-10 19:31:18 -07:00
table_reader_bench.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_reader.h Introduce a new MultiGet batching implementation (#5011) 2019-04-11 14:28:26 -07:00
table_test.cc Evict the uncompression dictionary from the block cache upon table close (#5150) 2019-04-04 16:21:12 -07:00
two_level_iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
two_level_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00