rocksdb/table/block_based
Peter Dillinger 14eca6bf04 For ApproximateSizes, pro-rate table metadata size over data blocks (#6784)
Summary:
The implementation of GetApproximateSizes was inconsistent in
its treatment of the size of non-data blocks of SST files, sometimes
including and sometimes now. This was at its worst with large portion
of table file used by filters and querying a small range that crossed
a table boundary: the size estimate would include large filter size.

It's conceivable that someone might want only to know the size in terms
of data blocks, but I believe that's unlikely enough to ignore for now.
Similarly, there's no evidence the internal function AppoximateOffsetOf
is used for anything other than a one-sided ApproximateSize, so I intend
to refactor to remove redundancy in a follow-up commit.

So to fix this, GetApproximateSizes (and implementation details
ApproximateSize and ApproximateOffsetOf) now consistently include in
their returned sizes a portion of table file metadata (incl filters
and indexes) based on the size portion of the data blocks in range. In
other words, if a key range covers data blocks that are X% by size of all
the table's data blocks, returned approximate size is X% of the total
file size. It would technically be more accurate to attribute metadata
based on number of keys, but that's not computationally efficient with
data available and rarely a meaningful difference.

Also includes miscellaneous comment improvements / clarifications.

Also included is a new approximatesizerandom benchmark for db_bench.
No significant performance difference seen with this change, whether ~700 ops/sec with cache_index_and_filter_blocks and small cache or ~150k ops/sec without cache_index_and_filter_blocks.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6784

Test Plan:
Test added to DBTest.ApproximateSizesFilesWithErrorMargin.
Old code running new test...

    [ RUN      ] DBTest.ApproximateSizesFilesWithErrorMargin
    db/db_test.cc:1562: Failure
    Expected: (size) <= (11 * 100), actual: 9478 vs 1100

Other tests updated to reflect consistent accounting of metadata.

Reviewed By: siying

Differential Revision: D21334706

Pulled By: pdillinger

fbshipit-source-id: 6f86870e45213334fedbe9c73b4ebb1d8d611185
2020-06-02 12:30:23 -07:00
..
binary_search_index_reader.cc Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
binary_search_index_reader.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
block_based_filter_block_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_based_filter_block.cc Reduce dependency on gtest dependency in release code (#6907) 2020-06-02 12:11:24 -07:00
block_based_filter_block.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_based_table_builder.cc Add tests for compression failure in BlockBasedTableBuilder (#6709) 2020-05-12 09:27:35 -07:00
block_based_table_builder.h Disallow BlockBasedTableBuilder to set status from non-OK (#6776) 2020-04-30 15:37:03 -07:00
block_based_table_factory.cc Add Struct Type to OptionsTypeInfo (#6425) 2020-05-21 10:58:39 -07:00
block_based_table_factory.h sst_dump to reduce number of file reads (#6836) 2020-05-12 18:23:33 -07:00
block_based_table_iterator.cc Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
block_based_table_iterator.h Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
block_based_table_reader_impl.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
block_based_table_reader_test.cc Update googletest from 1.8.1 to 1.10.0 (#6808) 2020-06-01 20:33:42 -07:00
block_based_table_reader.cc For ApproximateSizes, pro-rate table metadata size over data blocks (#6784) 2020-06-02 12:30:23 -07:00
block_based_table_reader.h For ApproximateSizes, pro-rate table metadata size over data blocks (#6784) 2020-06-02 12:30:23 -07:00
block_builder.cc Add pipelined & parallel compression optimization (#6262) 2020-04-01 16:40:18 -07:00
block_builder.h Add pipelined & parallel compression optimization (#6262) 2020-04-01 16:40:18 -07:00
block_prefetcher.cc De-template block based table iterator (#6531) 2020-03-16 12:20:50 -07:00
block_prefetcher.h De-template block based table iterator (#6531) 2020-03-16 12:20:50 -07:00
block_prefix_index.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_prefix_index.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_test.cc Update googletest from 1.8.1 to 1.10.0 (#6808) 2020-06-01 20:33:42 -07:00
block_type.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block.cc avoid IterKey::UpdateInternalKey() in BlockIter (#6843) 2020-05-28 10:51:30 -07:00
block.h avoid IterKey::UpdateInternalKey() in BlockIter (#6843) 2020-05-28 10:51:30 -07:00
cachable_entry.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_footer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_footer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index_test.cc Fix range deletion tombstone ingestion with global seqno (#6429) 2020-02-25 15:31:48 -08:00
data_block_hash_index.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_block_reader_common.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_block_reader_common.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_block.h Fix iterator reading filter block despite read_tier == kBlockCacheTier (#6562) 2020-03-26 15:21:26 -07:00
filter_policy_internal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_policy.cc Add Functions to OptionTypeInfo (#6422) 2020-04-28 18:04:26 -07:00
flush_block_policy.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_block_policy.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
full_filter_block_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
full_filter_block.cc Reduce dependency on gtest dependency in release code (#6907) 2020-06-02 12:11:24 -07:00
full_filter_block.h Fix iterator reading filter block despite read_tier == kBlockCacheTier (#6562) 2020-03-26 15:21:26 -07:00
hash_index_reader.cc Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
hash_index_reader.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
index_builder.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
index_builder.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
index_reader_common.cc Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
index_reader_common.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
mock_block_based_table.h For ApproximateSizes, pro-rate table metadata size over data blocks (#6784) 2020-06-02 12:30:23 -07:00
parsed_full_filter_block.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
parsed_full_filter_block.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
partitioned_filter_block_test.cc For ApproximateSizes, pro-rate table metadata size over data blocks (#6784) 2020-06-02 12:30:23 -07:00
partitioned_filter_block.cc Reduce dependency on gtest dependency in release code (#6907) 2020-06-02 12:11:24 -07:00
partitioned_filter_block.h Basic MultiGet support for partitioned filters (#6757) 2020-04-28 14:49:34 -07:00
partitioned_index_iterator.cc Fix regression bug in partitioned index reseek caused by #6531 (#6551) 2020-03-17 12:33:10 -07:00
partitioned_index_iterator.h Fix compiler warning treated as error (#6547) 2020-03-17 09:59:28 -07:00
partitioned_index_reader.cc Reduce dependency on gtest dependency in release code (#6907) 2020-06-02 12:11:24 -07:00
partitioned_index_reader.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
reader_common.cc Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
reader_common.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
uncompression_dict_reader.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
uncompression_dict_reader.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00