rocksdb/table
Aaron Gao 7e62c5d67a unbiase readamp bitmap
Summary:
Consider BlockReadAmpBitmap with bytes_per_bit = 32. Suppose bytes [a, b) were used, while bytes [a-32, a)
 and [b+1, b+33) weren't used; more formally, the union of ranges passed to BlockReadAmpBitmap::Mark() contains [a, b) and doesn't intersect with [a-32, a) and [b+1, b+33). Then bits [floor(a/32), ceil(b/32)] will be set, and so the number of useful bytes will be estimated as (ceil(b/32) - floor(a/32)) * 32, which is on average equal to b-a+31.

An extreme example: if we use 1 byte from each block, it'll be counted as 32 bytes from each block.

It's easy to remove this bias by slightly changing the semantics of the bitmap. Currently each bit represents a byte range [i*32, (i+1)*32).

This diff makes each bit represent a single byte: i*32 + X, where X is a random number in [0, 31] generated when bitmap is created. So, e.g., if you read a single byte at random, with probability 31/32 it won't be counted at all, and with probability 1/32 it will be counted as 32 bytes; so, on average it's counted as 1 byte.

*But there is one exception: the last bit will always set with the old way.*

(*) - assuming read_amp_bytes_per_bit = 32.
Closes https://github.com/facebook/rocksdb/pull/2259

Differential Revision: D5035652

Pulled By: lightmark

fbshipit-source-id: bd98b1b9b49fbe61f9e3781d07f624e3cbd92356
2017-05-10 14:06:54 -07:00
..
adaptive_table_factory.cc solve the problem of table_factory_to_write_=nullptr (#1342) 2016-09-20 10:11:51 -07:00
adaptive_table_factory.h Only cache level 0 indexes and filter when opening table reader 2016-07-20 11:23:31 -07:00
block_based_filter_block_test.cc Move various string utility functions into string_util 2017-04-06 14:54:12 -07:00
block_based_filter_block.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
block_based_filter_block.h Readers for partition filter 2017-03-22 09:24:15 -07:00
block_based_table_builder.cc Add macros to include file name and line number during Logging 2017-03-15 19:39:12 -07:00
block_based_table_builder.h Compaction Support for Range Deletion 2016-10-18 12:04:56 -07:00
block_based_table_factory.cc Remove skip_table_builder_flush and default it to true 2017-03-02 16:54:10 -08:00
block_based_table_factory.h Only cache level 0 indexes and filter when opening table reader 2016-07-20 11:23:31 -07:00
block_based_table_reader.cc do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
block_based_table_reader.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
block_builder.cc Miscellaneous performance improvements 2016-07-12 14:15:32 -07:00
block_builder.h TableBuilder / TableReader support for range deletion 2016-08-19 15:10:31 -07:00
block_prefix_index.cc Fix clang analyzer errors 2016-07-08 17:50:51 -07:00
block_prefix_index.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
block_test.cc unbiase readamp bitmap 2017-05-10 14:06:54 -07:00
block.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
block.h unbiase readamp bitmap 2017-05-10 14:06:54 -07:00
bloom_block.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
bloom_block.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
cleanable_test.cc Pinnableslice (2nd attempt) 2017-03-13 11:54:10 -07:00
cuckoo_table_builder_test.cc update IterKey that can get user key and internal key explicitly 2017-04-04 14:24:20 -07:00
cuckoo_table_builder.cc Embed column family name in SST file 2016-04-06 23:10:32 -07:00
cuckoo_table_builder.h Embed column family name in SST file 2016-04-06 23:10:32 -07:00
cuckoo_table_factory.cc Only cache level 0 indexes and filter when opening table reader 2016-07-20 11:23:31 -07:00
cuckoo_table_factory.h Only cache level 0 indexes and filter when opening table reader 2016-07-20 11:23:31 -07:00
cuckoo_table_reader_test.cc update IterKey that can get user key and internal key explicitly 2017-04-04 14:24:20 -07:00
cuckoo_table_reader.cc do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
cuckoo_table_reader.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
filter_block.h Readers for partition filter 2017-03-22 09:24:15 -07:00
flush_block_policy.cc Configure index partition size 2017-03-28 12:09:12 -07:00
format.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
format.h Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
full_filter_block_test.cc Move various string utility functions into string_util 2017-04-06 14:54:12 -07:00
full_filter_block.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
full_filter_block.h Readers for partition filter 2017-03-22 09:24:15 -07:00
get_context.cc Avoid pinning when row cache is accessed 2017-05-01 21:46:40 -07:00
get_context.h Pinnableslice (2nd attempt) 2017-03-13 11:54:10 -07:00
index_builder.cc Configure index partition size 2017-03-28 12:09:12 -07:00
index_builder.h Configure index partition size 2017-03-28 12:09:12 -07:00
internal_iterator.h fix assertion failure in Prev() 2016-10-13 17:36:48 -07:00
iter_heap.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
iterator_wrapper.h Add SeekForPrev() to Iterator 2016-09-27 18:20:57 -07:00
iterator.cc Pinnableslice (2nd attempt) 2017-03-13 11:54:10 -07:00
merger_test.cc Rename merger.h -> merging_iterator.h 2017-02-02 16:54:19 -08:00
merging_iterator.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
merging_iterator.h Rename merger.h -> merging_iterator.h 2017-02-02 16:54:19 -08:00
meta_blocks.cc Add macros to include file name and line number during Logging 2017-03-15 19:39:12 -07:00
meta_blocks.h New Statistics to track Compression/Decompression (#1197) 2016-07-19 09:44:03 -07:00
mock_table.cc do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
mock_table.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
partitioned_filter_block_test.cc Configure index partition size 2017-03-28 12:09:12 -07:00
partitioned_filter_block.cc Readers for partition filter 2017-03-22 09:24:15 -07:00
partitioned_filter_block.h Readers for partition filter 2017-03-22 09:24:15 -07:00
persistent_cache_helper.cc Refactoring 2017-03-03 18:24:12 -08:00
persistent_cache_helper.h Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
persistent_cache_options.h Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
plain_table_builder.cc Allow plain table to store index on file with bloom filter disabled 2016-11-17 11:09:13 -08:00
plain_table_builder.h Embed column family name in SST file 2016-04-06 23:10:32 -07:00
plain_table_factory.cc store prefix_extractor_name in table 2016-08-26 11:46:32 -07:00
plain_table_factory.h Only cache level 0 indexes and filter when opening table reader 2016-07-20 11:23:31 -07:00
plain_table_index.cc Add macros to include file name and line number during Logging 2017-03-15 19:39:12 -07:00
plain_table_index.h Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
plain_table_key_coding.cc update IterKey that can get user key and internal key explicitly 2017-04-04 14:24:20 -07:00
plain_table_key_coding.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
plain_table_reader.cc do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
plain_table_reader.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
scoped_arena_iterator.h Compaction Support for Range Deletion 2016-10-18 12:04:56 -07:00
sst_file_writer_collectors.h Support SST files with Global sequence numbers [reland] 2016-10-18 16:59:37 -07:00
sst_file_writer.cc Remove bulk loading and auto_roll_logger in rocksdb_lite 2017-02-28 11:09:11 -08:00
table_builder.h Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
table_properties_internal.h TableBuilder / TableReader support for range deletion 2016-08-19 15:10:31 -07:00
table_properties.cc Insert range deletion meta-block into block cache 2016-11-05 09:24:26 -07:00
table_reader_bench.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
table_reader.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
table_test.cc Move some files under util/ to separate dirs 2017-04-05 19:09:16 -07:00
two_level_iterator.cc do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00
two_level_iterator.h do not read next datablock if upperbound is reached 2017-05-10 14:06:33 -07:00