rocksdb/table
Shobhit Dayal b45b1cde3e Feature for sampling and reporting compressibility (#4842)
Summary:
This is a feature to sample data-block compressibility and and report them as stats. 1 in N (tunable) blocks is sampled for compressibility using two algorithms:
1. lz4 or snappy for fast compression
2. zstd or zlib for slow but higher compression.

The stats are reported to the caller as raw-bytes and compressed-bytes. The block continues to be compressed for storage using the specified CompressionType.

The db_bench_tool how has a command line option for specifying the sampling rate. It's default value is 0 (no sampling). To test the overhead for a certain value, users can compare the performance of db_bench_tool, varying the sampling rate. It is unlikely to have a noticeable impact for high values like 20.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4842

Differential Revision: D13629011

Pulled By: shobhitdayal

fbshipit-source-id: 14ca668bcab6499b2a1734edf848eb62a4f4fafa
2019-03-18 12:15:34 -07:00
..
adaptive_table_factory.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
adaptive_table_factory.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
block_based_filter_block_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
block_based_filter_block.cc PrefixMayMatch: remove unnecessary check for prefix_extractor_ (#4067) 2018-06-27 20:42:43 -07:00
block_based_filter_block.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
block_based_table_builder.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
block_based_table_builder.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
block_based_table_factory.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
block_based_table_factory.h Revert "Move MemoryAllocator option from Cache to BlockBasedTableOpti… (#4697) 2018-11-21 11:29:57 -08:00
block_based_table_reader.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
block_based_table_reader.h Cache dictionary used for decompressing data blocks (#4881) 2019-01-23 18:15:47 -08:00
block_builder.cc DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
block_builder.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
block_fetcher.cc Cache dictionary used for decompressing data blocks (#4881) 2019-01-23 18:15:47 -08:00
block_fetcher.h Cache dictionary used for decompressing data blocks (#4881) 2019-01-23 18:15:47 -08:00
block_prefix_index.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
block_prefix_index.h Change RocksDB License 2017-07-15 16:11:23 -07:00
block_test.cc Remove two variables from BlockContents class and don't use class Block for compressed block (#4650) 2018-11-13 17:02:55 -08:00
block.cc Checksum properties block for block-based table (#4956) 2019-02-11 11:50:01 -08:00
block.h Checksum properties block for block-based table (#4956) 2019-02-11 11:50:01 -08:00
bloom_block.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
bloom_block.h Disallow customized hash function in DynamicBloom (#4915) 2019-01-24 10:34:30 -08:00
cleanable_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_builder_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_builder.cc Promote rocksdb.{deleted.keys,merge.operands} to main table properties (#4594) 2018-10-30 15:34:27 -07:00
cuckoo_table_builder.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cuckoo_table_factory.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_factory.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_reader_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
cuckoo_table_reader.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
cuckoo_table_reader.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
data_block_footer.cc Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_footer.h Add db_bench options of data block hash index (#4281) 2018-08-16 18:42:46 -07:00
data_block_hash_index_test.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
data_block_hash_index.cc DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
data_block_hash_index.h DataBlockHashIndex: Remove the division from EstimateSize() (#4293) 2018-08-20 23:13:50 -07:00
filter_block.h use user_key and iterate_upper_bound to determine compatibility of bloom filters (#3899) 2018-06-26 15:57:26 -07:00
flush_block_policy.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
format.cc Make statistics's stats_level change thread-safe (#5030) 2019-03-01 10:42:09 -08:00
format.h Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
full_filter_bits_builder.h Skip duplicate bloom keys when whole_key and prefix are mixed 2018-04-24 10:58:16 -07:00
full_filter_block_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
full_filter_block.cc Charging block cache more accurately (#4073) 2018-06-29 08:57:20 -07:00
full_filter_block.h Charging block cache more accurately (#4073) 2018-06-29 08:57:20 -07:00
get_context.cc PlainTable should avoid copying Get() results from immortal source. (#4924) 2019-01-25 17:12:19 -08:00
get_context.h Cache dictionary used for decompressing data blocks (#4881) 2019-01-23 18:15:47 -08:00
index_builder.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
index_builder.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
internal_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iter_heap.h Make InternalKeyComparator final and directly use it in merging iterator 2017-09-11 12:04:21 -07:00
iterator_wrapper.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merger_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merging_iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merging_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
meta_blocks.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
meta_blocks.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
mock_table.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
mock_table.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
partitioned_filter_block_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
partitioned_filter_block.cc Fix bug in partition filters with format_version=4 (#4381) 2018-09-17 17:28:15 -07:00
partitioned_filter_block.h Fix bug in partition filters with format_version=4 (#4381) 2018-09-17 17:28:15 -07:00
persistent_cache_helper.cc Remove two variables from BlockContents class and don't use class Block for compressed block (#4650) 2018-11-13 17:02:55 -08:00
persistent_cache_helper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
persistent_cache_options.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_builder.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_builder.h Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_factory.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_factory.h Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_index.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_index.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
plain_table_key_coding.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
plain_table_key_coding.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
plain_table_reader.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
plain_table_reader.h PlainTable should avoid copying Get() results from immortal source. (#4924) 2019-01-25 17:12:19 -08:00
scoped_arena_iterator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
sst_file_reader_test.cc Get CompactionJobInfo from CompactFiles 2018-12-13 14:21:24 -08:00
sst_file_reader.cc Get CompactionJobInfo from CompactFiles 2018-12-13 14:21:24 -08:00
sst_file_writer_collectors.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
sst_file_writer.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_builder.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_properties_internal.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00
table_properties.cc Promote rocksdb.{deleted.keys,merge.operands} to main table properties (#4594) 2018-10-30 15:34:27 -07:00
table_reader_bench.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_reader.h Clean up FragmentedRangeTombstoneList (#4692) 2018-11-28 15:29:02 -08:00
table_test.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
two_level_iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
two_level_iterator.h Index value delta encoding (#3983) 2018-08-09 16:58:40 -07:00