rocksdb/db
Shobhit Dayal b45b1cde3e Feature for sampling and reporting compressibility (#4842)
Summary:
This is a feature to sample data-block compressibility and and report them as stats. 1 in N (tunable) blocks is sampled for compressibility using two algorithms:
1. lz4 or snappy for fast compression
2. zstd or zlib for slow but higher compression.

The stats are reported to the caller as raw-bytes and compressed-bytes. The block continues to be compressed for storage using the specified CompressionType.

The db_bench_tool how has a command line option for specifying the sampling rate. It's default value is 0 (no sampling). To test the overhead for a certain value, users can compare the performance of db_bench_tool, varying the sampling rate. It is unlikely to have a noticeable impact for high values like 20.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4842

Differential Revision: D13629011

Pulled By: shobhitdayal

fbshipit-source-id: 14ca668bcab6499b2a1734edf848eb62a4f4fafa
2019-03-18 12:15:34 -07:00
..
builder.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
builder.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
c_test.c Get CompactionJobInfo from CompactFiles 2018-12-13 14:21:24 -08:00
c.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
column_family_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
column_family.cc Digest ZSTD compression dictionary once when writing SST file (#4849) 2019-01-18 19:12:57 -08:00
column_family.h Lock free MultiGet (#4754) 2019-01-02 11:42:54 -08:00
compact_files_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
compacted_db_impl.cc move dump stats to a separate thread (#4382) 2018-10-08 22:54:43 -07:00
compacted_db_impl.h Enable checkpoint of read-only db (#4681) 2018-12-07 17:06:02 -08:00
compaction_iteration_stats.h add counter for deletion dropping optimization 2017-08-19 14:10:08 -07:00
compaction_iterator_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
compaction_iterator.cc WritePrepared: relax assert in compaction iterator (#4969) 2019-02-11 15:01:46 -08:00
compaction_iterator.h Deprecate CompactionFilter::IgnoreSnapshots() = false (#4954) 2019-02-07 16:57:33 -08:00
compaction_job_stats_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
compaction_job_test.cc Zero seqnum of final key / drop final tombstone when compacting to bottommost level 2019-02-01 09:21:57 -08:00
compaction_job.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
compaction_job.h Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
compaction_picker_fifo.cc Deprecate ttl option from CompactionOptionsFIFO (#4965) 2019-02-15 09:51:41 -08:00
compaction_picker_fifo.h Move FIFOCompactionPicker to a separate file (#4724) 2018-11-29 16:04:52 -08:00
compaction_picker_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
compaction_picker_universal.cc Add two more StatsLevel (#5027) 2019-02-28 10:27:59 -08:00
compaction_picker_universal.h Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction_picker.cc Move FIFOCompactionPicker to a separate file (#4724) 2018-11-29 16:04:52 -08:00
compaction_picker.h Move FIFOCompactionPicker to a separate file (#4724) 2018-11-29 16:04:52 -08:00
compaction.cc Dictionary compression for files written by SstFileWriter (#4978) 2019-02-14 11:23:55 -08:00
compaction.h Truncate range tombstones by leveraging InternalKeys (#4432) 2018-10-09 15:19:38 -07:00
comparator_db_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
convenience.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
corruption_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
cuckoo_table_db_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_basic_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_blob_index_test.cc fix lite build 2017-10-17 08:57:09 -07:00
db_block_cache_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_bloom_filter_test.cc add whole key bloom filter support in memtables (#4985) 2019-02-19 12:15:39 -08:00
db_compaction_filter_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_compaction_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_dynamic_level_test.cc Fix flaky DBDynamicLevelTest.DynamicLevelMaxBytesBase2 (#4668) 2018-11-12 16:42:16 -08:00
db_encryption_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
db_filesnapshot.cc Remove redundant member var and set options (#4631) 2018-11-12 12:24:26 -08:00
db_flush_test.cc Improve flushing multiple column families (#4708) 2018-12-13 15:12:40 -08:00
db_impl_compaction_flush.cc Add two more StatsLevel (#5027) 2019-02-28 10:27:59 -08:00
db_impl_debug.cc add GetStatsHistory to retrieve stats snapshots (#4748) 2019-02-20 15:52:54 -08:00
db_impl_experimental.cc Update JobContext. (#3949) 2018-08-03 17:42:34 -07:00
db_impl_files.cc Fix #3840: only SyncClosedLogs for multiple CFs (#4460) 2018-11-13 11:32:16 -08:00
db_impl_open.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
db_impl_readonly.cc Add two more StatsLevel (#5027) 2019-02-28 10:27:59 -08:00
db_impl_readonly.h Get CompactionJobInfo from CompactFiles 2018-12-13 14:21:24 -08:00
db_impl_write.cc Update bg_error when log flush fails in SwitchMemtable() (#5072) 2019-03-15 15:19:25 -07:00
db_impl.cc Add two more StatsLevel (#5027) 2019-02-28 10:27:59 -08:00
db_impl.h add GetStatsHistory to retrieve stats snapshots (#4748) 2019-02-20 15:52:54 -08:00
db_info_dumper.cc avoid copying when iterating using range-based for (#4459) 2018-10-09 17:15:51 -07:00
db_info_dumper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
db_inplace_update_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_io_failure_test.cc Disable DBIOFailureTest.NoSpaceCompactRange in LITE (#4596) 2018-10-29 14:36:31 -07:00
db_iter_stress_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_iter_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_iter.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_iter.h Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
db_iterator_test.cc WritePrepared: optimize read path by avoiding virtual (#5018) 2019-02-26 16:56:19 -08:00
db_log_iter_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_memtable_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_merge_operator_test.cc WritePrepared: optimize read path by avoiding virtual (#5018) 2019-02-26 16:56:19 -08:00
db_options_test.cc add GetStatsHistory to retrieve stats snapshots (#4748) 2019-02-20 15:52:54 -08:00
db_properties_test.cc Deprecate ttl option from CompactionOptionsFIFO (#4965) 2019-02-15 09:51:41 -08:00
db_range_del_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_sst_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_statistics_test.cc Make statistics's stats_level change thread-safe (#5030) 2019-03-01 10:42:09 -08:00
db_table_properties_test.cc Allow dynamic modification of window size and deletion trigger (#4403) 2018-09-20 15:15:28 -07:00
db_tailing_iter_test.cc Remove managed iterator 2018-07-17 14:43:18 -07:00
db_test_util.cc Remove cuckoo hash memtable (#4953) 2019-02-07 16:15:27 -08:00
db_test_util.h Remove cuckoo hash memtable (#4953) 2019-02-07 16:15:27 -08:00
db_test.cc Make statistics's stats_level change thread-safe (#5030) 2019-03-01 10:42:09 -08:00
db_test2.cc WritePrepared: optimize read path by avoiding virtual (#5018) 2019-02-26 16:56:19 -08:00
db_universal_compaction_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
db_wal_test.cc Preload some files even if options.max_open_files (#3340) 2018-12-28 18:02:28 -08:00
db_write_test.cc Update bg_error when log flush fails in SwitchMemtable() (#5072) 2019-03-15 15:19:25 -07:00
dbformat_test.cc Relax VersionStorageInfo::GetOverlappingInputs check (#4050) 2018-07-13 17:42:38 -07:00
dbformat.cc types: add kEntryBlobIndex for TablePropertiesCollector (#4233) 2018-08-06 18:27:44 -07:00
dbformat.h s/CacheAllocator/MemoryAllocator/g (#4590) 2018-10-26 14:30:30 -07:00
deletefile_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
error_handler_test.cc Fix regression test failures introduced by PR #4164 (#4375) 2018-09-17 13:14:07 -07:00
error_handler.cc Fix typos in comments (#4456) 2018-10-04 20:46:50 -07:00
error_handler.h Fix typos in comments (#4456) 2018-10-04 20:46:50 -07:00
event_helpers.cc Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
event_helpers.h Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
experimental.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
external_sst_file_basic_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
external_sst_file_ingestion_job.cc Atomic ingest (#4895) 2019-02-12 19:16:17 -08:00
external_sst_file_ingestion_job.h Atomic ingest (#4895) 2019-02-12 19:16:17 -08:00
external_sst_file_test.cc add whole key bloom filter support in memtables (#4985) 2019-02-19 12:15:39 -08:00
fault_injection_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
file_indexer_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
file_indexer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
file_indexer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
filename_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_job_test.cc Use correct FileMeta for atomic flush result install (#4932) 2019-01-31 14:49:51 -08:00
flush_job.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
flush_job.h Add support to flush multiple CFs atomically (#4262) 2018-10-15 20:01:17 -07:00
flush_scheduler.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_scheduler.h Change RocksDB License 2017-07-15 16:11:23 -07:00
forward_iterator_bench.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
forward_iterator.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
forward_iterator.h Comment out unused variables 2018-03-05 13:13:41 -08:00
in_memory_stats_history.cc add GetStatsHistory to retrieve stats snapshots (#4748) 2019-02-20 15:52:54 -08:00
in_memory_stats_history.h add GetStatsHistory to retrieve stats snapshots (#4748) 2019-02-20 15:52:54 -08:00
internal_stats.cc Add a new CPU time counter to compaction report (#4889) 2019-01-29 17:24:00 -08:00
internal_stats.h Add a new CPU time counter to compaction report (#4889) 2019-01-29 17:24:00 -08:00
job_context.h WritePrepared: Fix visible key compacted out by compaction (#4883) 2019-01-15 21:34:38 -08:00
listener_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
log_format.h Fix an inaccurate comment (#4315) 2018-08-24 18:13:20 -07:00
log_reader.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
log_reader.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
log_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
log_writer.cc Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
log_writer.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
logs_with_prep_tracker.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
logs_with_prep_tracker.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
malloc_stats.cc Detect if Jemalloc is linked with the binary (#4844) 2019-01-03 16:30:12 -08:00
malloc_stats.h Change RocksDB License 2017-07-15 16:11:23 -07:00
manual_compaction_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
memtable_list_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
memtable_list.cc Avoid using kInAtomicGroup tag for single-cf op (#4981) 2019-02-13 18:33:42 -08:00
memtable_list.h Use correct FileMeta for atomic flush result install (#4932) 2019-01-31 14:49:51 -08:00
memtable.cc add whole key bloom filter support in memtables (#4985) 2019-02-19 12:15:39 -08:00
memtable.h add whole key bloom filter support in memtables (#4985) 2019-02-19 12:15:39 -08:00
merge_context.h Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
merge_helper_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
merge_helper.cc Add two more StatsLevel (#5027) 2019-02-28 10:27:59 -08:00
merge_helper.h Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
merge_operator.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
obsolete_files_test.cc Modify verification logic of ObsoleteOptionsFileTest (#4218) 2018-08-03 13:57:40 -07:00
options_file_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
perf_context_test.cc Allow copy for PerfContext objects (#4919) 2019-02-05 14:29:08 -08:00
pinned_iterators_manager.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_db_test.cc Revert "Remove PlainTable's feature store_index_in_file (#4914)" (#5034) 2019-03-01 15:45:45 -08:00
pre_release_callback.h Call PreReleaseCallback between WAL and memtable write (#5015) 2019-02-28 15:49:11 -08:00
prefix_test.cc Apply modernize-use-override (2nd iteration) 2019-02-14 14:41:36 -08:00
range_del_aggregator_bench.cc Fix Windows broken build error due to non-const override (#4798) 2018-12-19 13:29:51 -08:00
range_del_aggregator_test.cc Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
range_del_aggregator.cc Remove stale TODO (#4800) 2018-12-19 15:45:37 -08:00
range_del_aggregator.h Fix unused member compile error 2018-12-18 14:28:42 -08:00
range_tombstone_fragmenter_test.cc Prepare FragmentedRangeTombstoneIterator for use in compaction (#4740) 2018-12-11 12:10:48 -08:00
range_tombstone_fragmenter.cc Add compaction logic to RangeDelAggregatorV2 (#4758) 2018-12-17 13:20:51 -08:00
range_tombstone_fragmenter.h Add compaction logic to RangeDelAggregatorV2 (#4758) 2018-12-17 13:20:51 -08:00
read_callback.h WritePrepared: optimize read path by avoiding virtual (#5018) 2019-02-26 16:56:19 -08:00
repair_test.cc Acquire lock on DB LOCK file before starting repair. (#4435) 2018-10-12 10:41:54 -07:00
repair.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
snapshot_checker.h WritePrepared: fix issue with snapshot released during compaction (#4858) 2019-01-16 09:55:32 -08:00
snapshot_impl.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
snapshot_impl.h Remove duplicates from SnapshotList::GetAll (#4860) 2019-01-09 16:25:42 -08:00
table_cache.cc Introduce a CPU time counter in perf_context (#4741) 2018-12-20 12:03:44 -08:00
table_cache.h Preload some files even if options.max_open_files (#3340) 2018-12-28 18:02:28 -08:00
table_properties_collector_test.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_properties_collector.cc Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
table_properties_collector.h Feature for sampling and reporting compressibility (#4842) 2019-03-18 12:15:34 -07:00
transaction_log_impl.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
transaction_log_impl.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
version_builder_test.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
version_builder.cc Preload some files even if options.max_open_files (#3340) 2018-12-28 18:02:28 -08:00
version_builder.h Preload some files even if options.max_open_files (#3340) 2018-12-28 18:02:28 -08:00
version_edit_test.cc Add a unit test to Ignorable manfiest record (#4964) 2019-02-11 11:20:24 -08:00
version_edit.cc Add a placeholder in manifest indicating ignorable record (#4960) 2019-02-08 11:33:11 -08:00
version_edit.h Add a unit test to Ignorable manfiest record (#4964) 2019-02-11 11:20:24 -08:00
version_set_test.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
version_set.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
version_set.h Remove v1 RangeDelAggregator (#4778) 2018-12-17 17:33:46 -08:00
wal_manager_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
wal_manager.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
wal_manager.h Fix memleak when DB::DeleteFile() 2018-01-11 18:57:33 -08:00
write_batch_base.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_batch_internal.h WriteUnPrepared: Add new WAL marker kTypeBeginUnprepareXID (#4069) 2018-06-28 18:58:29 -07:00
write_batch_test.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
write_batch.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
write_callback_test.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
write_callback.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller_test.cc Apply modernize-use-override (3) 2019-02-19 13:39:49 -08:00
write_controller.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_thread.cc Add more sync point to fix flaky test GroupCommitTest 2018-11-07 14:07:53 -08:00
write_thread.h Fix skip WAL for whole write_group when leader's callback fail (#4838) 2019-01-03 12:40:42 -08:00