rocksdb/tools
Andrew Kryczka 843d2e3137 Shared dictionary compression using reference block
Summary:
This adds a new metablock containing a shared dictionary that is used
to compress all data blocks in the SST file. The size of the shared dictionary
is configurable in CompressionOptions and defaults to 0. It's currently only
used for zlib/lz4/lz4hc, but the block will be stored in the SST regardless of
the compression type if the user chooses a nonzero dictionary size.

During compaction, computes the dictionary by randomly sampling the first
output file in each subcompaction. It pre-computes the intervals to sample
by assuming the output file will have the maximum allowable length. In case
the file is smaller, some of the pre-computed sampling intervals can be beyond
end-of-file, in which case we skip over those samples and the dictionary will
be a bit smaller. After the dictionary is generated using the first file in a
subcompaction, it is loaded into the compression library before writing each
block in each subsequent file of that subcompaction.

On the read path, gets the dictionary from the metablock, if it exists. Then,
loads that dictionary into the compression library before reading each block.

Test Plan: new unit test

Reviewers: yhchiang, IslamAbdelRahman, cyan, sdong

Reviewed By: sdong

Subscribers: andrewkr, yoshinorim, kradhakrishnan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D52287
2016-04-27 17:36:03 -07:00
..
dump Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
rdb fix typos in comments 2015-12-11 01:54:48 +09:00
auto_sanity_test.sh fix typos in comments 2015-12-11 01:54:48 +09:00
benchmark_leveldb.sh Add scripts to run leveldb benchmark 2015-04-27 19:32:56 -07:00
benchmark.sh Fix column label for L0 write sum 2016-04-18 14:34:45 -07:00
check_format_compatible.sh tools/check_format_compatible.sh to use consistent version when testing backward and forward compatibility 2016-03-21 11:13:26 -07:00
db_bench_tool.cc Print memory allocation counters 2016-04-27 16:23:33 -07:00
db_bench.cc Separeate main from bench functionality to allow cusomizations 2016-02-16 06:17:31 -08:00
db_crashtest.py [db_stress] Make subcompaction random in crash_test 2016-04-18 14:43:33 -07:00
db_repl_stress.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
db_sanity_test.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
db_stress.cc Temporarily disable CompactFiles in db_stress in its default setting 2016-04-27 16:50:51 -07:00
dbench_monitor Added simple monitoring script to monitor overusage of memory in db_bench 2015-02-11 18:40:11 -08:00
Dockerfile adding docker build script and dockerfile 2015-05-22 16:03:39 -07:00
generate_random_db.sh Script to check whether RocksDB can read DB generated by previous releases and vice versa 2015-04-08 16:04:59 -07:00
ldb_cmd_execute_result.h Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
ldb_cmd_test.cc Fix Windows build by replacing strings.h include 2016-04-11 19:21:00 -07:00
ldb_cmd.cc Introduce XPRESS compresssion on Windows. (#1081) 2016-04-19 22:54:24 -07:00
ldb_cmd.h to/from hex refactor 2016-03-30 14:36:48 -07:00
ldb_test.py ldb to support --column_family option 2016-01-25 14:58:18 -08:00
ldb_tool.cc Expose RepairDB as ldb command 2016-03-12 13:50:20 -08:00
ldb.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
pflag Added simple monitoring script to monitor overusage of memory in db_bench 2015-02-11 18:40:11 -08:00
reduce_levels_test.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
rocksdb_dump_test.sh Update dump_tool and undump_tool to accept Options 2015-10-05 19:49:48 -07:00
run_flash_bench.sh Update benchmarks used to measure subcompaction performance 2016-03-04 12:32:11 -08:00
run_leveldb.sh Add scripts to run leveldb benchmark 2015-04-27 19:32:56 -07:00
sample-dump.dmp First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
sst_dump_test.cc Shared dictionary compression using reference block 2016-04-27 17:36:03 -07:00
sst_dump_tool_imp.h Adding pin_l0_filter_and_index_blocks_in_cache feature and related fixes. 2016-04-01 10:42:39 -07:00
sst_dump_tool.cc Shared dictionary compression using reference block 2016-04-27 17:36:03 -07:00
sst_dump.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
verify_random_db.sh Script to check whether RocksDB can read DB generated by previous releases and vice versa 2015-04-08 16:04:59 -07:00
write_stress_runner.py Write stress test 2015-10-28 16:15:07 -07:00
write_stress.cc Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00