rocksdb/tools
Zhichao Cao ce8e88d2d7 Generate mixed workload with Get, Put, Seek in db_bench (#4788)
Summary:
Based on the specific workload models (key access distribution, value size distribution, and iterator scan length distribution, the QPS variation), the MixGraph benchmark generate the synthetic workload according to these distributions which can reflect the real-world workload characteristics.

After user enable the tracing function, they will get the trace file. By analyzing the trace file with the trace_analyzer tool, user can generate a set of statistic data files including. The *_accessed_key_stats.txt,  *-accessed_value_size_distribution.txt, *-iterator_length_distribution.txt, and *-qps_stats.txt are mainly used to fit the Matlab model fitting. After that, user can get the parameters of the workload distributions (the modeling details are described: [here](https://github.com/facebook/rocksdb/wiki/RocksDB-Trace%2C-Replay%2C-and-Analyzer))

The key access distribution follows the The two-term power model. The probability density function is: `f(x) = ax^{b}+c`. The corresponding parameters are key_dist_a, key_dist_b, and key_dist_c in db_bench

For the value size distribution and iterator scan length distribution, they both follow the Generalized Pareto Distribution. The probability density function is `f(x) = (1/sigma)(1+k*(x-theta)/sigma))^{-1-1/k)`. The parameters are: value_k, value_theta, value_sigma and iter_k, iter_theta, iter_sigma. For more information about the Generalized Pareto Distribution, users can find the [wiki](https://en.wikipedia.org/wiki/Generalized_Pareto_distribution) and [Matalb page](https://www.mathworks.com/help/stats/generalized-pareto-distribution.html)

As for the QPS, it follows the diurnal pattern. So Sine is a good model to fit it. `F(x) = sine_a*sin(sine_b*x + sine_c) + sine_d`. The trace_will tell you the average QPS in the print out resutls, which is sine_d. After user fit the "*-qps_stats.txt" to the Matlab model, user can get the sine_a, sine_b, and sine_c. By using the 4 parameters, user can control the QPS variation including the period, average, changes.

To use the bench mark, user can indicate the following parameters as examples:
```
-benchmarks="mixgraph" -key_dist_a=0.002312 -key_dist_b=0.3467 -value_k=0.9233 -value_sigma=226.4092 -iter_k=2.517 -iter_sigma=14.236 -mix_get_ratio=0.7 -mix_put_ratio=0.25 -mix_seek_ratio=0.05 -sine_mix_rate_interval_milliseconds=500 -sine_a=15000 -sine_b=1 -sine_d=20000
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4788

Differential Revision: D13573940

Pulled By: sagar0

fbshipit-source-id: e184c27e07b4f1bc0b436c2be36c5090c1fb0222
2019-01-22 10:44:26 -08:00
..
advisor Rules Advisor: some fixes to support fetching stats from ODS (#4223) 2018-08-02 15:42:42 -07:00
dump fix gflags namespace 2017-12-01 10:42:05 -08:00
rdb Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
auto_sanity_test.sh Suppress lint in old files 2018-01-29 12:56:42 -08:00
benchmark_leveldb.sh Suppress lint in old files 2018-01-29 12:56:42 -08:00
benchmark.sh Updated benchmark script (#4134) 2018-12-17 16:34:30 -08:00
blob_dump.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
check_format_compatible.sh Include newer RocksDB versions in compat test (#4634) 2018-11-06 14:25:39 -08:00
CMakeLists.txt cmake support for linux and osx (#1358) 2016-09-28 11:53:15 -07:00
db_bench_tool_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
db_bench_tool.cc Generate mixed workload with Get, Put, Seek in db_bench (#4788) 2019-01-22 10:44:26 -08:00
db_bench.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_crashtest.py Enable DeleteRange in stress/crash tests (#4483) 2018-12-18 13:42:49 -08:00
db_repl_stress.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
db_sanity_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_stress.cc Free memory after use 2019-01-08 17:19:09 -08:00
dbench_monitor Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
Dockerfile adding docker build script and dockerfile 2015-05-22 16:03:39 -07:00
generate_random_db.sh Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
ingest_external_sst.sh Add compatibility test of SST ingestion (#4310) 2018-08-24 14:27:43 -07:00
ldb_cmd_impl.h Add SST ingestion to ldb (#4205) 2018-08-09 14:29:11 -07:00
ldb_cmd_test.cc tools: use provided options instead of the default (#4839) 2019-01-03 11:23:49 -08:00
ldb_cmd.cc With ldb --try_load_options and wal_dir doesn't exist, ignore it (#4875) 2019-01-11 16:48:32 -08:00
ldb_test.py Add SST ingestion to ldb (#4205) 2018-08-09 14:29:11 -07:00
ldb_tool.cc Add SST ingestion to ldb (#4205) 2018-08-09 14:29:11 -07:00
ldb.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
pflag Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
reduce_levels_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
regression_test.sh Suppress lint in old files 2018-01-29 12:56:42 -08:00
report_lite_binary_size.sh Legocastle job to report lite build binary size to scuba 2018-02-15 17:27:24 -08:00
rocksdb_dump_test.sh Suppress lint in old files 2018-01-29 12:56:42 -08:00
run_flash_bench.sh Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
run_leveldb.sh Fix /bin/bash shebangs 2017-08-03 15:56:46 -07:00
sample-dump.dmp First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
sst_dump_test.cc tools: use provided options instead of the default (#4839) 2019-01-03 11:23:49 -08:00
sst_dump_tool_imp.h tools: use provided options instead of the default (#4839) 2019-01-03 11:23:49 -08:00
sst_dump_tool.cc tools: use provided options instead of the default (#4839) 2019-01-03 11:23:49 -08:00
sst_dump.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
trace_analyzer_test.cc Add the unit test of Iterator to trace_analyzer_test (#4282) 2018-08-23 17:28:32 -07:00
trace_analyzer_tool.cc Add unique key number changing statistics to Trace_analyzer (#4646) 2018-11-12 08:26:50 -08:00
trace_analyzer_tool.h Add unique key number changing statistics to Trace_analyzer (#4646) 2018-11-12 08:26:50 -08:00
trace_analyzer.cc RocksDB Trace Analyzer (#4091) 2018-08-13 11:44:02 -07:00
verify_random_db.sh tools/check_format_compatible.sh to cover forward option reading too (#3994) 2018-06-15 11:12:29 -07:00
write_external_sst.sh correct mistyped msg. (#4341) 2018-09-13 14:57:38 -07:00
write_stress_runner.py Suppress lint in old files 2018-01-29 12:56:42 -08:00
write_stress.cc Compilation fixes for powerpc build, -Wparentheses-equality error and missing header guards 2018-02-09 14:12:43 -08:00