Update documentation

Summary: Added more options for compaction settings + thread pools. Please check if thread pool description is correct. Test Plan: - Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14043
2013-11-12 16:09:57 -08:00 · 2013-11-12 16:09:57 -08:00 · c3dda7276c
commit c3dda7276c
parent 9df2b217e9
1 changed files with 121 additions and 10 deletions
--- a/doc/index.html
+++ b/doc/index.html
@ -387,7 +387,8 @@ of point reads of small values may wish to switch to a smaller block
 size if performance measurements indicate an improvement.  There isn't
 much benefit in using blocks smaller than one kilobyte, or larger than
 a few megabytes.  Also note that compression will be more effective
-with larger block sizes.
+with larger block sizes. To change block size parameter, use
 <code>Options::block_size</code>.
 <p>
 <h2>Write buffer</h2>
 <p>
@ -434,7 +435,7 @@ filesystem and each file stores a sequence of compressed blocks.  If
 used uncompressed block contents. If <code>options.block_cache_compressed</code>
 is non-NULL, it is used to cache frequently used compressed blocks. Compressed
 cache is an alternative to OS cache, which also caches compressed blocks. If
-compressed cache is used, you should disable OS cache by setting
+compressed cache is used, the OS cache will be disabled automatically by setting
 <code>options.allow_os_buffer</code> to false.
 <p>
 <pre>
@ -588,7 +589,7 @@ Here we give overview of the options that impact behavior of Compactions:
 <ul>
 <p>
 <li><code>Options::compaction_style</code> - RocksDB currently supports two
-compaction algorithms - Compaction style and Level style. This option switches
+compaction algorithms - Universal  style and Level style. This option switches
 between the two.  Can be kCompactionStyleUniversal or kCompactionStyleLevel.
 If this is kCompactionStyleUniversal, then you can configure universal style
 parameters with <code>Options::compaction_options_universal</code>.
@ -608,16 +609,126 @@ key-value during background compaction.
 </ul>
 <p>
 Other options impacting performance of compactions and when they get triggered
-are: <code>access_hint_on_compaction_start</code>,
+are: 
-<code>level0_file_num_compaction_trigger</code>,
+<ul>
-<code>max_mem_compaction_level</code>, <code>target_file_size_base</code>,
+<p>
-<code>target_file_size_multiplier</code>,
+<li> <code>Options::access_hint_on_compaction_start</code> - Specify the file access 
-<code>expanded_compaction_factor</code>, <code>source_compaction_factor</code>,
+pattern once a compaction is started. It will be applied to all input files of a compaction. Default: NORMAL
-<code>max_grandparent_overlap_factor</code>,
+<p>
-<code>disable_seek_compaction</code>, <code>max_background_compactions</code>.
+<li> <code>Options::level0_file_num_compaction_trigger</code> -  Number of files to trigger level-0 compaction. 
 A negative value means that level-0 compaction will not be triggered by number of files at all.
 <p>
 <li> <code>Options::max_mem_compaction_level</code> -  Maximum level to which a new compacted memtable is pushed if it
 does not create overlap.  We try to push to level 2 to avoid the relatively expensive level 0=>1 compactions and to avoid some
 expensive manifest file operations.  We do not push all the way to the largest level since that can generate a lot of wasted disk
 space if the same key space is being repeatedly overwritten.
 <p>
 <li> <code>Options::target_file_size_base</code> and <code>Options::target_file_size_multiplier</code> - 
 Target file size for compaction.  target_file_size_base is per-file size for level-1.
 Target file size for level L can be calculated by target_file_size_base * (target_file_size_multiplier ^ (L-1))
 For example, if target_file_size_base is 2MB and target_file_size_multiplier is 10, then each file on level-1 will
 be 2MB, and each file on level 2 will be 20MB, and each file on level-3 will be 200MB. Default target_file_size_base is 2MB
 and default target_file_size_multiplier is 1.
 <p>
 <li> <code>Options::expanded_compaction_factor</code> -  Maximum number of bytes in all compacted files.  We avoid expanding
 the lower level file set of a compaction if it would make the total compaction cover more than
 (expanded_compaction_factor * targetFileSizeLevel()) many bytes.
 <p>
 <li> <code>Options::source_compaction_factor</code> -    Maximum number of bytes in all source files to be compacted in a
 single compaction run. We avoid picking too many files in the source level so that we do not exceed the total source bytes
 for compaction to exceed (source_compaction_factor * targetFileSizeLevel()) many bytes.
 Default:1, i.e. pick maxfilesize amount of data as the source of a compaction.
 <p>
 <li> <code>Options::max_grandparent_overlap_factor</code> -   Control maximum bytes of overlaps in grandparent (i.e., level+2) before we
 stop building a single file in a level->level+1 compaction.
 <p>
 <li> <code>Options::disable_seek_compaction</code> -  Disable compaction triggered by seek.
 With bloomfilter and fast storage, a miss on one level is very cheap if the file handle is cached in table cache
 (which is true if max_open_files is large).
 <p>
 <li> <code>Options::max_background_compactions</code> - Maximum number of concurrent background jobs, submitted to
 the default LOW priority thread pool
 </ul>
 <p>
 You can learn more about all of those options in <code>rocksdb/options.h</code>
 <h2> Universal style compaction specific settings</h2>
 <p>
 If you're using Universal style compaction, there is an object <code>CompactionOptionsUniversal</code>
 that hold all the different options for that compaction. The exact definition is in
 <code>rocksdb/universal_compaction.h</code> and you can set it in <code>Options::compaction_options_universal</code>.
 Here we give short overview of options in <code>CompactionOptionsUniversal</code>:
 <ul>
 <p>
 <li> <code>CompactionOptionsUniversal::size_ratio</code> - Percentage flexibilty while comparing file size. If the candidate file(s)
   size is 1% smaller than the next file's size, then include next file into
   this candidate set.  Default: 1
 <p>
 <li> <code>CompactionOptionsUniversal::min_merge_width</code> - The minimum number of files in a single compaction run. Default: 2
 <p>
 <li> <code>CompactionOptionsUniversal::max_merge_width</code> - The maximum number of files in a single compaction run. Default: UINT_MAX
 <p>
 <li> <code>CompactionOptionsUniversal::max_size_amplification_percent</code> - The size amplification is defined as the amount (in percentage) of
 additional storage needed to store a single byte of data in the database.  For example, a size amplification of 2% means that a database that
 contains 100 bytes of user-data may occupy upto 102 bytes of physical storage. By this definition, a fully compacted database has
 a size amplification of 0%. Rocksdb uses the following heuristic to calculate size amplification: it assumes that all files excluding
 the earliest file contribute to the size amplification.  Default: 200, which means that a 100 byte database could require upto
 300 bytes of storage.
 <p>
 <li> <code>CompactionOptionsUniversal::compression_size_percent</code> - If this option is set to be -1 (the default value), all the output files
 will follow compression type specified.  If this option is not negative, we will try to make sure compressed
 size is just above this value. In normal cases, at least this percentage
 of data will be compressed.
 When we are compacting to a new file, here is the criteria whether
 it needs to be compressed: assuming here are the list of files sorted
 by generation time: [ A1...An B1...Bm C1...Ct ],
 where A1 is the newest and Ct is the oldest, and we are going to compact
 B1...Bm, we calculate the total size of all the files as total_size, as
 well as  the total size of C1...Ct as total_C, the compaction output file
 will be compressed iff total_C / total_size < this percentage
 <p>
 <li> <code>CompactionOptionsUniversal::stop_style</code> - The algorithm used to stop picking files into a single compaction run.
 Can be kCompactionStopStyleSimilarSize (pick files of similar size) or kCompactionStopStyleTotalSize (total size of picked files > next file).
 Default: kCompactionStopStyleTotalSize
 </ul>
 <h1>Thread pools</h1>
 <p>
 A thread pool is associated with Env environment object. The client has to create a thread pool by setting the number of background
 threads using method <code>Env::SetBackgroundThreads()</code> defined in <code>rocksdb/env.h</code>.
 We use the thread pool for compactions and memtable flushes.
 Since memtable flushes are in critical code path (stalling memtable flush can stall writes, increasing p99), we suggest 
 having two thread pools - with priorities HIGH and LOW. Memtable flushes can be set up to be scheduled on HIGH thread pool.
 There are two options available for configuration of background compactions and flushes:
 <ul>
 <p>
 <li> <code>Options::max_background_compactions</code> - Maximum number of concurrent background jobs,
 submitted to the default LOW priority thread pool
 <p>
 <li> <code>Options::max_background_flushes</code> - Maximum number of concurrent background memtable flush jobs, submitted to
 the HIGH priority thread pool.  By default, all background jobs (major compaction and memtable flush) go
 to the LOW priority pool. If this option is set to a positive number, memtable flush jobs will be submitted to the HIGH priority pool.
 It is important when the same Env is shared by multiple db instances.  Without a separate pool, long running major compaction jobs could
 potentially block memtable flush jobs of other db instances, leading to unnecessary Put stalls.
 </ul>
 <p>
 <pre>
  #include "rocksdb/env.h"
  #include "rocksdb/db.h"
  auto env = rocksdb::Env::Default();
  env->SetBackgroundThreads(2, rocksdb::Env::LOW);
  env->SetBackgroundThreads(1, rocksdb::Env::HIGH);
  rocksdb::DB* db;
  rocksdb::Options options;
  options.env = env;
  options.max_background_compactions = 2;
  options.max_background_flushes = 1;
  rocksdb::Status status = rocksdb::DB::Open(options, "/tmp/testdb", &amp;db);
  assert(status.ok());
  ...
 </pre>
 <h1>Approximate Sizes</h1>
 <p>
 The <code>GetApproximateSizes</code> method can used to get the approximate