rocksdb/include/rocksdb
Hui Xiao 240c4126fd Implement superior user & mid IO priority level in GenericRateLimiter (#8595)
Summary:
Context:
An extra IO_USER priority in rate limiter allows users to optionally charge WAL writes / SST reads to rate limiter at this priority level, which then has higher priority than IO_HIGH and IO_LOW. With an extra IO_USER priority, it allows users to better specify the relative urgency/importance among different requests in rate limiter. As a consequence, IO resource management can better prioritize and limit resource based on user's need.

The IO_USER is implemented as superior priority in GenericRateLimiter, in the sense that its request queue will always be iterated first without being constrained to fairness. The reason is that the notion of fairness is only meaningful in helping lower priorities in background IO (i.e, IO_HIGH/MID/LOW) to gain some fair chance to run so that it does not block foreground IO (i.e, the ones that are charged at the level of IO_USER). As we can see, the ultimate goal here is to not blocking foreground IO at IO_USER level, which justifies the superiority of IO_USER.

Similar benefits exist for IO_MID priority.
- Rewrote the logic of deciding the order of iterating request queues of high/low priorities to include the extra user/mid priority w/o affecting the existing behavior (see PR's [comment](https://github.com/facebook/rocksdb/pull/8595/files#r678749331))
- Included the request queue of user-pri/mid-pri in the code path of next-leader-candidate signaling and GenericRateLimiter's destructor
- Included the extra user/mid-pri in bookkeeping data structures: total_bytes_through_ and total_requests_
- Re-written the previous impl of explicitly iterating priorities with a loop from Env::IO_LOW to Env::IO_TOTAL

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8595

Test Plan:
- passed existing rate_limiter_test.cc
- passed added unit tests in rate_limiter_test.cc
- run performance test to verify performance with only high/low requests is not affected by this change
   - Set-up command:
   `TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillrandom --duration=5 --compression_type=none --num=100000000 --disable_auto_compactions=true --write_buffer_size=1048576 --writable_file_max_buffer_size=65536 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --level0_slowdown_writes_trigger=$(((1 << 31) - 1)) --level0_stop_writes_trigger=$(((1 << 31) - 1))`

    - Test command:
   `TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=overwrite --use_existing_db=true --disable_wal=true --duration=30 --compression_type=none --num=100000000 --write_buffer_size=1048576 --writable_file_max_buffer_size=65536 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --level0_slowdown_writes_trigger=$(((1 << 31) - 1)) --level0_stop_writes_trigger=$(((1 << 31) - 1)) --statistics=true --rate_limiter_bytes_per_sec=1048576 --rate_limiter_refill_period_us=1000  --threads=32 |& grep -E '(flush|compact)\.write\.bytes'`

   - Before (on branch upstream/master):
   `rocksdb.compact.write.bytes COUNT : 4014162`
   `rocksdb.flush.write.bytes COUNT : 26715832`
    rocksdb.flush.write.bytes/rocksdb.compact.write.bytes ~= 6.66

   - After (on branch rate_limiter_user_pri):
  `rocksdb.compact.write.bytes COUNT : 3807822`
  `rocksdb.flush.write.bytes COUNT : 26098659`
   rocksdb.flush.write.bytes/rocksdb.compact.write.bytes ~= 6.85

Reviewed By: ajkr

Differential Revision: D30577783

Pulled By: hx235

fbshipit-source-id: 0881f2705ffd13ecd331256bde7e8ec874a353f4
2021-08-31 11:24:27 -07:00
..
utilities Fix some minor issues in the Customizable infrastructure (#8566) 2021-08-19 10:10:47 -07:00
advanced_options.h Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
c.h Add Bloom/Ribbon hybrid API support (#8679) 2021-08-20 18:00:16 -07:00
cache_bench_tool.h Allow cache_bench/db_bench to use a custom secondary cache (#8312) 2021-05-19 15:26:18 -07:00
cache.h Parallelize secondary cache lookup in MultiGet (#8405) 2021-06-18 09:35:59 -07:00
cleanable.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compaction_filter.h Make MergeOperator+CompactionFilter/Factory into Customizable Classes (#8481) 2021-08-06 08:27:25 -07:00
compaction_job_stats.h Update compaction statistics to include the amount of data read from blob files (#8022) 2021-03-04 00:43:48 -08:00
comparator.h Make Comparator into a Customizable Object (#8336) 2021-06-11 06:22:59 -07:00
compression_type.h Move CompressionType to its own header file (#7162) 2020-08-03 15:49:31 -07:00
concurrent_task_limiter.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
configurable.h Make Configurable/Customizable options copyable (#8704) 2021-08-25 17:48:08 -07:00
convenience.h Add ObjectRegistry to ConfigOptions (#8166) 2021-05-11 06:47:22 -07:00
customizable.h Make MergeOperator+CompactionFilter/Factory into Customizable Classes (#8481) 2021-08-06 08:27:25 -07:00
data_structure.h Handoff checksum Implementation (#7523) 2021-02-10 22:20:32 -08:00
db_bench_tool.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_dump_tool.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_stress_tool.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db.h Add property LiveSstFilesSizeAtTemperature for tiered storage (#8644) 2021-08-15 14:17:45 -07:00
env_encryption.h Make EncryptionProvider and BlockCipher into Customizable objects (#8354) 2021-07-16 07:58:51 -07:00
env.h Implement superior user & mid IO priority level in GenericRateLimiter (#8595) 2021-08-31 11:24:27 -07:00
experimental.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_checksum.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
file_system.h Add CreateFrom methods to Env/FileSystem (#8174) 2021-06-15 03:43:48 -07:00
filter_policy.h Add Bloom/Ribbon hybrid API support (#8679) 2021-08-20 18:00:16 -07:00
flush_block_policy.h Make FlushBlockPolicyFactory into a Customizable class (#8432) 2021-07-12 09:04:59 -07:00
functor_wrapper.h Add StartThread type checking wrapper (#8303) 2021-05-19 16:51:13 -07:00
io_status.h Add a new IOStatus subcode to indicate that writes are fenced off (#7374) 2020-09-14 16:04:47 -07:00
iostats_context.h Disable IOStatsContext/PerfContext if no thread local (#8117) 2021-04-13 07:56:59 -07:00
iterator.h Enable backward iterator for keys with user-defined timestamp (#8035) 2021-03-10 11:15:46 -08:00
ldb_tool.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
listener.h Move old files to warm tier in FIFO compactions (#8310) 2021-08-09 12:51:14 -07:00
memory_allocator.h slightly improve jemalloc allocator API header (#7592) 2020-10-28 13:47:12 -07:00
memtablerep.h Improve MemPurge sampling (#8656) 2021-08-13 14:35:41 -07:00
merge_operator.h Make MergeOperator+CompactionFilter/Factory into Customizable Classes (#8481) 2021-08-06 08:27:25 -07:00
metadata.h Add BlobMetaData retrieval methods (#8273) 2021-06-28 08:13:29 -07:00
options.h Add extra information to RemoteCompaction APIs (#8680) 2021-08-23 16:27:38 -07:00
perf_context.h Add a PerfContext counter for secondary cache hits (#8685) 2021-08-20 15:17:30 -07:00
perf_level.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
persistent_cache.h Fix persistent cache on windows (#6932) 2020-06-13 13:28:31 -07:00
rate_limiter.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
rocksdb_namespace.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
secondary_cache.h Make SecondaryCache Customizable (#8480) 2021-07-06 09:18:08 -07:00
slice_transform.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
slice.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
sst_dump_tool.h Add --version and --help to ldb and sst_dump (#6951) 2020-06-09 10:04:01 -07:00
sst_file_manager.h Use SST file manager to track blob files as well (#8037) 2021-03-17 20:44:49 -07:00
sst_file_reader.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
sst_file_writer.h Add more LSM info to FilterBuildingContext (#8246) 2021-04-30 13:50:13 -07:00
sst_partitioner.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
statistics.h Add statistics support to integrated BlobDB (#8667) 2021-08-17 17:22:31 -07:00
stats_history.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
status.h Fix comments in Status (#8429) 2021-06-21 13:41:08 -07:00
system_clock.h Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
table_properties.h Embed original file number in SST table properties (#8686) 2021-08-20 20:40:48 -07:00
table.h Dynamically configure BlockBasedTableOptions.prepopulate_block_cache (#8620) 2021-08-05 19:44:51 -07:00
thread_status.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
threadpool.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trace_reader_writer.h Update comments, fix typos. (#8721) 2021-08-27 13:16:32 -07:00
trace_record_result.h Add IteratorTraceExecutionResult for iterator related trace records. (#8687) 2021-08-20 15:35:56 -07:00
trace_record.h Add IteratorTraceExecutionResult for iterator related trace records. (#8687) 2021-08-20 15:35:56 -07:00
transaction_log.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
types.h Add more LSM info to FilterBuildingContext (#8246) 2021-04-30 13:50:13 -07:00
universal_compaction.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
version.h Update version.h and HISTORY.md for the 6.24 release (#8688) 2021-08-20 22:28:16 -07:00
wal_filter.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_base.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch.h Fix spelling in comments in include/rocksdb/ (#8120) 2021-03-29 05:05:06 -07:00
write_buffer_manager.h Refactor WriteBufferManager::CacheRep into CacheReservationManager (#8506) 2021-08-24 12:43:31 -07:00