rocksdb/util
Burton Li a8b9891f95 Concurrent task limiter for compaction thread control (#4332)
Summary:
The PR is targeting to resolve the issue of:
https://github.com/facebook/rocksdb/issues/3972#issue-330771918

We have a rocksdb created with leveled-compaction with multiple column families (CFs), some of CFs are using HDD to store big and less frequently accessed data and others are using SSD.
When there are continuously write traffics going on to all CFs, the compaction thread pool is mostly occupied by those slow HDD compactions, which blocks fully utilize SSD bandwidth.
Since atomic write and transaction is needed across CFs, so splitting it to multiple rocksdb instance is not an option for us.

With the compaction thread control, we got 30%+ HDD write throughput gain, and also a lot smooth SSD write since less write stall happening.

ConcurrentTaskLimiter can be shared with multi-CFs across rocksdb instances, so the feature does not only work for multi-CFs scenarios, but also for multi-rocksdbs scenarios, who need disk IO resource control per tenant.

The usage is straight forward:
e.g.:

//
// Enable compaction thread limiter thru ColumnFamilyOptions
//
std::shared_ptr<ConcurrentTaskLimiter> ctl(NewConcurrentTaskLimiter("foo_limiter", 4));
Options options;
ColumnFamilyOptions cf_opt(options);
cf_opt.compaction_thread_limiter = ctl;
...

//
// Compaction thread limiter can be tuned or disabled on-the-fly
//
ctl->SetMaxOutstandingTask(12); // enlarge to 12 tasks
...
ctl->ResetMaxOutstandingTask(); // disable (bypass) thread limiter
ctl->SetMaxOutstandingTask(-1); // Same as above
...
ctl->SetMaxOutstandingTask(0);  // full throttle (0 task)

//
// Sharing compaction thread limiter among CFs (to resolve multiple storage perf issue)
//
std::shared_ptr<ConcurrentTaskLimiter> ctl_ssd(NewConcurrentTaskLimiter("ssd_limiter", 8));
std::shared_ptr<ConcurrentTaskLimiter> ctl_hdd(NewConcurrentTaskLimiter("hdd_limiter", 4));
Options options;
ColumnFamilyOptions cf_opt_ssd1(options);
ColumnFamilyOptions cf_opt_ssd2(options);
ColumnFamilyOptions cf_opt_hdd1(options);
ColumnFamilyOptions cf_opt_hdd2(options);
ColumnFamilyOptions cf_opt_hdd3(options);

// SSD CFs
cf_opt_ssd1.compaction_thread_limiter = ctl_ssd;
cf_opt_ssd2.compaction_thread_limiter = ctl_ssd;

// HDD CFs
cf_opt_hdd1.compaction_thread_limiter = ctl_hdd;
cf_opt_hdd2.compaction_thread_limiter = ctl_hdd;
cf_opt_hdd3.compaction_thread_limiter = ctl_hdd;

...

//
// The limiter is disabled by default (or set to nullptr explicitly)
//
Options options;
ColumnFamilyOptions cf_opt(options);
cf_opt.compaction_thread_limiter = nullptr;
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4332

Differential Revision: D13226590

Pulled By: siying

fbshipit-source-id: 14307aec55b8bd59c8223d04aa6db3c03d1b0c1d
2018-12-13 13:18:28 -08:00
..
aligned_buffer.h Fix Copying of data between buffers in FilePrefetchBuffer (#4100) 2018-07-11 12:28:13 -07:00
allocator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
arena_test.cc arena: derive alignment unit from std::max_align_t 2017-10-17 11:13:19 -07:00
arena.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
arena.h arena: derive alignment unit from std::max_align_t 2017-10-17 11:13:19 -07:00
auto_roll_logger_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
auto_roll_logger.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
auto_roll_logger.h Fix the Logger::Close() and DBImpl::Close() design pattern 2018-02-23 13:57:26 -08:00
autovector_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
autovector.h optimized the performance of autovector::emplace_back. (#4606) 2018-11-13 14:39:03 -08:00
bloom_test.cc fix gflags namespace 2017-12-01 10:42:05 -08:00
bloom.cc Improve FullFilterBitsReader::HashMayMatch's doc (#4202) 2018-08-06 11:13:18 -07:00
build_version.cc.in Makefile: generate util/build_version.cc from .in file (#1384) 2016-10-25 11:31:39 -07:00
build_version.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cast_util.h Add a missing "once" in .h 2017-07-31 12:12:03 -07:00
channel.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
coding_test.cc Coding.h: Added Fixed16 support (#4142) 2018-07-16 23:43:41 -07:00
coding.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
coding.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
compaction_job_stats_impl.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
comparator.cc Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
compression_context_cache.cc run make format for PR 3838 (#3954) 2018-06-05 12:58:02 -07:00
compression_context_cache.h run make format for PR 3838 (#3954) 2018-06-05 12:58:02 -07:00
compression.h s/CacheAllocator/MemoryAllocator/g (#4590) 2018-10-26 14:30:30 -07:00
concurrent_arena.cc Cap concurrent arena's shard block size to 128KB (#4147) 2018-07-18 10:43:54 -07:00
concurrent_arena.h util: Fix coverity issues 2017-11-03 14:42:08 -07:00
concurrent_task_limiter_impl.cc Concurrent task limiter for compaction thread control (#4332) 2018-12-13 13:18:28 -08:00
concurrent_task_limiter_impl.h Concurrent task limiter for compaction thread control (#4332) 2018-12-13 13:18:28 -08:00
core_local.h Change RocksDB License 2017-07-15 16:11:23 -07:00
crc32c_ppc_asm.S Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
crc32c_ppc_constants.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
crc32c_ppc.c Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
crc32c_ppc.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
crc32c_test.cc Build and tests fixes for Solaris Sparc (#4000) 2018-06-15 12:42:53 -07:00
crc32c.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
crc32c.h Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
delete_scheduler_test.cc Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
delete_scheduler.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
delete_scheduler.h In delete scheduler, before ftruncate file for slow delete, check whether there is other hard links (#4093) 2018-07-09 15:28:12 -07:00
duplicate_detector.h Improve db_stress with transactions 2018-04-18 16:32:35 -07:00
dynamic_bloom_test.cc Two code changes to make "clang analyze" happy (#4292) 2018-08-20 17:43:41 -07:00
dynamic_bloom.cc Use nullptr instead of NULL / 0 more consistently. 2018-03-07 12:42:12 -08:00
dynamic_bloom.h Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
event_logger_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
event_logger.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
event_logger.h Change RocksDB License 2017-07-15 16:11:23 -07:00
fault_injection_test_env.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
fault_injection_test_env.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
file_reader_writer_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
file_reader_writer.cc Direct I/O Close() shouldn't rewrite the last block (#4771) 2018-12-11 13:55:02 -08:00
file_reader_writer.h Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
file_util.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
file_util.h Sync CURRENT file during checkpoint (#4322) 2018-08-28 12:43:18 -07:00
filelock_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
filename.cc Store the return value of Fsync for check 2018-09-14 13:29:56 -07:00
filename.h BlobDB: GetLiveFiles and GetLiveFilesMetadata return relative path (#4326) 2018-08-31 12:12:49 -07:00
filter_policy.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
gflags_compat.h fix gflags namespace 2017-12-01 10:42:05 -08:00
hash_map.h Change RocksDB License 2017-07-15 16:11:23 -07:00
hash_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
hash.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
hash.h Change RocksDB License 2017-07-15 16:11:23 -07:00
heap_test.cc fix gflags namespace 2017-12-01 10:42:05 -08:00
heap.h Clean up FragmentedRangeTombstoneList (#4692) 2018-11-28 15:29:02 -08:00
jemalloc_nodump_allocator.cc Extend Transaction::GetForUpdate with do_validate (#4680) 2018-12-06 17:49:00 -08:00
jemalloc_nodump_allocator.h JemallocNodumpAllocator: option to limit tcache memory usage (#4736) 2018-11-29 17:33:40 -08:00
kv_map.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_buffer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
log_buffer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_write_bench.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
logging.h FIX #3820: shorter file name in logs (#4616) 2018-11-01 16:19:01 -07:00
memory_allocator.h s/CacheAllocator/MemoryAllocator/g (#4590) 2018-10-26 14:30:30 -07:00
memory_usage.h Change RocksDB License 2017-07-15 16:11:23 -07:00
mock_time_env.h Fix RepeatableThreadTest::MockEnvTest hang (#4560) 2018-10-21 20:17:18 -07:00
murmurhash.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
murmurhash.h Change RocksDB License 2017-07-15 16:11:23 -07:00
mutexlock.h Change RocksDB License 2017-07-15 16:11:23 -07:00
ppc-opcode.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
random.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
random.h Change RocksDB License 2017-07-15 16:11:23 -07:00
rate_limiter_test.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
rate_limiter.cc rate limit auto-tuning 2017-10-04 19:15:01 -07:00
rate_limiter.h rate limit auto-tuning 2017-10-04 19:15:01 -07:00
repeatable_thread_test.cc Utility to run task periodically in a thread (#4423) 2018-09-27 15:28:00 -07:00
repeatable_thread.h Fix RepeatableThreadTest::MockEnvTest hang (#4560) 2018-10-21 20:17:18 -07:00
set_comparator.h WritePrepared Txn: Move DuplicateDetector to util 2018-03-05 23:57:12 -08:00
slice_transform_test.cc Update all unique/shared_ptr instances to be qualified with namespace std (#4638) 2018-11-09 11:19:58 -08:00
slice.cc use user_key and iterate_upper_bound to determine compatibility of bloom filters (#3899) 2018-06-26 15:57:26 -07:00
sst_file_manager_impl.cc Fix regression test failures introduced by PR #4164 (#4375) 2018-09-17 13:14:07 -07:00
sst_file_manager_impl.h Fix regression test failures introduced by PR #4164 (#4375) 2018-09-17 13:14:07 -07:00
status.cc move static msgs out of Status class (#4144) 2018-07-23 15:44:16 -07:00
stderr_logger.h Change RocksDB License 2017-07-15 16:11:23 -07:00
stop_watch.h Exclude time waiting for rate limiter from rocksdb.sst.read.micros (#4102) 2018-07-13 18:44:14 -07:00
string_util.cc Several small "fixes" 2018-02-15 16:57:37 -08:00
string_util.h Change RocksDB License 2017-07-15 16:11:23 -07:00
sync_point_impl.cc Avoid acquiring SyncPoint mutex when it is disabled (#3991) 2018-06-13 13:13:18 -07:00
sync_point_impl.h Avoid acquiring SyncPoint mutex when it is disabled (#3991) 2018-06-13 13:13:18 -07:00
sync_point.cc Add read retry support to log reader (#4394) 2018-10-19 11:53:00 -07:00
sync_point.h Fix singleton destruction order of PosixEnv and SyncPoint (#3951) 2018-06-04 15:58:46 -07:00
testharness.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
testharness.h Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
testutil.cc Backup engine support for direct I/O reads (#4640) 2018-11-13 11:17:25 -08:00
testutil.h Backup engine support for direct I/O reads (#4640) 2018-11-13 11:17:25 -08:00
thread_list_test.cc Use nullptr instead of NULL / 0 more consistently. 2018-03-07 12:42:12 -08:00
thread_local_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
thread_local.cc Enable building of ARM32 (#4349) 2018-10-09 16:58:25 -07:00
thread_local.h Provide a way to override windows memory allocator with jemalloc for ZSTD 2018-06-04 12:12:48 -07:00
thread_operation.h Add inline comments to flush job (#4464) 2018-10-05 15:41:17 -07:00
threadpool_imp.cc Small issues (#4564) 2018-10-23 10:35:57 -07:00
threadpool_imp.h Support lowering CPU priority of background threads 2018-04-24 08:41:51 -07:00
timer_queue_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
timer_queue.h Change RocksDB License 2017-07-15 16:11:23 -07:00
trace_replay.cc Add the max trace file size limitation option to Tracing (#4610) 2018-11-27 14:27:05 -08:00
trace_replay.h Add the max trace file size limitation option to Tracing (#4610) 2018-11-27 14:27:05 -08:00
transaction_test_util.cc C++17 support (#4482) 2018-10-11 10:50:04 -07:00
transaction_test_util.h WritePrepared Txn: stress test 2017-12-06 09:42:28 -08:00
util.h Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
vector_iterator.h Use only "local" range tombstones during Get (#4449) 2018-10-24 12:31:12 -07:00
xxhash.cc xxhash 64 support 2018-11-01 15:44:06 -07:00
xxhash.h Move #include outside of namespace (#4629) 2018-11-06 17:18:28 -08:00