rocksdb/util
Ari Ekmekji 40c64434d4 Parallelize L0-L1 Compaction: Restructure Compaction Job
Summary:
As of now compactions involving files from Level 0 and Level 1 are single
threaded because the files in L0, although sorted, are not range partitioned like
the other levels. This means that during L0-L1 compaction each file from L1
needs to be merged with potentially all the files from L0.

This attempt to parallelize the L0-L1 compaction assigns a thread and a
corresponding iterator to each L1 file that then considers only the key range
found in that L1 file and only the L0 files that have those keys (and only the
specific portion of those L0 files in which those keys are found). In this way
the overlap is minimized and potentially eliminated between different iterators
focusing on the same files.

The first step is to restructure the compaction logic to break L0-L1 compactions
into multiple, smaller, sequential compactions. Eventually each of these smaller
jobs will be run simultaneously. Areas to pay extra attention to are

  # Correct aggregation of compaction job statistics across multiple threads
  # Proper opening/closing of output files (make sure each thread's is unique)
  # Keys that span multiple L1 files
  # Skewed distributions of keys within L0 files

Test Plan: Make and run db_test (newer version has separate compaction tests) and compaction_job_stats_test

Reviewers: igor, noetzli, anthony, sdong, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42699
2015-08-03 11:32:14 -07:00
..
allocator.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
arena_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
arena.cc Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
arena.h Removing unnecessary kInlineSize 2015-03-12 21:13:53 +03:00
auto_roll_logger_test.cc Fix when output level is 0 of universal compaction with trivial move 2015-07-27 14:25:57 -07:00
auto_roll_logger.cc Fix WinEnv::NowMicrosec 2015-07-22 14:36:43 -07:00
auto_roll_logger.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
autovector_test.cc Make autovector_test runnable in ROCKSDB_LITE 2015-06-18 15:58:00 -07:00
autovector.h "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
bloom_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
bloom.cc fix typos 2015-04-25 18:14:27 +09:00
build_version.h build: do not relink every single binary just for a timestamp 2015-02-19 13:11:10 -08:00
cache_bench.cc Fix -Wshadow for tools 2014-11-07 15:04:30 -08:00
cache_test.cc Fix memory leaks in PinnedUsageTest 2015-06-19 09:43:08 -07:00
cache.cc Add Cache.GetPinnedUsageUsage() 2015-06-18 13:56:31 -07:00
channel.h Multithreaded backup and restore in BackupEngineImpl 2015-07-02 11:35:51 -07:00
coding_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
coding.cc Removing BitStream* functions 2014-08-19 06:48:21 -07:00
coding.h Turn on -Wshorten-64-to-32 and fix all the errors 2014-11-11 16:47:22 -05:00
compaction_job_stats_impl.cc Count number of corrupt keys during compaction 2015-07-28 16:41:40 -07:00
comparator.cc rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
compression.h Fail DB::Open() when the requested compression is not available 2015-06-18 14:55:05 -07:00
crc32c_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
crc32c.cc Print Fast CRC32 support information in DB LOG 2015-07-10 17:59:36 -07:00
crc32c.h Print Fast CRC32 support information in DB LOG 2015-07-10 17:59:36 -07:00
db_info_dumper.cc extend temp str buffer size 2015-07-16 13:56:17 +08:00
db_info_dumper.h Fix iOS compile with -Wshorten-64-to-32 2014-11-13 14:39:30 -05:00
db_test_util.cc Parallelize L0-L1 Compaction: Restructure Compaction Job 2015-08-03 11:32:14 -07:00
db_test_util.h Parallelize L0-L1 Compaction: Restructure Compaction Job 2015-08-03 11:32:14 -07:00
dynamic_bloom_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
dynamic_bloom.cc Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
dynamic_bloom.h Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
env_hdfs.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
env_posix.cc cleaned up PosixMmapFile a little 2015-07-22 12:27:39 -07:00
env_test.cc Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
env.cc Ensure Windows build w/o port/port.h in public headers 2015-07-16 12:10:16 -07:00
event_logger_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
event_logger.cc Allow EventLogger to directly log from a JSONWriter. 2015-05-21 15:39:30 -07:00
event_logger.h Added JSON manifest dump option to ldb command 2015-07-17 10:07:40 -07:00
file_reader_writer_test.cc RangeSync not to sync last 1MB of the file 2015-07-21 16:22:40 -07:00
file_reader_writer.cc RangeSync not to sync last 1MB of the file 2015-07-21 16:22:40 -07:00
file_reader_writer.h Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
file_util.cc Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
file_util.h Provide openable snapshots 2014-11-14 11:38:26 -08:00
filelock_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
filter_policy.cc Add appropriate LICENSE and Copyright message. 2013-10-16 17:48:41 -07:00
hash_cuckoo_rep.cc "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
hash_cuckoo_rep.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
hash_linklist_rep.cc "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
hash_linklist_rep.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
hash_skiplist_rep.cc rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
hash_skiplist_rep.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
hash.cc Turn on -Wshorten-64-to-32 and fix all the errors 2014-11-11 16:47:22 -05:00
hash.h Introduce GetThreadList API 2014-11-20 10:49:32 -08:00
heap_test.cc Fix compile on Mac 2015-07-16 11:22:21 +02:00
heap.h Replace std::priority_queue in MergingIterator with custom heap, take 2 2015-07-15 03:34:40 -07:00
histogram_test.cc fix typos 2015-04-25 18:14:27 +09:00
histogram.cc fix typos 2015-04-25 18:14:27 +09:00
histogram.h "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
instrumented_mutex.cc Perf Context to report DB mutex waiting time 2015-02-09 17:55:12 -08:00
instrumented_mutex.h Add a counter for collecting the wait time on db mutex. 2015-02-04 21:39:45 -08:00
iostats_context_imp.h Removed two unused macros in iostats_context 2015-06-12 10:45:02 -07:00
iostats_context.cc Ensure Windows build w/o port/port.h in public headers 2015-07-16 12:10:16 -07:00
ldb_cmd_execute_result.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
ldb_cmd.cc dump_manifest supports DB with more number of levels 2015-08-03 11:02:09 -07:00
ldb_cmd.h Added JSON manifest dump option to ldb command 2015-07-17 10:07:40 -07:00
ldb_tool.cc Added 'dump_live_files' command to ldb tool. 2014-12-12 17:50:36 -08:00
log_buffer.cc Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
log_buffer.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
log_write_bench.cc Fix more gflag namespace issues 2014-05-09 08:41:02 -07:00
logging.cc Make the benchmark scripts configurable and add tests 2015-03-30 11:28:25 -07:00
logging.h Make the benchmark scripts configurable and add tests 2015-03-30 11:28:25 -07:00
manual_compaction_test.cc Ensure Windows build w/o port/port.h in public headers 2015-07-16 12:10:16 -07:00
memenv_test.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
memenv.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
mock_env_test.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
mock_env.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
mock_env.h Improved FileExists API 2015-07-20 17:20:40 -07:00
murmurhash.cc Add appropriate LICENSE and Copyright message. 2013-10-16 17:48:41 -07:00
murmurhash.h Turn on -Wshorten-64-to-32 and fix all the errors 2014-11-11 16:47:22 -05:00
mutable_cf_options.cc Don't let flushes preempt compactions 2015-07-17 12:02:52 -07:00
mutable_cf_options.h Parallelize L0-L1 Compaction: Restructure Compaction Job 2015-08-03 11:32:14 -07:00
mutexlock.h Add separate Read/WriteUnlock methods in MutexRW. 2014-06-16 15:41:46 -07:00
options_builder.cc Remove the compability check on log2 OS_ANDROID as it's already blocked by ROCKSDB_LITE 2014-12-04 13:56:14 -08:00
options_helper.cc Don't let flushes preempt compactions 2015-07-17 12:02:52 -07:00
options_helper.h Missing header in build on CentOS 2014-11-18 22:21:02 +01:00
options_test.cc Don't let flushes preempt compactions 2015-07-17 12:02:52 -07:00
options.cc Parallelize L0-L1 Compaction: Restructure Compaction Job 2015-08-03 11:32:14 -07:00
perf_context_imp.h more times in perf_context and iostats_context 2015-06-02 02:07:58 -07:00
perf_context.cc Ensure Windows build w/o port/port.h in public headers 2015-07-16 12:10:16 -07:00
perf_level_imp.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
perf_level.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
perf_step_timer.h more times in perf_context and iostats_context 2015-06-02 02:07:58 -07:00
posix_logger.h Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00
random.h Add appropriate LICENSE and Copyright message. 2013-10-16 17:48:41 -07:00
rate_limiter_test.cc Enable dynamic changing of rate limiter's bytes_per_second 2015-03-18 15:35:55 -07:00
rate_limiter.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
rate_limiter.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
scoped_arena_iterator.h Remove path with arena==nullptr from NewInternalIterator 2014-09-04 17:40:41 -07:00
skiplistrep.cc Allow GetApproximateSize() to include mem table size if it is skip list memtable 2015-06-16 18:13:23 -07:00
slice_transform_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
slice.cc "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
sst_dump_test.cc Compression sizes option for sst_dump_tool 2015-07-29 17:42:13 -07:00
sst_dump_tool_imp.h Compression sizes option for sst_dump_tool 2015-07-29 17:42:13 -07:00
sst_dump_tool.cc Fixing fprintf of non string literal 2015-07-30 17:46:47 -07:00
statistics.cc Fix assert in histogramData 2015-01-23 18:10:52 -08:00
statistics.h make statistics forward-able 2014-07-28 12:10:49 -07:00
status.cc Deprecate WriteOptions::timeout_hint_us 2015-07-14 09:35:48 +02:00
stl_wrappers.h Killing Transform Rep 2013-12-03 12:42:15 -08:00
stop_watch.h Change StopWatch interface 2014-07-28 12:22:37 -07:00
string_util.cc Clean up StringSplit 2014-11-21 11:05:28 -05:00
string_util.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
sync_point.cc Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
sync_point.h Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
testharness.cc rocksdb: print status error message when (ASSERT|EXPECT)_OK fails 2015-03-19 17:32:43 -07:00
testharness.h rocksdb: print status error message when (ASSERT|EXPECT)_OK fails 2015-03-19 17:32:43 -07:00
testutil.cc Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
testutil.h Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
thread_list_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
thread_local_test.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
thread_local.cc "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
thread_local.h "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
thread_operation.h Deprecate CompactionFilterV2 2015-07-17 18:59:11 +02:00
thread_status_impl.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
thread_status_updater_debug.cc Allow GetThreadList() to indicate a thread is doing Compaction. 2015-01-13 00:04:08 -08:00
thread_status_updater.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
thread_status_updater.h Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
thread_status_util_debug.cc Fix bad performance in debug mode 2015-04-13 15:58:45 -07:00
thread_status_util.cc Only initialize the ThreadStatusData when necessary. 2015-06-17 11:21:18 -07:00
thread_status_util.h Only initialize the ThreadStatusData when necessary. 2015-06-17 11:21:18 -07:00
vectorrep.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
xfunc.cc Merge the latest changes from github/master 2015-07-02 17:23:41 -07:00
xfunc.h Merge the latest changes from github/master 2015-07-02 17:23:41 -07:00
xxhash.cc Prevent xxhash symbols from polluting global namespace 2015-03-12 12:07:10 -07:00
xxhash.h Prevent xxhash symbols from polluting global namespace 2015-03-12 12:07:10 -07:00