rocksdb/db
Igor Canadi 7a3577519f Don't artificially inflate L0 score
Summary:
This turns out to be pretty bad because if we prioritize L0->L1 then L1 can grow artificially large, which makes L0->L1 more and more expensive. For example:
256MB @ L0 + 256MB @ L1 --> 512MB @ L1
256MB @ L0 + 512MB @ L1 --> 768MB @ L1
256MB @ L0 + 768MB @ L1 --> 1GB @ L1

....

256MB @ L0 + 10GB @ L1 --> 10.2GB @ L1

At some point we need to start compacting L1->L2 to speed up L0->L1.

Test Plan:
The performance improvement is massive for heavy write workload. This is the benchmark I ran: https://phabricator.fb.com/P19842671. Before this change, the benchmark took 47 minutes to complete. After, the benchmark finished in 2minutes. You can see full results here: https://phabricator.fb.com/P19842674

Also, we ran this diff on MongoDB on RocksDB on one replicaset. Before the change, our initial sync was so slow that it couldn't keep up with primary writes. After the change, the import finished without any issues

Reviewers: dynamike, MarkCallaghan, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D38637
2015-05-21 11:40:48 -07:00
..
builder.cc Allow GetThreadList to report Flush properties. 2015-05-15 23:22:22 -07:00
builder.h Add more table properties to EventLogger 2015-05-12 15:53:55 -07:00
c_test.c rocksdb: Fixed 'Dead assignment' and 'Dead initialization' scan-build warnings 2015-02-23 14:10:09 -08:00
c.cc Deprecate removeScanCountLimit in NewLRUCache 2015-03-17 15:04:37 -07:00
column_family_test.cc Fix flakiness in column_family_test 2015-05-05 15:59:02 -07:00
column_family.cc CompactRange skips levels 1 to base_level -1 for dynamic level base size 2015-05-18 10:54:11 -07:00
column_family.h CompactRange skips levels 1 to base_level -1 for dynamic level base size 2015-05-18 10:54:11 -07:00
compact_files_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
compaction_job_test.cc Cleanup CompactionJob 2015-05-05 19:01:12 -07:00
compaction_job.cc Add more table properties to EventLogger 2015-05-12 15:53:55 -07:00
compaction_job.h Allow GetThreadList() to report basic compaction operation properties. 2015-05-06 22:51:06 -07:00
compaction_picker_test.cc Reset parent_index and base_index when picking files marked for compaction 2015-05-12 11:16:25 -07:00
compaction_picker.cc CompactRange skips levels 1 to base_level -1 for dynamic level base size 2015-05-18 10:54:11 -07:00
compaction_picker.h Optimize GetRange Function 2015-05-05 09:57:47 -07:00
compaction.cc Universal Compaction with multiple levels won't allocate up to output size 2015-05-13 14:15:46 -07:00
compaction.h Cleanup CompactionJob 2015-05-05 19:01:12 -07:00
comparator_db_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
corruption_test.cc fix typos 2015-04-25 18:14:27 +09:00
cuckoo_table_db_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
db_bench.cc Don't artificially inflate L0 score 2015-05-21 11:40:48 -07:00
db_filesnapshot.cc Don't delete files when column family is dropped 2015-03-19 17:04:29 -07:00
db_impl_debug.cc Clean up old log files in background threads 2015-03-30 15:04:10 -04:00
db_impl_experimental.cc Don't compact bottommost level in SuggestCompactRange 2015-04-29 13:35:48 -07:00
db_impl_readonly.cc Move GetThreadList() feature under Env. 2014-12-22 12:20:17 -08:00
db_impl_readonly.h Block ReadOnlyDB in ROCKSDB_LITE 2014-11-26 11:37:59 -08:00
db_impl.cc [Public API Change] Make DB::GetDbIdentity() be const function. 2015-05-21 11:01:48 -07:00
db_impl.h [Public API Change] Make DB::GetDbIdentity() be const function. 2015-05-21 11:01:48 -07:00
db_iter_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
db_iter.cc fix typos 2015-04-25 18:14:27 +09:00
db_iter.h reduce references to cfd->options() in DBImpl 2014-09-08 15:04:34 -07:00
db_test.cc [Public API Change] Make DB::GetDbIdentity() be const function. 2015-05-21 11:01:48 -07:00
dbformat_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
dbformat.cc Turn on -Wshorten-64-to-32 and fix all the errors 2014-11-11 16:47:22 -05:00
dbformat.h Abstract out SetMaxPossibleForUserKey() and SetMinPossibleForUserKey 2015-04-23 18:08:37 -07:00
deletefile_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
event_logger_helpers.cc Add more table properties to EventLogger 2015-05-12 15:53:55 -07:00
event_logger_helpers.h Add more table properties to EventLogger 2015-05-12 15:53:55 -07:00
experimental.cc Implement DB::PromoteL0 method 2015-04-23 12:10:36 -07:00
fault_injection_test.cc fault_injection_test: add a test case to cover log syncing after a log roll 2015-04-09 16:15:42 -07:00
file_indexer_test.cc Fix possible SIGSEGV in CompactRange (github issue #596) 2015-04-29 10:52:31 -07:00
file_indexer.cc Fix possible SIGSEGV in CompactRange (github issue #596) 2015-04-29 10:52:31 -07:00
file_indexer.h Turn on -Wshorten-64-to-32 and fix all the errors 2014-11-11 16:47:22 -05:00
filename_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
filename.cc Sync manifest file when initializing it 2015-01-22 14:32:03 -08:00
filename.h Sync manifest file when initializing it 2015-01-22 14:32:03 -08:00
flush_job_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
flush_job.cc Allow GetThreadList to report Flush properties. 2015-05-15 23:22:22 -07:00
flush_job.h Allow GetThreadList to report Flush properties. 2015-05-15 23:22:22 -07:00
flush_scheduler.cc Don't return (or dereference) dangling pointer 2014-10-02 14:33:16 -07:00
flush_scheduler.h Fix data race #1 2015-01-26 11:48:07 -08:00
forward_iterator.cc rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
forward_iterator.h rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
internal_stats.cc Add --wal_bytes_per_sync for db_bench and more IO stats 2015-05-19 16:19:30 -07:00
internal_stats.h Add --wal_bytes_per_sync for db_bench and more IO stats 2015-05-19 16:19:30 -07:00
job_context.h Clean up old log files in background threads 2015-03-30 15:04:10 -04:00
listener_test.cc Fixed a bug in EventListener::OnCompactionCompleted(). 2015-05-12 16:10:23 -07:00
log_format.h Some minor refactoring on the code 2014-01-02 16:32:31 -08:00
log_reader.cc rocksdb: Fixed 'Dead assignment' and 'Dead initialization' scan-build warnings 2015-02-23 14:10:09 -08:00
log_reader.h Log writer record format doc. 2015-04-07 16:25:56 -07:00
log_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
log_writer.cc Fix comparison between signed and usigned integers 2015-05-19 10:59:30 -07:00
log_writer.h Log writer record format doc. 2015-04-07 16:25:56 -07:00
managed_iterator.cc Fix compile error on MacOS. 2015-02-24 16:24:53 -08:00
managed_iterator.h Fixed xfunc related compile errors in ROCKSDB_LITE 2015-04-09 21:05:18 -07:00
memtable_allocator.cc Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
memtable_allocator.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
memtable_list_test.cc Include bunch of more events into EventLogger 2015-04-27 15:20:02 -07:00
memtable_list.cc fix typos 2015-04-25 18:14:27 +09:00
memtable_list.h Fix valgrind issues in memtable_list_test 2015-04-10 14:16:03 -07:00
memtable.cc Adding stats for the merge and filter operation 2015-03-24 14:42:04 -07:00
memtable.h Add thread-safety documentation to MemTable and related classes 2015-04-08 21:10:35 -07:00
memtablerep_bench.cc build: avoid unused-variable warning 2015-05-02 13:19:10 -07:00
merge_context.h API to fetch from both a WriteBatchWithIndex and the db 2015-05-11 14:51:51 -07:00
merge_helper.cc Helper function to time Merges 2015-04-27 20:23:50 -07:00
merge_helper.h Helper function to time Merges 2015-04-27 20:23:50 -07:00
merge_operator.cc Some small cleaning up to make some compiling environment happy 2014-03-26 18:11:41 -07:00
merge_test.cc rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
perf_context_test.cc Makefile minor cleanup 2015-03-30 16:05:35 -04:00
plain_table_db_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
prefix_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
repair.cc options.paranoid_file_checks to read all rows after writing to a file. 2015-04-23 11:34:35 -07:00
skiplist_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
skiplist.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
slice.cc Create an abstract interface for write batches 2015-03-17 19:23:08 -07:00
snapshot.h Cleanup CompactionJob 2015-05-05 19:01:12 -07:00
table_cache.cc TableMock + framework for mock classes 2014-10-28 17:52:32 -07:00
table_cache.h use GetContext to replace callback function pointer 2014-09-29 11:09:09 -07:00
table_properties_collector_test.cc A new call back to TablePropertiesCollector to allow users know the entry is add, delete or merge 2015-04-06 10:27:21 -07:00
table_properties_collector.cc A new call back to TablePropertiesCollector to allow users know the entry is add, delete or merge 2015-04-06 10:27:21 -07:00
table_properties_collector.h A new call back to TablePropertiesCollector to allow users know the entry is add, delete or merge 2015-04-06 10:27:21 -07:00
transaction_log_impl.cc Turn -Wshadow back on 2014-11-06 11:14:28 -08:00
transaction_log_impl.h rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
version_builder_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
version_builder.cc Fix deleting obsolete files 2015-02-06 08:44:30 -08:00
version_builder.h Move VersionBuilder logic to a separate .cc file 2014-10-31 16:34:38 -07:00
version_edit_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
version_edit.cc Turn on -Wshadow 2014-10-31 11:59:54 -07:00
version_edit.h Add experimental API MarkForCompaction() 2015-04-17 16:44:45 -07:00
version_set_test.cc Fix level size overflow for options_.level_compaction_dynamic_level_bytes=true 2015-04-03 09:04:35 -07:00
version_set.cc Don't artificially inflate L0 score 2015-05-21 11:40:48 -07:00
version_set.h Optimize GetApproximateSizes() to use lesser CPU cycles. 2015-04-30 10:55:03 -07:00
wal_manager_test.cc Fix flakiness of WalManagerTest 2015-04-13 16:15:05 -07:00
wal_manager.cc rocksdb: Add missing override 2015-02-26 11:28:41 -08:00
wal_manager.h Fix -Wnon-virtual-dtor errors 2014-11-10 17:39:38 -05:00
write_batch_base.cc Create an abstract interface for write batches 2015-03-17 19:23:08 -07:00
write_batch_internal.h remove all remaining references to cfd->options() 2014-11-18 10:20:10 -08:00
write_batch_test.cc rocksdb: Remove #include "util/string_util.h" from util/testharness.h 2015-03-19 17:29:37 -07:00
write_batch.cc Adding stats for the merge and filter operation 2015-03-24 14:42:04 -07:00
write_controller_test.cc maint: use ASSERT_TRUE, not ASSERT_EQ(true; same for false 2015-04-17 14:54:17 -07:00
write_controller.cc Push- instead of pull-model for managing Write stalls 2014-09-08 11:20:25 -07:00
write_controller.h Fix #284 2014-09-13 14:14:10 -07:00
write_thread.cc WriteThread 2014-09-12 16:23:58 -07:00
write_thread.h Add a counter for collecting the wait time on db mutex. 2015-02-04 21:39:45 -08:00
writebuffer.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00