rocksdb/db
Dhruba Borthakur 47c4191fe8 Reduce write amplification by merging files in L0 back into L0
Summary:
There is a new option called hybrid_mode which, when switched on,
causes HBase style compactions.  Files from L0 are
compacted back into L0. This meat of this compaction algorithm
is in PickCompactionHybrid().

All files reside in L0. That means all files have overlapping
keys. Each file has a time-bound, i.e. each file contains a
range of keys that were inserted around the same time. The
start-seqno and the end-seqno refers to the timeframe when
these keys were inserted.  Files that have contiguous seqno
are compacted together into a larger file. All files are
ordered from most recent to the oldest.

The current compaction algorithm starts to look for
candidate files starting from the most recent file. It continues to
add more files to the same compaction run as long as the
sum of the files chosen till now is smaller than the next
candidate file size. This logic needs to be debated
and validated.

The above logic should reduce write amplification to a
large extent... will publish numbers shortly.

Test Plan: dbstress runs for 6 hours with no data corruption (tested so far).

Differential Revision: https://reviews.facebook.net/D11289
2013-06-30 20:07:04 -07:00
..
.nfs00000000066c9ebb00000002 Enhance db_bench 2013-03-14 16:00:23 -07:00
builder.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
builder.h [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
c_test.c Fix poor error on num_levels mismatch and few other minor improvements 2013-01-25 15:37:26 -08:00
c.cc Fix poor error on num_levels mismatch and few other minor improvements 2013-01-25 15:37:26 -08:00
corruption_test.cc [RocksDB] Fix CorruptionTest 2013-05-28 12:36:42 -07:00
db_bench.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
db_filesnapshot.cc [Rocksdb] Log on disable/enable file deletions 2013-06-05 10:48:24 -07:00
db_impl_readonly.cc [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
db_impl_readonly.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
db_impl.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
db_impl.h [rocksdb][refactor] statistic printing code to one place 2013-06-18 20:28:41 -07:00
db_iter.cc Record the number of open db iterators. 2013-05-29 08:47:08 -07:00
db_iter.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
db_statistics.h [RocksDB] Expose DBStatistics 2013-05-23 11:49:38 -07:00
db_stats_logger.cc remove boost 2012-09-16 19:33:43 -07:00
db_test.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
dbformat_test.cc Fix all warnings generated by -Wall option to the compiler. 2012-11-06 14:07:31 -08:00
dbformat.cc Fix refering freed memory in earlier commit. 2013-06-10 15:08:13 -07:00
dbformat.h Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
filename_test.cc Added meta-database support. 2012-12-17 11:26:59 -08:00
filename.cc Allow the logs to be purged by TTL. 2013-02-04 19:42:40 -08:00
filename.h Added meta-database support. 2012-12-17 11:26:59 -08:00
log_file.h GetUpdatesSince API to enable replication. 2012-12-07 11:42:13 -08:00
log_format.h Fixed sign-comparison in rocksdb code-base and fixed Makefile 2013-03-19 14:35:23 -07:00
log_reader.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
log_reader.h TransactionLogIter should stall at the last record. Currently it errors out 2013-03-21 15:12:35 -07:00
log_test.cc Fix more signed-unsigned comparisons 2013-03-19 17:21:36 -07:00
log_writer.cc Fix a number of object lifetime/ownership issues 2013-01-23 16:54:11 -08:00
log_writer.h Fix a number of object lifetime/ownership issues 2013-01-23 16:54:11 -08:00
memtable.cc Compact multiple memtables before flushing to storage. 2013-06-18 14:28:04 -07:00
memtable.h Compact multiple memtables before flushing to storage. 2013-06-18 14:28:04 -07:00
memtablelist.cc Compact multiple memtables before flushing to storage. 2013-06-18 14:28:04 -07:00
memtablelist.h Compact multiple memtables before flushing to storage. 2013-06-18 14:28:04 -07:00
merge_helper.cc [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
merge_helper.h [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
merge_test.cc [Rocksdb] Support Merge operation in rocksdb 2013-05-03 16:59:02 -07:00
repair.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
skiplist_test.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
skiplist.h Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
snapshot.h [RocksDB] fix compaction filter trigger condition 2013-05-13 12:33:02 -07:00
table_cache.cc [Rocksdb] measure table open io in a histogram 2013-06-13 17:25:09 -07:00
table_cache.h [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
transaction_log_iterator_impl.cc [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
transaction_log_iterator_impl.h [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
version_edit_test.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
version_edit.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
version_edit.h Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
version_set_reduce_num_levels.cc Fix valgrind errors in rocksdb tests: auto_roll_logger_test, reduce_levels_test 2013-03-12 16:03:16 -07:00
version_set_test.cc Codemod NULL to nullptr 2013-02-28 18:04:58 -08:00
version_set.cc Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
version_set.h Reduce write amplification by merging files in L0 back into L0 2013-06-30 20:07:04 -07:00
write_batch_internal.h GetUpdatesSince API to enable replication. 2012-12-07 11:42:13 -08:00
write_batch_test.cc [RocksDB] Expose count for WriteBatch 2013-06-26 15:13:21 -07:00
write_batch.cc [RocksDB] Expose count for WriteBatch 2013-06-26 15:13:21 -07:00