Take a chance on a random file when choosing compaction

Summary:
When trying to compact entire database with SuggestCompactRange(), we'll first try the left-most files. This is pretty bad, because:
1) the left part of LSM tree will be overly compacted, but right part will not be touched
2) First compaction will pick up the left-most file. Second compaction will try to pick up next left-most, but this will not be possible, because there's a big chance that second's file range on N+1 level is already being compacted.

I observe both of those problems when running Mongo+RocksDB and trying to compact the DB to clean up tombstones. I'm unable to clean them up :(

This diff adds a bit of randomness into choosing a file. First, it chooses a file at random and tries to compact that one. This should solve both problems specified here.

Test Plan: make check

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D38379
This commit is contained in:
Igor Canadi 2015-05-15 14:14:40 -07:00
parent 8c52788f0c
commit 7413306d94

View File

@ -16,9 +16,12 @@
#include <inttypes.h>
#include <limits>
#include <string>
#include <utility>
#include "db/column_family.h"
#include "db/filename.h"
#include "util/log_buffer.h"
#include "util/random.h"
#include "util/statistics.h"
#include "util/string_util.h"
#include "util/sync_point.h"
@ -744,8 +747,8 @@ void LevelCompactionPicker::PickFilesMarkedForCompactionExperimental(
return;
}
for (auto& level_file : vstorage->FilesMarkedForCompaction()) {
// If it's being compaction it has nothing to do here.
auto continuation = [&](std::pair<int, FileMetaData*> level_file) {
// If it's being compacted it has nothing to do here.
// If this assert() fails that means that some function marked some
// files as being_compacted, but didn't call ComputeCompactionScore()
assert(!level_file.second->being_compacted);
@ -754,7 +757,21 @@ void LevelCompactionPicker::PickFilesMarkedForCompactionExperimental(
inputs->files = {level_file.second};
inputs->level = *level;
if (ExpandWhileOverlapping(cf_name, vstorage, inputs)) {
return ExpandWhileOverlapping(cf_name, vstorage, inputs);
};
// take a chance on a random file first
Random64 rnd(/* seed */ reinterpret_cast<uint64_t>(vstorage));
size_t random_file_index = static_cast<size_t>(rnd.Uniform(
static_cast<uint64_t>(vstorage->FilesMarkedForCompaction().size())));
if (continuation(vstorage->FilesMarkedForCompaction()[random_file_index])) {
// found the compaction!
return;
}
for (auto& level_file : vstorage->FilesMarkedForCompaction()) {
if (continuation(level_file)) {
// found the compaction!
return;
}