CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
// Copyright (c) 2013, Facebook, Inc. All rights reserved.
|
|
|
|
// This source code is licensed under the BSD-style license found in the
|
|
|
|
// LICENSE file in the root directory of this source tree. An additional grant
|
|
|
|
// of patent rights can be found in the PATENTS file in the same directory.
|
|
|
|
//
|
|
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
|
|
|
|
|
|
#pragma once
|
|
|
|
#include "db/version_set.h"
|
|
|
|
#include "db/compaction.h"
|
|
|
|
#include "rocksdb/status.h"
|
|
|
|
#include "rocksdb/options.h"
|
2014-02-04 16:31:18 -08:00
|
|
|
#include "rocksdb/env.h"
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
#include <vector>
|
|
|
|
#include <memory>
|
|
|
|
#include <set>
|
|
|
|
|
|
|
|
namespace rocksdb {
|
|
|
|
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
class LogBuffer;
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
class Compaction;
|
|
|
|
class Version;
|
|
|
|
|
|
|
|
class CompactionPicker {
|
|
|
|
public:
|
2014-03-11 14:52:17 -07:00
|
|
|
CompactionPicker(const Options* options, const InternalKeyComparator* icmp);
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
virtual ~CompactionPicker();
|
|
|
|
|
|
|
|
// Pick level and inputs for a new compaction.
|
|
|
|
// Returns nullptr if there is no compaction to be done.
|
|
|
|
// Otherwise returns a pointer to a heap-allocated object that
|
|
|
|
// describes the compaction. Caller should delete the result.
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
virtual Compaction* PickCompaction(Version* version,
|
|
|
|
LogBuffer* log_buffer) = 0;
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
// Return a compaction object for compacting the range [begin,end] in
|
|
|
|
// the specified level. Returns nullptr if there is nothing in that
|
|
|
|
// level that overlaps the specified range. Caller should delete
|
|
|
|
// the result.
|
|
|
|
//
|
|
|
|
// The returned Compaction might not include the whole requested range.
|
|
|
|
// In that case, compaction_end will be set to the next key that needs
|
|
|
|
// compacting. In case the compaction will compact the whole range,
|
|
|
|
// compaction_end will be set to nullptr.
|
|
|
|
// Client is responsible for compaction_end storage -- when called,
|
|
|
|
// *compaction_end should point to valid InternalKey!
|
|
|
|
Compaction* CompactRange(Version* version, int input_level, int output_level,
|
|
|
|
const InternalKey* begin, const InternalKey* end,
|
|
|
|
InternalKey** compaction_end);
|
|
|
|
|
|
|
|
// Free up the files that participated in a compaction
|
|
|
|
void ReleaseCompactionFiles(Compaction* c, Status status);
|
|
|
|
|
|
|
|
// Return the total amount of data that is undergoing
|
|
|
|
// compactions per level
|
|
|
|
void SizeBeingCompacted(std::vector<uint64_t>& sizes);
|
|
|
|
|
|
|
|
// Returns maximum total overlap bytes with grandparent
|
|
|
|
// level (i.e., level+2) before we stop building a single
|
|
|
|
// file in level->level+1 compaction.
|
|
|
|
uint64_t MaxGrandParentOverlapBytes(int level);
|
|
|
|
|
|
|
|
// Returns maximum total bytes of data on a given level.
|
|
|
|
double MaxBytesForLevel(int level);
|
|
|
|
|
|
|
|
// Get the max file size in a given level.
|
|
|
|
uint64_t MaxFileSizeForLevel(int level) const;
|
|
|
|
|
|
|
|
protected:
|
|
|
|
int NumberLevels() const { return num_levels_; }
|
|
|
|
|
|
|
|
// Stores the minimal range that covers all entries in inputs in
|
|
|
|
// *smallest, *largest.
|
|
|
|
// REQUIRES: inputs is not empty
|
|
|
|
void GetRange(const std::vector<FileMetaData*>& inputs, InternalKey* smallest,
|
|
|
|
InternalKey* largest);
|
|
|
|
|
|
|
|
// Stores the minimal range that covers all entries in inputs1 and inputs2
|
|
|
|
// in *smallest, *largest.
|
|
|
|
// REQUIRES: inputs is not empty
|
|
|
|
void GetRange(const std::vector<FileMetaData*>& inputs1,
|
|
|
|
const std::vector<FileMetaData*>& inputs2,
|
|
|
|
InternalKey* smallest, InternalKey* largest);
|
|
|
|
|
2014-01-17 12:02:03 -08:00
|
|
|
// Add more files to the inputs on "level" to make sure that
|
|
|
|
// no newer version of a key is compacted to "level+1" while leaving an older
|
|
|
|
// version in a "level". Otherwise, any Get() will search "level" first,
|
|
|
|
// and will likely return an old/stale value for the key, since it always
|
|
|
|
// searches in increasing order of level to find the value. This could
|
|
|
|
// also scramble the order of merge operands. This function should be
|
|
|
|
// called any time a new Compaction is created, and its inputs_[0] are
|
|
|
|
// populated.
|
|
|
|
//
|
|
|
|
// Will return false if it is impossible to apply this compaction.
|
|
|
|
bool ExpandWhileOverlapping(Compaction* c);
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
uint64_t ExpandedCompactionByteSizeLimit(int level);
|
|
|
|
|
|
|
|
// Returns true if any one of the specified files are being compacted
|
|
|
|
bool FilesInCompaction(std::vector<FileMetaData*>& files);
|
|
|
|
|
|
|
|
// Returns true if any one of the parent files are being compacted
|
|
|
|
bool ParentRangeInCompaction(Version* version, const InternalKey* smallest,
|
|
|
|
const InternalKey* largest, int level,
|
|
|
|
int* index);
|
|
|
|
|
|
|
|
void SetupOtherInputs(Compaction* c);
|
|
|
|
|
|
|
|
// record all the ongoing compactions for all levels
|
|
|
|
std::vector<std::set<Compaction*>> compactions_in_progress_;
|
|
|
|
|
|
|
|
// Per-level target file size.
|
|
|
|
std::unique_ptr<uint64_t[]> max_file_size_;
|
|
|
|
|
|
|
|
// Per-level max bytes
|
|
|
|
std::unique_ptr<uint64_t[]> level_max_bytes_;
|
|
|
|
|
2014-03-11 14:52:17 -07:00
|
|
|
const Options* const options_;
|
2014-02-04 16:31:18 -08:00
|
|
|
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
private:
|
|
|
|
int num_levels_;
|
|
|
|
|
|
|
|
const InternalKeyComparator* const icmp_;
|
|
|
|
};
|
|
|
|
|
|
|
|
class UniversalCompactionPicker : public CompactionPicker {
|
|
|
|
public:
|
2014-03-11 14:52:17 -07:00
|
|
|
UniversalCompactionPicker(const Options* options,
|
|
|
|
const InternalKeyComparator* icmp)
|
|
|
|
: CompactionPicker(options, icmp) {}
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
virtual Compaction* PickCompaction(Version* version,
|
|
|
|
LogBuffer* log_buffer) override;
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
private:
|
|
|
|
// Pick Universal compaction to limit read amplification
|
|
|
|
Compaction* PickCompactionUniversalReadAmp(Version* version, double score,
|
|
|
|
unsigned int ratio,
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
unsigned int num_files,
|
|
|
|
LogBuffer* log_buffer);
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
// Pick Universal compaction to limit space amplification.
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
Compaction* PickCompactionUniversalSizeAmp(Version* version, double score,
|
|
|
|
LogBuffer* log_buffer);
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
};
|
|
|
|
|
|
|
|
class LevelCompactionPicker : public CompactionPicker {
|
|
|
|
public:
|
2014-03-11 14:52:17 -07:00
|
|
|
LevelCompactionPicker(const Options* options,
|
|
|
|
const InternalKeyComparator* icmp)
|
|
|
|
: CompactionPicker(options, icmp) {}
|
Buffer info logs when picking compactions and write them out after releasing the mutex
Summary: Now while the background thread is picking compactions, it writes out multiple info_logs, especially for universal compaction, which introduces a chance of waiting log writing in mutex, which is bad. To remove this risk, write all those info logs to a buffer and flush it after releasing the mutex.
Test Plan:
make all check
check the log lines while running some tests that trigger compactions.
Reviewers: haobo, igor, dhruba
Reviewed By: dhruba
CC: i.am.jin.lei, dhruba, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16515
2014-03-04 14:32:55 -08:00
|
|
|
virtual Compaction* PickCompaction(Version* version,
|
|
|
|
LogBuffer* log_buffer) override;
|
CompactionPicker
Summary:
This is a big one. This diff moves all the code related to picking compactions from VersionSet to new class CompactionPicker. Column families' compactions will be completely separate processes, so we need to have multiple CompactionPickers.
To make this easier to review, most of the code change is just copy/paste. There is also a small change not to use VersionSet::current_, but rather to take `Version* version` as a parameter. Most of the other code is exactly the same.
In future diffs, I will also make some improvements to CompactionPickers. I think the most important part will be encapsulating it better. Currently Version, VersionSet, Compaction and CompactionPicker are all friend classes, which makes it harder to change the implementation.
This diff depends on D15171, D15183, D15189 and D15201
Test Plan: `make check`
Reviewers: kailiu, sdong, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15207
2014-01-16 13:03:52 -08:00
|
|
|
|
|
|
|
private:
|
|
|
|
// For the specfied level, pick a compaction.
|
|
|
|
// Returns nullptr if there is no compaction to be done.
|
|
|
|
// If level is 0 and there is already a compaction on that level, this
|
|
|
|
// function will return nullptr.
|
|
|
|
Compaction* PickCompactionBySize(Version* version, int level, double score);
|
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace rocksdb
|