Compare commits

...

14 Commits

Author SHA1 Message Date
Andrew Kryczka
51c8841418 Expand effect of dictionary settings in ColumnFamilyOptions::compression_opts (#7619)
Summary:
In dictionary compression's initial implementation, in order to save CPU overhead, we only enabled it
for bottom level under the assumption that the vast majority of data is
stored there. At that time, there was no
such thing as `ColumnFamilyOptions::bottommost_compression_opts`, so we just
hardcoded disabling dictionary compression in flush and compactions to
non-bottommost level. Now, we have users who generate all their files
through flush and are considering using dictionary compression.

To support such a use case, this PR expands the scope of `ColumnFamilyOptions::compression_opts` to
additionally include flushed files and files generated by compaction to
a non-bottommost level. Users can still get the old behavior by moving
their dictionary settings to `ColumnFamilyOptions::bottommost_compression_opts`
and explicitly enabling both that and `ColumnFamilyOptions::bottommost_compression`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7619

Reviewed By: ltamasi

Differential Revision: D24665610

Pulled By: ajkr

fbshipit-source-id: 656b90bce1033fe21c71e09af931ef5bde3e464c
2020-11-03 10:56:38 -08:00
Andrew Kryczka
236d993648 Redesign block cache pinning API (#7520)
Summary:
The old flag-based APIs (`BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache` and `BlockBasedTableOptions::pin_top_level_index_and_filter`) were insufficient for our needs. For example, it was impossible to pin only unpartitioned meta-blocks, which could prevent block cache contention when turning on dictionary compression or during a migration to partitioned indexes/filters. It was also impossible to pin all meta-blocks in memory while having predictable memory usage via block cache. If we had continued adding flags to address these scenarios, they would have had significant overlap causing confusion. Instead, this PR deprecates the flags and starts a new API with non-overlapping options.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7520

Test Plan:
- new unit test
- added new options to stress/crash test and ran for a while: `$ python tools/db_crashtest.py blackbox --simple --max_key=1000000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 --interval=10 -value_size_mult=33 -column_families=1 -reopen=0`

Reviewed By: pdillinger

Differential Revision: D24200034

Pulled By: ajkr

fbshipit-source-id: 3fa7cfc71e7960f7a867511dd6ae5834dd73b13e
2020-11-03 10:54:22 -08:00
Andrew Kryczka
8b298e7021 update HISTORY.md and bump version 2020-10-30 11:20:12 -07:00
mrambacher
c87c42fda8 Return NotFound from TableFactory configuration errors during options loading (#7615)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7615

Reviewed By: riversand963

Differential Revision: D24637054

Pulled By: ajkr

fbshipit-source-id: 7da20d44289eaa2387af4edf8c3c48057425cc1c
2020-10-30 10:48:38 -07:00
mrambacher
e5686db330 Revert LoadLatestOptions handling of ignore_unknown_options if versions differ (#7612)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7612

Reviewed By: zhichao-cao

Differential Revision: D24627054

Pulled By: riversand963

fbshipit-source-id: 451b4da742e3e84c7442bc7cc4959d39089b89d0
2020-10-30 10:46:33 -07:00
akankshamahajan
91f3b72ebc Update History.md and version.h for 6.14.2
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
2020-10-21 20:29:57 -07:00
Akanksha Mahajan
3ddd79e983 Bug fix to remove function calling in assert statement (#7581)
Summary:
Remove function calling in assert statement as assert is a no
op in opt build and that function might not be called. This causes hang
in closing RocksDB when refit level is set.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7581

Test Plan: make check -j64

Reviewed By: riversand963

Differential Revision: D24466420

Pulled By: akankshamahajan15

fbshipit-source-id: 97db4ec5a95ae693c3290e176a3c12a9b1ad2f6d
2020-10-21 20:25:30 -07:00
mrambacher
59ddf78081 Revert Statuses returned from pre-Configurable options functions (#7563)
Summary:
Further refinement of the earlier PR.  Now the Status is NotFound with a subcode of PathNotFound. Also the existing functions for options parsing/loading are reverted to return InvalidArgument no matter in which way the user-provided arguments are deemed invalid.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7563

Reviewed By: zhichao-cao

Differential Revision: D24422491

Pulled By: ajkr

fbshipit-source-id: ba6b237cd0584d3f925c5ba0d349aeb8c250af67
2020-10-20 12:06:25 -07:00
mrambacher
d043387bdd Test for LoadLatestOptions (#7554)
Summary:
Make LoadLatestOptions return PathNotFound if the options file does not exist.  Added tests for the LoadOptions related methods.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7554

Reviewed By: akankshamahajan15

Differential Revision: D24298985

Pulled By: zhichao-cao

fbshipit-source-id: c9ae3cb12fc4a5bbef07743e1c1300f98a2441b3
2020-10-15 11:28:49 -07:00
Andrew Kryczka
e402de87ef update HISTORY.md and version.h for 6.14.1 2020-10-13 10:05:00 -07:00
Andrew Kryczka
668033f181 Fix bug in pinned partitioned indexes with some reads bypassing block cache
Backports part of 75d3b6fdf0.
2020-10-13 10:03:08 -07:00
Andrew Kryczka
2cea469908 Fix bug in pinned partitioned user key indexes
Backports part of 75d3b6fdf0.
2020-10-13 10:02:39 -07:00
Andrew Kryczka
94c019a7b0 Add missing regression test/release note for bug fix
Bug fix was in 7508175558.

The unit test comes from https://github.com/facebook/rocksdb/pull/7521.
2020-10-13 10:01:03 -07:00
Andrew Kryczka
353745d639 Add missing release note 2020-10-13 09:56:08 -07:00
31 changed files with 1160 additions and 116 deletions

View File

@ -1,4 +1,27 @@
# Rocksdb Change Log
## Unreleased
### Public API Change
* Deprecate `BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache` and `BlockBasedTableOptions::pin_top_level_index_and_filter`. These options still take effect until users migrate to the replacement APIs in `BlockBasedTableOptions::metadata_cache_options`. Migration guidance can be found in the API comments on the deprecated options.
### Behavior Changes
* The dictionary compression settings specified in `ColumnFamilyOptions::compression_opts` now additionally affect files generated by flush and compaction to non-bottommost level. Previously those settings at most affected files generated by compaction to bottommost level, depending on whether `ColumnFamilyOptions::bottommost_compression_opts` overrode them. Users who relied on dictionary compression settings in `ColumnFamilyOptions::compression_opts` affecting only the bottommost level can keep the behavior by moving their dictionary settings to `ColumnFamilyOptions::bottommost_compression_opts` and setting its `enabled` flag.
## 6.14.3 (10/30/2020)
### Bug Fixes
* Reverted a behavior change silently introduced in 6.14.2, in which the effects of the `ignore_unknown_options` flag (used in option parsing/loading functions) changed.
* Reverted a behavior change silently introduced in 6.14, in which options parsing/loading functions began returning `NotFound` instead of `InvalidArgument` for option names not available in the present version.
## 6.14.2 (10/21/2020)
### Bug Fixes
* Fixed a bug which causes hang in closing DB when refit level is set in opt build. It was because ContinueBackgroundWork() was called in assert statement which is a no op. It was introduced in 6.14.
## 6.14.1 (10/13/2020)
### Bug Fixes
* Since 6.12, memtable lookup should report unrecognized value_type as corruption (#7121).
* Since 6.14, fix false positive flush/compaction `Status::Corruption` failure when `paranoid_file_checks == true` and range tombstones were written to the compaction output files.
* Fixed a bug in the following combination of features: indexes with user keys (`format_version >= 3`), indexes are partitioned (`index_type == kTwoLevelIndexSearch`), and some index partitions are pinned in memory (`BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache`). The bug could cause keys to be truncated when read from the index leading to wrong read results or other unexpected behavior.
* Fixed a bug when indexes are partitioned (`index_type == kTwoLevelIndexSearch`), some index partitions are pinned in memory (`BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache`), and partitions reads could be mixed between block cache and directly from the file (e.g., with `enable_index_compression == 1` and `mmap_read == 1`, partitions that were stored uncompressed due to poor compression ratio would be read directly from the file via mmap, while partitions that were stored compressed would be read from block cache). The bug could cause index partitions to be mistakenly considered empty during reads leading to wrong read results.
## 6.14 (10/09/2020)
### Bug fixes
* Fixed a bug after a `CompactRange()` with `CompactRangeOptions::change_level` set fails due to a conflict in the level change step, which caused all subsequent calls to `CompactRange()` with `CompactRangeOptions::change_level` set to incorrectly fail with a `Status::NotSupported("another thread is refitting")` error.

View File

@ -124,11 +124,6 @@ Status BuildTable(
if (iter->Valid() || !range_del_agg->IsEmpty()) {
TableBuilder* builder;
std::unique_ptr<WritableFileWriter> file_writer;
// Currently we only enable dictionary compression during compaction to the
// bottommost level.
CompressionOptions compression_opts_for_flush(compression_opts);
compression_opts_for_flush.max_dict_bytes = 0;
compression_opts_for_flush.zstd_max_train_bytes = 0;
{
std::unique_ptr<FSWritableFile> file;
#ifndef NDEBUG
@ -160,7 +155,7 @@ Status BuildTable(
ioptions, mutable_cf_options, internal_comparator,
int_tbl_prop_collector_factories, column_family_id,
column_family_name, file_writer.get(), compression,
sample_for_compression, compression_opts_for_flush, level,
sample_for_compression, compression_opts, level,
false /* skip_filters */, creation_time, oldest_key_time,
0 /*target_file_size*/, file_creation_time, db_id, db_session_id);
}

View File

@ -248,12 +248,6 @@ Compaction::Compaction(VersionStorageInfo* vstorage,
if (max_subcompactions_ == 0) {
max_subcompactions_ = _mutable_db_options.max_subcompactions;
}
if (!bottommost_level_) {
// Currently we only enable dictionary compression during compaction to the
// bottommost level.
output_compression_opts_.max_dict_bytes = 0;
output_compression_opts_.zstd_max_train_bytes = 0;
}
#ifndef NDEBUG
for (size_t i = 1; i < inputs_.size(); ++i) {

View File

@ -106,7 +106,7 @@ class CorruptionTest : public testing::Test {
ASSERT_OK(::ROCKSDB_NAMESPACE::RepairDB(dbname_, options_));
}
void Build(int n, int flush_every = 0) {
void Build(int n, int start, int flush_every) {
std::string key_space, value_space;
WriteBatch batch;
for (int i = 0; i < n; i++) {
@ -115,13 +115,15 @@ class CorruptionTest : public testing::Test {
ASSERT_OK(dbi->TEST_FlushMemTable());
}
//if ((i % 100) == 0) fprintf(stderr, "@ %d of %d\n", i, n);
Slice key = Key(i, &key_space);
Slice key = Key(i + start, &key_space);
batch.Clear();
ASSERT_OK(batch.Put(key, Value(i, &value_space)));
ASSERT_OK(db_->Write(WriteOptions(), &batch));
}
}
void Build(int n, int flush_every = 0) { Build(n, 0, flush_every); }
void Check(int min_expected, int max_expected) {
uint64_t next_expected = 0;
uint64_t missed = 0;
@ -723,6 +725,102 @@ TEST_F(CorruptionTest, DisableKeyOrderCheck) {
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->ClearAllCallBacks();
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRangeFirst) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
std::string start, end;
assert(db_ != nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(3, &start), Key(7, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(8, &start), Key(9, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(5, &end)));
Build(10);
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRange) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
assert(db_ != nullptr);
Build(10, 0, 0);
std::string start, end;
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(5, &start), Key(15, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(8, &start), Key(9, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(12, &start), Key(17, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(4, &end)));
Build(10, 10, 0);
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRangeLast) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
assert(db_ != nullptr);
std::string start, end;
Build(10);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(3, &start), Key(7, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(6, &start), Key(8, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(5, &end)));
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -837,8 +837,9 @@ TEST_F(DBBlockCacheTest, CacheCompressionDict) {
Random rnd(301);
for (auto compression_type : compression_types) {
Options options = CurrentOptions();
options.compression = compression_type;
options.compression_opts.max_dict_bytes = 4096;
options.bottommost_compression = compression_type;
options.bottommost_compression_opts.max_dict_bytes = 4096;
options.bottommost_compression_opts.enabled = true;
options.create_if_missing = true;
options.num_levels = 2;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
@ -888,6 +889,166 @@ TEST_F(DBBlockCacheTest, CacheCompressionDict) {
#endif // ROCKSDB_LITE
class DBBlockCachePinningTest
: public DBTestBase,
public testing::WithParamInterface<
std::tuple<bool, PinningTier, PinningTier, PinningTier>> {
public:
DBBlockCachePinningTest()
: DBTestBase("/db_block_cache_test", /*env_do_fsync=*/false) {}
void SetUp() override {
partition_index_and_filters_ = std::get<0>(GetParam());
top_level_index_pinning_ = std::get<1>(GetParam());
partition_pinning_ = std::get<2>(GetParam());
unpartitioned_pinning_ = std::get<3>(GetParam());
}
bool partition_index_and_filters_;
PinningTier top_level_index_pinning_;
PinningTier partition_pinning_;
PinningTier unpartitioned_pinning_;
};
TEST_P(DBBlockCachePinningTest, TwoLevelDB) {
// Creates one file in L0 and one file in L1. Both files have enough data that
// their index and filter blocks are partitioned. The L1 file will also have
// a compression dictionary (those are trained only during compaction), which
// must be unpartitioned.
const int kKeySize = 32;
const int kBlockSize = 128;
const int kNumBlocksPerFile = 128;
const int kNumKeysPerFile = kBlockSize * kNumBlocksPerFile / kKeySize;
Options options = CurrentOptions();
// `kNoCompression` makes the unit test more portable. But it relies on the
// current behavior of persisting/accessing dictionary even when there's no
// (de)compression happening, which seems fairly likely to change over time.
options.compression = kNoCompression;
options.compression_opts.max_dict_bytes = 4 << 10;
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
BlockBasedTableOptions table_options;
table_options.block_cache = NewLRUCache(1 << 20 /* capacity */);
table_options.block_size = kBlockSize;
table_options.metadata_block_size = kBlockSize;
table_options.cache_index_and_filter_blocks = true;
table_options.metadata_cache_options.top_level_index_pinning =
top_level_index_pinning_;
table_options.metadata_cache_options.partition_pinning = partition_pinning_;
table_options.metadata_cache_options.unpartitioned_pinning =
unpartitioned_pinning_;
table_options.filter_policy.reset(
NewBloomFilterPolicy(10 /* bits_per_key */));
if (partition_index_and_filters_) {
table_options.index_type =
BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
table_options.partition_filters = true;
}
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
Reopen(options);
Random rnd(301);
for (int i = 0; i < 2; ++i) {
for (int j = 0; j < kNumKeysPerFile; ++j) {
ASSERT_OK(Put(Key(i * kNumKeysPerFile + j), rnd.RandomString(kKeySize)));
}
ASSERT_OK(Flush());
if (i == 0) {
// Prevent trivial move so file will be rewritten with dictionary and
// reopened with L1's pinning settings.
CompactRangeOptions cro;
cro.bottommost_level_compaction = BottommostLevelCompaction::kForce;
ASSERT_OK(db_->CompactRange(cro, nullptr, nullptr));
}
}
// Clear all unpinned blocks so unpinned blocks will show up as cache misses
// when reading a key from a file.
table_options.block_cache->EraseUnRefEntries();
// Get base cache values
uint64_t filter_misses = TestGetTickerCount(options, BLOCK_CACHE_FILTER_MISS);
uint64_t index_misses = TestGetTickerCount(options, BLOCK_CACHE_INDEX_MISS);
uint64_t compression_dict_misses =
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS);
// Read a key from the L0 file
Get(Key(kNumKeysPerFile));
uint64_t expected_filter_misses = filter_misses;
uint64_t expected_index_misses = index_misses;
uint64_t expected_compression_dict_misses = compression_dict_misses;
if (partition_index_and_filters_) {
if (top_level_index_pinning_ == PinningTier::kNone) {
++expected_filter_misses;
++expected_index_misses;
}
if (partition_pinning_ == PinningTier::kNone) {
++expected_filter_misses;
++expected_index_misses;
}
} else {
if (unpartitioned_pinning_ == PinningTier::kNone) {
++expected_filter_misses;
++expected_index_misses;
}
}
if (unpartitioned_pinning_ == PinningTier::kNone) {
++expected_compression_dict_misses;
}
ASSERT_EQ(expected_filter_misses,
TestGetTickerCount(options, BLOCK_CACHE_FILTER_MISS));
ASSERT_EQ(expected_index_misses,
TestGetTickerCount(options, BLOCK_CACHE_INDEX_MISS));
ASSERT_EQ(expected_compression_dict_misses,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
// Clear all unpinned blocks so unpinned blocks will show up as cache misses
// when reading a key from a file.
table_options.block_cache->EraseUnRefEntries();
// Read a key from the L1 file
Get(Key(0));
if (partition_index_and_filters_) {
if (top_level_index_pinning_ == PinningTier::kNone ||
top_level_index_pinning_ == PinningTier::kFlushedAndSimilar) {
++expected_filter_misses;
++expected_index_misses;
}
if (partition_pinning_ == PinningTier::kNone ||
partition_pinning_ == PinningTier::kFlushedAndSimilar) {
++expected_filter_misses;
++expected_index_misses;
}
} else {
if (unpartitioned_pinning_ == PinningTier::kNone ||
unpartitioned_pinning_ == PinningTier::kFlushedAndSimilar) {
++expected_filter_misses;
++expected_index_misses;
}
}
if (unpartitioned_pinning_ == PinningTier::kNone ||
unpartitioned_pinning_ == PinningTier::kFlushedAndSimilar) {
++expected_compression_dict_misses;
}
ASSERT_EQ(expected_filter_misses,
TestGetTickerCount(options, BLOCK_CACHE_FILTER_MISS));
ASSERT_EQ(expected_index_misses,
TestGetTickerCount(options, BLOCK_CACHE_INDEX_MISS));
ASSERT_EQ(expected_compression_dict_misses,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
}
INSTANTIATE_TEST_CASE_P(
DBBlockCachePinningTest, DBBlockCachePinningTest,
::testing::Combine(
::testing::Bool(),
::testing::Values(PinningTier::kNone, PinningTier::kFlushedAndSimilar,
PinningTier::kAll),
::testing::Values(PinningTier::kNone, PinningTier::kFlushedAndSimilar,
PinningTier::kAll),
::testing::Values(PinningTier::kNone, PinningTier::kFlushedAndSimilar,
PinningTier::kAll)));
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -974,6 +974,7 @@ Status DBImpl::SetOptions(
MutableCFOptions new_options;
Status s;
Status persist_options_status;
persist_options_status.PermitUncheckedError(); // Allow uninitialized access
SuperVersionContext sv_context(/* create_superversion */ true);
{
auto db_options = GetDBOptions();
@ -4717,7 +4718,8 @@ Status DBImpl::CreateColumnFamilyWithImport(
temp_s.ToString().c_str());
}
// Always returns Status::OK()
assert(DestroyColumnFamilyHandle(*handle).ok());
temp_s = DestroyColumnFamilyHandle(*handle);
assert(temp_s.ok());
*handle = nullptr;
}
return status;

View File

@ -950,7 +950,8 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
s = ReFitLevel(cfd, final_output_level, options.target_level);
TEST_SYNC_POINT("DBImpl::CompactRange:PostRefitLevel");
// ContinueBackgroundWork always return Status::OK().
assert(ContinueBackgroundWork().ok());
Status temp_s = ContinueBackgroundWork();
assert(temp_s.ok());
}
EnableManualCompaction();
}

View File

@ -12,6 +12,7 @@
#include "db/db_test_util.h"
#include "db/read_callback.h"
#include "options/options_helper.h"
#include "port/port.h"
#include "port/stack_trace.h"
#include "rocksdb/persistent_cache.h"
@ -1389,6 +1390,178 @@ TEST_F(DBTest2, PresetCompressionDictLocality) {
}
}
class PresetCompressionDictTest
: public DBTestBase,
public testing::WithParamInterface<std::tuple<CompressionType, bool>> {
public:
PresetCompressionDictTest()
: DBTestBase("/db_test2", false /* env_do_fsync */),
compression_type_(std::get<0>(GetParam())),
bottommost_(std::get<1>(GetParam())) {}
protected:
const CompressionType compression_type_;
const bool bottommost_;
};
INSTANTIATE_TEST_CASE_P(
DBTest2, PresetCompressionDictTest,
::testing::Combine(::testing::ValuesIn(GetSupportedDictCompressions()),
::testing::Bool()));
TEST_P(PresetCompressionDictTest, Flush) {
// Verifies that dictionary is generated and written during flush only when
// `ColumnFamilyOptions::compression` enables dictionary.
const size_t kValueLen = 256;
const size_t kKeysPerFile = 1 << 10;
const size_t kDictLen = 4 << 10;
Options options = CurrentOptions();
if (bottommost_) {
options.bottommost_compression = compression_type_;
options.bottommost_compression_opts.enabled = true;
options.bottommost_compression_opts.max_dict_bytes = kDictLen;
} else {
options.compression = compression_type_;
options.compression_opts.max_dict_bytes = kDictLen;
}
options.memtable_factory.reset(new SpecialSkipListFactory(kKeysPerFile));
options.statistics = CreateDBStatistics();
BlockBasedTableOptions bbto;
bbto.cache_index_and_filter_blocks = true;
options.table_factory.reset(NewBlockBasedTableFactory(bbto));
Reopen(options);
uint64_t prev_compression_dict_misses =
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS);
Random rnd(301);
for (size_t i = 0; i <= kKeysPerFile; ++i) {
ASSERT_OK(Put(Key(static_cast<int>(i)), rnd.RandomString(kValueLen)));
}
ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());
// If there's a compression dictionary, it should have been loaded when the
// flush finished, incurring a cache miss.
uint64_t expected_compression_dict_misses;
if (bottommost_) {
expected_compression_dict_misses = prev_compression_dict_misses;
} else {
expected_compression_dict_misses = prev_compression_dict_misses + 1;
}
ASSERT_EQ(expected_compression_dict_misses,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
}
TEST_P(PresetCompressionDictTest, CompactNonBottommost) {
// Verifies that dictionary is generated and written during compaction to
// non-bottommost level only when `ColumnFamilyOptions::compression` enables
// dictionary.
const size_t kValueLen = 256;
const size_t kKeysPerFile = 1 << 10;
const size_t kDictLen = 4 << 10;
Options options = CurrentOptions();
if (bottommost_) {
options.bottommost_compression = compression_type_;
options.bottommost_compression_opts.enabled = true;
options.bottommost_compression_opts.max_dict_bytes = kDictLen;
} else {
options.compression = compression_type_;
options.compression_opts.max_dict_bytes = kDictLen;
}
options.disable_auto_compactions = true;
options.statistics = CreateDBStatistics();
BlockBasedTableOptions bbto;
bbto.cache_index_and_filter_blocks = true;
options.table_factory.reset(NewBlockBasedTableFactory(bbto));
Reopen(options);
Random rnd(301);
for (size_t j = 0; j <= kKeysPerFile; ++j) {
ASSERT_OK(Put(Key(static_cast<int>(j)), rnd.RandomString(kValueLen)));
}
ASSERT_OK(Flush());
MoveFilesToLevel(2);
for (int i = 0; i < 2; ++i) {
for (size_t j = 0; j <= kKeysPerFile; ++j) {
ASSERT_OK(Put(Key(static_cast<int>(j)), rnd.RandomString(kValueLen)));
}
ASSERT_OK(Flush());
}
#ifndef ROCKSDB_LITE
ASSERT_EQ("2,0,1", FilesPerLevel(0));
#endif // ROCKSDB_LITE
uint64_t prev_compression_dict_misses =
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS);
// This L0->L1 compaction merges the two L0 files into L1. The produced L1
// file is not bottommost due to the existing L2 file covering the same key-
// range.
ASSERT_OK(dbfull()->TEST_CompactRange(0, nullptr, nullptr));
#ifndef ROCKSDB_LITE
ASSERT_EQ("0,1,1", FilesPerLevel(0));
#endif // ROCKSDB_LITE
// If there's a compression dictionary, it should have been loaded when the
// compaction finished, incurring a cache miss.
uint64_t expected_compression_dict_misses;
if (bottommost_) {
expected_compression_dict_misses = prev_compression_dict_misses;
} else {
expected_compression_dict_misses = prev_compression_dict_misses + 1;
}
ASSERT_EQ(expected_compression_dict_misses,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
}
TEST_P(PresetCompressionDictTest, CompactBottommost) {
// Verifies that dictionary is generated and written during compaction to
// non-bottommost level only when either `ColumnFamilyOptions::compression` or
// `ColumnFamilyOptions::bottommost_compression` enables dictionary.
const size_t kValueLen = 256;
const size_t kKeysPerFile = 1 << 10;
const size_t kDictLen = 4 << 10;
Options options = CurrentOptions();
if (bottommost_) {
options.bottommost_compression = compression_type_;
options.bottommost_compression_opts.enabled = true;
options.bottommost_compression_opts.max_dict_bytes = kDictLen;
} else {
options.compression = compression_type_;
options.compression_opts.max_dict_bytes = kDictLen;
}
options.disable_auto_compactions = true;
options.statistics = CreateDBStatistics();
BlockBasedTableOptions bbto;
bbto.cache_index_and_filter_blocks = true;
options.table_factory.reset(NewBlockBasedTableFactory(bbto));
Reopen(options);
Random rnd(301);
for (int i = 0; i < 2; ++i) {
for (size_t j = 0; j <= kKeysPerFile; ++j) {
ASSERT_OK(Put(Key(static_cast<int>(j)), rnd.RandomString(kValueLen)));
}
ASSERT_OK(Flush());
}
#ifndef ROCKSDB_LITE
ASSERT_EQ("2", FilesPerLevel(0));
#endif // ROCKSDB_LITE
uint64_t prev_compression_dict_misses =
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS);
CompactRangeOptions cro;
ASSERT_OK(db_->CompactRange(cro, nullptr, nullptr));
#ifndef ROCKSDB_LITE
ASSERT_EQ("0,1", FilesPerLevel(0));
#endif // ROCKSDB_LITE
// If there's a compression dictionary, it should have been loaded when the
// compaction finished, incurring a cache miss.
ASSERT_EQ(prev_compression_dict_misses + 1,
TestGetTickerCount(options, BLOCK_CACHE_COMPRESSION_DICT_MISS));
}
class CompactionCompressionListener : public EventListener {
public:
explicit CompactionCompressionListener(Options* db_options)

View File

@ -662,10 +662,9 @@ void ForwardIterator::RebuildIterators(bool refresh_sv) {
read_options_, sv_->current->version_set()->LastSequence()));
range_del_agg.AddTombstones(std::move(range_del_iter));
// Always return Status::OK().
assert(
sv_->imm
->AddRangeTombstoneIterators(read_options_, &arena_, &range_del_agg)
.ok());
Status temp_s = sv_->imm->AddRangeTombstoneIterators(read_options_, &arena_,
&range_del_agg);
assert(temp_s.ok());
}
has_iter_trimmed_for_upper_bound_ = false;
@ -728,10 +727,9 @@ void ForwardIterator::RenewIterators() {
read_options_, sv_->current->version_set()->LastSequence()));
range_del_agg.AddTombstones(std::move(range_del_iter));
// Always return Status::OK().
assert(
svnew->imm
->AddRangeTombstoneIterators(read_options_, &arena_, &range_del_agg)
.ok());
Status temp_s = svnew->imm->AddRangeTombstoneIterators(
read_options_, &arena_, &range_del_agg);
assert(temp_s.ok());
}
const auto* vstorage = sv_->current->storage_info();

View File

@ -132,6 +132,9 @@ DECLARE_int32(set_options_one_in);
DECLARE_int32(set_in_place_one_in);
DECLARE_int64(cache_size);
DECLARE_bool(cache_index_and_filter_blocks);
DECLARE_int32(top_level_index_pinning);
DECLARE_int32(partition_pinning);
DECLARE_int32(unpartitioned_pinning);
DECLARE_bool(use_clock_cache);
DECLARE_uint64(subcompactions);
DECLARE_uint64(periodic_compaction_seconds);

View File

@ -287,6 +287,24 @@ DEFINE_int64(cache_size, 2LL * KB * KB * KB,
DEFINE_bool(cache_index_and_filter_blocks, false,
"True if indexes/filters should be cached in block cache.");
DEFINE_int32(
top_level_index_pinning,
static_cast<int32_t>(ROCKSDB_NAMESPACE::PinningTier::kFallback),
"Type of pinning for top-level indexes into metadata partitions (see "
"`enum PinningTier` in table.h)");
DEFINE_int32(
partition_pinning,
static_cast<int32_t>(ROCKSDB_NAMESPACE::PinningTier::kFallback),
"Type of pinning for metadata partitions (see `enum PinningTier` in "
"table.h)");
DEFINE_int32(
unpartitioned_pinning,
static_cast<int32_t>(ROCKSDB_NAMESPACE::PinningTier::kFallback),
"Type of pinning for unpartitioned metadata blocks (see `enum PinningTier` "
"in table.h)");
DEFINE_bool(use_clock_cache, false,
"Replace default LRU block cache with clock cache.");

View File

@ -1964,6 +1964,12 @@ void StressTest::Open() {
block_based_options.block_cache = cache_;
block_based_options.cache_index_and_filter_blocks =
FLAGS_cache_index_and_filter_blocks;
block_based_options.metadata_cache_options.top_level_index_pinning =
static_cast<PinningTier>(FLAGS_top_level_index_pinning);
block_based_options.metadata_cache_options.partition_pinning =
static_cast<PinningTier>(FLAGS_partition_pinning);
block_based_options.metadata_cache_options.unpartitioned_pinning =
static_cast<PinningTier>(FLAGS_unpartitioned_pinning);
block_based_options.block_cache_compressed = compressed_cache_;
block_based_options.checksum = checksum_type_e;
block_based_options.block_size = FLAGS_block_size;

View File

@ -162,9 +162,15 @@ class Status {
static Status NotFound(const Slice& msg, const Slice& msg2 = Slice()) {
return Status(kNotFound, msg, msg2);
}
// Fast path for not found without malloc;
static Status NotFound(SubCode msg = kNone) { return Status(kNotFound, msg); }
static Status NotFound(SubCode sc, const Slice& msg,
const Slice& msg2 = Slice()) {
return Status(kNotFound, sc, msg, msg2);
}
static Status Corruption(const Slice& msg, const Slice& msg2 = Slice()) {
return Status(kCorruption, msg, msg2);
}
@ -463,7 +469,8 @@ class Status {
#ifdef ROCKSDB_ASSERT_STATUS_CHECKED
checked_ = true;
#endif // ROCKSDB_ASSERT_STATUS_CHECKED
return (code() == kIOError) && (subcode() == kPathNotFound);
return (code() == kIOError || code() == kNotFound) &&
(subcode() == kPathNotFound);
}
// Returns true iff the status indicates manual compaction paused. This

View File

@ -51,6 +51,55 @@ enum ChecksumType : char {
kxxHash64 = 0x3,
};
// `PinningTier` is used to specify which tier of block-based tables should
// be affected by a block cache pinning setting (see
// `MetadataCacheOptions` below).
enum class PinningTier {
// For compatibility, this value specifies to fallback to the behavior
// indicated by the deprecated options,
// `pin_l0_filter_and_index_blocks_in_cache` and
// `pin_top_level_index_and_filter`.
kFallback,
// This tier contains no block-based tables.
kNone,
// This tier contains block-based tables that may have originated from a
// memtable flush. In particular, it includes tables from L0 that are smaller
// than 1.5 times the current `write_buffer_size`. Note these criteria imply
// it can include intra-L0 compaction outputs and ingested files, as long as
// they are not abnormally large compared to flushed files in L0.
kFlushedAndSimilar,
// This tier contains all block-based tables.
kAll,
};
// `MetadataCacheOptions` contains members indicating the desired caching
// behavior for the different categories of metadata blocks.
struct MetadataCacheOptions {
// The tier of block-based tables whose top-level index into metadata
// partitions will be pinned. Currently indexes and filters may be
// partitioned.
//
// Note `cache_index_and_filter_blocks` must be true for this option to have
// any effect. Otherwise any top-level index into metadata partitions would be
// held in table reader memory, outside the block cache.
PinningTier top_level_index_pinning = PinningTier::kFallback;
// The tier of block-based tables whose metadata partitions will be pinned.
// Currently indexes and filters may be partitioned.
PinningTier partition_pinning = PinningTier::kFallback;
// The tier of block-based tables whose unpartitioned metadata blocks will be
// pinned.
//
// Note `cache_index_and_filter_blocks` must be true for this option to have
// any effect. Otherwise the unpartitioned meta-blocks would be held in table
// reader memory, outside the block cache.
PinningTier unpartitioned_pinning = PinningTier::kFallback;
};
// For advanced user only
struct BlockBasedTableOptions {
static const char* kName() { return "BlockTableOptions"; };
@ -79,12 +128,44 @@ struct BlockBasedTableOptions {
// than data blocks.
bool cache_index_and_filter_blocks_with_high_priority = true;
// DEPRECATED: This option will be removed in a future version. For now, this
// option still takes effect by updating each of the following variables that
// has the default value, `PinningTier::kFallback`:
//
// - `MetadataCacheOptions::partition_pinning`
// - `MetadataCacheOptions::unpartitioned_pinning`
//
// The updated value is chosen as follows:
//
// - `pin_l0_filter_and_index_blocks_in_cache == false` ->
// `PinningTier::kNone`
// - `pin_l0_filter_and_index_blocks_in_cache == true` ->
// `PinningTier::kFlushedAndSimilar`
//
// To migrate away from this flag, explicitly configure
// `MetadataCacheOptions` as described above.
//
// if cache_index_and_filter_blocks is true and the below is true, then
// filter and index blocks are stored in the cache, but a reference is
// held in the "table reader" object so the blocks are pinned and only
// evicted from cache when the table reader is freed.
bool pin_l0_filter_and_index_blocks_in_cache = false;
// DEPRECATED: This option will be removed in a future version. For now, this
// option still takes effect by updating
// `MetadataCacheOptions::top_level_index_pinning` when it has the
// default value, `PinningTier::kFallback`.
//
// The updated value is chosen as follows:
//
// - `pin_top_level_index_and_filter == false` ->
// `PinningTier::kNone`
// - `pin_top_level_index_and_filter == true` ->
// `PinningTier::kAll`
//
// To migrate away from this flag, explicitly configure
// `MetadataCacheOptions` as described above.
//
// If cache_index_and_filter_blocks is true and the below is true, then
// the top-level index of partitioned filter and index blocks are stored in
// the cache, but a reference is held in the "table reader" object so the
@ -92,6 +173,12 @@ struct BlockBasedTableOptions {
// freed. This is not limited to l0 in LSM tree.
bool pin_top_level_index_and_filter = true;
// The desired block cache pinning behavior for the different categories of
// metadata blocks. While pinning can reduce block cache contention, users
// must take care not to pin excessive amounts of data, which risks
// overflowing block cache.
MetadataCacheOptions metadata_cache_options;
// The index type that will be used for this table.
enum IndexType : char {
// A space efficient index block that is optimized for

View File

@ -6,7 +6,7 @@
#define ROCKSDB_MAJOR 6
#define ROCKSDB_MINOR 14
#define ROCKSDB_PATCH 0
#define ROCKSDB_PATCH 3
// Do not use these. We made the mistake of declaring macros starting with
// double underscore. Now we have to live with our choice. We'll deprecate these

View File

@ -91,8 +91,10 @@ void WriteBufferManager::ReserveMemWithCache(size_t mem) {
// Expand size by at least 256KB.
// Add a dummy record to the cache
Cache::Handle* handle = nullptr;
cache_rep_->cache_->Insert(cache_rep_->GetNextCacheKey(), nullptr,
kSizeDummyEntry, nullptr, &handle);
Status s =
cache_rep_->cache_->Insert(cache_rep_->GetNextCacheKey(), nullptr,
kSizeDummyEntry, nullptr, &handle);
s.PermitUncheckedError(); // TODO: What to do on error?
// We keep the handle even if insertion fails and a null handle is
// returned, so that when memory shrinks, we don't release extra
// entries from cache.

View File

@ -296,6 +296,17 @@ std::vector<CompressionType> GetSupportedCompressions() {
return supported_compressions;
}
std::vector<CompressionType> GetSupportedDictCompressions() {
std::vector<CompressionType> dict_compression_types;
for (const auto& comp_to_name : OptionsHelper::compression_type_string_map) {
CompressionType t = comp_to_name.second;
if (t != kDisableCompressionOption && DictCompressionTypeSupported(t)) {
dict_compression_types.push_back(t);
}
}
return dict_compression_types;
}
#ifndef ROCKSDB_LITE
bool ParseSliceTransformHelper(
const std::string& kFixedPrefixName, const std::string& kCappedPrefixName,
@ -724,9 +735,15 @@ Status GetColumnFamilyOptionsFromMap(
*new_options = base_options;
const auto config = CFOptionsAsConfigurable(base_options);
return ConfigureFromMap<ColumnFamilyOptions>(config_options, opts_map,
OptionsHelper::kCFOptionsName,
config.get(), new_options);
Status s = ConfigureFromMap<ColumnFamilyOptions>(
config_options, opts_map, OptionsHelper::kCFOptionsName, config.get(),
new_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetColumnFamilyOptionsFromString(
@ -773,9 +790,15 @@ Status GetDBOptionsFromMap(
assert(new_options);
*new_options = base_options;
auto config = DBOptionsAsConfigurable(base_options);
return ConfigureFromMap<DBOptions>(config_options, opts_map,
OptionsHelper::kDBOptionsName,
config.get(), new_options);
Status s = ConfigureFromMap<DBOptions>(config_options, opts_map,
OptionsHelper::kDBOptionsName,
config.get(), new_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetDBOptionsFromString(const DBOptions& base_options,
@ -841,7 +864,12 @@ Status GetOptionsFromString(const ConfigOptions& config_options,
*new_options = Options(*new_db_options, base_options);
}
}
return s;
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
std::unordered_map<std::string, EncodingType>

View File

@ -25,6 +25,8 @@ struct Options;
std::vector<CompressionType> GetSupportedCompressions();
std::vector<CompressionType> GetSupportedDictCompressions();
// Checks that the combination of DBOptions and ColumnFamilyOptions are valid
Status ValidateOptions(const DBOptions& db_opts,
const ColumnFamilyOptions& cf_opts);

View File

@ -460,7 +460,13 @@ Status RocksDBOptionsParser::EndSection(
opt_section_titles[kOptionSectionTableOptions].size()),
&(cf_opt->table_factory));
if (s.ok()) {
return cf_opt->table_factory->ConfigureFromMap(config_options, opt_map);
s = cf_opt->table_factory->ConfigureFromMap(config_options, opt_map);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
} else {
// Return OK for not supported table factories as TableFactory
// Deserialization is optional.

View File

@ -158,6 +158,9 @@ TEST_F(OptionsSettableTest, BlockBasedTableOptionsAllFieldsSettable) {
*bbto,
"cache_index_and_filter_blocks=1;"
"cache_index_and_filter_blocks_with_high_priority=true;"
"metadata_cache_options={top_level_index_pinning=kFallback;"
"partition_pinning=kAll;"
"unpartitioned_pinning=kFlushedAndSimilar;};"
"pin_l0_filter_and_index_blocks_in_cache=1;"
"pin_top_level_index_and_filter=1;"
"index_type=kHashSearch;"

View File

@ -302,8 +302,11 @@ TEST_F(OptionsTest, GetOptionsFromMapTest) {
ASSERT_EQ(new_db_opt.strict_bytes_per_sync, true);
db_options_map["max_open_files"] = "hello";
ASSERT_NOK(
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt));
Status s =
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
ASSERT_OK(
@ -311,8 +314,9 @@ TEST_F(OptionsTest, GetOptionsFromMapTest) {
// unknow options should fail parsing without ignore_unknown_options = true
db_options_map["unknown_db_option"] = "1";
ASSERT_NOK(
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt));
s = GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
@ -397,22 +401,29 @@ TEST_F(OptionsTest, GetColumnFamilyOptionsFromStringTest) {
ASSERT_EQ(kMoName, std::string(new_cf_opt.merge_operator->Name()));
// Wrong key/value pair
ASSERT_NOK(GetColumnFamilyOptionsFromString(
Status s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt,
"write_buffer_size=13;max_write_buffer_number;", &new_cf_opt));
"write_buffer_size=13;max_write_buffer_number;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
// Error Paring value
ASSERT_NOK(GetColumnFamilyOptionsFromString(
// Error Parsing value
s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt,
"write_buffer_size=13;max_write_buffer_number=;", &new_cf_opt));
"write_buffer_size=13;max_write_buffer_number=;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
// Missing option name
ASSERT_NOK(GetColumnFamilyOptionsFromString(
config_options, base_cf_opt, "write_buffer_size=13; =100;", &new_cf_opt));
s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt, "write_buffer_size=13; =100;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
@ -783,7 +794,10 @@ TEST_F(OptionsTest, OldInterfaceTest) {
ASSERT_EQ(new_db_opt.paranoid_checks, true);
ASSERT_EQ(new_db_opt.max_open_files, 32);
db_options_map["unknown_option"] = "1";
ASSERT_NOK(GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt));
Status s = GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
ASSERT_OK(GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt, true,
@ -795,11 +809,13 @@ TEST_F(OptionsTest, OldInterfaceTest) {
ASSERT_EQ(new_db_opt.create_if_missing, false);
ASSERT_EQ(new_db_opt.error_if_exists, false);
ASSERT_EQ(new_db_opt.max_open_files, 42);
ASSERT_NOK(GetDBOptionsFromString(
s = GetDBOptionsFromString(
base_db_opt,
"create_if_missing=false;error_if_exists=false;max_open_files=42;"
"unknown_option=1;",
&new_db_opt));
&new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
}
@ -844,19 +860,23 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
EXPECT_EQ(bfp.GetWholeBitsPerKey(), 5);
// unknown option
ASSERT_NOK(GetBlockBasedTableOptionsFromString(
Status s = GetBlockBasedTableOptionsFromString(
config_options, table_opt,
"cache_index_and_filter_blocks=1;index_type=kBinarySearch;"
"bad_option=1",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(static_cast<bool>(table_opt.cache_index_and_filter_blocks),
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
// unrecognized index type
ASSERT_NOK(GetBlockBasedTableOptionsFromString(
s = GetBlockBasedTableOptionsFromString(
config_options, table_opt,
"cache_index_and_filter_blocks=1;index_type=kBinarySearchXX", &new_opt));
"cache_index_and_filter_blocks=1;index_type=kBinarySearchXX", &new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
@ -870,21 +890,23 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
// unrecognized filter policy name
ASSERT_NOK(
GetBlockBasedTableOptionsFromString(config_options, table_opt,
s = GetBlockBasedTableOptionsFromString(config_options, table_opt,
"cache_index_and_filter_blocks=1;"
"filter_policy=bloomfilterxx:4:true",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.filter_policy, new_opt.filter_policy);
// unrecognized filter policy config
ASSERT_NOK(
GetBlockBasedTableOptionsFromString(config_options, table_opt,
s = GetBlockBasedTableOptionsFromString(config_options, table_opt,
"cache_index_and_filter_blocks=1;"
"filter_policy=bloomfilter:4",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.filter_policy, new_opt.filter_policy);
@ -1017,18 +1039,22 @@ TEST_F(OptionsTest, GetPlainTableOptionsFromString) {
ASSERT_TRUE(new_opt.store_index_in_file);
// unknown option
ASSERT_NOK(GetPlainTableOptionsFromString(
Status s = GetPlainTableOptionsFromString(
config_options, table_opt,
"user_key_len=66;bloom_bits_per_key=20;hash_table_ratio=0.5;"
"bad_option=1",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// unrecognized EncodingType
ASSERT_NOK(GetPlainTableOptionsFromString(
s = GetPlainTableOptionsFromString(
config_options, table_opt,
"user_key_len=66;bloom_bits_per_key=20;hash_table_ratio=0.5;"
"encoding_type=kPrefixXX",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
}
#endif // !ROCKSDB_LITE
@ -1147,23 +1173,29 @@ TEST_F(OptionsTest, GetOptionsFromStringTest) {
base_options.dump_malloc_stats = false;
base_options.write_buffer_size = 1024;
Options bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
Status s = GetOptionsFromString(config_options, base_options,
"create_if_missing=XX;dump_malloc_stats=true",
&bad_options));
&bad_options);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
"write_buffer_size=XX;dump_malloc_stats=true",
&bad_options));
s = GetOptionsFromString(config_options, base_options,
"write_buffer_size=XX;dump_malloc_stats=true",
&bad_options);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
// Test a bad value for a TableFactory Option returns a failure
bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
"write_buffer_size=16;dump_malloc_stats=true"
"block_based_table_factory={block_size=XX;};",
&bad_options));
s = GetOptionsFromString(config_options, base_options,
"write_buffer_size=16;dump_malloc_stats=true"
"block_based_table_factory={block_size=XX;};",
&bad_options);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
ASSERT_EQ(bad_options.write_buffer_size, 1024);

View File

@ -160,6 +160,16 @@ size_t TailPrefetchStats::GetSuggestedPrefetchSize() {
}
#ifndef ROCKSDB_LITE
const std::string kOptNameMetadataCacheOpts = "metadata_cache_options";
static std::unordered_map<std::string, PinningTier>
pinning_tier_type_string_map = {
{"kFallback", PinningTier::kFallback},
{"kNone", PinningTier::kNone},
{"kFlushedAndSimilar", PinningTier::kFlushedAndSimilar},
{"kAll", PinningTier::kAll}};
static std::unordered_map<std::string, BlockBasedTableOptions::IndexType>
block_base_table_index_type_string_map = {
{"kBinarySearch", BlockBasedTableOptions::IndexType::kBinarySearch},
@ -187,6 +197,22 @@ static std::unordered_map<std::string,
{"kShortenSeparatorsAndSuccessor",
BlockBasedTableOptions::IndexShorteningMode::
kShortenSeparatorsAndSuccessor}};
static std::unordered_map<std::string, OptionTypeInfo>
metadata_cache_options_type_info = {
{"top_level_index_pinning",
OptionTypeInfo::Enum<PinningTier>(
offsetof(struct MetadataCacheOptions, top_level_index_pinning),
&pinning_tier_type_string_map)},
{"partition_pinning",
OptionTypeInfo::Enum<PinningTier>(
offsetof(struct MetadataCacheOptions, partition_pinning),
&pinning_tier_type_string_map)},
{"unpartitioned_pinning",
OptionTypeInfo::Enum<PinningTier>(
offsetof(struct MetadataCacheOptions, unpartitioned_pinning),
&pinning_tier_type_string_map)}};
#endif // ROCKSDB_LITE
static std::unordered_map<std::string, OptionTypeInfo>
@ -348,6 +374,11 @@ static std::unordered_map<std::string, OptionTypeInfo>
pin_top_level_index_and_filter),
OptionType::kBoolean, OptionVerificationType::kNormal,
OptionTypeFlags::kNone}},
{kOptNameMetadataCacheOpts,
OptionTypeInfo::Struct(
kOptNameMetadataCacheOpts, &metadata_cache_options_type_info,
offsetof(struct BlockBasedTableOptions, metadata_cache_options),
OptionVerificationType::kNormal, OptionTypeFlags::kNone)},
{"block_cache",
{offsetof(struct BlockBasedTableOptions, block_cache),
OptionType::kUnknown, OptionVerificationType::kNormal,
@ -731,8 +762,14 @@ Status GetBlockBasedTableOptionsFromString(
if (!s.ok()) {
return s;
}
return GetBlockBasedTableOptionsFromMap(config_options, table_options,
opts_map, new_table_options);
s = GetBlockBasedTableOptionsFromMap(config_options, table_options, opts_map,
new_table_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetBlockBasedTableOptionsFromMap(

View File

@ -974,6 +974,9 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
}
}
}
// Partition filters cannot be enabled without partition indexes
assert(rep_->filter_type != Rep::FilterType::kPartitionedFilter ||
rep_->index_type == BlockBasedTableOptions::kTwoLevelIndexSearch);
// Find compression dictionary handle
bool found_compression_dict = false;
@ -987,20 +990,53 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
const bool use_cache = table_options.cache_index_and_filter_blocks;
// pin both index and filters, down to all partitions.
const bool pin_all =
rep_->table_options.pin_l0_filter_and_index_blocks_in_cache &&
const bool maybe_flushed =
level == 0 && file_size <= max_file_size_for_l0_meta_pin;
std::function<bool(PinningTier, PinningTier)> is_pinned =
[maybe_flushed, &is_pinned](PinningTier pinning_tier,
PinningTier fallback_pinning_tier) {
// Fallback to fallback would lead to infinite recursion. Disallow it.
assert(fallback_pinning_tier != PinningTier::kFallback);
switch (pinning_tier) {
case PinningTier::kFallback:
return is_pinned(fallback_pinning_tier,
PinningTier::kNone /* fallback_pinning_tier */);
case PinningTier::kNone:
return false;
case PinningTier::kFlushedAndSimilar:
return maybe_flushed;
case PinningTier::kAll:
return true;
};
// In GCC, this is needed to suppress `control reaches end of non-void
// function [-Werror=return-type]`.
assert(false);
return false;
};
const bool pin_top_level_index = is_pinned(
table_options.metadata_cache_options.top_level_index_pinning,
table_options.pin_top_level_index_and_filter ? PinningTier::kAll
: PinningTier::kNone);
const bool pin_partition =
is_pinned(table_options.metadata_cache_options.partition_pinning,
table_options.pin_l0_filter_and_index_blocks_in_cache
? PinningTier::kFlushedAndSimilar
: PinningTier::kNone);
const bool pin_unpartitioned =
is_pinned(table_options.metadata_cache_options.unpartitioned_pinning,
table_options.pin_l0_filter_and_index_blocks_in_cache
? PinningTier::kFlushedAndSimilar
: PinningTier::kNone);
// prefetch the first level of index
const bool prefetch_index =
prefetch_all ||
(table_options.pin_top_level_index_and_filter &&
index_type == BlockBasedTableOptions::kTwoLevelIndexSearch);
// pin the first level of index
const bool pin_index =
pin_all || (table_options.pin_top_level_index_and_filter &&
index_type == BlockBasedTableOptions::kTwoLevelIndexSearch);
index_type == BlockBasedTableOptions::kTwoLevelIndexSearch
? pin_top_level_index
: pin_unpartitioned;
// prefetch the first level of index
const bool prefetch_index = prefetch_all || pin_index;
std::unique_ptr<IndexReader> index_reader;
s = new_table->CreateIndexReader(ro, prefetch_buffer, meta_iter, use_cache,
@ -1015,24 +1051,20 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
// The partitions of partitioned index are always stored in cache. They
// are hence follow the configuration for pin and prefetch regardless of
// the value of cache_index_and_filter_blocks
if (prefetch_all) {
s = rep_->index_reader->CacheDependencies(ro, pin_all);
if (prefetch_all || pin_partition) {
s = rep_->index_reader->CacheDependencies(ro, pin_partition);
}
if (!s.ok()) {
return s;
}
// prefetch the first level of filter
const bool prefetch_filter =
prefetch_all ||
(table_options.pin_top_level_index_and_filter &&
rep_->filter_type == Rep::FilterType::kPartitionedFilter);
// Partition fitlers cannot be enabled without partition indexes
assert(!prefetch_filter || prefetch_index);
// pin the first level of filter
const bool pin_filter =
pin_all || (table_options.pin_top_level_index_and_filter &&
rep_->filter_type == Rep::FilterType::kPartitionedFilter);
rep_->filter_type == Rep::FilterType::kPartitionedFilter
? pin_top_level_index
: pin_unpartitioned;
// prefetch the first level of filter
const bool prefetch_filter = prefetch_all || pin_filter;
if (rep_->filter_policy) {
auto filter = new_table->CreateFilterBlockReader(
@ -1040,8 +1072,8 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
lookup_context);
if (filter) {
// Refer to the comment above about paritioned indexes always being cached
if (prefetch_all) {
filter->CacheDependencies(ro, pin_all);
if (prefetch_all || pin_partition) {
filter->CacheDependencies(ro, pin_partition);
}
rep_->filter = std::move(filter);
@ -1050,9 +1082,9 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
if (!rep_->compression_dict_handle.IsNull()) {
std::unique_ptr<UncompressionDictReader> uncompression_dict_reader;
s = UncompressionDictReader::Create(this, ro, prefetch_buffer, use_cache,
prefetch_all, pin_all, lookup_context,
&uncompression_dict_reader);
s = UncompressionDictReader::Create(
this, ro, prefetch_buffer, use_cache, prefetch_all || pin_unpartitioned,
pin_unpartitioned, lookup_context, &uncompression_dict_reader);
if (!s.ok()) {
return s;
}
@ -1990,6 +2022,7 @@ BlockBasedTable::PartitionedIndexIteratorState::NewSecondaryIterator(
rep->index_value_is_full);
}
// Create an empty iterator
// TODO(ajkr): this is not the right way to handle an unpinned partition.
return new IndexBlockIter();
}

View File

@ -173,7 +173,7 @@ Status PartitionIndexReader::CacheDependencies(const ReadOptions& ro,
return s;
}
if (block.GetValue() != nullptr) {
if (block.IsCached()) {
if (block.IsCached() || block.GetOwnValue()) {
if (pin) {
partition_map_[handle.offset()] = std::move(block);
}

View File

@ -149,6 +149,11 @@ class IteratorWrapperBase {
return result_.value_prepared;
}
Slice user_key() const {
assert(Valid());
return iter_->user_key();
}
private:
void Update() {
valid_ = iter_->Valid();

View File

@ -142,8 +142,14 @@ Status GetPlainTableOptionsFromString(const ConfigOptions& config_options,
return s;
}
return GetPlainTableOptionsFromMap(config_options, table_options, opts_map,
new_table_options);
s = GetPlainTableOptionsFromMap(config_options, table_options, opts_map,
new_table_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetMemTableRepFactoryFromString(

View File

@ -43,6 +43,10 @@ class TwoLevelIndexIterator : public InternalIteratorBase<IndexValue> {
assert(Valid());
return second_level_iter_.key();
}
Slice user_key() const override {
assert(Valid());
return second_level_iter_.user_key();
}
IndexValue value() const override {
assert(Valid());
return second_level_iter_.value();

View File

@ -80,6 +80,7 @@ default_params = {
"open_files": lambda : random.choice([-1, -1, 100, 500000]),
"optimize_filters_for_memory": lambda: random.randint(0, 1),
"partition_filters": lambda: random.randint(0, 1),
"partition_pinning": lambda: random.randint(0, 3),
"pause_background_one_in": 1000000,
"prefixpercent": 5,
"progress_reports": 0,
@ -93,6 +94,8 @@ default_params = {
"subcompactions": lambda: random.randint(1, 4),
"target_file_size_base": 2097152,
"target_file_size_multiplier": 2,
"top_level_index_pinning": lambda: random.randint(0, 3),
"unpartitioned_pinning": lambda: random.randint(0, 3),
"use_direct_reads": lambda: random.randint(0, 1),
"use_direct_io_for_flush_and_compaction": lambda: random.randint(0, 1),
"mock_direct_io": False,

View File

@ -540,6 +540,43 @@ inline bool CompressionTypeSupported(CompressionType compression_type) {
}
}
inline bool DictCompressionTypeSupported(CompressionType compression_type) {
switch (compression_type) {
case kNoCompression:
return false;
case kSnappyCompression:
return false;
case kZlibCompression:
return Zlib_Supported();
case kBZip2Compression:
return false;
case kLZ4Compression:
case kLZ4HCCompression:
#if LZ4_VERSION_NUMBER >= 10400 // r124+
return LZ4_Supported();
#else
return false;
#endif
case kXpressCompression:
return false;
case kZSTDNotFinalCompression:
#if ZSTD_VERSION_NUMBER >= 500 // v0.5.0+
return ZSTDNotFinal_Supported();
#else
return false;
#endif
case kZSTD:
#if ZSTD_VERSION_NUMBER >= 500 // v0.5.0+
return ZSTD_Supported();
#else
return false;
#endif
default:
assert(false);
return false;
}
}
inline std::string CompressionTypeToString(CompressionType compression_type) {
switch (compression_type) {
case kNoCompression:

View File

@ -65,7 +65,11 @@ Status GetLatestOptionsFileName(const std::string& dbpath,
uint64_t latest_time_stamp = 0;
std::vector<std::string> file_names;
s = env->GetChildren(dbpath, &file_names);
if (!s.ok()) {
if (s.IsNotFound()) {
return Status::NotFound(Status::kPathNotFound,
"No options files found in the DB directory.",
dbpath);
} else if (!s.ok()) {
return s;
}
for (auto& file_name : file_names) {
@ -79,7 +83,9 @@ Status GetLatestOptionsFileName(const std::string& dbpath,
}
}
if (latest_file_name.size() == 0) {
return Status::NotFound("No options files found in the DB directory.");
return Status::NotFound(Status::kPathNotFound,
"No options files found in the DB directory.",
dbpath);
}
*options_file_name = latest_file_name;
return Status::OK();

View File

@ -11,6 +11,8 @@
#include <cinttypes>
#include <unordered_map>
#include "env/mock_env.h"
#include "file/filename.h"
#include "options/options_parser.h"
#include "rocksdb/convenience.h"
#include "rocksdb/db.h"
@ -31,14 +33,12 @@ namespace ROCKSDB_NAMESPACE {
class OptionsUtilTest : public testing::Test {
public:
OptionsUtilTest() : rnd_(0xFB) {
env_.reset(new test::StringEnv(Env::Default()));
fs_.reset(new LegacyFileSystemWrapper(env_.get()));
env_.reset(NewMemEnv(Env::Default()));
dbname_ = test::PerThreadDBPath("options_util_test");
}
protected:
std::unique_ptr<test::StringEnv> env_;
std::unique_ptr<LegacyFileSystemWrapper> fs_;
std::unique_ptr<Env> env_;
std::string dbname_;
Random rnd_;
};
@ -58,8 +58,8 @@ TEST_F(OptionsUtilTest, SaveAndLoad) {
}
const std::string kFileName = "OPTIONS-123456";
ASSERT_OK(
PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName, fs_.get()));
ASSERT_OK(PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName,
env_->GetFileSystem().get()));
DBOptions loaded_db_opt;
std::vector<ColumnFamilyDescriptor> loaded_cf_descs;
@ -85,6 +85,7 @@ TEST_F(OptionsUtilTest, SaveAndLoad) {
exact, cf_opts[i], loaded_cf_descs[i].options));
}
DestroyDB(dbname_, Options(db_opt, cf_opts[0]));
for (size_t i = 0; i < kCFCount; ++i) {
if (cf_opts[i].compaction_filter) {
delete cf_opts[i].compaction_filter;
@ -121,8 +122,8 @@ TEST_F(OptionsUtilTest, SaveAndLoadWithCacheCheck) {
cf_names.push_back("cf_plain_table_sample");
// Saving DB in file
const std::string kFileName = "OPTIONS-LOAD_CACHE_123456";
ASSERT_OK(
PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName, fs_.get()));
ASSERT_OK(PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName,
env_->GetFileSystem().get()));
DBOptions loaded_db_opt;
std::vector<ColumnFamilyDescriptor> loaded_cf_descs;
@ -154,6 +155,7 @@ TEST_F(OptionsUtilTest, SaveAndLoadWithCacheCheck) {
ASSERT_EQ(loaded_bbt_opt->block_cache.get(), cache.get());
}
}
DestroyDB(dbname_, Options(loaded_db_opt, cf_opts[0]));
}
namespace {
@ -359,8 +361,280 @@ TEST_F(OptionsUtilTest, SanityCheck) {
ASSERT_OK(
CheckOptionsCompatibility(config_options, dbname_, db_opt, cf_descs));
}
DestroyDB(dbname_, Options(db_opt, cf_descs[0].options));
}
TEST_F(OptionsUtilTest, LatestOptionsNotFound) {
std::unique_ptr<Env> env(NewMemEnv(Env::Default()));
Status s;
Options options;
ConfigOptions config_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
options.env = env.get();
options.create_if_missing = true;
config_opts.env = options.env;
config_opts.ignore_unknown_options = false;
std::vector<std::string> children;
std::string options_file_name;
DestroyDB(dbname_, options);
// First, test where the db directory does not exist
ASSERT_NOK(options.env->GetChildren(dbname_, &children));
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(dbname_, options.env, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(config_opts, dbname_, &options, &cf_descs);
ASSERT_TRUE(s.IsPathNotFound());
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
// Second, test where the db directory exists but is empty
ASSERT_OK(options.env->CreateDir(dbname_));
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(dbname_, options.env, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
// Finally, test where a file exists but is not an "Options" file
std::unique_ptr<WritableFile> file;
ASSERT_OK(
options.env->NewWritableFile(dbname_ + "/temp.txt", &file, EnvOptions()));
ASSERT_OK(file->Close());
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(config_opts, dbname_, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
ASSERT_OK(options.env->DeleteFile(dbname_ + "/temp.txt"));
ASSERT_OK(options.env->DeleteDir(dbname_));
}
TEST_F(OptionsUtilTest, LoadLatestOptions) {
Options options;
options.OptimizeForSmallDb();
ColumnFamilyDescriptor cf_desc;
ConfigOptions config_opts;
DBOptions db_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
std::vector<ColumnFamilyHandle*> handles;
DB* db;
options.create_if_missing = true;
DestroyDB(dbname_, options);
cf_descs.emplace_back();
cf_descs.back().name = kDefaultColumnFamilyName;
cf_descs.back().options.table_factory.reset(NewBlockBasedTableFactory());
cf_descs.emplace_back();
cf_descs.back().name = "Plain";
cf_descs.back().options.table_factory.reset(NewPlainTableFactory());
db_opts.create_missing_column_families = true;
db_opts.create_if_missing = true;
// open and persist the options
ASSERT_OK(DB::Open(db_opts, dbname_, cf_descs, &handles, &db));
std::string options_file_name;
std::string new_options_file;
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &options_file_name));
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_EQ(cf_descs.size(), 2U);
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
ASSERT_OK(handles[0]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[0].options));
ASSERT_OK(handles[1]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[1].options));
// Now change some of the DBOptions
ASSERT_OK(db->SetDBOptions(
{{"delayed_write_rate", "1234"}, {"bytes_per_sync", "32768"}}));
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &new_options_file));
ASSERT_NE(options_file_name, new_options_file);
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
options_file_name = new_options_file;
// Now change some of the ColumnFamilyOptions
ASSERT_OK(db->SetOptions(handles[1], {{"write_buffer_size", "32768"}}));
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &new_options_file));
ASSERT_NE(options_file_name, new_options_file);
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
ASSERT_OK(handles[0]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[0].options));
ASSERT_OK(handles[1]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[1].options));
// close the db
for (auto* handle : handles) {
delete handle;
}
delete db;
DestroyDB(dbname_, options, cf_descs);
}
static void WriteOptionsFile(Env* env, const std::string& path,
const std::string& options_file, int major,
int minor, const std::string& db_opts,
const std::string& cf_opts,
const std::string& bbt_opts = "") {
std::string options_file_header =
"\n"
"[Version]\n"
" rocksdb_version=" +
ToString(major) + "." + ToString(minor) +
".0\n"
" options_file_version=1\n";
std::unique_ptr<WritableFile> wf;
ASSERT_OK(env->NewWritableFile(path + "/" + options_file, &wf, EnvOptions()));
ASSERT_OK(
wf->Append(options_file_header + "[ DBOptions ]\n" + db_opts + "\n"));
ASSERT_OK(wf->Append(
"[CFOptions \"default\"] # column family must be specified\n" +
cf_opts + "\n"));
ASSERT_OK(wf->Append("[TableOptions/BlockBasedTable \"default\"]\n" +
bbt_opts + "\n"));
ASSERT_OK(wf->Close());
std::string latest_options_file;
ASSERT_OK(GetLatestOptionsFileName(path, env, &latest_options_file));
ASSERT_EQ(latest_options_file, options_file);
}
TEST_F(OptionsUtilTest, BadLatestOptions) {
Status s;
ConfigOptions config_opts;
DBOptions db_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
Options options;
options.env = env_.get();
config_opts.env = env_.get();
config_opts.ignore_unknown_options = false;
config_opts.delimiter = "\n";
ConfigOptions ignore_opts = config_opts;
ignore_opts.ignore_unknown_options = true;
std::string options_file_name;
// Test where the db directory exists but is empty
ASSERT_OK(options.env->CreateDir(dbname_));
ASSERT_NOK(
GetLatestOptionsFileName(dbname_, options.env, &options_file_name));
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for a previous major release with an unknown DB
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0001", ROCKSDB_MAJOR - 1,
ROCKSDB_MINOR, "unknown_db_opt=true", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for a previous minor release with an unknown CF
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0002", ROCKSDB_MAJOR,
ROCKSDB_MINOR - 1, "", "unknown_cf_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for a previous minor release with an unknown BBT
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0003", ROCKSDB_MAJOR,
ROCKSDB_MINOR - 1, "", "", "unknown_bbt_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an unknown DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0004", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "unknown_db_opt=true", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an unknown CF Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0005", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "", "unknown_cf_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an invalid DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0006", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "create_if_missing=hello", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the next release with an invalid DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0007", ROCKSDB_MAJOR,
ROCKSDB_MINOR + 1, "create_if_missing=hello", "");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for the next release with an unknown DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0008", ROCKSDB_MAJOR,
ROCKSDB_MINOR + 1, "unknown_db_opt=true", "");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Ignore the errors for future releases when ignore_unknown_options=true
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for the next major release with an unknown CF Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0009", ROCKSDB_MAJOR + 1,
ROCKSDB_MINOR, "", "unknown_cf_opt=true");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Ignore the errors for future releases when ignore_unknown_options=true
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {