Compare commits

...

25 Commits

Author SHA1 Message Date
Andrew Kryczka
ed4316166f fixup release note (track_and_verify_wals_in_manifest not available in 6.14) 2020-12-01 15:05:35 -08:00
Andrew Kryczka
f56956c72f update HISTORY.md and bump version to 6.14.6 2020-12-01 14:51:09 -08:00
Andrew Kryczka
7e7b413a85 Fix kPointInTimeRecovery handling of truncated WAL (#7701)
Summary:
WAL may be truncated to an incomplete record due to crash while writing
the last record or corruption. In the former case, no hole will be
produced since no ACK'd data was lost. In the latter case, a hole could
be produced without this PR since we proceeded to recover the next WAL
as if nothing happened. This PR changes the record reading code to
always report a corruption for incomplete records in
`kPointInTimeRecovery` mode, and the upper layer will only ignore them
if the next WAL has consecutive seqnum (i.e., we are guaranteed no
hole).

While this solves the hole problem for the case of incomplete
records, the possibility is still there if the WAL is corrupted by
truncation to an exact record boundary.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7701

Test Plan:
Interestingly there already was a test for this case
(`DBWALTestWithParams.kPointInTimeRecovery`); it just had a typo bug in
the verification that prevented it from noticing holes in recovery.

Reviewed By: anand1976

Differential Revision: D25111765

Pulled By: ajkr

fbshipit-source-id: 5e330b13b1ee2b5be096cea9d0ff6075843e57b6
2020-12-01 14:48:36 -08:00
Yanqin Jin
9526da8d4b Fix the logic of setting read_amp_bytes_per_bit from OPTIONS file (#7680)
Summary:
Instead of using `EncodeFixed32` which always serialize a integer to
little endian, we should use the local machine's endianness when
populating a native data structure during options parsing.
Without this fix, `read_amp_bytes_per_bit` may be populated incorrectly
on big-endian machines.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7680

Test Plan: make check

Reviewed By: pdillinger

Differential Revision: D24999166

Pulled By: riversand963

fbshipit-source-id: dc603cff6e17f8fa32479ce6df93b93082e6b0c4
2020-11-17 10:41:30 -08:00
Yanqin Jin
35d8e36ef1 Bump version and update HISTORY 2020-11-15 15:45:38 -08:00
Yanqin Jin
71e9a1a2be Hack to load OPTIONS file for read_amp_bytes_per_bit (#7659)
Summary:
A temporary hack to work around a bug in 6.10, 6.11, 6.12, 6.13 and
6.14. The bug will write out 8 bytes to OPTIONS file from the starting
address of BlockBasedTableOptions.read_amp_bytes_per_bit which is
actually a uint32. Consequently, the value of read_amp_bytes_per_bit
written in the OPTIONS file is wrong. From 6.15, RocksDB will
try to parse the read_amp_bytes_per_bit from OPTIONS file as a uint32.
To be able to load OPTIONS file generated by affected releases before
the fix, we need to manually parse read_amp_bytes_per_bit with this hack.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7659

Test Plan:
Generate a db with current 6.14.fb (head at b6db05dbb5). Maybe use db_stress.

Checkout this PR, run
```
 ~/rocksdb/ldb --db=. --try_load_options --ignore_unknown_options idump --count_only
```
Expect success, and should not see
```
Failed: Invalid argument: Error parsing read_amp_bytes_per_bit:17179869184
```

Also
make check

Reviewed By: anand1976

Differential Revision: D24954752

Pulled By: riversand963

fbshipit-source-id: c7b802fc3e52acd050a4fc1cd475016122234394
2020-11-15 15:44:09 -08:00
Huisheng Liu
ebaf09dfd5 fix read_amp_bytes_per_bit field size (#7651)
Summary:
The field in BlockBasedTableOptions is 4 bytes:
  // Default: 0 (disabled)
  uint32_t read_amp_bytes_per_bit = 0;

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7651

Reviewed By: ltamasi

Differential Revision: D24844994

Pulled By: riversand963

fbshipit-source-id: e2695e55532256ef8996dd6939cad06987a80293
2020-11-15 15:42:40 -08:00
Yanqin Jin
b6db05dbb5 Permit unchecked error
For GetCurrentTime() call in GetSnapshot, explicitly ignore checking the
error.
2020-11-05 12:25:46 -08:00
Yanqin Jin
84728944d3 Bump version and update HISTORY 2020-11-05 12:00:18 -08:00
Yanqin Jin
1f73513576 Remove DB::VerifyFileChecksums 2020-11-05 11:55:29 -08:00
Yanqin Jin
d5673571bf Compute NeedCompact() after table builder Finish() (#7627)
Summary:
In `BuildTable()`, we call `builder->Finish()` before evaluating `builder->NeedCompact()`.
However, we call `builder->NeedCompact()` before `builder->Finish()` in compaction job. This can be wrong because the table properties collectors may rely on the success of `Finish()` to provide correct result for `NeedCompact()`.

Test plan (on devserver):
make check

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7627

Reviewed By: ajkr

Differential Revision: D24728741

Pulled By: riversand963

fbshipit-source-id: 5a0dce244e14eb1106c4f87021e6bebca82b486e
2020-11-05 11:47:31 -08:00
Yanqin Jin
d4b82746b2 Add API to verify whole sst file checksum (#7578)
Summary:
Existing API `VerifyChecksum()` allows application to verify sst files' block checksums.
Since whole file, user-specified checksum is tracked in MANIFEST, we can expose a new
API to verify sst files' file checksums.

```
// Compute table file checksums if applicable and compare with MANIFEST.
// Returns OK if no file has mismatching whole-file checksum.
Status DB::VerifyFileChecksums(const ReadOptions& /*read_options*/);
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7578

Test Plan: make check

Reviewed By: pdillinger

Differential Revision: D24436783

Pulled By: riversand963

fbshipit-source-id: 52b51519b842f2b3c4e3351998a97c86cbec85b3
2020-11-05 11:45:33 -08:00
Zhichao Cao
47d839afc9 Updated GenerateOneFileChecksum to use requested_checksum_func_name (#7586)
Summary:
CreateFileChecksumGenerator may uses requested_checksum_func_name in generator context to decide which generator will be used. GenerateOneFileChecksum has not being updated to use it, which will always get the generator when the name is empty. Fix it.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7586

Test Plan: make check

Reviewed By: riversand963

Differential Revision: D24491989

Pulled By: zhichao-cao

fbshipit-source-id: d9fdfdd431240f0a9a2e781ddbd48a7d6c609aad
2020-11-05 11:40:35 -08:00
Andrew Kryczka
8b298e7021 update HISTORY.md and bump version 2020-10-30 11:20:12 -07:00
mrambacher
c87c42fda8 Return NotFound from TableFactory configuration errors during options loading (#7615)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7615

Reviewed By: riversand963

Differential Revision: D24637054

Pulled By: ajkr

fbshipit-source-id: 7da20d44289eaa2387af4edf8c3c48057425cc1c
2020-10-30 10:48:38 -07:00
mrambacher
e5686db330 Revert LoadLatestOptions handling of ignore_unknown_options if versions differ (#7612)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7612

Reviewed By: zhichao-cao

Differential Revision: D24627054

Pulled By: riversand963

fbshipit-source-id: 451b4da742e3e84c7442bc7cc4959d39089b89d0
2020-10-30 10:46:33 -07:00
akankshamahajan
91f3b72ebc Update History.md and version.h for 6.14.2
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
2020-10-21 20:29:57 -07:00
Akanksha Mahajan
3ddd79e983 Bug fix to remove function calling in assert statement (#7581)
Summary:
Remove function calling in assert statement as assert is a no
op in opt build and that function might not be called. This causes hang
in closing RocksDB when refit level is set.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7581

Test Plan: make check -j64

Reviewed By: riversand963

Differential Revision: D24466420

Pulled By: akankshamahajan15

fbshipit-source-id: 97db4ec5a95ae693c3290e176a3c12a9b1ad2f6d
2020-10-21 20:25:30 -07:00
mrambacher
59ddf78081 Revert Statuses returned from pre-Configurable options functions (#7563)
Summary:
Further refinement of the earlier PR.  Now the Status is NotFound with a subcode of PathNotFound. Also the existing functions for options parsing/loading are reverted to return InvalidArgument no matter in which way the user-provided arguments are deemed invalid.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7563

Reviewed By: zhichao-cao

Differential Revision: D24422491

Pulled By: ajkr

fbshipit-source-id: ba6b237cd0584d3f925c5ba0d349aeb8c250af67
2020-10-20 12:06:25 -07:00
mrambacher
d043387bdd Test for LoadLatestOptions (#7554)
Summary:
Make LoadLatestOptions return PathNotFound if the options file does not exist.  Added tests for the LoadOptions related methods.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7554

Reviewed By: akankshamahajan15

Differential Revision: D24298985

Pulled By: zhichao-cao

fbshipit-source-id: c9ae3cb12fc4a5bbef07743e1c1300f98a2441b3
2020-10-15 11:28:49 -07:00
Andrew Kryczka
e402de87ef update HISTORY.md and version.h for 6.14.1 2020-10-13 10:05:00 -07:00
Andrew Kryczka
668033f181 Fix bug in pinned partitioned indexes with some reads bypassing block cache
Backports part of 75d3b6fdf0.
2020-10-13 10:03:08 -07:00
Andrew Kryczka
2cea469908 Fix bug in pinned partitioned user key indexes
Backports part of 75d3b6fdf0.
2020-10-13 10:02:39 -07:00
Andrew Kryczka
94c019a7b0 Add missing regression test/release note for bug fix
Bug fix was in 7508175558.

The unit test comes from https://github.com/facebook/rocksdb/pull/7521.
2020-10-13 10:01:03 -07:00
Andrew Kryczka
353745d639 Add missing release note 2020-10-13 09:56:08 -07:00
33 changed files with 963 additions and 126 deletions

View File

@ -1,4 +1,33 @@
# Rocksdb Change Log
## 6.14.6 (12/01/2020)
### Bug Fixes
* Truncated WALs ending in incomplete records can no longer produce gaps in the recovered data when `WALRecoveryMode::kPointInTimeRecovery` is used. Gaps are still possible when WALs are truncated exactly on record boundaries.
## 6.14.5 (11/15/2020)
### Bug Fixes
* Fix a bug of encoding and parsing BlockBasedTableOptions::read_amp_bytes_per_bit as a 64-bit integer.
* Fixed the logic of populating native data structure for `read_amp_bytes_per_bit` during OPTIONS file parsing on big-endian architecture. Without this fix, original code introduced in PR7659, when running on big-endian machine, can mistakenly store read_amp_bytes_per_bit (an uint32) in little endian format. Future access to `read_amp_bytes_per_bit` will give wrong values. Little endian architecture is not affected.
## 6.14.4 (11/05/2020)
### Bug Fixes
Fixed a potential bug caused by evaluating `TableBuilder::NeedCompact()` before `TableBuilder::Finish()` in compaction job. For example, the `NeedCompact()` method of `CompactOnDeletionCollector` returned by built-in `CompactOnDeletionCollectorFactory` requires `BlockBasedTable::Finish()` to return the correct result. The bug can cause a compaction-generated file not to be marked for future compaction based on deletion ratio.
## 6.14.3 (10/30/2020)
### Bug Fixes
* Reverted a behavior change silently introduced in 6.14.2, in which the effects of the `ignore_unknown_options` flag (used in option parsing/loading functions) changed.
* Reverted a behavior change silently introduced in 6.14, in which options parsing/loading functions began returning `NotFound` instead of `InvalidArgument` for option names not available in the present version.
## 6.14.2 (10/21/2020)
### Bug Fixes
* Fixed a bug which causes hang in closing DB when refit level is set in opt build. It was because ContinueBackgroundWork() was called in assert statement which is a no op. It was introduced in 6.14.
## 6.14.1 (10/13/2020)
### Bug Fixes
* Since 6.12, memtable lookup should report unrecognized value_type as corruption (#7121).
* Since 6.14, fix false positive flush/compaction `Status::Corruption` failure when `paranoid_file_checks == true` and range tombstones were written to the compaction output files.
* Fixed a bug in the following combination of features: indexes with user keys (`format_version >= 3`), indexes are partitioned (`index_type == kTwoLevelIndexSearch`), and some index partitions are pinned in memory (`BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache`). The bug could cause keys to be truncated when read from the index leading to wrong read results or other unexpected behavior.
* Fixed a bug when indexes are partitioned (`index_type == kTwoLevelIndexSearch`), some index partitions are pinned in memory (`BlockBasedTableOptions::pin_l0_filter_and_index_blocks_in_cache`), and partitions reads could be mixed between block cache and directly from the file (e.g., with `enable_index_compression == 1` and `mmap_read == 1`, partitions that were stored uncompressed due to poor compression ratio would be read directly from the file via mmap, while partitions that were stored compressed would be read from block cache). The bug could cause index partitions to be mistakenly considered empty during reads leading to wrong read results.
## 6.14 (10/09/2020)
### Bug fixes
* Fixed a bug after a `CompactRange()` with `CompactRangeOptions::change_level` set fails due to a conflict in the level change step, which caused all subsequent calls to `CompactRange()` with `CompactRangeOptions::change_level` set to incorrectly fail with a `Status::NotSupported("another thread is refitting")` error.

View File

@ -1352,7 +1352,6 @@ Status CompactionJob::FinishCompactionOutputFile(
ExtractInternalKeyFooter(meta->smallest.Encode()) !=
PackSequenceAndType(0, kTypeRangeDeletion));
}
meta->marked_for_compaction = sub_compact->builder->NeedCompact();
}
const uint64_t current_entries = sub_compact->builder->NumEntries();
if (s.ok()) {
@ -1367,6 +1366,7 @@ Status CompactionJob::FinishCompactionOutputFile(
const uint64_t current_bytes = sub_compact->builder->FileSize();
if (s.ok()) {
meta->fd.file_size = current_bytes;
meta->marked_for_compaction = sub_compact->builder->NeedCompact();
}
sub_compact->current_output()->finished = true;
sub_compact->total_bytes += current_bytes;

View File

@ -38,7 +38,7 @@
namespace ROCKSDB_NAMESPACE {
static const int kValueSize = 1000;
static constexpr int kValueSize = 1000;
class CorruptionTest : public testing::Test {
public:
@ -69,9 +69,16 @@ class CorruptionTest : public testing::Test {
}
~CorruptionTest() override {
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->LoadDependency({});
SyncPoint::GetInstance()->ClearAllCallBacks();
delete db_;
db_ = nullptr;
DestroyDB(dbname_, Options());
if (getenv("KEEP_DB")) {
fprintf(stdout, "db is still at %s\n", dbname_.c_str());
} else {
EXPECT_OK(DestroyDB(dbname_, Options()));
}
}
void CloseDb() {
@ -106,7 +113,7 @@ class CorruptionTest : public testing::Test {
ASSERT_OK(::ROCKSDB_NAMESPACE::RepairDB(dbname_, options_));
}
void Build(int n, int flush_every = 0) {
void Build(int n, int start, int flush_every) {
std::string key_space, value_space;
WriteBatch batch;
for (int i = 0; i < n; i++) {
@ -115,13 +122,15 @@ class CorruptionTest : public testing::Test {
ASSERT_OK(dbi->TEST_FlushMemTable());
}
//if ((i % 100) == 0) fprintf(stderr, "@ %d of %d\n", i, n);
Slice key = Key(i, &key_space);
Slice key = Key(i + start, &key_space);
batch.Clear();
ASSERT_OK(batch.Put(key, Value(i, &value_space)));
ASSERT_OK(db_->Write(WriteOptions(), &batch));
}
}
void Build(int n, int flush_every = 0) { Build(n, 0, flush_every); }
void Check(int min_expected, int max_expected) {
uint64_t next_expected = 0;
uint64_t missed = 0;
@ -723,6 +732,154 @@ TEST_F(CorruptionTest, DisableKeyOrderCheck) {
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->ClearAllCallBacks();
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRangeFirst) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
std::string start, end;
assert(db_ != nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(3, &start), Key(7, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(8, &start), Key(9, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(5, &end)));
Build(10);
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRange) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
assert(db_ != nullptr);
Build(10, 0, 0);
std::string start, end;
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(5, &start), Key(15, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(8, &start), Key(9, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(12, &start), Key(17, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(4, &end)));
Build(10, 10, 0);
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
TEST_F(CorruptionTest, ParanoidFileChecksWithDeleteRangeLast) {
Options options;
options.paranoid_file_checks = true;
options.create_if_missing = true;
for (bool do_flush : {true, false}) {
delete db_;
db_ = nullptr;
ASSERT_OK(DestroyDB(dbname_, options));
ASSERT_OK(DB::Open(options, dbname_, &db_));
assert(db_ != nullptr);
std::string start, end;
Build(10);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(3, &start), Key(7, &end)));
auto snap = db_->GetSnapshot();
ASSERT_NE(snap, nullptr);
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(6, &start), Key(8, &end)));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(),
Key(2, &start), Key(5, &end)));
if (do_flush) {
ASSERT_OK(db_->Flush(FlushOptions()));
} else {
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->TEST_FlushMemTable());
ASSERT_OK(dbi->TEST_CompactRange(0, nullptr, nullptr, nullptr, true));
}
db_->ReleaseSnapshot(snap);
}
}
TEST_F(CorruptionTest, VerifyWholeTableChecksum) {
CloseDb();
Options options;
options.env = &env_;
ASSERT_OK(DestroyDB(dbname_, options));
options.create_if_missing = true;
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
Reopen(&options);
Build(10, 5);
auto* dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->VerifyFileChecksums(ReadOptions()));
CloseDb();
// Corrupt the first byte of each table file, this must be data block.
Corrupt(kTableFile, 0, 1);
ASSERT_OK(TryReopen(&options));
dbi = static_cast_with_check<DBImpl>(db_);
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
int count{0};
SyncPoint::GetInstance()->SetCallBack(
"DBImpl::VerifySstFileChecksum:mismatch", [&](void* arg) {
auto* s = reinterpret_cast<Status*>(arg);
assert(s);
++count;
ASSERT_NOK(*s);
});
SyncPoint::GetInstance()->EnableProcessing();
ASSERT_TRUE(dbi->VerifyFileChecksums(ReadOptions()).IsCorruption());
ASSERT_EQ(1, count);
CloseDb();
ASSERT_OK(DestroyDB(dbname_, options));
Reopen(&options);
Build(10, 5);
dbi = static_cast_with_check<DBImpl>(db_);
ASSERT_OK(dbi->VerifyFileChecksums(ReadOptions()));
CloseDb();
Corrupt(kTableFile, 0, 1);
// Set best_efforts_recovery to true
options.best_efforts_recovery = true;
#ifdef OS_LINUX
ASSERT_TRUE(TryReopen(&options).IsCorruption());
#endif // OS_LINUX
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -2201,6 +2201,8 @@ TEST_F(DBBasicTest, MultiGetIOBufferOverrun) {
TEST_F(DBBasicTest, IncrementalRecoveryNoCorrupt) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
DestroyAndReopen(options);
CreateAndReopenWithCF({"pikachu", "eevee"}, options);
size_t num_cfs = handles_.size();
@ -2239,6 +2241,8 @@ TEST_F(DBBasicTest, IncrementalRecoveryNoCorrupt) {
TEST_F(DBBasicTest, BestEffortsRecoveryWithVersionBuildingFailure) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
DestroyAndReopen(options);
ASSERT_OK(Put("foo", "value"));
ASSERT_OK(Flush());
@ -2285,6 +2289,8 @@ TEST_F(DBBasicTest, RecoverWithMissingFiles) {
// Disable auto compaction to simplify SST file name tracking.
options.disable_auto_compactions = true;
options.listeners.emplace_back(listener);
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
CreateAndReopenWithCF({"pikachu", "eevee"}, options);
std::vector<std::string> all_cf_names = {kDefaultColumnFamilyName, "pikachu",
"eevee"};
@ -2345,6 +2351,8 @@ TEST_F(DBBasicTest, RecoverWithMissingFiles) {
TEST_F(DBBasicTest, BestEffortsRecoveryTryMultipleManifests) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
options.env = env_;
DestroyAndReopen(options);
ASSERT_OK(Put("foo", "value0"));
@ -2371,6 +2379,8 @@ TEST_F(DBBasicTest, BestEffortsRecoveryTryMultipleManifests) {
TEST_F(DBBasicTest, RecoverWithNoCurrentFile) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
options.env = env_;
DestroyAndReopen(options);
CreateAndReopenWithCF({"pikachu"}, options);
@ -2394,6 +2404,8 @@ TEST_F(DBBasicTest, RecoverWithNoCurrentFile) {
TEST_F(DBBasicTest, RecoverWithNoManifest) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
options.env = env_;
DestroyAndReopen(options);
ASSERT_OK(Put("foo", "value"));
@ -2423,6 +2435,8 @@ TEST_F(DBBasicTest, RecoverWithNoManifest) {
TEST_F(DBBasicTest, SkipWALIfMissingTableFiles) {
Options options = CurrentOptions();
options.file_checksum_gen_factory =
ROCKSDB_NAMESPACE::GetFileChecksumGenCrc32cFactory();
DestroyAndReopen(options);
TableFileListener* listener = new TableFileListener();
options.listeners.emplace_back(listener);
@ -3303,6 +3317,28 @@ TEST_F(DBBasicTest, ManifestWriteFailure) {
Reopen(options);
}
#ifndef ROCKSDB_LITE
TEST_F(DBBasicTest, VerifyFileChecksums) {
Options options = GetDefaultOptions();
options.create_if_missing = true;
options.env = env_;
DestroyAndReopen(options);
ASSERT_OK(Put("a", "value"));
ASSERT_OK(Flush());
ASSERT_TRUE(dbfull()->VerifyFileChecksums(ReadOptions()).IsInvalidArgument());
options.file_checksum_gen_factory = GetFileChecksumGenCrc32cFactory();
Reopen(options);
ASSERT_OK(dbfull()->VerifyFileChecksums(ReadOptions()));
// Write an L0 with checksum computed.
ASSERT_OK(Put("b", "value"));
ASSERT_OK(Flush());
ASSERT_OK(dbfull()->VerifyFileChecksums(ReadOptions()));
}
#endif // !ROCKSDB_LITE
// A test class for intercepting random reads and injecting artificial
// delays. Used for testing the deadline/timeout feature
class DBBasicTestDeadline

View File

@ -18,6 +18,7 @@
#include <cstdio>
#include <map>
#include <set>
#include <sstream>
#include <stdexcept>
#include <string>
#include <unordered_map>
@ -974,6 +975,7 @@ Status DBImpl::SetOptions(
MutableCFOptions new_options;
Status s;
Status persist_options_status;
persist_options_status.PermitUncheckedError(); // Allow uninitialized access
SuperVersionContext sv_context(/* create_superversion */ true);
{
auto db_options = GetDBOptions();
@ -2930,7 +2932,7 @@ const Snapshot* DBImpl::GetSnapshotForWriteConflictBoundary() {
SnapshotImpl* DBImpl::GetSnapshotImpl(bool is_write_conflict_boundary,
bool lock) {
int64_t unix_time = 0;
env_->GetCurrentTime(&unix_time); // Ignore error
env_->GetCurrentTime(&unix_time).PermitUncheckedError(); // Ignore error
SnapshotImpl* s = new SnapshotImpl;
if (lock) {
@ -4717,14 +4719,36 @@ Status DBImpl::CreateColumnFamilyWithImport(
temp_s.ToString().c_str());
}
// Always returns Status::OK()
assert(DestroyColumnFamilyHandle(*handle).ok());
temp_s = DestroyColumnFamilyHandle(*handle);
assert(temp_s.ok());
*handle = nullptr;
}
return status;
}
Status DBImpl::VerifyFileChecksums(const ReadOptions& read_options) {
return VerifyChecksumInternal(read_options, /*use_file_checksum=*/true);
}
Status DBImpl::VerifyChecksum(const ReadOptions& read_options) {
return VerifyChecksumInternal(read_options, /*use_file_checksum=*/false);
}
Status DBImpl::VerifyChecksumInternal(const ReadOptions& read_options,
bool use_file_checksum) {
Status s;
if (use_file_checksum) {
FileChecksumGenFactory* const file_checksum_gen_factory =
immutable_db_options_.file_checksum_gen_factory.get();
if (!file_checksum_gen_factory) {
s = Status::InvalidArgument(
"Cannot verify file checksum if options.file_checksum_gen_factory is "
"null");
return s;
}
}
std::vector<ColumnFamilyData*> cfd_list;
{
InstrumentedMutexLock l(&mutex_);
@ -4739,11 +4763,12 @@ Status DBImpl::VerifyChecksum(const ReadOptions& read_options) {
for (auto cfd : cfd_list) {
sv_list.push_back(cfd->GetReferencedSuperVersion(this));
}
for (auto& sv : sv_list) {
VersionStorageInfo* vstorage = sv->current->storage_info();
ColumnFamilyData* cfd = sv->current->cfd();
Options opts;
{
if (!use_file_checksum) {
InstrumentedMutexLock l(&mutex_);
opts = Options(BuildDBOptions(immutable_db_options_, mutable_db_options_),
cfd->GetLatestCFOptions());
@ -4751,13 +4776,20 @@ Status DBImpl::VerifyChecksum(const ReadOptions& read_options) {
for (int i = 0; i < vstorage->num_non_empty_levels() && s.ok(); i++) {
for (size_t j = 0; j < vstorage->LevelFilesBrief(i).num_files && s.ok();
j++) {
const auto& fd = vstorage->LevelFilesBrief(i).files[j].fd;
const auto& fd_with_krange = vstorage->LevelFilesBrief(i).files[j];
const auto& fd = fd_with_krange.fd;
std::string fname = TableFileName(cfd->ioptions()->cf_paths,
fd.GetNumber(), fd.GetPathId());
if (use_file_checksum) {
const FileMetaData* fmeta = fd_with_krange.file_metadata;
assert(fmeta);
s = VerifySstFileChecksum(*fmeta, fname, read_options);
} else {
s = ROCKSDB_NAMESPACE::VerifySstFileChecksum(opts, file_options_,
read_options, fname);
}
}
}
if (!s.ok()) {
break;
}
@ -4786,6 +4818,34 @@ Status DBImpl::VerifyChecksum(const ReadOptions& read_options) {
return s;
}
Status DBImpl::VerifySstFileChecksum(const FileMetaData& fmeta,
const std::string& fname,
const ReadOptions& read_options) {
Status s;
if (fmeta.file_checksum == kUnknownFileChecksum) {
return s;
}
std::string file_checksum;
std::string func_name;
s = ROCKSDB_NAMESPACE::GenerateOneFileChecksum(
fs_.get(), fname, immutable_db_options_.file_checksum_gen_factory.get(),
fmeta.file_checksum_func_name, &file_checksum, &func_name,
read_options.readahead_size, immutable_db_options_.allow_mmap_reads,
io_tracer_);
if (s.ok()) {
assert(fmeta.file_checksum_func_name == func_name);
if (file_checksum != fmeta.file_checksum) {
std::ostringstream oss;
oss << fname << " file checksum mismatch, ";
oss << "expecting " << Slice(fmeta.file_checksum).ToString(/*hex=*/true);
oss << ", but actual " << Slice(file_checksum).ToString(/*hex=*/true);
s = Status::Corruption(oss.str());
TEST_SYNC_POINT_CALLBACK("DBImpl::VerifySstFileChecksum:mismatch", &s);
}
}
return s;
}
void DBImpl::NotifyOnExternalFileIngested(
ColumnFamilyData* cfd, const ExternalSstFileIngestionJob& ingestion_job) {
if (immutable_db_options_.listeners.empty()) {

View File

@ -373,6 +373,12 @@ class DBImpl : public DB {
uint64_t start_time, uint64_t end_time,
std::unique_ptr<StatsHistoryIterator>* stats_iterator) override;
// If immutable_db_options_.best_efforts_recovery is true, and
// RocksDbFileChecksumsVerificationEnabledOnRecovery is defined and returns
// true, and immutable_db_options_.file_checksum_gen_factory is not nullptr,
// then call VerifyFileChecksums().
Status MaybeVerifyFileChecksums();
#ifndef ROCKSDB_LITE
using DB::ResetStats;
virtual Status ResetStats() override;
@ -431,8 +437,27 @@ class DBImpl : public DB {
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle) override;
Status VerifyFileChecksums(const ReadOptions& read_options);
using DB::VerifyChecksum;
virtual Status VerifyChecksum(const ReadOptions& /*read_options*/) override;
// Verify the checksums of files in db. Currently only tables are checked.
//
// read_options: controls file I/O behavior, e.g. read ahead size while
// reading all the live table files.
//
// use_file_checksum: if false, verify the block checksums of all live table
// in db. Otherwise, obtain the file checksums and compare
// with the MANIFEST. Currently, file checksums are
// recomputed by reading all table files.
//
// Returns: OK if there is no file whose file or block checksum mismatches.
Status VerifyChecksumInternal(const ReadOptions& read_options,
bool use_file_checksum);
Status VerifySstFileChecksum(const FileMetaData& fmeta,
const std::string& fpath,
const ReadOptions& read_options);
using DB::StartTrace;
virtual Status StartTrace(

View File

@ -950,7 +950,8 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
s = ReFitLevel(cfd, final_output_level, options.target_level);
TEST_SYNC_POINT("DBImpl::CompactRange:PostRefitLevel");
// ContinueBackgroundWork always return Status::OK().
assert(ContinueBackgroundWork().ok());
Status temp_s = ContinueBackgroundWork();
assert(temp_s.ok());
}
EnableManualCompaction();
}

View File

@ -23,6 +23,15 @@
#include "test_util/sync_point.h"
#include "util/rate_limiter.h"
#if !defined(ROCKSDB_LITE) && defined(OS_LINUX)
// VerifyFileChecksums is a weak symbol.
// If it is defined and returns true, and options.best_efforts_recovery = true,
// and file checksum is enabled, then the checksums of table files will be
// computed and verified with MANIFEST.
extern "C" bool RocksDbFileChecksumsVerificationEnabledOnRecovery()
__attribute__((__weak__));
#endif // !ROCKSDB_LITE && OS_LINUX
namespace ROCKSDB_NAMESPACE {
Options SanitizeOptions(const std::string& dbname, const Options& src) {
auto db_options = SanitizeOptions(dbname, DBOptions(src));
@ -1404,6 +1413,22 @@ Status DBImpl::WriteLevel0TableForRecovery(int job_id, ColumnFamilyData* cfd,
return s;
}
Status DBImpl::MaybeVerifyFileChecksums() {
Status s;
#if !defined(ROCKSDB_LITE) && defined(OS_LINUX)
// TODO: remove the VerifyFileChecksums() call because it's very expensive.
if (immutable_db_options_.best_efforts_recovery &&
RocksDbFileChecksumsVerificationEnabledOnRecovery &&
RocksDbFileChecksumsVerificationEnabledOnRecovery() &&
immutable_db_options_.file_checksum_gen_factory) {
s = VerifyFileChecksums(ReadOptions());
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"Verified file checksums: %s\n", s.ToString().c_str());
}
#endif // !ROCKSDB_LITE && OS_LINUX
return s;
}
Status DB::Open(const Options& options, const std::string& dbname, DB** dbptr) {
DBOptions db_options(options);
ColumnFamilyOptions cf_options(options);
@ -1779,6 +1804,9 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname,
"Persisting Option File error: %s",
persist_options_status.ToString().c_str());
}
if (s.ok()) {
s = impl->MaybeVerifyFileChecksums();
}
if (s.ok()) {
impl->StartPeriodicWorkScheduler();
} else {

View File

@ -233,6 +233,9 @@ Status DBImplReadOnly::OpenForReadOnlyWithoutCheck(
}
impl->mutex_.Unlock();
sv_context.Clean();
if (s.ok()) {
s = impl->MaybeVerifyFileChecksums();
}
if (s.ok()) {
*dbptr = impl;
for (auto* h : *handles) {

View File

@ -370,6 +370,54 @@ TEST_P(DBTablePropertiesTest, DeletionTriggeredCompactionMarking) {
ASSERT_LT(0, opts.statistics->getTickerCount(COMPACT_READ_BYTES_MARKED));
}
TEST_P(DBTablePropertiesTest, RatioBasedDeletionTriggeredCompactionMarking) {
constexpr int kNumKeys = 1000;
constexpr int kWindowSize = 0;
constexpr int kNumDelsTrigger = 0;
constexpr double kDeletionRatio = 0.1;
std::shared_ptr<TablePropertiesCollectorFactory> compact_on_del =
NewCompactOnDeletionCollectorFactory(kWindowSize, kNumDelsTrigger,
kDeletionRatio);
Options opts = CurrentOptions();
opts.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
opts.table_properties_collector_factories.emplace_back(compact_on_del);
Reopen(opts);
// Add an L2 file to prevent tombstones from dropping due to obsolescence
// during flush
Put(Key(0), "val");
Flush();
MoveFilesToLevel(2);
auto* listener = new DeletionTriggeredCompactionTestListener();
opts.listeners.emplace_back(listener);
Reopen(opts);
// Generate one L0 with kNumKeys Put.
for (int i = 0; i < kNumKeys; ++i) {
ASSERT_OK(Put(Key(i), "not important"));
}
ASSERT_OK(Flush());
// Generate another L0 with kNumKeys Delete.
// This file, due to deletion ratio, will trigger compaction: 2@0 files to L1.
// The resulting L1 file has only one tombstone for user key 'Key(0)'.
// Again, due to deletion ratio, a compaction will be triggered: 1@1 + 1@2
// files to L2. However, the resulting file is empty because the tombstone
// and value are both dropped.
for (int i = 0; i < kNumKeys; ++i) {
ASSERT_OK(Delete(Key(i)));
}
ASSERT_OK(Flush());
ASSERT_OK(dbfull()->TEST_WaitForCompact());
for (int i = 0; i < 3; ++i) {
ASSERT_EQ(0, NumTableFilesAtLevel(i));
}
}
INSTANTIATE_TEST_CASE_P(
DBTablePropertiesTest,
DBTablePropertiesTest,

View File

@ -15,6 +15,10 @@
#include "rocksdb/utilities/object_registry.h"
#include "util/random.h"
extern "C" bool RocksDbFileChecksumsVerificationEnabledOnRecovery() {
return true;
}
namespace ROCKSDB_NAMESPACE {
namespace {

View File

@ -49,6 +49,8 @@
#include "util/string_util.h"
#include "utilities/merge_operators.h"
extern "C" bool RocksDbFileChecksumsVerificationEnabledOnRecovery();
namespace ROCKSDB_NAMESPACE {
namespace anon {

View File

@ -1366,14 +1366,20 @@ TEST_P(DBWALTestWithParams, kPointInTimeRecovery) {
size_t recovered_row_count = RecoveryTestHelper::GetData(this);
ASSERT_LT(recovered_row_count, row_count);
// Verify a prefix of keys were recovered. But not in the case of full WAL
// truncation, because we have no way to know there was a corruption when
// truncation happened on record boundaries (preventing recovery holes in
// that case requires using `track_and_verify_wals_in_manifest`).
if (!trunc || corrupt_offset != 0) {
bool expect_data = true;
for (size_t k = 0; k < maxkeys; ++k) {
bool found = Get("key" + ToString(corrupt_offset)) != "NOT_FOUND";
bool found = Get("key" + ToString(k)) != "NOT_FOUND";
if (expect_data && !found) {
expect_data = false;
}
ASSERT_EQ(found, expect_data);
}
}
const size_t min = RecoveryTestHelper::kKeysPerWALFile *
(wal_file_id - RecoveryTestHelper::kWALFileOffset);

View File

@ -192,10 +192,13 @@ Status ExternalSstFileIngestionJob::Prepare(
// Step 1: generate the checksum for ingested sst file.
if (need_generate_file_checksum_) {
for (size_t i = 0; i < files_to_ingest_.size(); i++) {
std::string generated_checksum, generated_checksum_func_name;
std::string generated_checksum;
std::string generated_checksum_func_name;
std::string requested_checksum_func_name;
IOStatus io_s = GenerateOneFileChecksum(
fs_.get(), files_to_ingest_[i].internal_file_path,
db_options_.file_checksum_gen_factory.get(), &generated_checksum,
db_options_.file_checksum_gen_factory.get(),
requested_checksum_func_name, &generated_checksum,
&generated_checksum_func_name,
ingestion_options_.verify_checksums_readahead_size,
db_options_.allow_mmap_reads, io_tracer_);
@ -830,11 +833,13 @@ IOStatus ExternalSstFileIngestionJob::GenerateChecksumForIngestedFile(
// file checksum generated during Prepare(). This step will be skipped.
return IOStatus::OK();
}
std::string file_checksum, file_checksum_func_name;
std::string file_checksum;
std::string file_checksum_func_name;
std::string requested_checksum_func_name;
IOStatus io_s = GenerateOneFileChecksum(
fs_.get(), file_to_ingest->internal_file_path,
db_options_.file_checksum_gen_factory.get(), &file_checksum,
&file_checksum_func_name,
db_options_.file_checksum_gen_factory.get(), requested_checksum_func_name,
&file_checksum, &file_checksum_func_name,
ingestion_options_.verify_checksums_readahead_size,
db_options_.allow_mmap_reads, io_tracer_);
if (!io_s.ok()) {

View File

@ -662,10 +662,9 @@ void ForwardIterator::RebuildIterators(bool refresh_sv) {
read_options_, sv_->current->version_set()->LastSequence()));
range_del_agg.AddTombstones(std::move(range_del_iter));
// Always return Status::OK().
assert(
sv_->imm
->AddRangeTombstoneIterators(read_options_, &arena_, &range_del_agg)
.ok());
Status temp_s = sv_->imm->AddRangeTombstoneIterators(read_options_, &arena_,
&range_del_agg);
assert(temp_s.ok());
}
has_iter_trimmed_for_upper_bound_ = false;
@ -728,10 +727,9 @@ void ForwardIterator::RenewIterators() {
read_options_, sv_->current->version_set()->LastSequence()));
range_del_agg.AddTombstones(std::move(range_del_iter));
// Always return Status::OK().
assert(
svnew->imm
->AddRangeTombstoneIterators(read_options_, &arena_, &range_del_agg)
.ok());
Status temp_s = svnew->imm->AddRangeTombstoneIterators(
read_options_, &arena_, &range_del_agg);
assert(temp_s.ok());
}
const auto* vstorage = sv_->current->storage_info();

View File

@ -119,16 +119,26 @@ bool Reader::ReadRecord(Slice* record, std::string* scratch,
break;
case kBadHeader:
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency) {
// in clean shutdown we don't expect any error in the log files
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency ||
wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery) {
// In clean shutdown we don't expect any error in the log files.
// In point-in-time recovery an incomplete record at the end could
// produce a hole in the recovered data. Report an error here, which
// higher layers can choose to ignore when it's provable there is no
// hole.
ReportCorruption(drop_size, "truncated header");
}
FALLTHROUGH_INTENDED;
case kEof:
if (in_fragmented_record) {
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency) {
// in clean shutdown we don't expect any error in the log files
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency ||
wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery) {
// In clean shutdown we don't expect any error in the log files.
// In point-in-time recovery an incomplete record at the end could
// produce a hole in the recovered data. Report an error here, which
// higher layers can choose to ignore when it's provable there is no
// hole.
ReportCorruption(scratch->size(), "error reading trailing data");
}
// This can be caused by the writer dying immediately after
@ -142,8 +152,13 @@ bool Reader::ReadRecord(Slice* record, std::string* scratch,
if (wal_recovery_mode != WALRecoveryMode::kSkipAnyCorruptedRecords) {
// Treat a record from a previous instance of the log as EOF.
if (in_fragmented_record) {
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency) {
// in clean shutdown we don't expect any error in the log files
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency ||
wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery) {
// In clean shutdown we don't expect any error in the log files.
// In point-in-time recovery an incomplete record at the end could
// produce a hole in the recovered data. Report an error here,
// which higher layers can choose to ignore when it's provable
// there is no hole.
ReportCorruption(scratch->size(), "error reading trailing data");
}
// This can be caused by the writer dying immediately after
@ -164,6 +179,20 @@ bool Reader::ReadRecord(Slice* record, std::string* scratch,
break;
case kBadRecordLen:
if (eof_) {
if (wal_recovery_mode == WALRecoveryMode::kAbsoluteConsistency ||
wal_recovery_mode == WALRecoveryMode::kPointInTimeRecovery) {
// In clean shutdown we don't expect any error in the log files.
// In point-in-time recovery an incomplete record at the end could
// produce a hole in the recovered data. Report an error here, which
// higher layers can choose to ignore when it's provable there is no
// hole.
ReportCorruption(drop_size, "truncated record body");
}
return false;
}
FALLTHROUGH_INTENDED;
case kBadRecordChecksum:
if (recycled_ &&
wal_recovery_mode ==
@ -355,19 +384,15 @@ unsigned int Reader::ReadPhysicalRecord(Slice* result, size_t* drop_size) {
}
}
if (header_size + length > buffer_.size()) {
assert(buffer_.size() >= static_cast<size_t>(header_size));
*drop_size = buffer_.size();
buffer_.clear();
if (!eof_) {
// If the end of the read has been reached without seeing
// `header_size + length` bytes of payload, report a corruption. The
// higher layers can decide how to handle it based on the recovery mode,
// whether this occurred at EOF, whether this is the final WAL, etc.
return kBadRecordLen;
}
// If the end of the file has been reached without reading |length|
// bytes of payload, assume the writer died in the middle of writing the
// record. Don't report a corruption unless requested.
if (*drop_size) {
return kBadHeader;
}
return kEof;
}
if (type == kZeroType && length == 0) {
// Skip zero length record without reporting any drops since

View File

@ -465,7 +465,7 @@ TEST_P(LogTest, BadLengthAtEndIsNotIgnored) {
ShrinkSize(1);
ASSERT_EQ("EOF", Read(WALRecoveryMode::kAbsoluteConsistency));
ASSERT_GT(DroppedBytes(), 0U);
ASSERT_EQ("OK", MatchError("Corruption: truncated header"));
ASSERT_EQ("OK", MatchError("Corruption: truncated record body"));
}
TEST_P(LogTest, ChecksumMismatch) {
@ -573,9 +573,7 @@ TEST_P(LogTest, PartialLastIsNotIgnored) {
ShrinkSize(1);
ASSERT_EQ("EOF", Read(WALRecoveryMode::kAbsoluteConsistency));
ASSERT_GT(DroppedBytes(), 0U);
ASSERT_EQ("OK", MatchError(
"Corruption: truncated headerCorruption: "
"error reading trailing data"));
ASSERT_EQ("OK", MatchError("Corruption: truncated record body"));
}
TEST_P(LogTest, ErrorJoinsRecords) {

View File

@ -123,12 +123,15 @@ bool IsWalDirSameAsDBPath(const ImmutableDBOptions* db_options) {
return same;
}
IOStatus GenerateOneFileChecksum(FileSystem* fs, const std::string& file_path,
// requested_checksum_func_name brings the function name of the checksum
// generator in checksum_factory. Checksum factories may use or ignore
// requested_checksum_func_name.
IOStatus GenerateOneFileChecksum(
FileSystem* fs, const std::string& file_path,
FileChecksumGenFactory* checksum_factory,
std::string* file_checksum,
const std::string& requested_checksum_func_name, std::string* file_checksum,
std::string* file_checksum_func_name,
size_t verify_checksums_readahead_size,
bool allow_mmap_reads,
size_t verify_checksums_readahead_size, bool allow_mmap_reads,
std::shared_ptr<IOTracer>& io_tracer) {
if (checksum_factory == nullptr) {
return IOStatus::InvalidArgument("Checksum factory is invalid");
@ -137,8 +140,25 @@ IOStatus GenerateOneFileChecksum(FileSystem* fs, const std::string& file_path,
assert(file_checksum_func_name != nullptr);
FileChecksumGenContext gen_context;
gen_context.requested_checksum_func_name = requested_checksum_func_name;
gen_context.file_name = file_path;
std::unique_ptr<FileChecksumGenerator> checksum_generator =
checksum_factory->CreateFileChecksumGenerator(gen_context);
if (checksum_generator == nullptr) {
std::string msg =
"Cannot get the file checksum generator based on the requested "
"checksum function name: " +
requested_checksum_func_name +
" from checksum factory: " + checksum_factory->Name();
return IOStatus::InvalidArgument(msg);
}
// For backward compatable, requested_checksum_func_name can be empty.
// If we give the requested checksum function name, we expect it is the
// same name of the checksum generator.
assert(!checksum_generator || requested_checksum_func_name.empty() ||
requested_checksum_func_name == checksum_generator->Name());
uint64_t size;
IOStatus io_s;
std::unique_ptr<RandomAccessFileReader> reader;

View File

@ -35,7 +35,8 @@ extern bool IsWalDirSameAsDBPath(const ImmutableDBOptions* db_options);
extern IOStatus GenerateOneFileChecksum(
FileSystem* fs, const std::string& file_path,
FileChecksumGenFactory* checksum_factory, std::string* file_checksum,
FileChecksumGenFactory* checksum_factory,
const std::string& requested_checksum_func_name, std::string* file_checksum,
std::string* file_checksum_func_name,
size_t verify_checksums_readahead_size, bool allow_mmap_reads,
std::shared_ptr<IOTracer>& io_tracer);

View File

@ -1443,6 +1443,8 @@ class DB {
const ExportImportFilesMetaData& metadata,
ColumnFamilyHandle** handle) = 0;
// Verify the block checksums of files in db. The block checksums of table
// files are checked.
virtual Status VerifyChecksum(const ReadOptions& read_options) = 0;
virtual Status VerifyChecksum() { return VerifyChecksum(ReadOptions()); }

View File

@ -162,9 +162,15 @@ class Status {
static Status NotFound(const Slice& msg, const Slice& msg2 = Slice()) {
return Status(kNotFound, msg, msg2);
}
// Fast path for not found without malloc;
static Status NotFound(SubCode msg = kNone) { return Status(kNotFound, msg); }
static Status NotFound(SubCode sc, const Slice& msg,
const Slice& msg2 = Slice()) {
return Status(kNotFound, sc, msg, msg2);
}
static Status Corruption(const Slice& msg, const Slice& msg2 = Slice()) {
return Status(kCorruption, msg, msg2);
}
@ -463,7 +469,8 @@ class Status {
#ifdef ROCKSDB_ASSERT_STATUS_CHECKED
checked_ = true;
#endif // ROCKSDB_ASSERT_STATUS_CHECKED
return (code() == kIOError) && (subcode() == kPathNotFound);
return (code() == kIOError || code() == kNotFound) &&
(subcode() == kPathNotFound);
}
// Returns true iff the status indicates manual compaction paused. This

View File

@ -6,7 +6,7 @@
#define ROCKSDB_MAJOR 6
#define ROCKSDB_MINOR 14
#define ROCKSDB_PATCH 0
#define ROCKSDB_PATCH 6
// Do not use these. We made the mistake of declaring macros starting with
// double underscore. Now we have to live with our choice. We'll deprecate these

View File

@ -91,8 +91,10 @@ void WriteBufferManager::ReserveMemWithCache(size_t mem) {
// Expand size by at least 256KB.
// Add a dummy record to the cache
Cache::Handle* handle = nullptr;
Status s =
cache_rep_->cache_->Insert(cache_rep_->GetNextCacheKey(), nullptr,
kSizeDummyEntry, nullptr, &handle);
s.PermitUncheckedError(); // TODO: What to do on error?
// We keep the handle even if insertion fails and a null handle is
// returned, so that when memory shrinks, we don't release extra
// entries from cache.

View File

@ -724,9 +724,15 @@ Status GetColumnFamilyOptionsFromMap(
*new_options = base_options;
const auto config = CFOptionsAsConfigurable(base_options);
return ConfigureFromMap<ColumnFamilyOptions>(config_options, opts_map,
OptionsHelper::kCFOptionsName,
config.get(), new_options);
Status s = ConfigureFromMap<ColumnFamilyOptions>(
config_options, opts_map, OptionsHelper::kCFOptionsName, config.get(),
new_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetColumnFamilyOptionsFromString(
@ -773,9 +779,15 @@ Status GetDBOptionsFromMap(
assert(new_options);
*new_options = base_options;
auto config = DBOptionsAsConfigurable(base_options);
return ConfigureFromMap<DBOptions>(config_options, opts_map,
Status s = ConfigureFromMap<DBOptions>(config_options, opts_map,
OptionsHelper::kDBOptionsName,
config.get(), new_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetDBOptionsFromString(const DBOptions& base_options,
@ -841,7 +853,12 @@ Status GetOptionsFromString(const ConfigOptions& config_options,
*new_options = Options(*new_db_options, base_options);
}
}
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
std::unordered_map<std::string, EncodingType>

View File

@ -460,7 +460,13 @@ Status RocksDBOptionsParser::EndSection(
opt_section_titles[kOptionSectionTableOptions].size()),
&(cf_opt->table_factory));
if (s.ok()) {
return cf_opt->table_factory->ConfigureFromMap(config_options, opt_map);
s = cf_opt->table_factory->ConfigureFromMap(config_options, opt_map);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
} else {
// Return OK for not supported table factories as TableFactory
// Deserialization is optional.

View File

@ -302,8 +302,11 @@ TEST_F(OptionsTest, GetOptionsFromMapTest) {
ASSERT_EQ(new_db_opt.strict_bytes_per_sync, true);
db_options_map["max_open_files"] = "hello";
ASSERT_NOK(
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt));
Status s =
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
ASSERT_OK(
@ -311,8 +314,9 @@ TEST_F(OptionsTest, GetOptionsFromMapTest) {
// unknow options should fail parsing without ignore_unknown_options = true
db_options_map["unknown_db_option"] = "1";
ASSERT_NOK(
GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt));
s = GetDBOptionsFromMap(exact, base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
@ -397,22 +401,29 @@ TEST_F(OptionsTest, GetColumnFamilyOptionsFromStringTest) {
ASSERT_EQ(kMoName, std::string(new_cf_opt.merge_operator->Name()));
// Wrong key/value pair
ASSERT_NOK(GetColumnFamilyOptionsFromString(
Status s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt,
"write_buffer_size=13;max_write_buffer_number;", &new_cf_opt));
"write_buffer_size=13;max_write_buffer_number;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
// Error Paring value
ASSERT_NOK(GetColumnFamilyOptionsFromString(
// Error Parsing value
s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt,
"write_buffer_size=13;max_write_buffer_number=;", &new_cf_opt));
"write_buffer_size=13;max_write_buffer_number=;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
// Missing option name
ASSERT_NOK(GetColumnFamilyOptionsFromString(
config_options, base_cf_opt, "write_buffer_size=13; =100;", &new_cf_opt));
s = GetColumnFamilyOptionsFromString(
config_options, base_cf_opt, "write_buffer_size=13; =100;", &new_cf_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_options, base_cf_opt,
new_cf_opt));
@ -783,7 +794,10 @@ TEST_F(OptionsTest, OldInterfaceTest) {
ASSERT_EQ(new_db_opt.paranoid_checks, true);
ASSERT_EQ(new_db_opt.max_open_files, 32);
db_options_map["unknown_option"] = "1";
ASSERT_NOK(GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt));
Status s = GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
ASSERT_OK(GetDBOptionsFromMap(base_db_opt, db_options_map, &new_db_opt, true,
@ -795,11 +809,13 @@ TEST_F(OptionsTest, OldInterfaceTest) {
ASSERT_EQ(new_db_opt.create_if_missing, false);
ASSERT_EQ(new_db_opt.error_if_exists, false);
ASSERT_EQ(new_db_opt.max_open_files, 42);
ASSERT_NOK(GetDBOptionsFromString(
s = GetDBOptionsFromString(
base_db_opt,
"create_if_missing=false;error_if_exists=false;max_open_files=42;"
"unknown_option=1;",
&new_db_opt));
&new_db_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_OK(
RocksDBOptionsParser::VerifyDBOptions(exact, base_db_opt, new_db_opt));
}
@ -822,7 +838,12 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
"block_cache=1M;block_cache_compressed=1k;block_size=1024;"
"block_size_deviation=8;block_restart_interval=4;"
"format_version=5;whole_key_filtering=1;"
"filter_policy=bloomfilter:4.567:false;",
"filter_policy=bloomfilter:4.567:false;"
// A bug caused read_amp_bytes_per_bit to be a large integer in OPTIONS
// file generated by 6.10 to 6.14. Though bug is fixed in these releases,
// we need to handle the case of loading OPTIONS file generated before the
// fix.
"read_amp_bytes_per_bit=17179869185;",
&new_opt));
ASSERT_TRUE(new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(new_opt.index_type, BlockBasedTableOptions::kHashSearch);
@ -842,21 +863,28 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
dynamic_cast<const BloomFilterPolicy&>(*new_opt.filter_policy);
EXPECT_EQ(bfp.GetMillibitsPerKey(), 4567);
EXPECT_EQ(bfp.GetWholeBitsPerKey(), 5);
// Verify that only the lower 32bits are stored in
// new_opt.read_amp_bytes_per_bit.
EXPECT_EQ(1U, new_opt.read_amp_bytes_per_bit);
// unknown option
ASSERT_NOK(GetBlockBasedTableOptionsFromString(
Status s = GetBlockBasedTableOptionsFromString(
config_options, table_opt,
"cache_index_and_filter_blocks=1;index_type=kBinarySearch;"
"bad_option=1",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(static_cast<bool>(table_opt.cache_index_and_filter_blocks),
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
// unrecognized index type
ASSERT_NOK(GetBlockBasedTableOptionsFromString(
s = GetBlockBasedTableOptionsFromString(
config_options, table_opt,
"cache_index_and_filter_blocks=1;index_type=kBinarySearchXX", &new_opt));
"cache_index_and_filter_blocks=1;index_type=kBinarySearchXX", &new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
@ -870,21 +898,23 @@ TEST_F(OptionsTest, GetBlockBasedTableOptionsFromString) {
ASSERT_EQ(table_opt.index_type, new_opt.index_type);
// unrecognized filter policy name
ASSERT_NOK(
GetBlockBasedTableOptionsFromString(config_options, table_opt,
s = GetBlockBasedTableOptionsFromString(config_options, table_opt,
"cache_index_and_filter_blocks=1;"
"filter_policy=bloomfilterxx:4:true",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.filter_policy, new_opt.filter_policy);
// unrecognized filter policy config
ASSERT_NOK(
GetBlockBasedTableOptionsFromString(config_options, table_opt,
s = GetBlockBasedTableOptionsFromString(config_options, table_opt,
"cache_index_and_filter_blocks=1;"
"filter_policy=bloomfilter:4",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(table_opt.cache_index_and_filter_blocks,
new_opt.cache_index_and_filter_blocks);
ASSERT_EQ(table_opt.filter_policy, new_opt.filter_policy);
@ -1017,18 +1047,22 @@ TEST_F(OptionsTest, GetPlainTableOptionsFromString) {
ASSERT_TRUE(new_opt.store_index_in_file);
// unknown option
ASSERT_NOK(GetPlainTableOptionsFromString(
Status s = GetPlainTableOptionsFromString(
config_options, table_opt,
"user_key_len=66;bloom_bits_per_key=20;hash_table_ratio=0.5;"
"bad_option=1",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// unrecognized EncodingType
ASSERT_NOK(GetPlainTableOptionsFromString(
s = GetPlainTableOptionsFromString(
config_options, table_opt,
"user_key_len=66;bloom_bits_per_key=20;hash_table_ratio=0.5;"
"encoding_type=kPrefixXX",
&new_opt));
&new_opt);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
}
#endif // !ROCKSDB_LITE
@ -1147,23 +1181,29 @@ TEST_F(OptionsTest, GetOptionsFromStringTest) {
base_options.dump_malloc_stats = false;
base_options.write_buffer_size = 1024;
Options bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
Status s = GetOptionsFromString(config_options, base_options,
"create_if_missing=XX;dump_malloc_stats=true",
&bad_options));
&bad_options);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
s = GetOptionsFromString(config_options, base_options,
"write_buffer_size=XX;dump_malloc_stats=true",
&bad_options));
&bad_options);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
// Test a bad value for a TableFactory Option returns a failure
bad_options = new_options;
ASSERT_NOK(GetOptionsFromString(config_options, base_options,
s = GetOptionsFromString(config_options, base_options,
"write_buffer_size=16;dump_malloc_stats=true"
"block_based_table_factory={block_size=XX;};",
&bad_options));
&bad_options);
ASSERT_TRUE(s.IsInvalidArgument());
ASSERT_EQ(bad_options.dump_malloc_stats, false);
ASSERT_EQ(bad_options.write_buffer_size, 1024);

View File

@ -333,8 +333,24 @@ static std::unordered_map<std::string, OptionTypeInfo>
OptionTypeFlags::kNone}},
{"read_amp_bytes_per_bit",
{offsetof(struct BlockBasedTableOptions, read_amp_bytes_per_bit),
OptionType::kSizeT, OptionVerificationType::kNormal,
OptionTypeFlags::kNone}},
OptionType::kUInt32T, OptionVerificationType::kNormal,
OptionTypeFlags::kNone,
[](const ConfigOptions& /*opts*/, const std::string& /*name*/,
const std::string& value, char* addr) {
// A workaround to fix a bug in 6.10, 6.11, 6.12, 6.13
// and 6.14. The bug will write out 8 bytes to OPTIONS file from the
// starting address of BlockBasedTableOptions.read_amp_bytes_per_bit
// which is actually a uint32. Consequently, the value of
// read_amp_bytes_per_bit written in the OPTIONS file is wrong.
// From 6.15, RocksDB will try to parse the read_amp_bytes_per_bit
// from OPTIONS file as a uint32. To be able to load OPTIONS file
// generated by affected releases before the fix, we need to
// manually parse read_amp_bytes_per_bit with this special hack.
uint64_t read_amp_bytes_per_bit = ParseUint64(value);
*(reinterpret_cast<uint32_t*>(addr)) =
static_cast<uint32_t>(read_amp_bytes_per_bit);
return Status::OK();
}}},
{"enable_index_compression",
{offsetof(struct BlockBasedTableOptions, enable_index_compression),
OptionType::kBoolean, OptionVerificationType::kNormal,
@ -731,8 +747,14 @@ Status GetBlockBasedTableOptionsFromString(
if (!s.ok()) {
return s;
}
return GetBlockBasedTableOptionsFromMap(config_options, table_options,
opts_map, new_table_options);
s = GetBlockBasedTableOptionsFromMap(config_options, table_options, opts_map,
new_table_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetBlockBasedTableOptionsFromMap(

View File

@ -173,7 +173,7 @@ Status PartitionIndexReader::CacheDependencies(const ReadOptions& ro,
return s;
}
if (block.GetValue() != nullptr) {
if (block.IsCached()) {
if (block.IsCached() || block.GetOwnValue()) {
if (pin) {
partition_map_[handle.offset()] = std::move(block);
}

View File

@ -149,6 +149,11 @@ class IteratorWrapperBase {
return result_.value_prepared;
}
Slice user_key() const {
assert(Valid());
return iter_->user_key();
}
private:
void Update() {
valid_ = iter_->Valid();

View File

@ -142,8 +142,14 @@ Status GetPlainTableOptionsFromString(const ConfigOptions& config_options,
return s;
}
return GetPlainTableOptionsFromMap(config_options, table_options, opts_map,
s = GetPlainTableOptionsFromMap(config_options, table_options, opts_map,
new_table_options);
// Translate any errors (NotFound, NotSupported, to InvalidArgument
if (s.ok() || s.IsInvalidArgument()) {
return s;
} else {
return Status::InvalidArgument(s.getState());
}
}
Status GetMemTableRepFactoryFromString(

View File

@ -43,6 +43,10 @@ class TwoLevelIndexIterator : public InternalIteratorBase<IndexValue> {
assert(Valid());
return second_level_iter_.key();
}
Slice user_key() const override {
assert(Valid());
return second_level_iter_.user_key();
}
IndexValue value() const override {
assert(Valid());
return second_level_iter_.value();

View File

@ -65,7 +65,11 @@ Status GetLatestOptionsFileName(const std::string& dbpath,
uint64_t latest_time_stamp = 0;
std::vector<std::string> file_names;
s = env->GetChildren(dbpath, &file_names);
if (!s.ok()) {
if (s.IsNotFound()) {
return Status::NotFound(Status::kPathNotFound,
"No options files found in the DB directory.",
dbpath);
} else if (!s.ok()) {
return s;
}
for (auto& file_name : file_names) {
@ -79,7 +83,9 @@ Status GetLatestOptionsFileName(const std::string& dbpath,
}
}
if (latest_file_name.size() == 0) {
return Status::NotFound("No options files found in the DB directory.");
return Status::NotFound(Status::kPathNotFound,
"No options files found in the DB directory.",
dbpath);
}
*options_file_name = latest_file_name;
return Status::OK();

View File

@ -11,6 +11,8 @@
#include <cinttypes>
#include <unordered_map>
#include "env/mock_env.h"
#include "file/filename.h"
#include "options/options_parser.h"
#include "rocksdb/convenience.h"
#include "rocksdb/db.h"
@ -31,14 +33,12 @@ namespace ROCKSDB_NAMESPACE {
class OptionsUtilTest : public testing::Test {
public:
OptionsUtilTest() : rnd_(0xFB) {
env_.reset(new test::StringEnv(Env::Default()));
fs_.reset(new LegacyFileSystemWrapper(env_.get()));
env_.reset(NewMemEnv(Env::Default()));
dbname_ = test::PerThreadDBPath("options_util_test");
}
protected:
std::unique_ptr<test::StringEnv> env_;
std::unique_ptr<LegacyFileSystemWrapper> fs_;
std::unique_ptr<Env> env_;
std::string dbname_;
Random rnd_;
};
@ -58,8 +58,8 @@ TEST_F(OptionsUtilTest, SaveAndLoad) {
}
const std::string kFileName = "OPTIONS-123456";
ASSERT_OK(
PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName, fs_.get()));
ASSERT_OK(PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName,
env_->GetFileSystem().get()));
DBOptions loaded_db_opt;
std::vector<ColumnFamilyDescriptor> loaded_cf_descs;
@ -85,6 +85,7 @@ TEST_F(OptionsUtilTest, SaveAndLoad) {
exact, cf_opts[i], loaded_cf_descs[i].options));
}
DestroyDB(dbname_, Options(db_opt, cf_opts[0]));
for (size_t i = 0; i < kCFCount; ++i) {
if (cf_opts[i].compaction_filter) {
delete cf_opts[i].compaction_filter;
@ -121,8 +122,8 @@ TEST_F(OptionsUtilTest, SaveAndLoadWithCacheCheck) {
cf_names.push_back("cf_plain_table_sample");
// Saving DB in file
const std::string kFileName = "OPTIONS-LOAD_CACHE_123456";
ASSERT_OK(
PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName, fs_.get()));
ASSERT_OK(PersistRocksDBOptions(db_opt, cf_names, cf_opts, kFileName,
env_->GetFileSystem().get()));
DBOptions loaded_db_opt;
std::vector<ColumnFamilyDescriptor> loaded_cf_descs;
@ -154,6 +155,7 @@ TEST_F(OptionsUtilTest, SaveAndLoadWithCacheCheck) {
ASSERT_EQ(loaded_bbt_opt->block_cache.get(), cache.get());
}
}
DestroyDB(dbname_, Options(loaded_db_opt, cf_opts[0]));
}
namespace {
@ -359,8 +361,280 @@ TEST_F(OptionsUtilTest, SanityCheck) {
ASSERT_OK(
CheckOptionsCompatibility(config_options, dbname_, db_opt, cf_descs));
}
DestroyDB(dbname_, Options(db_opt, cf_descs[0].options));
}
TEST_F(OptionsUtilTest, LatestOptionsNotFound) {
std::unique_ptr<Env> env(NewMemEnv(Env::Default()));
Status s;
Options options;
ConfigOptions config_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
options.env = env.get();
options.create_if_missing = true;
config_opts.env = options.env;
config_opts.ignore_unknown_options = false;
std::vector<std::string> children;
std::string options_file_name;
DestroyDB(dbname_, options);
// First, test where the db directory does not exist
ASSERT_NOK(options.env->GetChildren(dbname_, &children));
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(dbname_, options.env, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(config_opts, dbname_, &options, &cf_descs);
ASSERT_TRUE(s.IsPathNotFound());
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
// Second, test where the db directory exists but is empty
ASSERT_OK(options.env->CreateDir(dbname_));
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(dbname_, options.env, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
// Finally, test where a file exists but is not an "Options" file
std::unique_ptr<WritableFile> file;
ASSERT_OK(
options.env->NewWritableFile(dbname_ + "/temp.txt", &file, EnvOptions()));
ASSERT_OK(file->Close());
s = GetLatestOptionsFileName(dbname_, options.env, &options_file_name);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
s = LoadLatestOptions(config_opts, dbname_, &options, &cf_descs);
ASSERT_TRUE(s.IsNotFound());
ASSERT_TRUE(s.IsPathNotFound());
ASSERT_OK(options.env->DeleteFile(dbname_ + "/temp.txt"));
ASSERT_OK(options.env->DeleteDir(dbname_));
}
TEST_F(OptionsUtilTest, LoadLatestOptions) {
Options options;
options.OptimizeForSmallDb();
ColumnFamilyDescriptor cf_desc;
ConfigOptions config_opts;
DBOptions db_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
std::vector<ColumnFamilyHandle*> handles;
DB* db;
options.create_if_missing = true;
DestroyDB(dbname_, options);
cf_descs.emplace_back();
cf_descs.back().name = kDefaultColumnFamilyName;
cf_descs.back().options.table_factory.reset(NewBlockBasedTableFactory());
cf_descs.emplace_back();
cf_descs.back().name = "Plain";
cf_descs.back().options.table_factory.reset(NewPlainTableFactory());
db_opts.create_missing_column_families = true;
db_opts.create_if_missing = true;
// open and persist the options
ASSERT_OK(DB::Open(db_opts, dbname_, cf_descs, &handles, &db));
std::string options_file_name;
std::string new_options_file;
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &options_file_name));
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_EQ(cf_descs.size(), 2U);
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
ASSERT_OK(handles[0]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[0].options));
ASSERT_OK(handles[1]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[1].options));
// Now change some of the DBOptions
ASSERT_OK(db->SetDBOptions(
{{"delayed_write_rate", "1234"}, {"bytes_per_sync", "32768"}}));
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &new_options_file));
ASSERT_NE(options_file_name, new_options_file);
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
options_file_name = new_options_file;
// Now change some of the ColumnFamilyOptions
ASSERT_OK(db->SetOptions(handles[1], {{"write_buffer_size", "32768"}}));
ASSERT_OK(GetLatestOptionsFileName(dbname_, options.env, &new_options_file));
ASSERT_NE(options_file_name, new_options_file);
ASSERT_OK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(RocksDBOptionsParser::VerifyDBOptions(config_opts,
db->GetDBOptions(), db_opts));
ASSERT_OK(handles[0]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[0].options));
ASSERT_OK(handles[1]->GetDescriptor(&cf_desc));
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(config_opts, cf_desc.options,
cf_descs[1].options));
// close the db
for (auto* handle : handles) {
delete handle;
}
delete db;
DestroyDB(dbname_, options, cf_descs);
}
static void WriteOptionsFile(Env* env, const std::string& path,
const std::string& options_file, int major,
int minor, const std::string& db_opts,
const std::string& cf_opts,
const std::string& bbt_opts = "") {
std::string options_file_header =
"\n"
"[Version]\n"
" rocksdb_version=" +
ToString(major) + "." + ToString(minor) +
".0\n"
" options_file_version=1\n";
std::unique_ptr<WritableFile> wf;
ASSERT_OK(env->NewWritableFile(path + "/" + options_file, &wf, EnvOptions()));
ASSERT_OK(
wf->Append(options_file_header + "[ DBOptions ]\n" + db_opts + "\n"));
ASSERT_OK(wf->Append(
"[CFOptions \"default\"] # column family must be specified\n" +
cf_opts + "\n"));
ASSERT_OK(wf->Append("[TableOptions/BlockBasedTable \"default\"]\n" +
bbt_opts + "\n"));
ASSERT_OK(wf->Close());
std::string latest_options_file;
ASSERT_OK(GetLatestOptionsFileName(path, env, &latest_options_file));
ASSERT_EQ(latest_options_file, options_file);
}
TEST_F(OptionsUtilTest, BadLatestOptions) {
Status s;
ConfigOptions config_opts;
DBOptions db_opts;
std::vector<ColumnFamilyDescriptor> cf_descs;
Options options;
options.env = env_.get();
config_opts.env = env_.get();
config_opts.ignore_unknown_options = false;
config_opts.delimiter = "\n";
ConfigOptions ignore_opts = config_opts;
ignore_opts.ignore_unknown_options = true;
std::string options_file_name;
// Test where the db directory exists but is empty
ASSERT_OK(options.env->CreateDir(dbname_));
ASSERT_NOK(
GetLatestOptionsFileName(dbname_, options.env, &options_file_name));
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for a previous major release with an unknown DB
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0001", ROCKSDB_MAJOR - 1,
ROCKSDB_MINOR, "unknown_db_opt=true", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for a previous minor release with an unknown CF
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0002", ROCKSDB_MAJOR,
ROCKSDB_MINOR - 1, "", "unknown_cf_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for a previous minor release with an unknown BBT
// Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0003", ROCKSDB_MAJOR,
ROCKSDB_MINOR - 1, "", "", "unknown_bbt_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an unknown DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0004", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "unknown_db_opt=true", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an unknown CF Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0005", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "", "unknown_cf_opt=true");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the current release with an invalid DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0006", ROCKSDB_MAJOR,
ROCKSDB_MINOR, "create_if_missing=hello", "");
s = LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Even though ignore_unknown_options=true, we still return an error...
s = LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs);
ASSERT_NOK(s);
ASSERT_TRUE(s.IsInvalidArgument());
// Write an options file for the next release with an invalid DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0007", ROCKSDB_MAJOR,
ROCKSDB_MINOR + 1, "create_if_missing=hello", "");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for the next release with an unknown DB Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0008", ROCKSDB_MAJOR,
ROCKSDB_MINOR + 1, "unknown_db_opt=true", "");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Ignore the errors for future releases when ignore_unknown_options=true
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
// Write an options file for the next major release with an unknown CF Option
WriteOptionsFile(options.env, dbname_, "OPTIONS-0009", ROCKSDB_MAJOR + 1,
ROCKSDB_MINOR, "", "unknown_cf_opt=true");
ASSERT_NOK(LoadLatestOptions(config_opts, dbname_, &db_opts, &cf_descs));
// Ignore the errors for future releases when ignore_unknown_options=true
ASSERT_OK(LoadLatestOptions(ignore_opts, dbname_, &db_opts, &cf_descs));
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {