Compare commits

...

8 Commits

Author SHA1 Message Date
sdong
5997edef44 Add 6.24.2 2021-09-17 14:16:18 -07:00
anand76
1ab38a19df More robust checking of IO uring completion data (#8894)
Summary:
Potential bugs in the IO uring implementation can cause bad data to be returned in the completion queue. Add some checks in the PosixRandomAccessFile::MultiRead completion handling code to catch such errors and fail the entire MultiRead. Also log some diagnostic messages and stack trace.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8894

Reviewed By: siying, pdillinger

Differential Revision: D30826982

Pulled By: anand1976

fbshipit-source-id: af91815ac760e095d6cc0466cf8bd5c10167fd15
2021-09-16 18:16:06 -07:00
anand76
9b2b87ec15 Update patch version to 6.24.1 2021-09-02 13:28:09 -07:00
anand76
1fe5ac3468 Fix a race in LRUCacheShard::Promote (#8717)
Summary:
In ```LRUCacheShard::Promote```, a reference is released outside the LRU mutex. Fix the race condition.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8717

Reviewed By: zhichao-cao

Differential Revision: D30649206

Pulled By: anand1976

fbshipit-source-id: 09c0af05b2294a7fe2c02876a61b0bad6e3ada61
2021-08-31 18:02:15 -07:00
mrambacher
3efc6fd641 Make Configurable/Customizable options copyable (#8704)
Summary:
The atomic variable "is_prepared_" was keeping Configurable objects from being copy-constructed.  Removed the atomic to allow copies.

Since the variable is only changed from false to true (and never back), there is no reason it had to be atomic.

Added tests that simple Configurable and Customizable objects can be put on the stack and copied.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8704

Reviewed By: anand1976

Differential Revision: D30530526

Pulled By: ltamasi

fbshipit-source-id: 4dd4439b3e5ad7fa396573d0b25d9fb709160576
2021-08-25 17:51:50 -07:00
Peter Dillinger
3a3f4801d0 Allow intentionally swallowed errors in BlockBasedFilterBlockReader (#8695)
Summary:
To avoid getting "Didn't get expected error from Get" from
crash test by enabling block-based filter in crash test in https://github.com/facebook/rocksdb/issues/8679.
Basically, this applies the pattern of IGNORE_STATUS_IF_ERROR in
full_filter_block.cc to block_based_filter_block.cc

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8695

Test Plan: watch for resolution of crash test runs

Reviewed By: ltamasi

Differential Revision: D30496748

Pulled By: pdillinger

fbshipit-source-id: f7808fcf14c0e787fe81da03fa8303244590d273
2021-08-23 16:12:50 -07:00
Peter Dillinger
87b20cbba0 Fix typo in 6.24.0 HISTORY.md (#8694)
Summary:
fix typo

Also, clarified change of C API signatures.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8694

Test Plan: visual

Reviewed By: ltamasi

Differential Revision: D30492882

Pulled By: pdillinger

fbshipit-source-id: ac6dc3dcefa01c91fd87fc7f50279ea5e13fa41d
2021-08-23 13:46:56 -07:00
mrambacher
c2c92f3013 Fix LITE build (#8689)
Summary:
Conditional compilation of static functions not used in LITE mode.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8689

Reviewed By: ltamasi

Differential Revision: D30476218

Pulled By: mrambacher

fbshipit-source-id: 5f3af90982d34818f47d2cb1d36dd5816d0333a5
2021-08-23 09:49:34 -07:00
11 changed files with 120 additions and 50 deletions

View File

@ -1,4 +1,12 @@
# Rocksdb Change Log
## 6.24.2 (2021-09-16)
### Bug Fixes
* Add checks for validity of the IO uring completion queue entries, and fail the BlockBasedTableReader MultiGet sub-batch if there's an invalid completion
## 6.24.1 (2021-08-31)
### Bug Fixes
* Fix a race in item ref counting in LRUCache when promoting an item from the SecondaryCache.
## 6.24.0 (2021-08-20)
### Bug Fixes
* If the primary's CURRENT file is missing or inaccessible, the secondary instance should not hang repeatedly trying to switch to a new MANIFEST. It should instead return the error code encountered while accessing the file.
@ -25,10 +33,10 @@
* The integrated BlobDB implementation now supports the tickers `BLOB_DB_BLOB_FILE_BYTES_READ`, `BLOB_DB_GC_NUM_KEYS_RELOCATED`, and `BLOB_DB_GC_BYTES_RELOCATED`, as well as the histograms `BLOB_DB_COMPRESSION_MICROS` and `BLOB_DB_DECOMPRESSION_MICROS`.
* Added hybrid configuration of Ribbon filter and Bloom filter where some LSM levels use Ribbon for memory space efficiency and some use Bloom for speed. See NewRibbonFilterPolicy. This also changes the default behavior of NewRibbonFilterPolicy to use Bloom for flushes under Leveled and Universal compaction and Ribbon otherwise. The C API function `rocksdb_filterpolicy_create_ribbon` is unchanged but adds new `rocksdb_filterpolicy_create_ribbon_hybrid`.
## Public API change
### Public API change
* Added APIs to decode and replay trace file via Replayer class. Added `DB::NewDefaultReplayer()` to create a default Replayer instance. Added `TraceReader::Reset()` to restart reading a trace file. Created trace_record.h, trace_record_result.h and utilities/replayer.h files to access the decoded Trace records, replay them, and query the actual operation results.
* Added Configurable::GetOptionsMap to the public API for use in creating new Customizable classes.
* Generalized bits_per_key parameters in C API from int to double for greater configurability.
* Generalized bits_per_key parameters in C API from int to double for greater configurability. Although this is a compatible change for existing C source code, anything depending on C API signatures, such as foreign function interfaces, will need to be updated.
### Performance Improvements
* Try to avoid updating DBOptions if `SetDBOptions()` does not change any option value.

9
cache/lru_cache.cc vendored
View File

@ -360,7 +360,10 @@ Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
if (handle == nullptr) {
LRU_Insert(e);
} else {
// If caller already holds a ref, no need to take one here
if (!e->HasRefs()) {
e->Ref();
}
*handle = reinterpret_cast<Cache::Handle*>(e);
}
}
@ -398,11 +401,7 @@ void LRUCacheShard::Promote(LRUHandle* e) {
if (e->value) {
Cache::Handle* handle = reinterpret_cast<Cache::Handle*>(e);
Status s = InsertItem(e, &handle, /*free_handle_on_fail=*/false);
if (s.ok()) {
// InsertItem would have taken a reference on the item, so decrement it
// here as we expect the caller to already hold a reference
e->Unref();
} else {
if (!s.ok()) {
// Item is in memory, but not accounted against the cache capacity.
// When the handle is released, the item should get deleted
assert(!e->InCache());

20
env/io_posix.cc vendored
View File

@ -31,6 +31,7 @@
#endif
#include "monitoring/iostats_context_imp.h"
#include "port/port.h"
#include "port/stack_trace.h"
#include "rocksdb/slice.h"
#include "test_util/sync_point.h"
#include "util/autovector.h"
@ -644,6 +645,7 @@ IOStatus PosixRandomAccessFile::MultiRead(FSReadRequest* reqs,
autovector<WrappedReadRequest, 32> req_wraps;
autovector<WrappedReadRequest*, 4> incomplete_rq_list;
std::unordered_set<WrappedReadRequest*> wrap_cache;
for (size_t i = 0; i < num_reqs; i++) {
req_wraps.emplace_back(&reqs[i]);
@ -676,6 +678,7 @@ IOStatus PosixRandomAccessFile::MultiRead(FSReadRequest* reqs,
sqe, fd_, &rep_to_submit->iov, 1,
rep_to_submit->req->offset + rep_to_submit->finished_len);
io_uring_sqe_set_data(sqe, rep_to_submit);
wrap_cache.emplace(rep_to_submit);
}
incomplete_rq_list.clear();
@ -724,6 +727,22 @@ IOStatus PosixRandomAccessFile::MultiRead(FSReadRequest* reqs,
}
req_wrap = static_cast<WrappedReadRequest*>(io_uring_cqe_get_data(cqe));
// Reset cqe data to catch any stray reuse of it
static_cast<struct io_uring_cqe*>(cqe)->user_data = 0xd5d5d5d5d5d5d5d5;
// Check that we got a valid unique cqe data
auto wrap_check = wrap_cache.find(req_wrap);
if (wrap_check == wrap_cache.end()) {
fprintf(stderr,
"PosixRandomAccessFile::MultiRead: "
"Bad cqe data from IO uring - %p\n",
req_wrap);
port::PrintStack();
ios = IOStatus::IOError("io_uring_cqe_get_data() returned " +
ToString((uint64_t)req_wrap));
continue;
}
wrap_cache.erase(wrap_check);
FSReadRequest* req = req_wrap->req;
if (cqe->res < 0) {
req->result = Slice(req->scratch, 0);
@ -769,6 +788,7 @@ IOStatus PosixRandomAccessFile::MultiRead(FSReadRequest* reqs,
}
io_uring_cqe_seen(iu, cqe);
}
wrap_cache.clear();
}
return ios;
#else

View File

@ -56,7 +56,6 @@ class Configurable {
};
public:
Configurable();
virtual ~Configurable() {}
// Returns the raw pointer of the named options that is used by this
@ -262,10 +261,6 @@ class Configurable {
virtual Status ValidateOptions(const DBOptions& db_opts,
const ColumnFamilyOptions& cf_opts) const;
// Returns true if this object has been initialized via PrepareOptions, false
// otherwise. Once an object has been prepared, only mutable options may be
// changed.
virtual bool IsPrepared() const { return is_prepared_; }
// Splits the input opt_value into the ID field and the remaining options.
// The input opt_value can be in the form of "name" or "name=value
@ -286,10 +281,6 @@ class Configurable {
std::string* id, std::unordered_map<std::string, std::string>* options);
protected:
// True once the object is prepared. Once the object is prepared, only
// mutable options can be configured.
std::atomic<bool> is_prepared_;
// Returns the raw pointer for the associated named option.
// The name is typically the name of an option registered via the
// Classes may override this method to provide further specialization (such as

View File

@ -11,7 +11,7 @@
#define ROCKSDB_MAJOR 6
#define ROCKSDB_MINOR 24
#define ROCKSDB_PATCH 0
#define ROCKSDB_PATCH 2
// Do not use these. We made the mistake of declaring macros starting with
// double underscore. Now we have to live with our choice. We'll deprecate these

View File

@ -17,8 +17,6 @@
namespace ROCKSDB_NAMESPACE {
Configurable::Configurable() : is_prepared_(false) {}
void Configurable::RegisterOptions(
const std::string& name, void* opt_ptr,
const std::unordered_map<std::string, OptionTypeInfo>* type_map) {
@ -68,9 +66,6 @@ Status Configurable::PrepareOptions(const ConfigOptions& opts) {
#else
(void)opts;
#endif // ROCKSDB_LITE
if (status.ok()) {
is_prepared_ = true;
}
return status;
}

View File

@ -331,6 +331,40 @@ TEST_F(ConfigurableTest, PrepareOptionsTest) {
ASSERT_EQ(*up, 0);
}
TEST_F(ConfigurableTest, CopyObjectTest) {
class CopyConfigurable : public Configurable {
public:
CopyConfigurable() : prepared_(0), validated_(0) {}
Status PrepareOptions(const ConfigOptions& options) override {
prepared_++;
return Configurable::PrepareOptions(options);
}
Status ValidateOptions(const DBOptions& db_opts,
const ColumnFamilyOptions& cf_opts) const override {
validated_++;
return Configurable::ValidateOptions(db_opts, cf_opts);
}
int prepared_;
mutable int validated_;
};
CopyConfigurable c1;
ConfigOptions config_options;
Options options;
ASSERT_OK(c1.PrepareOptions(config_options));
ASSERT_OK(c1.ValidateOptions(options, options));
ASSERT_EQ(c1.prepared_, 1);
ASSERT_EQ(c1.validated_, 1);
CopyConfigurable c2 = c1;
ASSERT_OK(c1.PrepareOptions(config_options));
ASSERT_OK(c1.ValidateOptions(options, options));
ASSERT_EQ(c2.prepared_, 1);
ASSERT_EQ(c2.validated_, 1);
ASSERT_EQ(c1.prepared_, 2);
ASSERT_EQ(c1.validated_, 2);
}
TEST_F(ConfigurableTest, MutableOptionsTest) {
static std::unordered_map<std::string, OptionTypeInfo> imm_option_info = {
#ifndef ROCKSDB_LITE

View File

@ -66,6 +66,7 @@ class TestCustomizable : public Customizable {
const char* Name() const override { return name_.c_str(); }
static const char* Type() { return "test.custom"; }
#ifndef ROCKSDB_LITE
static Status CreateFromString(const ConfigOptions& opts,
const std::string& value,
std::unique_ptr<TestCustomizable>* result);
@ -75,6 +76,7 @@ class TestCustomizable : public Customizable {
static Status CreateFromString(const ConfigOptions& opts,
const std::string& value,
TestCustomizable** result);
#endif // ROCKSDB_LITE
bool IsInstanceOf(const std::string& name) const override {
if (name == kClassName()) {
return true;
@ -146,6 +148,7 @@ class BCustomizable : public TestCustomizable {
BOptions opts_;
};
#ifndef ROCKSDB_LITE
static bool LoadSharedB(const std::string& id,
std::shared_ptr<TestCustomizable>* result) {
if (id == "B") {
@ -159,7 +162,6 @@ static bool LoadSharedB(const std::string& id,
}
}
#ifndef ROCKSDB_LITE
static int A_count = 0;
static int RegisterCustomTestObjects(ObjectLibrary& library,
const std::string& /*arg*/) {
@ -241,6 +243,7 @@ static void GetMapFromProperties(
#endif // ROCKSDB_LITE
} // namespace
#ifndef ROCKSDB_LITE
Status TestCustomizable::CreateFromString(
const ConfigOptions& config_options, const std::string& value,
std::shared_ptr<TestCustomizable>* result) {
@ -285,6 +288,7 @@ Status TestCustomizable::CreateFromString(const ConfigOptions& config_options,
},
result);
}
#endif // ROCKSDB_LITE
class CustomizableTest : public testing::Test {
public:
@ -549,7 +553,6 @@ TEST_F(CustomizableTest, PrepareOptionsTest) {
ConfigOptions prepared(config_options_);
prepared.invoke_prepare_options = true;
ASSERT_FALSE(base->IsPrepared());
ASSERT_OK(base->ConfigureFromString(
prepared, "unique=A_1; shared={id=B;string=s}; pointer.id=S"));
SimpleOptions* simple = base->GetOptions<SimpleOptions>();
@ -557,10 +560,6 @@ TEST_F(CustomizableTest, PrepareOptionsTest) {
ASSERT_NE(simple->cu, nullptr);
ASSERT_NE(simple->cs, nullptr);
ASSERT_NE(simple->cp, nullptr);
ASSERT_TRUE(base->IsPrepared());
ASSERT_TRUE(simple->cu->IsPrepared());
ASSERT_TRUE(simple->cs->IsPrepared());
ASSERT_TRUE(simple->cp->IsPrepared());
delete simple->cp;
base.reset(new SimpleConfigurable());
ASSERT_OK(base->ConfigureFromString(
@ -571,16 +570,8 @@ TEST_F(CustomizableTest, PrepareOptionsTest) {
ASSERT_NE(simple->cu, nullptr);
ASSERT_NE(simple->cs, nullptr);
ASSERT_NE(simple->cp, nullptr);
ASSERT_FALSE(base->IsPrepared());
ASSERT_FALSE(simple->cu->IsPrepared());
ASSERT_FALSE(simple->cs->IsPrepared());
ASSERT_FALSE(simple->cp->IsPrepared());
ASSERT_OK(base->PrepareOptions(config_options_));
ASSERT_TRUE(base->IsPrepared());
ASSERT_TRUE(simple->cu->IsPrepared());
ASSERT_TRUE(simple->cs->IsPrepared());
ASSERT_TRUE(simple->cp->IsPrepared());
delete simple->cp;
base.reset(new SimpleConfigurable());
simple = base->GetOptions<SimpleOptions>();
@ -593,21 +584,16 @@ TEST_F(CustomizableTest, PrepareOptionsTest) {
ASSERT_OK(
base->ConfigureFromString(prepared, "unique={id=P; can_prepare=true}"));
ASSERT_NE(simple->cu, nullptr);
ASSERT_TRUE(simple->cu->IsPrepared());
ASSERT_OK(base->ConfigureFromString(config_options_,
"unique={id=P; can_prepare=true}"));
ASSERT_NE(simple->cu, nullptr);
ASSERT_FALSE(simple->cu->IsPrepared());
ASSERT_OK(simple->cu->PrepareOptions(prepared));
ASSERT_TRUE(simple->cu->IsPrepared());
ASSERT_OK(base->ConfigureFromString(config_options_,
"unique={id=P; can_prepare=false}"));
ASSERT_NE(simple->cu, nullptr);
ASSERT_FALSE(simple->cu->IsPrepared());
ASSERT_NOK(simple->cu->PrepareOptions(prepared));
ASSERT_FALSE(simple->cu->IsPrepared());
}
namespace {
@ -684,6 +670,42 @@ TEST_F(CustomizableTest, WrappedInnerTest) {
ASSERT_EQ(wc2->CheckedCast<TestCustomizable>(), ac.get());
}
TEST_F(CustomizableTest, CopyObjectTest) {
class CopyCustomizable : public Customizable {
public:
CopyCustomizable() : prepared_(0), validated_(0) {}
const char* Name() const override { return "CopyCustomizable"; }
Status PrepareOptions(const ConfigOptions& options) override {
prepared_++;
return Customizable::PrepareOptions(options);
}
Status ValidateOptions(const DBOptions& db_opts,
const ColumnFamilyOptions& cf_opts) const override {
validated_++;
return Customizable::ValidateOptions(db_opts, cf_opts);
}
int prepared_;
mutable int validated_;
};
CopyCustomizable c1;
ConfigOptions config_options;
Options options;
ASSERT_OK(c1.PrepareOptions(config_options));
ASSERT_OK(c1.ValidateOptions(options, options));
ASSERT_EQ(c1.prepared_, 1);
ASSERT_EQ(c1.validated_, 1);
CopyCustomizable c2 = c1;
ASSERT_OK(c1.PrepareOptions(config_options));
ASSERT_OK(c1.ValidateOptions(options, options));
ASSERT_EQ(c2.prepared_, 1);
ASSERT_EQ(c2.validated_, 1);
ASSERT_EQ(c1.prepared_, 2);
ASSERT_EQ(c1.validated_, 2);
}
TEST_F(CustomizableTest, TestStringDepth) {
class ShallowCustomizable : public Customizable {
public:
@ -979,7 +1001,6 @@ TEST_F(CustomizableTest, MutableOptionsTest) {
std::string opt_str;
ConfigOptions options = config_options_;
ASSERT_FALSE(mc.IsPrepared());
ASSERT_OK(mc.ConfigureOption(options, "mutable", "{id=B;}"));
options.mutable_options_only = true;
ASSERT_OK(mc.GetOptionString(options, &opt_str));

View File

@ -1482,13 +1482,11 @@ TEST_F(OptionsTest, MutableTableOptions) {
bbtf.reset(NewBlockBasedTableFactory());
auto bbto = bbtf->GetOptions<BlockBasedTableOptions>();
ASSERT_NE(bbto, nullptr);
ASSERT_FALSE(bbtf->IsPrepared());
ASSERT_OK(bbtf->ConfigureOption(config_options, "block_align", "true"));
ASSERT_OK(bbtf->ConfigureOption(config_options, "block_size", "1024"));
ASSERT_EQ(bbto->block_align, true);
ASSERT_EQ(bbto->block_size, 1024);
ASSERT_OK(bbtf->PrepareOptions(config_options));
ASSERT_TRUE(bbtf->IsPrepared());
config_options.mutable_options_only = true;
ASSERT_OK(bbtf->ConfigureOption(config_options, "block_size", "1024"));
ASSERT_EQ(bbto->block_align, true);

View File

@ -257,6 +257,7 @@ bool BlockBasedFilterBlockReader::MayMatch(
const Status s =
GetOrReadFilterBlock(no_io, get_context, lookup_context, &filter_block);
if (!s.ok()) {
IGNORE_STATUS_IF_ERROR(s);
return true;
}
@ -315,6 +316,7 @@ std::string BlockBasedFilterBlockReader::ToString() const {
GetOrReadFilterBlock(false /* no_io */, nullptr /* get_context */,
nullptr /* lookup_context */, &filter_block);
if (!s.ok()) {
IGNORE_STATUS_IF_ERROR(s);
return std::string("Unable to retrieve filter block");
}

View File

@ -1728,14 +1728,16 @@ void BlockBasedTable::RetrieveMultipleBlocks(
{
IOOptions opts;
IOStatus s = file->PrepareIOOptions(options, opts);
if (s.IsTimedOut()) {
if (s.ok()) {
s = file->MultiRead(opts, &read_reqs[0], read_reqs.size(),
&direct_io_buf);
}
if (!s.ok()) {
// Discard all the results in this batch if there is any time out
// or overall MultiRead error
for (FSReadRequest& req : read_reqs) {
req.status = s;
}
} else {
// How to handle this status code?
file->MultiRead(opts, &read_reqs[0], read_reqs.size(), &direct_io_buf)
.PermitUncheckedError();
}
}