Compare commits
4 Commits
main
...
release_fi
Author | SHA1 | Date | |
---|---|---|---|
|
c464412cf8 | ||
|
4411488585 | ||
|
8cc712c0eb | ||
|
62b71b3094 |
14
HISTORY.md
14
HISTORY.md
@ -1,8 +1,5 @@
|
|||||||
# Rocksdb Change Log
|
# Rocksdb Change Log
|
||||||
## 7.1.0 (03/21/2022)
|
## 7.1.0 (03/23/2022)
|
||||||
### Public API changes
|
|
||||||
* Add DB::OpenAndTrimHistory API. This API will open DB and trim data to the timestamp specified by trim_ts (The data with timestamp larger than specified trim bound will be removed). This API should only be used at a timestamp-enabled column families recovery. If the column family doesn't have timestamp enabled, this API won't trim any data on that column family. This API is not compatible with avoid_flush_during_recovery option.
|
|
||||||
|
|
||||||
### New Features
|
### New Features
|
||||||
* Allow WriteBatchWithIndex to index a WriteBatch that includes keys with user-defined timestamps. The index itself does not have timestamp.
|
* Allow WriteBatchWithIndex to index a WriteBatch that includes keys with user-defined timestamps. The index itself does not have timestamp.
|
||||||
* Add support for user-defined timestamps to write-committed transaction without API change. The `TransactionDB` layer APIs do not allow timestamps because we require that all user-defined-timestamps-aware operations go through the `Transaction` APIs.
|
* Add support for user-defined timestamps to write-committed transaction without API change. The `TransactionDB` layer APIs do not allow timestamps because we require that all user-defined-timestamps-aware operations go through the `Transaction` APIs.
|
||||||
@ -13,10 +10,12 @@
|
|||||||
* Experimental support for async_io in ReadOptions which is used by FilePrefetchBuffer to prefetch some of the data asynchronously, if reads are sequential and auto readahead is enabled by rocksdb internally.
|
* Experimental support for async_io in ReadOptions which is used by FilePrefetchBuffer to prefetch some of the data asynchronously, if reads are sequential and auto readahead is enabled by rocksdb internally.
|
||||||
|
|
||||||
### Bug Fixes
|
### Bug Fixes
|
||||||
|
* Fixed a major performance bug in which Bloom filters generated by pre-7.0 releases are not read by early 7.0.x releases (and vice-versa) due to changes to FilterPolicy::Name() in #9590. This can severely impact read performance and read I/O on upgrade or downgrade with existing DB, but not data correctness.
|
||||||
* Fixed a data race on `versions_` between `DBImpl::ResumeImpl()` and threads waiting for recovery to complete (#9496)
|
* Fixed a data race on `versions_` between `DBImpl::ResumeImpl()` and threads waiting for recovery to complete (#9496)
|
||||||
* Fixed a bug caused by race among flush, incoming writes and taking snapshots. Queries to snapshots created with these race condition can return incorrect result, e.g. resurfacing deleted data.
|
* Fixed a bug caused by race among flush, incoming writes and taking snapshots. Queries to snapshots created with these race condition can return incorrect result, e.g. resurfacing deleted data.
|
||||||
* Fixed a bug that DB flush uses `options.compression` even `options.compression_per_level` is set.
|
* Fixed a bug that DB flush uses `options.compression` even `options.compression_per_level` is set.
|
||||||
* Fixed a bug that DisableManualCompaction may assert when disable an unscheduled manual compaction.
|
* Fixed a bug that DisableManualCompaction may assert when disable an unscheduled manual compaction.
|
||||||
|
* Fix a race condition when cancel manual compaction with `DisableManualCompaction`. Also DB close can cancel the manual compaction thread.
|
||||||
* Fixed a potential timer crash when open close DB concurrently.
|
* Fixed a potential timer crash when open close DB concurrently.
|
||||||
* Fixed a race condition for `alive_log_files_` in non-two-write-queues mode. The race is between the write_thread_ in WriteToWAL() and another thread executing `FindObsoleteFiles()`. The race condition will be caught if `__glibcxx_requires_nonempty` is enabled.
|
* Fixed a race condition for `alive_log_files_` in non-two-write-queues mode. The race is between the write_thread_ in WriteToWAL() and another thread executing `FindObsoleteFiles()`. The race condition will be caught if `__glibcxx_requires_nonempty` is enabled.
|
||||||
* Fixed a bug that `Iterator::Refresh()` reads stale keys after DeleteRange() performed.
|
* Fixed a bug that `Iterator::Refresh()` reads stale keys after DeleteRange() performed.
|
||||||
@ -25,12 +24,11 @@
|
|||||||
* Fixed a race condition when mmaping a WritableFile on POSIX.
|
* Fixed a race condition when mmaping a WritableFile on POSIX.
|
||||||
|
|
||||||
### Public API changes
|
### Public API changes
|
||||||
* Remove BlockBasedTableOptions.hash_index_allow_collision which already takes no effect.
|
* Added pure virtual FilterPolicy::CompatibilityName(), which is needed for fixing major performance bug involving FilterPolicy naming in SST metadata without affecting Customizable aspect of FilterPolicy. This change only affects those with their own custom or wrapper FilterPolicy classes.
|
||||||
* `options.compression_per_level` is dynamically changeable with `SetOptions()`.
|
* `options.compression_per_level` is dynamically changeable with `SetOptions()`.
|
||||||
* Added `WriteOptions::rate_limiter_priority`. When set to something other than `Env::IO_TOTAL`, the internal rate limiter (`DBOptions::rate_limiter`) will be charged at the specified priority for writes associated with the API to which the `WriteOptions` was provided. Currently the support covers automatic WAL flushes, which happen during live updates (`Put()`, `Write()`, `Delete()`, etc.) when `WriteOptions::disableWAL == false` and `DBOptions::manual_wal_flush == false`.
|
* Added `WriteOptions::rate_limiter_priority`. When set to something other than `Env::IO_TOTAL`, the internal rate limiter (`DBOptions::rate_limiter`) will be charged at the specified priority for writes associated with the API to which the `WriteOptions` was provided. Currently the support covers automatic WAL flushes, which happen during live updates (`Put()`, `Write()`, `Delete()`, etc.) when `WriteOptions::disableWAL == false` and `DBOptions::manual_wal_flush == false`.
|
||||||
|
* Add DB::OpenAndTrimHistory API. This API will open DB and trim data to the timestamp specified by trim_ts (The data with timestamp larger than specified trim bound will be removed). This API should only be used at a timestamp-enabled column families recovery. If the column family doesn't have timestamp enabled, this API won't trim any data on that column family. This API is not compatible with avoid_flush_during_recovery option.
|
||||||
### Bug Fixes
|
* Remove BlockBasedTableOptions.hash_index_allow_collision which already takes no effect.
|
||||||
* Fix a race condition when cancel manual compaction with `DisableManualCompaction`. Also DB close can cancel the manual compaction thread.
|
|
||||||
|
|
||||||
## 7.0.0 (02/20/2022)
|
## 7.0.0 (02/20/2022)
|
||||||
### Bug Fixes
|
### Bug Fixes
|
||||||
|
6
db/c.cc
6
db/c.cc
@ -3752,6 +3752,9 @@ rocksdb_filterpolicy_t* rocksdb_filterpolicy_create_bloom_format(
|
|||||||
const FilterPolicy* rep_;
|
const FilterPolicy* rep_;
|
||||||
~Wrapper() override { delete rep_; }
|
~Wrapper() override { delete rep_; }
|
||||||
const char* Name() const override { return rep_->Name(); }
|
const char* Name() const override { return rep_->Name(); }
|
||||||
|
const char* CompatibilityName() const override {
|
||||||
|
return rep_->CompatibilityName();
|
||||||
|
}
|
||||||
// No need to override GetFilterBitsBuilder if this one is overridden
|
// No need to override GetFilterBitsBuilder if this one is overridden
|
||||||
ROCKSDB_NAMESPACE::FilterBitsBuilder* GetBuilderWithContext(
|
ROCKSDB_NAMESPACE::FilterBitsBuilder* GetBuilderWithContext(
|
||||||
const ROCKSDB_NAMESPACE::FilterBuildingContext& context)
|
const ROCKSDB_NAMESPACE::FilterBuildingContext& context)
|
||||||
@ -3789,6 +3792,9 @@ rocksdb_filterpolicy_t* rocksdb_filterpolicy_create_ribbon_format(
|
|||||||
const FilterPolicy* rep_;
|
const FilterPolicy* rep_;
|
||||||
~Wrapper() override { delete rep_; }
|
~Wrapper() override { delete rep_; }
|
||||||
const char* Name() const override { return rep_->Name(); }
|
const char* Name() const override { return rep_->Name(); }
|
||||||
|
const char* CompatibilityName() const override {
|
||||||
|
return rep_->CompatibilityName();
|
||||||
|
}
|
||||||
ROCKSDB_NAMESPACE::FilterBitsBuilder* GetBuilderWithContext(
|
ROCKSDB_NAMESPACE::FilterBitsBuilder* GetBuilderWithContext(
|
||||||
const ROCKSDB_NAMESPACE::FilterBuildingContext& context)
|
const ROCKSDB_NAMESPACE::FilterBuildingContext& context)
|
||||||
const override {
|
const override {
|
||||||
|
@ -1638,9 +1638,15 @@ class LevelAndStyleCustomFilterPolicy : public FilterPolicy {
|
|||||||
policy_l0_other_(NewBloomFilterPolicy(bpk_l0_other)),
|
policy_l0_other_(NewBloomFilterPolicy(bpk_l0_other)),
|
||||||
policy_otherwise_(NewBloomFilterPolicy(bpk_otherwise)) {}
|
policy_otherwise_(NewBloomFilterPolicy(bpk_otherwise)) {}
|
||||||
|
|
||||||
|
const char* Name() const override {
|
||||||
|
return "LevelAndStyleCustomFilterPolicy";
|
||||||
|
}
|
||||||
|
|
||||||
// OK to use built-in policy name because we are deferring to a
|
// OK to use built-in policy name because we are deferring to a
|
||||||
// built-in builder. We aren't changing the serialized format.
|
// built-in builder. We aren't changing the serialized format.
|
||||||
const char* Name() const override { return policy_fifo_->Name(); }
|
const char* CompatibilityName() const override {
|
||||||
|
return policy_fifo_->CompatibilityName();
|
||||||
|
}
|
||||||
|
|
||||||
FilterBitsBuilder* GetBuilderWithContext(
|
FilterBitsBuilder* GetBuilderWithContext(
|
||||||
const FilterBuildingContext& context) const override {
|
const FilterBuildingContext& context) const override {
|
||||||
|
@ -66,6 +66,17 @@ void FilePrefetchBuffer::CalculateOffsetAndLen(size_t alignment,
|
|||||||
// chunk_len is greater than 0.
|
// chunk_len is greater than 0.
|
||||||
bufs_[index].buffer_.RefitTail(static_cast<size_t>(chunk_offset_in_buffer),
|
bufs_[index].buffer_.RefitTail(static_cast<size_t>(chunk_offset_in_buffer),
|
||||||
static_cast<size_t>(chunk_len));
|
static_cast<size_t>(chunk_len));
|
||||||
|
} else if (chunk_len > 0) {
|
||||||
|
// For async prefetching, it doesn't call RefitTail with chunk_len > 0.
|
||||||
|
// Allocate new buffer if needed because aligned buffer calculate remaining
|
||||||
|
// buffer as capacity_ - cursize_ which might not be the case in this as we
|
||||||
|
// are not refitting.
|
||||||
|
// TODO akanksha: Update the condition when asynchronous prefetching is
|
||||||
|
// stable.
|
||||||
|
bufs_[index].buffer_.Alignment(alignment);
|
||||||
|
bufs_[index].buffer_.AllocateNewBuffer(
|
||||||
|
static_cast<size_t>(roundup_len), copy_data_to_new_buffer,
|
||||||
|
chunk_offset_in_buffer, static_cast<size_t>(chunk_len));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -236,34 +247,47 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
// Index of second buffer.
|
// Index of second buffer.
|
||||||
uint32_t second = curr_ ^ 1;
|
uint32_t second = curr_ ^ 1;
|
||||||
|
|
||||||
// If data is in second buffer, make it curr_. Second buffer can be either
|
// First clear the buffers if it contains outdated data. Outdated data can be
|
||||||
// partial filled or full.
|
// because previous sequential reads were read from the cache instead of these
|
||||||
if (bufs_[second].buffer_.CurrentSize() > 0 &&
|
// buffer.
|
||||||
offset >= bufs_[second].offset_ &&
|
{
|
||||||
offset <= bufs_[second].offset_ + bufs_[second].buffer_.CurrentSize()) {
|
if (bufs_[curr_].buffer_.CurrentSize() > 0 &&
|
||||||
// Clear the curr_ as buffers have been swapped and curr_ contains the
|
offset >= bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize()) {
|
||||||
// outdated data.
|
|
||||||
bufs_[curr_].buffer_.Clear();
|
bufs_[curr_].buffer_.Clear();
|
||||||
// Switch the buffers.
|
|
||||||
curr_ = curr_ ^ 1;
|
|
||||||
second = curr_ ^ 1;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// If second buffer contains outdated data, clear it for async prefetching.
|
|
||||||
// Outdated can be because previous sequential reads were read from the cache
|
|
||||||
// instead of this buffer.
|
|
||||||
if (bufs_[second].buffer_.CurrentSize() > 0 &&
|
if (bufs_[second].buffer_.CurrentSize() > 0 &&
|
||||||
offset >= bufs_[second].offset_ + bufs_[second].buffer_.CurrentSize()) {
|
offset >= bufs_[second].offset_ + bufs_[second].buffer_.CurrentSize()) {
|
||||||
bufs_[second].buffer_.Clear();
|
bufs_[second].buffer_.Clear();
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// If data is in second buffer, make it curr_. Second buffer can be either
|
||||||
|
// partial filled or full.
|
||||||
|
if (bufs_[second].buffer_.CurrentSize() > 0 &&
|
||||||
|
offset >= bufs_[second].offset_ &&
|
||||||
|
offset < bufs_[second].offset_ + bufs_[second].buffer_.CurrentSize()) {
|
||||||
|
// Clear the curr_ as buffers have been swapped and curr_ contains the
|
||||||
|
// outdated data and switch the buffers.
|
||||||
|
bufs_[curr_].buffer_.Clear();
|
||||||
|
curr_ = curr_ ^ 1;
|
||||||
|
second = curr_ ^ 1;
|
||||||
|
}
|
||||||
|
// After swap check if all the requested bytes are in curr_, it will go for
|
||||||
|
// async prefetching only.
|
||||||
|
if (bufs_[curr_].buffer_.CurrentSize() > 0 &&
|
||||||
|
offset + length <=
|
||||||
|
bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize()) {
|
||||||
|
offset += length;
|
||||||
|
length = 0;
|
||||||
|
prefetch_size -= length;
|
||||||
|
}
|
||||||
// Data is overlapping i.e. some of the data is in curr_ buffer and remaining
|
// Data is overlapping i.e. some of the data is in curr_ buffer and remaining
|
||||||
// in second buffer.
|
// in second buffer.
|
||||||
if (bufs_[curr_].buffer_.CurrentSize() > 0 &&
|
if (bufs_[curr_].buffer_.CurrentSize() > 0 &&
|
||||||
bufs_[second].buffer_.CurrentSize() > 0 &&
|
bufs_[second].buffer_.CurrentSize() > 0 &&
|
||||||
offset >= bufs_[curr_].offset_ &&
|
offset >= bufs_[curr_].offset_ &&
|
||||||
offset < bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize() &&
|
offset < bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize() &&
|
||||||
offset + prefetch_size > bufs_[second].offset_) {
|
offset + length > bufs_[second].offset_) {
|
||||||
// Allocate new buffer to third buffer;
|
// Allocate new buffer to third buffer;
|
||||||
bufs_[2].buffer_.Clear();
|
bufs_[2].buffer_.Clear();
|
||||||
bufs_[2].buffer_.Alignment(alignment);
|
bufs_[2].buffer_.Alignment(alignment);
|
||||||
@ -273,12 +297,10 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
|
|
||||||
// Move data from curr_ buffer to third.
|
// Move data from curr_ buffer to third.
|
||||||
CopyDataToBuffer(curr_, offset, length);
|
CopyDataToBuffer(curr_, offset, length);
|
||||||
|
|
||||||
if (length == 0) {
|
if (length == 0) {
|
||||||
// Requested data has been copied and curr_ still has unconsumed data.
|
// Requested data has been copied and curr_ still has unconsumed data.
|
||||||
return s;
|
return s;
|
||||||
}
|
}
|
||||||
|
|
||||||
CopyDataToBuffer(second, offset, length);
|
CopyDataToBuffer(second, offset, length);
|
||||||
// Length == 0: All the requested data has been copied to third buffer. It
|
// Length == 0: All the requested data has been copied to third buffer. It
|
||||||
// should go for only async prefetching.
|
// should go for only async prefetching.
|
||||||
@ -306,6 +328,7 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
if (length > 0) {
|
if (length > 0) {
|
||||||
CalculateOffsetAndLen(alignment, offset, roundup_len1, curr_,
|
CalculateOffsetAndLen(alignment, offset, roundup_len1, curr_,
|
||||||
false /*refit_tail*/, chunk_len1);
|
false /*refit_tail*/, chunk_len1);
|
||||||
|
assert(roundup_len1 >= chunk_len1);
|
||||||
read_len1 = static_cast<size_t>(roundup_len1 - chunk_len1);
|
read_len1 = static_cast<size_t>(roundup_len1 - chunk_len1);
|
||||||
}
|
}
|
||||||
{
|
{
|
||||||
@ -316,7 +339,7 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
Roundup(rounddown_start2 + readahead_size, alignment);
|
Roundup(rounddown_start2 + readahead_size, alignment);
|
||||||
|
|
||||||
// For length == 0, do the asynchronous prefetching in second instead of
|
// For length == 0, do the asynchronous prefetching in second instead of
|
||||||
// synchronous prefetching of remaining prefetch_size.
|
// synchronous prefetching in curr_.
|
||||||
if (length == 0) {
|
if (length == 0) {
|
||||||
rounddown_start2 =
|
rounddown_start2 =
|
||||||
bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize();
|
bufs_[curr_].offset_ + bufs_[curr_].buffer_.CurrentSize();
|
||||||
@ -330,8 +353,8 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
|
|
||||||
// Update the buffer offset.
|
// Update the buffer offset.
|
||||||
bufs_[second].offset_ = rounddown_start2;
|
bufs_[second].offset_ = rounddown_start2;
|
||||||
|
assert(roundup_len2 >= chunk_len2);
|
||||||
uint64_t read_len2 = static_cast<size_t>(roundup_len2 - chunk_len2);
|
uint64_t read_len2 = static_cast<size_t>(roundup_len2 - chunk_len2);
|
||||||
|
|
||||||
ReadAsync(opts, reader, rate_limiter_priority, read_len2, chunk_len2,
|
ReadAsync(opts, reader, rate_limiter_priority, read_len2, chunk_len2,
|
||||||
rounddown_start2, second)
|
rounddown_start2, second)
|
||||||
.PermitUncheckedError();
|
.PermitUncheckedError();
|
||||||
@ -344,7 +367,6 @@ Status FilePrefetchBuffer::PrefetchAsync(const IOOptions& opts,
|
|||||||
return s;
|
return s;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Copy remaining requested bytes to third_buffer.
|
// Copy remaining requested bytes to third_buffer.
|
||||||
if (copy_to_third_buffer && length > 0) {
|
if (copy_to_third_buffer && length > 0) {
|
||||||
CopyDataToBuffer(curr_, offset, length);
|
CopyDataToBuffer(curr_, offset, length);
|
||||||
|
@ -90,6 +90,19 @@ class FilterPolicy : public Customizable {
|
|||||||
virtual ~FilterPolicy();
|
virtual ~FilterPolicy();
|
||||||
static const char* Type() { return "FilterPolicy"; }
|
static const char* Type() { return "FilterPolicy"; }
|
||||||
|
|
||||||
|
// The name used for identifying whether a filter on disk is readable
|
||||||
|
// by this FilterPolicy. If this FilterPolicy is part of a family that
|
||||||
|
// can read each others filters, such as built-in BloomFilterPolcy and
|
||||||
|
// RibbonFilterPolicy, the CompatibilityName is a shared family name,
|
||||||
|
// while kinds of filters in the family can have distinct Customizable
|
||||||
|
// Names. This function is pure virtual so that wrappers around built-in
|
||||||
|
// policies are prompted to defer to CompatibilityName() of the wrapped
|
||||||
|
// policy, which is important for compatibility.
|
||||||
|
//
|
||||||
|
// For custom filter policies that are not part of a read-compatible
|
||||||
|
// family (rare), implementations may return Name().
|
||||||
|
virtual const char* CompatibilityName() const = 0;
|
||||||
|
|
||||||
// Creates a new FilterPolicy based on the input value string and returns the
|
// Creates a new FilterPolicy based on the input value string and returns the
|
||||||
// result The value might be an ID, and ID with properties, or an old-style
|
// result The value might be an ID, and ID with properties, or an old-style
|
||||||
// policy string.
|
// policy string.
|
||||||
|
@ -1487,6 +1487,7 @@ class MockFilterPolicy : public FilterPolicy {
|
|||||||
public:
|
public:
|
||||||
static const char* kClassName() { return "MockFilterPolicy"; }
|
static const char* kClassName() { return "MockFilterPolicy"; }
|
||||||
const char* Name() const override { return kClassName(); }
|
const char* Name() const override { return kClassName(); }
|
||||||
|
const char* CompatibilityName() const override { return Name(); }
|
||||||
FilterBitsBuilder* GetBuilderWithContext(
|
FilterBitsBuilder* GetBuilderWithContext(
|
||||||
const FilterBuildingContext&) const override {
|
const FilterBuildingContext&) const override {
|
||||||
return nullptr;
|
return nullptr;
|
||||||
|
@ -1605,7 +1605,7 @@ void BlockBasedTableBuilder::WriteFilterBlock(
|
|||||||
? BlockBasedTable::kPartitionedFilterBlockPrefix
|
? BlockBasedTable::kPartitionedFilterBlockPrefix
|
||||||
: BlockBasedTable::kFullFilterBlockPrefix;
|
: BlockBasedTable::kFullFilterBlockPrefix;
|
||||||
}
|
}
|
||||||
key.append(rep_->table_options.filter_policy->Name());
|
key.append(rep_->table_options.filter_policy->CompatibilityName());
|
||||||
meta_index_builder->Add(key, filter_block_handle);
|
meta_index_builder->Add(key, filter_block_handle);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -12,6 +12,7 @@
|
|||||||
#include <array>
|
#include <array>
|
||||||
#include <limits>
|
#include <limits>
|
||||||
#include <string>
|
#include <string>
|
||||||
|
#include <unordered_set>
|
||||||
#include <utility>
|
#include <utility>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
|
|
||||||
@ -50,6 +51,7 @@
|
|||||||
#include "table/block_based/block_prefix_index.h"
|
#include "table/block_based/block_prefix_index.h"
|
||||||
#include "table/block_based/block_type.h"
|
#include "table/block_based/block_type.h"
|
||||||
#include "table/block_based/filter_block.h"
|
#include "table/block_based/filter_block.h"
|
||||||
|
#include "table/block_based/filter_policy_internal.h"
|
||||||
#include "table/block_based/full_filter_block.h"
|
#include "table/block_based/full_filter_block.h"
|
||||||
#include "table/block_based/hash_index_reader.h"
|
#include "table/block_based/hash_index_reader.h"
|
||||||
#include "table/block_based/partitioned_filter_block.h"
|
#include "table/block_based/partitioned_filter_block.h"
|
||||||
@ -897,29 +899,54 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
|
|||||||
const BlockBasedTableOptions& table_options, const int level,
|
const BlockBasedTableOptions& table_options, const int level,
|
||||||
size_t file_size, size_t max_file_size_for_l0_meta_pin,
|
size_t file_size, size_t max_file_size_for_l0_meta_pin,
|
||||||
BlockCacheLookupContext* lookup_context) {
|
BlockCacheLookupContext* lookup_context) {
|
||||||
Status s;
|
|
||||||
|
|
||||||
// Find filter handle and filter type
|
// Find filter handle and filter type
|
||||||
if (rep_->filter_policy) {
|
if (rep_->filter_policy) {
|
||||||
for (auto filter_type :
|
auto name = rep_->filter_policy->CompatibilityName();
|
||||||
{Rep::FilterType::kFullFilter, Rep::FilterType::kPartitionedFilter,
|
bool builtin_compatible =
|
||||||
Rep::FilterType::kBlockFilter}) {
|
strcmp(name, BuiltinFilterPolicy::kCompatibilityName()) == 0;
|
||||||
std::string prefix;
|
|
||||||
switch (filter_type) {
|
for (const auto& [filter_type, prefix] :
|
||||||
case Rep::FilterType::kFullFilter:
|
{std::make_pair(Rep::FilterType::kFullFilter, kFullFilterBlockPrefix),
|
||||||
prefix = kFullFilterBlockPrefix;
|
std::make_pair(Rep::FilterType::kPartitionedFilter,
|
||||||
|
kPartitionedFilterBlockPrefix),
|
||||||
|
std::make_pair(Rep::FilterType::kBlockFilter, kFilterBlockPrefix)}) {
|
||||||
|
if (builtin_compatible) {
|
||||||
|
// This code is only here to deal with a hiccup in early 7.0.x where
|
||||||
|
// there was an unintentional name change in the SST files metadata.
|
||||||
|
// It should be OK to remove this in the future (late 2022) and just
|
||||||
|
// have the 'else' code.
|
||||||
|
// NOTE: the test:: names below are likely not needed but included
|
||||||
|
// out of caution
|
||||||
|
static const std::unordered_set<std::string> kBuiltinNameAndAliases = {
|
||||||
|
BuiltinFilterPolicy::kCompatibilityName(),
|
||||||
|
test::LegacyBloomFilterPolicy::kClassName(),
|
||||||
|
test::FastLocalBloomFilterPolicy::kClassName(),
|
||||||
|
test::Standard128RibbonFilterPolicy::kClassName(),
|
||||||
|
DeprecatedBlockBasedBloomFilterPolicy::kClassName(),
|
||||||
|
BloomFilterPolicy::kClassName(),
|
||||||
|
RibbonFilterPolicy::kClassName(),
|
||||||
|
};
|
||||||
|
|
||||||
|
// For efficiency, do a prefix seek and see if the first match is
|
||||||
|
// good.
|
||||||
|
meta_iter->Seek(prefix);
|
||||||
|
if (meta_iter->status().ok() && meta_iter->Valid()) {
|
||||||
|
Slice key = meta_iter->key();
|
||||||
|
if (key.starts_with(prefix)) {
|
||||||
|
key.remove_prefix(prefix.size());
|
||||||
|
if (kBuiltinNameAndAliases.find(key.ToString()) !=
|
||||||
|
kBuiltinNameAndAliases.end()) {
|
||||||
|
Slice v = meta_iter->value();
|
||||||
|
Status s = rep_->filter_handle.DecodeFrom(&v);
|
||||||
|
if (s.ok()) {
|
||||||
|
rep_->filter_type = filter_type;
|
||||||
break;
|
break;
|
||||||
case Rep::FilterType::kPartitionedFilter:
|
|
||||||
prefix = kPartitionedFilterBlockPrefix;
|
|
||||||
break;
|
|
||||||
case Rep::FilterType::kBlockFilter:
|
|
||||||
prefix = kFilterBlockPrefix;
|
|
||||||
break;
|
|
||||||
default:
|
|
||||||
assert(0);
|
|
||||||
}
|
}
|
||||||
std::string filter_block_key = prefix;
|
}
|
||||||
filter_block_key.append(rep_->filter_policy->Name());
|
}
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
std::string filter_block_key = prefix + name;
|
||||||
if (FindMetaBlock(meta_iter, filter_block_key, &rep_->filter_handle)
|
if (FindMetaBlock(meta_iter, filter_block_key, &rep_->filter_handle)
|
||||||
.ok()) {
|
.ok()) {
|
||||||
rep_->filter_type = filter_type;
|
rep_->filter_type = filter_type;
|
||||||
@ -927,12 +954,13 @@ Status BlockBasedTable::PrefetchIndexAndFilterBlocks(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}
|
||||||
// Partition filters cannot be enabled without partition indexes
|
// Partition filters cannot be enabled without partition indexes
|
||||||
assert(rep_->filter_type != Rep::FilterType::kPartitionedFilter ||
|
assert(rep_->filter_type != Rep::FilterType::kPartitionedFilter ||
|
||||||
rep_->index_type == BlockBasedTableOptions::kTwoLevelIndexSearch);
|
rep_->index_type == BlockBasedTableOptions::kTwoLevelIndexSearch);
|
||||||
|
|
||||||
// Find compression dictionary handle
|
// Find compression dictionary handle
|
||||||
s = FindOptionalMetaBlock(meta_iter, kCompressionDictBlockName,
|
Status s = FindOptionalMetaBlock(meta_iter, kCompressionDictBlockName,
|
||||||
&rep_->compression_dict_handle);
|
&rep_->compression_dict_handle);
|
||||||
if (!s.ok()) {
|
if (!s.ok()) {
|
||||||
return s;
|
return s;
|
||||||
|
@ -1325,6 +1325,16 @@ bool BuiltinFilterPolicy::IsInstanceOf(const std::string& name) const {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static const char* kBuiltinFilterMetadataName = "rocksdb.BuiltinBloomFilter";
|
||||||
|
|
||||||
|
const char* BuiltinFilterPolicy::kCompatibilityName() {
|
||||||
|
return kBuiltinFilterMetadataName;
|
||||||
|
}
|
||||||
|
|
||||||
|
const char* BuiltinFilterPolicy::CompatibilityName() const {
|
||||||
|
return kBuiltinFilterMetadataName;
|
||||||
|
}
|
||||||
|
|
||||||
BloomLikeFilterPolicy::BloomLikeFilterPolicy(double bits_per_key)
|
BloomLikeFilterPolicy::BloomLikeFilterPolicy(double bits_per_key)
|
||||||
: warned_(false), aggregate_rounding_balance_(0) {
|
: warned_(false), aggregate_rounding_balance_(0) {
|
||||||
// Sanitize bits_per_key
|
// Sanitize bits_per_key
|
||||||
@ -1372,7 +1382,7 @@ bool BloomLikeFilterPolicy::IsInstanceOf(const std::string& name) const {
|
|||||||
}
|
}
|
||||||
|
|
||||||
const char* ReadOnlyBuiltinFilterPolicy::kClassName() {
|
const char* ReadOnlyBuiltinFilterPolicy::kClassName() {
|
||||||
return "rocksdb.BuiltinBloomFilter";
|
return kBuiltinFilterMetadataName;
|
||||||
}
|
}
|
||||||
|
|
||||||
const char* DeprecatedBlockBasedBloomFilterPolicy::kClassName() {
|
const char* DeprecatedBlockBasedBloomFilterPolicy::kClassName() {
|
||||||
|
@ -135,6 +135,9 @@ class BuiltinFilterPolicy : public FilterPolicy {
|
|||||||
FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override;
|
FilterBitsReader* GetFilterBitsReader(const Slice& contents) const override;
|
||||||
static const char* kClassName();
|
static const char* kClassName();
|
||||||
bool IsInstanceOf(const std::string& id) const override;
|
bool IsInstanceOf(const std::string& id) const override;
|
||||||
|
// All variants of BuiltinFilterPolicy can read each others filters.
|
||||||
|
const char* CompatibilityName() const override;
|
||||||
|
static const char* kCompatibilityName();
|
||||||
|
|
||||||
public: // new
|
public: // new
|
||||||
// An internal function for the implementation of
|
// An internal function for the implementation of
|
||||||
|
@ -84,6 +84,7 @@ class TestFilterBitsReader : public FilterBitsReader {
|
|||||||
class TestHashFilter : public FilterPolicy {
|
class TestHashFilter : public FilterPolicy {
|
||||||
public:
|
public:
|
||||||
const char* Name() const override { return "TestHashFilter"; }
|
const char* Name() const override { return "TestHashFilter"; }
|
||||||
|
const char* CompatibilityName() const override { return Name(); }
|
||||||
|
|
||||||
FilterBitsBuilder* GetBuilderWithContext(
|
FilterBitsBuilder* GetBuilderWithContext(
|
||||||
const FilterBuildingContext&) const override {
|
const FilterBuildingContext&) const override {
|
||||||
|
Loading…
Reference in New Issue
Block a user