rocksdb/db/blob/blob_file_reader.h

85 lines
3.1 KiB
C
Raw Normal View History

Introduce a blob file reader class (#7461) Summary: The patch adds a class called `BlobFileReader` that can be used to retrieve blobs using the information available in blob references (e.g. blob file number, offset, and size). This will come in handy when implementing blob support for `Get`, `MultiGet`, and iterators, and also for compaction/garbage collection. When a `BlobFileReader` object is created (using the factory method `Create`), it first checks whether the specified file is potentially valid by comparing the file size against the combined size of the blob file header and footer (files smaller than the threshold are considered malformed). Then, it opens the file, and reads and verifies the header and footer. The verification involves magic number/CRC checks as well as checking for unexpected header/footer fields, e.g. incorrect column family ID or TTL blob files. Blobs can be retrieved using `GetBlob`. `GetBlob` validates the offset and compression type passed by the caller (because of the presence of the header and footer, the specified offset cannot be too close to the start/end of the file; also, the compression type has to match the one in the blob file header), and retrieves and potentially verifies and uncompresses the blob. In particular, when `ReadOptions::verify_checksums` is set, `BlobFileReader` reads the blob record header as well (as opposed to just the blob itself) and verifies the key/value size, the key itself, as well as the CRC of the blob record header and the key/value pair. In addition, the patch exposes the compression type from `BlobIndex` (both using an accessor and via `DebugString`), and adds a blob file read latency histogram to `InternalStats` that can be used with `BlobFileReader`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7461 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23999219 Pulled By: ltamasi fbshipit-source-id: deb6b1160d251258b308d5156e2ec063c3e12e5e
2020-10-08 00:43:23 +02:00
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#pragma once
#include <cinttypes>
#include <memory>
#include "file/random_access_file_reader.h"
#include "rocksdb/compression_type.h"
#include "rocksdb/rocksdb_namespace.h"
namespace ROCKSDB_NAMESPACE {
class Status;
struct ImmutableCFOptions;
struct FileOptions;
class HistogramImpl;
struct ReadOptions;
class Slice;
class PinnableSlice;
class BlobFileReader {
public:
static Status Create(const ImmutableCFOptions& immutable_cf_options,
const FileOptions& file_options,
uint32_t column_family_id,
HistogramImpl* blob_file_read_hist,
uint64_t blob_file_number,
const std::shared_ptr<IOTracer>& io_tracer,
Introduce a blob file reader class (#7461) Summary: The patch adds a class called `BlobFileReader` that can be used to retrieve blobs using the information available in blob references (e.g. blob file number, offset, and size). This will come in handy when implementing blob support for `Get`, `MultiGet`, and iterators, and also for compaction/garbage collection. When a `BlobFileReader` object is created (using the factory method `Create`), it first checks whether the specified file is potentially valid by comparing the file size against the combined size of the blob file header and footer (files smaller than the threshold are considered malformed). Then, it opens the file, and reads and verifies the header and footer. The verification involves magic number/CRC checks as well as checking for unexpected header/footer fields, e.g. incorrect column family ID or TTL blob files. Blobs can be retrieved using `GetBlob`. `GetBlob` validates the offset and compression type passed by the caller (because of the presence of the header and footer, the specified offset cannot be too close to the start/end of the file; also, the compression type has to match the one in the blob file header), and retrieves and potentially verifies and uncompresses the blob. In particular, when `ReadOptions::verify_checksums` is set, `BlobFileReader` reads the blob record header as well (as opposed to just the blob itself) and verifies the key/value size, the key itself, as well as the CRC of the blob record header and the key/value pair. In addition, the patch exposes the compression type from `BlobIndex` (both using an accessor and via `DebugString`), and adds a blob file read latency histogram to `InternalStats` that can be used with `BlobFileReader`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7461 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23999219 Pulled By: ltamasi fbshipit-source-id: deb6b1160d251258b308d5156e2ec063c3e12e5e
2020-10-08 00:43:23 +02:00
std::unique_ptr<BlobFileReader>* reader);
BlobFileReader(const BlobFileReader&) = delete;
BlobFileReader& operator=(const BlobFileReader&) = delete;
~BlobFileReader();
Status GetBlob(const ReadOptions& read_options, const Slice& user_key,
uint64_t offset, uint64_t value_size,
CompressionType compression_type, PinnableSlice* value) const;
private:
BlobFileReader(std::unique_ptr<RandomAccessFileReader>&& file_reader,
uint64_t file_size, CompressionType compression_type);
static Status OpenFile(const ImmutableCFOptions& immutable_cf_options,
const FileOptions& file_opts,
HistogramImpl* blob_file_read_hist,
uint64_t blob_file_number,
const std::shared_ptr<IOTracer>& io_tracer,
uint64_t* file_size,
Introduce a blob file reader class (#7461) Summary: The patch adds a class called `BlobFileReader` that can be used to retrieve blobs using the information available in blob references (e.g. blob file number, offset, and size). This will come in handy when implementing blob support for `Get`, `MultiGet`, and iterators, and also for compaction/garbage collection. When a `BlobFileReader` object is created (using the factory method `Create`), it first checks whether the specified file is potentially valid by comparing the file size against the combined size of the blob file header and footer (files smaller than the threshold are considered malformed). Then, it opens the file, and reads and verifies the header and footer. The verification involves magic number/CRC checks as well as checking for unexpected header/footer fields, e.g. incorrect column family ID or TTL blob files. Blobs can be retrieved using `GetBlob`. `GetBlob` validates the offset and compression type passed by the caller (because of the presence of the header and footer, the specified offset cannot be too close to the start/end of the file; also, the compression type has to match the one in the blob file header), and retrieves and potentially verifies and uncompresses the blob. In particular, when `ReadOptions::verify_checksums` is set, `BlobFileReader` reads the blob record header as well (as opposed to just the blob itself) and verifies the key/value size, the key itself, as well as the CRC of the blob record header and the key/value pair. In addition, the patch exposes the compression type from `BlobIndex` (both using an accessor and via `DebugString`), and adds a blob file read latency histogram to `InternalStats` that can be used with `BlobFileReader`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7461 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23999219 Pulled By: ltamasi fbshipit-source-id: deb6b1160d251258b308d5156e2ec063c3e12e5e
2020-10-08 00:43:23 +02:00
std::unique_ptr<RandomAccessFileReader>* file_reader);
static Status ReadHeader(const RandomAccessFileReader* file_reader,
uint32_t column_family_id,
CompressionType* compression_type);
static Status ReadFooter(uint64_t file_size,
const RandomAccessFileReader* file_reader);
using Buffer = std::unique_ptr<char[]>;
static Status ReadFromFile(const RandomAccessFileReader* file_reader,
uint64_t read_offset, size_t read_size,
Slice* slice, Buffer* buf,
AlignedBuf* aligned_buf);
static Status VerifyBlob(const Slice& record_slice, const Slice& user_key,
uint64_t value_size);
static Status UncompressBlobIfNeeded(const Slice& value_slice,
CompressionType compression_type,
PinnableSlice* value);
static void SaveValue(const Slice& src, PinnableSlice* dst);
std::unique_ptr<RandomAccessFileReader> file_reader_;
uint64_t file_size_;
CompressionType compression_type_;
};
} // namespace ROCKSDB_NAMESPACE