rocksdb/table/block_based/block_builder.h
Peter Dillinger b234a3f569 Improve data block construction performance (#9040)
Summary:
... by bypassing tracking of last_key in BlockBuilder when
last_key is already known (for BlockBasedTableBuilder::data_block).

I tried extracting a base class of BlockBuilder without the last_key
tracking at all, but that became complicated by NewFlushBlockPolicy() in
the public API referencing BlockBuilder, which would need to be the base
class, and I don't want to replace nearly all the internal references to
BlockBuilder.

Possible follow-up:
* Investigate / consider using AddWithLastKey in more places

This improvement should stack with https://github.com/facebook/rocksdb/issues/9039

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9040

Test Plan:
TEST_TMPDIR=/dev/shm/rocksdb1 ./db_bench -benchmarks=fillseq -memtablerep=vector -allow_concurrent_memtable_write=false -num=50000000
Compiled with DEBUG_LEVEL=0
Test vs. control runs simulaneous for better accuracy, units = ops/sec

Run 1: 278929 vs. 267799 (+4.2%)
Run 2: 281836 vs. 267432 (+5.4%)
Run 3: 278279 vs. 270454 (+2.9%)

(This benchmark is chosen to have detectable signal-to-noise, not to
represent expected improvement percent on real workloads.)

Reviewed By: mrambacher

Differential Revision: D31706033

Pulled By: pdillinger

fbshipit-source-id: 8a50fe6fefdd67b6d7665ffa687bbdcf5ad0d5ec
2021-10-19 12:36:21 -07:00

102 lines
4.1 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#pragma once
#include <vector>
#include <stdint.h>
#include "rocksdb/slice.h"
#include "rocksdb/table.h"
#include "table/block_based/data_block_hash_index.h"
namespace ROCKSDB_NAMESPACE {
class BlockBuilder {
public:
BlockBuilder(const BlockBuilder&) = delete;
void operator=(const BlockBuilder&) = delete;
explicit BlockBuilder(int block_restart_interval,
bool use_delta_encoding = true,
bool use_value_delta_encoding = false,
BlockBasedTableOptions::DataBlockIndexType index_type =
BlockBasedTableOptions::kDataBlockBinarySearch,
double data_block_hash_table_util_ratio = 0.75);
// Reset the contents as if the BlockBuilder was just constructed.
void Reset();
// Swap the contents in BlockBuilder with buffer, then reset the BlockBuilder.
void SwapAndReset(std::string& buffer);
// REQUIRES: Finish() has not been called since the last call to Reset().
// REQUIRES: key is larger than any previously added key
// DO NOT mix with AddWithLastKey() between Resets. For efficiency, use
// AddWithLastKey() in contexts where previous added key is already known
// and delta encoding might be used.
void Add(const Slice& key, const Slice& value,
const Slice* const delta_value = nullptr);
// A faster version of Add() if the previous key is already known for all
// Add()s.
// REQUIRES: Finish() has not been called since the last call to Reset().
// REQUIRES: key is larger than any previously added key
// REQUIRES: if AddWithLastKey has been called since last Reset(), last_key
// is the key from most recent AddWithLastKey. (For convenience, last_key
// is ignored on first call after creation or Reset().)
// DO NOT mix with Add() between Resets.
void AddWithLastKey(const Slice& key, const Slice& value,
const Slice& last_key,
const Slice* const delta_value = nullptr);
// Finish building the block and return a slice that refers to the
// block contents. The returned slice will remain valid for the
// lifetime of this builder or until Reset() is called.
Slice Finish();
// Returns an estimate of the current (uncompressed) size of the block
// we are building.
inline size_t CurrentSizeEstimate() const {
return estimate_ + (data_block_hash_index_builder_.Valid()
? data_block_hash_index_builder_.EstimateSize()
: 0);
}
// Returns an estimated block size after appending key and value.
size_t EstimateSizeAfterKV(const Slice& key, const Slice& value) const;
// Return true iff no entries have been added since the last Reset()
bool empty() const { return buffer_.empty(); }
private:
inline void AddWithLastKeyImpl(const Slice& key, const Slice& value,
const Slice& last_key,
const Slice* const delta_value,
size_t buffer_size);
const int block_restart_interval_;
// TODO(myabandeh): put it into a separate IndexBlockBuilder
const bool use_delta_encoding_;
// Refer to BlockIter::DecodeCurrentValue for format of delta encoded values
const bool use_value_delta_encoding_;
std::string buffer_; // Destination buffer
std::vector<uint32_t> restarts_; // Restart points
size_t estimate_;
int counter_; // Number of entries emitted since restart
bool finished_; // Has Finish() been called?
std::string last_key_;
DataBlockHashIndexBuilder data_block_hash_index_builder_;
#ifndef NDEBUG
bool add_with_last_key_called_ = false;
#endif
};
} // namespace ROCKSDB_NAMESPACE