rocksdb/table/cuckoo_table_reader.h
Radheshyam Balasundaram 08be7f5266 Implement Prepare method in CuckooTableReader
Summary:
- Implement Prepare method
- Rewrite performance tests in cuckoo_table_reader_test to write new file only if one doesn't already exist.
- Add performance tests for batch lookup along with prefetching.

Test Plan:
./cuckoo_table_reader_test --enable_perf
Results (We get better results if we used int64 comparator instead of string comparator (TBD in future diffs)):
With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
Time taken per op is 0.208us (4.8 Mqps) with batch size of 0
With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
Time taken per op is 0.182us (5.5 Mqps) with batch size of 10
With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
Time taken per op is 0.161us (6.2 Mqps) with batch size of 25
With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
Time taken per op is 0.161us (6.2 Mqps) with batch size of 50
With 100000000 items and hash table ratio 0.500000, number of hash functions used: 2.
Time taken per op is 0.163us (6.1 Mqps) with batch size of 100

With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
Time taken per op is 0.252us (4.0 Mqps) with batch size of 0
With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
Time taken per op is 0.192us (5.2 Mqps) with batch size of 10
With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
Time taken per op is 0.195us (5.1 Mqps) with batch size of 25
With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
Time taken per op is 0.191us (5.2 Mqps) with batch size of 50
With 100000000 items and hash table ratio 0.600000, number of hash functions used: 3.
Time taken per op is 0.194us (5.1 Mqps) with batch size of 100

With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
Time taken per op is 0.228us (4.4 Mqps) with batch size of 0
With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
Time taken per op is 0.185us (5.4 Mqps) with batch size of 10
With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
Time taken per op is 0.186us (5.4 Mqps) with batch size of 25
With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
Time taken per op is 0.189us (5.3 Mqps) with batch size of 50
With 100000000 items and hash table ratio 0.750000, number of hash functions used: 3.
Time taken per op is 0.188us (5.3 Mqps) with batch size of 100

With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
Time taken per op is 0.325us (3.1 Mqps) with batch size of 0
With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
Time taken per op is 0.196us (5.1 Mqps) with batch size of 10
With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
Time taken per op is 0.199us (5.0 Mqps) with batch size of 25
With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
Time taken per op is 0.196us (5.1 Mqps) with batch size of 50
With 100000000 items and hash table ratio 0.900000, number of hash functions used: 3.
Time taken per op is 0.209us (4.8 Mqps) with batch size of 100

Reviewers: sdong, yhchiang, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22167
2014-08-20 18:35:35 -07:00

79 lines
2.5 KiB
C++

// Copyright (c) 2014, Facebook, Inc. All rights reserved.
// This source code is licensed under the BSD-style license found in the
// LICENSE file in the root directory of this source tree. An additional grant
// of patent rights can be found in the PATENTS file in the same directory.
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#pragma once
#ifndef ROCKSDB_LITE
#include <string>
#include <memory>
#include <utility>
#include <vector>
#include "db/dbformat.h"
#include "rocksdb/env.h"
#include "table/table_reader.h"
namespace rocksdb {
class Arena;
class TableReader;
class CuckooTableReader: public TableReader {
public:
CuckooTableReader(
const Options& options,
std::unique_ptr<RandomAccessFile>&& file,
uint64_t file_size,
uint64_t (*get_slice_hash)(const Slice&, uint32_t, uint64_t));
~CuckooTableReader() {}
std::shared_ptr<const TableProperties> GetTableProperties() const override {
return table_props_;
}
Status status() const { return status_; }
Status Get(
const ReadOptions& readOptions, const Slice& key, void* handle_context,
bool (*result_handler)(void* arg, const ParsedInternalKey& k,
const Slice& v),
void (*mark_key_may_exist_handler)(void* handle_context) = nullptr)
override;
Iterator* NewIterator(const ReadOptions&, Arena* arena = nullptr) override;
void Prepare(const Slice& target) override;
// Report an approximation of how much memory has been used.
size_t ApproximateMemoryUsage() const override;
// Following methods are not implemented for Cuckoo Table Reader
uint64_t ApproximateOffsetOf(const Slice& key) override { return 0; }
void SetupForCompaction() override {}
// End of methods not implemented.
private:
friend class CuckooTableIterator;
void LoadAllKeys(std::vector<std::pair<Slice, uint32_t>>* key_to_bucket_id);
std::unique_ptr<RandomAccessFile> file_;
Slice file_data_;
bool is_last_level_;
std::shared_ptr<const TableProperties> table_props_;
Status status_;
uint32_t num_hash_fun_;
std::string unused_key_;
uint32_t key_length_;
uint32_t value_length_;
uint32_t bucket_length_;
uint64_t num_buckets_;
uint64_t (*get_slice_hash_)(const Slice& s, uint32_t index,
uint64_t max_num_buckets);
};
} // namespace rocksdb
#endif // ROCKSDB_LITE