3bfd3d39a3
Summary: Currently, transactions can fail even if there is no actual write conflict. This is due to relying on only the memtables to check for write-conflicts. Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings. With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts. This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot. Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged). Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread. Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files. Test Plan: unit tests, db bench Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb, yoshinorim Differential Revision: https://reviews.facebook.net/D50475
66 lines
2.2 KiB
C++
66 lines
2.2 KiB
C++
// Copyright (c) 2015, Facebook, Inc. All rights reserved.
|
|
// This source code is licensed under the BSD-style license found in the
|
|
// LICENSE file in the root directory of this source tree. An additional grant
|
|
// of patent rights can be found in the PATENTS file in the same directory.
|
|
|
|
#pragma once
|
|
|
|
#ifndef ROCKSDB_LITE
|
|
|
|
#include <string>
|
|
#include <unordered_map>
|
|
|
|
#include "rocksdb/db.h"
|
|
#include "rocksdb/slice.h"
|
|
#include "rocksdb/status.h"
|
|
#include "rocksdb/types.h"
|
|
|
|
namespace rocksdb {
|
|
|
|
using TransactionKeyMap =
|
|
std::unordered_map<uint32_t,
|
|
std::unordered_map<std::string, SequenceNumber>>;
|
|
|
|
class DBImpl;
|
|
struct SuperVersion;
|
|
class WriteBatchWithIndex;
|
|
|
|
class TransactionUtil {
|
|
public:
|
|
// Verifies there have been no writes to this key in the db since this
|
|
// sequence number.
|
|
//
|
|
// If cache_only is true, then this function will not attempt to read any
|
|
// SST files. This will make it more likely this function will
|
|
// return an error if it is unable to determine if there are any conflicts.
|
|
//
|
|
// Returns OK on success, BUSY if there is a conflicting write, or other error
|
|
// status for any unexpected errors.
|
|
static Status CheckKeyForConflicts(DBImpl* db_impl,
|
|
ColumnFamilyHandle* column_family,
|
|
const std::string& key,
|
|
SequenceNumber key_seq, bool cache_only);
|
|
|
|
// For each key,SequenceNumber pair in the TransactionKeyMap, this function
|
|
// will verify there have been no writes to the key in the db since that
|
|
// sequence number.
|
|
//
|
|
// Returns OK on success, BUSY if there is a conflicting write, or other error
|
|
// status for any unexpected errors.
|
|
//
|
|
// REQUIRED: this function should only be called on the write thread or if the
|
|
// mutex is held.
|
|
static Status CheckKeysForConflicts(DBImpl* db_impl,
|
|
const TransactionKeyMap& keys,
|
|
bool cache_only);
|
|
|
|
private:
|
|
static Status CheckKey(DBImpl* db_impl, SuperVersion* sv,
|
|
SequenceNumber earliest_seq, SequenceNumber key_seq,
|
|
const std::string& key, bool cache_only);
|
|
};
|
|
|
|
} // namespace rocksdb
|
|
|
|
#endif // ROCKSDB_LITE
|