rocksdb/test_util/sync_point_impl.h
matthewvon 678ba5e41c SyncPoint::Process thrashes heap ... fix it (#9023)
Summary:
The first parameter of SyncPoint::Process is "const std::string&".  The majority, maybe all, of the actual calls to this function use a "const char *".  The conversion before entering the function requires a construction of a std::string object on the heap.  This std::object is then typically not needed because first use of the string is a rocksdb::Slice which has a less costly conversion of char * to slice.

Example:

We have a load and iterate test.  The test loads 10m keys and iterates most via 10 rocksdb::Iterator objects.  We used TCMALLOC to gather information about allocation and space usage during iterators.

- Before this PR:  test took 32 min 17 sec
- After this PR:  test took 1 min 14 sec

The TCMALLOC top object list before this PR:

<pre>
Total: 5105999 objects
 5003717  98.0%  98.0%  5009471  98.1% rocksdb::DBIter::MergeValuesNewToOld (inline)
   20260   0.4%  98.4%    20260   0.4% std::__cxx11::basic_string::_M_mutate
   15214   0.3%  98.7%    15214   0.3% rocksdb::UncompressBlockContentsForCompressionType (inline)
   13408   0.3%  99.0%    13408   0.3% std::_Rb_tree::_M_emplace_hint_unique [clone .constprop.416] (inline)
   12957   0.3%  99.2%    12957   0.3% std::_Rb_tree::_M_emplace_hint_unique [clone .constprop.405] (inline)
    9327   0.2%  99.4%     9327   0.2% std::_Rb_tree::_M_copy (inline)
    7691   0.2%  99.5%     9919   0.2% JVM_FindSignal
    2859   0.1%  99.6%     2859   0.1% rocksdb::Cleanable::RegisterCleanup
    2844   0.1%  99.7%     2844   0.1% std::map::operator[] (inline)
</pre>

The "MergeValuesNewToOld (inline)" objects are the #define wrappers to SyncPoint::Process.  We discovered this in a 5.18 rocksdb release.  There TCMALLOC was more specific that std::basic_string was being constructed.  I believe that was before SyncPoint::Process was declared inline in subsequent releases.

The TCMALLOC top object list after this PR:

<pre>
Total: 104911 objects
   45090  43.0%  43.0%    45090  43.0% rocksdb::Cleanable::RegisterCleanup
   29995  28.6%  71.6%    29995  28.6% rocksdb::LRUCacheShard::Insert
   15229  14.5%  86.1%    15229  14.5% rocksdb::UncompressBlockContentsForCompressionType (inline)
    4373   4.2%  90.3%     4551   4.3% JVM_FindSignal
    2881   2.7%  93.0%     2881   2.7% rocksdb::::ReadBlockFromFile (inline)
    1162   1.1%  94.1%     1176   1.1% rocksdb::BlockFetcher::ReadBlockContents (inline)
    1036   1.0%  95.1%     1036   1.0% std::__cxx11::basic_string::_M_mutate
     869   0.8%  95.9%      869   0.8% std::vector::_M_realloc_insert (inline)
     806   0.8%  96.7%      806   0.8% SnmpAgent::GetVariables (inline)
</pre>

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9023

Reviewed By: pdillinger

Differential Revision: D31610907

Pulled By: mrambacher

fbshipit-source-id: 574ff51b639dd46ad253a8e664a575f06b7cc85d
2021-10-15 13:30:29 -07:00

102 lines
3.1 KiB
C++

// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#include <assert.h>
#include <atomic>
#include <condition_variable>
#include <functional>
#include <mutex>
#include <string>
#include <thread>
#include <unordered_map>
#include <unordered_set>
#include "memory/concurrent_arena.h"
#include "port/port.h"
#include "test_util/sync_point.h"
#include "util/dynamic_bloom.h"
#include "util/random.h"
#pragma once
#ifndef NDEBUG
namespace ROCKSDB_NAMESPACE {
// A hacky allocator for single use.
// Arena depends on SyncPoint and create circular dependency.
class SingleAllocator : public Allocator {
public:
char* Allocate(size_t) override {
assert(false);
return nullptr;
}
char* AllocateAligned(size_t bytes, size_t, Logger*) override {
buf_.resize(bytes);
return const_cast<char*>(buf_.data());
}
size_t BlockSize() const override {
assert(false);
return 0;
}
private:
std::string buf_;
};
struct SyncPoint::Data {
Data() : point_filter_(&alloc_, /*total_bits=*/8192), enabled_(false) {}
// Enable proper deletion by subclasses
virtual ~Data() {}
// successor/predecessor map loaded from LoadDependency
std::unordered_map<std::string, std::vector<std::string>> successors_;
std::unordered_map<std::string, std::vector<std::string>> predecessors_;
std::unordered_map<std::string, std::function<void(void*)> > callbacks_;
std::unordered_map<std::string, std::vector<std::string> > markers_;
std::unordered_map<std::string, std::thread::id> marked_thread_id_;
std::mutex mutex_;
std::condition_variable cv_;
// sync points that have been passed through
std::unordered_set<std::string> cleared_points_;
SingleAllocator alloc_;
// A filter before holding mutex to speed up process.
DynamicBloom point_filter_;
std::atomic<bool> enabled_;
int num_callbacks_running_ = 0;
void LoadDependency(const std::vector<SyncPointPair>& dependencies);
void LoadDependencyAndMarkers(const std::vector<SyncPointPair>& dependencies,
const std::vector<SyncPointPair>& markers);
bool PredecessorsAllCleared(const std::string& point);
void SetCallBack(const std::string& point,
const std::function<void(void*)>& callback) {
std::lock_guard<std::mutex> lock(mutex_);
callbacks_[point] = callback;
point_filter_.Add(point);
}
void ClearCallBack(const std::string& point);
void ClearAllCallBacks();
void EnableProcessing() {
enabled_ = true;
}
void DisableProcessing() {
enabled_ = false;
}
void ClearTrace() {
std::lock_guard<std::mutex> lock(mutex_);
cleared_points_.clear();
}
bool DisabledByMarker(const std::string& point,
std::thread::id thread_id) {
auto marked_point_iter = marked_thread_id_.find(point);
return marked_point_iter != marked_thread_id_.end() &&
thread_id != marked_point_iter->second;
}
void Process(const Slice& point, void* cb_arg);
};
} // namespace ROCKSDB_NAMESPACE
#endif // NDEBUG