rocksdb/docs/_posts/2017-08-24-pinnableslice.markdown
Maysam Yabandeh b01f426f56 Blog post for FlushWAL
Summary: Closes https://github.com/facebook/rocksdb/pull/2790

Differential Revision: D5711609

Pulled By: maysamyabandeh

fbshipit-source-id: ea103dac013c0a6a031834541ad67e7d95a80fe8
2017-08-25 16:11:57 -07:00

2.2 KiB

title layout author category
PinnableSlice; less memcpy with point lookups post maysamyabandeh blog

The classic API for DB::Get receives a std::string as argument to which it will copy the value. The memcpy overhead could be non-trivial when the value is large. The new API receives a PinnableSlice instead, which avoids memcpy in most of the cases.

What is PinnableSlice?

Similarly to Slice, PinnableSlice refers to some in-memory data so it does not incur the memcpy cost. To ensure that the data will not be erased while it is being processed by the user, PinnableSlice, as its name suggests, has the data pinned in memory. The pinned data are released when PinnableSlice object is destructed or when ::Reset is invoked explicitly on it.

How good is it?

Here are the improvements in throughput for an in-memory benchmark:

  • value 1k byte: 14%
  • value 10k byte: 34%

Any limitations?

PinnableSlice tries to avoid memcpy as much as possible. The primary gain is when reading large values from the block cache. There are however cases that it would still have to copy the data into its internal buffer. The reason is mainly the complexity of implementation and if there is enough motivation on the application side. the scope of PinnableSlice could be extended to such cases too. These include:

  • Merged values
  • Reads from memtables

How to use it?

PinnableSlice pinnable_val;
while (!stopped) { 
   auto s = db->Get(opt, cf, key, &pinnable_val);
   // ... use it
   pinnable_val.Reset(); // then release it immediately
}

You can also initialize the internal buffer of PinnableSlice by passing your own string in the constructor. simple_example.cc demonstrates that with more examples.