Nikhil Benesch 7891af8b53 expose a hook to skip tables during iteration
Summary:
As discussed on the mailing list (["Skipping entire SSTs while iterating"](https://groups.google.com/forum/#!topic/rocksdb/ujHCJVLrHlU)), this patch adds a `table_filter` to `ReadOptions` that allows specifying a callback to be executed during iteration before each table in the database is scanned. The callback is passed the table's properties; the table is scanned iff the callback returns true.

This can be used in conjunction with a `TablePropertiesCollector` to dramatically speed up scans by skipping tables that are known to contain irrelevant data for the scan at hand.

We're using this [downstream in CockroachDB](https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/engine/db.cc#L2009-L2022) already. With this feature, under ideal conditions, we can reduce the time of an incremental backup in  from hours to seconds.

FYI, the first commit in this PR fixes a segfault that I unfortunately have not figured out how to reproduce outside of CockroachDB. I'm hoping you accept it on the grounds that it is not correct to return 8-byte aligned memory from a call to `malloc` on some 64-bit platforms; one correct approach is to infer the necessary alignment from `std::max_align_t`, as done here. As noted in the first commit message, the bug is tickled by having a`std::function` in `struct ReadOptions`. That is, the following patch alone is enough to cause RocksDB to segfault when run from CockroachDB on Darwin.

```diff
 --- a/include/rocksdb/options.h
+++ b/include/rocksdb/options.h
@@ -1546,6 +1546,13 @@ struct ReadOptions {
   // Default: false
   bool ignore_range_deletions;

+  // A callback to determine whether relevant keys for this scan exist in a
+  // given table based on the table's properties. The callback is passed the
+  // properties of each table during iteration. If the callback returns false,
+  // the table will not be scanned.
+  // Default: empty (every table will be scanned)
+  std::function<bool(const TableProperties&)> table_filter;
+
   ReadOptions();
   ReadOptions(bool cksum, bool cache);
 };
```

/cc danhhz
Closes https://github.com/facebook/rocksdb/pull/2265

Differential Revision: D5054262

Pulled By: yiwu-arbug

fbshipit-source-id: dd6b28f2bba6cb8466250d8c5c542d3c92785476
2017-10-17 22:12:00 -07:00
..
2017-10-06 10:41:53 -07:00
2017-10-06 10:41:53 -07:00
2017-10-17 11:13:19 -07:00
2017-07-15 16:11:23 -07:00
2017-08-09 15:58:13 -07:00
2017-07-15 16:11:23 -07:00
2017-10-17 08:57:09 -07:00
2017-07-15 16:11:23 -07:00
2017-10-10 13:12:37 -07:00
2017-09-28 16:56:45 -07:00
2017-10-10 13:12:37 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-26 21:11:47 -07:00
2017-10-09 17:15:28 -07:00
2017-10-09 17:15:28 -07:00
2017-10-09 17:15:28 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-10-17 08:57:09 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-10-03 09:11:23 -07:00
2017-10-03 09:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-28 16:27:16 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-10-10 21:26:11 -07:00
2017-10-03 09:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-10-10 13:12:37 -07:00
2017-10-10 13:12:37 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-10-10 13:12:37 -07:00
2017-10-10 13:12:37 -07:00
2017-07-28 16:27:16 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-09-28 16:56:45 -07:00
2017-09-28 16:56:45 -07:00