rocksdb/table/block_based
Hui Xiao 920386f2b7 Detect (new) Bloom/Ribbon Filter construction corruption (#9342)
Summary:
Note: rebase on and merge after https://github.com/facebook/rocksdb/pull/9349, https://github.com/facebook/rocksdb/pull/9345, (optional) https://github.com/facebook/rocksdb/pull/9393
**Context:**
(Quoted from pdillinger) Layers of information during new Bloom/Ribbon Filter construction in building block-based tables includes the following:
a) set of keys to add to filter
b) set of hashes to add to filter (64-bit hash applied to each key)
c) set of Bloom indices to set in filter, with duplicates
d) set of Bloom indices to set in filter, deduplicated
e) final filter and its checksum

This PR aims to detect corruption (e.g, unexpected hardware/software corruption on data structures residing in the memory for a long time) from b) to e) and leave a) as future works for application level.
- b)'s corruption is detected by verifying the xor checksum of the hash entries calculated as the entries accumulate before being added to the filter. (i.e, `XXPH3FilterBitsBuilder::MaybeVerifyHashEntriesChecksum()`)
- c) - e)'s corruption is detected by verifying the hash entries indeed exists in the constructed filter by re-querying these hash entries in the filter (i.e, `FilterBitsBuilder::MaybePostVerify()`) after computing the block checksum (except for PartitionFilter, which is done right after each `FilterBitsBuilder::Finish` for impl simplicity - see code comment for more). For this stage of detection, we assume hash entries are not corrupted after checking on b) since the time interval from b) to c) is relatively short IMO.

Option to enable this feature of detection is `BlockBasedTableOptions::detect_filter_construct_corruption` which is false by default.

**Summary:**
- Implemented new functions `XXPH3FilterBitsBuilder::MaybeVerifyHashEntriesChecksum()` and `FilterBitsBuilder::MaybePostVerify()`
- Ensured hash entries, final filter and banding and their [cache reservation ](https://github.com/facebook/rocksdb/issues/9073) are released properly despite corruption
   - See [Filter.construction.artifacts.release.point.pdf ](https://github.com/facebook/rocksdb/files/7923487/Design.Filter.construction.artifacts.release.point.pdf) for high-level design
   -  Bundled and refactored hash entries's related artifact in XXPH3FilterBitsBuilder into `HashEntriesInfo` for better control on lifetime of these artifact during `SwapEntires`, `ResetEntries`
- Ensured RocksDB block-based table builder calls `FilterBitsBuilder::MaybePostVerify()` after constructing the filter by `FilterBitsBuilder::Finish()`
- When encountering such filter construction corruption, stop writing the filter content to files and mark such a block-based table building non-ok by storing the corruption status in the builder.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9342

Test Plan:
- Added new unit test `DBFilterConstructionCorruptionTestWithParam.DetectCorruption`
- Included this new feature in `DBFilterConstructionReserveMemoryTestWithParam.ReserveMemory` as this feature heavily touch ReserveMemory's impl
   - For fallback case, I run `./filter_bench -impl=3 -detect_filter_construct_corruption=true -reserve_table_builder_memory=true -strict_capacity_limit=true  -quick -runs 10 | grep 'Build avg'` to make sure nothing break.
- Added to `filter_bench`: increased filter construction time by **30%**, mostly by `MaybePostVerify()`
   -  FastLocalBloom
       - Before change: `./filter_bench -impl=2 -quick -runs 10 | grep 'Build avg'`: **28.86643s**
       - After change:
          -  `./filter_bench -impl=2 -detect_filter_construct_corruption=false -quick -runs 10 | grep 'Build avg'` (expect a tiny increase due to MaybePostVerify is always called regardless): **27.6644s (-4% perf improvement might be due to now we don't drop bloom hash entry in `AddAllEntries` along iteration but in bulk later, same with the bypassing-MaybePostVerify case below)**
          - `./filter_bench -impl=2 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'` (expect acceptable increase): **34.41159s (+20%)**
          - `./filter_bench -impl=2 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'` (by-passing MaybePostVerify, expect minor increase): **27.13431s (-6%)**
    -  Standard128Ribbon
       - Before change: `./filter_bench -impl=3 -quick -runs 10 | grep 'Build avg'`: **122.5384s**
       - After change:
          - `./filter_bench -impl=3 -detect_filter_construct_corruption=false -quick -runs 10 | grep 'Build avg'` (expect a tiny increase due to MaybePostVerify is always called regardless - verified by removing MaybePostVerify under this case and found only +-1ns difference): **124.3588s (+2%)**
          - `./filter_bench -impl=3 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'`(expect acceptable increase): **159.4946s (+30%)**
          - `./filter_bench -impl=3 -detect_filter_construct_corruption=true -quick -runs 10 | grep 'Build avg'`(by-passing MaybePostVerify, expect minor increase) : **125.258s (+2%)**
- Added to `db_stress`: `make crash_test`, `./db_stress --detect_filter_construct_corruption=true`
- Manually smoke-tested: manually corrupted the filter construction in some db level tests with basic PUT and background flush. As expected, the error did get returned to users in subsequent PUT and Flush status.

Reviewed By: pdillinger

Differential Revision: D33746928

Pulled By: hx235

fbshipit-source-id: cb056426be5a7debc1cd16f23bc250f36a08ca57
2022-02-01 17:42:35 -08:00
..
binary_search_index_reader.cc Separate internal and user key comparators in BlockIter (#6944) 2020-07-07 17:26:16 -07:00
binary_search_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
block_based_filter_block_test.cc Add table properties for number of entries added to filters (#8323) 2021-05-21 17:11:32 -07:00
block_based_filter_block.cc Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter (#9070) 2021-11-04 13:35:38 -07:00
block_based_filter_block.h Deallocate payload of BlockBasedTableBuilder::Rep::FilterBlockBuilder earlier for Full/PartitionedFilter (#9070) 2021-11-04 13:35:38 -07:00
block_based_table_builder.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
block_based_table_builder.h Fix segmentation fault in table_options.prepopulate_block_cache when used with partition_filters (#9263) 2021-12-08 12:44:38 -08:00
block_based_table_factory.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
block_based_table_factory.h Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
block_based_table_iterator.cc Reuse internal auto readhead_size at each Level (expect L0) for Iterations (#9056) 2021-11-10 16:20:04 -08:00
block_based_table_iterator.h Fix bug in rocksdb internal automatic prefetching (#9234) 2021-11-30 22:53:10 -08:00
block_based_table_reader_impl.h New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
block_based_table_reader_test.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
block_based_table_reader.cc Ignore total_order_seek in DB::Get (#9427) 2022-01-31 19:46:42 -08:00
block_based_table_reader.h Ignore total_order_seek in DB::Get (#9427) 2022-01-31 19:46:42 -08:00
block_builder.cc Improve data block construction performance (#9040) 2021-10-19 12:36:21 -07:00
block_builder.h Improve data block construction performance (#9040) 2021-10-19 12:36:21 -07:00
block_like_traits.h fix lru caching test and fix reference binding to null pointer (#8326) 2021-05-24 08:37:00 -07:00
block_prefetcher.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
block_prefetcher.h Fix bug in rocksdb internal automatic prefetching (#9234) 2021-11-30 22:53:10 -08:00
block_prefix_index.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_prefix_index.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
block_test.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
block_type.h Fix and detect headers with missing dependencies (#8893) 2021-09-10 10:00:26 -07:00
block.cc Add NewMetaDataIterator method (#8692) 2021-12-21 11:32:49 -08:00
block.h Add NewMetaDataIterator method (#8692) 2021-12-21 11:32:49 -08:00
cachable_entry.h Parallelize secondary cache lookup in MultiGet (#8405) 2021-06-18 09:35:59 -07:00
data_block_footer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_footer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index_test.cc Fast path for detecting unchanged prefix_extractor (#9407) 2022-01-21 11:37:46 -08:00
data_block_hash_index.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_block_reader_common.cc Parallelize secondary cache lookup in MultiGet (#8405) 2021-06-18 09:35:59 -07:00
filter_block_reader_common.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filter_block.h Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
filter_policy_internal.h Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
filter_policy.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
flush_block_policy.cc Restore Regex support for ObjectLibrary::Register, rename new APIs to allow old one to be deprecated in the future (#9362) 2022-01-11 06:33:48 -08:00
flush_block_policy.h Make FlushBlockPolicyFactory into a Customizable class (#8432) 2021-07-12 09:04:59 -07:00
full_filter_block_test.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
full_filter_block.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
full_filter_block.h Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
hash_index_reader.cc Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
hash_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
index_builder.cc Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
index_builder.h Make db_basic_test pass assert status checked (#7452) 2020-09-29 09:49:04 -07:00
index_reader_common.cc Parallelize secondary cache lookup in MultiGet (#8405) 2021-06-18 09:35:59 -07:00
index_reader_common.h Divide block_based_table_reader.cc (#6527) 2020-03-12 21:41:50 -07:00
mock_block_based_table.h Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262) 2021-05-05 14:00:17 -07:00
parsed_full_filter_block.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
parsed_full_filter_block.h Use new Insert and Lookup APIs in table reader to support secondary cache (#8315) 2021-05-21 18:29:12 -07:00
partitioned_filter_block_test.cc More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
partitioned_filter_block.cc Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
partitioned_filter_block.h Detect (new) Bloom/Ribbon Filter construction corruption (#9342) 2022-02-01 17:42:35 -08:00
partitioned_index_iterator.cc Reuse internal auto readhead_size at each Level (expect L0) for Iterations (#9056) 2021-11-10 16:20:04 -08:00
partitioned_index_iterator.h Fix bug in rocksdb internal automatic prefetching (#9234) 2021-11-30 22:53:10 -08:00
partitioned_index_reader.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
partitioned_index_reader.h Clarify caching behavior for index and filter partitions (#9068) 2021-10-27 17:23:04 -07:00
reader_common.cc Improve / clean up meta block code & integrity (#9163) 2021-11-18 11:43:44 -08:00
reader_common.h Bring the Configurable options together (#5753) 2020-09-14 17:01:01 -07:00
uncompression_dict_reader.cc Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
uncompression_dict_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00