rocksdb/db
Peter Dillinger 25a0d0ca30 Fix block checksum for >=4GB, refactor (#6978)
Summary:
Although RocksDB falls over in various other ways with KVs
around 4GB or more, this change fixes how XXH32 and XXH64 were being
called by the block checksum code to support >= 4GB in case that should
ever happen, or the code copied for other uses.

This change is not a schema compatibility issue because the checksum
verification code would checksum the first (block_size + 1) mod 2^32
bytes while the checksum construction code would checksum the first
block_size mod 2^32 plus the compression type byte, meaning the
XXH32/64 checksums for >=4GB block would not match about 255/256 times.

While touching this code, I refactored to consolidate redundant
implementations, improving diagnostics and performance tracking in some
cases. Also used less confusing language in those diagnostics.

Makes https://github.com/facebook/rocksdb/issues/6875 obsolete.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6978

Test Plan:
I was able to write a test for this using an SST file writer
and VerifyChecksum in a reader. The test fails before the fix, though
I'm leaving the test disabled because I don't think it's worth the
expense of running regularly.

Reviewed By: gg814

Differential Revision: D22143260

Pulled By: pdillinger

fbshipit-source-id: 982993d16134e8c50bea2269047f901c1783726e
2020-06-19 16:18:24 -07:00
..
blob Maintain the set of linked SSTs in BlobFileMetaData (#6945) 2020-06-12 09:54:39 -07:00
compaction Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
db_impl Fix a bug that causes iterator to return wrong result in a rare data race (#6973) 2020-06-18 10:16:38 -07:00
arena_wrapped_db_iter.cc Fix a bug that causes iterator to return wrong result in a rare data race (#6973) 2020-06-18 10:16:38 -07:00
arena_wrapped_db_iter.h Iterator with timestamp (#6255) 2020-03-06 16:24:27 -08:00
builder.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
builder.h Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
c_test.c Add (some) getters for options to the C API (#6925) 2020-06-03 17:08:50 -07:00
c.cc Add (some) getters for options to the C API (#6925) 2020-06-03 17:08:50 -07:00
column_family_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
column_family.cc Attempt to recover from db with missing table files (#6334) 2020-03-20 19:30:48 -07:00
column_family.h Attempt to recover from db with missing table files (#6334) 2020-03-20 19:30:48 -07:00
compact_files_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compacted_db_impl.cc return timestamp from get (#6409) 2020-03-02 16:01:00 -08:00
compacted_db_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
comparator_db_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
convenience.cc sst_dump to reduce number of file reads (#6836) 2020-05-12 18:23:33 -07:00
corruption_test.cc Check iterator status BlockBasedTableReader::VerifyChecksumInBlocks() (#6909) 2020-06-05 11:08:25 -07:00
cuckoo_table_db_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_basic_test.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
db_block_cache_test.cc Fix potential overflow of unsigned type in for loop (#6902) 2020-06-02 15:05:07 -07:00
db_bloom_filter_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_compaction_filter_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_compaction_test.cc make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
db_dynamic_level_test.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
db_encryption_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_filesnapshot.cc Fix db_stress when GetLiveFiles() flushes dropped CF (#6805) 2020-05-04 17:45:49 -07:00
db_flush_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_info_dumper.cc Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_info_dumper.h Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_inplace_update_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_io_failure_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_iter_stress_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_iter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_iter.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
db_iter.h make iterator return versions between timestamp bounds (#6544) 2020-04-10 09:51:58 -07:00
db_iterator_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_log_iter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_logical_block_size_cache_test.cc Get block size only in direct IO mode (#6522) 2020-03-20 15:26:10 -07:00
db_memtable_test.cc Remove racially charged terms "whitelist" and "blacklist" (#7008) 2020-06-19 15:27:32 -07:00
db_merge_operand_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_merge_operator_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_options_test.cc Remove the support of setting CompressionOptions.parallel_threads from string for now (#6782) 2020-04-30 17:01:17 -07:00
db_properties_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_range_del_test.cc Fix potential overflow of unsigned type in for loop (#6902) 2020-06-02 15:05:07 -07:00
db_sst_test.cc Add logs and stats in DeleteScheduler (#6927) 2020-06-05 09:43:04 -07:00
db_statistics_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_table_properties_test.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
db_tailing_iter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
db_test_util.cc Fix failure to write output in SpecialEnv::GetCurrentTime (#6803) 2020-05-05 13:11:29 -07:00
db_test_util.h Fix failure to write output in SpecialEnv::GetCurrentTime (#6803) 2020-05-05 13:11:29 -07:00
db_test.cc Add a DB Session ID (#6959) 2020-06-15 10:47:02 -07:00
db_test2.cc Fix a bug that causes iterator to return wrong result in a rare data race (#6973) 2020-06-18 10:16:38 -07:00
db_universal_compaction_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_wal_test.cc Add OptionTypeInfo::Enum and related methods (#6423) 2020-05-05 15:04:04 -07:00
db_with_timestamp_basic_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
db_with_timestamp_compaction_test.cc Compaction with timestamp: input boundaries (#6645) 2020-04-10 16:05:49 -07:00
db_write_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
dbformat_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dbformat.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
dbformat.h Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
deletefile_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
error_handler_fs_test.cc Revamp cache_bench to resemble a real workload (#6629) 2020-04-03 10:26:49 -07:00
error_handler.cc Be able to decrease background thread's CPU priority when creating database backup (#6602) 2020-03-28 19:07:25 -07:00
error_handler.h Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) 2020-03-27 16:04:43 -07:00
event_helpers.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
event_helpers.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
experimental.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
external_sst_file_basic_test.cc Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
external_sst_file_ingestion_job.cc Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
external_sst_file_ingestion_job.h Ingest SST files with checksum information (#6891) 2020-06-11 14:27:36 -07:00
external_sst_file_test.cc Fix block checksum for >=4GB, refactor (#6978) 2020-06-19 16:18:24 -07:00
fault_injection_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
file_indexer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
file_indexer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
filename_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_job_test.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
flush_job.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
flush_job.h Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
flush_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
flush_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
forward_iterator_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
forward_iterator.cc make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
forward_iterator.h Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621) 2020-04-15 17:40:44 -07:00
import_column_family_job.cc Fix potential size_t overflow in import_column_family (#6762) 2020-04-30 08:40:42 -07:00
import_column_family_job.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
import_column_family_test.cc Use a per-thread path for the export directory in import_column_family_test (#6962) 2020-06-10 14:04:07 -07:00
internal_stats.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
internal_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
job_context.h Expose the set of live blob files from Version/VersionSet (#6785) 2020-05-04 15:08:13 -07:00
listener_test.cc Move BlobDB related files under db/ to db/blob/ (#6519) 2020-03-12 11:00:56 -07:00
log_format.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_reader.cc Fail point-in-time WAL recovery upon IOError reading WAL (#6963) 2020-06-11 18:42:10 -07:00
log_reader.h Fix tabs and lint-ignores (#6734) 2020-04-20 11:39:31 -07:00
log_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
log_writer.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
log_writer.h Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) 2020-03-27 16:04:43 -07:00
logs_with_prep_tracker.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
logs_with_prep_tracker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
lookup_key.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
malloc_stats.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
manual_compaction_test.cc Skip high levels with no key falling in the range in CompactRange (#6482) 2020-03-04 20:15:25 -08:00
memtable_list_test.cc Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487) 2020-03-27 16:04:43 -07:00
memtable_list.cc Fix some defects reported by Coverity Scan (#6933) 2020-06-04 15:46:27 -07:00
memtable_list.h Fix some defects reported by Coverity Scan (#6933) 2020-06-04 15:46:27 -07:00
memtable.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
memtable.h return timestamp from get (#6409) 2020-03-02 16:01:00 -08:00
merge_context.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_helper.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_operator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
merge_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
obsolete_files_test.cc Find/purge obsolete blob files (#6807) 2020-05-07 09:32:51 -07:00
options_file_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
perf_context_test.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
pinned_iterators_manager.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
plain_table_db_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
pre_release_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
prefix_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator_bench.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_del_aggregator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
range_tombstone_fragmenter.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
read_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
repair_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
repair.cc Store DB identity and DB session ID in SST files (#6983) 2020-06-17 10:57:40 -07:00
snapshot_checker.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
snapshot_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_cache.cc make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
table_cache.h make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
table_properties_collector_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
table_properties_collector.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
table_properties_collector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
transaction_log_impl.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
transaction_log_impl.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
trim_history_scheduler.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
version_builder_test.cc Maintain the set of linked SSTs in BlobFileMetaData (#6945) 2020-06-12 09:54:39 -07:00
version_builder.cc Maintain the set of linked SSTs in BlobFileMetaData (#6945) 2020-06-12 09:54:39 -07:00
version_builder.h make L0 index/filter pinned memory usage predictable (#6911) 2020-06-09 16:51:23 -07:00
version_edit_handler.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
version_edit_handler.h Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
version_edit_test.cc Revert "Added the safe-to-ignore tag to version_edit (#6530)" (#6569) 2020-03-23 10:27:47 -07:00
version_edit.cc Remove unnecessary inclusion of version_edit.h in env (#6952) 2020-06-07 21:56:55 -07:00
version_edit.h Remove unnecessary inclusion of version_edit.h in env (#6952) 2020-06-07 21:56:55 -07:00
version_set_test.cc Add convenience method GetFileMetaDataByNumber (#6940) 2020-06-08 16:01:59 -07:00
version_set.cc Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
version_set.h Fail recovery when MANIFEST record checksum mismatch (#6996) 2020-06-18 10:09:12 -07:00
wal_manager_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
wal_manager.cc Fix FilterBench when RTTI=0 (#6732) 2020-04-29 13:09:23 -07:00
wal_manager.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_base.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_internal.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_batch.cc Add timestamp to delete (#6253) 2020-05-28 10:40:03 -07:00
write_callback_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_callback.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_controller.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
write_thread.cc fix some spelling typos (#6464) 2020-02-28 14:14:03 -08:00
write_thread.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00