rocksdb/util
Zhichao Cao a904c62d28 Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412)
Summary:
In PR https://github.com/facebook/rocksdb/issues/7523 , checksum handoff is introduced in RocksDB for WAL, Manifest, and SST files. When user enable checksum handoff for a certain type of file, before the data is written to the lower layer storage system, we calculate the checksum (crc32c) of each piece of data and pass the checksum down with the data, such that data verification can be down by the lower layer storage system if it has the capability. However, it cannot cover the whole lifetime of the data in the memory and also it potentially introduces extra checksum calculation overhead.

In this PR, we introduce a new interface in WritableFileWriter::Append, which allows the caller be able to pass the data and the checksum (crc32c) together. In this way, WritableFileWriter can directly use the pass-in checksum (crc32c) to generate the checksum of data being passed down to the storage system. It saves the calculation overhead and achieves higher protection coverage. When a new checksum is added with the data, we use Crc32cCombine https://github.com/facebook/rocksdb/issues/8305 to combine the existing checksum and the new checksum. To avoid the segmenting of data by rate-limiter before it is stored, rate-limiter is called enough times to accumulate enough credits for a certain write. This design only support Manifest and WAL which use log_writer in the current stage.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8412

Test Plan: make check, add new testing cases.

Reviewed By: anand1976

Differential Revision: D29151545

Pulled By: zhichao-cao

fbshipit-source-id: 75e2278c5126cfd58393c67b1efd18dcc7a30772
2021-06-25 00:47:17 -07:00
..
aligned_buffer.h Fix wrong comments about function TruncateToPageBoundary. (#6975) 2020-10-07 12:34:34 -07:00
autovector_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
autovector.h Change autovector to have a reserved size in LITE mode (#6868) 2020-05-21 14:48:10 -07:00
bloom_impl.h Ribbon: InterleavedSolutionStorage (#7598) 2020-11-03 12:46:36 -08:00
bloom_test.cc Revert Ribbon starting level support from #8198 (#8212) 2021-04-20 19:46:40 -07:00
build_version.cc.in Make builds reproducible (#7866) 2021-01-28 17:42:16 -08:00
cast_util.h Replace reinterpret_cast with static_cast_with_check (#7067) 2020-07-02 19:25:41 -07:00
channel.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding_lean.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
coding_test.cc Fix potential overflow of unsigned type in for loop (#6902) 2020-06-02 15:05:07 -07:00
coding.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
coding.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
compaction_job_stats_impl.cc Update compaction statistics to include the amount of data read from blob files (#8022) 2021-03-04 00:43:48 -08:00
comparator.cc Make Comparator into a Customizable Object (#8336) 2021-06-11 06:22:59 -07:00
compression_context_cache.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compression_context_cache.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
compression.h Limit buffering for collecting samples for compression dictionary (#7970) 2021-02-19 14:09:54 -08:00
concurrent_task_limiter_impl.cc Fix ConcurrentTaskLimiter token release for shutdown (#8253) 2021-05-04 17:27:24 -07:00
concurrent_task_limiter_impl.h Fix ConcurrentTaskLimiter token release for shutdown (#8253) 2021-05-04 17:27:24 -07:00
core_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
crc32c_arm64.cc Mac M1 crc32 intrinsics ARM64 check support proposal (#7893) 2021-03-10 09:05:56 -08:00
crc32c_arm64.h Fix compilation on Apple Silicon (#7714) 2020-12-04 15:22:33 -08:00
crc32c_ppc_asm.S Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_ppc_constants.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
crc32c_ppc.c Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_ppc.h Fix Compilation on ppc64le using Clang 11 (#7713) 2020-12-01 11:21:44 -08:00
crc32c_test.cc Implementation of Crc32c combine function (#8305) 2021-06-16 18:30:34 -07:00
crc32c.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
crc32c.h Implementation of Crc32c combine function (#8305) 2021-06-16 18:30:34 -07:00
defer_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
defer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
duplicate_detector.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
dynamic_bloom.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
dynamic_bloom.h Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
fastrange.h Genericize and clean up FastRange (#7436) 2020-09-28 11:35:00 -07:00
file_checksum_helper.cc Refactor with VersionEditHandler (#6581) 2020-11-11 08:00:14 -08:00
file_checksum_helper.h Real fix for race in backup custom checksum checking (#7309) 2020-08-26 10:39:20 -07:00
file_reader_writer_test.cc Using existing crc32c checksum in checksum handoff for Manifest and WAL (#8412) 2021-06-25 00:47:17 -07:00
filelock_test.cc Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
filter_bench.cc Rename variables in ImmutableCFOptions to avoid conflicts with ImmutableDBOptions (#8227) 2021-04-26 12:43:45 -07:00
gflags_compat.h Fix many tests to run with MEM_ENV and ENCRYPTED_ENV; Introduce a MemoryFileSystem class (#7566) 2020-10-27 10:33:09 -07:00
hash_map.h Change HashMap::Insert()'s value to a const reference (#6567) 2020-03-20 14:59:54 -07:00
hash_test.cc Use NPHash64 in more places (#7632) 2020-11-10 23:42:13 -08:00
hash.cc Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
hash.h Integrity protection for live updates to WriteBatch (#7748) 2021-01-29 12:18:58 -08:00
heap_test.cc Revert "Update googletest from 1.8.1 to 1.10.0 (#6808)" (#6923) 2020-06-03 15:55:03 -07:00
heap.h Avoid self-move-assign in pop operation of binary heap. (#7942) 2021-02-19 13:47:25 -08:00
kv_map.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
log_write_bench.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
math128.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
math.h Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
murmurhash.cc C++20 compatibility (#6697) 2020-04-20 13:24:25 -07:00
murmurhash.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
mutexlock.h Prevents Table Cache to open same files more times (#6707) 2020-04-21 13:16:31 -07:00
ppc-opcode.h Remove PATENTS text from a few straggler files (#5326) 2019-05-21 16:22:35 -07:00
random_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
random.cc Add De/Serialization for CompactionInput/Result (#8247) 2021-05-12 12:36:43 -07:00
random.h Add De/Serialization for CompactionInput/Result (#8247) 2021-05-12 12:36:43 -07:00
rate_limiter_test.cc Add a SystemClock class to capture the time functions of an Env (#7858) 2021-01-25 22:09:11 -08:00
rate_limiter.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
rate_limiter.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
repeatable_thread_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
repeatable_thread.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
ribbon_alg.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_config.cc Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_config.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_impl.h Refine Ribbon configuration, improve testing, add Homogeneous (#7879) 2021-02-26 08:50:42 -08:00
ribbon_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
set_comparator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice_test.cc Handoff checksum Implementation (#7523) 2021-02-10 22:20:32 -08:00
slice_transform_test.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
slice.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
status.cc Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
stderr_logger.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
stop_watch.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
string_util.cc Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
string_util.h Add remote compaction public API (#8300) 2021-05-19 21:41:31 -07:00
thread_guard.h Introduce a ThreadGuard class and use it in ExternalSSTFileTest.PickedLevelBug (#8112) 2021-03-25 22:08:58 -07:00
thread_list_test.cc fix thread status synchronization in thread_list_test (#7825) 2021-01-04 10:46:24 -08:00
thread_local_test.cc Add StartThread type checking wrapper (#8303) 2021-05-19 16:51:13 -07:00
thread_local.cc Fix typo in ThreadData comment (#7131) 2020-07-15 09:23:23 -07:00
thread_local.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
thread_operation.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
threadpool_imp.cc Use thread-safe strerror_r() to get error message (#8087) 2021-03-24 23:07:27 -07:00
threadpool_imp.h Make it able to lower cpu priority to specific level in threadpool (#6969) 2020-06-13 13:25:20 -07:00
timer_queue_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
timer_queue.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
timer_test.cc Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
timer.h Use SystemClock* instead of std::shared_ptr<SystemClock> in lower level routines (#8033) 2021-03-15 04:34:11 -07:00
user_comparator_wrapper.h Enable backward iterator for keys with user-defined timestamp (#8035) 2021-03-10 11:15:46 -08:00
vector_iterator.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
work_queue_test.cc Add pipelined & parallel compression optimization (#6262) 2020-04-01 16:40:18 -07:00
work_queue.h Revamp cache_bench to resemble a real workload (#6629) 2020-04-03 10:26:49 -07:00
xxh3p.h Fix MSVC-related build issues (#7439) 2020-10-01 09:23:04 -07:00
xxhash.cc Remove unused includes (#7604) 2020-10-28 23:22:27 -07:00
xxhash.h Misc hashing updates / upgrades (#5909) 2019-10-24 17:16:46 -07:00