Repairer documentation improvement.
Summary: Adding verbosity to existing comments. Test Plan: None Reviewers: sdong CC: leveldb Task ID: #6718960 Blame Rev:
This commit is contained in:
parent
2f66d7f925
commit
697380f3d7
53
db/repair.cc
53
db/repair.cc
@ -7,24 +7,53 @@
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
||||
//
|
||||
// We recover the contents of the descriptor from the other files we find.
|
||||
// (1) Any log files are first converted to tables
|
||||
// (2) We scan every table to compute
|
||||
// (a) smallest/largest for the table
|
||||
// (b) largest sequence number in the table
|
||||
// (3) We generate descriptor contents:
|
||||
// - log number is set to zero
|
||||
// - next-file-number is set to 1 + largest file number we found
|
||||
// - last-sequence-number is set to largest sequence# found across
|
||||
// all tables (see 2c)
|
||||
// - compaction pointers are cleared
|
||||
// - every table file is added at level 0
|
||||
// Repairer does best effort recovery to recover as much data as possible after
|
||||
// a disaster without compromising consistency. It does not guarantee bringing
|
||||
// the database to a time consistent state.
|
||||
//
|
||||
// Repair process is broken into 4 phases:
|
||||
// (a) Find files
|
||||
// (b) Convert logs to tables
|
||||
// (c) Extract metadata
|
||||
// (d) Write Descriptor
|
||||
//
|
||||
// (a) Find files
|
||||
//
|
||||
// The repairer goes through all the files in the directory, and classifies them
|
||||
// based on their file name. Any file that cannot be identified by name will be
|
||||
// ignored.
|
||||
//
|
||||
// (b) Convert logs to table
|
||||
//
|
||||
// Every log file that is active is replayed. All sections of the file where the
|
||||
// checksum does not match is skipped over. We intentionally give preference to
|
||||
// data consistency.
|
||||
//
|
||||
// (c) Extract metadata
|
||||
//
|
||||
// We scan every table to compute
|
||||
// (1) smallest/largest for the table
|
||||
// (2) largest sequence number in the table
|
||||
//
|
||||
// If we are unable to scan the file, then we ignore the table.
|
||||
//
|
||||
// (d) Write Descriptor
|
||||
//
|
||||
// We generate descriptor contents:
|
||||
// - log number is set to zero
|
||||
// - next-file-number is set to 1 + largest file number we found
|
||||
// - last-sequence-number is set to largest sequence# found across
|
||||
// all tables (see 2c)
|
||||
// - compaction pointers are cleared
|
||||
// - every table file is added at level 0
|
||||
//
|
||||
// Possible optimization 1:
|
||||
// (a) Compute total size and use to pick appropriate max-level M
|
||||
// (b) Sort tables by largest sequence# in the table
|
||||
// (c) For each table: if it overlaps earlier table, place in level-0,
|
||||
// else place in level-M.
|
||||
// (d) We can provide options for time consistent recovery and unsafe recovery
|
||||
// (ignore checksum failure when applicable)
|
||||
// Possible optimization 2:
|
||||
// Store per-table metadata (smallest, largest, largest-seq#, ...)
|
||||
// in the table's meta section to speed up ScanTable.
|
||||
|
Loading…
Reference in New Issue
Block a user