rocksdb/util
Anand Ananthabhotla a27fce408e Auto recovery from out of space errors (#4164)
Summary:
This commit implements automatic recovery from a Status::NoSpace() error
during background operations such as write callback, flush and
compaction. The broad design is as follows -
1. Compaction errors are treated as soft errors and don't put the
database in read-only mode. A compaction is delayed until enough free
disk space is available to accomodate the compaction outputs, which is
estimated based on the input size. This means that users can continue to
write, and we rely on the WriteController to delay or stop writes if the
compaction debt becomes too high due to persistent low disk space
condition
2. Errors during write callback and flush are treated as hard errors,
i.e the database is put in read-only mode and goes back to read-write
only fater certain recovery actions are taken.
3. Both types of recovery rely on the SstFileManagerImpl to poll for
sufficient disk space. We assume that there is a 1-1 mapping between an
SFM and the underlying OS storage container. For cases where multiple
DBs are hosted on a single storage container, the user is expected to
allocate a single SFM instance and use the same one for all the DBs. If
no SFM is specified by the user, DBImpl::Open() will allocate one, but
this will be one per DB and each DB will recover independently. The
recovery implemented by SFM is as follows -
  a) On the first occurance of an out of space error during compaction,
subsequent
  compactions will be delayed until the disk free space check indicates
  enough available space. The required space is computed as the sum of
  input sizes.
  b) The free space check requirement will be removed once the amount of
  free space is greater than the size reserved by in progress
  compactions when the first error occured
  c) If the out of space error is a hard error, a background thread in
  SFM will poll for sufficient headroom before triggering the recovery
  of the database and putting it in write-only mode. The headroom is
  calculated as the sum of the write_buffer_size of all the DB instances
  associated with the SFM
4. EventListener callbacks will be called at the start and completion of
automatic recovery. Users can disable the auto recov ery in the start
callback, and later initiate it manually by calling DB::Resume()

Todo:
1. More extensive testing
2. Add disk full condition to db_stress (follow-on PR)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4164

Differential Revision: D9846378

Pulled By: anand1976

fbshipit-source-id: 80ea875dbd7f00205e19c82215ff6e37da10da4a
2018-09-15 13:43:04 -07:00
..
aligned_buffer.h Fix Copying of data between buffers in FilePrefetchBuffer (#4100) 2018-07-11 12:28:13 -07:00
allocator.h Change RocksDB License 2017-07-15 16:11:23 -07:00
arena_test.cc arena: derive alignment unit from std::max_align_t 2017-10-17 11:13:19 -07:00
arena.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
arena.h arena: derive alignment unit from std::max_align_t 2017-10-17 11:13:19 -07:00
auto_roll_logger_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
auto_roll_logger.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
auto_roll_logger.h Fix the Logger::Close() and DBImpl::Close() design pattern 2018-02-23 13:57:26 -08:00
autovector_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
autovector.h Change RocksDB License 2017-07-15 16:11:23 -07:00
bloom_test.cc fix gflags namespace 2017-12-01 10:42:05 -08:00
bloom.cc Improve FullFilterBitsReader::HashMayMatch's doc (#4202) 2018-08-06 11:13:18 -07:00
build_version.cc.in Makefile: generate util/build_version.cc from .in file (#1384) 2016-10-25 11:31:39 -07:00
build_version.h Change RocksDB License 2017-07-15 16:11:23 -07:00
cast_util.h Add a missing "once" in .h 2017-07-31 12:12:03 -07:00
channel.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
coding_test.cc Coding.h: Added Fixed16 support (#4142) 2018-07-16 23:43:41 -07:00
coding.cc Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
coding.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
compaction_job_stats_impl.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
comparator.cc Improve point-lookup performance using a data block hash index (#4174) 2018-08-15 14:30:03 -07:00
compression_context_cache.cc run make format for PR 3838 (#3954) 2018-06-05 12:58:02 -07:00
compression_context_cache.h run make format for PR 3838 (#3954) 2018-06-05 12:58:02 -07:00
compression.h Revert "Digest ZSTD compression dictionary once per SST file (#4251)" (#4347) 2018-09-06 09:58:34 -07:00
concurrent_arena.cc Cap concurrent arena's shard block size to 128KB (#4147) 2018-07-18 10:43:54 -07:00
concurrent_arena.h util: Fix coverity issues 2017-11-03 14:42:08 -07:00
core_local.h Change RocksDB License 2017-07-15 16:11:23 -07:00
crc32c_ppc_asm.S Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
crc32c_ppc_constants.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
crc32c_ppc.c Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
crc32c_ppc.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
crc32c_test.cc Build and tests fixes for Solaris Sparc (#4000) 2018-06-15 12:42:53 -07:00
crc32c.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
crc32c.h Updated CRC32 Power Optimization Changes 2017-08-31 14:16:30 -07:00
delete_scheduler_test.cc Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
delete_scheduler.cc In delete scheduler, before ftruncate file for slow delete, check whether there is other hard links (#4093) 2018-07-09 15:28:12 -07:00
delete_scheduler.h In delete scheduler, before ftruncate file for slow delete, check whether there is other hard links (#4093) 2018-07-09 15:28:12 -07:00
duplicate_detector.h Improve db_stress with transactions 2018-04-18 16:32:35 -07:00
dynamic_bloom_test.cc Two code changes to make "clang analyze" happy (#4292) 2018-08-20 17:43:41 -07:00
dynamic_bloom.cc Use nullptr instead of NULL / 0 more consistently. 2018-03-07 12:42:12 -08:00
dynamic_bloom.h Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
event_logger_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
event_logger.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
event_logger.h Change RocksDB License 2017-07-15 16:11:23 -07:00
fault_injection_test_env.cc Allow DB resume after background errors (#3997) 2018-06-28 12:34:40 -07:00
fault_injection_test_env.h Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
file_reader_writer_test.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
file_reader_writer.cc Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
file_reader_writer.h Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
file_util.cc Sync CURRENT file during checkpoint (#4322) 2018-08-28 12:43:18 -07:00
file_util.h Sync CURRENT file during checkpoint (#4322) 2018-08-28 12:43:18 -07:00
filelock_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
filename.cc Store the return value of Fsync for check 2018-09-14 13:29:56 -07:00
filename.h BlobDB: GetLiveFiles and GetLiveFilesMetadata return relative path (#4326) 2018-08-31 12:12:49 -07:00
filter_policy.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
gflags_compat.h fix gflags namespace 2017-12-01 10:42:05 -08:00
hash_map.h Change RocksDB License 2017-07-15 16:11:23 -07:00
hash_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
hash.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
hash.h Change RocksDB License 2017-07-15 16:11:23 -07:00
heap_test.cc fix gflags namespace 2017-12-01 10:42:05 -08:00
heap.h Change RocksDB License 2017-07-15 16:11:23 -07:00
kv_map.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_buffer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
log_buffer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_write_bench.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
logging.h Change RocksDB License 2017-07-15 16:11:23 -07:00
memory_usage.h Change RocksDB License 2017-07-15 16:11:23 -07:00
murmurhash.cc Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
murmurhash.h Change RocksDB License 2017-07-15 16:11:23 -07:00
mutexlock.h Change RocksDB License 2017-07-15 16:11:23 -07:00
ppc-opcode.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
random.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
random.h Change RocksDB License 2017-07-15 16:11:23 -07:00
rate_limiter_test.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
rate_limiter.cc rate limit auto-tuning 2017-10-04 19:15:01 -07:00
rate_limiter.h rate limit auto-tuning 2017-10-04 19:15:01 -07:00
set_comparator.h WritePrepared Txn: Move DuplicateDetector to util 2018-03-05 23:57:12 -08:00
slice_transform_test.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
slice.cc use user_key and iterate_upper_bound to determine compatibility of bloom filters (#3899) 2018-06-26 15:57:26 -07:00
sst_file_manager_impl.cc Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
sst_file_manager_impl.h Auto recovery from out of space errors (#4164) 2018-09-15 13:43:04 -07:00
status.cc move static msgs out of Status class (#4144) 2018-07-23 15:44:16 -07:00
stderr_logger.h Change RocksDB License 2017-07-15 16:11:23 -07:00
stop_watch.h Exclude time waiting for rate limiter from rocksdb.sst.read.micros (#4102) 2018-07-13 18:44:14 -07:00
string_util.cc Several small "fixes" 2018-02-15 16:57:37 -08:00
string_util.h Change RocksDB License 2017-07-15 16:11:23 -07:00
sync_point_impl.cc Avoid acquiring SyncPoint mutex when it is disabled (#3991) 2018-06-13 13:13:18 -07:00
sync_point_impl.h Avoid acquiring SyncPoint mutex when it is disabled (#3991) 2018-06-13 13:13:18 -07:00
sync_point.cc Refactor sync_point to make implementation either customizable or replaceable 2018-03-23 12:56:52 -07:00
sync_point.h Fix singleton destruction order of PosixEnv and SyncPoint (#3951) 2018-06-04 15:58:46 -07:00
testharness.cc Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
testharness.h Per-thread unique test db names (#4135) 2018-07-13 17:27:39 -07:00
testutil.cc Add path to WritableFileWriter. (#4039) 2018-08-23 10:12:58 -07:00
testutil.h Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
thread_list_test.cc Use nullptr instead of NULL / 0 more consistently. 2018-03-07 12:42:12 -08:00
thread_local_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
thread_local.cc Reformatting some recent changes (#4161) 2018-07-20 14:43:38 -07:00
thread_local.h Provide a way to override windows memory allocator with jemalloc for ZSTD 2018-06-04 12:12:48 -07:00
thread_operation.h Change RocksDB License 2017-07-15 16:11:23 -07:00
threadpool_imp.cc Suppress clang analyzer error (#4299) 2018-08-21 16:43:05 -07:00
threadpool_imp.h Support lowering CPU priority of background threads 2018-04-24 08:41:51 -07:00
timer_queue_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
timer_queue.h Change RocksDB License 2017-07-15 16:11:23 -07:00
trace_replay.cc Add tracing function of Seek() and SeekForPrev() to trace_replay (#4228) 2018-08-10 17:57:40 -07:00
trace_replay.h Add tracing function of Seek() and SeekForPrev() to trace_replay (#4228) 2018-08-10 17:57:40 -07:00
transaction_test_util.cc Support pragma once in all header files and cleanup some warnings (#4339) 2018-09-05 18:13:31 -07:00
transaction_test_util.h WritePrepared Txn: stress test 2017-12-06 09:42:28 -08:00
util.h Add GCC 8 to Travis (#3433) 2018-07-13 10:58:06 -07:00
xxhash.cc fixed typo 2017-06-05 11:27:34 -07:00
xxhash.h Prevent xxhash symbols from polluting global namespace 2015-03-12 12:07:10 -07:00