A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Andrew Kryczka b78ed0460b fix ReadaheadRandomAccessFile/iterator prefetch bug
Summary:
`ReadaheadRandomAccessFile` is used by iterators for file reads in several cases, like in compaction when `compaction_readahead_size > 0` or `use_direct_io_for_flush_and_compaction == true`, or in user iterator when `ReadOptions::readahead_size > 0`. `ReadaheadRandomAccessFile` maintains an internal buffer for readahead data. It assumes that, if the buffer's length is less than `ReadaheadRandomAccessFile::readahead_size_`, which is fixed in the constructor, then EOF has been reached so it doesn't try reading further.

Recently, d938226af4 started calling `RandomAccessFile::Prefetch` with various lengths: 8KB, 16KB, etc. When the `RandomAccessFile` is a `ReadaheadRandomAccessFile`, it triggers the above condition and incorrectly determines EOF. If a block is partially in the readahead buffer and EOF is incorrectly decided, the result is a truncated data block.

The problem is reproducible:

```
TEST_TMPDIR=/data/compaction_bench ./db_bench -benchmarks=fillrandom -write_buffer_size=1048576 -target_file_size_base=1048576 -block_size=18384 -use_direct_io_for_flush_and_compaction=true
...
put error: Corruption: truncated block read from /data/compaction_bench/dbbench/000014.sst offset 20245, expected 10143 bytes, got 8427
```
Closes https://github.com/facebook/rocksdb/pull/3454

Differential Revision: D6869405

Pulled By: ajkr

fbshipit-source-id: 87001c299e7600a37c0dcccbd0368e0954c929cf
2018-02-01 09:42:09 -08:00
buckifier Suppress lint in old files 2018-01-29 12:56:42 -08:00
build_tools Suppress lint in old files 2018-01-29 12:56:42 -08:00
cache fix gflags namespace 2017-12-01 10:42:05 -08:00
cmake add missing config checks to CMakeLists.txt 2017-11-30 22:57:00 -08:00
coverage Suppress lint in old files 2018-01-29 12:56:42 -08:00
db WritePrepared Txn: Duplicate Keys, Memtable part 2018-01-31 18:57:07 -08:00
docs fix Gemfile.lock nokogiri dependencies 2018-01-11 20:11:32 -08:00
env Add a Close() method to DB to return status when closing a db 2018-01-16 11:08:57 -08:00
examples Pinnableslice examples and blog post 2017-08-24 12:26:07 -07:00
hdfs Suppress lint in old files 2018-01-29 12:56:42 -08:00
include/rocksdb WritePrepared Txn: Duplicate Keys, Memtable part 2018-01-31 18:57:07 -08:00
java Suppress lint in old files 2018-01-29 12:56:42 -08:00
memtable WritePrepared Txn: Duplicate Keys, Memtable part 2018-01-31 18:57:07 -08:00
monitoring fix ThreadStatus for bottom-pri compaction threads 2017-12-14 14:57:49 -08:00
options DB::DumpSupportInfo should log all supported compression types 2018-01-23 14:44:12 -08:00
port FreeBSD build support for RocksDB and RocksJava 2018-01-11 13:29:55 -08:00
table Update rocksdb.read.block.get.micros when block cache disabled 2018-01-31 14:26:52 -08:00
third-party Enable MSVC W4 with a few exceptions. Fix warnings and bugs 2017-10-19 10:57:12 -07:00
tools db_bench: sanity check CuckooTable with mmap_read option 2018-01-29 14:27:32 -08:00
util fix ReadaheadRandomAccessFile/iterator prefetch bug 2018-02-01 09:42:09 -08:00
utilities Blob DB: miscellaneous changes 2018-01-31 18:13:23 -08:00
.clang-format
.gitignore Remove leftover references to phutil_module_cache 2017-08-23 12:12:21 -07:00
.travis.yml CMake cross platform Java support and add JNI to travis 2017-11-28 12:27:53 -08:00
appveyor.yml Make Windows dep switches compatible with other builds 2018-01-05 14:56:54 -08:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt CMake changes for CRC32 Optimization on PowerPC 2018-01-23 16:57:11 -08:00
CODE_OF_CONDUCT.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Delete files in multiple ranges at once 2018-01-30 13:56:39 -08:00
INSTALL.md FreeBSD build support for RocksDB and RocksJava 2018-01-11 13:29:55 -08:00
issue_template.md Add a template for issues 2017-09-29 11:41:28 -07:00
LANGUAGE-BINDINGS.md Add Nim to the list of language bindings 2018-01-29 09:57:46 -08:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile add -fno-sanitize-recover option to force exit on errors 2018-01-31 12:13:00 -08:00
README.md Add Jenkins for PPC64le build status badge 2018-01-11 14:57:45 -08:00
ROCKSDB_LITE.md Optimistic Transactions 2015-05-29 14:36:35 -07:00
src.mk Refactor ReadBlockContents() 2017-12-11 15:27:32 -08:00
TARGETS WritePrepared Txn: make buck tests parallel 2017-12-18 14:42:09 -08:00
thirdparty.inc Make Windows dep switches compatible with other builds 2018-01-05 14:56:54 -08:00
USERS.md Added ProfaneDB 2017-11-19 10:11:44 -08:00
Vagrantfile Update Vagrant file (test internal phabricator workflow) 2016-10-28 15:39:19 -07:00
WINDOWS_PORT.md Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Linux/Mac Build Status Windows Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/