A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Igor Canadi 4b66d95344 Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files

There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress

Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files

write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.

Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.

In a separate diff, I'll add this new test to our nightly legocastle runs.

Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!

When i reverted https://reviews.facebook.net/D33045:

     ./write_stress --runtime_sec=200 --low_open_files_mode=true
     Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory

When i reverted https://reviews.facebook.net/D42399:

    python tools/write_stress_runner.py --runtime_sec=5000
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
    Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
    Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
    ERROR: write_stress died with exitcode=-6

When i reverted https://reviews.facebook.net/D46899:

    python tools/write_stress_runner.py --runtime_sec=1000
    runtime: 1000
    Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
    ERROR: write_stress died with exitcode=-6

Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49533
2015-10-28 16:15:07 -07:00
arcanist_util Don't spew warnings when flint doesn't exist 2015-10-19 18:47:59 -07:00
build_tools Add LITE tests to Legocastle 2015-10-26 11:50:29 -07:00
coverage Fix coverage script 2014-11-03 14:53:00 -08:00
db Merge branch 'master' into wal_filter 2015-10-26 19:03:34 -07:00
doc Remove seek compaction 2014-06-20 10:23:02 +02:00
examples [RocksDB Options File] Add TableOptions section and support BlockBasedTable 2015-10-11 12:17:42 -07:00
hdfs [Cleanup] Remove RandomRWFile 2015-08-12 10:18:59 -07:00
include/rocksdb Move include/posix/io_posix.h to util/io_posix.h 2015-10-28 12:15:51 -07:00
java options: add recycle_log_file_num option 2015-10-18 21:21:24 -04:00
memtable Moving memtable related files from util to a new directory memtable 2015-10-16 14:10:33 -07:00
port Allow users to disable some kill points in db_stress 2015-10-15 14:33:13 -07:00
table Fix MockTable ID storage 2015-10-28 10:53:14 -07:00
third-party "make format" against last 10 commits 2015-07-13 13:50:18 -07:00
tools Write stress test 2015-10-28 16:15:07 -07:00
util Move include/posix/io_posix.h to util/io_posix.h 2015-10-28 12:15:51 -07:00
utilities No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
.arcconfig Integrate Jenkins with Phabricator 2015-04-07 11:56:29 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore New amalgamation target 2015-10-01 08:29:31 +13:00
.travis.yml Run ROCKSDB_LITE tests in travis 2015-10-16 10:47:37 -07:00
appveyor.yml Do not build test only code and unit tests in Release builds 2015-10-20 13:35:08 -07:00
appveyordailytests.yml Do not build test only code and unit tests in Release builds 2015-10-20 13:35:08 -07:00
AUTHORS Add AUTHORS file. Fix #203 2014-09-29 10:52:18 -07:00
CMakeLists.txt Write stress test 2015-10-28 16:15:07 -07:00
CONTRIBUTING.md facebook accounts are not required for CLA signers 2014-07-08 05:57:54 -04:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Remove DefaultCompactionFilterFactory. 2015-10-14 11:02:30 -07:00
INSTALL.md Update 4 is required for building with MS Visual Studio 13 2015-10-15 11:06:02 -07:00
LICENSE Fix copyright year 2014-03-12 12:06:58 -07:00
Makefile Write stress test 2015-10-28 16:15:07 -07:00
PATENTS Update Patent Grant. 2015-04-13 10:33:43 +01:00
README.md Replaced "built on on earlier work" by "built on earlier work" in README.md 2014-09-17 01:16:17 -07:00
ROCKSDB_LITE.md Optimistic Transactions 2015-05-29 14:36:35 -07:00
src.mk Split posix storage backend into Env and library 2015-10-22 17:31:31 +02:00
thirdparty.inc Improve CI build and fix Windows build breakage 2015-09-30 11:20:23 -07:00
USERS.md Add Cloudera's blog post to USERS.md 2015-09-02 14:04:51 -07:00
Vagrantfile RocksDB on FreeBSD support 2015-02-26 15:19:17 -08:00
WINDOWS_PORT.md Commit both PR and internal code review changes 2015-07-07 16:58:20 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/