Commit Graph

4280 Commits

Author SHA1 Message Date
Igor Canadi
c97667d9f1 Fix RocksDB lite build for write_stress
Summary: We don't have access to GetLiveFilesMetadata() in RocksDB lite. If compiling write_stress for lite, I skip the check for leaked files, which depends on this function.

Test Plan: OPT=-DROCKSDB_LITE m write_stress

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D49647
2015-10-28 16:37:39 -07:00
Herman Lee
0d720dfc17 Use the correct variable when fetching table properties.
Summary:
An uninitialized parameter was being passed into the call to fetch the table
properties during the compaction notification callbacks.

Test Plan:
Build it with myrocks and verify unit test passed.
Run unit tests.

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49635
2015-10-28 16:28:11 -07:00
Igor Canadi
4b66d95344 Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files

There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress

Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files

write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.

Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.

In a separate diff, I'll add this new test to our nightly legocastle runs.

Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!

When i reverted https://reviews.facebook.net/D33045:

     ./write_stress --runtime_sec=200 --low_open_files_mode=true
     Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory

When i reverted https://reviews.facebook.net/D42399:

    python tools/write_stress_runner.py --runtime_sec=5000
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
    Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
    Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
    ERROR: write_stress died with exitcode=-6

When i reverted https://reviews.facebook.net/D46899:

    python tools/write_stress_runner.py --runtime_sec=1000
    runtime: 1000
    Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
    ERROR: write_stress died with exitcode=-6

Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49533
2015-10-28 16:15:07 -07:00
sdong
47414c6cd6 Move include/posix/io_posix.h to util/io_posix.h
Summary: include/posix/io_posix.h is not a public API. Although include/posix/ is not a public header directory, it is confusing to put non-public headers to under include/. Move it to util/ to be clearer.

Test Plan: Run all tests

Reviewers: rven, IslamAbdelRahman, anthony, kradhakrishnan, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49611
2015-10-28 12:15:51 -07:00
sdong
2889df84cb Revert "Avoid to reply on ROCKSDB_FALLOCATE_PRESENT in include/posix/io_posix.h"
This reverts commit c37223c083.
2015-10-28 11:55:20 -07:00
Siying Dong
28c8758a34 Merge pull request #795 from yuslepukhin/fix_mocktable_id
Fix MockTable ID storage
2015-10-28 11:37:37 -07:00
Dmitri Smirnov
5c8f2ee786 Fix MockTable ID storage
On Windows two tests fail that use MockTable:
  flush_job_test and compaction_job_test with the following message:
  compaction_job_test_je.exe : Assertion failed: result.size() == 4,
  file c:\dev\rocksdb\rocksdb\table\mock_table.cc, line 110

  Investigation reveals that this failure occurs when a 4 byte
  ID written to a beginning of the physically open file (main
  contents remains in a in-memory map) can not be read back.

  The reason for the failure is that the ID is written directly
  to a WritableFile bypassing WritableFileWriter. The side effect of that
  is that pending_sync_ never becomes true so the file is never flushed,
  however, the direct cause of the failure is that the filesize_ member
  of the WritableFileWriter remains zero. At Close() the file is truncated
  to that size and the file becomes empty so the ID can not be read back.
2015-10-28 10:53:14 -07:00
Islam AbdelRahman
72d6e758b4 Fix WritableFileWriter::Append() return
Summary: It looks like WritableFileWriter::Append() was returning OK() even when there is an error

Test Plan: make check

Reviewers: sdong, yhchiang, anthony, rven, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49569
2015-10-27 21:04:00 -07:00
Siying Dong
d0a18c2840 Merge pull request #786 from aloukissas/unused_param
Fix unused parameter warnings in db.h
2015-10-27 16:51:22 -07:00
sdong
c37223c083 Avoid to reply on ROCKSDB_FALLOCATE_PRESENT in include/posix/io_posix.h
Summary: include/posix/io_posix.h should not depend on ROCKSDB_FALLOCATE_PRESENT. Remove it.

Test Plan: Build it with both of ROCKSDB_FALLOCATE_PRESENT defined and not defined.

Reviewers: rven, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49563
2015-10-27 16:48:40 -07:00
sdong
d6219e4d9b Mac build break caused by include/posix/io_posix.h not declearing errno,
Summary: Mac build breaks as include/posix/io_posix.h doesn't include errno. Move the exact function declaration to io_posix.cc

Test Plan: Run all test. Will run on Mac

Reviewers: rven, anthony, yhchiang, IslamAbdelRahman, igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49551
2015-10-27 14:16:19 -07:00
Siying Dong
beb69d4511 Merge pull request #765 from PraveenSinghRao/wal_filter
Adding wal filter to inspect and filter wal records on recovery
2015-10-27 12:13:01 -07:00
sdong
ab0f3b964f crash_test to trigger some less frequent crash point more frequently
Summary: crash_test still has a very low chance to hit some crash point. Have another mode for covering them more likely.

Test Plan: Run crash_test and see db_stress is called with expected prameters.

Reviewers: kradhakrishnan, igor, anthony, rven, IslamAbdelRahman, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49473
2015-10-27 12:06:06 -07:00
Igor Canadi
7beb743cf5 Merge pull request #778 from Vaisman/master
Error while cmake by building from zip-archive
2015-10-27 09:58:07 -07:00
Praveen Rao
4ce117c4d5 Merge branch 'master' into wal_filter 2015-10-26 19:03:34 -07:00
Praveen Rao
32cdec634e Fail recovery if filter provides more records than original and corresponding unit-test, fix naming conventions 2015-10-26 18:11:18 -07:00
sdong
44d4057d78 Avoid some includes in io_posix.h
Summary: IO Posix depends on too many .h files. Move most of them to .cc files.

Test Plan: make all

Reviewers: anthony, rven, IslamAbdelRahman, yhchiang, kradhakrishnan, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49479
2015-10-26 17:00:25 -07:00
Alex Loukissas
2adad23a14 Fix unused parameter warnings. 2015-10-26 16:38:37 -07:00
Alex Loukissas
b0980ff748 Fix unused parameter warnings. 2015-10-26 16:32:14 -07:00
Alex Loukissas
bc898c5f80 Fix unused parameter warnings. 2015-10-26 16:00:51 -07:00
Siying Dong
138876a62c Merge pull request #746 from ceph/wip-recycle
Add Options.recycle_log_file_num for Recycling WAL Files
2015-10-26 15:01:28 -07:00
Islam AbdelRahman
581f20fd8b Add LITE tests to Legocastle
Summary: Update rocksdb-lego-determinator to include running make check under ROCKSDB_LITE

Test Plan: will be tested after landing in fbcode

Reviewers: sdong, yhchiang, igor, kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49065
2015-10-26 11:50:29 -07:00
Vasili Svirski
3d56d868c7 Merge remote-tracking branch 'upstream/master' 2015-10-24 21:05:59 +04:00
sdong
d691111146 include/posix/io_posix.h should have a once declartion
Summary: include/posix/io_posix.h doesn't not prevent multiple includes. Need to fix it. It is also breaking unity build.

Test Plan: Run unity build and see error go away.

Reviewers: rven, igor, IslamAbdelRahman, kradhakrishnan, anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49281
2015-10-23 07:45:00 -07:00
Igor Canadi
a6962edf81 Merge pull request #783 from yuslepukhin/remove_test_conditional_compilation
No need to #ifdef test only code on windows
2015-10-22 20:14:52 -04:00
Dmitri Smirnov
3c750b59ae No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
Siying Dong
8c11c5dee8 Merge pull request #768 from OpenChannelSSD/to_fb_master2
Split posix storage backend into Env and library
2015-10-22 13:33:51 -07:00
Javier González
6e6dd5f6f9 Split posix storage backend into Env and library
Summary: This patch splits the posix storage backend into Env and
the actual *File implementations. The motivation is to allow other Envs
to use posix as a library. This enables a storage backend different from
posix to split its secondary storage between a normal file system
partition managed by posix, and it own media.

Test Plan: No new functionality is added to posix Env or the library,
thus the current tests should suffice.
2015-10-22 17:31:31 +02:00
Alexey Maykov
980a82ee2f Fix a bug in GetApproximateSizes
Summary: Need to pass through the memtable parameter.

Test Plan: built, tested through myrocks

Reviewers: igor, sdong, rven

Reviewed By: rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49167
2015-10-21 18:34:39 -07:00
Shusen Liu
d0d13ebf67 fix bug in db_crashtest.py
Summary:
in tools/db_crashtest.py, cmd_params['db'] by default is a lambda expression, not the actual db_name.
fix by get the db_name before passing it to gen_cmd.

Test Plan: run `make crashtest`

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49119
2015-10-20 22:01:11 -07:00
Vasili Svirski
01a41af0ae Merge remote-tracking branch 'upstream/master' 2015-10-21 07:52:10 +04:00
Yueh-Hsuan Chiang
5678c05d86 Use DEBUG_LEVEL=0 in make release and make clean
Summary: Use DEBUG_LEVEL=0 in make release and make clean

Test Plan:
make clean
make release -j32

Reviewers: MarkCallaghan, sdong, anthony, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D49125
2015-10-20 17:09:09 -07:00
Siying Dong
ac25fe6b9a Merge pull request #779 from yuslepukhin/optimize_windows_build
Do not build test only code and unit tests in Release builds
2015-10-20 16:32:21 -07:00
Dmitri Smirnov
e154ee0863 Do not build test only code and unit tests in Release builds
Test code errors are currently blocking Windows Release builew
  We do not want spend time building in Release what we can not run
  We want to eliminate a source of most frequent errors when people
  check-in test only code which can not be built in Release.
  This feature will work only if you invoke msbuild against rocksdb.sln
  Invoking it against ALL_BUILD target will attempt to build everything.
2015-10-20 13:35:08 -07:00
Vasili Svirski
cd3286faea Error while cmake by building from zip-archive
* add validation is git found
* add validation is .git folder exists in project (project zip archive download without .git folder)
* get head commit SHA if git found and .git folder exists

Tested:
* configure project by CMake 3.0.0 successfully (with and without git), with project zip archive (without .git folder) and with project cloned from github
* configure project by command: cmake -G "Visual Studio 12 Win64"
* build solution by Visual Studio
* manually validate that file utils/build_version.cc contains valid head revision value
2015-10-20 22:51:19 +04:00
sdong
e3d4e14075 DBCompactionTestWithParam.ManualCompaction to verify block cache is not filled in manual compaction
Summary: Manual compaction should not fill block cache. Add the verification in unit test

Test Plan: Run the test

Reviewers: yhchiang, kradhakrishnan, rven, IslamAbdelRahman, anthony, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49089
2015-10-20 10:36:49 -07:00
Shusen Liu
033c6f1add T7916298, bug fix
Summary: dbname => cmd_params['db']

Test Plan: Run `make crash_test`

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49077
2015-10-19 21:09:35 -07:00
krad
7717ad1afe Adding artifacts to stress_crash CI job
Summary: Adding the ability to upload logs and db content to storage after the
completion of the job

Test Plan: Manual run

Reviewers:

CC: leveldb@

Task ID: #8754201

Blame Rev:
2015-10-19 20:13:14 -07:00
Igor Canadi
0bf656b904 Don't spew warnings when flint doesn't exist
Summary: Before this diff `arc lint` on non-fb machine issued warnings. Now it doesn't.

Test Plan: `arc lint` is quiet.

Reviewers: yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D49071
2015-10-19 18:47:59 -07:00
sdong
6d6776f6b8 Log more information for the add file with overlapping range failure
Summary: crash_test sometimes fails, hitting the add file overlapping assert. Add information in info logs help us to find the bug.

Test Plan: Run all test suites. Do some manual tests to make sure printing is correct.

Reviewers: kradhakrishnan, yhchiang, anthony, IslamAbdelRahman, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49017
2015-10-19 17:31:13 -07:00
Praveen Rao
7951b9b079 make field order match initialization order 2015-10-19 17:03:01 -07:00
Siying Dong
90228bb088 Merge pull request #771 from maximecaron/patch-1
Fix build error using Visual Studio 12
2015-10-19 15:26:22 -07:00
Praveen Rao
2938c5c137 merge upstream changes 2015-10-19 15:21:33 -07:00
Islam AbdelRahman
e3b1d23d3e Bump version to 4.2
Summary: Bump the version to 4.2 ( the unreleased version ), so that when fbcode_unittests run it can differentiate between old and new APIs

Test Plan: make check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49041
2015-10-19 15:05:59 -07:00
Sage Weil
a7b2bedfb0 log_{reader,write}: recyclable record format
Introduce new tags for records that have a log_number.  This changes the
header size from 7 to 11 for these records, making this a
backward-incompatible change.

If we read a record that belongs to a different log_number (i.e., a
previous instantiation of this log file, before it was most recently
recycled), we return kOldRecord from ReadPhysicalRecord.  ReadRecord
will translate this into a kEof or kBadRecord depending on what the
WAL recovery mode is.

We make several adjustments to the log_test.cc tests to compensate for the
fact that the header size varies between the two modes.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-10-19 17:24:05 -04:00
Igor Canadi
4e07c99a9a Fix iOS build
Summary: We don't yet have a CI build for iOS, so our iOS compile gets broken sometimes. Most of the errors are from assumption that size_t is 64-bit, while it's actually 32-bit on some (all?) iOS platforms. This diff fixes the compile.

Test Plan:
TARGET_OS=IOS make static_lib

Observe there are no warnings

Reviewers: sdong, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D49029
2015-10-19 13:40:44 -07:00
Praveen Rao
0c59691dde Handle multiple batches in single log record - allow app to return a new batch + allow app to return corrupted record status 2015-10-19 13:27:40 -07:00
Shusen Liu
32c291e3c9 Merge branch 'master' of github.com:facebook/rocksdb into T7916298 2015-10-19 13:25:39 -07:00
Shusen Liu
4575de5b9e #7916298: merge tools/db_crashtest2.py into tools/db_crashtest.py
Summary:
merge tools/db_crashtest2.py into tools/db_crashtest.py

python tools/db_crashtest.py -h  # show help message, ALL parameters can be overwrite by arguments

Example usages:
python tools/db_crashtest.py blackbox  # run blackbox with default parameters
python tools/db_crashtest.py blackbox --simple
python tools/db_crashtest.py whitebox  # run whitebox with default parameters
python tools/db_crashtest.py whitebox --simple

all default parameters are identical to previous version.

Test Plan: `make crash_test` and make sure it can run with expected parameters pased to db_stress.

Reviewers: igor, rven, anthony, IslamAbdelRahman, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D48567
2015-10-19 13:24:55 -07:00
Igor Canadi
5c727de6a3 Merge pull request #777 from yuslepukhin/fix_win_build_uint
uint is a not a datatype on windows.
2015-10-19 13:23:41 -07:00