Commit Graph

89 Commits

Author SHA1 Message Date
Peter Dillinger
b475a83f9d Postponing custom checksum support in BackupEngine (#7411)
Summary:
This change reverts BackupEngine to 6.12 state to accommodate a
higher-priority fix that does not easily merge with this custom checksum
support. We intend to reinstate this support soon, by merging a revert
of this change.

For backupable_db_test, I've removed the tests depending on this
feature.

I've also removed relevant HISTORY.md entry.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7411

Test Plan: unit tests

Reviewed By: ajkr

Differential Revision: D23793835

Pulled By: pdillinger

fbshipit-source-id: 7e861436539584799b13d1a8ae559b81b6d08052
2020-09-18 15:27:03 -07:00
Peter Dillinger
93719fc953 Restore file size in backup table file names (and other cleanup) (#7400)
Summary:
Prior to 6.12, backup files using share_files_with_checksum had
the file size encoded in the file name, after the last '\_' and before
the last '.'. We considered this an implementation detail subject to
change, and indeed removed this information from the file name (with an
option to use old behavior) because it was considered
ineffective/inefficient for file name uniqueness. However, some
downstream RocksDB users were relying on this information since the file
size is not explicitly in the backup manifest file.

This primary purpose of this change is "retrofitting" the 6.12 release
(not yet a public release) to simultaneously support the benefits of the
new naming scheme (I/O performance and data correctness at scale) and
preserve the file size information, both as default behaviors. With this
change, we are essentially making the file size information encoded in
the file name an official, though obscure, extension of the backup meta
file format.

We preserve an option (kLegacyCrc32cAndFileSize) to use the original
"legacy" naming scheme, with its caveats, and make it easy to omit the
file size information (no kFlagIncludeFileSize), for more compact file
names. But note that changing the naming scheme used on an existing db
and backup directory can lead to transient space amplification, as some
files will be stored under two names in the shared_checksum directory.
Because some backups were saved using the original 6.12 naming scheme,
we offer two ways of dealing with those files: SST files generated by
older 6.12 versions can either use the default naming scheme in effect
when the SST files were generated (kFlagMatchInterimNaming, default, no
transient space amplification) or can use a new naming scheme (no
kFlagMatchInterimNaming, potential space amplification because some
already stored files getting a new name).

We don't have a natural way to detect which files were generated by
previous 6.12 versions, but this change hacks one in by changing DB
session ids to now use a more concise encoding, reducing file name
length, saving ~dozen bytes from SST files, and making them visually
distinct from DB ids so that they are less likely to be mixed up.

Two final auxiliary notes:
Recognizing that the backup file names have become a de facto part of
the backup meta schema, this change makes them easier to parse and
extend by putting a distinct marker, 's', before DB session ids embedded
in the name. When we extend this to allow custom checksums in the name,
they can get their own marker to ensure safe parsing. For backward
compatibility, file size does not get a marker but is assumed for
`_[0-9]+[.]`

Another change from initial 6.12 default behavior is never including
file custom checksum in the file name. Looking ahead to 6.13, we do not
want the default behavior to cause backup space amplification for
someone turning on file custom checksum checking in BackupEngine; we
want that to be an easy decision. When implemented, including file
custom checksums in backup file names will be a non-default option.

Actual file name patterns and priorities, as regexes:

    kLegacyCrc32cAndFileSize OR pre-6.12 SST file ->
      [0-9]+_[0-9]+_[0-9]+[.]sst
    kFlagMatchInterimNaming set (default) AND early 6.12 SST file ->
      [0-9]+_[0-9a-fA-F-]+[.]sst
    kUseDbSessionId AND NOT kFlagIncludeFileSize ->
      [0-9]+_s[0-9A-Z]{20}[.]sst
    kUseDbSessionId AND kFlagIncludeFileSize (default) ->
      [0-9]+_s[0-9A-Z]{20}_[0-9]+[.]sst

We might add opt-in options for more '\_' separated data in the name,
but embedded file size, if present, will always be after last '\_' and
before '.sst'.

This change was originally applied to version 6.12. (See https://github.com/facebook/rocksdb/issues/7390)

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7400

Test Plan:
unit tests included. Sync point callbacks are used to mimic
previous version SST files.

Reviewed By: ajkr

Differential Revision: D23759587

Pulled By: pdillinger

fbshipit-source-id: f62d8af4e0978de0a34f26288cfbe66049b70025
2020-09-17 10:24:22 -07:00
Peter Dillinger
a0ac71aae1 Disable sst_file_manager in stress testing backup restore (#7384)
Summary:
This is potentially the cause of failures:

    Failure in Destroy restore dir with: IO error: file rmdir: /dev/shm/rocksdb/rocksdb_crashtest_whitebox/.restore13: Directory not empty

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7384

Test Plan: smoke test blackbox_crash_test

Reviewed By: jay-zhuang

Differential Revision: D23685087

Pulled By: pdillinger

fbshipit-source-id: 55f62e9853ce84be1d5ca7d856de867f0f2596ee
2020-09-14 14:21:06 -07:00
Peter Dillinger
c4e2066dbd Fix cf_consistency_stress for backup/restore, harmonize (#7373)
Summary:
We can only check key on restored backup if in a stress test
configuration locking the key. (Fixes mismatch seen in backup/restore
with atomic flush.)

TestCheckpoint used a very ugly solution to the same problem: copy-paste
dozens of lines of code with some changes and removals. I removed the
unnecessary implementation and made the existing one simply adaptive,
like TestBackupRestore.

Also made TestBackupRestore clean up dead backup/restore directories on
success.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7373

Test Plan:
blackbox_crash_test_with_atomic_flush for a while,
blackbox_crash_test for a while, with backup and checkpoint 1 in 5k and
only 1k max_keys to stress this area

Reviewed By: ajkr

Differential Revision: D23629057

Pulled By: pdillinger

fbshipit-source-id: d7fe7e2be75aaf3cf974be9540a7c5c5de8b371b
2020-09-10 22:55:06 -07:00
Peter Dillinger
e3149358a5 More backup/restore stress test fixes (#7361)
Summary:
(a) Missed a case in updating handling of rand_keys
(b) Only opening restored db with DB::Open so don't (yet)
attempt to open restored BlobDB or TransactionDB.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7361

Test Plan: better than being broken

Reviewed By: ajkr

Differential Revision: D23592570

Pulled By: pdillinger

fbshipit-source-id: dd1d999bcc0c852ee77cb6041964ec4abc0fd4fd
2020-09-09 11:19:05 -07:00
Peter Dillinger
4e258d3e63 Fix backup/restore in stress/crash test (#7357)
Summary:
(1) Skip check on specific key if restoring an old backup
(small minority of cases) because it can fail in those cases. (2) Remove
an old assertion about number of column families and number of keys
passed in, which is broken by atomic flush (cf_consistency) test. Like
other code (for better or worse) assume a single key and iterate over
column families. (3) Apply mock_direct_io to NewSequentialFile so that
db_stress backup works on /dev/shm.

Also add more context to output in case of backup/restore db_stress
failure.

Also a minor fix to BackupEngine to report first failure status in
creating new backup, and drop another clue about the potential
source of a "Backup failed" status.

Reverts "Disable backup/restore stress test (https://github.com/facebook/rocksdb/issues/7350)"

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7357

Test Plan:
Using backup_one_in=10000,
"USE_CLANG=1 make crash_test_with_atomic_flush" for 30+ minutes
"USE_CLANG=1 make blackbox_crash_test" for 30+ minutes
And with use_direct_reads with TEST_TMPDIR=/dev/shm/rocksdb

Reviewed By: riversand963

Differential Revision: D23567244

Pulled By: pdillinger

fbshipit-source-id: e77171c2e8394d173917e36898c02dead1c40b77
2020-09-08 10:50:19 -07:00
Peter Dillinger
06ad5dd293 Add file checksum to stress/crash test (#7343)
Summary:
This change has the crash test randomly select from a few file
checksum implementations, or nullptr, for DB file_checksum_gen_factory.
For compatibility across runs on same DB, each non-null factory can
understand all the other functions, but the default changes.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7343

Test Plan:
'make blackbox_crash_test' for a while, including with some
debug output to ensure code is being exercised.

Reviewed By: zhichao-cao

Differential Revision: D23494580

Pulled By: pdillinger

fbshipit-source-id: 73bbc7ca32c1adaf619134c0c830f12894880b8a
2020-09-03 23:50:33 -07:00
Peter Dillinger
499c9448d0 Fix, enable, and enhance backup/restore in db_stress (#7348)
Summary:
Although added to db_stress, testing of backup/restore
was never integrated into the crash test, originally concerned about
performance. I've enabled it now and to address the peformance concern,
testing backup/restore is always skipped once the db exceeds a certain
size threshold, default 100MB. This should provide sufficient
opportunity for testing BackupEngine without bogging down everything
else with heavier and heavier operations.

Also fixed backup/restore in db_stress by making sure PurgeOldBackups
can remove manifest files, which are normally kept around for db_stress.

Added more coverage of backup options, and up to three backups being
saved in one backup directory (in some cases).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7348

Test Plan:
ran 'make blackbox_crash_test' for a while, with heightened
probabilitly of taking backups (1/10k). Also confirmed with some debug
output that the code is being covered, TestBackupRestore only takes
a few seconds to complete when triggered, and even at 1/10k and ~50MB
database, there's <,~ 1 thread testing backups at any time.

Reviewed By: ajkr

Differential Revision: D23510835

Pulled By: pdillinger

fbshipit-source-id: b6b8735591808141f81f10773ac31634cf03b6c0
2020-09-03 20:13:15 -07:00
Hans Holmberg
2a0d3c7054 Add a file system parameter: --fs_uri to db_stress and db_bench (#6878)
Summary:
This pull request adds the parameter --fs_uri to db_bench and db_stress, creating a composite env combining the default env with a specified registered rocksdb file system.

This makes it easier to develop and test new RocksDB FileSystems.

The pull request also registers the posix file system for testing purposes.

Examples:
```
$./db_bench --fs_uri=posix:// --benchmarks=fillseq

$./db_stress --fs_uri=zenfs://nullb1
```

zenfs is a RocksDB FileSystem I'm developing to add support for zoned block devices, and in that case the zoned block device is specified in the uri (a zoned null block device in the above example).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/6878

Reviewed By: siying

Differential Revision: D23023063

Pulled By: ajkr

fbshipit-source-id: 8b3fe7193ce45e683043b021779b7a4d547af247
2020-08-17 11:55:24 -07:00
Andrew Kryczka
7eebe6d38a Mark files for compaction in stress/crash tests (#7231)
Summary:
The mechanism to mark files for compaction is most commonly used in
delete-triggered compaction. This PR adds an option to exercise the
marking mechanism on random files created by db_stress. This PR also
enables that option in db_crashtest.py on its db_stress runs at random.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7231

Test Plan:
- ran some minified crash tests; verified they succeed and we see `"compaction_reason": "FilesMarkedForCompaction"` regularly in the logs.

```
$ TEST_TMPDIR=/dev/shm python tools/db_crashtest.py blackbox --duration=600 --interval=30 --max_key=10000000 --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --value_size_mult=33
$ TEST_TMPDIR=/dev/shm python tools/db_crashtest.py whitebox --duration=600 --interval=30 --max_key=1000000 --write_buffer_size=1048576 --target_file_size_base=1048576 --max_bytes_for_level_base=4194304 --value_size_mult=33 --random_kill_odd=8887
```

Reviewed By: anand1976

Differential Revision: D23025156

Pulled By: ajkr

fbshipit-source-id: a404c467ebc12afa94dae35956ea9b372f592a96
2020-08-10 16:17:56 -07:00
Jay Zhuang
0c5bb10f06 Remove redundant ROCKSDB_LITE check (#7172)
Summary:
It's already inside of a `#ifdef ROCKSDB_LITE` block.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7172

Reviewed By: gg814

Differential Revision: D22736057

Pulled By: jay-zhuang

fbshipit-source-id: 31f4aa05aba98e2e42fa6f890fa72acf3a0f12f2
2020-07-24 14:14:14 -07:00
Jay Zhuang
fc4d5f5065 Add stress test for GetProperty (#7111)
Summary:
Add stress test coverage for `DB::GetProperty()`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7111

Test Plan:
```
./db_stress -get_property_one_in=1
make crash_test
```

Reviewed By: ajkr

Differential Revision: D22487906

Pulled By: jay-zhuang

fbshipit-source-id: c118d95cc9b4e2fa669a06e6aa531541fa885dc5
2020-07-14 12:12:36 -07:00
mrambacher
c7c7b07f06 More Makefile Cleanup (#7097)
Summary:
Cleans up some of the dependencies on test code in the Makefile while building tools:
- Moves the test::RandomString, DBBaseTest::RandomString into Random
- Moves the test::RandomHumanReadableString into Random
- Moves the DestroyDir method into file_utils
- Moves the SetupSyncPointsToMockDirectIO into sync_point.
- Moves the FaultInjection Env and FS classes under env

These changes allow all of the tools to build without dependencies on test_util, thereby simplifying the build dependencies.  By moving the FaultInjection code, the dependency in db_stress on different libraries for debug vs release was eliminated.

Tested both release and debug builds via Make and CMake for both static and shared libraries.

More work remains to clean up how the tools are built and remove some unnecessary dependencies.  There is also more work that should be done to get the Makefile and CMake to align in their builds -- what is in the libraries and the sizes of the executables are different.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7097

Reviewed By: riversand963

Differential Revision: D22463160

Pulled By: pdillinger

fbshipit-source-id: e19462b53324ab3f0b7c72459dbc73165cc382b2
2020-07-09 14:35:17 -07:00
Yanqin Jin
842bd2742a Running ./ldb without any extra arg print usage (#7107)
Summary:
as title.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7107

Test Plan: make ldb && ./ldb

Reviewed By: pdillinger

Differential Revision: D22451399

Pulled By: riversand963

fbshipit-source-id: 797645e06473bb9cf139c533877e5161281515e8
2020-07-09 10:20:06 -07:00
Peter Dillinger
90fd6b0cc8 cf_consistency_stress (crash_test_with_atomic_flush) checkpoint clean (#7103)
Summary:
Delicious copy-pasta from https://github.com/facebook/rocksdb/issues/7039

Also fixing DestroyDir to allow files to go missing while it is operating. This seems to fix failures I got with test plan reproducer.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7103

Test Plan:
make blackbox_crash_test_with_atomic_flush for a while with
checkpoint_one_in=100

Reviewed By: siying

Differential Revision: D22435315

Pulled By: pdillinger

fbshipit-source-id: 0ec0538402493887aeda43ecc03f32979cb84ced
2020-07-08 13:04:55 -07:00
Jay Zhuang
ca7659e2c4 Fix release build caused by #7067 (#7077)
Summary:
The issue is introduced by https://github.com/facebook/rocksdb/issues/7067

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7077

Test Plan: `make release`

Reviewed By: pdillinger

Differential Revision: D22370835

Pulled By: jay-zhuang

fbshipit-source-id: 44326bae07809c4518371b6a7d1f47124e24a4f3
2020-07-02 20:53:08 -07:00
Jay Zhuang
00de699096 Replace reinterpret_cast with static_cast_with_check (#7067)
Summary:
Replace `reinterpret_cast` with `static_cast_with_check` for `DBImpl` and `ColumnFamilyHandleImpl`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7067

Reviewed By: siying

Differential Revision: D22361587

Pulled By: jay-zhuang

fbshipit-source-id: dfe9e8f3af39c3d27cc372c55ab9ad905eb0a5a1
2020-07-02 19:25:41 -07:00
mrambacher
80f71b5863 Use Libraries in the RocksDB Makefile Build (#6660)
Summary:
Change the linking of tests/tools to be against a library rather than a list of objects.  This change substantially reduces the size of the objects produced.

peterd clean repo size: 264M
Before this change, with make all: 40G
After this change, with make all: 28G
With make LIB_MODE=shared all: 7.0G

The list of TESTS was changed from being hard-coded to generated from the test sources variable.  Note that there are some test sources that are not built as tests (though the set of tests is identical to the previous version).

Added OBJ_DIR option to Makefile to allow objects to be placed in an alternative location.  By default, OBJ_DIR is the same as before ("./").

This change is a precursor to being able to build/run the tests/tools linked against static libraries.  Additionally, it should be possible to clean up and merge some of the rules for building tests and the like if so desired.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6660

Reviewed By: riversand963

Differential Revision: D22244463

Pulled By: pdillinger

fbshipit-source-id: db9c6341d81ed62c2270374f4ede02fb9604c754
2020-06-30 19:33:31 -07:00
Cheng Chang
f045ee6422 Increase transaction timeout and enable deadlock detection in stress test (#7056)
Summary:
There are errors like `Transaction put: Operation timed out: Timeout waiting to lock key
terminate called without an active exception`, based on experiment on devserver, increasing timeouts can resolve the issue.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7056

Test Plan: watch stress test with txn.

Reviewed By: anand1976

Differential Revision: D22317265

Pulled By: cheng-chang

fbshipit-source-id: 2dc3352def5e78d2c39a18d7262a3a65ca98bbba
2020-06-30 14:29:17 -07:00
sdong
2d1d51d385 db_stress: deep clean directory before checkpoint (#7039)
Summary:
We see crash test occassionally fails with "A checkpoint operation failed with: Invalid argument: Directory exists". The suspicious is that the directory fails to be deleted because some trash files. Deep clean the directory after a DestroyDB() call.

Also add more debugging printf in case it fails.
Also, preserve the DB if verification fails.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7039

Test Plan: Run db_stress with low --checkpoint_one_in value

Reviewed By: riversand963

Differential Revision: D22271694

fbshipit-source-id: 6a9b2abb664fc69a4dc666741df4f6b23703cd6d
2020-06-30 12:01:34 -07:00
Peter Dillinger
5b2bbacb6f Minimize memory internal fragmentation for Bloom filters (#6427)
Summary:
New experimental option BBTO::optimize_filters_for_memory builds
filters that maximize their use of "usable size" from malloc_usable_size,
which is also used to compute block cache charges.

Rather than always "rounding up," we track state in the
BloomFilterPolicy object to mix essentially "rounding down" and
"rounding up" so that the average FP rate of all generated filters is
the same as without the option. (YMMV as heavily accessed filters might
be unluckily lower accuracy.)

Thus, the option near-minimizes what the block cache considers as
"memory used" for a given target Bloom filter false positive rate and
Bloom filter implementation. There are no forward or backward
compatibility issues with this change, though it only works on the
format_version=5 Bloom filter.

With Jemalloc, we see about 10% reduction in memory footprint (and block
cache charge) for Bloom filters, but 1-2% increase in storage footprint,
due to encoding efficiency losses (FP rate is non-linear with bits/key).

Why not weighted random round up/down rather than state tracking? By
only requiring malloc_usable_size, we don't actually know what the next
larger and next smaller usable sizes for the allocator are. We pick a
requested size, accept and use whatever usable size it has, and use the
difference to inform our next choice. This allows us to narrow in on the
right balance without tracking/predicting usable sizes.

Why not weight history of generated filter false positive rates by
number of keys? This could lead to excess skew in small filters after
generating a large filter.

Results from filter_bench with jemalloc (irrelevant details omitted):

    (normal keys/filter, but high variance)
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9
    Build avg ns/key: 29.6278
    Number of filters: 5516
    Total size (MB): 200.046
    Reported total allocated memory (MB): 220.597
    Reported internal fragmentation: 10.2732%
    Bits/key stored: 10.0097
    Average FP rate %: 0.965228
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=30000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
    Build avg ns/key: 30.5104
    Number of filters: 5464
    Total size (MB): 200.015
    Reported total allocated memory (MB): 200.322
    Reported internal fragmentation: 0.153709%
    Bits/key stored: 10.1011
    Average FP rate %: 0.966313

    (very few keys / filter, optimization not as effective due to ~59 byte
     internal fragmentation in blocked Bloom filter representation)
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9
    Build avg ns/key: 29.5649
    Number of filters: 162950
    Total size (MB): 200.001
    Reported total allocated memory (MB): 224.624
    Reported internal fragmentation: 12.3117%
    Bits/key stored: 10.2951
    Average FP rate %: 0.821534
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
    Build avg ns/key: 31.8057
    Number of filters: 159849
    Total size (MB): 200
    Reported total allocated memory (MB): 208.846
    Reported internal fragmentation: 4.42297%
    Bits/key stored: 10.4948
    Average FP rate %: 0.811006

    (high keys/filter)
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9
    Build avg ns/key: 29.7017
    Number of filters: 164
    Total size (MB): 200.352
    Reported total allocated memory (MB): 221.5
    Reported internal fragmentation: 10.5552%
    Bits/key stored: 10.0003
    Average FP rate %: 0.969358
    $ ./filter_bench -quick -impl=2 -average_keys_per_filter=1000000 -vary_key_count_ratio=0.9 -optimize_filters_for_memory
    Build avg ns/key: 30.7131
    Number of filters: 160
    Total size (MB): 200.928
    Reported total allocated memory (MB): 200.938
    Reported internal fragmentation: 0.00448054%
    Bits/key stored: 10.1852
    Average FP rate %: 0.963387

And from db_bench (block cache) with jemalloc:

    $ ./db_bench -db=/dev/shm/dbbench.no_optimize -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false
    $ ./db_bench -db=/dev/shm/dbbench -benchmarks=fillrandom -format_version=5 -value_size=90 -bloom_bits=10 -num=2000000 -threads=8 -optimize_filters_for_memory -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false
    $ (for FILE in /dev/shm/dbbench.no_optimize/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }'
    17063835
    $ (for FILE in /dev/shm/dbbench/*.sst; do ./sst_dump --file=$FILE --show_properties | grep 'filter block' ; done) | awk '{ t += $4; } END { print t; }'
    17430747
    $ #^ 2.1% additional filter storage
    $ ./db_bench -db=/dev/shm/dbbench.no_optimize -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000
    rocksdb.block.cache.index.add COUNT : 33
    rocksdb.block.cache.index.bytes.insert COUNT : 8440400
    rocksdb.block.cache.filter.add COUNT : 33
    rocksdb.block.cache.filter.bytes.insert COUNT : 21087528
    rocksdb.bloom.filter.useful COUNT : 4963889
    rocksdb.bloom.filter.full.positive COUNT : 1214081
    rocksdb.bloom.filter.full.true.positive COUNT : 1161999
    $ #^ 1.04 % observed FP rate
    $ ./db_bench -db=/dev/shm/dbbench -use_existing_db -benchmarks=readrandom,stats -statistics -bloom_bits=10 -num=2000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=false -optimize_filters_for_memory -duration=10 -cache_index_and_filter_blocks -cache_size=1000000000
    rocksdb.block.cache.index.add COUNT : 33
    rocksdb.block.cache.index.bytes.insert COUNT : 8448592
    rocksdb.block.cache.filter.add COUNT : 33
    rocksdb.block.cache.filter.bytes.insert COUNT : 18220328
    rocksdb.bloom.filter.useful COUNT : 5360933
    rocksdb.bloom.filter.full.positive COUNT : 1321315
    rocksdb.bloom.filter.full.true.positive COUNT : 1262999
    $ #^ 1.08 % observed FP rate, 13.6% less memory usage for filters

(Due to specific key density, this example tends to generate filters that are "worse than average" for internal fragmentation. "Better than average" cases can show little or no improvement.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6427

Test Plan: unit test added, 'make check' with gcc, clang and valgrind

Reviewed By: siying

Differential Revision: D22124374

Pulled By: pdillinger

fbshipit-source-id: f3e3aa152f9043ddf4fae25799e76341d0d8714e
2020-06-22 13:32:07 -07:00
Andrew Kryczka
d76eed4839 minor fixes for stress/crash contruns (#7006)
Summary:
Avoid using `cf_consistency` together with `enable_compaction_filter` as
the former heavily uses snapshots while the latter is incompatible with
snapshots.

Also fix a clang-analyze error for a write to a variable that is never
read.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7006

Reviewed By: zhichao-cao

Differential Revision: D22141679

Pulled By: ajkr

fbshipit-source-id: 1840ae238168818a9ab5973f90fd78c067399447
2020-06-19 16:05:17 -07:00
Peter Dillinger
88b4210701 Remove racially charged terms "whitelist" and "blacklist" (#7008)
Summary:
We don't need them.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7008

Test Plan: "make check" and ensure "make crash_test" starts

Reviewed By: ajkr

Differential Revision: D22143838

Pulled By: pdillinger

fbshipit-source-id: 72c8e16603abc59f4954e304466bc4dc1f58f94e
2020-06-19 15:27:32 -07:00
Zhichao Cao
a607f3efaa Fix unused variable failure (#7004)
Summary:
pass make check, run db_stress
Pull Request resolved: https://github.com/facebook/rocksdb/pull/7004

Reviewed By: ajkr

Differential Revision: D22132617

Pulled By: zhichao-cao

fbshipit-source-id: d65397967e213206ec5efcb767bbdda8a575662a
2020-06-18 22:06:51 -07:00
Andrew Kryczka
775dc623ad add CompactionFilter to stress/crash tests (#6988)
Summary:
Added a `CompactionFilter` that is aware of the stress test's expected state. It only drops key versions that are already covered according to the expected state. It is incompatible with snapshots (same as all `CompactionFilter`s), so disables all snapshot-related features when used in the crash test.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6988

Test Plan:
running a minified blackbox crash test

```
$ TEST_TMPDIR=/dev/shm python tools/db_crashtest.py blackbox --max_key=1000000 -write_buffer_size=1048576 -max_bytes_for_level_base=4194304 -target_file_size_base=1048576 -value_size_mult=33 --interval=10 --duration=3600
```

Reviewed By: anand1976

Differential Revision: D22072888

Pulled By: ajkr

fbshipit-source-id: 727b9d7a90d5eab18be0ec6cd5a810712ac13320
2020-06-18 09:54:55 -07:00
Yanqin Jin
15d9f28da5 Add stress test for best-efforts recovery (#6819)
Summary:
Add crash test for the case of best-efforts recovery.
After a certain amount of time, we kill the db_stress process, randomly delete some certain table files and restart db_stress. Given the randomness of file deletion, it is difficult to verify against a reference for data correctness. Therefore, we just check that the db can restart successfully.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6819

Test Plan:
```
./db_stress -best_efforts_recovery=true -disable_wal=1 -reopen=0
./db_stress -best_efforts_recovery=true -disable_wal=0 -skip_verifydb=1 -verify_db_one_in=0 -continuous_verification_interval=0
make crash_test_with_best_efforts_recovery
```

Reviewed By: anand1976

Differential Revision: D21436753

Pulled By: riversand963

fbshipit-source-id: 0b3605c922a16c37ed17d5ab6682ca4240e47926
2020-06-12 19:27:46 -07:00
Zhichao Cao
2adb7e3768 Fix potential overflow of unsigned type in for loop (#6902)
Summary:
x.size() -1 or y - 1 can overflow to an extremely large value when x.size() pr y is 0 when they are unsigned type. The end condition of i in the for loop will be extremely large, potentially causes segment fault. Fix them.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6902

Test Plan: pass make asan_check

Reviewed By: ajkr

Differential Revision: D21843767

Pulled By: zhichao-cao

fbshipit-source-id: 5b8b88155ac5a93d86246d832e89905a783bb5a1
2020-06-02 15:05:07 -07:00
Andrew Kryczka
b464a85e33 fix transaction rollback in db_stress TestMultiGet (#6873)
Summary:
There were further uses of `txn` after `RollbackTxn(txn)` leading to
stress test errors. Moved the rollback to the end of the function.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6873

Test Plan:
found a command from the crash test that previously failed immediately under TSAN; verified now it succeeds.
```
./db_stress --acquire_snapshot_one_in=10000 --allow_concurrent_memtable_write=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --block_size=16384 --bloom_bits=222.913637674 --bottommost_compression_type=none --cache_index_and_filter_blocks=1 --cache_size=1048576 --checkpoint_one_in=0 --checksum_type=kCRC32c --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_style=1 --compaction_ttl=0 --compression_max_dict_bytes=0 --compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_whitebox --db_write_buffer_size=1048576 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --disable_wal=0 --enable_pipelined_write=0 --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=12 --index_type=2 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --log2_keys_per_lock=22 --long_running_snapshots=0 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=1000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --num_levels=1 --open_files=100 --ops_per_thread=200000 --partition_filters=1 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefixpercent=5 --progress_reports=0 --read_fault_one_in=0 --readpercent=45 --recycle_log_file_num=0 --reopen=20 --snapshot_hold_ops=100000 --subcompactions=4 --sync=0 --sync_fault_injection=False --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=0 --txn_write_policy=1 --unordered_write=1 --use_block_based_filter=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=1 --use_multiget=1 --use_txn=1 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --write_buffer_size=4194304 --write_dbid_to_manifest=0 --writepercent=35
```

Reviewed By: pdillinger

Differential Revision: D21708338

Pulled By: ajkr

fbshipit-source-id: dcf55cddee0a14f429a75e7a8a505acf8025f2b1
2020-05-24 15:27:24 -07:00
anand76
eb04bb86c6 Fix a bug in crash_test_with_txn (#6860)
Summary:
In NoBatchedOpsStress::TestMultiGet, call txn->Get() when transactions
are in use.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6860

Test Plan: make crash_test_with_txn

Reviewed By: pdillinger

Differential Revision: D21667249

Pulled By: anand1976

fbshipit-source-id: 194bd7b9630a8efc3ae29d85422a61214e9e200e
2020-05-20 14:47:05 -07:00
anand76
39b24432d4 Strengthen MultiGet correctness verification in NoBatchedOpsStress (#6849)
Summary:
Add MultiGet to VerifyDb and check consistency with Get in TestMultiGet.

Test plan -
make crash_test
ASAN crash test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6849

Reviewed By: pdillinger

Differential Revision: D21635011

Pulled By: anand1976

fbshipit-source-id: deb5a79d08fefd8d8010204f1f20b83adc92310e
2020-05-19 12:47:22 -07:00
Tongliang Liao
07204837ce Mark dependencies as PRIVATE and fix missing dependencies in tools. (#6790)
Summary:
Tools were mistakenly using leaked (PUBLIC) dependencies from main lib.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6790

Reviewed By: riversand963

Differential Revision: D21471551

fbshipit-source-id: ec43b92e231777e0fcf0f865444391af09d6963b
2020-05-12 21:07:55 -07:00
Andrew Kryczka
1f20df2f38 cover single level universal in crash test (#6818)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6818

Test Plan:
fast whitebox test and verify there are some single-level universal and
some multi-level universal runs.

```
$ python ./tools/db_crashtest.py whitebox --simple -max_key=1000000 -value_size_mult=33 -write_buffer_size=524288 -target_file_size_base=524288 -max_bytes_for_level_base=2097152 --duration=120 --interval=10 --ops_per_thread=1000 --random_kill_odd=887
```

Reviewed By: riversand963

Differential Revision: D21432138

Pulled By: ajkr

fbshipit-source-id: 2fc5ba9f3dfa49bb11e81da7dd00a17b476e64d7
2020-05-06 18:08:09 -07:00
Ziyue Yang
e619a20e93 Add an option for parallel compression in for db_stress (#6722)
Summary:
This commit adds an `compression_parallel_threads` option in
db_stress. It also fixes the naming of parallel compression
option in db_bench to keep it aligned with others.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6722

Reviewed By: pdillinger

Differential Revision: D21091385

fbshipit-source-id: c9ba8c4e5cc327ff9e6094a6dc6a15fcff70f100
2020-04-30 10:49:07 -07:00
Cheng Chang
0a77617820 Disable O_DIRECT in stress test when db directory does not support direct IO (#6727)
Summary:
In crash test, the db directory might be set to /dev/shm or /tmp, in certain environments such as internal testing infrastructure, neither of these directories support direct IO, so direct IO is never enabled in crash test.

This PR sets up SyncPoints in direct IO related code paths to disable O_DIRECT flag in calls to `open`, so the direct IO code paths will be executed, all direct IO related assertions will be checked, but no real direct IO request will be issued to the file system.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6727

Test Plan:
export CRASH_TEST_EXT_ARGS="--use_direct_reads=1 --mmap_read=0"
make -j24 crash_test

Reviewed By: zhichao-cao

Differential Revision: D21139250

Pulled By: cheng-chang

fbshipit-source-id: db9adfe78d91aa4759835b1af91c5db7b27b62ee
2020-04-25 00:01:03 -07:00
anand76
9e7b7e2c08 Silence false alarms in db_stress fault injection (#6741)
Summary:
False alarms are caused by codepaths that intentionally swallow IO
errors.

Tests:
make crash_test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6741

Reviewed By: ltamasi

Differential Revision: D21181138

Pulled By: anand1976

fbshipit-source-id: 5ccfbc68eb192033488de6269e59c00f2c65ce00
2020-04-24 13:06:12 -07:00
sdong
73523baeb1 crash_test to cover options.avoid_flush_during_recovery (#6712)
Summary:
Options.avoid_flush_during_recovery is uncovered in crash_test. Add the coverage with a chance of 1/8, as it is a less frequently used options.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6712

Test Plan: Run crash_test and see the option can be used or not used by chance.

Reviewed By: ltamasi

Differential Revision: D21056566

fbshipit-source-id: c3b1521517cfc204786e6ef8c6acd7fffda64793
2020-04-16 12:11:45 -07:00
Yueh-Hsuan Chiang
5801af4646 Add env_fault_injection argument to db_stress (#6687)
Summary:
Add env_fault_injection argument to db_stress.  When enabled,
FaultInjectionTestEnv will be used instead.  Currently this
option does not support running with other env setting.

This will allow
us to later manually produce error when running db_crashtest.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6687

Test Plan:
make db_stress -j32
./db_stress --env_fault_injection
./db_stress --env_fault_injection --hdfs   // expect error message

Reviewed By: ajkr

Differential Revision: D21014683

Pulled By: yhchiang

fbshipit-source-id: 0724aeac37efd57adb72a37defe6dbd3bfa8106a
2020-04-16 11:13:44 -07:00
anand76
610a09ccff Remove a printf from db_stress that's not useful info (#6705)
Summary:
This was causing db_crashtest.py to wrongly assume an error by parsing the output. Hopefully this will stabilize the crash tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6705

Test Plan: make blackbox_crash_test

Reviewed By: ltamasi

Differential Revision: D21043335

Pulled By: anand1976

fbshipit-source-id: 5cddd112b124d4e2ebd11724a17d4ef0f50c1cf8
2020-04-15 12:13:35 -07:00
anand76
234e2ed5b6 Fix a couple of bugs in db_stress fault injection (#6700)
Summary:
1. Fix a memory leak in FaultInjectionTestFS in the stack trace related
code
2. Check status of all MultiGet keys before deciding whether an error
was swallowed, instead of assuming an ok status for any key means an
undetected error
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6700

Test Plan: Run db_stress with asan and fault injection

Reviewed By: cheng-chang

Differential Revision: D21021498

Pulled By: anand1976

fbshipit-source-id: 489191efd1ab0fa834923a1e1d57253a7a315465
2020-04-14 11:06:55 -07:00
anand76
5c19a441c4 Fault injection in db_stress (#6538)
Summary:
This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538

Test Plan:
crash_test
make check

Reviewed By: riversand963

Differential Revision: D20714347

Pulled By: anand1976

fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
2020-04-10 17:21:26 -07:00
Levi Tamasi
217ce20021 Remove GetSortedWalFiles/GetCurrentWalFile from the crash test (#6491)
Summary:
Currently, `db_stress` tests a randomly picked one of `GetLiveFiles`,
`GetSortedWalFiles`, and `GetCurrentWalFile` with a 1/N chance when the
command line parameter `get_live_files_and_wal_files_one_in` is specified.
The problem is that `GetSortedWalFiles` and `GetCurrentWalFile` are unreliable
in the sense that they can return errors if another thread removes a WAL file
while they are executing (which is a perfectly plausible and legitimate scenario).
The patch splits this command line parameter into three (one for each API),
and changes the crash test script so that only `GetLiveFiles` is tested during
our continuous crash test runs.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6491

Test Plan:
```
make check
python tools/db_crashtest.py whitebox
```

Reviewed By: siying

Differential Revision: D20312200

Pulled By: ltamasi

fbshipit-source-id: e7c3481eddfe3bd3d5349476e34abc9eee5b7dc8
2020-03-18 17:14:15 -07:00
Yanqin Jin
58918d4ccc Use correct Env for DestroyDB in stress test (#6539)
Summary:
When using custom Env, trying to call DestroyDB() with default Options will
fail.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6539

Test Plan: ./db_stress

Differential Revision: D20476204

Pulled By: riversand963

fbshipit-source-id: 612c6754660cc9b5bb3e9c2dbb2f6ecd7f648797
2020-03-16 16:57:48 -07:00
Andrew Kryczka
f52db84650 support SstFileManager in db_stress (#6454)
Summary:
Add some flags for configuring an SstFileManager. An
SstFileManager is only created when one or more of these flags are
set.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6454

Test Plan:
- ran it a while:

```
$ python ./tools/db_crashtest.py blackbox --simple -max_key=100000 -write_buffer_size=131072 -target_file_size_base=131072 -max_bytes_for_level_base=524288 -value_size_mult=33 --interval=10 -max_background_compactions=4 -max_background_flushes=2 -sst_file_manager_bytes_per_sec=1048576
```

- verified with strace the SstFileManager is behaving as configured:

```
$ strace -fp `pidof db_stress` -e ftruncate,unlink
...
[pid 3074805]
ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>,
67423) = 0
[pid 3074805]
ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>,
51039) = 0
[pid 3074805]
ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>,
34655) = 0
[pid 3074805]
ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>,
18271) = 0
[pid 3074805]
ftruncate(9</tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash>,
1887) = 0
[pid 3074805]
unlink("/tmp/rocksdb_crashtest_blackbox6OJywh/000070.sst.trash") = 0
...
```

Differential Revision: D20103315

Pulled By: ajkr

fbshipit-source-id: b3e1092747157459d244b047947a979b85c98f48
2020-02-25 16:45:30 -08:00
sdong
fdf882ded2 Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433)
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433

Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.

Differential Revision: D19977691

fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
2020-02-20 12:09:57 -08:00
Maysam Yabandeh
01ab882ba3 Fix release warning for unused bg_canceled
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6357

Differential Revision: D19670931

Pulled By: maysamyabandeh

fbshipit-source-id: d528c4c7f9450f1f38b9d2a36e0d5d0865b39be9
2020-01-31 15:09:10 -08:00
Maysam Yabandeh
2243030bc5 Cancel bg jobs before deleting WritePrepared DB in stress tests (#6355)
Summary:
Background jobs in WritePrepared DB might access the db via a snapshot checker callback. The stress tests therefore should cancel background jobs before deleting the db in ::Reopen.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6355

Differential Revision: D19664132

Pulled By: maysamyabandeh

fbshipit-source-id: 6060a830e8aad0015c10448286ad37c8a346ac01
2020-01-31 10:29:11 -08:00
Burton Li
c9a5e48762 fix build warnnings on MSVC (#6309)
Summary:
Fix build warnings on MSVC. siying
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6309

Differential Revision: D19455012

Pulled By: ltamasi

fbshipit-source-id: 940739f2c92de60e47cc2bed8dd7f921459545a9
2020-01-30 16:07:26 -08:00
sdong
8f2bee6747 Add ReadOptions.auto_prefix_mode (#6314)
Summary:
Add a new option ReadOptions.auto_prefix_mode. When set to true, iterator should return the same result as total order seek, but may choose to do prefix seek internally, based on iterator upper bounds. Also fix two previous bugs when handling prefix extrator changes: (1) reverse iterator should not rely on upper bound to determine prefix. Fix it with skipping prefix check. (2) block-based filter is not handled properly.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6314

Test Plan: (1) add a unit test; (2) add the check to stress test and run see whether it can pass at least one run.

Differential Revision: D19458717

fbshipit-source-id: 51c1bcc5cdd826c2469af201979a39600e779bce
2020-01-28 14:44:05 -08:00
anand76
687119aeaf Variable key length in db_stress (#6273)
Summary:
Undo https://github.com/facebook/rocksdb/issues/6243 and fix the crash test failures.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6273

Test Plan: Run make ubsan_crash_test

Differential Revision: D19331472

Pulled By: anand1976

fbshipit-source-id: 30aa4a36c1b0f77a97159d82bbfd1cd767878e28
2020-01-09 21:27:18 -08:00
sdong
1244abef66 Stress Test: relax prefix iterator check condition (#6269)
Summary:
Right now, when validating prefix iterator, if control iterator is invalidate but prefix iterator shows value, we determine it as a test failure. However, this fails to consider the case where a file or memtable containing a tombstone is filtered out by a prefix bloom filter. The fix is to relax the check in this case. If we are out of prefix range, then ignore the check.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6269

Test Plan: Run crash_test for a short while and it still passes.

Differential Revision: D19317594

fbshipit-source-id: b964a1cdc1df5efe439d4b32f8023e1fbc8598c1
2020-01-08 13:32:06 -08:00