rocksdb

Author	SHA1	Message	Date
Maysam Yabandeh	2f4e288143	Enable partitioned index/filter in stress tests (#5895 ) Summary: This is the 2nd attempt after the revert of https://github.com/facebook/rocksdb/pull/4020 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5895 Test Plan: ``` ./tools/db_crashtest.py blackbox --simple --interval=10 --max_key=10000000 ``` Differential Revision: D17822137 Pulled By: maysamyabandeh fbshipit-source-id: 3d148c0d8cc129080410ff859c04b544223c8ea3	2019-10-08 16:50:21 -07:00
sdong	679a45d0cb	crash_test to do some verification for prefix extractor and iterator bounds. (#5846 ) Summary: For now, crash_test is not able to report any failure for the logic related to iterator upper, lower bounds or iterators, or reseek. These are features prone to errors. Improve db_stress in several ways: (1) For each iterator run, reseek up to 3 times. (2) For every iterator, create control iterator with upper or lower bound, with total order seek. Compare the results with the iterator. (3) Make simple crash test to avoid prefix size to have more coverage. (4) make prefix_size = 0 a valid size and -1 to indicate disabling prefix extractor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5846 Test Plan: Manually hack the code to create wrong results and see they are caught by the tool. Differential Revision: D17631760 fbshipit-source-id: acd460a177bd2124a5ffd7fff490702dba63030b	2019-09-27 11:10:44 -07:00
Maysam Yabandeh	6ec6a4a9a4	Remove snap_refresh_nanos option (#5826 ) Summary: The snap_refresh_nanos option didn't bring much benefit. Remove the feature to simplify the code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5826 Differential Revision: D17467147 Pulled By: maysamyabandeh fbshipit-source-id: 4f950b046990d0d1292d7fc04c2ccafaf751c7f0	2019-09-18 20:26:04 -07:00
Levi Tamasi	94d62d771e	Temporarily disable partitioned index/filter in stress test (#5811 ) Summary: PR https://github.com/facebook/rocksdb/issues/4020 enabled partitioned indexes/filters in stress tests; however, this causes assertion failures in BatchedOpsStressTest. This patch disables them until we can root cause the failures. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5811 Test Plan: Ran the script and made sure it only uses the binary search index. Differential Revision: D17399366 Pulled By: ltamasi fbshipit-source-id: adb116e6297f9c6ccd7ac15b6a16c9aa91f21ac5	2019-09-16 11:41:35 -07:00
Levi Tamasi	d35ffd569c	Temporarily disable hash index in stress tests (#5792 ) Summary: PR https://github.com/facebook/rocksdb/issues/4020 implicitly enabled the hash index as well in stress/crash tests, resulting in assertion failures in Block. This patch disables the hash index until we can pinpoint the root cause of these issues. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5792 Test Plan: Ran tools/db_crashtest.py and made sure it only uses index types 0 and 2 (binary search and partitioned index). Differential Revision: D17346777 Pulled By: ltamasi fbshipit-source-id: b4318f37f1fda3ee1bbff4ef2c2f556ca9e6b551	2019-09-12 12:11:34 -07:00
Andrew Kryczka	dd2a35f13f	Support partitioned index and filters in stress/crash tests (#4020 ) Summary: - In `db_stress`, support choosing index type and whether to enable filter partitioning, and randomly set those options in crash test - When partitioned filter is enabled by crash test, force partitioned index to also be enabled since it's a prerequisite Pull Request resolved: https://github.com/facebook/rocksdb/pull/4020 Test Plan: currently this is blocked on fixing the bug that crash test caught: ``` $ TEST_TMPDIR=/data/compaction_bench python ./tools/db_crashtest.py blackbox --simple --interval=10 --max_key=10000000 ... Verification failed for column family 0 key 937501: Value not found: NotFound: Crash-recovery verification failed :( ``` Differential Revision: D8508683 Pulled By: maysamyabandeh fbshipit-source-id: 0337e5d0558bcef26b1f3699f47265a2c1e99629	2019-09-11 14:13:38 -07:00
sdong	1daff8f85a	crash_test to skip compaction TTL for FIFO compaction. (#5749 ) Summary: https://github.com/facebook/rocksdb/pull/5741 added compaction TTL to crash test, but it causes assertion fails for FIFO compaction. Disable this combination for now while we debug the assertion failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5749 Test Plan: Run crash test and observe that when compaction_style=2, compaction_ttl is always 0. Differential Revision: D17078292 fbshipit-source-id: 446821a3b9739956094d5e4f9be1251a15b57f5d	2019-08-27 17:55:37 -07:00
sdong	1d6a10f52d	Extend stress test to cover periodic compaction and compaction TTL (#5741 ) Summary: Covering periodic compaction and compaction TTL can help us expose potential issues. Add it there. Randomly select value for these two options. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5741 Test Plan: Run crash_test and see the perameters generated. Differential Revision: D17059515 fbshipit-source-id: 8213974846a0b6a22fc13be705825c9054d1d097	2019-08-26 15:03:25 -07:00
sdong	d8a27d9331	Atomic Flush Crash Test also covers the case that WAL is enabled. (#5729 ) Summary: AtomicFlushStressTest is a powerful test, but right now we only run it for atomic_flush=true + disable_wal=true. We further extend it to the case where atomic_flush=false + disable_wal = false. All the workload generation and validation can stay the same. Atomic flush crash test is also changed to switch between the two test scenarios. It makes the name "atomic flush crash test" out of sync from what it really does. We leave it as it is to avoid troubles with continous test set-up. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5729 Test Plan: Run "CRASH_TEST_KILL_ODD=188 TEST_TMPDIR=/dev/shm/ USE_CLANG=1 make whitebox_crash_test_with_atomic_flush", observe the settings used and see it passed. Differential Revision: D16969791 fbshipit-source-id: 56e37487000ae631e31b0100acd7bdc441c04163	2019-08-22 16:32:55 -07:00
sdong	8e12638f3d	Slightly adjust atomic white box test's kill odd (#5717 ) Summary: Atomic white box test's kill odd is the same as normal test. However, in the scenario that only WritableFileWriter::Append() is blacklisted, WritableFileWriter::Flush() dominates the killing odds. Normally, most of WritableFileWriter::Flush() are called in WAL writes, where every write triggers a WAL flush. In atomic test, WAL is disabled, so the kill happens less frequently than we antipated. In some rare cases, the kill didn't end up with happening (for reasons I still don't fully understand) and cause the stress test timeout. If WAL is disabled, make the odds 5x likely to trigger. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5717 Test Plan: Run whitebox_crash_test_with_atomic_flush and whitebox_crash_test and observe the kill odds printed out. Differential Revision: D16897237 fbshipit-source-id: cbf5d96f6fc0e980523d0f1f94bf4e72cdb82d1c	2019-08-19 10:51:59 -07:00
Yanqin Jin	a78503bd6c	Temporarily disable snapshot list refresh for atomic flush stress test (#5581 ) Summary: Atomic flush test started to fail after https://github.com/facebook/rocksdb/issues/5099. Then https://github.com/facebook/rocksdb/issues/5278 provided a fix after which the same error occurred much less frequently. However it still occur occasionally. Not sure what the root cause is. This PR disables the feature of snapshot list refresh, and we should keep an eye on the failure in the future. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5581 Differential Revision: D16295985 Pulled By: riversand963 fbshipit-source-id: c9e62e65133c52c21b07097de359632ca62571e4	2019-07-22 14:38:16 -07:00
Tim Hatch	a6a9213a36	Fix interpreter lines for files with python2-only syntax. Reviewed By: lisroach Differential Revision: D15362271 fbshipit-source-id: 48fab12ab6e55a8537b19b4623d2545ca9950ec5	2019-07-09 10:51:37 -07:00
Maysam Yabandeh	f9842869cf	Disable pipeline writes in stress test (#5445 ) Summary: The tsan crash tests are failing with a data race compliant with pipelined write option. Temporarily disable it until its concurrency issue are fixed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5445 Differential Revision: D15783824 Pulled By: maysamyabandeh fbshipit-source-id: 413a0c3230b86f524fc7eeea2cf8e8375406e65b	2019-06-12 11:12:36 -07:00
anand76	181bb43f08	Fix bugs in FilePickerMultiGet (#5292 ) Summary: This PR fixes a couple of bugs in FilePickerMultiGet that were causing db_stress test failures. The failures were caused by - 1. Improper handling of a key that matches the user key portion of an L0 file's largest key. In this case, the curr_index_in_curr_level file index in L0 for that key was getting incremented, but batch_iter_ was not advanced. By design, all keys in a batch are supposed to be checked against an L0 file before advancing to the next L0 file. Not advancing to the next key in the batch was causing a double increment of curr_index_in_curr_level due to the same key being processed again 2. Improper handling of a key that matches the user key portion of the largest key in the last file of L1 and higher. This was resulting in a premature end to the processing of the batch for that level when the next key in the batch is a duplicate. Typically, the keys in MultiGet will not be duplicates, but its good to handle that case correctly Test - asan_crash make check Pull Request resolved: https://github.com/facebook/rocksdb/pull/5292 Differential Revision: D15282530 Pulled By: anand1976 fbshipit-source-id: d1a6a86e0af273169c3632db22a44d79c66a581f	2019-05-09 13:18:00 -07:00
anand76	930bfa5750	Disable MultiGet from db_stress (#5284 ) Summary: Disable it for now until we can get stress tests to pass consistently. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5284 Differential Revision: D15230727 Pulled By: anand1976 fbshipit-source-id: 239baacdb3c4cd4fb7c4447f7582b9042501d752	2019-05-06 18:26:50 -07:00
Maysam Yabandeh	6a40ee5eb1	Refresh snapshot list during long compactions (2nd attempt) (#5278 ) Summary: Part of compaction cpu goes to processing snapshot list, the larger the list the bigger the overhead. Although the lifetime of most of the snapshots is much shorter than the lifetime of compactions, the compaction conservatively operates on the list of snapshots that it initially obtained. This patch allows the snapshot list to be updated via a callback if the compaction is taking long. This should let the compaction to continue more efficiently with much smaller snapshot list. For simplicity, to avoid the feature is disabled in two cases: i) When more than one sub-compaction are sharing the same snapshot list, ii) when Range Delete is used in which the range delete aggregator has its own copy of snapshot list. This fixes the reverted https://github.com/facebook/rocksdb/pull/5099 issue with range deletes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5278 Differential Revision: D15203291 Pulled By: maysamyabandeh fbshipit-source-id: fa645611e606aa222c7ce53176dc5bb6f259c258	2019-05-03 17:30:22 -07:00
anand76	434ccf2df4	Add option to use MultiGet in db_stress (#5264 ) Summary: The new option will pick a batch size randomly in the range 1-64. It will then space the keys in the batch by random intervals. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5264 Differential Revision: D15175522 Pulled By: anand1976 fbshipit-source-id: c16baa69d0f1ff4cf53c55c813ddd82c8aeb58fc	2019-05-01 23:06:56 -07:00
Andrew Kryczka	b02d0c238d	Init compression dict handle before reading meta-blocks (#5267 ) Summary: At least one of the meta-block loading functions (`ReadRangeDelBlock`) uses the same block reading function (`NewDataBlockIterator`) as data block reads, which means it uses the dictionary handle. However, the dictionary handle was uninitialized while reading meta-blocks, causing readers to receive an error. This situation was only noticed when `cache_index_and_filter_blocks=true`. This PR initializes the handle to null while reading meta-blocks to prevent the error. It also adds support to `db_stress` / `db_crashtest.py` for `cache_index_and_filter_blocks`. Fixes #5263. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5267 Differential Revision: D15149264 Pulled By: maysamyabandeh fbshipit-source-id: 991d38a306c62db5976778bfb050fa3cd4a0671b	2019-04-30 09:50:49 -07:00
Yanqin Jin	210b49cac9	Disable pipelined write in atomic flush stress test (#5266 ) Summary: Since currently pipelined write allows one thread to perform memtable writes while another thread is traversing the `flush_scheduler_`, it will cause an assertion failure in `FlushScheduler::Clear`. To unblock crash recoery tests, we temporarily disable pipelined write when atomic flush is enabled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5266 Differential Revision: D15142285 Pulled By: riversand963 fbshipit-source-id: a0c20fe4ac543e08feaed602414f982054df7831	2019-04-30 08:12:42 -07:00
Fosco Marotto	6c2bf9e916	Add copyright headers per FB open-source checkup tool. (#5199 ) Summary: internal task: T35568575 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5199 Differential Revision: D14962794 Pulled By: gfosco fbshipit-source-id: 93838ede6d0235eaecff90d200faed9a8515bbbe	2019-04-18 10:55:01 -07:00
Andrew Kryczka	2263f86901	exercise WAL recycling in crash test (#5070 ) Summary: Since this feature affects the WAL behavior, it seems important our crash-recovery tests cover it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5070 Differential Revision: D14470085 Pulled By: miasantreble fbshipit-source-id: 9b9682a718a926d57d055e0a5ec867efbd2eb9c1	2019-03-15 12:03:26 -07:00
Andrew Kryczka	1218704b61	Fix `compression_zstd_max_train_bytes` coverage in stress test (#4957 ) Summary: Previously `finalize_and_sanitize` function was always zeroing out `compression_zstd_max_train_bytes`. It was only supposed to do that when non-ZSTD compression was used. But since `--compression_type` was an unknown argument (i.e., one that `db_crashtest.py` does not recognize and blindly forwards to `db_stress`), `finalize_and_sanitize` could not tell whether ZSTD was used. This PR fixes it simply by making `--compression_type` a known argument with snappy as default (same as `db_stress`). Pull Request resolved: https://github.com/facebook/rocksdb/pull/4957 Differential Revision: D13994302 Pulled By: ajkr fbshipit-source-id: 1b0baea7331397822830970d3698642eb7a7df65	2019-02-11 14:56:39 -08:00
Andrew Kryczka	68d949b3e3	Enable DeleteRange in stress/crash tests (#4483 ) Summary: Set `delrangepercent=1` when `test_batches_snapshots=false`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4483 Differential Revision: D10324361 Pulled By: ajkr fbshipit-source-id: 0cde1f1504f9493408a0c6493b976d7e5f5b2d23	2018-12-18 13:42:49 -08:00
Andrew Kryczka	8d2b74d287	Refine db_stress params for atomic flush (#4781 ) Summary: Separate flag for enabling option from flag for enabling dedicated atomic stress test. I have found setting the former without setting the latter can detect different problems. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4781 Differential Revision: D13463211 Pulled By: ajkr fbshipit-source-id: 054f777885b2dc7d5ea99faafa21d6537eee45fd	2018-12-13 22:10:38 -08:00
Yanqin Jin	912bbbbc72	Enable crash-recovery stress test for atomic flush (#4605 ) Summary: This PR adds test of atomic flush to our continuous stress tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4605 Differential Revision: D12840607 Pulled By: riversand963 fbshipit-source-id: 0da187572791a59530065a7952697c05b1197ad9	2018-10-30 14:03:36 -07:00
Zhongyi Xie	9b3cf908a6	add missing range in random.choice argument (#4397 ) Summary: This will fix the broken asan crash test: > Traceback (most recent call last): File "tools/db_crashtest.py", line 384, in <module> main() File "tools/db_crashtest.py", line 368, in main parser.add_argument("--" + k, type=type(v() if callable(v) else v)) File "tools/db_crashtest.py", line 59, in <lambda> "index_block_restart_interval": lambda: random.choice(1, 16), TypeError: choice() takes exactly 2 arguments (3 given) Pull Request resolved: https://github.com/facebook/rocksdb/pull/4397 Differential Revision: D9933041 Pulled By: miasantreble fbshipit-source-id: 10998e5bc6b6a5cea3e4088b18465affc246e639	2018-09-19 12:13:20 -07:00
Maysam Yabandeh	a0ebec3804	Extend crash test with index_block_restart_interval (#4383 ) Summary: The default for index_block_restart_interval is 1 but some use 16 in production. The patch extends crash test to test both values. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4383 Differential Revision: D9887304 Pulled By: maysamyabandeh fbshipit-source-id: a8d00fea974a79ad563f9f4d9d7b069e9f746a8f	2018-09-18 15:43:29 -07:00
Andrew Kryczka	8c25204633	Support manual flush in stress/crash tests (#4368 ) Summary: - Made stress test call `Flush()` periodically according to `--flush_one_in` flag. - Enabled by default in crash test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4368 Differential Revision: D9838593 Pulled By: ajkr fbshipit-source-id: fe5a6e49b36e5ea752acc3aa8be364f8ef34d9cc	2018-09-17 12:27:55 -07:00
Maysam Yabandeh	d122025891	Extend stress test to format_version 4 (#4265 ) Summary: Stress tests currently cover format_version 2 and 3. The patch adds 4 as well. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4265 Differential Revision: D9323185 Pulled By: maysamyabandeh fbshipit-source-id: 54d11e41ecae09bae14cadd7313f07c9a3db5a57	2018-08-14 14:13:33 -07:00
Andrew Kryczka	6175b4b294	Support dictionary compression in stress/crash tests (#4234 ) Summary: - Add `--compression_max_dict_bytes` and `--compression_zstd_max_train_bytes` flags to stress test - Randomly enable/disable the above flags in crash test - Set `--compression_type=zstd` in FB-specific crash test runs Pull Request resolved: https://github.com/facebook/rocksdb/pull/4234 Differential Revision: D9187207 Pulled By: ajkr fbshipit-source-id: 8d78cf8d8e1165f2cd1c32e069b73726b5bc1fd2	2018-08-06 15:27:29 -07:00
Siying Dong	4b0a43574a	db_stress to cover upper bound in iterators (#4162 ) Summary: db_stress doesn't cover upper or lower bound in iterators. Try to cover it by randomly assigning a random one. Also in prefix scan tests, with 50% of the chance, set next prefix as the upper bound. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4162 Differential Revision: D8953507 Pulled By: siying fbshipit-source-id: f0f04e9cb6c07cbebbb82b892ca23e0daeea708b	2018-07-23 10:45:29 -07:00
Peter (Stig) Edwards	2694b6dc26	Remove unused imports, from python scripts. (#4057 ) Summary: Also remove redefined variable. As reported on https://lgtm.com/projects/g/facebook/rocksdb/ Closes https://github.com/facebook/rocksdb/pull/4057 Differential Revision: D8648342 Pulled By: ajkr fbshipit-source-id: afd2ba84d1364d316010179edd44777e64ca9183	2018-06-26 12:43:04 -07:00
Andrew Kryczka	7f3a634e06	Support pipelined write in stress/crash tests Summary: Closes https://github.com/facebook/rocksdb/pull/4019 Differential Revision: D8508681 Pulled By: ajkr fbshipit-source-id: 23a3c07d642386446e322b02e69cdf70d12ef009	2018-06-19 09:14:12 -07:00
Andrew Kryczka	8585059ae0	Support backup and checkpoint in db_stress (#4005 ) Summary: Add the `backup_one_in` and `checkpoint_one_in` options to periodically trigger backups and checkpoints. The directory names contain thread ID to avoid clashing with parallel backups/checkpoints. Enable checkpoint in crash test so our CI runs will use it. Didn't enable backup in crash test since it copies all the files which is too slow. Closes https://github.com/facebook/rocksdb/pull/4005 Differential Revision: D8472275 Pulled By: ajkr fbshipit-source-id: ff91bdc37caac4ffd97aea8df96b3983313ac1d5	2018-06-18 19:28:18 -07:00
Andrew Kryczka	de2c6fb158	Fix stderr processing in crash test (#4006 ) Summary: Fixed bug where `db_stress` output a line with a warning followed by a line with an error, and `db_crashtest.py` considered that a success. For example: ``` WARNING: prefix_size is non-zero but memtablerep != prefix_hash open error: Corruption: SST file is ahead of WALs ``` Closes https://github.com/facebook/rocksdb/pull/4006 Differential Revision: D8473463 Pulled By: ajkr fbshipit-source-id: 60461bdd7491d9d26c63f7d4ee522a0f88ba3de7	2018-06-18 17:58:13 -07:00
Andrew Kryczka	7497f992e0	Run manual compaction in stress/crash tests (#3936 ) Summary: - Add support to `db_stress` for `CompactRange` - Enable `CompactRange` and `CompactFiles` in crash tests Closes https://github.com/facebook/rocksdb/pull/3936 Differential Revision: D8230953 Pulled By: ajkr fbshipit-source-id: 208f9980b5bc8c204b1fa726e83791ad674e21e8	2018-06-13 16:45:28 -07:00
Maysam Yabandeh	d0c38c0c8c	Extend some tests to format_version=3 (#3942 ) Summary: format_version=3 changes the format of SST index. This is however not being tested currently since tests only work with the default format_version which is currently 2. The patch extends the most related tests to also test for format_version=3. Closes https://github.com/facebook/rocksdb/pull/3942 Differential Revision: D8238413 Pulled By: maysamyabandeh fbshipit-source-id: 915725f55753dd8e9188e802bf471c23645ad035	2018-06-04 20:13:00 -07:00
Andrew Kryczka	4f297ad05f	Fix crash test check for direct I/O Summary: We need to keep the DB directory around since the direct IO check in "db_crashtest.py" relies on it existing. This PR fixes an issue where it was removed after each stress test run during the second half of whitebox crash testing. Closes https://github.com/facebook/rocksdb/pull/3946 Differential Revision: D8247998 Pulled By: ajkr fbshipit-source-id: 4e7cffbdab9b40df125e7842d0d59916e76261d3	2018-06-03 21:42:12 -07:00
Andrew Kryczka	88c3ee2d31	Configure direct I/O statically in db_stress Summary: Previously `db_stress` attempted to configure direct I/O dynamically in `SetOptions()` which had multiple problems (ummm must've never been tested): - It's a DB option so SetDBOptions should've been called instead - It's not a dynamic option so even SetDBOptions would fail - It required enabling SyncPoint to mask O_DIRECT since it had no way to detect whether the DB directory was in tmpfs or not. This required locking that consumed ~80% of db_stress CPU. In this PR I delete the broken dynamic config and instead configure it statically, only enabling it if the DB directory truly supports O_DIRECT. Closes https://github.com/facebook/rocksdb/pull/3939 Differential Revision: D8238120 Pulled By: ajkr fbshipit-source-id: 60bb2deebe6c9b54a3f788079261715b4a229279	2018-06-01 16:42:34 -07:00
Andrew Kryczka	d19f568abf	Refactor argument handling in db_crashtest.py Summary: - Any options unknown to `db_crashtest.py` are now passed directly to `db_stress`. This way, we won't need to update `db_crashtest.py` every time `db_stress` gets a new option. - Remove `db_crashtest.py` redundant arguments where the value is the same as `db_stress`'s default - Remove `db_crashtest.py` redundant arguments where the value is the same in a previously applied options map. For example, default_params are always applied before whitebox_default_params, so if they require the same value for an argument, that value only needs to be provided in default_params. - Made the simple option maps applied in addition to the regular option maps. Previously they were exclusive which led to lots of duplication Closes https://github.com/facebook/rocksdb/pull/3809 Differential Revision: D7885779 Pulled By: ajkr fbshipit-source-id: 3a3243b55724d6d5bff36e939b582b9b62c538a8	2018-05-09 13:42:41 -07:00
Andrew Kryczka	46152d53bf	Second attempt at db_stress crash-recovery verification Summary: - Original commit: `a4fb1f8c04` - Revert commit (we reverted as a quick fix to get crash tests passing): `6afe22db2e` This PR includes the contents of the original commit plus two bug fixes, which are: - In whitebox crash test, only set `--expected_values_path` for `db_stress` runs in the first half of the crash test's duration. In the second half, a fresh DB is created for each `db_stress` run, so we cannot maintain expected state across `db_stress` runs. - Made `Exists()` return true for `UNKNOWN_SENTINEL` values. I previously had an assert in `Exists()` that value was not `UNKNOWN_SENTINEL`. But it is possible for post-crash-recovery expected values to be `UNKNOWN_SENTINEL` (i.e., if the crash happens in the middle of an update), in which case this assertion would be tripped. The effect of returning true in this case is there may be cases where a `SingleDelete` deletes no data. But if we had returned false, the effect would be calling `SingleDelete` on a key with multiple older versions, which is not supported. Closes https://github.com/facebook/rocksdb/pull/3793 Differential Revision: D7811671 Pulled By: ajkr fbshipit-source-id: 67e0295bfb1695ff9674837f2e05bb29c50efc30	2018-04-30 12:27:34 -07:00
Andrew Kryczka	6afe22db2e	revert db_stress crash-recovery verification Summary: crash-recovery verification is failing in the whitebox testing, which may or may not be a valid correctness issue -- need more time to investigate. In the meantime, reverting so we don't mask other failures. Closes https://github.com/facebook/rocksdb/pull/3786 Differential Revision: D7794516 Pulled By: ajkr fbshipit-source-id: 28ccdfdb9ec9b3b0fb08c15cbf9d2e282201ff33	2018-04-27 12:57:01 -07:00
Andrew Kryczka	db36f222d8	Allow options file in db_stress and db_crashtest Summary: - When options file is provided to db_stress, take supported options from the file instead of from flags - Call `BuildOptionsTable` after `Open` so it can use `options_` once it has been populated either from flags or from file - Allow options filename to be passed via `db_crashtest.py` Closes https://github.com/facebook/rocksdb/pull/3768 Differential Revision: D7755331 Pulled By: ajkr fbshipit-source-id: 5205cc5deb0d74d677b9832174153812bab9a60a	2018-04-26 18:42:07 -07:00
Andrew Kryczka	a4fb1f8c04	Add crash-recovery correctness check to db_stress Summary: Previously, our `db_stress` tool held the expected state of the DB in-memory, so after crash-recovery, there was no way to verify data correctness. This PR adds an option, `--expected_values_file`, which specifies a file holding the expected values. In black-box testing, the `db_stress` process can be killed arbitrarily, so updates to the `--expected_values_file` must be atomic. We achieve this by `mmap`ing the file and relying on `std::atomic<uint32_t>` for atomicity. Actually this doesn't provide a total guarantee on what we want as `std::atomic<uint32_t>` could, in theory, be translated into multiple stores surrounded by a mutex. We can verify our assumption by looking at `std::atomic::is_always_lock_free`. For the `mmap`'d file, we didn't have an existing way to expose its contents as a raw memory buffer. This PR adds it in the `Env::NewMemoryMappedFileBuffer` function, and `MemoryMappedFileBuffer` class. `db_crashtest.py` is updated to use an expected values file for black-box testing. On the first iteration (when the DB is created), an empty file is provided as `db_stress` will populate it when it runs. On subsequent iterations, that same filename is provided so `db_stress` can check the data is as expected on startup. Closes https://github.com/facebook/rocksdb/pull/3629 Differential Revision: D7463144 Pulled By: ajkr fbshipit-source-id: c8f3e82c93e045a90055e2468316be155633bd8b	2018-04-24 15:58:22 -07:00
Andrew Kryczka	b058a33705	Reduce default --nooverwritepercent in black-box crash tests Summary: Previously `python tools/db_crashtest.py blackbox` would do no useful work as the crash interval (two minutes) was shorter than the preparation phase. The preparation phase is slow because of the ridiculously inefficient way it computes which keys should not be overwritten. It was doing this for 60M keys since default values were `FLAGS_nooverwritepercent == 60` and `FLAGS_max_key == 100000000`. Move the "nooverwritepercent" override from whitebox-specific to the general options so it also applies to blackbox test runs. Now preparation phase takes a few seconds. Closes https://github.com/facebook/rocksdb/pull/3671 Differential Revision: D7457732 Pulled By: ajkr fbshipit-source-id: 601f4461a6a7e49e50449dcf15aebc9b8a98d6f0	2018-04-03 15:28:40 -07:00
Mark Isaacson	b8eb32f8cf	Suppress lint in old files Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues. Reviewed By: yiwu-arbug Differential Revision: D6821806 fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c	2018-01-29 12:56:42 -08:00
Andrew Kryczka	d75793d6b4	db_stress support long-held snapshots Summary: Add options to `db_stress` (correctness testing tool) to randomly acquire snapshot and release it after some period of time. It's useful for correctness testing of #3009, as well as other parts of compaction that behave differently depending on which snapshots are held. Closes https://github.com/facebook/rocksdb/pull/3038 Differential Revision: D6086501 Pulled By: ajkr fbshipit-source-id: 3ec0d8666c78ac507f1f808887c4ff759ba9b865	2017-10-20 15:26:59 -07:00
Siying Dong	b87ee6f773	Use more keys per lock in daily TSAN crash test Summary: TSAN shows error when we grab too many locks at the same time. In TSAN crash test, make one shard key cover 2^22 keys so that no many keys will be hold at the same time. Closes https://github.com/facebook/rocksdb/pull/2719 Differential Revision: D5609035 Pulled By: siying fbshipit-source-id: 930e5d63fff92dbc193dc154c4c615efbdf06c6a	2017-08-10 17:56:57 -07:00
Daniel Black	67510eeff3	db_crashtest.py: remove need for shell Summary: Before: $ ps -ef build 1713 16 0 Jul11 ? 00:00:00 make crash_test build 3437 1713 0 Jul11 ? 00:00:00 python -u tools/db_crashtest.py --simple blackbox build 3438 3437 0 Jul11 ? 00:00:00 [sh] <defunct> build 3440 1 99 Jul11 ? 5-03:01:25 ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --block_size=16384 --allow_concurrent_me After: build 1706 16 0 02:52 ? 00:00:01 make crash_test build 3432 1706 0 02:55 ? 00:00:00 python -u tools/db_crashtest.py --simple blackbox build 4452 3432 99 04:35 ? 00:01:42 ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --block_size=16384 --allow_concurr Closes https://github.com/facebook/rocksdb/pull/2571 Differential Revision: D5421580 Pulled By: maysamyabandeh fbshipit-source-id: d6c3970c38ea0fa23da653f4385e8e25d83f5c9f	2017-07-14 09:11:03 -07:00
Aaron Gao	ba7da434ae	fix db_stress crash caused by buggy kernel warning Summary: filter the warning out and only print it once. Closes https://github.com/facebook/rocksdb/pull/2137 Differential Revision: D4870925 Pulled By: lightmark fbshipit-source-id: 91b363ce7f70bce88b0780337f408fc4649139b8	2017-04-11 16:56:59 -07:00

1 2

92 Commits