rocksdb

Author	SHA1	Message	Date
anand76	10c141a3b7	Ensure all MultiGet IO errors are propagated to user (#6403 ) Summary: Unrevert the previous fix to propagate error status, and an additional fix to not treat a memtable lookup MergeInProgress status as an error. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6403 Test Plan: Unit tests Tried running stress tests but couldn't repro the stress failure Differential Revision: D19846721 Pulled By: anand1976 fbshipit-source-id: 7db10cccbdc863d9b559497f0a46b608d2488ca4	2020-02-13 11:55:42 -08:00
sdong	2c62c227ae	By default turn IO Uring off. (#6405 ) Summary: We realized bugs related to IO Uring. Turn it off by default. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6405 Test Plan: Manually run build_tools/build_detect_platform and observe outputs. Differential Revision: D19862792 fbshipit-source-id: 5d5e8e2762997b72a145ae59389ef3d7e4ccd060	2020-02-12 18:27:20 -08:00
anand76	0bc8750e82	Force a new manifest file if append to current one fails (#6331 ) Summary: Fix for issue https://github.com/facebook/rocksdb/issues/6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3	2020-01-31 16:26:22 -08:00
sdong	f6b3de76e5	Revert "Fix kHashSearch bug with SeekForPrev (#6297 )" This reverts commit `d2b4d42d4b`.	2020-01-27 14:58:30 -08:00
sdong	974dfc3de6	Revert "Fix a bug caused by recent fix of Prefix Hash (#6302 )" This reverts commit `f8b5ef85ec`.	2020-01-27 14:56:41 -08:00
sdong	cad5db1c3e	Revert "Fix another bug caused by recent hash index fix (#6305 )" This reverts commit `d87cffaea4`.	2020-01-27 14:56:30 -08:00
Fosco Marotto	bd698e4f55	Update version for next release, 6.7.0 (#6320 ) Summary: Adjusted history for 6.6.1 and 6.6.2, switched master version to 6.7.0. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6320 Differential Revision: D19499272 Pulled By: gfosco fbshipit-source-id: 2bafb2456951f231e411e9c03aaa4c044f497684	2020-01-24 15:36:32 -08:00
Maysam Yabandeh	c4bc30e12d	Implement PinnableSlice::remove_prefix (#6330 ) Summary: The function was left unimplemented. Although we currently don't have a use for that it was declared with an assert(0) to prevent mistakenly using the remove_prefix of the parent class. The function body with only assert(0) however causes issues with some compiler's warning levels. The patch implements the function to avoid the warning. It also piggybacks some minor code warning for unnecessary semicolons after the function definition.s Pull Request resolved: https://github.com/facebook/rocksdb/pull/6330 Differential Revision: D19559062 Pulled By: maysamyabandeh fbshipit-source-id: 3a022484f688c9abd4556e5412bcc2628ab96a00	2020-01-24 13:04:53 -08:00
Levi Tamasi	f34782a67d	Fix the "records dropped" statistics (#6325 ) Summary: The earlier code used two conflicting definitions for the number of input records going into a compaction, one based on the `rocksdb.num.entries` table property and one based on `CompactionIterationStats`. The first one is correct and in line with how output records are counted, while the second one incorrectly ignores input records in various cases when the `CompactionIterator` advances or reseeks the input iterator (this can happen, amongst other cases, when dealing with `SingleDelete`s, regular `Delete`s, `Merge`s, and compaction filters). This can result in the code undercounting the input records and computing an incorrect value for "records dropped" during the compaction. The patch fixes this by switching over to the correct (table property based) input record count for "records dropped". Pull Request resolved: https://github.com/facebook/rocksdb/pull/6325 Test Plan: Tested using `make check` and `db_bench`. Differential Revision: D19525491 Pulled By: ltamasi fbshipit-source-id: 4340b0b2f41546db8e356db70ca02199e48fa636	2020-01-23 15:27:22 -08:00
anand76	0672a6db64	Fix queue manipulation in WriteThread::BeginWriteStall() (#6322 ) Summary: When there is a write stall, the active write group leader calls ```BeginWriteStall()``` to walk the queue of writers and remove any with the ```no_slowdown``` option set. There was a bug in the code which updated the back pointer but not the forward pointer (```link_newer```), corrupting the list and causing some threads to wait forever. This PR fixes it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6322 Test Plan: Add a unit test in db_write_test Differential Revision: D19538313 Pulled By: anand1976 fbshipit-source-id: 6fbed819e594913f435886606f5d36f74f235c3a	2020-01-23 14:01:28 -08:00
Maysam Yabandeh	967a2d953f	Revert "crash_test to enable block-based table hash index (#6310 )" (#6327 ) Summary: This reverts commit `8e309b35bb`. The stress tests are failing . Revert it until we figure the root cause. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6327 Differential Revision: D19537657 Pulled By: maysamyabandeh fbshipit-source-id: bf34a5dd720825957729e136e9a5a729a240e61a	2020-01-23 09:09:17 -08:00
Maysam Yabandeh	cb1142e00d	Set index_block_restart_interval of kHashSearch to 1 in stress test (#6324 ) Summary: kHashSearch is incompatible with larger than 1 values for index_block_restart_interval. Setting it to 1 in stress tests would avoid confusion about the test parameters. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6324 Differential Revision: D19525669 Pulled By: maysamyabandeh fbshipit-source-id: fbf3a797e0ebcebb4d32eba3728cf3583906fc8a	2020-01-22 16:33:21 -08:00
matthewvon	e6e8b9e871	Correct pragma once problem with Bazel on Windows (#6321 ) Summary: This is a simple edit to have two #include file paths be consistent within range_del_aggregator.{h,cc} with everywhere else. The impact of this inconsistency is that it actual breaks a Bazel based build on the Windows platform. The same pragma once failure occurs with both Windows Visual C++ 2019 and clang for Windows 9.0. Bazel's "sandboxing" of the builds causes both compilers to not properly recognize "rocksdb/types.h" and "include/rocksdb/types.h" to be the same file (also comparator.h). My guess is that the backslash versus forward slash mixing within path names is the underlying issue. But, everything builds fine once the include paths in these two source files are consistent with the rest of the repository. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6321 Differential Revision: D19506585 Pulled By: ltamasi fbshipit-source-id: 294c346607edc433ab99eaabc9c880ee7426817a	2020-01-21 16:12:43 -08:00
Levi Tamasi	d305f13e21	Make DBCompactionTest.SkipStatsUpdateTest more robust (#6306 ) Summary: Currently, this test case tries to infer whether `VersionStorageInfo::UpdateAccumulatedStats` was called during open by checking the number of files opened against an arbitrary threshold (10). This makes the test brittle and results in sporadic failures. The patch changes the test case to use sync points to directly test whether `UpdateAccumulatedStats` was called. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6306 Test Plan: `make check` Differential Revision: D19439544 Pulled By: ltamasi fbshipit-source-id: ceb7adf578222636a0f51740872d0278cd1a914f	2020-01-21 12:55:55 -08:00
sdong	8e309b35bb	crash_test to enable block-based table hash index (#6310 ) Summary: Block-based table has index has been disabled in crash test due to bugs. We fixed a bug and re-enable it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6310 Test Plan: Finish one round of "crash_test_with_atomic_flush" test successfully while exclusively running has index. Another run also ran for several hours without failure. Differential Revision: D19455856 fbshipit-source-id: 1192752d2c1e81ed7e5c5c7a9481c841582d5274	2020-01-21 12:27:30 -08:00
Peter Dillinger	8aa99fc71e	Warn on excessive keys for legacy Bloom filter with 32-bit hash (#6317 ) Summary: With many millions of keys, the old Bloom filter implementation for the block-based table (format_version <= 4) would have excessive FP rate due to the limitations of feeding the Bloom filter with a 32-bit hash. This change computes an estimated inflated FP rate due to this effect and warns in the log whenever an SST filter is constructed (almost certainly a "full" not "partitioned" filter) that exceeds 1.5x FP rate due to this effect. The detailed condition is only checked if 3 million keys or more have been added to a filter, as this should be a lower bound for common bits/key settings (< 20). Recommended remedies include smaller SST file size, using format_version >= 5 (for new Bloom filter), or using partitioned filters. This does not change behavior other than generating warnings for some constructed filters using the old implementation. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6317 Test Plan: Example with warning, 15M keys @ 15 bits / key: (working_mem_size_mb is just to stop after building one filter if it's large) $ ./filter_bench -quick -impl=0 -working_mem_size_mb=1 -bits_per_key=15 -average_keys_per_filter=15000000 2>&1 \| grep 'FP rate' [WARN] [/block_based/filter_policy.cc:292] Using legacy SST/BBT Bloom filter with excessive key count (15.0M @ 15bpk), causing estimated 1.8x higher filter FP rate. Consider using new Bloom with format_version>=5, smaller SST file size, or partitioned filters. Predicted FP rate %: 0.766702 Average FP rate %: 0.66846 Example without warning (150K keys): $ ./filter_bench -quick -impl=0 -working_mem_size_mb=1 -bits_per_key=15 -average_keys_per_filter=150000 2>&1 \| grep 'FP rate' Predicted FP rate %: 0.422857 Average FP rate %: 0.379301 $ With more samples at 15 bits/key: 150K keys -> no warning; actual: 0.379% FP rate (baseline) 1M keys -> no warning; actual: 0.396% FP rate, 1.045x 9M keys -> no warning; actual: 0.563% FP rate, 1.485x 10M keys -> warning (1.5x); actual: 0.564% FP rate, 1.488x 15M keys -> warning (1.8x); actual: 0.668% FP rate, 1.76x 25M keys -> warning (2.4x); actual: 0.880% FP rate, 2.32x At 10 bits/key: 150K keys -> no warning; actual: 1.17% FP rate (baseline) 1M keys -> no warning; actual: 1.16% FP rate 10M keys -> no warning; actual: 1.32% FP rate, 1.13x 25M keys -> no warning; actual: 1.63% FP rate, 1.39x 35M keys -> warning (1.6x); actual: 1.81% FP rate, 1.55x At 5 bits/key: 150K keys -> no warning; actual: 9.32% FP rate (baseline) 25M keys -> no warning; actual: 9.62% FP rate, 1.03x 200M keys -> no warning; actual: 12.2% FP rate, 1.31x 250M keys -> warning (1.5x); actual: 12.8% FP rate, 1.37x 300M keys -> warning (1.6x); actual: 13.4% FP rate, 1.43x The reason for the modest inaccuracy at low bits/key is that the assumption of independence between a collision between 32-hash values feeding the filter and an FP in the filter is not quite true for implementations using "simple" logic to compute indices from the stock hash result. There's math on this in my dissertation, but I don't think it's worth the effort just for these extreme cases (> 100 million keys and low-ish bits/key). Differential Revision: D19471715 Pulled By: pdillinger fbshipit-source-id: f80c96893a09bf1152630ff0b964e5cdd7e35c68	2020-01-20 21:31:47 -08:00
Peter Dillinger	4b86fe1123	Log warning for high bits/key in legacy Bloom filter (#6312 ) Summary: Help users that would benefit most from new Bloom filter implementation by logging a warning that recommends the using format_version >= 5. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6312 Test Plan: $ (for BPK in 10 13 14 19 20 50; do ./filter_bench -quick -impl=0 -bits_per_key=$BPK -m_queries=1 2>&1; done) \| grep 'its/key' Bits/key actual: 10.0647 Bits/key actual: 13.0593 [WARN] [/block_based/filter_policy.cc:546] Using legacy Bloom filter with high (14) bits/key. Significant filter space and/or accuracy improvement is available with format_verion>=5. Bits/key actual: 14.0581 [WARN] [/block_based/filter_policy.cc:546] Using legacy Bloom filter with high (19) bits/key. Significant filter space and/or accuracy improvement is available with format_verion>=5. Bits/key actual: 19.0542 [WARN] [/block_based/filter_policy.cc:546] Using legacy Bloom filter with high (20) bits/key. Dramatic filter space and/or accuracy improvement is available with format_verion>=5. Bits/key actual: 20.0584 [WARN] [/block_based/filter_policy.cc:546] Using legacy Bloom filter with high (50) bits/key. Dramatic filter space and/or accuracy improvement is available with format_verion>=5. Bits/key actual: 50.0577 Differential Revision: D19457191 Pulled By: pdillinger fbshipit-source-id: 073d94cde5c70e03a160f953e1100c15ea83eda4	2020-01-17 19:37:35 -08:00
chenyou-fdu	931876e86e	Separate enable-WAL and disable-WAL writer to avoid unwanted data in log files (#6290 ) Summary: When we do concurrently writes, and different write operations will have WAL enable or disable. But the data from write operation with WAL disabled will still be logged into log files, which will lead to extra disk write/sync since we do not want any guarantee for these part of data. Detail can be found in https://github.com/facebook/rocksdb/issues/6280. This PR avoid mixing the two types in a write group. The advantage is simpler reasoning about the write group content Pull Request resolved: https://github.com/facebook/rocksdb/pull/6290 Differential Revision: D19448598 Pulled By: maysamyabandeh fbshipit-source-id: 3d990a0f79a78ea1bfc90773f6ebafc1884c20de	2020-01-17 15:54:55 -08:00
Matt Bell	7e5b04d04f	Expose atomic flush option in C API (#6307 ) Summary: This PR adds a `rocksdb_options_set_atomic_flush` function to the C API. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6307 Differential Revision: D19451313 Pulled By: ltamasi fbshipit-source-id: 750495642ef55b1ea7e13477f85c38cd6574849c	2020-01-17 12:57:48 -08:00
sdong	6b64aed4c0	Fix bug which causes crash_test to always run on sync mode (#6304 ) Summary: A previous change meant to make db_stress to run on sync=1 mode for 1/20 of the time in crash_test, but a bug caused to to always run on sync=1 mode. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6304 Test Plan: Start and kill "python -u tools/db_crashtest.py --simple whitebox" multiple times and observe that most times sync=0 is used while some times sync=1 is used. Differential Revision: D19433000 fbshipit-source-id: 7a0adba39b17a1b3acbbd791bb0cdb743b91fa95	2020-01-17 01:46:48 -08:00
sdong	d87cffaea4	Fix another bug caused by recent hash index fix (#6305 ) Summary: Recent bug fix related to hash index introduced a new bug: hash index can return NotFound but it is not handled by BlockBasedTable::Get(). The end result is that Get() stops being executed too early. Fix it by ignoring NotFound code in Get(). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6305 Test Plan: A problematic DB used to return NotFound incorrectly, and now able to return correct result. Will try to construct a unit test too.0 Differential Revision: D19438925 fbshipit-source-id: e751afa8c13728d56511cfeb1bc811ecb99f3217	2020-01-17 01:41:04 -08:00
Levi Tamasi	73f65b457e	Adjust thread pool sizes when setting max_background_jobs dynamically (#6300 ) Summary: https://github.com/facebook/rocksdb/pull/2205 introduced a new configuration option called `max_background_jobs`, superseding the earlier options `max_background_flushes` and `max_background_compactions`. However, unlike `max_background_compactions`, setting `max_background_jobs` dynamically through the `SetDBOptions` interface does not adjust the size of the thread pools (see https://github.com/facebook/rocksdb/issues/6298). The patch fixes this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6300 Test Plan: Extended unit test. Differential Revision: D19430899 Pulled By: ltamasi fbshipit-source-id: 704006605b3c13c3d1b997ccc0831ee369721074	2020-01-16 14:35:10 -08:00
Cheng Chang	86623a7153	Update example of optimistic transaction (#6074 ) Summary: Add asserts to show the intentions of result explicitly. Add examples to show the effect of optimistic transaction more clearly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6074 Test Plan: `cd examples && make optimistic_transaction_example && ./optimistic_transaction_example` Differential Revision: D18964309 Pulled By: cheng-chang fbshipit-source-id: a524616ed9981edf2fd37ae61c5ed18c5cf25f55	2020-01-16 14:04:44 -08:00
sdong	f8b5ef85ec	Fix a bug caused by recent fix of Prefix Hash (#6302 ) Summary: Recent fix to Prefix Hash https://github.com/facebook/rocksdb/pull/6292 caused a bug that the newly created NotFound status in hash index is never reset. This causes reseek or implict reseek to return wrong results sometimes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6302 Test Plan: Add a unit test that would fail. Not fix. crash test with hash test would fail in several seconds. With the fix, it will run about several minutes before failing with another failure. Differential Revision: D19424572 fbshipit-source-id: c5276f36a95fd0e2837e30190476d2fe21ed8566	2020-01-16 10:47:20 -08:00
Levi Tamasi	b7f1b3e51c	Access Maven Central over HTTPS (#6301 ) Summary: As of 1/15/2020, Maven Central does not support plain HTTP. Because of this, our Travis and AppVeyor builds have started failing during the assertj download step. This patch will hopefully fix these issues. See https://blog.sonatype.com/central-repository-moving-to-https for more info. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6301 Test Plan: Will monitor the builds. ("I don't always test my changes but when I do, I do it in production.") Differential Revision: D19422923 Pulled By: ltamasi fbshipit-source-id: 76f9a8564a5b66ddc721d705f9cbfc736bf7a97d	2020-01-15 17:54:53 -08:00
sdong	d2b4d42d4b	Fix kHashSearch bug with SeekForPrev (#6297 ) Summary: When prefix is enabled the expected behavior when the prefix of the target does not exist is for Seek is to seek to any key larger than target and SeekToPrev to any key less than the target. Currently. the prefix index (kHashSearch) returns OK status but sets Invalid() to indicate two cases: a prefix of the searched key does not exist, ii) the key is beyond the range of the keys in SST file. The SeekForPrev implementation in BlockBasedTable thus does not have enough information to know when it should set the index key to first (to return a key smaller than target). The patch fixes that by returning NotFound status for cases that the prefix does not exist. SeekForPrev in BlockBasedTable accordingly SeekToFirst instead of SeekToLast on the index iterator. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6297 Test Plan: SeekForPrev of non-exsiting prefix is added to block_test.cc, and a test case is added in db_test2, which fails without the fix. Differential Revision: D19404695 fbshipit-source-id: cafbbf95f8f60ff9ede9ccc99d25bfa1cf6fcdc3	2020-01-15 14:28:39 -08:00
Levi Tamasi	1dd7873e08	Remove earlier partial BlobDB GC implementation (#6278 ) Summary: In addition to removing the earlier partially implemented garbage collection logic from the BlobDB codebase, the patch also removes the test cases (as well as the related sync points, as appropriate) that were only relevant for the old implementation, and reworks the remaining ones so they use the new GC logic. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6278 Test Plan: `make check` Differential Revision: D19335226 Pulled By: ltamasi fbshipit-source-id: 0cc1794bc9892feda1426ed5522a318f3cb1b692	2020-01-14 15:08:44 -08:00
sdong	76c117b24b	Fix LITE test build broken by recent commit (#6295 ) Summary: A recent commit adds a unit test that uses a function not available in LITE build. Fix it by avoiding the call Pull Request resolved: https://github.com/facebook/rocksdb/pull/6295 Test Plan: Run the test in LITE build and see it passes. Differential Revision: D19395678 fbshipit-source-id: 37b42835bae02511630d80f7cafb1179401bc033	2020-01-14 13:17:04 -08:00
Maysam Yabandeh	d4b7fbf0d5	kHashSearch incompatible with index_block_restart_interval>1 (#6294 ) Summary: kHashSearch index type is incompatible with index_block_restart_interval larger than 1. The patch asserts that and also resets index_block_restart_interval value if it is incompatible with kHashSearch. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6294 Differential Revision: D19394229 Pulled By: maysamyabandeh fbshipit-source-id: 8a12712ab25e81094a7f71ecd43f773dd4fb6acd	2020-01-14 11:21:27 -08:00
sdong	894c6d21af	Bug when multiple files at one level contains the same smallest key (#6285 ) Summary: The fractional cascading index is not correctly generated when two files at the same level contains the same smallest or largest user key. The result would be that it would hit an assertion in debug mode and lower level files might be skipped. This might cause wrong results when the same user keys are of merge operands and Get() is called using the exact user key. In that case, the lower files would need to further checked. The fix is to fix the fractional cascading index. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6285 Test Plan: Add a unit test which would cause the assertion which would be fixed. Differential Revision: D19358426 fbshipit-source-id: 39b2b1558075fd95e99491d462a67f9f2298c48e	2020-01-13 16:27:42 -08:00
Qinfan Wu	6733be033e	More const pointers in C API (#6283 ) Summary: This makes it easier to call the functions from Rust as otherwise they require mutable types. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6283 Differential Revision: D19349991 Pulled By: wqfish fbshipit-source-id: e8da7a75efe8cd97757baef8ca844a054f2519b4	2020-01-10 19:27:09 -08:00
Sagar Vemuri	cfa585611d	Consider all compaction input files to compute the oldest ancestor time (#6279 ) Summary: Look at all compaction input files to compute the oldest ancestor time. In https://github.com/facebook/rocksdb/issues/5992 we changed how creation_time (aka oldest-ancestor-time) table property of compaction output files is computed from max(creation-time-of-all-compaction-inputs) to min(creation-time-of-all-inputs). This exposed a bug where, during compaction, the creation_time:s of only the L0 compaction inputs were being looked at, and all other input levels were being ignored. This PR fixes the issue. Some TTL compactions when using Level-Style compactions might not have run due to this bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6279 Test Plan: Enhanced the unit tests to validate that the correct time is propagated to the compaction outputs. Differential Revision: D19337812 Pulled By: sagar0 fbshipit-source-id: edf8a72f11e405e93032ff5f45590816debe0bb4	2020-01-10 19:02:42 -08:00
Maysam Yabandeh	eff5e076f5	unordered_write incompatible with max_successive_merges (#6284 ) Summary: unordered_write is incompatible with non-zero max_successive_merges. Although we check this at runtime, we currently don't prevent the user from setting this combination in options. This has led to stress tests to fail with this combination is tried in ::SetOptions. The patch fixes that and also reverts the changes performed by https://github.com/facebook/rocksdb/pull/6254, in which max_successive_merges was mistakenly declared incompatible with unordered_write. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6284 Differential Revision: D19356115 Pulled By: maysamyabandeh fbshipit-source-id: f06dadec777622bd75f267361c022735cf8cecb6	2020-01-10 16:53:19 -08:00
anand76	687119aeaf	Variable key length in db_stress (#6273 ) Summary: Undo https://github.com/facebook/rocksdb/issues/6243 and fix the crash test failures. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6273 Test Plan: Run make ubsan_crash_test Differential Revision: D19331472 Pulled By: anand1976 fbshipit-source-id: 30aa4a36c1b0f77a97159d82bbfd1cd767878e28	2020-01-09 21:27:18 -08:00
Yanqin Jin	6a9989381f	Fix compilation under LITE (#6277 ) Summary: Fix compilation under LITE by putting `#ifndef ROCKSDB_LITE` around a code block. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6277 Differential Revision: D19334157 Pulled By: riversand963 fbshipit-source-id: 947111ed68aa550f5ea424b216c1442a8af9e32b	2020-01-09 15:57:39 -08:00
sdong	39410bcb3d	Fix some shadow warning (#6242 ) Summary: Some shadow warning shows up when using gcc 4.8. An example: ./utilities/blob_db/blob_compaction_filter.h: In constructor ‘rocksdb::blob_db::BlobIndexCompactionFilterFactoryBase::BlobIndexCompactionFilterFactoryBase(rocksdb::blob_db::lobDBImpl, rocksdb::Env, rocksdb::Statistics*)’: ./utilities/blob_db/blob_compaction_filter.h:121:7: error: declaration of ‘blob_db_impl’ shadows a member of 'this' [-Werror=shadow] : blob_db_impl_(blob_db_impl), env_(_env), statistics_(_statistics) {} ^ Fix them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6242 Test Plan: Build and see the warnings go away. Differential Revision: D19217789 fbshipit-source-id: 8ef631941f23dab47a388e060adec24b72efd65e	2020-01-08 18:20:13 -08:00
Yanqin Jin	cfd9732f65	Remove inaccurate code comment (#6274 ) Summary: Remove a comment. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6274 Differential Revision: D19323151 Pulled By: riversand963 fbshipit-source-id: d0d804d6882edcd94e35544ef45578b32ff1caae	2020-01-08 17:51:42 -08:00
Huisheng Liu	e5b476f551	Update file indexer to take timestamp into consideration (#6205 ) Summary: Exclude timestamp in key comparison during boundary calculation to avoid key versions being excluded. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6205 Differential Revision: D19166765 Pulled By: riversand963 fbshipit-source-id: bbe08816fef8de349a83ebd59a595ad844021f24	2020-01-08 16:31:23 -08:00
Amber1990Zhang	941bd15aed	add user nebula (#6271 ) Summary: As title. add a new user [Nebula Graph](https://github.com/vesoft-inc/nebula) to the user doc. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6271 Differential Revision: D19319345 fbshipit-source-id: 52a54372cecc701c34da4ea6b1cf27f3b7498efb	2020-01-08 13:46:43 -08:00
sdong	1244abef66	Stress Test: relax prefix iterator check condition (#6269 ) Summary: Right now, when validating prefix iterator, if control iterator is invalidate but prefix iterator shows value, we determine it as a test failure. However, this fails to consider the case where a file or memtable containing a tombstone is filtered out by a prefix bloom filter. The fix is to relax the check in this case. If we are out of prefix range, then ignore the check. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6269 Test Plan: Run crash_test for a short while and it still passes. Differential Revision: D19317594 fbshipit-source-id: b964a1cdc1df5efe439d4b32f8023e1fbc8598c1	2020-01-08 13:32:06 -08:00
Maysam Yabandeh	f4a378be3e	Print out non-ok DB::Open status in db_stress (#6272 ) Summary: The crash test is failing with non-ok status after TransactionDB::Open. This patch adds more debugging information. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6272 Differential Revision: D19314527 Pulled By: maysamyabandeh fbshipit-source-id: d45ecb0f2144e052fb4b5fdd483150440991a3b4	2020-01-08 12:10:55 -08:00
Adam Retter	6477075f2c	JMH microbenchmarks for RocksJava (#6241 ) Summary: This is the start of some JMH microbenchmarks for RocksJava. Such benchmarks can help us decide on performance improvements of the Java API. At the moment, I have only added benchmarks for various Comparator options, as that is one of the first areas where I want to improve performance. I plan to expand this to many more tests. Details of how to compile and run the benchmarks are in the `README.md`. A run of these on a XEON 3.5 GHz 4vCPU (QEMU Virtual CPU version 2.5+) / 8GB RAM KVM with Ubuntu 18.04, OpenJDK 1.8.0_232, and gcc 8.3.0 produced the following: ``` # Run complete. Total time: 01:43:17 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark (comparatorName) Mode Cnt Score Error Units ComparatorBenchmarks.put native_bytewise thrpt 25 122373.920 ± 2200.538 ops/s ComparatorBenchmarks.put java_bytewise_adaptive_mutex thrpt 25 17388.201 ± 1444.006 ops/s ComparatorBenchmarks.put java_bytewise_non-adaptive_mutex thrpt 25 16887.150 ± 1632.204 ops/s ComparatorBenchmarks.put java_direct_bytewise_adaptive_mutex thrpt 25 15644.572 ± 1791.189 ops/s ComparatorBenchmarks.put java_direct_bytewise_non-adaptive_mutex thrpt 25 14869.601 ± 2252.135 ops/s ComparatorBenchmarks.put native_reverse_bytewise thrpt 25 116528.735 ± 4168.797 ops/s ComparatorBenchmarks.put java_reverse_bytewise_adaptive_mutex thrpt 25 10651.975 ± 545.998 ops/s ComparatorBenchmarks.put java_reverse_bytewise_non-adaptive_mutex thrpt 25 10514.224 ± 930.069 ops/s ``` Indicating a ~7x difference between comparators implemented natively (C++) and those implemented in Java. Let's see if we can't improve on that in the near future... Pull Request resolved: https://github.com/facebook/rocksdb/pull/6241 Differential Revision: D19290410 Pulled By: pdillinger fbshipit-source-id: 25d44bf3a31de265502ed0c5d8a28cf4c7cb9c0b	2020-01-07 15:46:09 -08:00
Maysam Yabandeh	5709e97a74	Skip CancelAllBackgroundWork if DBImpl is already closed (#6268 ) Summary: WritePreparedTxnDB calls CancelAllBackgroundWork in its destructor to avoid dangling references to it from background job's SnapshotChecker callback. However, if the DBImpl is already closed, the info log might be closed with it, which causes memory leak when CancelAllBackgroundWork tries to print to the info log. The patch fixes that by calling CancelAllBackgroundWork only if the db is not closed already. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6268 Differential Revision: D19303439 Pulled By: maysamyabandeh fbshipit-source-id: 4228a6be7e78d43c90630347baa89b008200bd15	2020-01-07 15:34:27 -08:00
wolfkdy	1ab1231acf	parallel occ (#6240 ) Summary: This is a continuation of https://github.com/facebook/rocksdb/pull/5320/files I open a new mr for these purposes, half a year has past since the old mr is posted so it's almost impossible to fulfill some points below on the old mr, especially 5) 1) add validation modes for optimistic txns 2) modify unittests to test both modes 3) make format 4) refine hash functor 5) push to master Pull Request resolved: https://github.com/facebook/rocksdb/pull/6240 Differential Revision: D19301296 fbshipit-source-id: 5b5b3cbd39558f43947f7d2dec6cd31a06386edb	2020-01-07 14:20:38 -08:00
Huisheng Liu	2fdd8087ce	Implement getfreespace for WinEnv (#6265 ) Summary: A new interface method Env::GetFreeSpace was added in https://github.com/facebook/rocksdb/issues/4164. It needs to be implemented for Windows port. Some error_handler_test cases fail on Windows because recovery cannot succeed without free space being reported. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6265 Differential Revision: D19303065 fbshipit-source-id: 1f1a83e53f334284781cf61feabc996e87b945ca	2020-01-07 13:56:13 -08:00
Yanqin Jin	a8b1085ae2	Fix test in LITE mode (#6267 ) Summary: Currently, the recently-added test DBTest2.SwitchMemtableRaceWithNewManifest fails in LITE mode since SetOptions() returns "Not supported". I do not want to put `#ifndef ROCKSDB_LITE` because it reduces test coverage. Instead, just trigger compaction on a different column family. The bg compaction thread calling LogAndApply() may race with thread calling SwitchMemtable(). Test Plan (dev server): make check OPT=-DROCKSDB_LITE make check or run DBTest2.SwitchMemtableRaceWithNewManifest 100 times. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6267 Differential Revision: D19301309 Pulled By: riversand963 fbshipit-source-id: 88cedcca2f985968ed3bb234d324ffa2aa04ca50	2020-01-07 13:47:03 -08:00
Yanqin Jin	bce5189f4d	Fix error message (#6264 ) Summary: Fix an error message when CURRENT is not found. Test plan (dev server) ``` make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6264 Differential Revision: D19300699 Pulled By: riversand963 fbshipit-source-id: 303fa206386a125960ecca1dbdeff07422690caf	2020-01-07 12:32:20 -08:00
Connor1996	3e26a94ba1	Add oldest snapshot sequence property (#6228 ) Summary: Add oldest snapshot sequence property, so we can use `db.GetProperty("rocksdb.oldest-snapshot-sequence")` to get the sequence number of the oldest snapshot. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6228 Differential Revision: D19264145 Pulled By: maysamyabandeh fbshipit-source-id: 67fbe5304d89cbc475bd404e30d1299f7b11c010	2020-01-07 08:36:44 -08:00
Yanqin Jin	1aaa145877	Fix a data race for cfd->log_number_ (#6249 ) Summary: A thread calling LogAndApply may release db mutex when calling WriteCurrentStateToManifest() which reads cfd->log_number_. Another thread can call SwitchMemtable() and writes to cfd->log_number_. Solution is to cache the cfd->log_number_ before releasing mutex in LogAndApply. Test Plan (on devserver): ``` $COMPILE_WITH_TSAN=1 make db_stress $./db_stress --acquire_snapshot_one_in=10000 --avoid_unnecessary_blocking_io=1 --block_size=16384 --bloom_bits=16 --bottommost_compression_type=zstd --cache_index_and_filter_blocks=1 --cache_size=1048576 --checkpoint_one_in=1000000 --checksum_type=kxxHash --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_ttl=0 --compression_max_dict_bytes=16384 --compression_type=zstd --compression_zstd_max_train_bytes=0 --continuous_verification_interval=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_blackbox --db_write_buffer_size=1048576 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --enable_pipelined_write=0 --flush_one_in=1000000 --format_version=5 --get_live_files_and_wal_files_one_in=1000000 --index_block_restart_interval=5 --index_type=0 --log2_keys_per_lock=22 --long_running_snapshots=0 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=1000000 --max_manifest_file_size=16384 --max_write_batch_group_size_bytes=16 --max_write_buffer_number=3 --memtablerep=skip_list --mmap_read=0 --nooverwritepercent=1 --open_files=500000 --ops_per_thread=100000000 --partition_filters=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefixpercent=5 --progress_reports=0 --readpercent=45 --recycle_log_file_num=0 --reopen=20 --set_options_one_in=10000 --snapshot_hold_ops=100000 --subcompactions=2 --sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=1 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=0 --use_merge=0 --use_multiget=1 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --write_buffer_size=4194304 --write_dbid_to_manifest=1 --writepercent=35 ``` Then repeat the following multiple times, e.g. 100 after compiling with tsan. ``` $./db_test2 --gtest_filter=DBTest2.SwitchMemtableRaceWithNewManifest ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6249 Differential Revision: D19235077 Pulled By: riversand963 fbshipit-source-id: 79467b52f48739ce7c27e440caa2447a40653173	2020-01-06 20:09:51 -08:00
Yanqin Jin	946c43a026	Improve error msg for SstFileWriter Merge (#6261 ) Summary: Reword the error message when keys are not added in strict ascending order. Specifically, original error message is not clear when application tries to call SstFileWriter::Merge() with duplicate keys. Test plan (dev server) ``` make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6261 Differential Revision: D19290398 Pulled By: riversand963 fbshipit-source-id: 4dc30a701414e6894db2eb024e3734470c22b371	2020-01-06 10:57:22 -08:00

1 2 3 4 5 ...

8683 Commits