rocksdb

Author	SHA1	Message	Date
Zhichao Cao	4146276885	Add ldb_cmd_test to ASSERT_STATUS_CHECKED list (#7499 ) Summary: Add ldb_cmd_test to ASSERT_STATUS_CHECKED list Pull Request resolved: https://github.com/facebook/rocksdb/pull/7499 Test Plan: pass ASSERT_STATUS_CHECKED=1 make -j48 ldb_cmd_test Reviewed By: cheng-chang Differential Revision: D24086203 Pulled By: zhichao-cao fbshipit-source-id: 29592202b1d4335e566de15e7937269d98d57841	2020-10-08 00:00:48 -07:00
Yanqin Jin	002b30c967	Fix clang analyzer (#7518 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7518 Test Plan: ``` $USE_CLANG=1 make analyze ``` Reviewed By: zhichao-cao Differential Revision: D24175390 Pulled By: riversand963 fbshipit-source-id: c70121652908cf5d450120c38ab65cc595332ca7	2020-10-07 20:11:06 -07:00
Levi Tamasi	1f84611e5d	Clean up BlobLogReader and rename it to BlobLogSequentialReader (#7517 ) Summary: The patch does some cleanup in and around the legacy `BlobLogReader` class: * It renames the class to `BlobLogSequentialReader` to emphasize that it is for sequentially iterating through blobs in a blob file, as opposed to doing random point reads using `BlobIndex`es (which is `BlobFileReader`'s jurisdiction). * It removes some dead code from the old BlobDB implementation that references `BlobLogReader` (namely the method `BlobFile::OpenRandomAccessReader`). * It cleans up some `#include`s and forward declarations. * It fixes some incorrect/outdated comments related to the reader class. * It adds a few assertions to the `Read` methods of the class. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7517 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D24172611 Pulled By: ltamasi fbshipit-source-id: 43e2ae1eba5c3dd30c1070cb00f217edc45bd64f	2020-10-07 17:48:16 -07:00
Levi Tamasi	22655a398b	Introduce a blob file reader class (#7461 ) Summary: The patch adds a class called `BlobFileReader` that can be used to retrieve blobs using the information available in blob references (e.g. blob file number, offset, and size). This will come in handy when implementing blob support for `Get`, `MultiGet`, and iterators, and also for compaction/garbage collection. When a `BlobFileReader` object is created (using the factory method `Create`), it first checks whether the specified file is potentially valid by comparing the file size against the combined size of the blob file header and footer (files smaller than the threshold are considered malformed). Then, it opens the file, and reads and verifies the header and footer. The verification involves magic number/CRC checks as well as checking for unexpected header/footer fields, e.g. incorrect column family ID or TTL blob files. Blobs can be retrieved using `GetBlob`. `GetBlob` validates the offset and compression type passed by the caller (because of the presence of the header and footer, the specified offset cannot be too close to the start/end of the file; also, the compression type has to match the one in the blob file header), and retrieves and potentially verifies and uncompresses the blob. In particular, when `ReadOptions::verify_checksums` is set, `BlobFileReader` reads the blob record header as well (as opposed to just the blob itself) and verifies the key/value size, the key itself, as well as the CRC of the blob record header and the key/value pair. In addition, the patch exposes the compression type from `BlobIndex` (both using an accessor and via `DebugString`), and adds a blob file read latency histogram to `InternalStats` that can be used with `BlobFileReader`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7461 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23999219 Pulled By: ltamasi fbshipit-source-id: deb6b1160d251258b308d5156e2ec063c3e12e5e	2020-10-07 15:44:53 -07:00
Akanksha Mahajan	38d0a365e3	Add Stats for MultiGet (#7366 ) Summary: Add following stats for MultiGet in Histogram to get more insight on MultiGet. 1. Number of index and filter blocks read from file as part of MultiGet request per level. 2. Number of data blocks read from file per level. 3. Number of SST files loaded from file system per level. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7366 Reviewed By: anand1976 Differential Revision: D24127040 Pulled By: akankshamahajan15 fbshipit-source-id: e63a003056b833729b277edc0639c08fb432756b	2020-10-07 13:28:48 -07:00
Jay Zhuang	8891e9a0eb	Disallow trivial move if BottommostLevelCompaction is kForce* (#7368 ) Summary: If `BottommostLevelCompaction.kForce*` is set, compaction should avoid trivial move and always compact the sst to the target size. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7368 Reviewed By: ajkr Differential Revision: D23629525 Pulled By: jay-zhuang fbshipit-source-id: 79f23c79ecb31587e0593b28cce43131107bbcd0	2020-10-07 13:19:31 -07:00
anand76	a242a58301	Enable ASSERT_STATUS_CHECKED for db_universal_compaction_test (#7460 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7460 Reviewed By: riversand963 Differential Revision: D24057636 Pulled By: anand1976 fbshipit-source-id: bfb13da6993a5e407be20073e4d6751dfb38e442	2020-10-06 14:42:12 -07:00
Yanqin Jin	1bcef3d83c	Make sure assert(false) handles failures too (#7483 ) Summary: In opt mode, assertions are just no-ops. Therefore, we need to report errors instead of just doing an `assert(false)`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7483 Test Plan: make check Reviewed By: anand1976 Differential Revision: D24142725 Pulled By: riversand963 fbshipit-source-id: 5629556dbe29f00dd09e30a7d5df5e6cf09ee435	2020-10-06 13:52:42 -07:00
Jay Zhuang	53089038de	Fix StallWrite crash with mixed of slowdown/no_slowdown writes (#7508 ) Summary: `BeginWriteStall()` removes no_slowdown write from the write list and updates `link_newer`, which makes `CreateMissingNewerLinks()` thought all write list has valid `link_newer` and failed to create link for all writers. It caused flaky test and SegFault for release build. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7508 Test Plan: Add unittest to reproduce the issue. Reviewed By: anand1976 Differential Revision: D24126601 Pulled By: jay-zhuang fbshipit-source-id: f8ac5dba653f7ee1b0950296427d4f5f8ee34a06	2020-10-06 12:44:20 -07:00
Yanqin Jin	758ead5df7	Enforce status check for corruption_test (#7453 ) Summary: Enforce status check for corruption_test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7453 Test Plan: ``` ASSERT_STATUS_CHECKED=1 make corruption_test ./corruption_test ``` Reviewed By: jay-zhuang Differential Revision: D24006862 Pulled By: riversand963 fbshipit-source-id: 664677caf4c3007a25cf565cec3d677f2dcea130	2020-10-02 22:11:00 -07:00
sdong	668ee08915	Fix prefix_test for status check (#7495 ) Summary: Fix prefix_test so that it passes when ASSERT_STATUS_CHECKED=1 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7495 Test Plan: Run the test with the option Reviewed By: anand1976 Differential Revision: D24069715 fbshipit-source-id: 54f74b58575a1b49dbdee9ea2d24751fa956b620	2020-10-02 17:01:15 -07:00
Zhichao Cao	b7062f0b2c	Status check enforcement for error_handler_fs_test (#7342 ) Summary: Added status check enforcement for error_test_fs_test Pull Request resolved: https://github.com/facebook/rocksdb/pull/7342 Test Plan: ASSERT_STATUS_CHECKED=1 make -j48 error_test_fs_test Reviewed By: akankshamahajan15 Differential Revision: D23972231 Pulled By: zhichao-cao fbshipit-source-id: fa41bfe440012e0c55f2c9507c1d0104e5e93f84	2020-10-02 16:41:13 -07:00
Akanksha Mahajan	7cd760dfdf	Add status check enforcement for column_family_test.cc (#7484 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7484 Reviewed By: jay-zhuang Differential Revision: D24037616 Pulled By: akankshamahajan15 fbshipit-source-id: 0f63281f81046bcb1b95a7578783285cc6346ece	2020-10-02 13:35:15 -07:00
Yanqin Jin	48d5aa9bab	Enable status check for db_secondary_test (#7487 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7487 Test Plan: ASSERT_STATUS_CHECKED=1 make db_secondary_test ./db_secondary_test Reviewed By: zhichao-cao Differential Revision: D24071038 Pulled By: riversand963 fbshipit-source-id: e6600c0aecab71c1326b22af263e92bddee5f7ac	2020-10-02 11:23:36 -07:00
Andrew Kryczka	29ed766193	add Status check enforcement for stats_history_test (#7496 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7496 Reviewed By: zhichao-cao Differential Revision: D24070007 Pulled By: ajkr fbshipit-source-id: 4320413a4d7707774ee23a7e6232714d7ee7a57f	2020-10-02 08:25:30 -07:00
Andrew Kryczka	1e00909730	Periodically flush info log out of application buffer (#7488 ) Summary: This PR schedules a background thread (shared across all DB instances) to flush info log every ten seconds. This improves debuggability in case of RocksDB hanging since it ensures the log messages leading up to the hang will eventually become visible in the log. The bulk of this PR is moving monitoring/stats_dump_scheduler* to db/periodic_work_scheduler* and making the corresponding name changes since now the scheduler handles info log flushing, not just stats dumping. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7488 Reviewed By: riversand963 Differential Revision: D24065165 Pulled By: ajkr fbshipit-source-id: 339c47a0ff43b79fdbd055fbd9fefbb6f9d8d3b5	2020-10-01 19:14:14 -07:00
sdong	94fc676d3f	Fix db_properties_test for ASSERT_STATUS_CHECKED (#7490 ) Summary: Add all status handling in db_properties_test so that it can pass ASSERT_STATUS_CHECKED. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7490 Test Plan: Run the test with ASSERT_STATUS_CHECKED Reviewed By: jay-zhuang Differential Revision: D24065382 fbshipit-source-id: e008916155196891478c964df0226545308ca71d	2020-10-01 17:47:09 -07:00
Zhichao Cao	685cabdafa	Add trace_analyzer_test to ASSERT_STATUS_CHECKED list (#7480 ) Summary: Add trace_analyzer_test to ASSERT_STATUS_CHECKED list Pull Request resolved: https://github.com/facebook/rocksdb/pull/7480 Test Plan: ASSERT_STATUS_CHECKED=1 make -j48 trace_analyzer_test Reviewed By: riversand963 Differential Revision: D24033768 Pulled By: zhichao-cao fbshipit-source-id: b415045e6fab01d6193448650772368c21c6dba6	2020-10-01 15:58:52 -07:00
Peter Dillinger	9082771b86	Add is_full_compaction to CompactionJobStats, cleanup (#7451 ) Summary: This exposes to the listener interface whether a compaction was full or not. Also cleaned up API comment for CompactionJobInfo::stats, which is not of a nullable type. And since CompactionJob is always created with non-null CompactionJobStats, removed conditionals on it being nullptr and instead assert non-null. TODO later: update C and Java interfaces Pull Request resolved: https://github.com/facebook/rocksdb/pull/7451 Test Plan: updated existing unit tests to check new field, make check Reviewed By: ltamasi Differential Revision: D23977796 Pulled By: pdillinger fbshipit-source-id: 1ae7e26cb949631c2b2fb9e696710daf53cc378d	2020-10-01 12:52:58 -07:00
Levi Tamasi	786c1a2cc4	Reduce the number of iterations in DBTest.FileCreationRandomFailure (#7481 ) Summary: `DBTest.FileCreationRandomFailure` frequently times out during our continuous test runs. (It's a case of "stress test posing as unit test.") The patch reduces the number of iterations to avoid this. Note that the lower numbers are still sufficient to trigger both flushes and compactions, so test coverage is still the same. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7481 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D24034712 Pulled By: ltamasi fbshipit-source-id: 8731a9446e5a121a1041b00f0df473b9f714935a	2020-10-01 10:42:58 -07:00
sdong	7508175558	Introduce options.check_flush_compaction_key_order (#7467 ) Summary: Introduce an new option options.check_flush_compaction_key_order, by default set to true, which checks key order of flush and compaction, and fail the operation if the order is violated. Also did minor refactor hash checking code, which consolidates the hashing logic to a vlidation class, where the key ordering logic is added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7467 Test Plan: Add unit tests to validate the check can catch reordering in flush and compaction, and can be properly disabled. Reviewed By: riversand963 Differential Revision: D24010683 fbshipit-source-id: 8dd6292d2cda8006054e9ded7cfa4bf405f0527c	2020-10-01 10:10:26 -07:00
Koby Kahane	3e745053b7	Fix MSVC-related build issues (#7439 ) Summary: This PR addresses some build and functional issues on MSVC targets, as a step towards an eventual goal of having RocksDB build successfully for Windows on ARM64. Addressed issues include: - BitsSetToOne and CountTrailingZeroBits do not compile on non-x64 MSVC targets. A fallback implementation of BitsSetToOne when Intel intrinsics are not available is added, based on the C++20 `<bit>` popcount implementation in Microsoft's STL. - The implementation of FloorLog2 for MSVC targets (including x64) gives incorrect results. The unit test easily detects this, but CircleCI is currently configured to only run a specific set of tests for Windows CMake builds, so this seems to have been unnoticed. - AsmVolatilePause does not use YieldProcessor on Windows ARM64 targets, even though it is available. - When CondVar::TimedWait calls Microsoft STL's condition_variable::wait_for, it can potentially trigger a bug (just recently fixed in the upcoming VS 16.8's STL) that deadlocks various tests that wait for a timer to execute, since `Timer::Run` doesn't get a chance to execute before being blocked by the test function acquiring the mutex. - In c_test, `GetTempDir` assumes a POSIX-style temp path. - `NormalizePath` did not eliminate consecutive POSIX-style path separators on Windows, resulting in test failures in e.g., wal_manager_test. - Various other test failures. In a followup PR I hope to modify CircleCI's config.yml to invoke all RocksDB unit tests in Windows CMake builds with CTest, instead of the current use of `run_ci_db_test.ps1` which requires individual tests to be specified and is missing many of the existing tests. Notes from peterd: FloorLog2 is not yet used in production code (it's for something in progress). I also added a few more inexpensive platform-dependent tests to Windows CircleCI runs. And included facebook/folly#1461 as requested Pull Request resolved: https://github.com/facebook/rocksdb/pull/7439 Reviewed By: jay-zhuang Differential Revision: D24021563 Pulled By: pdillinger fbshipit-source-id: 0ec2027c0d6a494d8a0fe38d9667fc2f7e29f7e7	2020-10-01 09:23:04 -07:00
Ramkumar Vadivelu	e04a50923d	Change ParseInternalKey() to return Status instead of bool (#7457 ) Summary: Fixes https://github.com/facebook/rocksdb/issues/7430 Change ParseInternalKey() to return Status instead of bool. db_bench (seekrandom) based before/after results with value size of 100 bytes and 16 bytes can be found at (tests ran on an udb server): https://www.dropbox.com/s/47bwamdy5ozngph/PIK_ret_Status_results.xlsx?dl=0 ![db_bench_results](https://user-images.githubusercontent.com/62277872/94642825-2a21a800-029a-11eb-88f2-124136c83fd3.png) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7457 Reviewed By: ajkr Differential Revision: D24002433 Pulled By: ramvadiv fbshipit-source-id: ac253ecf577a29044c47c3fe254a01e71404c44c	2020-09-30 19:16:47 -07:00
Andrew Kryczka	718e192965	Fix flaky intra-L0 consistency failure regression tests (#7477 ) Summary: Do not assert the number of files after intra-L0 compaction is eligible to run since it could complete (and reduce the number of files) before the assertion executes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7477 Reviewed By: pdillinger Differential Revision: D24032049 Pulled By: ajkr fbshipit-source-id: e838ac7a24651ebd643b9e5a9d39d2e789c46929	2020-09-30 16:50:24 -07:00
Peter Dillinger	ddbc5dad05	Enable force_consistency_checks by default (#7446 ) Summary: This has been running in production on some key workloads, so we believe it to be safe and extremely low cost. Nevertheless, I've added code to ensure that "force_consistency_checks" is mentioned in any corruption reports so that people know how to disable in case of false positive corruption reports. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7446 Test Plan: make check, CI, temporary debug print new message with ./version_builder_test Reviewed By: ajkr Differential Revision: D23972101 Pulled By: pdillinger fbshipit-source-id: 9623e400f3752577c0ecf977e6d0915562cf9968	2020-09-30 11:57:32 -07:00
sdong	5f33436285	Revert an uncessary status code check skipping (#7458 ) Summary: https://github.com/facebook/rocksdb/pull/7452 added an uncessary skip for status code checking. Revert it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7458 Test Plan: Watch CI to finish Reviewed By: jay-zhuang Differential Revision: D23994390 fbshipit-source-id: a2b50a6326d8073db3386bff3d32acc5a6666e9b	2020-09-30 11:38:39 -07:00
Akanksha Mahajan	9d212d3f0e	Provide users with option to opt-in to get corrupt data in logs/messages (#7420 ) Summary: Add a new Option "allow_data_in_errors". When it's set by users, it allows them to opt-in to get error messages containing corrupted keys/values. Corrupt keys, values will be logged in the messages, logs, status etc. that will help users with the useful information regarding affected data. By default value is set false to prevent users data to be exposed in the messages. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7420 Test Plan: 1. make check -j64 2. Add a new test case Reviewed By: ajkr Differential Revision: D23835028 Pulled By: akankshamahajan15 fbshipit-source-id: 8d2eba8fb898e79fcf1fccc07295065a75eb59b1	2020-09-29 23:17:45 -07:00
Jay Zhuang	1bdaef7a06	Status check enforcement for timestamp_basic_test (#7454 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7454 Reviewed By: riversand963 Differential Revision: D23981719 Pulled By: jay-zhuang fbshipit-source-id: 01073f73e54c17067b886c4a2f179b2804198399	2020-09-29 18:23:27 -07:00
Andrew Kryczka	8115eb520d	add Status check assertions for repair_test (#7455 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7455 Reviewed By: riversand963 Differential Revision: D23985283 Pulled By: ajkr fbshipit-source-id: 5dd2be62350f6e31d13a1e7821cb848a37699c93	2020-09-29 16:30:08 -07:00
Yanqin Jin	07dc955a1f	Report error of GetChildren (#7459 ) Summary: As title Pull Request resolved: https://github.com/facebook/rocksdb/pull/7459 Test Plan: make check Reviewed By: anand1976 Differential Revision: D23999393 Pulled By: riversand963 fbshipit-source-id: 09df8e1637f4df3616c63ee314de397b35be4e4a	2020-09-29 15:27:00 -07:00
anand76	12ede5ed7c	Remove invalid assertion in compaction_picker_universal.cc (#7421 ) Summary: The assertion checks that there is no overlap in sequence numbers across levels in universal compaction. However, this assumption doesn't hold when there is a delete triggered compaction or a trivial move, as they operate on a subset of a level. Tests - make check Pull Request resolved: https://github.com/facebook/rocksdb/pull/7421 Reviewed By: ajkr Differential Revision: D23872672 Pulled By: anand1976 fbshipit-source-id: c386deab8e01a5746ca996ff1f4ebcae3b15b7d2	2020-09-29 13:29:58 -07:00
sdong	d08a9005b7	Make db_basic_test pass assert status checked (#7452 ) Summary: Add db_basic_test status check list. Some of the warnings are suppressed. It is possible that some of them are due to real bugs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7452 Test Plan: See CI tests pass. Reviewed By: zhichao-cao Differential Revision: D23979764 fbshipit-source-id: 6151570c2a9b931b0fbb3fe939a94b2bd1583cbe	2020-09-29 09:49:04 -07:00
Zhichao Cao	d71cfe04e4	Add flush_job_test to the list of ASSERT_STATUS_CHECKED tests (#7445 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7445 Test Plan: pass ASSERT_STATUS_CHECKED=1 make -j48 flush_job_test Reviewed By: akankshamahajan15 Differential Revision: D23969372 Pulled By: zhichao-cao fbshipit-source-id: 498ff45ef84e07ec27a8f35d0874d3371412afe9	2020-09-28 14:59:02 -07:00
Levi Tamasi	1abbc56aba	Add version_builder_test to the list of ASSERT_STATUS_CHECKED tests (#7444 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7444 Test Plan: `ASSERT_STATUS_CHECKED=1 make version_builder_test -j24` Reviewed By: jay-zhuang Differential Revision: D23965793 Pulled By: ltamasi fbshipit-source-id: 8beaf66548379f21146189cda699d5f6fbb35a1b	2020-09-28 12:12:40 -07:00
Ramkumar Vadivelu	c203e01773	reset refitting_level_ flag to false in error paths (#7403 ) Summary: Reset refitting_level_ flag to false in error paths in DBImpl::ReFitLevel() Pull Request resolved: https://github.com/facebook/rocksdb/pull/7403 Reviewed By: ajkr Differential Revision: D23909028 Pulled By: ramvadiv fbshipit-source-id: 521ad9aadc1b734bef9ef9119d1e1ee1fa8126e9	2020-09-28 11:37:00 -07:00
Peter Dillinger	08552b19d3	Genericize and clean up FastRange (#7436 ) Summary: A generic algorithm in progress depends on a templatized version of fastrange, so this change generalizes it and renames it to fit our style guidelines, FastRange32, FastRange64, and now FastRangeGeneric. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7436 Test Plan: added a few more test cases Reviewed By: jay-zhuang Differential Revision: D23958153 Pulled By: pdillinger fbshipit-source-id: 8c3b76101653417804997e5f076623a25586f3e8	2020-09-28 11:35:00 -07:00
Jay Zhuang	fa92b9dc9f	Fix TSAN build and re-enable the tests (#7386 ) Summary: Resolve TSAN build warnings and re-enable disabled TSAN tests. Not sure if it's a compiler issue or TSAN check issue. Switching from conditional operator to if-else mitigated the problem. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7386 Test Plan: run TSAN check 10 times in circleci. ``` WARNING: ThreadSanitizer: data race (pid=27735) Atomic write of size 8 at 0x7b54000005e8 by thread T32: #0 __tsan_atomic64_store <null> (db_test+0x4cee95) https://github.com/facebook/rocksdb/issues/1 std::__atomic_base<unsigned long>::store(unsigned long, std::memory_order) /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/bits/atomic_base.h:374:2 (db_test+0x78460e) https://github.com/facebook/rocksdb/issues/2 rocksdb::VersionSet::SetLastSequence(unsigned long) /home/circleci/project/./db/version_set.h:1058:20 (db_test+0x78460e) ... Previous read of size 8 at 0x7b54000005e8 by thread T31: #0 bool rocksdb::DBImpl::MultiCFSnapshot<std::unordered_map<unsigned int, rocksdb::DBImpl::MultiGetColumnFamilyData, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, rocksdb::DBImpl::MultiGetColumnFamilyData> > > >(rocksdb::ReadOptions const&, rocksdb::ReadCallback, std::function<rocksdb::DBImpl::MultiGetColumnFamilyData (std::unordered_map<unsigned int, rocksdb::DBImpl::MultiGetColumnFamilyData, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, rocksdb::DBImpl::MultiGetColumnFamilyData> > >::iterator&)>&, std::unordered_map<unsigned int, rocksdb::DBImpl::MultiGetColumnFamilyData, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, rocksdb::DBImpl::MultiGetColumnFamilyData> > >, unsigned long) /home/circleci/project/db/db_impl/db_impl.cc (db_test+0x715087) ``` Reviewed By: ltamasi Differential Revision: D23725226 Pulled By: jay-zhuang fbshipit-source-id: a6d662a5ea68111246cd32ec95f3411a25f76bc6	2020-09-25 14:46:28 -07:00
Peter Dillinger	c8a12aa94b	EnableFileDeletions only read field while holding mutex (#7435 ) Summary: Possible fix for a TSAN issue reported in EnableFileDeletions. disable_delete_obsolete_files_ should only be accessed holding the db mutex, but for logging it was being accessed outside holding the mutex, now fixed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7435 Test Plan: existing tests, watch for recurrence Reviewed By: ltamasi Differential Revision: D23917578 Pulled By: pdillinger fbshipit-source-id: 8573025bca3f6fe169b24b87bbfc4ce9667b0482	2020-09-25 13:34:36 -07:00
Cheng Chang	1a24f4d1d6	Track WAL in MANIFEST: add method to check WAL consistency (#7236 ) Summary: Add a method `CheckWals` in `WalSet` to check the logs on disk. See `CheckWals`'s comments. This method will be used to check consistency of WALs during DB recovery. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7236 Test Plan: a set of tests are added to wal_edit_test.cc. Reviewed By: riversand963 Differential Revision: D23036505 Pulled By: cheng-chang fbshipit-source-id: 5b1d6857ac173429b00f950c32c4a5b8d063a732	2020-09-25 13:25:54 -07:00
Jay Zhuang	8c9fff917c	MultiGet() with timestamp should respect snapshot (#7404 ) Summary: Similar to PR https://github.com/facebook/rocksdb/issues/7227, add read callback to filter out rows with with higher sequence number. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7404 Reviewed By: riversand963 Differential Revision: D23790762 Pulled By: jay-zhuang fbshipit-source-id: bce854307612f1a22f985ffc934da627d0a139c2	2020-09-25 09:42:01 -07:00
Akanksha Mahajan	9a63bbd391	Add few unit test cases in ASSERT_STATUS_CHECKED build (#7427 ) Summary: Fix few test cases and add them in ASSERT_STATUS_CHECKED build. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7427 Test Plan: 1. ASSERT_STATUS_CHECKED=1 make -j48 check, 2. travis build for ASSERT_STATUS_CHECKED, 3. Without ASSERT_STATUS_CHECKED: make check -j64, CircleCI build and travis build Reviewed By: pdillinger Differential Revision: D23909983 Pulled By: akankshamahajan15 fbshipit-source-id: 42d7e4aea972acb9fcddb7ca73fcb82f93272434	2020-09-24 21:48:57 -07:00
Akanksha Mahajan	98ac6b646a	Add IO Tracer Parser (#7333 ) Summary: Implement a parsing tool io_tracer_parser that takes IO trace file (binary file) with command line argument --io_trace_file and output file with --output_file and dumps the IO trace records in outputfile in human readable form. Also added unit test cases that generates IO trace records and calls io_tracer_parse to parse those records. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7333 Test Plan: make check -j64, Add unit test cases. Reviewed By: anand1976 Differential Revision: D23772360 Pulled By: akankshamahajan15 fbshipit-source-id: 9c20519c189362e6663352d08863326f3e496271	2020-09-23 15:50:26 -07:00
Peter Dillinger	31d1cea4f3	Add and fix clang -Wshift-sign-overflow (#7431 ) Summary: This option is apparently used by some teams within Facebook (internal ref T75998621) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7431 Test Plan: USE_CLANG=1 make check before (fails) and after Reviewed By: jay-zhuang Differential Revision: D23876584 Pulled By: pdillinger fbshipit-source-id: abb8b67a1f1aac75327944d266e284b2b6727191	2020-09-23 15:26:17 -07:00
rockeet	b005f96937	db_iter.cc: DBIter::Next(): minor improve (#7407 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7407 Reviewed By: ajkr Differential Revision: D23817122 Pulled By: jay-zhuang fbshipit-source-id: 62bf43e4d780fad8c682edd750b4800b5b8f4a77	2020-09-23 09:53:24 -07:00
Cheng Chang	00ee89b584	Track WAL in MANIFEST: update WalMetadata for WAL syncing (#7414 ) Summary: There are some tricky behaviors related to WAL sync: - When creating a WAL, the WAL might not be synced, if the WAL directory is not synced, the WAL file's metadata may not even be synced to disk, so during recovery, when listing the WAL directory, the WAL may not even show up. - During each DB::Write, the WriteOption can control whether the WAL should be synced, so a WAL previously not synced on creation can be synced during Write. For each `SyncWAL`, we'll track the synced status and the current WAL size. Previously, we only track the WAL size on closing. During recovery, we check that the on-disk WAL size is >= the last synced size. So this PR introduces `synced_size` and `closed` to `WalMetadata` for the above design update. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7414 Test Plan: - updated wal_edit_test - updated version_edit_test Reviewed By: riversand963 Differential Revision: D23796127 Pulled By: cheng-chang fbshipit-source-id: 5498ab80f537c48a10157e71a4745716aef5cf30	2020-09-22 14:35:14 -07:00
Yanqin Jin	cd72f8974b	Allow mutex to be released in GetAggregatedIntProperty (#7412 ) Summary: Current implementation holds db mutex while calling `GetAggregatedIntProperty()`. For property kEstimateTableReadersMem, this can be expensive, especially if the number of table readers is high. We can release and re-acquire db mutex if property_info.need_out_of_mutex is true. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7412 Test Plan: make check COMPILE_WITH_ASAN=1 make check COMPILE_WITH_TSAN=1 make check Also test internally on a shadow host. Used bpf to verify the excessively long db mutex holding no longer exists when applications call GetApproximateMemoryUsageByType(). Reviewed By: jay-zhuang Differential Revision: D23794824 Pulled By: riversand963 fbshipit-source-id: 6bc02a59fd25613d343a62cf817467c7122c9721	2020-09-22 12:37:16 -07:00
Peter Dillinger	6727259eb4	Possible fix to flaky db_write_test (#7418 ) Summary: Make the test robust to spurious wakeups on condition variable, and clear sync points to ensure no use-after-free. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7418 Test Plan: repeated runs on updated test, watch CircleCI for recurrence Reviewed By: jay-zhuang Differential Revision: D23828823 Pulled By: pdillinger fbshipit-source-id: af85117d9c02602541a90252840e0e5a6996de5b	2020-09-22 09:57:05 -07:00
Zhichao Cao	485fd9d9db	fix the flaky test failure (#7415 ) Summary: Fix the flaky test failure in error_handler_fs_test. Add the sync point, solve the dependency. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7415 Test Plan: make asan_check, ~/gtest-parallel/gtest-parallel -r 100 ./error_handler_fs_test Reviewed By: siying Differential Revision: D23804330 Pulled By: zhichao-cao fbshipit-source-id: 5175108651f7652e47e15978f2a9c1669ef59d80	2020-09-19 17:57:54 -07:00
Zhichao Cao	c268628c25	Map retryable IO error during Flush without WAL to soft error and no switch memtable during resume (#7310 ) Summary: In the current implementation, any retryable IO error happens during Flush is mapped to a hard error. In this case, DB is stopped and write is stalled unless the background error is cleaned. In this PR, if WAL is DISABLED, the retryable IO error during FLush is mapped to a soft error. Such that, the memtable can continue receive the writes. At the same time, if auto resume is triggered, SwtichMemtable will not be called during Flush when resuming the DB to avoid to many small memtables. Testing cases are added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7310 Test Plan: adding new unit test, pass make check. Reviewed By: anand1976 Differential Revision: D23710892 Pulled By: zhichao-cao fbshipit-source-id: bc4ca50d11c6b23b60d2c0cb171d86d542b038e9	2020-09-17 20:25:45 -07:00
Adam Retter	3ac07a12fe	RocksJava - Add errorIfLogFileExists parameter to RocksDB.openReadOnly (#7046 ) Summary: Expose from C++ API to Java API. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7046 Reviewed By: riversand963 Differential Revision: D23726297 Pulled By: pdillinger fbshipit-source-id: fc66bf626ce6fe9797e7d021ac849eacab91bf6d	2020-09-17 15:41:25 -07:00
Peter Dillinger	93719fc953	Restore file size in backup table file names (and other cleanup) (#7400 ) Summary: Prior to 6.12, backup files using share_files_with_checksum had the file size encoded in the file name, after the last '\_' and before the last '.'. We considered this an implementation detail subject to change, and indeed removed this information from the file name (with an option to use old behavior) because it was considered ineffective/inefficient for file name uniqueness. However, some downstream RocksDB users were relying on this information since the file size is not explicitly in the backup manifest file. This primary purpose of this change is "retrofitting" the 6.12 release (not yet a public release) to simultaneously support the benefits of the new naming scheme (I/O performance and data correctness at scale) and preserve the file size information, both as default behaviors. With this change, we are essentially making the file size information encoded in the file name an official, though obscure, extension of the backup meta file format. We preserve an option (kLegacyCrc32cAndFileSize) to use the original "legacy" naming scheme, with its caveats, and make it easy to omit the file size information (no kFlagIncludeFileSize), for more compact file names. But note that changing the naming scheme used on an existing db and backup directory can lead to transient space amplification, as some files will be stored under two names in the shared_checksum directory. Because some backups were saved using the original 6.12 naming scheme, we offer two ways of dealing with those files: SST files generated by older 6.12 versions can either use the default naming scheme in effect when the SST files were generated (kFlagMatchInterimNaming, default, no transient space amplification) or can use a new naming scheme (no kFlagMatchInterimNaming, potential space amplification because some already stored files getting a new name). We don't have a natural way to detect which files were generated by previous 6.12 versions, but this change hacks one in by changing DB session ids to now use a more concise encoding, reducing file name length, saving ~dozen bytes from SST files, and making them visually distinct from DB ids so that they are less likely to be mixed up. Two final auxiliary notes: Recognizing that the backup file names have become a de facto part of the backup meta schema, this change makes them easier to parse and extend by putting a distinct marker, 's', before DB session ids embedded in the name. When we extend this to allow custom checksums in the name, they can get their own marker to ensure safe parsing. For backward compatibility, file size does not get a marker but is assumed for `_[0-9]+[.]` Another change from initial 6.12 default behavior is never including file custom checksum in the file name. Looking ahead to 6.13, we do not want the default behavior to cause backup space amplification for someone turning on file custom checksum checking in BackupEngine; we want that to be an easy decision. When implemented, including file custom checksums in backup file names will be a non-default option. Actual file name patterns and priorities, as regexes: kLegacyCrc32cAndFileSize OR pre-6.12 SST file -> [0-9]+_[0-9]+_[0-9]+[.]sst kFlagMatchInterimNaming set (default) AND early 6.12 SST file -> [0-9]+_[0-9a-fA-F-]+[.]sst kUseDbSessionId AND NOT kFlagIncludeFileSize -> [0-9]+_s[0-9A-Z]{20}[.]sst kUseDbSessionId AND kFlagIncludeFileSize (default) -> [0-9]+_s[0-9A-Z]{20}_[0-9]+[.]sst We might add opt-in options for more '\_' separated data in the name, but embedded file size, if present, will always be after last '\_' and before '.sst'. This change was originally applied to version 6.12. (See https://github.com/facebook/rocksdb/issues/7390) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7400 Test Plan: unit tests included. Sync point callbacks are used to mimic previous version SST files. Reviewed By: ajkr Differential Revision: D23759587 Pulled By: pdillinger fbshipit-source-id: f62d8af4e0978de0a34f26288cfbe66049b70025	2020-09-17 10:24:22 -07:00
mrambacher	a08d6f18f0	Add more tests to ASSERT_STATUS_CHECKED (#7367 ) Summary: db_options_test options_file_test auto_roll_logger_test options_util_test persistent_cache_test Pull Request resolved: https://github.com/facebook/rocksdb/pull/7367 Reviewed By: jay-zhuang Differential Revision: D23712520 Pulled By: zhichao-cao fbshipit-source-id: 99b331e357f5d6a6aabee89d1bd933002cbb3908	2020-09-16 15:48:07 -07:00
Andrew Kryczka	ec024a86de	More robust sync points for intra-L0 compaction tests (#7382 ) Summary: `IntraL0CompactionAfterFlushCheckConsistencyFail` was flaky by sometimes failing due to no intra-L0 compactions happening. I was able to repro it by putting a `sleep(1)` in the compaction thread before it grabs the lock and picks a compaction. This also showed other intra-L0 tests are affected too, although some of them exhibit hanging forever rather than failing. The problem was that all the flushes/ingestions could finish before any compaction got picked, so it would end up simply picking all the files that the test generates for L0->L1. But, these tests intend only the first few files to be picked for L0->L1, and the subsequent files to be picked for intra-L0. This PR adjusts the sync points of all the intra-L0 tests to enforce this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7382 Test Plan: run all the `db_compaction_test`s with and without the artificial `sleep()` Reviewed By: jay-zhuang Differential Revision: D23684985 Pulled By: ajkr fbshipit-source-id: 6508399030dddec7738e9853a7b3dc53ef77a584	2020-09-15 22:44:16 -07:00
Andrew Kryczka	9d3b2db9b5	Disable fsync in DB tests with timeouts (#7380 ) Summary: Some tests were encountering 600 second timeout in CI, such as `./db_universal_compaction_test --gtest_filter=NumLevels/DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest2/5`, `./db_properties_test --gtest_filter=DBPropertiesTest.AggregatedTablePropertiesAtLevel`, and `./db_basic_test --gtest_filter=DBBasicTest.MultiGetBatchedSortedMultiFile`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7380 Test Plan: - `./db_universal_compaction_test --gtest_filter=NumLevels/DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest2/5`: 40 -> 3 seconds - `./db_properties_test --gtest_filter=DBPropertiesTest.AggregatedTablePropertiesAtLevel`: 106 -> 1 second - `./db_basic_test --gtest_filter=DBBasicTest.MultiGetBatchedSortedMultiFile`: 27 -> 1 second Reviewed By: anand1976 Differential Revision: D23674570 Pulled By: ajkr fbshipit-source-id: 4d4ca6a4e2d2e76fcf8b6f6cce91e0f98ba5050c	2020-09-15 18:55:08 -07:00
Levi Tamasi	bf1aeebb6c	Integrate blob file writing with recovery (#7388 ) Summary: The patch adds support for extracting large values into blob files when performing a flush during recovery (when `avoid_flush_during_recovery` is `false`). Blob files are built and added to the `Version` similarly to flush. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7388 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23709912 Pulled By: ltamasi fbshipit-source-id: ce48b4227849cf25429ae98574e72b0e1cb9c67d	2020-09-15 17:14:10 -07:00
mrambacher	67bd5401e9	Changes to EncryptedEnv public API (#7279 ) Summary: Cleaned up the public API to use the EncryptedEnv. This change will allow providers to be developed and added to the system easier in the future. It will also allow better integration in the future with the OPTIONS file. - The internal classes were moved out of the public API into an internal "env_encryption_ctr.h" header. Short-cut constructors were added to provide the original API functionality. - The APIs to the constructors were changed to take shared_ptr, rather than raw pointers or references to allow better memory management and alternative implementations. - CreateFromString methods were added to allow future expansion to other provider and cipher implementations through a standard API. Additionally, there was a code duplication in the NewXXXFile methods. This common code was moved under a templatized function. A first-pass at structuring the code was made to potentially allow multiple EncryptionProviders in a single EncryptedEnv. The idea was that different providers may use different cipher keys or different versions/algorithms. The EncryptedEnv should have some means of picking different providers based on information. The groundwork was started for this (the use of the provider_ member variable was localized) but the work has not been completed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7279 Reviewed By: jay-zhuang Differential Revision: D23709440 Pulled By: zhichao-cao fbshipit-source-id: 0e845fff0e03a52603eb9672b4ade32d063ff2f2	2020-09-15 17:14:10 -07:00
Levi Tamasi	b0e7834100	Integrate blob file writing with the flush logic (#7345 ) Summary: The patch adds support for writing blob files during flush by integrating `BlobFileBuilder` with the flush logic, most importantly, `BuildTable` and `CompactionIterator`. If `enable_blob_files` is set, large values are extracted to blob files and replaced with references. The resulting blob files are then logged to the MANIFEST as part of the flush job's `VersionEdit` and added to the `Version`, similarly to table files. Errors related to writing blob files fail the flush, and any blob files written by such jobs are immediately deleted (again, similarly to how SST files are handled). In addition, the patch extends the logging and statistics around flushes to account for the presence of blob files (e.g. `InternalStats::CompactionStats::bytes_written`, which is used for calculating write amplification, now considers the blob files as well). Pull Request resolved: https://github.com/facebook/rocksdb/pull/7345 Test Plan: Tested using `make check` and `db_bench`. Reviewed By: riversand963 Differential Revision: D23506369 Pulled By: ltamasi fbshipit-source-id: 646885f22dfbe063f650d38a1fedc132f499a159	2020-09-14 21:11:43 -07:00
mrambacher	7d472accdc	Bring the Configurable options together (#5753 ) Summary: This PR merges the functionality of making the ColumnFamilyOptions, TableFactory, and DBOptions into Configurable into a single PR, resolving any merge conflicts Pull Request resolved: https://github.com/facebook/rocksdb/pull/5753 Reviewed By: ajkr Differential Revision: D23385030 Pulled By: zhichao-cao fbshipit-source-id: 8b977a7731556230b9b8c5a081b98e49ee4f160a	2020-09-14 17:01:01 -07:00
anand76	18a3227b12	Add a new IOStatus subcode to indicate that writes are fenced off (#7374 ) Summary: In a distributed file system, directory ownership is enforced by fencing off the previous owner once they've been preempted by a new owner. This PR adds a IOStatus subcode for ```StatusCode::IOError``` to indicate this. Once this error is returned for a file write, the DB is put in read-only mode and not allowed to resume in read-write mode. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7374 Test Plan: Add new unit tests in ```error_handler_fs_test``` Reviewed By: riversand963 Differential Revision: D23687777 Pulled By: anand1976 fbshipit-source-id: bef948642089dc0af399057864d9a8ca339e8b2f	2020-09-14 16:04:47 -07:00
Yanqin Jin	205e577694	Cancel tombstone skipping during bottommost compaction (#7356 ) Summary: During bottommost compaction, RocksDB cannot simply drop a tombstone if this tombstone is not in the earliest snapshot. The current behavior is: RocksDB skips other internal keys (of the same user key) in the same snapshot range. In the meantime, RocksDB should check for the `shutting_down` flag. Otherwise, it is possible for a bottommost compaction that has already started running to take a long time to finish, even if the application has tried to cancel all background jobs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7356 Test Plan: make check Reviewed By: ltamasi Differential Revision: D23663241 Pulled By: riversand963 fbshipit-source-id: 25f8e9b51bc3bfa3353cdf87557800f9d90ee0b5	2020-09-11 17:45:43 -07:00
Peter Dillinger	be8445eea8	Assert valid linked list for write group (#7375 ) Summary: We've seen some segfaults in db_write_test, with at least one suggesting corruption of a write group linked list. Adding an assertion to have this fail in a more specific way if that is the broken invariant. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7375 Test Plan: make check Reviewed By: jay-zhuang Differential Revision: D23638477 Pulled By: pdillinger fbshipit-source-id: a76fd677cad60a3a516bd363947bfd9ce418edc1	2020-09-11 07:58:31 -07:00
Peter Dillinger	92639b93a6	Fix checkpoint file deletion race with avoid_unnecessary_blocking_io (#7369 ) Summary: https://github.com/facebook/rocksdb/issues/3341 guaranteed that upon return of `GetSortedWalFiles` after `DisableFileDeletions`, all pending purges of previously obsolete WAL files will have finished. However, the addition of avoid_unnecessary_blocking_io in https://github.com/facebook/rocksdb/issues/5043 opened a hole in the code making that assurance, which can lead to files to be copied for checkpoint or backup going missing before being copied, with that option enabled. This change patches the hole. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7369 Test Plan: apparent fix to backups in crash test observed. Will work on a unit test for another commit Reviewed By: ajkr Differential Revision: D23620258 Pulled By: pdillinger fbshipit-source-id: bea36b461a5b719c3e3ef802f967bc3e8ae71614	2020-09-10 22:35:25 -07:00
Stanislav Tkach	5c39d8df69	Add getters to the C API for flush, write, cache and compact options (#7321 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7321 Reviewed By: ajkr Differential Revision: D23590160 fbshipit-source-id: 35d106e732ac37f674222759cdb1dbb31e005ca7	2020-09-09 11:45:27 -07:00
Levi Tamasi	7b1d6c438a	Fix the handling of the case when a blob file with a lower number gets added in VersionBuilder (#7349 ) Summary: When multiple background jobs are generating blob files in parallel, it is actually possible for a blob file to be added with a file number that is lower than the highest one in the base version. (This is a harmless race condition.) The patch fixes the handling of this case and adds a unit test. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7349 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23542453 Pulled By: ltamasi fbshipit-source-id: 4ff6f3654bc58c391d10b9870e1cc40b5e3fa8e4	2020-09-09 10:25:12 -07:00
Akanksha Mahajan	0de335e076	Use FSRandomRWFilePtr Object to call underlying file system. (#7198 ) Summary: Replace FSRandomRWFile pointer with FSRandomRWFilePtr object in the rocksdb internal code. This new object wraps FSRandomRWFile pointer. Objective: If tracing is enabled, FSRandomRWFile object returns FSRandomRWFileTracingWrapper pointer that includes all necessary information in IORecord and calls underlying FileSystem and invokes IOTracer to dump that record in a binary file. If tracing is disabled then, underlying FileSystem pointer is returned directly. FSRandomRWFilePtr wrapper class is added to bypass the FSRandomRWFileWrapper when tracing is disabled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7198 Test Plan: make check -j64 Reviewed By: anand1976 Differential Revision: D23421116 Pulled By: akankshamahajan15 fbshipit-source-id: 8a5ba0e7d9c1ba34c3a6f29829b107c5f09ab6a3	2020-09-08 12:21:58 -07:00
Akanksha Mahajan	b175eceb09	Store FSWritableFilePtr object in WritableFileWriter (#7193 ) Summary: Replace FSWritableFile pointer with FSWritableFilePtr object in WritableFileWriter. This new object wraps FSWritableFile pointer. Objective: If tracing is enabled, FSWritableFile Ptr returns FSWritableFileTracingWrapper pointer that includes all necessary information in IORecord and calls underlying FileSystem and invokes IOTracer to dump that record in a binary file. If tracing is disabled then, underlying FileSystem pointer is returned directly. FSWritableFilePtr wrapper class is added to bypass the FSWritableFileWrapper when tracing is disabled. Test Plan: make check -j64 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7193 Reviewed By: anand1976 Differential Revision: D23355915 Pulled By: akankshamahajan15 fbshipit-source-id: e62a27a13c1fd77e36a6dbafc7006d969bed25cf	2020-09-08 10:56:08 -07:00
Levi Tamasi	423d051124	Clean up SubcompactionState a bit (#7322 ) Summary: The patch cleans up a few things in `CompactionJob::SubcompactionState`: * Instead of using both the member initializer list and in-class initializers (and sometimes both at the same time for the same member), the struct now uniformly uses the latter to initialize integer members. * The default parameter value for the constructor parameter `size` is removed. * The explicitly deleted copy operations are removed, since they are implicitly deleted anyways because of the `unique_ptr` members. * The handwritten move operations, which did not move the member `c_iter` and were not declared `nothrow`, are removed. Note that with the user-declared copy operations gone (see the previous item), we can rely on the compiler to (correctly) generate these methods. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7322 Test Plan: `make check` Reviewed By: siying Differential Revision: D23382408 Pulled By: ltamasi fbshipit-source-id: a4ae5af150161c50ff7bdc07fa145482d0150bfe	2020-09-08 09:24:23 -07:00
Yanqin Jin	ab202e8d72	Add a new stats level to exclude tickers (#7329 ) Summary: Currently, application may pass a statistics object to db but later wants to reduce stats tracking overhead by setting stats level to kExceptHistogramOrTimers (the current lowest level). Tickers will still be incremented, causing up to 1% CPU. We can add a new lowest stats level `kExceptTickers` to disable ticker incrementing as well, thus reducing CPU cycles spent on tickers. Test Plan (devserver): ``` make check make clean DEBUG_LEVEL=0 make db_bench ./db_bench -perf_level=1 -stats_level=0 -statistics -benchmarks=fillseq,readrandom -duration=120 ``` Measure CPU util (%) before and after change: CPU util by rocksdb::RecordTick: 1.1 vs (<0.1) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7329 Reviewed By: pdillinger Differential Revision: D23434014 Pulled By: riversand963 fbshipit-source-id: 72ff0f02a192ac476d4b0044b9f37fd4a22ff0d4	2020-09-04 23:25:03 -07:00
Cheng Chang	3f9b75604d	Fix wrong level args (#7346 ) Summary: The level args should be output level instead of input levels. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7346 Test Plan: make check Reviewed By: ajkr Differential Revision: D23506373 Pulled By: cheng-chang fbshipit-source-id: b2f701d44c13581c5c10c4dbebded4fcd354d641	2020-09-03 23:17:37 -07:00
Eduardo Barreto Alexandre	5b1ccdc191	Expose rocksdb_open_column_families_with_ttl C function (#7314 ) Summary: This PR creates `rocksdb_open_column_families_with_ttl` which allows C API users to open a DBWithTLL with column families. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7314 Reviewed By: cheng-chang Differential Revision: D23430287 Pulled By: ajkr fbshipit-source-id: 307aa21d170d1402653263a91f6f832ef76afba0	2020-09-03 14:39:58 -07:00
Hiep	d0c1a01c1b	Avoid converting MERGES to PUTS when allow_ingest_behind is true (#7166 ) Summary: - Closes https://github.com/facebook/rocksdb/issues/6490 - Currently MERGEs are converted to PUTs at bottom or compaction has reached the beginning of the key, this can wrongly cover a PUT future base case. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7166 Test Plan: - Automated: `make all check` - Manual: With `allow_ingest_behind = true`, add Merge operations to a key then run compaction. Then run ingesting external files to make sure the base case is probably compacted with existing Merges. Reviewed By: cheng-chang Differential Revision: D23325425 Pulled By: ajkr fbshipit-source-id: 3eb415eb7b381b5453e45245393566153b1abb68	2020-09-03 14:39:58 -07:00
Andrew Kryczka	177f8bd063	Bound L0->Lbase fanout in dynamic leveled compaction (#7325 ) Summary: L0 score is based on size target and number of files. The size target used is `max_bytes_for_level_base`. However, the base level's size can dynamically expand in write burst mode. In fact, it can expand so much that L0->Lbase becomes the highest fanout in target sizes. This doesn't make sense from an efficiency perspective, so this PR bounds the L0->Lbase fanout to the smoothed level multiplier. The L0 scoring based on file count remains unchanged. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7325 Test Plan: contrived benchmark that exhibits the problem: ``` $ TEST_TMPDIR=/data/users/andrewkr/ ./db_bench -benchmarks=filluniquerandom,readrandom -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -level0_file_num_compaction_trigger=4 -level_compaction_dynamic_level_bytes=true -compression_type=none -max_background_jobs=12 -rate_limiter_bytes_per_sec=104857600 -benchmark_write_rate_limit=10485760 -num=100000000 ``` Results: - "Burst W-Amp" is the write-amp near the end of the fillrandom benchmark - "Total W-Amp" is the write-amp after readrandom has run a while and all levels no longer need compaction Branch \| Burst W-Amp \| Total W-Amp \| fillrandom (MB/s) -- \| -- \| -- \| -- master \| 20.2 \| 21.5 \| 4.7 dynamic-l0-score \| 12.6 \| 14.1 \| 7.2 Reviewed By: siying Differential Revision: D23412935 Pulled By: ajkr fbshipit-source-id: f91f2067188e432dd39deab02f1c56f195057a0e	2020-09-01 19:34:01 -07:00
Levi Tamasi	792d2f906e	Log info about generated blob files in BlobFileBuilder (#7324 ) Summary: The patch adds a log message to `BlobFileBuilder` that is logged upon generating a blob file, similarly to how we log the generation of table files during flush and compaction. The log message contains the column family name, job id, blob file number, and the number and total size of blobs in the new file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7324 Test Plan: Ran `make check` and checked the actual log messages using a custom `db_bench`. Reviewed By: riversand963 Differential Revision: D23402229 Pulled By: ltamasi fbshipit-source-id: ca42beb4db284b783d1eb2651f321032a45d0c5f	2020-08-31 13:24:12 -07:00
Akanksha Mahajan	963314ffd6	Add unit test for max_write_buffer_size_to_maintain (#7311 ) Summary: Add a unit test case to check memory usage when max_write_buffer_size_to_maintain is set if flushed immutable memtables are trimmed timely or not. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7311 Test Plan: Compared the results with before bug fix. Reviewed By: ltamasi Differential Revision: D23321702 Pulled By: akankshamahajan15 fbshipit-source-id: da04ee21137d641a07fd499a9e2749eb036fcb1e	2020-08-28 17:38:05 -07:00
Levi Tamasi	5043960623	Add a blob file builder class that can be used in background jobs (#7306 ) Summary: The patch adds a class called `BlobFileBuilder` that can be used to build and cut blob files in background jobs (flushes/compactions). The class enforces a value size threshold (`min_blob_size`; smaller blobs will be inlined in the LSM tree itself), and supports specifying a blob file size limit (`blob_file_size`), as well as compression (`blob_compression_type`) and checksums for blob files. It also keeps track of the generated blob files and their associated `BlobFileAddition` metadata, which can be applied as part of the background job's `VersionEdit`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7306 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23298817 Pulled By: ltamasi fbshipit-source-id: 38f35d81dab1ba81f15236240612ec173d7f21b5	2020-08-27 11:55:54 -07:00
Akanksha Mahajan	8e0df9050c	Store FSRandomAccessPtr object in RandomAccessFileReader (#7192 ) Summary: Replace FSRandomAccessFile pointer with FSRandomAccessFilePtr object in RandomAccessFileReader. This new object wraps FSRandomAccessFile pointer. Objective: If tracing is enabled, FSRandomAccessFile Ptr returns FSRandomAccessFileTracingWrapper pointer that includes all necessary information in IORecord and calls underlying FileSystem and invokes IOTracer to dump that record in a binary file. If tracing is disabled then, underlying FileSystem pointer is returned directly. FSRandomAccessFilePtr wrapper class is added to bypass the FSRandomAccessFileWrapper when tracing is disabled. Test Plan: make check -j64 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7192 Reviewed By: anand1976 Differential Revision: D23356867 Pulled By: akankshamahajan15 fbshipit-source-id: 48f31168166a17a7444b40be44a9a9d4a5c7182c	2020-08-27 11:21:52 -07:00
Peter Dillinger	9aad24da55	Real fix for race in backup custom checksum checking (#7309 ) Summary: This is a "real" fix for the issue worked around in https://github.com/facebook/rocksdb/issues/7294. To get DB checksum info for live files, we now read the manifest file that will become part of the checkpoint/backup. This requires a little extra handling in taking a custom checkpoint, including only reading the manifest file up to the size prescribed by the checkpoint. This moves GetFileChecksumsFromManifest from backup code to file_checksum_helper.{h,cc} and removes apparently unnecessary checking related to column families. Updated HISTORY.md and warned potential future users of DB::GetLiveFilesChecksumInfo() Pull Request resolved: https://github.com/facebook/rocksdb/pull/7309 Test Plan: updated unit test, before and after Reviewed By: ajkr Differential Revision: D23311994 Pulled By: pdillinger fbshipit-source-id: 741e30a2dc1830e8208f7648fcc8c5f000d4e2d5	2020-08-26 10:39:20 -07:00
sdong	722814e357	Get() to fail with underlying failures in PartitionIndexReader::CacheDependencies() (#7297 ) Summary: Right now all I/O failures under PartitionIndexReader::CacheDependencies() is swallowed. This doesn't impact correctness but we've made a decision that any I/O error in read path now should be returned to users for awareness. Return errors in those cases instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7297 Test Plan: Add a new unit test that ingest errors in this code path and see Get() fails. Only one I/O path is hit in PartitionIndexReader::CacheDependencies(). Several option changes are attempt but not able to got other pread paths triggered. Not sure whether other failure cases would be even possible. Would rely on continuous stress test to validate it. Reviewed By: anand1976 Differential Revision: D23257950 fbshipit-source-id: 859dbc92fa239996e1bb378329344d3d54168c03	2020-08-25 19:01:05 -07:00
sdong	cecdd5d2ab	Parameterize DBBasicTest.CompactBetweenSnapshots (#7301 ) Summary: DBBasicTest.CompactBetweenSnapshots can time-out in some slow-I/O hosts. Parameterize it so that single test runs shorter. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7301 Test Plan: Run the test and see see different runs are of different configerations in a hacky way. Reviewed By: ltamasi Differential Revision: D23277733 fbshipit-source-id: 1f717b4131322d175abf9e211131fe7e9b1ef758	2020-08-25 15:42:11 -07:00
Zhichao Cao	d51f88c9e4	Pass SST file checksum information through OnTableFileCreated (#7108 ) Summary: When SST file is created, application is able to know the file information through OnTableFileCreated callback in LogAndNotifyTableFileCreationFinished. Since file checksum information can be useful for application when the SST file is created, we add file_checksum and file_checksum_func_name information to TableFileCreationInfo, which will be passed through OnTableFileCreated. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7108 Test Plan: make check, listener_test. Reviewed By: ajkr Differential Revision: D22470240 Pulled By: zhichao-cao fbshipit-source-id: 92c20344d9b986eadfe3480f3769bf4add0dbaae	2020-08-25 10:46:11 -07:00
Connor1996	416943bf28	Eliminates a no-op compaction upon snapshot release when disabling auto compactions (#7267 ) Summary: After releasing a snapshot, it checks whether it is suitable to trigger bottom compactions. When disabling auto compactions, it may still schedule compaction when releasing a snapshot. Whereas no compaction job will be actually handled, so the state of LSM is not changed and compaction will be triggered again and again every time releasing a snapshot. Too frequent compactions lead to high CPU usage and high db_mutex lock contention which affects foreground write duration finally. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7267 Test Plan: - make check - manual test Reviewed By: akankshamahajan15 Differential Revision: D23252880 Pulled By: ajkr fbshipit-source-id: 4431e071a35d9912a2a3592875db27bae521434b	2020-08-24 22:06:45 -07:00
mrambacher	b7e1c5213f	Add some simulator cache and block tracer tests to ASSERT_STATUS_CHECKED (#7305 ) Summary: More tests now pass. When in doubt, I added a TODO comment to check what should happen with an ignored error. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7305 Reviewed By: akankshamahajan15 Differential Revision: D23301262 Pulled By: ajkr fbshipit-source-id: 5f120edc7393560aefc0633250277bbc7e8de9e6	2020-08-24 16:43:31 -07:00
sdong	21ce018a32	Disable fsync in some ExternalSSTFileTest tests (#7303 ) Summary: Some ExternalSSTFileTest runs very long on some places. Disable fsync in some tests to speed them up. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7303 Test Plan: Run these tests. Reviewed By: riversand963 Differential Revision: D23280261 fbshipit-source-id: 0dca862e462f9e6d807f393320a1f82aa5b87e59	2020-08-24 11:26:09 -07:00
Akanksha Mahajan	3844612625	Bug Fix for memtables not trimmed down. (#7296 ) Summary: When a memtable is trimmed in MemTableListVersion, the memtable is only added to delete list if it is the last reference. However it is not the last reference as it is held by the super version. But the super version would not be switched if the delete list is empty. So the memtable is never destroyed and memory usage increases beyond write_buffer_size + max_write_buffer_size_to_maintain. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7296 Test Plan: 1. ./db_bench -benchmarks=randomtransaction -optimistic_transaction_db=1 -statistics -stats_interval_seconds=1 -duration=90 -num=500000 --max_write_buffer_size_to_maintain=16000000 --transaction_set_snapshot Reviewed By: ltamasi Differential Revision: D23267395 Pulled By: akankshamahajan15 fbshipit-source-id: 3a8d437fe9f4015f851ff84c0e29528aa946b650	2020-08-21 13:29:05 -07:00
mrambacher	e9befdebbf	Add EnvTestWithParam::OptionsTest to the ASSERT_STATUS_CHECKED passes (#7283 ) Summary: This test uses database functionality and required more extensive work to get it to pass than the other tests. The DB functionality required for this test now passes the check. When it was unclear what the proper behavior was for unchecked status codes, a TODO was added. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7283 Reviewed By: akankshamahajan15 Differential Revision: D23251497 Pulled By: ajkr fbshipit-source-id: 52b79629bdafa0a58de8ead1d1d66f141b331523	2020-08-20 19:18:35 -07:00
Stanislav Tkach	b288f0131b	Add getters for the read options to the C API (#7289 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7289 Reviewed By: akankshamahajan15 Differential Revision: D23252520 Pulled By: ajkr fbshipit-source-id: 85cea485a6dcaa1c67c32a83eb49a1b623966609	2020-08-20 16:36:19 -07:00
Cheng Chang	ce4192375d	Track WAL in MANIFEST: minor udpates (#7282 ) Summary: The updates resolve comments left from https://github.com/facebook/rocksdb/pull/7164. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7282 Test Plan: wal_edit_test Reviewed By: ltamasi Differential Revision: D23196824 Pulled By: cheng-chang fbshipit-source-id: 797f3fef27fc72114c2be777d9eadd3429da5301	2020-08-20 15:12:00 -07:00
Jay Zhuang	3e422ce0ca	Fix a timer_test deadlock (#7277 ) Summary: There's a potential deadlock caused by MockTimeEnv time value get to a large number, which causes TimedWait() wait forever. The test misuses the microseconds as seconds, making it more likely to happen. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7277 Reviewed By: pdillinger Differential Revision: D23183873 Pulled By: jay-zhuang fbshipit-source-id: 6fc38ebd40b4125a99551204b271f91a27e70086	2020-08-20 08:43:13 -07:00
Jay Zhuang	ac7dcfda10	Add missing ComputeCompactionScore() for a new universal manual compaction (#7281 ) Summary: Seems it's only causing assert failure during compaction pick, but in production code, the problematic compactions are excluded at a later step. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7281 Reviewed By: akankshamahajan15 Differential Revision: D23228000 Pulled By: jay-zhuang fbshipit-source-id: 2e4055aeebe0f5a2b07e299e0a2d51a1ad2e216d	2020-08-19 17:42:08 -07:00
Levi Tamasi	b9bb59d49d	Add initial set of options for integrated blob write path (#7280 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7280 Test Plan: `make check` Reviewed By: riversand963 Differential Revision: D23195192 Pulled By: ltamasi fbshipit-source-id: 743b382de391963e62ba86119e9fbd0233ea3b3a	2020-08-18 18:32:37 -07:00
Akanksha Mahajan	cc24ac14eb	Store FSSequentialFilePtr object in SequenceFileReader (#7190 ) Summary: This diff contains following changes: 1. Replace `FSSequentialFile` pointer with `FSSequentialFilePtr` object that wraps `FSSequentialFile` pointer in `SequenceFileReader`. Objective: If tracing is enabled, `FSSequentialFilePtr` returns `FSSequentialFileTracingWrapper` pointer that includes all necessary information in `IORecord` and calls underlying FileSystem and invokes `IOTracer` to dump that record in a binary file. If tracing is disabled then, underlying `FileSystem` pointer is returned directly. `FSSequentialFilePtr` wrapper class is added to bypass the `FSSequentialFileTracingWrapper` when tracing is disabled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7190 Test Plan: make check -j64 COMPILE_WITH_TSAN=1 make check -j64 Reviewed By: anand1976 Differential Revision: D23059616 Pulled By: akankshamahajan15 fbshipit-source-id: 1564b94dd1297cd0fbfe2ed5c9cc3e20f7395301	2020-08-18 16:20:54 -07:00
sdong	b194c21bba	Whole DBTest to skip fsync (#7274 ) Summary: After https://github.com/facebook/rocksdb/pull/7036, we still see extra DBTest that can timeout when running 10 or 20 in parallel. Expand skip-fsync mode in whole DBTest. Still preserve other tests from doing this mode to be conservative. This commit reinstates https://github.com/facebook/rocksdb/issues/7049, whose un-revert was lost in an automatic infrastructure mis-merge. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7274 Test Plan: Run all existing files. Reviewed By: pdillinger Differential Revision: D23177444 fbshipit-source-id: 1f61690b2ac6333c3b2c87176fef6b2cba086b33	2020-08-17 18:42:25 -07:00
Andrew Kryczka	5d5ff82408	Disable `recycle_log_file_num` with `kTolerateCorruptedTailRecords` (#7271 ) Summary: The two features are naturally incompatible. WAL recycling expects the recovery to succeed upon encountering a corrupt record at the point where new data ends and recycled data remains at the tail. However, `WALRecoveryMode::kTolerateCorruptedTailRecords` must fail upon encountering any such corrupt record, as it cannot differentiate between this and a real corruption, which would cause committed updates to be truncated. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7271 Reviewed By: riversand963 Differential Revision: D23169923 Pulled By: ajkr fbshipit-source-id: 2cf8a3bcd2c9a0ecb0055a84725047a10fd4db50	2020-08-17 18:21:10 -07:00
Yanqin Jin	92593d511a	Add a new EntryType for deletion with timestamp (#7195 ) Summary: Add `kEntryDeleteWithTimestamp` to `EntryType` which is a public API. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7195 Test Plan: make check Reviewed By: ajkr Differential Revision: D22914704 Pulled By: riversand963 fbshipit-source-id: 886f73c6b70c527cad1c8fc9fc8d3afe60e1ea39	2020-08-17 16:26:06 -07:00
Levi Tamasi	9b083cb11c	Build blob file reader/writer classes in LITE mode as well (#7272 ) Summary: The patch makes sure that the functionality required for the new integrated BlobDB implementation (most importantly, the classes related to reading and writing blob files) is also built in LITE mode by removing the corresponding `#ifndef`s. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7272 Test Plan: Ran `make check` in both regular and LITE mode. Reviewed By: zhichao-cao Differential Revision: D23173280 Pulled By: ltamasi fbshipit-source-id: 1596bd1a76409a8a6d83d8f1dbfe08bfdea7ffe6	2020-08-17 15:19:05 -07:00
sdong	1760637539	CompactRange() refit level should confirm destination level is not empty (#7261 ) Summary: There is potential data race related CompactRange() with level refitting. After the compaction step and refitting step, some automatic compaction could put data to the destination level and cause the DB to be corrupted. Fix the bug by checking the target level to be empty. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7261 Test Plan: Add a unit test, which would fail with "Corruption: L1 have overlapping ranges '666F6F' seq:6, type:1 vs. '626172' seq:2, type:1", and now it succeeds. Reviewed By: ajkr Differential Revision: D23142269 fbshipit-source-id: 28bc14d5ac934c192260b23a4ce3f10a95e3ee91	2020-08-17 14:21:53 -07:00
matthewvon	2ad88ceae9	Populate cf_id member of CompactionJobInfo for OnCompactionBegin (#6938 ) Summary: Looks like somebody simply missed initializing a member variable. The column family ID, cf_id, is not set during OnCompactionBegin. But it is set properly in the next function for OnCompactionCompleted. Need this cf_id for tracking progress of a Stardog optimize since there may be multiple compactions required for a given column family. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6938 Reviewed By: siying Differential Revision: D23153235 Pulled By: ajkr fbshipit-source-id: 932938de3a4ebbc7ac89702f655583862587d251	2020-08-17 11:57:47 -07:00
Jay Zhuang	69760b4d05	Introduce a global StatsDumpScheduler for stats dumping (#7223 ) Summary: Have a global StatsDumpScheduler for all DB instance stats dumping, including `DumpStats()` and `PersistStats()`. Before this, there're 2 dedicate threads for every DB instance, one for DumpStats() one for PersistStats(), which could create lots of threads if there're hundreds DB instances. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7223 Reviewed By: riversand963 Differential Revision: D23056737 Pulled By: jay-zhuang fbshipit-source-id: 0faa2311142a73433ebb3317361db7cbf43faeba	2020-08-14 20:12:44 -07:00
Yanqin Jin	d758273ceb	Get() with timestamp should respect snapshot (#7227 ) Summary: If user-defined timestamp is enabled, current implementation can expose newer data to queries even if an older sequence number is specified via read_options.snapshot. This PR makes Get() respect sequence-number-based snapshot. Solution is simple. Besides using <ukey, ts, seq> to search the index for the key, we also verify that the candidate result's seq is smaller than or equal to seq. This requires passing a seq via `GetContext`, which results in the majority of code change caused by this PR. Also added a few unit tests to demonstrate standard visibility during point lookup and range scan when timestamp and snapshot are both present. Test plan (devserver): ``` make check $./db_bench --benchmarks=fillseq,readrandom -cache_size=$[6410241024] ``` Result this PR: readrandom : 4.827 micros/op 207180 ops/sec; 22.9 MB/s (1000000 of 1000000 found) master: readrandom : 4.936 micros/op 202610 ops/sec; 22.4 MB/s (1000000 of 1000000 found) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7227 Reviewed By: ltamasi Differential Revision: D23015242 Pulled By: riversand963 fbshipit-source-id: ea7b85a728654553ba357d2e6a207b5e40f7376a	2020-08-14 19:20:58 -07:00
Andrew Kryczka	a1aa3f8385	Disable manual compaction during `ReFitLevel()` (#7250 ) Summary: Manual compaction with `CompactRangeOptions::change_levels` set could refit to a level targeted by another manual compaction. If force_consistency_checks were disabled, it could be possible for overlapping files to be written at that target level. This PR prevents the possibility by calling `DisableManualCompaction()` prior to `ReFitLevel()`. It also improves the manual compaction disabling mechanism to wait for pending manual compactions to complete before returning, and support disabling from multiple threads. Fixes https://github.com/facebook/rocksdb/issues/6432. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7250 Test Plan: crash test command that repro'd the bug reliably: ``` $ TEST_TMPDIR=/dev/shm python tools/db_crashtest.py blackbox --simple -target_file_size_base=524288 -write_buffer_size=1048576 -clear_column_family_one_in=0 -reopen=0 -max_key=10000000 -column_families=1 -max_background_compactions=8 -compact_range_one_in=100000 -compression_type=none -compaction_style=1 -num_levels=5 -universal_min_merge_width=4 -universal_max_merge_width=8 -level0_file_num_compaction_trigger=12 -rate_limiter_bytes_per_sec=1048576000 -universal_max_size_amplification_percent=100 --duration=3600 --interval=60 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --enable_compaction_filter=0 ``` Reviewed By: ltamasi Differential Revision: D23090800 Pulled By: ajkr fbshipit-source-id: afcbcd51b42ce76789fdb907d8b9ada790709c13	2020-08-14 11:29:52 -07:00

1 2 3 4 5 ...

4263 Commits