rocksdb

Author	SHA1	Message	Date
Igor Canadi	9eaff629e3	Make corruption_test more robust Summary: Latest travis failed because of corruption test TableFileIndexData: https://travis-ci.org/facebook/rocksdb/jobs/83732558 This diff makes the test more explicit: 1. create two files 2. corrupt the second's file index 3. expect to get only 5000 keys when range scanning Test Plan: the test is still passing :) Reviewers: sdong, rven, yhchiang, kradhakrishnan, IslamAbdelRahman, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D48183	2015-10-05 14:46:28 -07:00
Igor Canadi	bf19dbff44	Fix valgrind - Initialize done variable Summary: Fixes the valgrind warning "Conditional jump or move depends on uninitialised value(s)" Test Plan: valgrind test, no more warning Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D48177	2015-10-05 10:10:11 -07:00
Igor Canadi	115427ef63	Add APIs PauseBackgroundWork() and ContinueBackgroundWork() Summary: To support a new MongoDB capability, we need to make sure that we don't do any IO for a short period of time. For background, see: * https://jira.mongodb.org/browse/SERVER-20704 * https://jira.mongodb.org/browse/SERVER-18899 To implement that, I add a new API calls PauseBackgroundWork() and ContinueBackgroundWork() which reuse the capability we already have in place for RefitLevel() function. Test Plan: Added a new test in db_test. Made sure that test fails when PauseBackgroundWork() is commented out. Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47901	2015-10-02 13:17:34 -07:00
Islam AbdelRahman	c29af48d3e	Add max_file_opening_threads to db_bench Summary: Add an option to db_bench for max_file_opening_threads Test Plan: compile and run db_bench Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: dhruba, paultuckfield Differential Revision: https://reviews.facebook.net/D47811	2015-09-30 09:51:31 -07:00
Yueh-Hsuan Chiang	30f74fa964	Make CompactionJobStatsTest.UniversalCompactionTest more robust Summary: CompactionJobStatsTest.UniversalCompactionTest assumes compaction kicks in when the number of L0 files equals to the compaction trigger. However, in some case, the compaction might not catch up the write speed and thus compaction might not kick in until the number of L0 files is GREATER than the compaction trigger. This patch tries to fix this corner case by making the Put thread wait for a potential compaction whenever it flushes. Test Plan: ./compaction_job_stats_test Reviewers: sdong, anthony, IslamAbdelRahman, igor Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D47589	2015-09-28 13:55:53 -07:00
Mike Lin	60fa9cf0b5	Override DBImplReadOnly::SyncWAL() to return NotSupported. Previously, calling it caused program abort.	2015-09-25 21:25:30 -07:00
Yueh-Hsuan Chiang	63e0f86797	Fixed a bug which causes rocksdb.flush.write.bytes stat is always zero Summary: Fixed a bug which causes rocksdb.flush.write.bytes stat is always zero Test Plan: augment existing db_test Reviewers: sdong, anthony, IslamAbdelRahman, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47595	2015-09-25 13:34:49 -07:00
Yueh-Hsuan Chiang	b6aa3f962d	Fixed a memory leak issue in DBTest.UnremovableSingleDelete Summary: Fixed a memory leak issue in DBTest.UnremovableSingleDelete Test Plan: valgrind --error-exitcode=2 --leak-check=full ./db_test --gtest_filter="UnremovableSingleDelete" Reviewers: sdong, anthony, IslamAbdelRahman, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47583	2015-09-25 12:07:32 -07:00
Igor Canadi	7b7b5d9f18	[minor] Reuse SleepingBackgroundTask Summary: As title Test Plan: make check Reviewers: yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46983	2015-09-25 10:29:44 -07:00
Mayank Pundir	c58bac701c	Fix valgrind failure due to memory leaks Summary: Test cases for IsBottommostLevel function create FileMetaData objects which were not getting deleted in the destructor. Test Plan: Valgrind check on compaction_picker_test Reviewers: yhchiang, igor, sdong Subscribers: rven, kradhakrishnan, IslamAbdelRahman, dhruba, anthony Differential Revision: https://reviews.facebook.net/D47463	2015-09-23 17:41:42 -07:00
Islam AbdelRahman	f03b5c987b	Add experimental DB::AddFile() to plug sst files into empty DB Summary: This is an initial version of bulk load feature This diff allow us to create sst files, and then bulk load them later, right now the restrictions for loading an sst file are (1) Memtables are empty (2) Added sst files have sequence number = 0, and existing values in database have sequence number = 0 (3) Added sst files values are not overlapping Test Plan: unit testing Reviewers: igor, ott, sdong Reviewed By: sdong Subscribers: leveldb, ott, dhruba Differential Revision: https://reviews.facebook.net/D39081	2015-09-23 12:42:43 -07:00
Yueh-Hsuan Chiang	3fdb6e5234	Fixed old lint errors in db/filename.cc Summary: Fixed old lint errors in db/filename.cc Test Plan: make Reviewers: igor, sdong, anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47445	2015-09-23 12:39:16 -07:00
Yueh-Hsuan Chiang	b349d22786	Fixed old lint errors in db/filename.h Summary: Fixed old lint errors in db/filename.h Test Plan: make Reviewers: igor, sdong, anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47439	2015-09-23 12:22:44 -07:00
sdong	df34aea331	PlainTableReader to support non-mmap mode Summary: PlainTableReader now only allows mmap-mode. Add the support to non-mmap mode for more flexibility. Refactor the codes to move all logic of reading data to PlainTableKeyDecoder, and consolidate the calls to Read() call and ReadVarint32() call. Implement the calls for both of mmap and non-mmap case seperately. For non-mmap mode, make copy of keys in several places when we need to move the buffer after reading the keys. Test Plan: Add the mode of non-mmap case in plain_table_db_test. Run it in valgrind mode too. Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D47187	2015-09-23 11:41:07 -07:00
sdong	d0c31641d2	Internal stats WAL file synced to match meaning of the stats of the same name Summary: https://reviews.facebook.net/D23343 changed WAL sync bytes to extra fsync. This change does the same for internal stats. Test Plan: Run all existing unit tests and verify results in db_bench. Reviewers: anthony, rven, igor, MarkCallaghan, kradhakrishnan, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D47349	2015-09-22 14:23:11 -07:00
Siying Dong	48b4497f75	Merge pull request #730 from yuslepukhin/fix_write_batch_win_const_expr Fix Windows constexpr issue and '#ifdef' column_family_test in Release.	2015-09-22 11:08:10 -07:00
sdong	f1b9f804e9	Add a mode to always pick the oldest file to compact for each level Summary: Add options.compaction_pri, which specifies the policy about which file to compact first. kCompactionPriByLargestSeq will compact oldest files first. Verified the behavior in db_bench but did not write unit tests yet. Also need to make it settable through option string and dynamically changeable. Test Plan: Will write unit tests Reviewers: igor, rven, anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, MarkCallaghan Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D45951	2015-09-21 17:21:59 -07:00
Dmitri Smirnov	2754ec9994	Fix Windows constexpr issue and '#ifdef' column_family_test in Release.	2015-09-21 16:21:01 -07:00
jsteemann	5ec129971b	key_ cannot become nullptr, so no check is needed for that (ignoring the unlikely case that some overrides `operator new throw(std::bad_alloc)` with a function that returns a nullptr)	2015-09-18 20:15:20 +02:00
jsteemann	834b12a8d5	made Size() function const because it does not modify data	2015-09-18 20:10:00 +02:00
Andres Noetzli	014fd55adc	Support for SingleDelete() Summary: This patch fixes #7460559. It introduces SingleDelete as a new database operation. This operation can be used to delete keys that were never overwritten (no put following another put of the same key). If an overwritten key is single deleted the behavior is undefined. Single deletion of a non-existent key has no effect but multiple consecutive single deletions are not allowed (see limitations). In contrast to the conventional Delete() operation, the deletion entry is removed along with the value when the two are lined up in a compaction. Note: The semantics are similar to @igor's prototype that allowed to have this behavior on the granularity of a column family ( https://reviews.facebook.net/D42093 ). This new patch, however, is more aggressive when it comes to removing tombstones: It removes the SingleDelete together with the value whenever there is no snapshot between them while the older patch only did this when the sequence number of the deletion was older than the earliest snapshot. Most of the complex additions are in the Compaction Iterator, all other changes should be relatively straightforward. The patch also includes basic support for single deletions in db_stress and db_bench. Limitations: - Not compatible with cuckoo hash tables - Single deletions cannot be used in combination with merges and normal deletions on the same key (other keys are not affected by this) - Consecutive single deletions are currently not allowed (and older version of this patch supported this so it could be resurrected if needed) Test Plan: make all check Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor Reviewed By: igor Subscribers: maykov, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43179	2015-09-17 11:42:56 -07:00
Venkatesh Radhakrishnan	51e1c11254	Do not flag error if file to be deleted does not exist Summary: Some users have observed errors in the log file when the log file or sst file is already deleted. Test Plan: Make sure that the errors do not appear for already deleted files. Reviewers: sdong Reviewed By: sdong Subscribers: anthony, kradhakrishnan, yhchiang, rven, igor, IslamAbdelRahman, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47115	2015-09-17 10:21:34 -07:00
Mayank Pundir	a5e312a7a4	Improving condition for bottommost level during compaction Summary: The diff modifies the condition checked to determine the bottommost level during compaction. Previously, absence of files in higher levels alone was used as the condition. Now, the function additionally evaluates if the higher levels have files which have non-overlapping key ranges, then the level can be safely considered as the bottommost level. Test Plan: Unit test cases added and passing. However, unit tests of universal compaction are failing as a result of the changes made in this diff. Need to understand why that is happening. Reviewers: igor Subscribers: dhruba, sdong, lgalanis, meyering Differential Revision: https://reviews.facebook.net/D46473	2015-09-16 17:47:50 -07:00
sdong	9aca7cd6d8	DB::Open() to flush info log after printing DB pointer Summary: Now DB::Open() flushes info log before printing DB pointer, so it may not show up if no activity after DB open. Move log flushing from after printing options to printing DB pointer. Test Plan: make commit-prereq Reviewers: igor, IslamAbdelRahman, yhchiang, kradhakrishnan, anthony, rven Reviewed By: rven Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D47121	2015-09-16 16:33:39 -07:00
Yueh-Hsuan Chiang	f21c7415a7	Change the log level of DB start-up log from Warn to Header. Summary: Change the log level of DB start-up log from Warn to Header. Test Plan: db_bench and observe the LOG header Reviewers: igor, anthony, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47067	2015-09-16 11:31:45 -07:00
Alexey Maykov	3ebf11ed16	Adding the increment for a counter for a number of WAL syncs Summary: This will unblock the corresponding change in MyRocks Test Plan: ran rocksdb.write_sync test Reviewers: sdong, kolmike Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46911	2015-09-16 11:00:49 -07:00
Igor Canadi	1b7ea8ce81	Skipped tests shouldn't be failures Summary: If we skip a test, we shouldn't mark `make check` as failure. This fixes travis CI test. Test Plan: Travis CI Reviewers: noetzli, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47031	2015-09-15 18:10:36 -07:00
Ari Ekmekji	5ba3297d0d	Add compaction time to log output Summary: Although compaction time is recorded in the statistics, it is helpful to include this value in the log output corresponding to the end of compaction. Test Plan: make all && make check Reviewers: yhchiang, sdong, igor, noetzli, MarkCallaghan Reviewed By: MarkCallaghan Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D47007	2015-09-15 17:11:44 -07:00
Igor Canadi	0e50a3fcc0	Merge issue with D46773 Summary: There was a merge issue with SleepingBackgroundTask Test Plan: compiles now Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46977	2015-09-15 11:35:23 -07:00
Igor Canadi	a7e80379b0	LogAndApply() should fail if the column family has been dropped Summary: This patch finally fixes the ColumnFamilyTest.ReadDroppedColumnFamily test. The test has been failing very sporadically and it was hard to repro. However, I managed to write a new tests that reproes the failure deterministically. Here's what happens: 1. We start the flush for the column family 2. We check if the column family was dropped here: `a3fc49bfdd/db/flush_job.cc (L149)` 3. This check goes through, ends up in InstallMemtableFlushResults() and it goes into LogAndApply() 4. At about this time, we start dropping the column family. Dropping the column family process gets to LogAndApply() at about the same time as LogAndApply() from flush process 5. Drop column family goes through LogAndApply() first, marking the column family as dropped. 6. Flush process gets woken up and gets a chance to write to the MANIFEST. However, this is where it gets stuck: `a3fc49bfdd/db/version_set.cc (L1975)` 7. We see that the column family was dropped, so there is no need to write to the MANIFEST. We return OK. 8. Flush gets OK back from LogAndApply() and it deletes the memtable, thinking that the data is now safely persisted to sst file. The fix is pretty simple. Instead of OK, we return ShutdownInProgress. This is not really true, but we have been using this status code to also mean "this operation was canceled because the column family has been dropped". The fix is only one LOC. All other code is related to tests. I added a new test that reproes the failure. I also moved SleepingBackgroundTask to util/testutil.h (because I needed it in column_family_test for my new test). There's plenty of other places where we reimplement SleepingBackgroundTask, but I'll address that in a separate commit. Test Plan: 1. new test 2. make check 3. Make sure the ColumnFamilyTest.ReadDroppedColumnFamily doesn't fail on Travis: https://travis-ci.org/facebook/rocksdb/jobs/79952386 Reviewers: yhchiang, anthony, IslamAbdelRahman, kradhakrishnan, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46773	2015-09-15 11:28:44 -07:00
Yoshinori Matsunobu	4886073174	Adding Slice::difference_offset() function Summary: There are some use cases in MyRocks to compare two slices and to return the first byte where they differ. It may be useful to add it as a RocksDB Slice function. Test Plan: db_test Reviewers: sdong, rven, igor Reviewed By: igor Subscribers: jkedgar, dhruba Differential Revision: https://reviews.facebook.net/D46935	2015-09-15 10:32:42 -07:00
sdong	f3170b6f6c	DBImpl::FindObsoleteFiles() shouldn't release mutex between getting min_pending_output and scanning files Summary: Releasing mutex between getting min_pending_output and scanning files may cause min_pending_output to be max but some non-final files are found in file scanning, ending up with deleting wrong files. As a recent regression, mutex can be released while waiting for log sync. We move it to after file scanning. Test Plan: Run all existing tests. Don't think it is easy to write a unit test. Maybe we should find a way to assert lock not released so that we can have some test verification for similar cases. Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, kolmike, rven Reviewed By: rven Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D46899	2015-09-14 23:39:30 -07:00
Islam AbdelRahman	7cb314b9e6	Skip some tests in ROCKSD_LITE Summary: Skip these tests under ROCKSDB_LITE compaction_job_stats_test corruption_test transactions/transaction_test Test Plan: compile using ROCKSDB_LITE Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46923	2015-09-14 16:44:35 -07:00
sdong	5de807ac16	Add options.hard_pending_compaction_bytes_limit to stop writes if compaction lagging behind Summary: Add an option to stop writes if compaction lefts behind. If estimated pending compaction bytes is more than threshold specified by options.hard_pending_compaction_bytes_liimt, writes will stop until compactions are cleared to under the threshold. Test Plan: Add unit test DBTest.HardLimit Reviewers: rven, kradhakrishnan, anthony, IslamAbdelRahman, yhchiang, igor Reviewed By: igor Subscribers: MarkCallaghan, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D45999	2015-09-14 12:51:16 -07:00
Siying Dong	592f6bf782	Merge pull request #716 from yuslepukhin/refactor_file_reader_writer_win Refactor to support file_reader_writer on Windows.	2015-09-14 12:29:01 -07:00
Ari Ekmekji	03ddce9a01	Add counters for L0 stall while L0-L1 compaction is taking place Summary: Although there are currently counters to keep track of the stall caused by having too many L0 files, there is no distinction as to whether when that stall occurs either (A) L0-L1 compaction is taking place to try and mitigate it, or (B) no L0-L1 compaction has been scheduled at the moment. This diff adds a counter for (A) so that the nature of L0 stalls can be better understood. Test Plan: make all && make check Reviewers: sdong, igor, anthony, noetzli, yhchiang Reviewed By: yhchiang Subscribers: MarkCallaghan, dhruba Differential Revision: https://reviews.facebook.net/D46749	2015-09-14 11:03:37 -07:00
Dmitri Smirnov	ddc8b44998	Address code review comments both GH and internal Fix compilation issues on GCC/CLANG Address Windows Release test build issues due to Sync	2015-09-11 17:36:48 -07:00
Andres Noetzli	34cedaff66	Initialize variable to avoid warning Summary: RocksDB debug version failed to build under gcc-4.8.1 on sandcastle with the following error: ``` db/db_compaction_filter_test.cc:570:33: error: â€˜snapshotâ€™ may be used uninitialized in this function [-Werror=maybe-uninitialized] ``` Test Plan: make db_compaction_filter_test && ./db_compaction_filter_test Reviewers: rven, anthony, yhchiang, aekmekji, igor, sdong Reviewed By: igor, sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46725	2015-09-11 12:07:54 -07:00
Manuel Ung	aeb4612685	Add counters for seek/next/prev Summary: There are currently no statistics on seeks, only on gets. This adds the following counters: rocksdb.number.db.seek rocksdb.number.db.next rocksdb.number.db.prev (number of calls) rocksdb.db.iterate.bytes.read (number of bytes read from key + value using seek/next/prev) rocksdb.number.keys.seek.found rocksdb.number.keys.next.found rocksdb.number.keys.prev.found (number of calls where seek/next/prev found a value) Test Plan: ./db_bench -statistics -benchmarks fillrandom,seekrandom -seek_nexts 5 ./db_bench -statistics -benchmarks fillrandom,seekrandom -seek_nexts 5 -reverse_iterator Reviewers: yhchiang, rven, kradhakrishnan, IslamAbdelRahman, MarkCallaghan, sdong, igor Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46605	2015-09-11 11:37:44 -07:00
Islam AbdelRahman	45e9e4f0bb	Refactor NewTableReader to accept TableReaderOptions Summary: Refactoring NewTableReader to accept TableReaderOptions This will make it easier to add new options in the future, for example in this diff https://reviews.facebook.net/D46071 Test Plan: run existing tests Reviewers: igor, yhchiang, anthony, rven, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46179	2015-09-11 11:36:33 -07:00
Andres Noetzli	ddb950f83f	Fixed bug in compaction iterator Summary: During the refactoring, the condition that makes sure that compaction filters are only applied to records newer than the latest snapshot got butchered. This patch fixes the condition and adds a test case. Test Plan: make db_compaction_filter_test && ./db_compaction_filter_test Reviewers: rven, anthony, yhchiang, sdong, aekmekji, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46707	2015-09-11 10:13:49 -07:00
Dmitri Smirnov	30e82d5c41	Refactor to support file_reader_writer on Windows. Summary. A change https://reviews.facebook.net/differential/diff/224721/ Has attempted to move common functionality out of platform dependent code to a new facility called file_reader_writer. This includes: - perf counters - Buffering - RateLimiting However, the change did not attempt to refactor Windows code. To mitigate, we introduce new quering interfaces such as UseOSBuffer(), GetRequiredBufferAlignment() and ReaderWriterForward() for pure forwarding where required. Introduce WritableFile got a new method Truncate(). This is to communicate to the file as to how much data it has on close. - When space is pre-allocated on Linux it is filled with zeros implicitly, no such thing exist on Windows so we must truncate file on close. - When operating in unbuffered mode the last page is filled with zeros but we still want to truncate. Previously, Close() would take care of it but now buffer management is shifted to the wrappers and the file has no idea about the file true size. This means that Close() on the wrapper level must always include Truncate() as well as wrapper __dtor should call Close() and against double Close(). Move buffered/unbuffered write logic to the wrapper. Utilize Aligned buffer class. Adjust tests and implement Truncate() where necessary. Come up with reasonable defaults for new virtual interfaces. Forward calls for RandomAccessReadAhead class to avoid double buffering and locking (double locking in unbuffered mode on WIndows).	2015-09-11 09:57:02 -07:00
Andres Noetzli	c25f6a85bf	Removed __unused__ attribute Summary: The current build is failing on some platforms due to an __unused__ attribute. This patch prevents the problem by using a pattern similar to MergeHelper (assert not on the variable but inside a condition that uses the variable). We should have better error handling in both cases in the future. Test Plan: make clean all check Reviewers: rven, anthony, yhchiang, sdong, igor, aekmekji Reviewed By: aekmekji Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46623	2015-09-10 15:16:32 -07:00
Ari Ekmekji	6db0a939d2	Fix DBCompactionTest failure with parallel L0-L1 compactions Summary: The test SuggestCompactRangeNoTwoLevel0Compactions in DBCompactionTest fails when there are parallel L0-L1 compactions taking place because the test makes sure that only one compaction involving L0 takes place at any given time (since before having parallel compactions this was impossible). I changed the test to only run with DBOptions.max_subcompactions=1 so as to not hit this issue which is not a correctness issue but just an inherent changing of assumptions after introducing parallel compactions. This failed after landing https://reviews.facebook.net/D43269#inline-321303 so now this should fix it Test Plan: make all && make check Reviewers: yhchiang, igor, anthony, noetzli, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D46617	2015-09-10 14:37:00 -07:00
Andres Noetzli	8aa1f15197	Refactored common code of Builder/CompactionJob out into a CompactionIterator Summary: Builder and CompactionJob share a lot of fairly complex code. This patch refactors this code into a separate class, the CompactionIterator. Because the shared code is fairly complex, this patch hopefully improves maintainability. While there are is a lot of potential for further improvements, the patch is intentionally pretty close to the original structure because the change is already complex enough. Test Plan: make clean all check && ./db_stress Reviewers: rven, anthony, yhchiang, sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46197	2015-09-10 14:35:25 -07:00
Igor Canadi	95ffc5d2bc	Correct ASSERT_OK() in ReadDroppedColumnFamily Summary: ReadDroppedColumnFamily is consistently failing in Travis CI environment (can't repro locally). I suspect it might be failing with non-OK status. This diff will give us more info about the failure. Test Plan: none Reviewers: sdong, kradhakrishnan Reviewed By: kradhakrishnan Subscribers: kradhakrishnan, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46611	2015-09-10 14:17:12 -07:00
Ari Ekmekji	3c37b3cccd	Determine boundaries of subcompactions Summary: Up to this point, the subcompactions that make up a compaction job have been divided based on the key range of the L1 files, and each subcompaction has handled the key range of only one file. However DBOption.max_subcompactions allows the user to designate how many subcompactions at most to perform. This patch updates the CompactionJob::GetSubcompactionBoundaries() to determine these divisions accordingly based on that option and other input/system factors. The current approach orders the starting and/or ending keys of certain compaction input files and then generates a histogram to approximate the size covered by the key range between each consecutive pair of keys. Then it groups these ranges into groups so that the sizes are approximately equal to one another. The approach has also been adapted to work for universal compaction as well instead of just for level-based compaction as it was before. These subcompactions are then executed in parallel by locally spawning threads, one for each. The results are then aggregated and the compaction completed. Test Plan: make all && make check Reviewers: yhchiang, anthony, igor, noetzli, sdong Reviewed By: sdong Subscribers: MarkCallaghan, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D43269	2015-09-10 13:50:00 -07:00
krad	1126644082	Relaxing consistency detection to include errors while inserting to memtable as WAL recovery error. Summary: The current code, considers data to be consistent if the record checksum passes. We do have customer issues where the record checksum passed but the data was incomprehensible. There is no way to get out of this error case since all WAL recovery model will consider this error as unrelated to WAL. Relaxing the definition and including errors while inserting to memtable as WAL errors and handing them as per the recovery level. Test Plan: Used customer dump to verify the fix for different level. The db opens for kSkipAnyCorruptedRecords and kPointInTimeRecovery, but fails for kAbsoluteConsistency and kTolerateCorruptedTailRecords. Reviewers: sdon igor CC: leveldb@ Task ID: #7918721 Blame Rev:	2015-09-10 12:56:17 -07:00
sdong	abc7f5fdb2	Make DBTest.ReadLatencyHistogramByLevel more robust Summary: DBTest.ReadLatencyHistogramByLevel was not written as expected. After writes, reads aren't guaranteed to hit data written. It was not expected. Fix it. Test Plan: Run the test multiple times Reviewers: IslamAbdelRahman, rven, anthony, kradhakrishnan, yhchiang, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D46587	2015-09-10 11:32:19 -07:00
Igor Canadi	ac9bcb55ce	Set max_open_files based on ulimit Summary: We should never set max_open_files to be bigger than the system's ulimit. Otherwise we will get "Too many open files" errors. See an example in this Travis run: https://travis-ci.org/facebook/rocksdb/jobs/79591566 Test Plan: make check I will also verify that max_max_open_files is reasonable. Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D46551	2015-09-10 10:49:28 -07:00

1 2 3 4 5 ...

1945 Commits