rocksdb

Author	SHA1	Message	Date
Yueh-Hsuan Chiang	be9816b3d9	Fix data race issue when sub-compaction is used in CompactionJob Summary: When subcompaction is used, all subcompactions share the same Compaction pointer in CompactionJob while each subcompaction all keeps their mutable stats in SubcompactionState. However, there're still some mutable part that is currently store in the shared Compaction pointer. This patch makes two changes: 1. Make the shared Compaction pointer const so that it can never be modified during the compaction. 2. Move necessary states from Compaction to SubcompactionState. 3. Make functions of Compaction const if the function does not modify its internal state. Test Plan: rocksdb and MyRocks test Reviewers: sdong, kradhakrishnan, andrewkr, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba, yoshinorim, gunnarku, leveldb Differential Revision: https://reviews.facebook.net/D55923	2016-03-24 19:36:39 -07:00
Praveen Rao	583157f710	Avoid overloaded virtual function	2016-03-22 17:10:31 -07:00
Praveen Rao	136b8e0cad	Merge from master	2016-03-22 12:38:44 -07:00
Praveen Rao	2dcbb3b4f3	Addressed review comments	2016-03-22 12:07:15 -07:00
sdong	b1fafcaca6	Revert "Adding pin_l0_filter_and_index_blocks_in_cache feature." This reverts commit `522de4f59e`. It has bug of index block cleaning up.	2016-03-21 11:50:42 -07:00
agiardullo	fbbb8a6144	Add test for Snapshot 0 Summary: I ran into this assert when stress testing transactions. It's pretty easy to repro. Changing VersionSet::last_sequence_ to start at 1 seems pretty straightforward. We would just need to change the 4 callers of SetLastSequence(), including recovery code. I'd make this change myself, but I do not have enough time to test changes to recovery code-paths this week. But checking in this test case (disabled) for future fixing. Test Plan: n/a Reviewers: yhchiang, kradhakrishnan, andrewkr, anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55311	2016-03-18 16:16:20 -07:00
Andrew Kryczka	e182f03c1e	Add unit tests for RepairDB Summary: Basic test cases: - Manifest is lost or corrupt - Manifest refers to too many or too few SST files - SST file is corrupt - Unflushed data is present when RepairDB is called Depends on D55065 for its CreateFile() function in file_utils Test Plan: Ran the tests. Reviewers: IslamAbdelRahman, yhchiang, yoshinorim, sdong Reviewed By: sdong Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55485	2016-03-18 15:18:42 -07:00
Praveen Rao	7d371863e5	travis build fixes	2016-03-18 14:43:22 -07:00
Praveen Rao	4f1c74a46e	merge from master	2016-03-18 12:48:01 -07:00
Praveen Rao	f8c2189307	Publish log numbers for column family to wal_filter, and provide log number in the record callback	2016-03-18 12:32:15 -07:00
Marton Trencseni	44756260ae	Reset block cache in failing unit test. Test Plan: make -j40 check OPT=-g, on both /tmp and /dev/shm Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55701	2016-03-18 06:13:54 +00:00
Marton Trencseni	522de4f59e	Adding pin_l0_filter_and_index_blocks_in_cache feature. Summary: When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache. What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention. When the table reader is destroyed, it releases the pinned blocks (if there were any). This has to happen before the cache is destroyed, so I had to introduce a TableReader::Close(), to guarantee the order of destruction. Test Plan: Added two unit tests for this. Existing unit tests run fine (default is pin_l0_filter_and_index_blocks_in_cache=false). DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32 Mac: OK. Linux: with D55287 patched in it's OK. Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54801	2016-03-17 22:40:01 +00:00
dhruba borthakur	33d568611d	Merge pull request #1040 from bureau14/master Fixes warnings and ensure correct int behavior on 32-bit platforms.	2016-03-17 03:23:30 -07:00
Edouard A	02e62ebbc8	Fixes warnings and ensure correct int behavior on 32-bit platforms.	2016-03-16 22:57:57 +01:00
Islam AbdelRahman	3ff98bd209	Fix no compression test Summary: DBBlockCacheTest.TestWithCompressedBlockCache is depending on compression using snappy, so this test fail when snappy is not available block this test when we don't have snappy https://ci-builds.fb.com/view/rocksdb/job/rocksdb_no_compression/833/console Test Plan: run the test when compression libraries are not avaliable Reviewers: sdong, yiwu Reviewed By: yiwu Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55413	2016-03-15 12:17:40 -07:00
Dhruba Borthakur	1a2cc27e01	ColumnFamilyOptions SanitizeOptions is buggy on 32-bit platforms. Summary: The pre-existing code is trying to clamp between 65,536 and 0, resulting in clamping to 65,536, resulting in very small buffers, resulting in ShouldFlushNow() being true quite easily, resulting in assertion failing and database performance being "not what it should be". https://github.com/facebook/rocksdb/issues/1018 Test Plan: make check Reviewers: sdong, andrewkr, IslamAbdelRahman, yhchiang, igor Reviewed By: igor Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55455	2016-03-14 16:21:54 -07:00
sdong	b2ae5950ba	Index Reader should not be reused after DB restart Summary: In block based table reader, wow we put index reader to block cache, which can be retrieved after DB restart. However, index reader may reference internal comparator, which can be destroyed after DB restarts, causing problems. Fix it by making cache key identical per table reader. Test Plan: Add a new test which failed with out the commit but now pass. Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: maro, yhchiang, kradhakrishnan, leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55287	2016-03-14 10:04:09 -07:00
Islam AbdelRahman	580fede347	Aggregate hot Iterator counters in LocalStatistics (DBIter::Next perf regression) Summary: This patch bump the counters in the frequent code path DBIter::Next() / DBIter::Prev() in a local data members and send them to Statistics when the iterator is destroyed A better solution will be to have thread_local implementation for Statistics New performance ``` readseq : 0.035 micros/op 28597881 ops/sec; 3163.7 MB/s 1,851,568,819 stalled-cycles-frontend # 31.29% frontend cycles idle [49.86%] 884,929,823 stalled-cycles-backend # 14.95% backend cycles idle [50.21%] readreverse : 0.071 micros/op 14077393 ops/sec; 1557.3 MB/s 3,239,575,993 stalled-cycles-frontend # 27.36% frontend cycles idle [49.96%] 1,558,253,983 stalled-cycles-backend # 13.16% backend cycles idle [50.14%] ``` Existing performance ``` readreverse : 0.174 micros/op 5732342 ops/sec; 634.1 MB/s 20,570,209,389 stalled-cycles-frontend # 70.71% frontend cycles idle [50.01%] 18,422,816,837 stalled-cycles-backend # 63.33% backend cycles idle [50.04%] readseq : 0.119 micros/op 8400537 ops/sec; 929.3 MB/s 15,634,225,844 stalled-cycles-frontend # 79.07% frontend cycles idle [49.96%] 14,227,427,453 stalled-cycles-backend # 71.95% backend cycles idle [50.09%] ``` Test Plan: unit tests Reviewers: yhchiang, sdong, igor Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D55107	2016-03-11 19:01:12 -08:00
Baris Yazici	e8e6cf0173	fix: handle_fatal_signal (sig=6) in std::vector<std::string, std::allocator<std::string> >::_M_range_check \| c++/4.8.2/bits/stl_vector.h:794 #174 Summary: Fix for https://github.com/facebook/mysql-5.6/issues/174 When there is no old files to purge, vector.at(i) function was crashing if (old_info_log_file_count != 0 && old_info_log_file_count >= db_options_.keep_log_file_num) { std::sort(old_info_log_files.begin(), old_info_log_files.end()); size_t end = old_info_log_file_count - db_options_.keep_log_file_num; for (unsigned int i = 0; i <= end; i++) { std::string& to_delete = old_info_log_files.at(i); Added check to old_info_log_file_count be non zero. Test Plan: run existing tests Reviewers: gunnarku, vasilep, sdong, yhchiang Reviewed By: yhchiang Subscribers: andrewkr, webscalesql-eng, dhruba Differential Revision: https://reviews.facebook.net/D55245	2016-03-11 11:11:45 -08:00
Andrew Kryczka	d9620239d2	Cleanup stale manifests outside of full purge Summary: - Keep track of obsolete manifests in VersionSet - Updated FindObsoleteFiles() to put obsolete manifests in the JobContext for later use by PurgeObsoleteFiles() - Added test case that verifies a stale manifest is deleted by a non-full purge Test Plan: $ ./backupable_db_test --gtest_filter=BackupableDBTest.ChangeManifestDuringBackupCreation Reviewers: IslamAbdelRahman, yoshinorim, sdong Reviewed By: sdong Subscribers: andrewkr, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55269	2016-03-10 18:16:21 -08:00
Yi Wu	f71fc77b7c	Cache to have an option to fail Cache::Insert() when full Summary: Cache to have an option to fail Cache::Insert() when full. Update call sites to check status and handle error. I totally have no idea what's correct behavior of all the call sites when they encounter error. Please let me know if you see something wrong or more unit test is needed. Test Plan: make check -j32, see tests pass. Reviewers: anthony, yhchiang, andrewkr, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54705	2016-03-10 17:35:19 -08:00
Yueh-Hsuan Chiang	765597fa78	Update compaction score right after CompactFiles forms a compaction Summary: This is a follow-up patch of https://reviews.facebook.net/D54891. As the information about files being compacted will also be used when making compaction decision, it is necessary to update the compaction score when a compaction plan has been made but not yet execute. This patch adds a missing call to update the compaction score in CompactFiles(). Test Plan: compact_files_test Reviewers: sdong, IslamAbdelRahman, kradhakrishnan, yiwu, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55227	2016-03-10 14:34:28 -08:00
Yueh-Hsuan Chiang	aa3f02d50c	Improve comment in compaction.h and compaction_picker.h Summary: ReleaseCompactionFiles must be called when DB mutex is held, but the documentation is mission. Test Plan: no code change Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54987	2016-03-08 16:46:41 -08:00
sdong	294bdf9ee2	Change Property name from "rocksdb.current_version_number" to "rocksdb.current-super-version-number" Summary: I realized I again is wrong about the naming convention. Let me change it to the correct one. Test Plan: Run unit tests. Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55041	2016-03-04 18:15:29 -08:00
Yueh-Hsuan Chiang	a7d4eb2f34	Fix a bug where flush does not happen when a manual compaction is running Summary: Currently, when rocksdb tries to run manual compaction to refit data into a level, there's a ReFitLevel() process that requires no bg work is currently running. When RocksDB plans to ReFitLevel(), it will do the following: 1. pause scheduling new bg work. 2. wait until all bg work finished 3. do the ReFitLevel() 4. unpause scheduling new bg work. However, as it pause scheduling new bg work at step one and waiting for all bg work finished in step 2, RocksDB will stop flushing until all bg work is done (which could take a long time.) This patch fix this issue by changing the way ReFitLevel() pause the background work: 1. pause scheduling compaction. 2. wait until all bg work finished. 3. pause scheduling flush 4. do ReFitLevel() 5. unpause both flush and compaction. The major difference is that. We only pause scheduling compaction in step 1 and wait for all bg work finished in step 2. This prevent flush being blocked for a long time. Although there's a very rare case that ReFitLevel() might be in starvation in step 2, but it's less likely the case as flush typically finish very fast. Test Plan: existing test. Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55029	2016-03-04 14:24:52 -08:00
Islam AbdelRahman	dfe96c72c3	Fix WriteLevel0TableForRecovery file delete protection Summary: The call to ``` CaptureCurrentFileNumberInPendingOutputs() ``` should be before ``` versions_->NewFileNumber() ``` Right now we are not actually protecting the file from being deleted Test Plan: make check Reviewers: sdong, anthony, yhchiang Reviewed By: yhchiang Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D54645	2016-03-03 18:25:07 -08:00
sdong	ef204df7ef	Compaction always needs to be removed from level0_compactions_in_progress_ for universal compaction Summary: We always put compaction to level0_compactions_in_progress_ for universal compaction, so we should also remove it. The bug causes assert failure when running manual compaction. Test Plan: TEST_TMPDIR=/dev/shm/ ./db_bench --benchmarks=fillrandom,compact --subcompactions=16 --compaction_style=1 always fails on my host. After the fix, it doesn't fail any more. Reviewers: IslamAbdelRahman, andrewkr, kradhakrishnan, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55017	2016-03-02 21:23:28 -08:00
sdong	e79ad9e184	Add Iterator Property rocksdb.iterator.version_number Summary: We want to provide a way to detect whether an iterator is stale and needs to be recreated. Add a iterator property to return version number. Test Plan: Add two unit tests for it. Reviewers: IslamAbdelRahman, yhchiang, anthony, kradhakrishnan, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54921	2016-03-02 16:23:59 -08:00
sdong	19ea40f8b6	Subcompaction boundary keys should not terminate after an empty level Summary: Now we skip to add boundary keys to subcompaction candidates since we see an empty level. This makes subcompaction almost disabled for universal compaction. We should consider all files instead. Test Plan: Run existing tests. Reviewers: IslamAbdelRahman, andrewkr, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D55005	2016-03-02 15:45:07 -08:00
sdong	74b660702e	Rename iterator property "rocksdb.iterator.is.key.pinned" => "rocksdb.iterator.is-key-pinned" Summary: Rename iterator property to folow property naming convention. Test Plan: Run all existing tests. Reviewers: andrewkr, anthony, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54957	2016-03-01 13:47:12 -08:00
Islam AbdelRahman	6743135ea1	Fix DB::AddFile() issue when PurgeObsoleteFiles() is called Summary: In some situations the DB will scan all existing files in the DB path and delete the ones that are Obsolete. If this happen during adding an external sst file. this could cause the file to be deleted while we are adding it. This diff fix this issue Test Plan: unit test to reproduce the bug existing unit tests Reviewers: sdong, yhchiang, andrewkr Reviewed By: andrewkr Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D54627	2016-03-01 12:05:29 -08:00
sdong	432f3adf2c	Add DB Property "rocksdb.current_version_number" Summary: Add a DB Property "rocksdb.current_version_number" for users to monitor version changes and stale iterators. Test Plan: Add a unit test. Reviewers: andrewkr, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54927	2016-03-01 10:55:40 -08:00
sdong	b5b1db167a	Recompute compaction score after scheduling manual compaction Summary: After we made manual compaction runnable concurrently with automaticallly compaction, we need to run ComputeCompactionScore() to prepare a coming compaction picking call before the compaction finishes. Test Plan: Run existing tests. Reviewers: yhchiang, IslamAbdelRahman, andrewkr, kradhakrishnan, anthony, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54891	2016-02-29 17:17:51 -08:00
sdong	1f5954147b	Introduce Iterator::GetProperty() and replace Iterator::IsKeyPinned() Summary: Add Iterator::GetProperty(), a way for users to communicate with iterator, and turn Iterator::IsKeyPinned() with it. As a follow-up, I'll ask a property as the version number attached to the iterator Test Plan: Rerun existing tests and add a negative test case. Reviewers: yhchiang, andrewkr, kradhakrishnan, anthony, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54783	2016-02-29 14:01:31 -08:00
Andrew Kryczka	69c471bd9b	Handle concurrent manifest update and backup creation Summary: Fixed two related race conditions in backup creation. (1) CreateNewBackup() uses DB::DisableFileDeletions() to prevent table files from being deleted while it is copying; however, the MANIFEST file could still rotate during this time. The fix is to stop deleting the old manifest in the rotation logic. It will be deleted safely later when PurgeObsoleteFiles() runs (can only happen when file deletions are enabled). (2) CreateNewBackup() did not account for the CURRENT file being mutable. This is significant because the files returned by GetLiveFiles() contain a particular manifest filename, but the manifest to which CURRENT refers can change at any time. This causes problems when CURRENT changes between the call to GetLiveFiles() and when it's copied to the backup directory. To workaround this, I manually forge a CURRENT file referring to the manifest filename returned in GetLiveFiles(). (2) also applies to the checkpointing code, so let me know if this approach is good and I'll make the same change there. Test Plan: new test for roll manifest during backup creation. running the test before this change: $ ./backupable_db_test --gtest_filter=BackupableDBTest.ChangeManifestDuringBackupCreation ... IO error: /tmp/rocksdbtest-9383/backupable_db/MANIFEST-000001: No such file or directory running the test after this change: $ ./backupable_db_test --gtest_filter=BackupableDBTest.ChangeManifestDuringBackupCreation ... [ RUN ] BackupableDBTest.ChangeManifestDuringBackupCreation [ OK ] BackupableDBTest.ChangeManifestDuringBackupCreation (2836 ms) Reviewers: IslamAbdelRahman, anthony, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54711	2016-02-29 12:56:55 -08:00
sdong	8800975fb0	Make DBTestUniversalCompaction.IncreaseUniversalCompactionNumLevels more robust Summary: Based on thread scheduling, DBTestUniversalCompaction.IncreaseUniversalCompactionNumLevels can fail to flush enough files to trigger expected compactions. Fix it by waiting for flush after inserting each key. There are failrue reported: db/db_universal_compaction_test.cc:1134: Failure Expected: (NumTableFilesAtLevel(options.num_levels - 1, 1)) > (0), actual: 0 vs 0 but I can't repro it. Try to fix the bug and see whether it goes away. Test Plan: Run the test multiple time. Reviewers: IslamAbdelRahman, anthony, andrewkr, kradhakrishnan, yhchiang Reviewed By: yhchiang Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54747	2016-02-26 11:59:31 -08:00
sdong	82f15fb15d	Add test to make sure DropColumnFamily doesn't impact existing iterators Summary: Add a test case in ColumnFamilyTest.ReadDroppedColumnFamily to make sure existing iterator is not impacted by column family dropping. Test Plan: N/A Reviewers: igor, yhchiang, anthony, andrewkr, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54657	2016-02-24 10:25:38 -08:00
sdong	38201b3599	Fix assert failure when DBImpl::SyncWAL() conflicts with log rolling Summary: DBImpl::SyncWAL() releases db mutex before calling DBImpl::MarkLogsSynced(), while inside DBImpl::MarkLogsSynced() we assert there is none or one outstanding log file. However, a memtable switch can happen in between and causing two or outstanding logs there, failing the assert. The diff adds a unit test that repros the issue and fix the assert so that the unit test passes. Test Plan: Run the new tests. Reviewers: anthony, kolmike, yhchiang, IslamAbdelRahman, kradhakrishnan, andrewkr Reviewed By: andrewkr Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54621	2016-02-23 11:42:15 -08:00
Andrew Kryczka	b046916656	Redo SyncPoints for flush while rolling test Summary: There was a race condition in the test where the rolling thread acquired the mutex before the flush thread pinned the logger. Rather than add more complicated synchronization to fix it, I followed Siying's suggestion to use SyncPoint in the test code. Comments in the LoadDependency() invocation explain the reason for each of the sync points. Test Plan: Ran test 1000 times for tsan/asan. Will wait for all sandcastle tests to finish before committing since this is a tricky test. Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54615	2016-02-22 21:32:19 -08:00
Mike Kolupaev	eef63ef807	Fixed CompactFiles() spuriously failing or corrupting DB Summary: We started getting two kinds of crashes since we started using `DB::CompactFiles()`: (1) `CompactFiles()` fails saying something like "/data/logdevice/4440/shard12/012302.sst: No such file or directory", and presumably makes DB read-only, (2) DB fails to open saying "Corruption: Can't access /267000.sst: IO error: /data/logdevice/4440/shard1/267000.sst: No such file or directory". AFAICT, both can be explained by background thread deleting compaction output as "obsolete" while it's being written, before it's committed to manifest. If it ends up committed to the manifest, we get (2); if compaction notices the disappearance and fails, we get (1). The internal tasks t10068021 and t10134177 have some details about the investigation that led to this. Test Plan: `make -j check`; the new test fails to reopen the DB without the fix Reviewers: yhchiang Reviewed By: yhchiang Subscribers: dhruba, sdong Differential Revision: https://reviews.facebook.net/D54561	2016-02-22 13:54:58 -08:00
Dmitri Smirnov	d37d348da8	This addresses build issues on Windows https://github.com/facebook/rocksdb/issues/1002	2016-02-19 12:29:54 -08:00
Andrew Kryczka	d825fc70d4	Use condition variable in log roller test Summary: Previously I just slept until the flush_thread was "probably" ready since proper synchronization in test cases seemed like overkill. But then tsan complained about it, so I did the synchronization (mostly) properly now. Test Plan: $ COMPILE_WITH_TSAN=1 make -j32 auto_roll_logger_test $ ./auto_roll_logger_test Reviewers: anthony, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54399	2016-02-18 18:03:53 -08:00
Islam AbdelRahman	df9ba6df62	Introduce SstFileManager::SetMaxAllowedSpaceUsage() to cap disk space usage Summary: Introude SstFileManager::SetMaxAllowedSpaceUsage() that can be used to limit the maximum space usage allowed for RocksDB. When this limit is exceeded WriteImpl() will fail and return Status::Aborted() Test Plan: unit testing Reviewers: yhchiang, anthony, andrewkr, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D53763	2016-02-17 15:20:23 -08:00
Andrew Kryczka	3943d16780	Fix race conditions in auto-rolling logger Summary: For GetLogFileSize() and Flush(), they previously did not follow the synchronization pattern for accessing logger_. This meant ResetLogger() could cause logger_ destruction while the unsynchronized functions were accessing it, causing a segfault. Also made the mutex instance variable mutable so we can preserve GetLogFileSize()'s const-ness. Test Plan: new test case, it's quite ugly because both threads need to access one of the functions with SyncPoints (PosixLogger::Flush()), and also special handling is needed to prevent the mutex and sync points from conflicting. Reviewers: kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54237	2016-02-17 12:06:45 -08:00
reid horuff	a7b6f0748a	Improve write_with_callback_test to sync WAL Summary: Currently write_with_callback_test does not test with WAL syncing enabled. This addresses that. Test Plan: write_with_callback_test Reviewers: anthony Reviewed By: anthony Subscribers: leveldb, dhruba, hermanlee4 Differential Revision: https://reviews.facebook.net/D54255	2016-02-16 14:04:14 -08:00
reid horuff	5bcf952a87	Fix WriteImpl empty batch hanging issue Summary: There is an issue in DBImpl::WriteImpl where if an empty writebatch comes in and sync=true then the logs will be marked as being synced yet the sync never actually happens because there is no data in the writebatch. This causes the next incoming batch to hang while waiting for the logs to complete syncing. This fix syncs logs even if the writebatch is empty. Test Plan: DoubleEmptyBatch unit test in transaction_test. Reviewers: yoshinorim, hermanlee4, sdong, ngbronson, anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54057	2016-02-16 12:21:33 -08:00
Mike Kolupaev	44371501f0	Fixed a segfault when compaction fails Summary: We've hit it today. Test Plan: `make -j check`; didn't reproduce the issue Reviewers: yhchiang Reviewed By: yhchiang Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D54219	2016-02-16 11:11:16 -08:00
Jonathan Wiepert	7bd284c374	Separeate main from bench functionality to allow cusomizations Summary: Isolate db_bench functionality from main so custom benchmark code can be written and managed Test Plan: Tested commands ./build_tools/regression_build_test.sh ./db_bench --db=/tmp/rocksdbtest-12321/dbbench --stats_interval_seconds=1 --num=1000 ./db_bench --db=/tmp/rocksdbtest-12321/dbbench --stats_interval_seconds=1 --num=1000 --reads=500 --writes=500 ./db_bench --db=/tmp/rocksdbtest-12321/dbbench --stats_interval_seconds=1 --num=1000 --merge_keys=100 --numdistinct=100 --num_column_families=3 --num_hot_column_families=1 ./db_bench --stats_interval_seconds=1 --num=1000 --bloom_locality=1 --seed=5 --threads=5 ./db_bench --duration=60 --value_size=50 --seek_nexts=10 --reverse_iterator=true --usee_uint64_comparator=true --batch-size=5 ./db_bench --duration=60 --value_size=50 --seek_nexts=10 --reverse_iterator=true --use_uint64_comparator=true --batch_size=5 ./db_bench --duration=60 --value_size=50 --seek_nexts=10 --reverse_iterator=true --usee_uint64_comparator=true --batch-size=5 Test Results - https://phabricator.fb.com/P56130387 Additional tests for: ./db_bench --duration=60 --value_size=50 --seek_nexts=10 --reverse_iterator=true --use_uint64_comparator=true --batch_size=5 --key_size=8 --merge_operator=put ./db_bench --stats_interval_seconds=1 --num=1000 --bloom_locality=1 --seed=5 --threads=5 --merge_operator=uint64add Results: https://phabricator.fb.com/P56130607 Reviewers: yhchiang, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D53991	2016-02-16 06:17:31 -08:00
sdong	92a9ccf1a6	Add a new compaction priority that picks file whose overlapping ratio is smallest Summary: Add a new compaction priority as following: For every file, we calculate total size of files overalapping with the file in the next level, over the file's size itself. The file with smallest ratio will be picked first. My "db_bench --fillrandom" shows about 5% less compaction than kOldestSmallestSeqFirst if --hard_pending_compaction_bytes_limit value to keep LSM tree in shape. If not limiting hard_pending_compaction_bytes_limit, improvement is only 1% or 2%. Test Plan: Add a unit test Reviewers: andrewkr, kradhakrishnan, anthony, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: MarkCallaghan, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54075	2016-02-11 15:59:19 -08:00
Peter Mattis	239aaf2fc0	Use user_comparator when comparing against iterate_upper_bound. Fixes #983.	2016-02-11 08:47:16 -05:00

1 2 3 4 5 ...

2236 Commits