rocksdb

Author	SHA1	Message	Date
Islam AbdelRahman	3ce3bb3da2	Allowing L0 -> L1 trivial move on sorted data Summary: This diff updates the logic of how we do trivial move, now trivial move can run on any number of files in input level as long as they are not overlapping The conditions for trivial move have been updated Introduced conditions: - Trivial move cannot happen if we have a compaction filter (except if the compaction is not manual) - Input level files cannot be overlapping Removed conditions: - Trivial move only run when the compaction is not manual - Input level should can contain only 1 file More context on what tests failed because of Trivial move ``` DBTest.CompactionsGenerateMultipleFiles This test is expecting compaction on a file in L0 to generate multiple files in L1, this test will fail with trivial move because we end up with one file in L1 ``` ``` DBTest.NoSpaceCompactRange This test expect compaction to fail when we force environment to report running out of space, of course this is not valid in trivial move situation because trivial move does not need any extra space, and did not check for that ``` ``` DBTest.DropWrites Similar to DBTest.NoSpaceCompactRange ``` ``` DBTest.DeleteObsoleteFilesPendingOutputs This test expect that a file in L2 is deleted after it's moved to L3, this is not valid with trivial move because although the file was moved it is now used by L3 ``` ``` CuckooTableDBTest.CompactionIntoMultipleFiles Same as DBTest.CompactionsGenerateMultipleFiles ``` This diff is based on a work by @sdong https://reviews.facebook.net/D34149 Test Plan: make -j64 check Reviewers: rven, sdong, igor Reviewed By: igor Subscribers: yhchiang, ott, march, dhruba, sdong Differential Revision: https://reviews.facebook.net/D34797	2015-06-04 16:51:25 -07:00
Yueh-Hsuan Chiang	bb808eaddb	Changed the CompactionJobStats::output_key_prefix type from char[] to string. Summary: Keys in RocksDB can be arbitrary byte strings. However, in the current CompactionJobStats, smallest_output_key_prefix and largest_output_key_prefix are of type char[] without having a length, which is insufficient to handle non-null terminated strings. This patch change their type to std::string. Test Plan: compaction_job_stats_test Reviewers: igor, rven, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39537	2015-06-04 12:31:12 -07:00
Yueh-Hsuan Chiang	0b3172d071	Add EventListener::OnTableFileDeletion() Summary: Add EventListener::OnTableFileDeletion(), which will be called when a table file is deleted. Test Plan: Extend three existing tests in db_test to verify the deleted files. Reviewers: rven, anthony, kradhakrishnan, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38931	2015-06-03 19:57:01 -07:00
Igor Canadi	2d0b9e5f0a	Fix compile on darwin Summary: We need to start doing some CI on Macs. Test Plan: works now Reviewers: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39489	2015-06-03 16:52:51 -04:00
sdong	3af668ed17	Fix DBTest.MigrateToDynamicLevelMaxBytesBase slowness with valgrind Summary: DBTest.MigrateToDynamicLevelMaxBytesBase with valgrind test is extremely slow. Work it around by not having both threads running everything non-stop. Test Plan: Run the test with valgrind which used to take too long to finish and see it finish in reasonable time. Reviewers: yhchiang, anthony, rven, kradhakrishnan, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D39477	2015-06-03 12:08:37 -07:00
Igor Canadi	408cc4b8e0	Revert "Merge pull request #621 from rdallman/c-slice-parts-support" This reverts commit `78382d4ba7`, reversing changes made to `ca8b85ac04`.	2015-06-03 13:34:07 -04:00
Igor Canadi	78382d4ba7	Merge pull request #621 from rdallman/c-slice-parts-support C: add support for WriteBatch SliceParts params	2015-06-03 13:16:14 -04:00
Yueh-Hsuan Chiang	0483dab2ab	Remove a TODO that has been done Summary: Remove a TODO that has been done Test Plan: make Reviewers: sdong, igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39429	2015-06-02 18:38:57 -07:00
Yueh-Hsuan Chiang	8afafc2783	Fix compile warning in db/db_impl Summary: Fix the following compile warning in db/db_impl db/db_impl.cc:1603:19: error: implicit conversion loses integer precision: 'const uint64_t' (aka 'const unsigned long') to 'int' [-Werror,-Wshorten-64-to-32] info.job_id = job_id; ~ ^~~~~~ Test Plan: db_test Reviewers: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39423	2015-06-02 17:36:45 -07:00
Yueh-Hsuan Chiang	fe5c6321cb	Allow EventListener::OnCompactionCompleted to return CompactionJobStats. Summary: Allow EventListener::OnCompactionCompleted to return CompactionJobStats, which contains useful information about a compaction. Example CompactionJobStats returned by OnCompactionCompleted(): smallest_output_key_prefix 05000000 largest_output_key_prefix 06990000 elapsed_time 42419 num_input_records 300 num_input_files 3 num_input_files_at_output_level 2 num_output_records 200 num_output_files 1 actual_bytes_input 167200 actual_bytes_output 110688 total_input_raw_key_bytes 5400 total_input_raw_value_bytes 300000 num_records_replaced 100 is_manual_compaction 1 Test Plan: Developed a mega test in db_test which covers 20 variables in CompactionJobStats. Reviewers: rven, igor, anthony, sdong Reviewed By: sdong Subscribers: tnovak, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38463	2015-06-02 17:07:16 -07:00
Yueh-Hsuan Chiang	3083ed2129	Fixed heap-use-after-free error in compaction_job_test.cc Summary: Fixed heap-use-after-free error in compaction_job_test.cc Test Plan: compaction_job_test Reviewers: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39411	2015-06-02 16:23:01 -07:00
Yueh-Hsuan Chiang	ab946af08a	Fix a compile warning in listener_test.cc Summary: Fixed the following compile warning in listener_test.cc: db/listener_test.cc:214:8: error: 'OnTableFileCreated' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override] 14:16:46 void OnTableFileCreated( Test Plan: make listener_test Reviewers: sdong, igor Subscribers: leveldb	2015-06-02 14:20:27 -07:00
Yueh-Hsuan Chiang	fc83821270	Add EventListener::OnTableFileCreated() Summary: Add EventListener::OnTableFileCreated(), which will be called when a table file is created. This patch is part of the EventLogger and EventListener integration. Test Plan: Augment existing test in db/listener_test.cc Reviewers: anthony, kradhakrishnan, rven, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38865	2015-06-02 14:12:23 -07:00
Yueh-Hsuan Chiang	898e803fc5	Add a stats counter for DB_WRITE back which was mistakenly removed. Summary: Add a stats counter for DB_WRITE back which was mistakenly removed. Test Plan: augment GroupCommitTest Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39399	2015-06-02 12:35:12 -07:00
Mike Kolupaev	ec7a944360	more times in perf_context and iostats_context Summary: We occasionally get write stalls (>1s Write() calls) on HDD under read load. The following timers explain almost all of the stalls: - perf_context.db_mutex_lock_nanos - perf_context.db_condition_wait_nanos - iostats_context.open_time - iostats_context.allocate_time - iostats_context.write_time - iostats_context.range_sync_time - iostats_context.logger_time In my experiments each of these occasionally takes >1s on write path under some workload. There are rare cases when Write() takes long but none of these takes long. Test Plan: Added code to our application to write the listed timings to log for slow writes. They usually add up to almost exactly the time Write() call took. Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: march, dhruba, tnovak Differential Revision: https://reviews.facebook.net/D39177	2015-06-02 02:07:58 -07:00
sdong	4266d4fd90	Allow users to migrate to options.level_compaction_dynamic_level_bytes=true using CompactRange() Summary: In DB::CompactRange(), change parameter "reduce_level" to "change_level". Users can compact all data to the last level if needed. By doing it, users can migrate the DB to options.level_compaction_dynamic_level_bytes=true. Test Plan: Add a unit test for it. Reviewers: yhchiang, anthony, kradhakrishnan, igor, rven Reviewed By: rven Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D39099	2015-06-01 18:21:14 -07:00
Yueh-Hsuan Chiang	d333820bad	Removed DBImpl::notifying_events_ Summary: DBImpl::notifying_events_ is a internal counter in DBImpl which is used to prevent DB close when DB is notifying events. However, as the current events all rely on either compaction or flush which already have similar counters to prevent DB close, it is safe to remove notifying_events_. Test Plan: listener_test examples/compact_files_example Reviewers: igor, anthony, kradhakrishnan, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39315	2015-06-01 15:32:23 -07:00
Igor Canadi	a187e66ad0	Merge pull request #617 from rdallman/wb-merge-sliceparts WriteBatch.Merge w/ SliceParts support	2015-05-31 13:34:34 -04:00
Igor Canadi	4c181f08bc	Fix compile on darwin Summary: As title Test Plan: make check Reviewers: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39243	2015-05-30 12:25:45 -04:00
agiardullo	bc7a7a400c	fix LITE build Summary: Broken by optimistic transaction diff. (I only built 'release' not 'static_lib' when testing). Test Plan: build Reviewers: yhchiang, sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39219	2015-05-29 15:22:00 -07:00
agiardullo	dc9d70de65	Optimistic Transactions Summary: Optimistic transactions supporting begin/commit/rollback semantics. Currently relies on checking the memtable to determine if there are any collisions at commit time. Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty. You should probably start with transaction.h to get an overview of what is currently supported. Test Plan: Added a new test, but still need to look into stress testing. Reviewers: yhchiang, igor, rven, sdong Reviewed By: sdong Subscribers: adamretter, MarkCallaghan, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D33435	2015-05-29 14:36:35 -07:00
Reed Allman	21cd6b7ad8	C: add support for WriteBatch SliceParts params	2015-05-29 10:23:43 -07:00
Reed Allman	a0635ba3f6	WriteBatch.Merge w/ SliceParts support also hooked up WriteBatchInternal	2015-05-29 04:30:03 -07:00
agiardullo	c815351038	Support saving history in memtable_list Summary: For transactions, we are using the memtables to validate that there are no write conflicts. But after flushing, we don't have any memtables, and transactions could fail to commit. So we want to someone keep around some extra history to use for conflict checking. In addition, we want to provide a way to increase the size of this history if too many transactions fail to commit. After chatting with people, it seems like everyone prefers just using Memtables to store this history (instead of a separate history structure). It seems like the best place for this is abstracted inside the memtable_list. I decide to create a separate list in MemtableListVersion as using the same list complicated the flush/installalflushresults logic too much. This diff adds a new parameter to control how much memtable history to keep around after flushing. However, it sounds like people aren't too fond of adding new parameters. So I am making the default size of flushed+not-flushed memtables be set to max_write_buffers. This should not change the maximum amount of memory used, but make it more likely we're using closer the the limit. (We are now postponing deleting flushed memtables until the max_write_buffer limit is reached). So while we might use more memory on average, we are still obeying the limit set (and you could argue it's better to go ahead and use up memory now instead of waiting for a write stall to happen to test this limit). However, if people are opposed to this default behavior, we can easily set it to 0 and require this parameter be set in order to use transactions. Test Plan: Added a xfunc test to play around with setting different values of this parameter in all tests. Added testing in memtablelist_test and planning on adding more testing here. Reviewers: sdong, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D37443	2015-05-28 16:34:24 -07:00
Yueh-Hsuan Chiang	ec4ff4e99c	Rename EventLoggerHelpers EventHelpers Summary: Rename EventLoggerHelpers EventHelpers, as it's going to include all event-related helper functions instead of EventLogger only stuffs. Test Plan: make Reviewers: sdong, rven, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39093	2015-05-28 13:37:47 -07:00
Yueh-Hsuan Chiang	672dda9b3b	[API Change] Move listeners from ColumnFamilyOptions to DBOptions Summary: Move listeners from ColumnFamilyOptions to DBOptions Test Plan: listener_test compact_files_test Reviewers: rven, anthony, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D39087	2015-05-28 13:21:39 -07:00
Yueh-Hsuan Chiang	3ab8ffd4dd	Compaction now conditionally boosts the size of deletion entries. Summary: Compaction now boosts the size of deletion entries of a file only when the number of deletion entries is greater than the number of non-deletion entries in the file. The motivation here is that in a stable workload, the number of deletion entries should be roughly equal to the number of non-deletion entries. If we compensate the size of deletion entries in a stable workload, the deletion compensation logic might introduce unwanted effet which changes the shape of LSM tree. Test Plan: db_test --gtest_filter="Deletion" Reviewers: sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38703	2015-05-26 14:05:38 -07:00
Igor Canadi	a81ac24127	Merge pull request #615 from rdallman/master C: add more block based table stuff, some aux slice transform/merge ops	2015-05-26 14:19:31 -04:00
Yueh-Hsuan Chiang	6d299b70b8	Fixed a bug in EventLoggerHelpers::LogTableFileCreation Summary: Fixed a missing "}" at the end of the generated JSON Log in EventLoggerHelpers::LogTableFileCreation. Test Plan: db_bench Reviewers: igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38919	2015-05-26 10:55:46 -07:00
Yueh-Hsuan Chiang	a0580205c8	Removed an unused private variable in db_impl.h Summary: Removed an unused private variable in db_impl.h Test Plan: make db_test Reviewers: sdong, anthony, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38925	2015-05-26 10:46:26 -07:00
Reed Allman	9c38ce1d02	C: extra bbto / noop slice transform	2015-05-22 22:56:28 -07:00
Igor Canadi	ea6d3a8ac0	Don't skip last level when calculating compaction stats Summary: We have a bug where we don't report the last level's files as being compacted. This fixes it. Test Plan: See the fix in action here: https://phabricator.fb.com/P19845738 Reviewers: MarkCallaghan, sdong Reviewed By: sdong Subscribers: yhchiang, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38727	2015-05-22 15:30:43 -04:00
Yueh-Hsuan Chiang	5c224d1b70	Fixed two bugs on logging file deletion. Summary: This patch fixes the following two bugs on logging file deletion. 1. Previously, file deletion failure was only logged in INFO_LEVEL. This patch changes it to ERROR_LEVEL and does some code clean. 2. EventLogger previously will always generate the same log on table file deletion even when file deletion is not successful. Now the resulting status of file deletion will also be logged. Test Plan: make all check Reviewers: sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38817	2015-05-22 12:10:51 -07:00
Yueh-Hsuan Chiang	dc81efe415	Change the log-level of DB summary and options from INFO_LEVEL to WARN_LEVEL Summary: Change the log-level of DB summary and options from INFO_LEVEL to WARN_LEVEL Test Plan: Use db_bench to verify the log level. Sample output: 2015/05/22-00:20:39.778064 7fff75b41300 [WARN] RocksDB version: 3.11.0 2015/05/22-00:20:39.778095 7fff75b41300 [WARN] Git sha rocksdb_build_git_sha:7fee8775a459134c4cb04baae5bd1687e268f2a0 2015/05/22-00:20:39.778099 7fff75b41300 [WARN] Compile date May 22 2015 2015/05/22-00:20:39.778101 7fff75b41300 [WARN] DB SUMMARY 2015/05/22-00:20:39.778145 7fff75b41300 [WARN] SST files in /tmp/rocksdbtest-691931916/dbbench dir, Total Num: 0, files: 2015/05/22-00:20:39.778148 7fff75b41300 [WARN] Write Ahead Log file in /tmp/rocksdbtest-691931916/dbbench: 2015/05/22-00:20:39.778150 7fff75b41300 [WARN] Options.error_if_exists: 0 2015/05/22-00:20:39.778152 7fff75b41300 [WARN] Options.create_if_missing: 1 2015/05/22-00:20:39.778153 7fff75b41300 [WARN] Options.paranoid_checks: 1 Reviewers: MarkCallaghan, igor, kradhakrishnan Reviewed By: igor Subscribers: sdong, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38835	2015-05-22 11:54:59 -07:00
Yueh-Hsuan Chiang	687214f878	Ensure ColumnFamilyOptions.num_levels >= 2 when level compaction is used. Summary: Ensure ColumnFamilyOptions.num_levels >= 2 when level compaction is used. Test Plan: Extend SanitizeOptions test in column_family_test Reviewers: sdong, rven, anthony, krishnanm86, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38829	2015-05-22 11:35:40 -07:00
Yueh-Hsuan Chiang	2abb592688	Avoid logging under mutex in DBImpl::WriteLevel0TableForRecovery(). Summary: Avoid logging under mutex in DBImpl::WriteLevel0TableForRecovery(). Test Plan: make all check Reviewers: igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38823	2015-05-22 11:24:12 -07:00
Yueh-Hsuan Chiang	7fee8775a4	Allow EventLogger to directly log from a JSONWriter. Summary: Allow EventLogger to directly log from a JSONWriter. This allows the JSONWriter to be shared by EventLogger and potentially EventListener, which is an important step to integrate EventLogger and EventListener. This patch also rewrites EventLoggerHelpers::LogTableFileCreation(), which uses the new API to generate identical log. Test Plan: Run db_bench in debug mode and make sure the log is correct and no assertions fail. Reviewers: sdong, anthony, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38709	2015-05-21 15:39:30 -07:00
Igor Canadi	7a3577519f	Don't artificially inflate L0 score Summary: This turns out to be pretty bad because if we prioritize L0->L1 then L1 can grow artificially large, which makes L0->L1 more and more expensive. For example: 256MB @ L0 + 256MB @ L1 --> 512MB @ L1 256MB @ L0 + 512MB @ L1 --> 768MB @ L1 256MB @ L0 + 768MB @ L1 --> 1GB @ L1 .... 256MB @ L0 + 10GB @ L1 --> 10.2GB @ L1 At some point we need to start compacting L1->L2 to speed up L0->L1. Test Plan: The performance improvement is massive for heavy write workload. This is the benchmark I ran: https://phabricator.fb.com/P19842671. Before this change, the benchmark took 47 minutes to complete. After, the benchmark finished in 2minutes. You can see full results here: https://phabricator.fb.com/P19842674 Also, we ran this diff on MongoDB on RocksDB on one replicaset. Before the change, our initial sync was so slow that it couldn't keep up with primary writes. After the change, the import finished without any issues Reviewers: dynamike, MarkCallaghan, rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38637	2015-05-21 11:40:48 -07:00
Yueh-Hsuan Chiang	e2c1d4b57f	[Public API Change] Make DB::GetDbIdentity() be const function. Summary: Make DB::GetDbIdentity() be const function. Test Plan: make db_test Reviewers: igor, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38745	2015-05-21 11:01:48 -07:00
Yueh-Hsuan Chiang	812c461c96	Dump db stats in WARN level Summary: Dump db stats in WARN level Test Plan: run db_bench and verify the LOG Reviewers: igor, MarkCallaghan Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38691	2015-05-19 18:42:17 -07:00
Mark Callaghan	944043d683	Add --wal_bytes_per_sync for db_bench and more IO stats Summary: See https://gist.github.com/mdcallag/89ebb2b8cbd331854865 for the IO stats. I added "Cumulative compaction:" and "Interval compaction:" lines. The IO rates can be confusing. Rates fro per-level stats lines, Wr(MB/s) & Rd(MB/s), are computed using the duration of the compaction job. If the job reads 10MB, writes 9MB and the job (IO & merging) takes 1 second then the rates are 10MB/s for read and 9MB/s for writes. The IO rates in the Cumulative compaction line uses the total uptime. The IO rates in the Interval compaction line uses the interval uptime. So these Cumalative & Interval compaction IO rates cannot be compared to the per-level IO rates. But both forms of the rates are useful for debugging perf. Task ID: # Blame Rev: Test Plan: run db_bench Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: igor Reviewed By: igor Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D38667	2015-05-19 16:19:30 -07:00
Igor Canadi	04feaeebb9	Fix comparison between signed and usigned integers Summary: Not sure why this fails on some compilers and doesn't on others. Test Plan: none Reviewers: meyering, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38673	2015-05-19 10:59:30 -07:00
Igor Canadi	4a855c0799	Add an option wal_bytes_per_sync to control sync_file_range for WAL files Summary: sync_file_range is not always asyncronous and thus can block writes if we do this for WAL in the foreground thread. See more here: http://yoshinorimatsunobu.blogspot.com/2014/03/how-syncfilerange-really-works.html Some users don't want us to call sync_file_range on WALs. Some other do. Thus, I'm adding a separate option wal_bytes_per_sync to control calling sync_file_range on WAL files. bytes_per_sync will apply only to table files now. Test Plan: no more sync_file_range for WAL as evidenced by strace Reviewers: yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38253	2015-05-18 17:03:59 -07:00
Igor Canadi	b0fdda4ff0	Allow flushes to run in parallel with manual compaction Summary: As title. I spent some time thinking about it and I don't think there should be any issue with running manual compaction and flushes in parallel Test Plan: make check works Reviewers: rven, yhchiang, sdong Reviewed By: yhchiang, sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38355	2015-05-18 15:34:33 -07:00
sdong	fb5bdbf987	DBTest.DynamicLevelMaxBytesCompactRange: make sure L0 is not empty before running compact range Summary: DBTest.DynamicLevelMaxBytesCompactRange needs to make sure L0 is not empty to properly cover the code paths we want to cover. However, current codes have a bug that might leave the condition not held. Improve the test to ensure it. Test Plan: Run the test in an environment that is used to fail. Also run it many times. Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D38631	2015-05-18 11:49:45 -07:00
sdong	6fa7085121	CompactRange skips levels 1 to base_level -1 for dynamic level base size Summary: CompactRange() now is much more expensive for dynamic level base size as it goes through all the levels. Skip those not used levels between level 0 an base level. Test Plan: Run all unit tests Reviewers: yhchiang, rven, anthony, kradhakrishnan, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D37125	2015-05-18 10:54:11 -07:00
Yueh-Hsuan Chiang	3f0867c0fe	Allow GetThreadList to report Flush properties. Summary: Allow GetThreadList to report Flush properties, which includes: * job id * number of bytes that has been written since flush started. * total size of input mem-tables Test Plan: ./db_bench --threads=30 --num=1000000 --benchmarks=fillrandom --thread_status_per_interval=100 --value_size=1000 Sample output from db_bench which tracks same flush job ThreadID ThreadType cfName Operation ElapsedTime Stage State OperationProperties 140213879898240 High Pri default Flush 5789 us FlushJob::WriteLevel0Table BytesMemtables 4112835 \| BytesWritten 577104 \| JobID 8 \| ThreadID ThreadType cfName Operation ElapsedTime Stage State OperationProperties 140213879898240 High Pri default Flush 30.634 ms FlushJob::WriteLevel0Table BytesMemtables 4112835 \| BytesWritten 1734865 \| JobID 8 \| Reviewers: rven, igor, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38505	2015-05-15 23:22:22 -07:00
Igor Canadi	7413306d94	Take a chance on a random file when choosing compaction Summary: When trying to compact entire database with SuggestCompactRange(), we'll first try the left-most files. This is pretty bad, because: 1) the left part of LSM tree will be overly compacted, but right part will not be touched 2) First compaction will pick up the left-most file. Second compaction will try to pick up next left-most, but this will not be possible, because there's a big chance that second's file range on N+1 level is already being compacted. I observe both of those problems when running Mongo+RocksDB and trying to compact the DB to clean up tombstones. I'm unable to clean them up :( This diff adds a bit of randomness into choosing a file. First, it chooses a file at random and tries to compact that one. This should solve both problems specified here. Test Plan: make check Reviewers: yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D38379	2015-05-15 14:14:40 -07:00
sdong	5aad881298	DBTest.DynamicLevelMaxBytesBase2: remove an unnecesary check Summary: DBTest.DynamicLevelMaxBytesBase2 has a check that is not necessary and may fail. Remove it, and add two unrelated check. Test Plan: Run the test Reviewers: yhchiang, rven, kradhakrishnan, anthony, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D38457	2015-05-14 09:22:43 -07:00
sdong	ec43a8b9fb	Universal Compaction with multiple levels won't allocate up to output size Summary: Universal compactions with multiple levels should use file preallocation size based on file size if output level is not level 0 Test Plan: Run all tests. Reviewers: igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D38439	2015-05-13 14:15:46 -07:00

1 2 3 4 5 ...

1677 Commits