rocksdb

Author	SHA1	Message	Date
Islam AbdelRahman	d52f334cbd	Break stalls when no bg work is happening Summary: Current stall will keep sleeping even if there is no Flush/Compactions to wait for, I changed the logic to break the stall if we are not flushing or compacting db_bench command used ``` # fillrandom # memtable size = 10MB # value size = 1 MB # num = 1000 # use /dev/shm ./db_bench --benchmarks="fillrandom,stats" --value_size=1048576 --write_buffer_size=10485760 --num=1000 --delayed_write_rate=XXXXX --db="/dev/shm/new_stall" \| grep "Cumulative stall" ``` ``` Current results # delayed_write_rate = 1000 Kb/sec Cumulative stall: 00:00:9.031 H:M:S # delayed_write_rate = 200 Kb/sec Cumulative stall: 00:00:22.314 H:M:S # delayed_write_rate = 100 Kb/sec Cumulative stall: 00:00:42.784 H:M:S # delayed_write_rate = 50 Kb/sec Cumulative stall: 00:01:23.785 H:M:S # delayed_write_rate = 25 Kb/sec Cumulative stall: 00:02:45.702 H:M:S ``` ``` New results # delayed_write_rate = 1000 Kb/sec Cumulative stall: 00:00:9.017 H:M:S # delayed_write_rate = 200 Kb/sec Cumulative stall: 00 Closes https://github.com/facebook/rocksdb/pull/1884 Differential Revision: D4585439 Pulled By: IslamAbdelRahman fbshipit-source-id: aed2198	2017-03-16 18:24:17 -07:00
Xiaofei Du	7106a994fe	Use monotonic time points in write_controller.cc and rate_limiter.cc Summary: NowMicros() provides non-monotonic time. When wall clock is synchronized or changed, the non-monotonicity time points will affect write rate controllers. This patch changes write_controller.cc and rate_limiter.cc to use monotonic time points. Closes https://github.com/facebook/rocksdb/pull/1865 Differential Revision: D4561732 Pulled By: siying fbshipit-source-id: 95ece62	2017-02-14 18:24:24 -08:00
Siying Dong	cd7c4143d7	Improve Write Stalling System Summary: Current write stalling system has the problem of lacking of positive feedback if the restricted rate is already too low. Users sometimes stack in very low slowdown value. With the diff, we add a positive feedback (increasing the slowdown value) if we recover from slowdown state back to normal. To avoid the positive feedback to keep the slowdown value to be to high, we add issue a negative feedback every time we are close to the stop condition. Experiments show it is easier to reach a relative balance than before. Also increase level0_stop_writes_trigger default from 24 to 32. Since level0_slowdown_writes_trigger default is 20, stop trigger 24 only gives four files as the buffer time to slowdown writes. In order to avoid stop in four files while 20 files have been accumulated, the slowdown value must be very low, which is amost the same as stop. It also doesn't give enough time for the slowdown value to converge. Increase it to 32 will smooth out the system. Closes https://github.com/facebook/rocksdb/pull/1562 Differential Revision: D4218519 Pulled By: siying fbshipit-source-id: 95e4088	2016-11-23 09:24:15 -08:00
Lijun Tang	adb665e0bf	Allowed delayed_write_rate option to be dynamically set. Summary: Closes https://github.com/facebook/rocksdb/pull/1488 Differential Revision: D4157784 Pulled By: siying fbshipit-source-id: f150081	2016-11-12 15:54:11 -08:00
Baraa Hamodi	21e95811d1	Updated all copyright headers to the new format.	2016-02-09 15:12:00 -08:00
Venkatesh Radhakrishnan	3b2a1ddd2e	Add options.base_background_compactions as a number of compaction threads for low compaction debt Summary: If options.base_background_compactions is given, we try to schedule number of compactions not existing this number, only when L0 files increase to certain number, or pending compaction bytes more than certain threshold, we schedule compactions based on options.max_background_compactions. The watermarks are calculated based on slowdown thresholds. Test Plan: Add new test cases in column_family_test. Adding more unit tests. Reviewers: IslamAbdelRahman, yhchiang, kradhakrishnan, rven, anthony Reviewed By: anthony Subscribers: leveldb, dhruba, yoshinorim Differential Revision: https://reviews.facebook.net/D53409	2016-01-29 16:15:53 -08:00
sdong	b9f77ba12b	When slowdown is triggered, reduce the write rate Summary: It's usually hard for users to set a value of options.delayed_write_rate. With this diff, after slowdown condition triggers, we greedily reduce write rate if estimated pending compaction bytes increase. If estimated compaction pending bytes drop, we increase the write rate. Test Plan: Add a unit test Test with db_bench setting: TEST_TMPDIR=/dev/shm/ ./db_bench --benchmarks=fillrandom -num=10000000 --soft_pending_compaction_bytes_limit=1000000000 --hard_pending_compaction_bytes_limit=3000000000 --delayed_write_rate=100000000 and make sure without the commit, write stop will happen, but with the commit, it will not happen. Reviewers: igor, anthony, rven, yhchiang, kradhakrishnan, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52131	2015-12-23 11:33:15 -08:00
sdong	7842920be5	Slow down writes by bytes written Summary: We slow down data into the database to the rate of options.delayed_write_rate (a new option) with this patch. The thread synchronization approach I take is to still synchronize write controller by DB mutex and GetDelay() is inside DB mutex. Try to minimize the frequency of getting time in GetDelay(). I verified it through db_bench and it seems to work hard_rate_limit is deprecated. options.delayed_write_rate is still not dynamically changeable. Need to work on it as a follow-up. Test Plan: Add new unit tests in db_test Reviewers: yhchiang, rven, kradhakrishnan, anthony, MarkCallaghan, igor Reviewed By: igor Subscribers: ikabiljo, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D36351	2015-06-11 20:42:18 -07:00
Igor Canadi	04ce1b25f3	Fix #284	2014-09-13 14:14:10 -07:00
Igor Canadi	a2bb7c3c33	Push- instead of pull-model for managing Write stalls Summary: Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either: * proceed with all writes without delay * delay all writes by fixed time * stop all writes The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case). When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal. This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write. Test Plan: make check for now. I'll add some unit tests later. Also, perf test. Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22791	2014-09-08 11:20:25 -07:00

10 Commits