rocksdb

Author	SHA1	Message	Date
Zhipeng Jia	728f944f0d	Fix computation of size of last sub-compaction	2015-12-22 18:37:51 +08:00
Igor Canadi	8ac7fb8377	Merge pull request #863 from zhangyybuaa/fix_hdfs_error Fix build error with hdfs	2015-12-22 09:27:51 +01:00
Igor Canadi	e53e8219ad	Merge pull request #894 from zhipeng-jia/develop Sorting std::vector instead of using std::set	2015-12-22 09:26:56 +01:00
Zhipeng Jia	e0abec1580	Sorting std::vector instead of using std::set	2015-12-22 14:34:57 +08:00
Alex Yang	33e09c0e19	add call to install superversion and schedule work in enableautocompactions Summary: This patch fixes https://github.com/facebook/mysql-5.6/issues/121 There is a recent change in rocksdb to disable auto compactions on startup: https://reviews.facebook.net/D51147. However, there is a small timing window where a column family needs to be compacted and schedules a compaction, but the scheduled compaction fails when it checks the disable_auto_compactions setting. The expectation is once the application is ready, it will call EnableAutoCompactions() to allow new compactions to go through. However, if the Column family is stalled because L0 is full, and no writes can go through, it is possible the column family may never have a new compaction request get scheduled. EnableAutoCompaction() should probably schedule an new flush and compaction event when it resets disable_auto_compaction. Using InstallSuperVersionAndScheduleWork, we call SchedulePendingFlush, SchedulePendingCompaction, as well as MaybeScheduleFlushOrcompaction on all the column families to avoid the situation above. This is still a first pass for feedback. Could also just call SchedePendingFlush and SchedulePendingCompaction directly. Test Plan: Run on Asan build cd _build-5.6-ASan/ && ./mysql-test/mtr --mem --big --testcase-timeout=36000 --suite-timeout=12000 --parallel=16 --suite=rocksdb,rocksdb_rpl,rocksdb_sys_vars --mysqld=--default-storage-engine=rocksdb --mysqld=--skip-innodb --mysqld=--default-tmp-storage-engine=MyISAM --mysqld=--rocksdb rocksdb_rpl.rpl_rocksdb_stress_crash --repeat=1000 Ensure that it no longer hangs during the test. Reviewers: hermanlee4, yhchiang, anthony Reviewed By: anthony Subscribers: leveldb, yhchiang, dhruba Differential Revision: https://reviews.facebook.net/D51747	2015-12-21 10:06:49 -08:00
Siying Dong	22c6b50ee8	Merge pull request #893 from zhipeng-jia/develop Fix clang warning regarding implicit conversion	2015-12-21 10:01:22 -08:00
Zhipeng Jia	24c7dae130	Fix clang warning regarding implicit conversion	2015-12-21 23:57:55 +08:00
agiardullo	eff309867e	Do not use timed_mutex in TransactionDB Summary: Stopped using std::timed_mutex as it has known issues in older versiong of gcc. Ran into these problems when testing MongoRocks. Test Plan: unit tests. Manual mongo testing on gcc 4.8. Reviewers: igor, yhchiang, rven, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52197	2015-12-18 17:26:02 -08:00
Reid Horuff	97ea8afaaf	compaction assertion triggering test fix for sequence zeroing assertion trip	2015-12-18 16:08:31 -08:00
Islam AbdelRahman	521da3abb3	Fix BlockBasedTableTest.BlockCacheLeak valgrind failure Summary: I added this line in my previous patch D48999 (which is incorrect) We should not release the iterator since releasing it will evict the blocks from cache Test Plan: Run the test under valgrind make check Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52161	2015-12-18 11:17:21 -08:00
Nathan Bronson	a48382399d	Fix use-after free in db_bench Test Plan: valgrind db_bench Reviewers: igor, sdong Reviewed By: sdong Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52101	2015-12-18 06:42:57 -08:00
Igor Canadi	bf8ffc1d60	Merge pull request #890 from zhipeng-jia/develop fix typo: sr to picking_sr	2015-12-18 10:08:45 +01:00
Zhipeng Jia	131f7ddf63	fix typo: sr to picking_sr	2015-12-18 17:02:36 +08:00
sdong	c37729a6a6	db_bench: --soft_pending_compaction_bytes_limit should set options.soft_pending_compaction_bytes_limit Summary: Fix a bug that options.soft_pending_compaction_bytes_limit is not actually set with --soft_pending_compaction_bytes_limit Test Plan: Run db_bench with this parameter and make sure the parameter is set correctly. Reviewers: anthony, kradhakrishnan, yhchiang, IslamAbdelRahman, igor, rven Reviewed By: rven Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52125	2015-12-17 18:28:56 -08:00
Venkatesh Radhakrishnan	7b12ae97d4	Add signalall after removing item from manual_compaction deque Summary: When there are waiting manual compactions, we need to signal them after removing the current manual compaction from the deque. Test Plan: ColumnFamilytTest.SameCFManualManualCommaction Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, yoshinorim Differential Revision: https://reviews.facebook.net/D52119	2015-12-17 16:59:00 -08:00
sdong	d72b31774e	Slowdown when writing to the last write buffer Summary: Now if inserting to mem table is much faster than writing to files, there is no mechanism users can rely on to avoid stopping for reaching options.max_write_buffer_number. With the commit, if there are more than four maximum write buffers configured, we slow down to the rate of options.delayed_write_rate while we reach the last one. Test Plan: 1. Add a new unit test. 2. Run db_bench with ./db_bench --benchmarks=fillrandom --num=10000000 --max_background_flushes=6 --batch_size=32 -max_write_buffer_number=4 --delayed_write_rate=500000 --statistics based on hard drive and see stopping is avoided with the commit. Reviewers: yhchiang, IslamAbdelRahman, anthony, rven, kradhakrishnan, igor Reviewed By: igor Subscribers: MarkCallaghan, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52047	2015-12-17 10:49:08 -08:00
Venkatesh Radhakrishnan	6b2a3ac92c	Add documentation for unschedFunction Summary: Documenting the unschedFunction parameter to Schedule as requested by Michael Kolupaev. Test Plan: build, unit test Reviewers: sdong, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: kolmike, dhruba Differential Revision: https://reviews.facebook.net/D52089	2015-12-17 10:41:39 -08:00
sdong	167fb919a5	ZSTD to use CompressionOptions.level Summary: Now ZSTD hard code level 1. Change it to use the compression level setting. Test Plan: Run it with hacked codes of sst_dump and show ZSTD compression sizes with different levels. Reviewers: rven, anthony, yhchiang, kradhakrishnan, igor, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: yoshinorim, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52041	2015-12-16 16:58:04 -08:00
Islam AbdelRahman	32ff05e971	Bump version to 4.4 Summary: Bump version to 4.4 Test Plan: none Reviewers: sdong, rven, yhchiang, anthony, kradhakrishnan Reviewed By: kradhakrishnan Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D52035	2015-12-16 14:32:58 -08:00
Islam AbdelRahman	aececc209e	Introduce ReadOptions::pin_data (support zero copy for keys) Summary: This patch update the Iterator API to introduce new functions that allow users to keep the Slices returned by key() valid as long as the Iterator is not deleted ReadOptions::pin_data : If true keep loaded blocks in memory as long as the iterator is not deleted Iterator::IsKeyPinned() : If true, this mean that the Slice returned by key() is valid as long as the iterator is not deleted Also add a new option BlockBasedTableOptions::use_delta_encoding to allow users to disable delta_encoding if needed. Benchmark results (using https://phabricator.fb.com/P20083553) ``` // $ du -h /home/tec/local/normal.4K.Snappy/db10077 // 6.1G /home/tec/local/normal.4K.Snappy/db10077 // $ du -h /home/tec/local/zero.8K.LZ4/db10077 // 6.4G /home/tec/local/zero.8K.LZ4/db10077 // Benchmarks for shard db10077 // _build/opt/rocks/benchmark/rocks_copy_benchmark \ // --normal_db_path="/home/tec/local/normal.4K.Snappy/db10077" \ // --zero_db_path="/home/tec/local/zero.8K.LZ4/db10077" // First run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 1.73s 576.97m // BM_StringPiece 103.74% 1.67s 598.55m // ============================================================================ // Match rate : 1000000 / 1000000 // Second run // ============================================================================ // rocks/benchmark/RocksCopyBenchmark.cpp relative time/iter iters/s // ============================================================================ // BM_StringCopy 611.99ms 1.63 // BM_StringPiece 203.76% 300.35ms 3.33 // ============================================================================ // Match rate : 1000000 / 1000000 ``` Test Plan: Unit tests Reviewers: sdong, igor, anthony, yhchiang, rven Reviewed By: rven Subscribers: dhruba, lovro, adsharma Differential Revision: https://reviews.facebook.net/D48999	2015-12-16 12:08:30 -08:00
Igor Canadi	e6e505a4d9	Fix examples Summary: For some reason `make librocksdb.a` is not valid anymore. Replace with `make static_lib` Test Plan: cd examples/; make all; Reviewers: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52017	2015-12-16 17:04:46 +01:00
Igor Canadi	aa29cc1289	Improve examples/README.md	2015-12-16 16:27:37 +01:00
Gunnar Kudrjavets	97265f5f14	Fix minor bugs in delete operator, snprintf, and size_t usage Summary: List of changes: 1) Fix the snprintf() usage in cases where wrong variable was used to determine the output buffer size. 2) Remove unnecessary checks before calling delete operator. 3) Increase code correctness by using size_t type when getting vector's size. 4) Unify the coding style by removing namespace::std usage at the top of the file to confirm to the majority usage. 5) Fix various lint errors pointed out by 'arc lint'. Test Plan: Code review and build: git diff make clean make -j 32 commit-prereq arc lint Reviewers: kradhakrishnan, sdong, rven, anthony, yhchiang, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51849	2015-12-15 15:26:20 -08:00
Islam AbdelRahman	b68dc0f83e	Merge pull request #885 from yuslepukhin/fix_size_t_formatting Use port size_t formatting	2015-12-15 14:00:09 -08:00
Dmitri Smirnov	b6d19adcf7	Use port size_t formatting	2015-12-15 11:34:22 -08:00
Igor Canadi	963660eb55	Merge pull request #883 from zhipeng-jia/master Fix typo	2015-12-15 18:12:19 +01:00
Zhipeng Jia	99ae549d37	Fix typo	2015-12-15 23:47:47 +08:00
Islam AbdelRahman	636cd3c714	Clean up listener_test (reuse db_test_util) Summary: Reuse db_test_util in listener_test Test Plan: make listener_test -j64 && ./listener_test USE_CLANG=1 make listener_test -j64 && ./listener_test Reviewers: yhchiang, rven, kradhakrishnan, anthony Reviewed By: anthony Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D51939	2015-12-14 13:36:32 -08:00
Venkatesh Radhakrishnan	030215bf01	Running manual compactions in parallel with other automatic or manual compactions in restricted cases Summary: This diff provides a framework for doing manual compactions in parallel with other compactions. We now have a deque of manual compactions. We also pass manual compactions as an argument from RunManualCompactions down to BackgroundCompactions, so that RunManualCompactions can be reentrant. Parallelism is controlled by the two routines ConflictingManualCompaction to allow/disallow new parallel/manual compactions based on already existing ManualCompactions. In this diff, by default manual compactions still have to run exclusive of other compactions. However, by setting the compaction option, exclusive_manual_compaction to false, it is possible to run other compactions in parallel with a manual compaction. However, we are still restricted to one manual compaction per column family at a time. All of these restrictions will be relaxed in future diffs. I will be adding more tests later. Test Plan: Rocksdb regression + new tests + valgrind Reviewers: igor, anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, sdong Reviewed By: sdong Subscribers: yoshinorim, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47973	2015-12-14 11:20:34 -08:00
Islam AbdelRahman	d26a4ea621	Merge pull request #882 from SherlockNoMad/BuildFix Fix appVeyor Build problem	2015-12-11 21:27:10 -08:00
SherlockNoMad	768a61486c	Fix appVeyor Build problem	2015-12-11 21:10:49 -08:00
agiardullo	84f98792d6	Transaction::SetWriteOptions() Summary: Add support to change write options after creating a transaction. This is needed for MongoRocks. Test Plan: added test Reviewers: sdong, rven, kradhakrishnan, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51867	2015-12-11 16:08:25 -08:00
agiardullo	3bfd3d39a3	Use SST files for Transaction conflict detection Summary: Currently, transactions can fail even if there is no actual write conflict. This is due to relying on only the memtables to check for write-conflicts. Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings. With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts. This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot. Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged). Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread. Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files. Test Plan: unit tests, db bench Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb, yoshinorim Differential Revision: https://reviews.facebook.net/D50475	2015-12-11 12:34:11 -08:00
krad	362d819a14	Improving parser Summary: Improving the parser string to make better error report. Currently the error report fails to capture the assert details. This fix addresses the issue. Test Plan: None Reviewers: CC: leveldb@ Task ID: #6968635 Blame Rev:	2015-12-11 11:06:42 -08:00
Yueh-Hsuan Chiang	00d6edf6a0	Ensure the destruction order of PosixEnv and ThreadLocalPtr Summary: By default, RocksDB initializes the singletons of ThreadLocalPtr first, then initializes PosixEnv via static initializer. Destructor terminates objects in reverse order, so terminating PosixEnv (calling pthread_mutex_lock), then ThreadLocal (calling pthread_mutex_destroy). However, in certain case, application might initialize PosixEnv first, then ThreadLocalPtr. This will cause core dump at the end of the program (eg. https://github.com/facebook/mysql-5.6/issues/122) This patch fix this issue by ensuring the destruction order by moving the global static singletons to function static singletons. Since function static singletons are initialized when the function is first called, this property allows us invoke to enforce the construction of the static PosixEnv and the singletons of ThreadLocalPtr by calling the function where the ThreadLocalPtr singletons belongs right before we initialize the static PosixEnv. Test Plan: Verified in the MyRocks. Reviewers: yoshinorim, IslamAbdelRahman, rven, kradhakrishnan, anthony, sdong, MarkCallaghan Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51789	2015-12-11 00:21:58 -08:00
Igor Canadi	64fa43843b	Merge pull request #862 from ceph/wip-env implement EnvMirror	2015-12-10 18:45:07 -08:00
Sage Weil	2074ddd625	env: add EnvMirror This is an Env implementation that mirrors all storage-related methods on two different backend Env's and verifies that they return the same results (return status and read results). This is useful for implementing a new Env and verifying its correctness. Signed-off-by: Sage Weil <sage@redhat.com>	2015-12-10 21:32:45 -05:00
Yueh-Hsuan Chiang	a3ba5915c8	Correct a comment in include/rocksdb/cache.h Summary: Correct a comment in include/rocksdb/cache.h Test Plan: No code change. Reviewers: igor, sdong, IslamAbdelRahman, rven, kradhakrishnan, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51831	2015-12-10 16:39:10 -08:00
Yueh-Hsuan Chiang	f0a8e5a2d8	Fixed the valgrind error in ColumnFamilyTest::CreateAndDropRace Summary: Fixed the valgrind error in ColumnFamilyTest::CreateAndDropRace Test Plan: valgrind --error-exitcode=2 --leak-check=full ./column_family_test Reviewers: kradhakrishnan, rven, anthony, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51795	2015-12-10 11:53:53 -08:00
agiardullo	9e44629061	Change SingleDelete to support conflict checking Summary: For Transactions, we want to start using the SST files to do write conflict checking. To do this, we need to make sure that compaction never removes all writes if an earlier snapshot exists. So I had to change the way we process SingleDeletes to sometimes leave a SingleDelete behind when we encounter a Put followed by a SingleDelete. See the comments in this diff for a more detailed explanation. Test Plan: added more unit tests Reviewers: rven, igor, kradhakrishnan, IslamAbdelRahman, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D50295	2015-12-10 11:35:38 -08:00
Igor Canadi	c5af8bffbf	Merge pull request #879 from charsyam/feature/typos fix typos in comments	2015-12-10 09:42:11 -08:00
charsyam	c30b499541	fix typos in comments	2015-12-11 01:54:48 +09:00
sdong	56e77f0967	Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit Summary: Deprecate options.soft_rate_limit, which is hard to tune, with options.soft_pending_compaction_bytes_limit, which would trigger the slowdown if estimated pending compaction bytes exceeds the threshold. The hope is to make it more striaght-forward to tune. Test Plan: Modify DBTest.SoftLimit to cover options.soft_pending_compaction_bytes_limit instead; run all unit tests. Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, igor, anthony Reviewed By: anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51117	2015-12-09 18:22:45 -08:00
sdong	d6e1035a1f	A new compaction picking priority that optimizes for write amplification for random updates. Summary: Introduce a compaction picking priority that picks files who contains the oldest rows to compact. This is a mode that slightly improves write amplification for random update cases. Test Plan: Add a unit test and run it in valgrind too. Reviewers: yhchiang, anthony, IslamAbdelRahman, rven, kradhakrishnan, MarkCallaghan, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51459	2015-12-09 18:13:03 -08:00
Igor Canadi	de6958b2e2	Merge pull request #877 from yuslepukhin/fix_unnecessary_type_truncation Prefer integer arithmetics	2015-12-09 15:14:27 -08:00
Yueh-Hsuan Chiang	0991cee6cd	Merge pull request #815 from SherlockNoMad/CounterFix Fix EstimateNumKeys Counter Inaccurate Issue	2015-12-09 14:10:49 -08:00
yuslepukhin	49957f9a98	Prefer integer arithmetics The code had conversion to double then casting to size_t and then casting uint32_t which caused compiler warning (VS15).	2015-12-09 14:06:23 -08:00
Igor Canadi	0836d265c9	Merge pull request #876 from warrenfalk/wf_win_master Add compaction_iterator and delete_scheduler tests to Windows build	2015-12-09 10:05:21 -08:00
Warren Falk	c6fedf2bf8	Add compaction_iterator and delete_scheduler tests to Windows build	2015-12-09 11:01:02 -05:00
sdong	ac8e56f050	db_bench: in uncompress benchmark, get Snappy size from compressed stream Summary: Now in benchmark "uncompress" in db_bench, we get size from compressed stream for all other compression types except Snappy, where we allocate memory based on parameter. Change it to match to behavior of other compression types. Test Plan: Run ./db_bench --benchmarks=uncompress with snappy and other compression types. Reviewers: yhchiang, kradhakrishnan, anthony, IslamAbdelRahman, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D51681	2015-12-08 18:11:58 -08:00

1 2 3 4 5 ...

4503 Commits