rocksdb

Author	SHA1	Message	Date
Igor Canadi	64904b39a0	Merge branch 'master' into columnfamilies Conflicts: utilities/backupable/backupable_db.cc	2014-03-17 17:57:14 -07:00
Igor Canadi	5601bc4619	Check starts_with(prefix) in MultiPrefixIterate Summary: We switched to prefix_seek method of seeking. This means that anytime we check Valid(), we also need to check starts_with(prefix) Test Plan: ran db_stress Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16953	2014-03-17 17:02:34 -07:00
Igor Canadi	e0c1211555	Merge branch 'master' into columnfamilies Conflicts: db/version_set.cc tools/db_stress.cc	2014-03-17 12:21:05 -07:00
Igor Canadi	9b8a2b52d4	No prefix iterator in db_stress Summary: We're trying to deprecate prefix iterators, so no need to test them in db_stress Test Plan: ran it Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16917	2014-03-17 10:02:46 -07:00
Igor Canadi	928ee23567	Change WriteBatch interface	2014-03-14 13:40:06 -07:00
Igor Canadi	e1f56e12cf	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_test.cc tools/db_stress.cc	2014-03-13 13:21:20 -07:00
Igor Canadi	04a1035efe	Revert "DB stress with normal skip list" This reverts commit `86926d8c6a`.	2014-03-12 22:21:13 -07:00
Lei Jin	02a2cb139b	fix VerifyDb in StressTest Summary: this should fix the hash_skip_list issue, but I still see seqno assertion failure in the last run. Will continue investigating and address that in a different diff Test Plan: make whitebox_crash_test Reviewers: igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16851	2014-03-12 22:20:22 -07:00
Igor Canadi	86926d8c6a	DB stress with normal skip list Summary: Hash skip list has issues, causing db_stress to fail badly. For now, switching back to skip_list by default before we figure out root cause. Test Plan: db_stress is happy(ier) Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16845	2014-03-12 17:35:27 -07:00
Igor Canadi	dff9214165	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc tools/db_stress.cc	2014-03-12 10:17:41 -07:00
Igor Canadi	5ba028c179	DBStress cleanup Summary: ) fixed the comment ) constant 1 was not casted to 64-bit, which (I think) might cause overflow if we shift it too much *) default prefix size to be 7, like it was before Test Plan: compiled Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16827	2014-03-12 09:31:06 -07:00
Lei Jin	86ba3e24e3	make assert based on FLAGS_prefix_size Summary: as title Test Plan: running python tools/db_crashtest.py Reviewers: igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16803	2014-03-11 16:33:42 -07:00
Lei Jin	02dab3be11	fix db_stress test Summary: Fix the db_stress test, let is run with HashSkipList for real Test Plan: python tools/db_crashtest.py python tools/db_crashtest2.py Reviewers: igor, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16773	2014-03-11 13:44:33 -07:00
Igor Canadi	56ca8338e5	initialize static const outside of class	2014-03-11 13:08:48 -07:00
Igor Canadi	457c78eb89	[CF] db_stress for column families Summary: I had this diff for a while to test column families implementation. Last night, I ran it sucessfully for 10 hours with the command: time ./db_stress --threads=30 --ops_per_thread=200000000 --max_key=5000 --column_families=20 --clear_column_family_one_in=3000000 --verify_before_write=1 --reopen=50 --max_background_compactions=10 --max_background_flushes=10 --db=/tmp/db_stress It is ready to be committed :) Test Plan: Ran it for 10 hours Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16797	2014-03-11 12:06:12 -07:00
Lei Jin	8d007b4aaf	Consolidate SliceTransform object ownership Summary: (1) Fix SanitizeOptions() to also check HashLinkList. The current dynamic case just happens to work because the 2 classes have the same layout. (2) Do not delete SliceTransform object in HashSkipListFactory and HashLinkListFactory destructor. Reason: SanitizeOptions() enforces prefix_extractor and SliceTransform to be the same object when HashFactory is used. This makes the behavior strange: when HashFactory is used, prefix_extractor will be released by RocksDB. If other memtable factory is used, prefix_extractor should be released by user. Test Plan: db_bench && make asan_check Reviewers: haobo, igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D16587	2014-03-10 12:56:46 -07:00
Albert Strasheim	df2f92214a	Support for LZ4 compression.	2014-02-08 14:15:51 -08:00
Siying Dong	8477255da3	Moving Some includes from options.h to forward declaration Summary: By removing some includes form options.h and reply on forward declaration, we can more easily reason the dependencies. Test Plan: make all check Reviewers: kailiu, haobo, igor, dhruba Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15411	2014-01-24 17:16:22 -08:00
Igor Canadi	e832e72b31	Revert "Moving to glibc-fb" This reverts commit `d24961b65e`. For some reason, glibc2.17-fb breaks gflags. Reverting for now	2014-01-24 11:50:38 -08:00
Igor Canadi	d24961b65e	Moving to glibc-fb Summary: It looks like we might have some trouble when building the new release with 4.8, since fbcode is using glibc2.17-fb by default and we are using glibc2.17. It was reported by Benjamin Renard in our internal group. This diff moves our fbcode build to use glibc2.17-fb by default. I got some linker errors when compiling, complaining that `google::SetUsageMessage()` was undefined. After deleting all offending lines, the compile was successful and everything works. Test Plan: Compiled Ran ./db_bench ./db_stress ./db_repl_stress Reviewers: kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15405	2014-01-24 10:24:08 -08:00
Igor Canadi	83681bf9ef	Statistics code cleanup Summary: I'm separating code-cleanup part of https://reviews.facebook.net/D14517. This will make D14517 easier to understand and this diff easier to review. Test Plan: make check Reviewers: haobo, kailiu, sdong, dhruba, tnovak Reviewed By: tnovak CC: leveldb Differential Revision: https://reviews.facebook.net/D15099	2014-01-17 12:46:06 -08:00
Igor Canadi	eb12e47e0e	Killing Transform Rep Summary: Let's get rid of TransformRep and it's children. We have confirmed that HashSkipListRep works better with multifeed, so there is no benefit to keeping this around. This diff is mostly just deleting references to obsoleted functions. I also have a diff for fbcode that we'll need to push when we switch to new release. I had to expose HashSkipListRepFactory in the client header files because db_impl.cc needs access to GetTransform() function for SanitizeOptions. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14397	2013-12-03 12:42:15 -08:00
Igor Canadi	469a9f32a7	Fix two nasty use-after-free-bugs Summary: These bugs were caught by ASAN crash test. 1. The first one, in table/filter_block.cc is very nasty. We first reference entries_ and store the reference to Slice prev. Then, we call entries_.append(), which can change the reference. The Slice prev now points to junk. 2. The second one is a bug in a test, so it's not very serious. Once we set read_opts.prefix, we never clear it, so some other function might still reference it. Test Plan: asan crash test now runs more than 5 mins. Before, it failed immediately. I will run the full one, but the full one takes quite some time (5 hours) Reviewers: dhruba, haobo, kailiu Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14223	2013-11-19 21:01:48 -08:00
kailiu	97d8e573a6	make util/env_posix.cc work under mac Summary: This diff invoves some more complicated issues in the posix environment. Test Plan: works under mac os. will need to verify dev box. Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14061	2013-11-16 23:44:39 -08:00
Kai Liu	88ba331c1a	Add the index/filter block cache Summary: This diff leverage the existing block cache and extend it to cache index/filter block. Test Plan: Added new tests in db_test and table_test The correctness is checked by: 1. make check 2. make valgrind_check Performance is test by: 1. 10 times of build_tools/regression_build_test.sh on two versions of rocksdb before/after the code change. Test results suggests no significant difference between them. For the two key operatons `overwrite` and `readrandom`, the average iops are both 20k and ~260k, with very small variance). 2. db_stress. Reviewers: dhruba Reviewed By: dhruba CC: leveldb, haobo, xjin Differential Revision: https://reviews.facebook.net/D13167	2013-11-12 22:46:51 -08:00
Dhruba Borthakur	b4ad5e89ae	Implement a compressed block cache. Summary: Rocksdb can now support a uncompressed block cache, or a compressed block cache or both. Lookups first look for a block in the uncompressed cache, if it is not found only then it is looked up in the compressed cache. If it is found in the compressed cache, then it is uncompressed and inserted into the uncompressed cache. It is possible that the same block resides in the compressed cache as well as the uncompressed cache at the same time. Both caches have their own individual LRU policy. Test Plan: Unit test case attached. Reviewers: kailiu, sdong, haobo, leveldb Reviewed By: haobo CC: xjin, haobo Differential Revision: https://reviews.facebook.net/D12675	2013-11-01 14:31:35 -07:00
Slobodan Predolac	e44976b199	Conversion of db_bench, db_stress and db_repl_stress to use gflags Summary: Converted db_stress, db_repl_stress and db_bench to use gflags Test Plan: I tested by printing out all the flags from old and new versions. Tried defaults, + various combinations with "interesting flags". Also, tested by running db_crashtest.py and db_crashtest2.py. Reviewers: emayanke, dhruba, haobo, kailiu, sdong Reviewed By: emayanke CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D13581	2013-10-24 07:43:14 -07:00
Dhruba Borthakur	9cd221094c	Add appropriate LICENSE and Copyright message. Summary: Add appropriate LICENSE and Copyright message. Test Plan: make check Reviewers: CC: Task ID: # Blame Rev:	2013-10-16 17:48:41 -07:00
Dhruba Borthakur	4463b11cad	Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Test Plan: make check Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13311	2013-10-06 00:14:26 -07:00
Dhruba Borthakur	a143ef9b38	Change namespace from leveldb to rocksdb Summary: Change namespace from leveldb to rocksdb. This allows a single application to link in open-source leveldb code as well as rocksdb code into the same process. Test Plan: compile rocksdb Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13287	2013-10-04 11:59:26 -07:00
Mayank Agarwal	6b34021fc2	Triggering verify for gets also Summary: Will use iterators to verify keys in the db for half of its keys and Gets for the other half. Test Plan: ./db_stress --max_key=1000 --ops_per_thread=100 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13227	2013-10-02 11:22:17 -07:00
Natalie Hildebrandt	7edb92b843	Phase 2 of iterator stress test Summary: Using an iterator instead of the Get method, each thread goes through a portion of the database and verifies values by comparing to the shared state. Test Plan: ./db_stress --db=/tmp/tmppp --max_key=10000 --ops_per_thread=10000 To test some basic cases, the following lines can be added (each set in turn) to the verifyDb method with the following expected results: // Should abort with "Unexpected value found" shared.Delete(start); // Should abort with "Value not found" WriteOptions write_opts; db_->Delete(write_opts, Key(start)); // Should succeed WriteOptions write_opts; shared.Delete(start); db_->Delete(write_opts, Key(start)); // Should abort with "Value not found" WriteOptions write_opts; db_->Delete(write_opts, Key(start + (end-start)/2)); // Should abort with "Value not found" db_->Delete(write_opts, Key(end-1)); // Should abort with "Unexpected value" shared.Delete(end-1); // Should abort with "Unexpected value" shared.Delete(start + (end-start)/2); // Should abort with "Value not found" db_->Delete(write_opts, Key(start)); shared.Delete(start); db_->Delete(write_opts, Key(end-1)); db_->Delete(write_opts, Key(end-2)); To test the out of range abort, change the key in the for loop to Key(i+1), so that the key defined by the index i is now outside of the supposed range of the database. Reviewers: emayanke Reviewed By: emayanke CC: dhruba, xjin Differential Revision: https://reviews.facebook.net/D13071	2013-09-30 16:48:00 -07:00
Natalie Hildebrandt	433541823c	Phase 1 of an iterator stress test Summary: Added MultiIterate() which does a seek and some Next/Prev calls. Iterator status is checked only, no data integrity check Test Plan: make db_stress ./db_stress --iterpercent=<nonzero value> --readpercent=, etc. Reviewers: emayanke, dhruba, xjin Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D12915	2013-09-19 16:47:24 -07:00
Dhruba Borthakur	4012ca1c7b	Added a parameter to limit the maximum space amplification for universal compaction. Summary: Added a new field called max_size_amplification_ratio in the CompactionOptionsUniversal structure. This determines the maximum percentage overhead of space amplification. The size amplification is defined to be the ratio between the size of the oldest file to the sum of the sizes of all other files. If the size amplification exceeds the specified value, then min_merge_width and max_merge_width are ignored and a full compaction of all files is done. A value of 10 means that the size a database that stores 100 bytes of user data could occupy 110 bytes of physical storage. Test Plan: Unit test DBTest.UniversalCompactionSpaceAmplification added. Reviewers: haobo, emayanke, xjin Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D12825	2013-09-13 16:27:18 -07:00
Dhruba Borthakur	1186192ed1	Replace include/leveldb with include/rocksdb. Summary: Replace include/leveldb with include/rocksdb. Test Plan: make clean; make check make clean; make release Differential Revision: https://reviews.facebook.net/D12489	2013-08-23 10:51:00 -07:00
Jim Paton	74781a0c49	Add three new MemTableRep's Summary: This patch adds three new MemTableRep's: UnsortedRep, PrefixHashRep, and VectorRep. UnsortedRep stores keys in an std::unordered_map of std::sets. When an iterator is requested, it dumps the keys into an std::set and iterates over that. VectorRep stores keys in an std::vector. When an iterator is requested, it creates a copy of the vector and sorts it using std::sort. The iterator accesses that new vector. PrefixHashRep stores keys in an unordered_map mapping prefixes to ordered sets. I also added one API change. I added a function MemTableRep::MarkImmutable. This function is called when the rep is added to the immutable list. It doesn't do anything yet, but it seems like that could be useful. In particular, for the vectorrep, it means we could elide the extra copy and just sort in place. The only reason I haven't done that yet is because the use of the ArenaAllocator complicates things (I can elaborate on this if needed). Test Plan: make -j32 check ./db_stress --memtablerep=vector ./db_stress --memtablerep=unsorted ./db_stress --memtablerep=prefixhash --prefix_size=10 Reviewers: dhruba, haobo, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12117	2013-08-22 23:10:02 -07:00
Deon Nicholas	b87dcae1a3	Made merge_oprator a shared_ptr; and added TTL unit tests Test Plan: - make all check; - make release; - make stringappend_test; ./stringappend_test Reviewers: haobo, emayanke Reviewed By: haobo CC: leveldb, kailiu Differential Revision: https://reviews.facebook.net/D12381	2013-08-20 13:35:28 -07:00
Deon Nicholas	ad48c3c262	Benchmarking for Merge Operator Summary: Updated db_bench and utilities/merge_operators.h to allow for dynamic benchmarking of merge operators in db_bench. Added a new test (--benchmarks=mergerandom), which performs a bunch of random Merge() operations over random keys. Also added a "--merge_operator=" flag so that the tester can easily benchmark different merge operators. Currently supports the PutOperator and UInt64Add operator. Support for stringappend or list append may come later. Test Plan: 1. make db_bench 2. Test the PutOperator (simulating Put) as follows: ./db_bench --benchmarks=fillrandom,readrandom,updaterandom,readrandom,mergerandom,readrandom --merge_operator=put --threads=2 3. Test the UInt64AddOperator (simulating numeric addition) similarly: ./db_bench --value_size=8 --benchmarks=fillrandom,readrandom,updaterandom,readrandom,mergerandom,readrandom --merge_operator=uint64add --threads=2 Reviewers: haobo, dhruba, zshao, MarkCallaghan Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11535	2013-08-15 17:13:07 -07:00
Xing Jin	0a5afd1afc	Minor fix to current codes Summary: Minor fix to current codes, including: coding style, output format, comments. No major logic change. There are only 2 real changes, please see my inline comments. Test Plan: make all check Reviewers: haobo, dhruba, emayanke Differential Revision: https://reviews.facebook.net/D12297	2013-08-14 23:03:57 -07:00
Tyler Harter	7612d496ff	Add prefix scans to db_stress (and bug fix in prefix scan) Summary: Added support for prefix scans. Test Plan: ./db_stress --max_key=4096 --ops_per_thread=10000 Reviewers: dhruba, vamsi Reviewed By: vamsi CC: leveldb Differential Revision: https://reviews.facebook.net/D12267	2013-08-14 16:58:36 -07:00
Dhruba Borthakur	f5fa26b6a9	Merge branch 'performance' of github.com:facebook/rocksdb into performance Conflicts: db/builder.cc db/db_impl.cc db/version_set.cc include/leveldb/statistics.h	2013-08-07 11:58:06 -07:00
Mayank Agarwal	1d7b4765c3	Expose base db object from ttl wrapper Summary: rocksdb replicaiton will need this when writing value+TS from master to slave 'as is' Test Plan: make Reviewers: dhruba, vamsi, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11919	2013-08-05 18:44:14 -07:00
Dhruba Borthakur	711a30cb30	Merge branch 'master' into performance Conflicts: include/leveldb/options.h include/leveldb/statistics.h util/options.cc	2013-08-02 10:22:08 -07:00
Mayank Agarwal	59d0b02f8b	Expand KeyMayExist to return the proper value if it can be found in memory and also check block_cache Summary: Removed KeyMayExistImpl because KeyMayExist demanded Get like semantics now. Removed no_io from memtable and imm because we need the proper value now and shouldn't just stop when we see Merge in memtable. Added checks to block_cache. Updated documentation and unit-test Test Plan: make all check;db_stress for 1 hour Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11853	2013-08-01 09:07:46 -07:00
Mayank Agarwal	bf66c10b13	Use KeyMayExist for WriteBatch-Deletes Summary: Introduced KeyMayExist checking during writebatch-delete and removed from Outer Delete API because it uses writebatch-delete. Added code to skip getting Table from disk if not already present in table_cache. Some renaming of variables. Introduced KeyMayExistImpl which allows checking since specified sequence number in GetImpl useful to check partially written writebatch. Changed KeyMayExist to not be pure virtual and provided a default implementation. Expanded unit-tests in db_test to check appropriately. Ran db_stress for 1 hour with ./db_stress --max_key=100000 --ops_per_thread=10000000 --delpercent=50 --filter_deletes=1 --statistics=1. Test Plan: db_stress;make check Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D11745	2013-07-23 13:36:50 -07:00
Dhruba Borthakur	4a745a5666	Merge branch 'master' into performance Conflicts: db/version_set.cc include/leveldb/options.h util/options.cc	2013-07-17 15:05:57 -07:00
Mayank Agarwal	2a986919d6	Make rocksdb-deletes faster using bloom filter Summary: Wrote a new function in db_impl.c-CheckKeyMayExist that calls Get but with a new parameter turned on which makes Get return false only if bloom filters can guarantee that key is not in database. Delete calls this function and if the option- deletes_use_filter is turned on and CheckKeyMayExist returns false, the delete will be dropped saving: 1. Put of delete type 2. Space in the db,and 3. Compaction time Test Plan: make all check; will run db_stress and db_bench and enhance unit-test once the basic design gets approved Reviewers: dhruba, haobo, vamsi Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11607	2013-07-11 12:11:11 -07:00
Mayank Agarwal	821889e207	Print complete statistics in db_stress Summary: db_stress should alos print complete statistics like db_bench. Needed this when I wanted to measure number of delete-IOs dropped due to CheckKeyMayExist to be introduced to rocksdb codebase later- to make deltes in rocksdb faster Test Plan: make db_stress;./db_stress --max_key=100 --ops_per_thread=1000 --statistics=1 Reviewers: sheki, dhruba, vamsi, haobo Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D11655	2013-07-10 18:07:13 -07:00
Dhruba Borthakur	116ec527f2	Renamed 'hybrid_compaction' tp be "Universal Compaction'. Summary: All the universal compaction parameters are encapsulated in a new file universal_compaction.h Test Plan: make check	2013-07-03 15:47:53 -07:00
Dhruba Borthakur	47c4191fe8	Reduce write amplification by merging files in L0 back into L0 Summary: There is a new option called hybrid_mode which, when switched on, causes HBase style compactions. Files from L0 are compacted back into L0. This meat of this compaction algorithm is in PickCompactionHybrid(). All files reside in L0. That means all files have overlapping keys. Each file has a time-bound, i.e. each file contains a range of keys that were inserted around the same time. The start-seqno and the end-seqno refers to the timeframe when these keys were inserted. Files that have contiguous seqno are compacted together into a larger file. All files are ordered from most recent to the oldest. The current compaction algorithm starts to look for candidate files starting from the most recent file. It continues to add more files to the same compaction run as long as the sum of the files chosen till now is smaller than the next candidate file size. This logic needs to be debated and validated. The above logic should reduce write amplification to a large extent... will publish numbers shortly. Test Plan: dbstress runs for 6 hours with no data corruption (tested so far). Differential Revision: https://reviews.facebook.net/D11289	2013-06-30 20:07:04 -07:00

1 2

82 Commits