rocksdb

Author	SHA1	Message	Date
Igor Canadi	dc55903293	Improved CompressedCache Summary: This is testing behavior that was reported in https://github.com/facebook/rocksdb/issues/111 No issue was found, but it still good to commit this and make CompressedCache more robust. Test Plan: this is a plan Reviewers: ljin, dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17625	2014-04-09 11:43:14 -07:00
Igor Canadi	b947fdc89d	Column family support for DB::OpenForReadOnly() Summary: When opening DB in read-only mode, client can choose to only specify a subset of column families ("default" column family can't be omitted, though) Test Plan: added a unit test in column_family_test Reviewers: haobo, sdong, ljin, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17565	2014-04-09 09:56:17 -07:00
Igor Canadi	731e55c01c	Fix GetProperty() test Summary: GetProperty test is flakey. Before this diff: P8635927 After: P8635945 We need to make sure the thread is done before we destruct sleeping tasks. Otherwise, bad things happen. Test Plan: See summary Reviewers: ljin, sdong, haobo, dhruba Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D17595	2014-04-08 14:57:00 -07:00
Igor Canadi	3d2fe844ab	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/memtable_list.cc db/version_set.cc	2014-04-07 11:31:11 -07:00
Haobo Xu	48bc0c6ad3	[RocksDB] Fix a race condition in GetSortedWalFiles Summary: This patch fixed a race condition where a log file is moved to archived dir in the middle of GetSortedWalFiles. Without the fix, the log file would be missed in the result, which leads to transaction log iterator gap. A test utility SyncPoint is added to help reproducing the race condition. Test Plan: TransactionLogIteratorRace; make check Reviewers: dhruba, ljin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17121	2014-04-02 22:12:29 -07:00
sdong	4af1954fd6	Compaction Filter V1 to use old context struct to keep backward compatible Summary: The previous change D15087 changed existing compaction filter, which makes the commonly used class not backward compatible. Revert the older interface. Use a new interface for V2 instead. Test Plan: make all check Reviewers: haobo, yhchiang, igor CC: danguo, dhruba, ljin, igor, leveldb Differential Revision: https://reviews.facebook.net/D17223	2014-04-02 14:57:51 -07:00
Igor Canadi	8555ce2dec	Merge branch 'master' into columnfamilies	2014-04-02 10:48:05 -07:00
sdong	e0a87c4cf1	DBIter to use static allocated char array for saved_key_ (if it is not too long) Summary: DBIter now uses a std::string for saved_key. Based on some profiling, it could be more expensive than we though. Optimize it with the same technique as LookupKey -- if it is short, we copy it to a static allocated char. Otherwise, dynamically allocate memory for it. Test Plan: make all check Reviewers: haobo, ljin Reviewed By: haobo CC: dhruba, igor, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17289	2014-04-01 16:43:11 -07:00
Igor Canadi	ddbd1ece88	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_test.cc db/internal_stats.cc db/internal_stats.h db/version_edit.cc db/version_edit.h db/version_set.cc include/rocksdb/options.h util/options.cc	2014-03-31 13:39:24 -07:00
Igor Canadi	8a139a054c	More valgrind issues! Summary: Fix some more CompactionFilterV2 valgrind issues. Maybe it would make sense for CompactionFilterV2 to delete its prefix_extractor? Test Plan: ran CompactionFilterV2* tests with valgrind. issues before patch -> no issues after Reviewers: haobo, sdong, ljin, dhruba Reviewed By: dhruba CC: leveldb, danguo Differential Revision: https://reviews.facebook.net/D17337	2014-03-29 10:34:47 -07:00
sdong	43a593a6d9	Change default value of some Options Summary: Since we are optimizing for server workloads, some default values are not optimized any more. We change some of those values that I feel it's less prone to regression bugs. Test Plan: make all check Reviewers: dhruba, haobo, ljin, igor, yhchiang Reviewed By: igor CC: leveldb, MarkCallaghan Differential Revision: https://reviews.facebook.net/D16995	2014-03-28 17:09:28 -07:00
Haobo Xu	a92194e5b2	[RocksDB] Add db property "rocksdb.cur-size-active-mem-table" Summary: as title Test Plan: db_test Reviewers: sdong Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17217	2014-03-27 15:14:04 -07:00
sdong	6b2e7a2a01	When Options.max_num_files=-1, non level0 files also by pass table cache Summary: This is the part that was not finished when doing the Options.max_num_files=-1 feature. For iterating non level0 SST files (which was done using two level iterator), table cache is not bypassed. With this patch, the leftover feature is done. Test Plan: make all check; change Options.max_num_files=-1 in one of the tests to cover the codes. Reviewers: haobo, igor, dhruba, ljin, yhchiang Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17001	2014-03-25 18:40:52 -07:00
Igor Canadi	e86d7dffd7	Merge branch 'master' into columnfamilies	2014-03-25 15:24:02 -07:00
Danny Guo	d9ca83df28	[rocksdb] make init prefix more robust Summary: Currently if client uses kNULLString as the prefix, it will confuse compaction filter v2. This diff added a bool to indicate if the prefix has been intialized. I also added a unit test to cover this case and make sure the new code path is hit. Test Plan: db_test Reviewers: igor, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17151	2014-03-25 11:59:40 -07:00
Igor Canadi	e8168382c4	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc include/rocksdb/options.h util/options.cc	2014-03-25 11:09:40 -07:00
Danny Guo	b47812fba6	[rocksdb] new CompactionFilterV2 API Summary: This diff adds a new CompactionFilterV2 API that roll up the decisions of kv pairs during compactions. These kv pairs must share the same key prefix. They are buffered inside the db. typedef std::vector<Slice> SliceVector; virtual std::vector<bool> Filter(int level, const SliceVector& keys, const SliceVector& existing_values, std::vector<std::string>* new_values, std::vector<bool>* values_changed ) const = 0; Application can override the Filter() function to operate on the buffered kv pairs. More details in the inline documentation. Test Plan: make check. Added unit tests to make sure Keep, Delete, Change all works. Reviewers: haobo CCs: leveldb Differential Revision: https://reviews.facebook.net/D15087	2014-03-24 20:47:53 -07:00
Igor Canadi	ac328a86b9	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_test.cc	2014-03-20 14:41:37 -07:00
sdong	f681030c80	Fix DBTest.UniversalCompactionTrigger failure caused by D17067 Summary: D17067 breaks DBTest.UniversalCompactionTrigger because of wrong location of the checking. Fix it. Test Plan: Run the test and make sure it passes. Reviewers: igor, haobo Reviewed By: igor CC: dhruba, ljin, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17079	2014-03-20 11:10:11 -07:00
sdong	752ec46cd5	Add a unit test to verify compaction filter context Summary: Add unit tests to make sure CompactionFilterContext::is_manual_compaction_ and CompactionFilterContext::is_full_compaction_ are set correctly. Test Plan: run the new tests. Reviewers: haobo, igor, dhruba, yhchiang, ljin Reviewed By: haobo CC: nkg-, leveldb Differential Revision: https://reviews.facebook.net/D17067	2014-03-19 18:10:48 -07:00
Igor Canadi	e20fa3f8a4	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/internal_stats.cc db/internal_stats.h db/version_set.cc	2014-03-19 17:22:20 -07:00
Igor Canadi	22507aff6c	Fix compile issue in Mac OS Summary: Compile issues are: * Unused variable env_ * Unused fallocate_with_keep_size_ Test Plan: compiles Reviewers: dhruba, haobo, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17043	2014-03-19 15:40:12 -07:00
sdong	71e6a34271	Add a DB property to indicate number of background errors encountered Summary: Add a property to calculate number of background errors encountered to help users build their monitoring Test Plan: Add a unit test. make all check Reviewers: haobo, igor, dhruba Reviewed By: igor CC: ljin, nkg-, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16959	2014-03-18 14:28:30 -07:00
Kai Liu	1ec72b37b1	Several easy-to-add properties related to compaction and flushes Summary: To partly address the request @nkg- raised, add three easy-to-add properties to compactions and flushes. Test Plan: run unit tests and add a new unit test to cover new properties. Reviewers: haobo, dhruba Reviewed By: dhruba CC: nkg-, leveldb Differential Revision: https://reviews.facebook.net/D13677	2014-03-18 14:00:09 -07:00
Igor Canadi	e0c1211555	Merge branch 'master' into columnfamilies Conflicts: db/version_set.cc tools/db_stress.cc	2014-03-17 12:21:05 -07:00
sdong	c61c9830d4	Fix a bug that Prev() can hang. Summary: Prev() now can hang when there is a key with more than max_skipped number of appearance internally but all of them are newer than the sequence ID to seek. Add unit tests to confirm the bug and fix it. Test Plan: make all check Reviewers: igor, haobo Reviewed By: igor CC: ljin, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16899	2014-03-17 10:00:41 -07:00
Igor Canadi	928ee23567	Change WriteBatch interface	2014-03-14 13:40:06 -07:00
Igor Canadi	e1f56e12cf	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_test.cc tools/db_stress.cc	2014-03-13 13:21:20 -07:00
Kai Liu	11da8bc5df	A heuristic way to check if a memtable is full Summary: This is is based on https://reviews.facebook.net/D15027. It's not finished but I would like to give a prototype to avoid arena over-allocation while making better use of the already allocated memory blocks. Instead of check approximate memtable size, we will take a deeper look at the arena, which incorporate essential idea that @sdong suggests: flush when arena has allocated its last and the last is "almost full" Test Plan: N/A Reviewers: haobo, sdong Reviewed By: sdong CC: leveldb, sdong Differential Revision: https://reviews.facebook.net/D15051	2014-03-12 16:40:14 -07:00
Igor Canadi	25c8a1a20f	More bug fixed introduced by code cleanup	2014-03-12 12:28:23 -07:00
Igor Canadi	b5d6ad69fc	Bug fixes introduced by code cleanup	2014-03-12 11:10:26 -07:00
Igor Canadi	2b95dc1542	Revert "Fix bad merge of D16791 and D16767" This reverts commit `839c8ecfcd`.	2014-03-12 09:37:43 -07:00
sdong	839c8ecfcd	Fix bad merge of D16791 and D16767 Summary: A bad Auto-Merge caused log buffer is flushed twice. Remove the unintended one. Test Plan: Should already be tested (the code looks the same as when I ran unit tests). Reviewers: haobo, igor Reviewed By: haobo CC: ljin, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16821	2014-03-11 21:31:57 -07:00
Igor Canadi	9634ba42ac	Merge branch 'master' into columnfamilies Conflicts: db/compaction_picker.cc db/db_impl.cc db/db_impl.h db/tailing_iter.cc db/version_set.h include/rocksdb/options.h util/options.cc	2014-03-10 17:26:09 -07:00
sdong	fac58c0504	DBTest: remove perf_context's time > 0 check Summary: DBTest checks perf_context.seek_internal_seek_time > 0 and perf_context.find_next_user_entry_time > 0, which is not reliable. Remove them. Test Plan: ./db_test Reviewers: igor, haobo, ljin Reviewed By: igor CC: dhruba, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16737	2014-03-10 14:24:56 -07:00
Lei Jin	8d007b4aaf	Consolidate SliceTransform object ownership Summary: (1) Fix SanitizeOptions() to also check HashLinkList. The current dynamic case just happens to work because the 2 classes have the same layout. (2) Do not delete SliceTransform object in HashSkipListFactory and HashLinkListFactory destructor. Reason: SanitizeOptions() enforces prefix_extractor and SliceTransform to be the same object when HashFactory is used. This makes the behavior strange: when HashFactory is used, prefix_extractor will be released by RocksDB. If other memtable factory is used, prefix_extractor should be released by user. Test Plan: db_bench && make asan_check Reviewers: haobo, igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D16587	2014-03-10 12:56:46 -07:00
Igor Canadi	0738ae6dc9	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc	2014-03-05 12:25:05 -08:00
Igor Canadi	8ca30bd51b	Merge pull request #47 from mlin/kCompactionStopStyleSimilarSize An initial implementation of kCompactionStopStyleSimilarSize for universal compaction	2014-03-05 10:35:30 -08:00
Igor Canadi	c0ccf43648	MergingIterator assertion Summary: I wrote a test that triggers assertion in MergingIterator. I have not touched that code ever, so I'm looking for somebody with good understanding of the MergingIterator code to fix this. The solution is probably a one-liner. Let me know if you're willing to take a look. Test Plan: This test fails with an assertion `use_heap_ == false` Reviewers: dhruba, haobo, sdong, kailiu Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D16521	2014-03-05 09:13:07 -08:00
Igor Canadi	9d0577a6be	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/transaction_log_impl.cc db/transaction_log_impl.h include/rocksdb/options.h util/env.cc util/options.cc	2014-03-03 18:29:03 -08:00
Yueh-Hsuan Chiang	a77527f2af	Add ReadOptions to TransactionLogIterator. Summary: Add an optional input parameter ReadOptions to DB::GetUpdateSince(), which allows the verification of checksums to be disabled by setting ReadOptions::verify_checksums to false. Test Plan: Tests are done off-line and will not be included in the regular unit test. Reviewers: igor Reviewed By: igor CC: leveldb, xjin, dhruba Differential Revision: https://reviews.facebook.net/D16305	2014-02-28 11:50:36 -08:00
Igor Canadi	343c32be7b	[CF] DifferentMergeOperators and DifferentCompactionStyles tests Summary: Two new column family tests: * DifferentMergeOperators -- three column families, one without merge operator, one with add operator and one with append operator. verify that operations work as expected. * DifferentCompactionStyles -- three column families, two with level compactions and one with universal compaction. trigger the compactions and verify they work as expected. Test Plan: nope Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16377	2014-02-26 16:05:24 -08:00
Igor Canadi	8b7ab9951c	[CF] Handle failure in WriteBatch::Handler Summary: * Add ColumnFamilyHandle::GetID() function. Client needs to know column family's ID to be able to construct WriteBatch * Handle WriteBatch::Handler failure gracefully. Since WriteBatch is not a very smart function (it takes raw CF id), client can add data to WriteBatch for column family that doesn't exist. In that case, we need to gracefully return failure status from DB::Write(). To do that, I added a return Status to WriteBatch functions PutCF, DeleteCF and MergeCF. Test Plan: Added test to column_family_test Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16323	2014-02-26 10:10:00 -08:00
Igor Canadi	6aef661230	some improvements to CompressedCache test	2014-02-14 17:47:53 -08:00
Igor Canadi	422bb09cb0	Fix table properties Summary: Adapt table properties to column family world Test Plan: make check Reviewers: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D16161	2014-02-14 17:13:10 -08:00
Igor Canadi	76c048183c	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_test.cc include/rocksdb/db.h	2014-02-14 16:46:03 -08:00
Igor Canadi	be7e273d83	fix u/s comparison #83	2014-02-14 16:18:55 -08:00
Igor Canadi	c67d48c852	[CF] DB test to run on non-default column family Summary: This is a huge diff and it was hectic, but the idea is actually quite simple. Every operation (Put, Get, etc.) done on default column family in DBTest is now forwarded to non-default ("pikachu"). The good news is that we had zero test failures! Column families look stable so far. One interesting test that I adapted for column families is MultiThreadedTest. I replaced every Put() with a WriteBatch writing to all column families concurrently. Every Put in the write batch contains unique_id. Instead of Get() I do a multiget across all column families with the same key. If atomicity holds, I expect to see the same unique_id in all column families. Test Plan: This is a test! Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D16149	2014-02-14 16:08:59 -08:00
kailiu	63690625cd	Expose the table properties to application Summary: Provide a public API for users to access the table properties for each SSTable. Test Plan: Added a unit tests to test the function correctness under differnet conditions. Reviewers: haobo, dhruba, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16083	2014-02-13 16:28:21 -08:00
Igor Canadi	ccdb93e775	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/memtable_list.cc db/memtable_list.h db/version_set.cc db/version_set.h	2014-02-12 14:01:30 -08:00
Igor Canadi	b06840aa7d	[CF] Rethinking ColumnFamilyHandle and fix to dropping column families Summary: The change to the public behavior: * When opening a DB or creating new column family client gets a ColumnFamilyHandle. * As long as column family handle is alive, client can do whatever he wants with it, even drop it * Dropped column family can still be read from (using the column family handle) * Added a new call CloseColumnFamily(). Client has to close all column families that he has opened before deleting the DB * As soon as column family is closed, any calls to DB using that column family handle will fail (also any outstanding calls) Internally: * Ref-counting ColumnFamilyData * New thread-safety for ColumnFamilySet * Dropped column families are now completely dropped and their memory cleaned-up Test Plan: added some tests to column_family_test Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D16101	2014-02-12 13:47:09 -08:00
Igor Canadi	ca5f1a225a	CompactionContext to include is_manual_compaction Summary: Added a bit more information to compaction context, requested by internal team at FB. Test Plan: Modified CompactionFilter test to make sure is_manual_compaction is properly set. Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16095	2014-02-12 12:24:18 -08:00
Lei Jin	5fbf2ef42d	preload table handle on Recover() when max_open_files == -1 Summary: This covers existing table files before DB open happens and avoids contention on table cache Test Plan: db_test Reviewers: haobo, sdong, igor, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16089	2014-02-12 10:43:27 -08:00
Albert Strasheim	df2f92214a	Support for LZ4 compression.	2014-02-08 14:15:51 -08:00
Igor Canadi	0143abdbb0	Merge branch 'master' into columnfamilies Conflicts: HISTORY.md db/db_impl.cc db/db_impl.h db/db_iter.cc db/db_test.cc db/dbformat.h db/memtable.cc db/memtable_list.cc db/memtable_list.h db/table_cache.cc db/table_cache.h db/version_edit.h db/version_set.cc db/version_set.h db/write_batch.cc db/write_batch_test.cc include/rocksdb/options.h util/options.cc	2014-02-06 15:58:20 -08:00
kailiu	84f8185fc0	Merge branch 'master' into performance Conflicts: HISTORY.md db/db_impl.cc db/memtable.cc	2014-02-05 21:21:00 -08:00
Igor Canadi	2a9271b403	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/db_impl_readonly.cc	2014-02-03 13:47:54 -08:00
Lei Jin	5b3b6549d6	use super_version in NewIterator() and MultiGet() function Summary: Use super_version insider NewIterator to avoid Ref() each component separately under mutex The new added bench shows NewIterator QPS increases from 515K to 719K No meaningful improvement for multiget I guess due to its relatively small cost comparing to 90 keys fetch in the test. Test Plan: unit test and db_bench Reviewers: igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D15609	2014-02-03 13:13:36 -08:00
Igor Canadi	29bacb2eb6	VersionSet cleanup Summary: Removed icmp_ from VersionSet (since it's per-column-family, not per-DB-instance) Unfriended VersionSet and ColumnFamilyData (yay!) Removed VersionSet::NumberLevels() Cleaned up DBImpl Test Plan: make check Reviewers: dhruba, haobo, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15819	2014-02-03 13:10:47 -08:00
Siying Dong	d169b67680	[Performance Branch] PlainTable to encode rows with seqID 0, value type using 1 internal byte. Summary: In PlainTable, use one single byte to represent 8 bytes of internal bytes, if seqID = 0 and it is value type (which should be common for bottom most files). It is to save 7 bytes for uncompressed cases. Test Plan: make all check Reviewers: haobo, dhruba, kailiu Reviewed By: haobo CC: igor, leveldb Differential Revision: https://reviews.facebook.net/D15489	2014-02-03 12:19:30 -08:00
kailiu	4f6cb17bdb	First phase API clean up Summary: Addressed all the issues in https://reviews.facebook.net/D15447. Now most table-related modules are hidden from user land. Test Plan: make check Reviewers: sdong, haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15525	2014-02-03 00:30:43 -08:00
kailiu	a5e220f5ef	Merge branch 'master' into performance Conflicts: Makefile db/db_impl.cc db/db_test.cc db/memtable_list.cc db/memtable_list.h table/block_based_table_reader.cc table/table_test.cc util/cache.cc util/coding.cc	2014-01-28 10:35:55 -08:00
Igor Canadi	511b03a5b5	LogAndApply to take ColumnFamilyData Summary: This removes the default implementation of LogAndApply that applied the changed to the default column family by default. It is mostly simple reformatting. Test Plan: make check Reviewers: dhruba, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15465	2014-01-27 13:57:58 -08:00
Mike Lin	af7838de36	address code review comments on `5e3aeb5f8e` - reduce string copying in Compaction::Summary - simplify file number checking in UniversalCompactionStopStyleSimilarSize unit test	2014-01-25 14:12:24 -08:00
Igor Canadi	5356b2a680	Merge branch 'master' into columnfamilies	2014-01-24 18:34:48 -08:00
Siying Dong	8477255da3	Moving Some includes from options.h to forward declaration Summary: By removing some includes form options.h and reply on forward declaration, we can more easily reason the dependencies. Test Plan: make all check Reviewers: kailiu, haobo, igor, dhruba Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15411	2014-01-24 17:16:22 -08:00
Igor Canadi	1423e7c9de	Merge branch 'master' into columnfamilies Conflicts: db/version_set.cc db/version_set_reduce_num_levels.cc util/ldb_cmd.cc	2014-01-24 15:03:54 -08:00
kailiu	66dc033af3	Temporarily disable caching index/filter blocks Summary: Mixing index/filter blocks with data blocks resulted in some known issues. To make sure in next release our users won't be affected, we added a new option in BlockBasedTableFactory::TableOption to conceal this functionality for now. This patch also introduced a BlockBasedTableReader::OpenOptions, which avoids the "infinite" growth of parameters in BlockBasedTableReader::Open(). Test Plan: make check Reviewers: haobo, sdong, igor, dhruba Reviewed By: igor CC: leveldb, tnovak Differential Revision: https://reviews.facebook.net/D15327	2014-01-24 10:57:15 -08:00
Igor Canadi	28d1a0c6f5	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/db_impl_readonly.h db/db_test.cc include/rocksdb/db.h include/utilities/stackable_db.h	2014-01-24 09:27:29 -08:00
Lei Jin	aba2acb5ec	CompactRange() to return status Summary: as title Test Plan: make all check What else tests shall I cover? Reviewers: igor, haobo CC: Differential Revision: https://reviews.facebook.net/D15339	2014-01-23 16:41:46 -08:00
Kai Liu	054c5dda8c	Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_test.cc db/memtable.cc db/version_set.cc include/rocksdb/statistics.h util/statistics_imp.h	2014-01-23 16:32:49 -08:00
Tomislav Novak	81c9cc9b3b	Tailing iterator Summary: This diff implements a special type of iterator that doesn't create a snapshot (can be used to read newly inserted data) and is optimized for doing sequential reads. TailingIterator uses current superversion number to determine whether to invalidate its internal iterators. If the version hasn't changed, it can often avoid doing expensive seeks over immutable structures (sst files and immutable memtables). Test Plan: * new unit tests * running LD with this patch Reviewers: igor, dhruba, haobo, sdong, kailiu Reviewed By: sdong CC: leveldb, lovro, march Differential Revision: https://reviews.facebook.net/D15285	2014-01-23 16:26:08 -08:00
Igor Canadi	7c5e583a27	ColumnFamilySet Summary: I created a separate class ColumnFamilySet to keep track of column families. Before we did this in VersionSet and I believe this approach is cleaner. Let me know if you have any comments. I will commit tomorrow. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15357	2014-01-23 14:03:38 -08:00
Igor Canadi	92a022ad07	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/db_impl_readonly.cc db/version_set.cc	2014-01-22 10:59:07 -08:00
Igor Canadi	6fe9b57748	Refactor Recover() code Summary: This diff does two things: * Rethinks how we call Recover() with read_only option. Before, we call it with pointer to memtable where we'd like to apply those changes to. This memtable is set in db_impl_readonly.cc and it's actually DBImpl::mem_. Why don't we just apply updates to mem_ right away? It seems more intuitive. * Changes when we apply updates to manifest. Before, the process is to recover all the logs, flush it to sst files and then do one giant commit that atomically adds all recovered sst files and sets the next log number. This works good enough, but causes some small troubles for my column family approach, since I can't have one VersionEdit apply to more than single column family[1]. The change here is to commit the files recovered from logs right away. Here is the state of the world before the change: 1. Recover log 5, add new sst files to edit 2. Recover log 7, add new sst files to edit 3. Recover log 8, add new sst files to edit 4. Commit all added sst files to manifest and mark log files 5, 7 and 8 as recoverd (via SetLogNumber(9) function) After the change, we'll do: 1. Recover log 5, commit the new sst files and set log 5 as recovered 2. Recover log 7, commit the new sst files and set log 7 as recovered 3. Recover log 8, commit the new sst files and set log 8 as recovered The added (small) benefit is that if we fail after (2), the new recovery will only have to recover log 8. In previous case, we'll have to restart the recovery from the beginning. The bigger benefit will be to enable easier integration of multiple column families in Recovery code path. [1] I'm happy to dicuss this decison, but I believe this is the cleanest way to go. It also makes backward compatibility much easier. We don't have a requirement of adding multiple column families atomically. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15237	2014-01-22 10:45:26 -08:00
Igor Canadi	23f6791c9e	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl_readonly.cc db/db_test.cc db/version_edit.cc db/version_edit.h db/version_set.cc db/version_set.h db/version_set_reduce_num_levels.cc	2014-01-21 17:01:52 -08:00
Igor Canadi	83681bf9ef	Statistics code cleanup Summary: I'm separating code-cleanup part of https://reviews.facebook.net/D14517. This will make D14517 easier to understand and this diff easier to review. Test Plan: make check Reviewers: haobo, kailiu, sdong, dhruba, tnovak Reviewed By: tnovak CC: leveldb Differential Revision: https://reviews.facebook.net/D15099	2014-01-17 12:46:06 -08:00
Mike Lin	5e3aeb5f8e	An initial implementation of kCompactionStopStyleSimilarSize for universal compaction	2014-01-16 22:59:34 -08:00
Naman Gupta	1447bb5919	Allow callback to change size of existing value. Change return type of the callback function to an enum status to handle 3 cases. Summary: This diff fixes 2 hacks: * The callback function can modify the existing value inplace, if the merged value fits within the existing buffer size. But currently the existing buffer size is not being modified. Now the callback recieves a int* allowing the size to be modified. Since size is encoded as a varint in the internal key for memtable. It might happen that the entire value might have be copied to the new location if the new size varint is smaller than the existing size varint. * The callback function has 3 functionalities 1. Modify existing buffer inplace, and update size correspondingly. Now to indicate that, Returns 1. 2. Generate a new buffer indicating merged value. Returns 2. 3. Fails to do either of above, based on whatever application logic. Returns 0. Test Plan: Just make all for now. I'm adding another unit test to test each scenario. Reviewers: dhruba, haobo Reviewed By: haobo CC: leveldb, sdong, kailiu, xinyaohu, sumeet, danguo Differential Revision: https://reviews.facebook.net/D15195	2014-01-16 15:12:39 -08:00
kailiu	1304d8c8ce	Merge branch 'master' into performance Conflicts: Makefile db/db_impl.cc db/db_impl.h db/db_test.cc db/memtable.cc db/memtable.h db/version_edit.h db/version_set.cc include/rocksdb/options.h util/hash_skiplist_rep.cc util/options.cc	2014-01-15 23:12:31 -08:00
Igor Canadi	d9cd7a063f	Fix CompactRange to apply filter to every key Summary: When doing CompactRange(), we should first flush the memtable and then calculate max_level_with_files. Also, we want to compact all the levels that have files, including level `max_level_with_files`. This patch fixed the unit test. Test Plan: Added a failing unit test and a fix, so it's not failing anymore. Reviewers: dhruba, haobo, sdong Reviewed By: haobo CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D14421	2014-01-14 16:19:09 -08:00
Igor Canadi	1ed2404f27	Wrong number of levels is Invalid argument now, not corruption	2014-01-14 15:54:11 -08:00
Igor Canadi	6291020284	Fix test	2014-01-14 15:41:30 -08:00
Igor Canadi	055e6df45b	VersionEdit not to take NumLevels() Summary: I will submit a sequence of diffs that are preparing master branch for column families. There are a lot of implicit assumptions in the code that are making column family implementation hard. If I make the change only in column family branch, it will make merging back to master impossible. Most of the diffs will be simple code refactorings, so I hope we can have fast turnaround time. Feel free to grab me in person to discuss any of them. This diff removes number of level check from VersionEdit. It is used only when VersionEdit is read, not written, but has to be set when it is written. I believe it is a right thing to make VersionEdit dumb and check consistency on the caller side. This will also make it much easier to implement Column Families, since different column families can have different number of levels. Test Plan: make check Reviewers: dhruba, haobo, sdong, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15159	2014-01-14 15:27:09 -08:00
Igor Canadi	7d9f21cf23	BuildBatchGroup -- memcpy outside of lock Summary: When building batch group, don't actually build a new batch since it requires heavy-weight mem copy and malloc. Only store references to the batches and build the batch group without lock held. Test Plan: `make check` I am also planning to run performance tests. The workload that will benefit from this change is readwhilewriting. I will post the results once I have them. Reviewers: dhruba, haobo, kailiu Reviewed By: haobo CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D15063	2014-01-14 14:49:31 -08:00
Naman Gupta	8454cfe569	Add read/modify/write functionality to Put() api Summary: The application can set a callback function, which is applied on the previous value. And calculates the new value. This new value can be set, either inplace, if the previous value existed in memtable, and new value is smaller than previous value. Otherwise the new value is added normally. Test Plan: fbmake. Added unit tests. All unit tests pass. Reviewers: dhruba, haobo Reviewed By: haobo CC: sdong, kailiu, xinyaohu, sumeet, leveldb Differential Revision: https://reviews.facebook.net/D14745	2014-01-14 07:55:16 -08:00
Siying Dong	aa0ef6602d	[Performance Branch] If options.max_open_files set to be -1, cache table readers in FileMetadata for Get() and NewIterator() Summary: In some use cases, table readers for all live files should always be cached. In that case, there will be an opportunity to avoid the table cache look-up while Get() and NewIterator(). We define options.max_open_files = -1 to be the mode that table readers for live files will always be kept. In that mode, table readers are cached in FileMetaData (with a reference count hold in table cache). So that when executing table_cache.Get() and table_cache.newInterator(), LRU cache checking can be by-passed, to reduce latency. Test Plan: add a test case in db_test Reviewers: haobo, kailiu Reviewed By: haobo CC: dhruba, igor, leveldb Differential Revision: https://reviews.facebook.net/D15039	2014-01-10 15:57:49 -08:00
Siying Dong	424a524ac9	[Performance Branch] A Hashed Linked List Based Mem Table Summary: Implement a mem table, in which keys are hashed based on prefixes. In each bucket, entries are organized in a sorted linked list. It has the same thread safety guarantee as skip list. The motivation is to optimize memory usage for the case that prefix hashing is primary way of seeking to the entry. Compared to hash skip list implementation, this implementation is more memory efficient, but inside each bucket, search is always linear. The target scenario is that there are only very limited number of records in each hash bucket. Test Plan: Add a test case in db_test Reviewers: haobo, kailiu, dhruba Reviewed By: haobo CC: igor, nkg-, leveldb Differential Revision: https://reviews.facebook.net/D14979	2014-01-09 16:19:11 -08:00
Igor Canadi	19e3ee64ac	Add column family information to WAL Summary: I have added three new value types: * kTypeColumnFamilyDeletion * kTypeColumnFamilyValue * kTypeColumnFamilyMerge which include column family Varint32 before the data (value, deletion and merge). These values are used only in WAL (not in memtables yet). This endeavour required changing some WriteBatch internals. Test Plan: Added a unittest Reviewers: dhruba, haobo, sdong, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15045	2014-01-08 12:53:33 -08:00
Kai Liu	5e7d5629c7	Fix the valgrind issues	2014-01-03 11:48:31 -08:00
kailiu	e72aa37cc5	Merge branch 'master' into performance Conflicts: db/table_cache.cc	2014-01-02 16:34:59 -08:00
Igor Canadi	7535443083	[RocksDB] Support for column families in manifest Summary: <This diff is for Column Family branch> Added fields in manifest file to support adding and deleting column families. Pretty simple change, each version edit record can be: 1. add column family 2. drop column family 3. add and delete N files from a single column family (compactions and flushes will generate such records) Test Plan: make check works, the code is backward compatible Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D14733	2014-01-02 04:18:28 -08:00
Igor Canadi	6de1b5b83e	Merge branch 'master' into columnfamilies	2014-01-02 04:18:07 -08:00
Igor Canadi	b60c14f6ee	Support multi-threaded DisableFileDeletions() and EnableFileDeletions() Summary: We don't want two threads to clash if they concurrently call DisableFileDeletions() and EnableFileDeletions(). I'm adding a counter that will enable file deletions only after all DisableFileDeletions() calls have been negated with EnableFileDeletions(). However, we also don't want to break the old behavior, so I added a parameter force to EnableFileDeletions(). If force is true, we will still enable file deletions after every call to EnableFileDeletions(), which is what is happening now. Test Plan: make check Reviewers: dhruba, haobo, sanketh Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14781	2014-01-02 03:33:42 -08:00
kailiu	f1cec73a76	Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_test.cc db/memtable.cc db/version_set.cc include/rocksdb/statistics.h	2013-12-27 12:23:17 -08:00
kailiu	079a21ba99	Fix the unused variable warning message in mac os	2013-12-26 15:12:30 -08:00
Siying Dong	abaf26266d	[RocksDB] [Performance Branch] Some Changes to PlainTable format Summary: Some changes to PlainTable format: (1) support variable key length (2) use user defined slice transformer to extract prefixes (3) Run some test cases against PlainTable in db_test and table_test Test Plan: test db_test Reviewers: haobo, kailiu CC: dhruba, igor, leveldb, nkg- Differential Revision: https://reviews.facebook.net/D14457	2013-12-20 12:08:35 -08:00
Igor Canadi	9385a5247e	[RocksDB] [Column Family] Interface proposal Summary: <This diff is for Column Family branch> Sharing some of the work I've done so far. This diff compiles and passes the tests. The biggest change is in options.h - I broke down Options into two parts - DBOptions and ColumnFamilyOptions. DBOptions is DB-specific (env, create_if_missing, block_cache, etc.) and ColumnFamilyOptions is column family-specific (all compaction options, compresion options, etc.). Note that this does not break backwards compatibility at all. Further, I created DBWithColumnFamily which inherits DB interface and adds new functions with column family support. Clients can transparently switch to DBWithColumnFamily and it will not break their backwards compatibility. There are few methods worth checking out: ListColumnFamilies(), MultiNewIterator(), MultiGet() and GetSnapshot(). [GetSnapshot() returns the snapshot across all column families for now - I think that's what we agreed on] Finally, I made small changes to WriteBatch so we are able to atomically insert data across column families. Please provide feedback. Test Plan: make check works, the code is backward compatible Reviewers: dhruba, haobo, sdong, kailiu, emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D14445	2013-12-18 13:08:22 -08:00
kailiu	0cd1521af5	Completely remove argv_ since no one use it There are still warning in some other environment, just move that useless variable `argv_`	2013-12-12 16:36:38 -08:00
kailiu	0e24f97b9f	Revert last commit and add "unused" attribute to suppress warning	2013-12-12 15:40:44 -08:00

1 2 3 4 5 ...

312 Commits