rocksdb

Author	SHA1	Message	Date
Igor Canadi	2b8c44639a	Merge branch 'master' into columnfamilies	2014-02-07 11:42:56 -08:00
Yueh-Hsuan Chiang	3ce8d9a988	Add support for plain table format to sst_dump. Summary: This diff enables the command line tool `sst_dump` to work for sst files under plain table format. Changes include: * In tools/sst_dump.cc: - add support for plain table format - display prefix_extractor information when --show_properties is on * In table/format.cc - Now the table magic number of a Footer can be later initialized via ReadFooterFromFile(). * In table/meta_bocks: - add function ReadTableMagicNumber() that reads the magic number of the specified file. Minor fixes: - remove a duplicate #include in table/table_test.cc - fix a commentary typo in include/rocksdb/memtablerep.h - fix lint errors. Test Plan: Runs sst_dump with both block-based and plain-table format files with different arguments, specifically those with --show-properties and --from. * sample output: https://reviews.facebook.net/P261 Reviewers: kailiu, sdong, xjin CC: leveldb Differential Revision: https://reviews.facebook.net/D15903	2014-02-07 11:15:00 -08:00
Igor Canadi	1560bb913e	Readrandom with tailing iterator Summary: Added an option for readrandom benchmark to run with tailing iterator instead of Get. Benefit of tailing iterator is that it doesn't require locking DB mutex on access. I also have some results when running on my machine. The results highly depend on number of cache shards. With our current benchmark setting of 4 table cache shards and 6 block cache shards, I don't see much improvements of using tailing iterator. In that case, we're probably seeing cache mutex contention. Here are the results for different number of shards cache shards tailing iterator get 6 1.38M 1.16M 10 1.58M 1.15M As soon as we get rid of cache mutex contention, we're seeing big improvements in using tailing iterator vs. ordinary get. Test Plan: ran regression test Reviewers: dhruba, haobo, ljin, kailiu, sding Reviewed By: haobo CC: tnovak Differential Revision: https://reviews.facebook.net/D15867	2014-02-07 09:47:47 -08:00
Igor Canadi	d53b188228	Fix some errors detected by coverity scan Summary: Nothing major, just an extra return line and posibility of leaking fb in NewRandomRWFile Test Plan: make check Reviewers: kailiu, dhruba Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15993	2014-02-06 21:59:44 -08:00
kailiu	1d08140e81	Compile -O2 by default and add `make dbg` Summary: To speed up the compilation while allowing us to compile in debug mode. Test Plan: make: see -O2 enabled make dbg: didn't see -O2 Reviewers: igor Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D15969	2014-02-06 16:11:35 -08:00
Igor Canadi	0143abdbb0	Merge branch 'master' into columnfamilies Conflicts: HISTORY.md db/db_impl.cc db/db_impl.h db/db_iter.cc db/db_test.cc db/dbformat.h db/memtable.cc db/memtable_list.cc db/memtable_list.h db/table_cache.cc db/table_cache.h db/version_edit.h db/version_set.cc db/version_set.h db/write_batch.cc db/write_batch_test.cc include/rocksdb/options.h util/options.cc	2014-02-06 15:58:20 -08:00
Kai Liu	dcea184948	Update HISTORY.md Remove the unnecessary line led from branch merge.	2014-02-06 15:52:16 -08:00
Igor Canadi	0b4ccf765c	Flushes should always go to HIGH priority thread pool Summary: This is not column-family related diff. It is in columnfamily branch because the change is significant and we want to push it with next major release (3.0). It removes the leveldb notion of one thread pool and expands it to two thread pools by default (HIGH and LOW). Flush process is removed from compaction process and all flush threads are executed on HIGH thread pool, since we don't want long-running compactions to influence flush latency. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15987	2014-02-06 14:17:13 -08:00
Igor Canadi	4564b2e8f9	Merge pull request #76 from lisyarus/fix_zlib_macro_bug Fix build in case zlib not found	2014-02-06 13:05:03 -08:00
Kai Liu	fd0ffbc7ca	Disable the html-based coverage report by default	2014-02-06 12:58:13 -08:00
Igor Canadi	f8d5443efe	[CF] Thread-safety guarantees for ColumnFamilySet Summary: Revised thread-safety guarantees and implemented a way to spinlock the object. Test Plan: make check Reviewers: dhruba, haobo, sdong, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15975	2014-02-06 11:57:36 -08:00
Igor Canadi	8fa8a708ef	[CF] Propagate correct options to WriteBatch::InsertInto Summary: WriteBatch can have multiple column families in one batch. Every column family has different options. So we have to add a way for write batch to get options for an arbitrary column family. This required a bit more acrobatics since lots of interfaces had to be changed. Test Plan: make check Reviewers: dhruba, haobo, sdong, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15957	2014-02-06 10:23:31 -08:00
kailiu	84f8185fc0	Merge branch 'master' into performance Conflicts: HISTORY.md db/db_impl.cc db/memtable.cc	2014-02-05 21:21:00 -08:00
Igor Canadi	b4f441f48a	Fixed a bug introduced by previous commit	2014-02-05 14:58:24 -08:00
Igor Canadi	f276e0e59d	[CF] Options -> DBOptions Summary: Replaced most of occurrences of Options with more specific DBOptions. This brings us very close to supporting different configuration options for each column family. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15933	2014-02-05 14:56:09 -08:00
lisyarus	aa6fbbfae7	Fix build in case zlib not found	2014-02-06 01:34:18 +04:00
Igor Canadi	6e56ab5702	[CF] Add full_options_ to ColumnFamilyData Summary: Lots of code expects Options on construction/function call. My original idea was to split Options argument into ColumnFamilyOptions and DBOptions (the latter only if needed). However, this will require huge code changes very deep in the stack. The better idea is to have ColumnFamilyData hold both ColumnFamilyOptions and Options. ColumnFamilyData::Options would be constructed from DBOptions (same for each column family) and ColumnFamilyOptions (different for each column family) Now when we construct a class or call any method that requires Options, we can just push him ColumnFamilyData::Options and be sure that it's using column-family-specific settings. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15927	2014-02-05 12:26:40 -08:00
Igor Canadi	328ac7ee02	Merge branch 'master' into columnfamilies	2014-02-05 12:00:33 -08:00
Igor Canadi	c24d8c4e90	[CF] Rethink table cache Summary: Adapting table cache to column families is interesting. We want table cache to be global LRU, so if some column families are use not as often as others, we want them to be evicted from cache. However, current TableCache object also constructs tables on its own. If table is not found in the cache, TableCache automatically creates new table. We want each column family to be able to specify different table factory. To solve the problem, we still have a single LRU, but we provide the LRUCache object to TableCache on construction. We have one TableCache per column family, but the underyling cache is shared by all TableCache objects. This allows us to have a global LRU, but still be able to support different table factories for different column families. Also, in the future it will also be able to support different directories for different column families. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15915	2014-02-05 11:55:30 -08:00
Igor Canadi	183ba01a0e	Merge pull request #71 from alberts/crc32 crc32c: choose function in static initialization	2014-02-05 08:23:56 -08:00
Albert Strasheim	d411dc5884	crc32c: choose function in static initialization Before: 4.430 micros/op 225732 ops/sec; 881.8 MB/s (4K per op) After: 4.125 micros/op 242425 ops/sec; 947.0 MB/s (4K per op)	2014-02-04 19:13:57 -08:00
Igor Canadi	7b9f134959	[CF] Move InternalStats to ColumnFamilyData Summary: InternalStats is a messy thing, keeping both DB data and column family data. However, it's better off living in ColumnFamilyData than in DBImpl. For now, at least. Test Plan: make check Reviewers: dhruba, kailiu, haobo, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15879	2014-02-04 17:59:50 -08:00
Igor Canadi	73f62255c1	[CF] Split SanitizeOptions into two Summary: There are three SanitizeOption-s now : one for DBOptions, one for ColumnFamilyOptions and one for Options (which just calls the other two) I have also reshuffled some options -- table_cache options and info_log should live in DBOptions, for example. Test Plan: make check doesn't complain Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15873	2014-02-04 17:26:51 -08:00
kailiu	d43ebd8c65	Put table factory back to public api Summary: Previous I am too ambitious to hide every detail about table factory to internal api. However, we cannot pass the compilatoin for external users since we use table factory as the shared_ptr, which requires the definition of table factory's destructor. Test Plan: make check; Reviewers: sdong, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15861	2014-02-03 19:51:20 -08:00
Igor Canadi	5e2c4fe766	Get rid of DBImpl::user_comparator() Summary: user_comparator() is a Column Family property, not DBImpl Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15855	2014-02-03 17:50:47 -08:00
Igor Canadi	0e22badc08	[column families] Iterator and MultiGet Summary: Support for different column families in Iterator and MultiGet code path. Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15849	2014-02-03 17:44:40 -08:00
Igor Canadi	2966d764cd	Fix some 32-bit compile errors Summary: RocksDB doesn't compile on 32-bit architecture apparently. This is attempt to fix some of 32-bit errors. They are reported here: https://gist.github.com/paxos/8789697 Test Plan: RocksDB still compiles on 64-bit :) Reviewers: kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15825	2014-02-03 13:48:30 -08:00
Igor Canadi	2a9271b403	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h db/db_impl_readonly.cc	2014-02-03 13:47:54 -08:00
Lei Jin	5b3b6549d6	use super_version in NewIterator() and MultiGet() function Summary: Use super_version insider NewIterator to avoid Ref() each component separately under mutex The new added bench shows NewIterator QPS increases from 515K to 719K No meaningful improvement for multiget I guess due to its relatively small cost comparing to 90 keys fetch in the test. Test Plan: unit test and db_bench Reviewers: igor, sdong Reviewed By: igor CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D15609	2014-02-03 13:13:36 -08:00
Igor Canadi	29bacb2eb6	VersionSet cleanup Summary: Removed icmp_ from VersionSet (since it's per-column-family, not per-DB-instance) Unfriended VersionSet and ColumnFamilyData (yay!) Removed VersionSet::NumberLevels() Cleaned up DBImpl Test Plan: make check Reviewers: dhruba, haobo, kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15819	2014-02-03 13:10:47 -08:00
Siying Dong	d169b67680	[Performance Branch] PlainTable to encode rows with seqID 0, value type using 1 internal byte. Summary: In PlainTable, use one single byte to represent 8 bytes of internal bytes, if seqID = 0 and it is value type (which should be common for bottom most files). It is to save 7 bytes for uncompressed cases. Test Plan: make all check Reviewers: haobo, dhruba, kailiu Reviewed By: haobo CC: igor, leveldb Differential Revision: https://reviews.facebook.net/D15489	2014-02-03 12:19:30 -08:00
Igor Canadi	5c6ef56152	Fix printf format	2014-02-03 10:25:37 -08:00
Kai Liu	87bda51d77	Merge pull request #58 from mlin/no-stdout Eliminate stdout message when launching a posix thread.	2014-02-03 00:38:11 -08:00
kailiu	4f6cb17bdb	First phase API clean up Summary: Addressed all the issues in https://reviews.facebook.net/D15447. Now most table-related modules are hidden from user land. Test Plan: make check Reviewers: sdong, haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15525	2014-02-03 00:30:43 -08:00
Igor Canadi	27a8856c23	Compacting column families Summary: This diff enables non-default column families to get compacted both automatically and also by calling CompactRange() Test Plan: make check Reviewers: dhruba, haobo, kailiu, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15813	2014-01-31 19:54:03 -08:00
Igor Canadi	5661ed8b80	Fix reduce_levels_test	2014-01-31 19:44:48 -08:00
Igor Canadi	fe9bd300d6	Merge branch 'master' into columnfamilies	2014-01-31 17:17:22 -08:00
Dhruba Borthakur	30a700657d	Fix corruption_test failure caused by auto-enablement of checksum verification. Summary: Patch https://reviews.facebook.net/D15591 enabled checksum verification by default. This caused the unit test to fail. Test Plan: ./corruption_test Reviewers: igor, kailiu Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D15795	2014-01-31 17:16:38 -08:00
Igor Canadi	f7489123e2	Move compaction picker and internal key comparator to ColumnFamilyData Summary: Compaction picker and internal key comparator are different for each column family (not global), so they should live in ColumnFamilyData Test Plan: make check Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15801	2014-01-31 16:06:55 -08:00
Igor Canadi	5693db2a02	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc	2014-01-31 14:45:15 -08:00
Igor Canadi	dbbffbd772	Mark the log_number file number used Summary: VersionSet::next_file_number_ is always assumed to be strictly greater than VersionSet::log_number_. In our new recovery code, we artificially set log_number_ to be (log_number + 1), so that once we flush, we don't recover from the same log file again (this is important because of merge operator non-idempotence) When we set VersionSet::log_number_ to (log_number + 1), we also have to mark that file number used, such that next_file_number_ is increased to a legal level. Otherwise, VersionSet might assert. This has not be a problem so far because here's what happens: 1. assume next_file_number is 5, we're recovering log_number 10 2. in DBImpl::Recover() we call MarkFileNumberUsed with 10. This will set VersionSet::next_file_number_ to 11. 3. If there are some updates, we will call WriteTable0ForRecovery(), which will use file number 11 as a new table file and advance VersionSet::next_file_number_ to 12. 4. When we LogAndApply() with log_number 11, assertion is true: assert(11 <= 12); However, this was a lucky occurrence. Even though this diff doesn't cause a bug, I think the issue is important to fix. Test Plan: In column families I have different recovery logic and this code path asserted. When adding MarkFileNumberUsed(log_number + 1) assert is gone. Reviewers: dhruba, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D15783	2014-01-31 14:43:16 -08:00
Igor Canadi	3615f534d1	Enable flushing memtables from arbitrary column families Summary: Removed default_cfd_ from all flush code paths. This means we can now flush memtables from arbitrary column families! Test Plan: Added a new unit test Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15789	2014-01-31 14:42:52 -08:00
Siying Dong	56bea9f80d	When using Universal Compaction, Zero out seqID in the last file too Summary: I didn't figure out the reason why the feature of zeroing out earlier sequence ID is disabled in universal compaction. I do see bottommost_level is set correctly. It should simply work if we remove the constraint of universal compaction. Test Plan: make all check Reviewers: haobo, dhruba, kailiu, igor Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15423	2014-01-31 09:59:52 -08:00
kailiu	4e0298f23c	Clean up arena API Summary: Easy thing goes first. This patch moves arena to internal dir; based on which, the coming patch will deal with memtable_rep. Test Plan: make check Reviewers: haobo, sdong, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D15615	2014-01-30 22:10:10 -08:00
Dhruba Borthakur	abd70ecc2b	The default settings enable checksum verification on every read. Summary: The default settings enable checksum verification on every read. Test Plan: make check Reviewers: haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15591	2014-01-30 19:14:03 -08:00
Igor Canadi	9ca638a86d	Enable iterating column families with a concurrent writer Summary: Sometimes we iterate through column families, and unlock the mutex in the body of the iteration. While mutex is unlocked, some column family might be created or dropped. We need to be able to continue iterating through column families even though our current column family got dropped. This diff implements circular linked lists that connect all column families. It then uses the link list to enable iterating through linked lists. Even if the column family is dropped, its next_ pointer still can be used to advance to another alive column family. Test Plan: make check Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15603	2014-01-30 16:57:52 -08:00
Igor Canadi	6973bb1722	MakeRoomForWrite() support for column families Summary: Making room for write will be the hardest part of the column family implementation. For now, I just iterate through all column families and run MakeRoomForWrite() for every one. Test Plan: make check does not complain Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15597	2014-01-30 16:12:08 -08:00
Igor Canadi	c37e7de669	Merge branch 'master' into columnfamilies Conflicts: db/db_impl.cc db/db_impl.h	2014-01-30 11:47:23 -08:00
Igor Canadi	ac92420fc5	Merge branch 'master' into performance Conflicts: db/db_impl.h	2014-01-30 10:09:23 -08:00
Yueh-Hsuan Chiang	a46ac92138	Allow command line tool sst-dump to display table properties. Summary: Add option '--show_properties' to sst_dump tool to allow displaying property block of the specified files. Test Plan: Run sst_dump with the following arguments, which covers cases affected by this diff: 1. with only --file 2. with both --file and --show_properties 3. with --file, --show_properties, and --from Reviewers: kailiu, xjin Differential Revision: https://reviews.facebook.net/D15453	2014-01-30 01:50:34 -08:00

1 2 3 4 5 ...

1185 Commits