rocksdb

Author	SHA1	Message	Date
Lei Jin	fd5d80d55e	CompactedDB: log using the correct info_log Summary: info_log from supplied Options can be nullptr. Using the one from db_impl. Also call flush after that since no more loggging will happen and LOG can contain partial output Test Plan: verified with db_bench Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D24183	2014-09-29 12:45:04 -07:00
Lei Jin	2faf49d5f1	use GetContext to replace callback function pointer Summary: Intead of passing callback function pointer and its arg on Table::Get() interface, passing GetContext. This makes the interface cleaner and possible better perf. Also adding a fast pass for SaveValue() Test Plan: make all check Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D24057	2014-09-29 11:09:09 -07:00
sdong	389edb6b1b	universal compaction picker: use double for potential overflow Summary: There is a possible overflow case in universal compaction picker. Use double to make the logic straight-forward Test Plan: make all check Reviewers: yhchiang, igor, MarkCallaghan, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23817	2014-09-26 16:17:05 -07:00
Lei Jin	fbd2dafc9f	CompactedDBImpl::MultiGet() for better CuckooTable performance Summary: Add the MultiGet API to allow prefetching. With file size of 1.5G, I configured it to have 0.9 hash ratio that can fill With 115M keys and result in 2 hash functions, the lookup QPS is ~4.9M/s vs. 3M/s for Get(). It is tricky to set the parameters right. Since files size is determined by power-of-two factor, that means # of keys is fixed in each file. With big file size (thus smaller # of files), we will have more chance to waste lot of space in the last file - lower space utilization as a result. Using smaller file size can improve the situation, but that harms lookup speed. Test Plan: db_bench Reviewers: yhchiang, sdong, igor Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23673	2014-09-25 13:34:51 -07:00
Lei Jin	3c68006109	CompactedDBImpl Summary: Add a CompactedDBImpl that will enabled when calling OpenForReadOnly() and the DB only has one level (>0) of files. As a performan comparison, CuckooTable performs 2.1M/s with CompactedDBImpl vs. 1.78M/s with ReadOnlyDBImpl. Test Plan: db_bench Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23553	2014-09-25 11:14:01 -07:00
Igor Canadi	f7375f39fd	Fix double deletes Summary: While debugging clients compaction issues, I noticed bunch of delete bugs: P16329995. MakeTableName returns sst file with "/" prefix. We also need "/" prefix when we get the files though GetChildren(), so that we can properly dedup the files. Test Plan: none Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23457	2014-09-25 11:08:16 -07:00
Igor Canadi	21ddcf6e4f	Remove allow_thread_local Summary: See https://reviews.facebook.net/D19365 Test Plan: compiles Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23907	2014-09-24 13:12:16 -07:00
sdong	cdaf44f9ae	Enlarge log size cap when printing file summary Summary: Now the file summary is too small for printing. Enlarge it. To enable it, allow to pass a size to log buffer. Test Plan: Add a unit test. make all check Reviewers: ljin, yhchiang Reviewed By: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21723	2014-09-23 16:56:34 -07:00
sdong	d0de413f4d	WriteBatchWithIndex to allow different Comparators for different column families Summary: Previously, one single column family is given to WriteBatchWithIndex to index keys for all column families. An extra map from column family ID to comparator is maintained which can override the default comparator given in the constructor. A WriteBatchWithIndex::SetComparatorForCF() is added for user to add comparators per column family. Also move more codes into anonymous namespace. Test Plan: Add a unit test Reviewers: ljin, igor Reviewed By: igor Subscribers: dhruba, leveldb, yhchiang Differential Revision: https://reviews.facebook.net/D23355	2014-09-22 13:47:39 -07:00
Lei Jin	57a32f147f	change target_file_size_base to uint64_t Summary: It contrains the file size to be 4G max with int Test Plan: tried to grep instance and made sure other related variables are also uint64 Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23697	2014-09-22 11:15:03 -07:00
Lei Jin	5e6aee4325	dont create backup_input if compaction filter v2 is not used Summary: Compaction creates backup_input iterator even though it only needed when compaction filter v2 is enabled Test Plan: make all check Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23769	2014-09-22 10:36:53 -07:00
Venkatesh Radhakrishnan	f44594743f	RocksDB: Format uint64 using PRIu64 in db_impl.cc Summary: Use PRIu64 to format uint64 in a portable manner Test Plan: Run "make all check" Reviewers: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23595	2014-09-18 22:19:41 -07:00
Igor Canadi	90b8c07b48	Fix unit tests errors Summary: Those were introduced with `2fb1fea30f` because the flushing behavior changed when max_background_flushes is > 0. Test Plan: make check Reviewers: ljin, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23577	2014-09-18 13:32:44 -07:00
Lei Jin	51af7c326c	CuckooTable: add one option to allow identity function for the first hash function Summary: MurmurHash becomes expensive when we do millions Get() a second in one thread. Add this option to allow the first hash function to use identity function as hash function. It results in QPS increase from 3.7M/s to ~4.3M/s. I did not observe improvement for end to end RocksDB performance. This may be caused by other bottlenecks that I will address in a separate diff. Test Plan: ``` [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=0 ==== Test CuckooReaderTest.WhenKeyExists ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator ==== Test CuckooReaderTest.CheckIterator ==== Test CuckooReaderTest.CheckIteratorUint64 ==== Test CuckooReaderTest.WhenKeyNotFound ==== Test CuckooReaderTest.TestReadPerformance With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.272us (3.7 Mqps) with batch size of 0, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.138us (7.2 Mqps) with batch size of 10, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.142us (7.1 Mqps) with batch size of 25, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.142us (7.0 Mqps) with batch size of 50, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.144us (6.9 Mqps) with batch size of 100, # of found keys 125829120 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.201us (5.0 Mqps) with batch size of 0, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.121us (8.3 Mqps) with batch size of 10, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.123us (8.1 Mqps) with batch size of 25, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.121us (8.3 Mqps) with batch size of 50, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.112us (8.9 Mqps) with batch size of 100, # of found keys 104857600 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.251us (4.0 Mqps) with batch size of 0, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.107us (9.4 Mqps) with batch size of 10, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.099us (10.1 Mqps) with batch size of 25, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.100us (10.0 Mqps) with batch size of 50, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.116us (8.6 Mqps) with batch size of 100, # of found keys 83886080 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.189us (5.3 Mqps) with batch size of 0, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.095us (10.5 Mqps) with batch size of 10, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.096us (10.4 Mqps) with batch size of 25, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.098us (10.2 Mqps) with batch size of 50, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.105us (9.5 Mqps) with batch size of 100, # of found keys 73400320 [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=1 ==== Test CuckooReaderTest.WhenKeyExists ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator ==== Test CuckooReaderTest.CheckIterator ==== Test CuckooReaderTest.CheckIteratorUint64 ==== Test CuckooReaderTest.WhenKeyNotFound ==== Test CuckooReaderTest.TestReadPerformance With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.230us (4.3 Mqps) with batch size of 0, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.086us (11.7 Mqps) with batch size of 10, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.088us (11.3 Mqps) with batch size of 25, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.083us (12.1 Mqps) with batch size of 50, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.083us (12.1 Mqps) with batch size of 100, # of found keys 125829120 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.159us (6.3 Mqps) with batch size of 0, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 10, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.080us (12.6 Mqps) with batch size of 25, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.080us (12.5 Mqps) with batch size of 50, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.082us (12.2 Mqps) with batch size of 100, # of found keys 104857600 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.154us (6.5 Mqps) with batch size of 0, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.077us (13.0 Mqps) with batch size of 10, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.077us (12.9 Mqps) with batch size of 25, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 50, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.079us (12.6 Mqps) with batch size of 100, # of found keys 83886080 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.218us (4.6 Mqps) with batch size of 0, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.083us (12.0 Mqps) with batch size of 10, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.085us (11.7 Mqps) with batch size of 25, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.086us (11.6 Mqps) with batch size of 50, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 100, # of found keys 73400320 ``` Reviewers: sdong, igor, yhchiang Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23451	2014-09-18 11:00:48 -07:00
Igor Canadi	2fb1fea30f	Fix syncronization issues	2014-09-18 10:42:54 -07:00
Lei Jin	a062e1f2c4	SetOptions() for memtable related options Summary: as title Test Plan: make all check I will think a way to set up stress test for this Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23055	2014-09-17 12:49:13 -07:00
Igor Canadi	60a4aa175e	Test use_mmap_reads Summary: We currently don't test mmap reads as part of db_test. Piggyback it on kWalDir test config. Test Plan: make check Reviewers: ljin, sdong, yhchiang Reviewed By: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23337	2014-09-17 12:31:53 -07:00
Igor Canadi	4a27a2f193	Don't sync manifest when disableDataSync = true Summary: As we discussed offline Test Plan: compiles Reviewers: yhchiang, sdong, ljin, dhruba Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22989	2014-09-15 11:32:01 -07:00
Igor Canadi	04ce1b25f3	Fix #284	2014-09-13 14:14:10 -07:00
Igor Canadi	dee91c259d	WriteThread Summary: This diff just moves the write thread control out of the DBImpl. I will need this as I will control column family data concurrency by only accessing some data in the write thread. That way, we won't have to lock our accesses to column family hash table (mappings from IDs to CFDs). Test Plan: make check Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23301	2014-09-12 16:23:58 -07:00
Igor Canadi	540a257f2c	Fix WAL synced Summary: Uhm... Test Plan: nope Reviewers: sdong, yhchiang, tnovak, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23343	2014-09-12 16:15:29 -07:00
Chilledheart	49fe329e5e	Fix build issue under macosx	2014-09-13 05:05:22 +08:00
Feng Zhu	0352a9fa91	add_wrapped_bloom_test Summary: 1. wrap a filter policy like what fbcode/multifeed/rocksdb/MultifeedRocksDbKey.h to ensure that rocksdb works fine after filterpolicy interface change Test Plan: 1. valgrind ./bloom_test Reviewers: ljin, igor, yhchiang, dhruba, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23229	2014-09-11 16:33:46 -07:00
Igor Canadi	9c0e66ce98	Don't run background jobs (flush, compactions) when bg_error_ is set Summary: If bg_error_ is set, that means that we mark DB read only. However, current behavior still continues the flushes and compactions, even though bg_error_ is set. On the other hand, if bg_error_ is set, we will return Status::OK() from CompactRange(), although the compaction didn't actually succeed. This is clearly not desired behavior. I found this when I was debugging t5132159, although I'm pretty sure these aren't related. Also, when we're shutting down, it's dangerous to exit RunManualCompaction(), since that will destruct ManualCompaction object. Background compaction job might still hold a reference to manual_compaction_ and this will lead to undefined behavior. I changed the behavior so that we only exit RunManualCompaction when manual compaction job is marked done. Test Plan: make check Reviewers: sdong, ljin, yhchiang Reviewed By: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23223	2014-09-11 16:24:16 -07:00
Igor Canadi	a9639bda84	Fix valgrind test Summary: Get valgrind to stop complaining about uninitialized value Test Plan: valgrind not complaining anymore Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23289	2014-09-11 15:36:30 -07:00
Igor Canadi	d1f24dc7ee	Relax FlushSchedule test Summary: The test makes sure that we don't call flush too often. For that, it's ok to check if we have less than 10 table files. Otherwise, the test is flaky because it's hard to estimate number of entries in the memtable before it gets flushed (any ideas?) Test Plan: Still works, but hopefully less flaky. Reviewers: ljin, sdong, yhchiang Reviewed by: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23241	2014-09-11 11:00:45 -07:00
Igor Canadi	3d9e6f7759	Push model for flushing memtables Summary: When memtable is full it calls the registered callback. That callback then registers column family as needing the flush. Every write checks if there are some column families that need to be flushed. This completely eliminates the need for MakeRoomForWrite() function and simplifies our Write code-path. There is some complexity with the concurrency when the column family is dropped. I made it a bit less complex by dropping the column family from the write thread in https://reviews.facebook.net/D22965. Let me know if you want to discuss this. Test Plan: make check works. I'll also run db_stress with creating and dropping column families for a while. Reviewers: yhchiang, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23067	2014-09-10 18:46:09 -07:00
Igor Canadi	059e584dd3	[unit test] CompactRange should fail if we don't have space Summary: See t5106397. Also, few more changes: 1. in unit tests, the assumption is that writes will be dropped when there is no space left on device. I changed the wording around it. 2. InvalidArgument() errors are only when user-provided arguments are invalid. When the file is corrupted, we need to return Status::Corruption Test Plan: make check Reviewers: sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23145	2014-09-10 17:00:00 -07:00
Igor Canadi	a52cecb56c	Fix Mac compile	2014-09-09 18:42:35 -07:00
Jonah Cohen	092f97e219	Fix comments and typos Summary: Correct some comments and typos in RocksDB. Test Plan: Inspection Reviewers: sdong, igor Reviewed By: igor Differential Revision: https://reviews.facebook.net/D23133	2014-09-09 15:20:49 -07:00
Igor Canadi	0a42295a24	Fix SimpleWriteTimeoutTest Summary: In column family's SanitizeOptions() [1], we make sure that min_write_buffer_number_to_merge is normal value. However, this test depended on the fact that setting min_write_buffer_number_to_merge to be bigger than max_write_buffer_number will cause a deadlock. I'm not sure how it worked before. This diff fixes it by scheduling sleeping background task, which will actually block any attempts of flushing. [1] https://github.com/facebook/rocksdb/blob/master/db/column_family.cc#L104 Test Plan: the test works now Reviewers: yhchiang, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23103	2014-09-09 11:50:05 -07:00
sdong	06d986252a	Always pass MergeContext as pointer, not reference Summary: To follow the coding convention and make sure when passing reference as a parameter it is also const, pass MergeContext as a pointer to mem tables. Test Plan: make all check Reviewers: ljin, igor Reviewed By: igor Subscribers: leveldb, dhruba, yhchiang Differential Revision: https://reviews.facebook.net/D23085	2014-09-09 11:37:32 -07:00
Stanislau Hlebik	d343c3fe46	Improve db recovery Summary: Avoid creating unnecessary sst files while db opening Test Plan: make all check Reviewers: sdong, igor Reviewed By: igor Subscribers: zagfox, yhchiang, ljin, leveldb Differential Revision: https://reviews.facebook.net/D20661	2014-09-09 11:18:50 -07:00
Lei Jin	52311463e9	MemTableOptions Summary: removed reference to options in WriteBatch and DBImpl::Get() Test Plan: make all check Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23049	2014-09-08 18:46:52 -07:00
Lei Jin	171d4ff4a2	remove TailingIterator reference in db_impl.h Summary: as title Test Plan: make release Reviewers: igor Differential Revision: https://reviews.facebook.net/D23073	2014-09-08 15:39:53 -07:00
Lei Jin	9b0f7ffa1c	rename version_set options_ to db_options_ to avoid confusion Summary: as title Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23007	2014-09-08 15:25:01 -07:00
Igor Canadi	2d57828d0e	Check stop level trigger-0 before slowdown level-0 trigger Summary: ... Test Plan: Can't repro the test failure, but let's see what jenkins says Reviewers: zagfox, sdong, ljin Reviewed By: sdong, ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23061	2014-09-08 15:23:58 -07:00
Lei Jin	659d2d50c3	move compaction_filter to immutable_options Summary: all shared_ptrs are in immutable_options now. This will also make options assignment a little cheaper Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23001	2014-09-08 15:09:25 -07:00
Lei Jin	048560a642	reduce references to cfd->options() in DBImpl Summary: I found it is almost impossible to get rid of this function in a single batch. I will take a step by step approach Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22995	2014-09-08 15:04:34 -07:00
sdong	011241bb99	DB::Flush() Do not wait for background threads when there is nothing in mem table Summary: When we have multiple column families, users can issue Flush() on every column families to make sure everything is flushes, even if some of them might be empty. By skipping the waiting for empty cases, it can be greatly speed up. Still wait for people's comments before writing unit tests for it. Test Plan: Will write a unit test to make sure it is correct. Reviewers: ljin, yhchiang, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D22953	2014-09-08 13:40:42 -07:00
Igor Canadi	a2bb7c3c33	Push- instead of pull-model for managing Write stalls Summary: Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either: * proceed with all writes without delay * delay all writes by fixed time * stop all writes The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case). When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal. This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write. Test Plan: make check for now. I'll add some unit tests later. Also, perf test. Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22791	2014-09-08 11:20:25 -07:00
Feng Zhu	0af157f9bf	Implement full filter for block based table. Summary: 1. Make filter_block.h a base class. Derive block_based_filter_block and full_filter_block. The previous one is the traditional filter block. The full_filter_block is newly added. It would generate a filter block that contain all the keys in SST file. 2. When querying a key, table would first check if full_filter is available. If not, it would go to the exact data block and check using block_based filter. 3. User could choose to use full_filter or tradional(block_based_filter). They would be stored in SST file with different meta index name. "filter.filter_policy" or "full_filter.filter_policy". Then, Table reader is able to know the fllter block type. 4. Some optimizations have been done for full_filter_block, thus it requires a different interface compared to the original one in filter_policy.h. 5. Actual implementation of filter bits coding/decoding is placed in util/bloom_impl.cc Benchmark: base commit `1d23b5c470` Command: db_bench --db=/dev/shm/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --write_buffer_size=134217728 --max_write_buffer_number=2 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --verify_checksum=false --max_background_compactions=4 --use_plain_table=0 --memtablerep=prefix_hash --open_files=-1 --mmap_read=1 --mmap_write=0 --bloom_bits=10 --bloom_locality=1 --memtable_bloom_bits=500000 --compression_type=lz4 --num=393216000 --use_hash_search=1 --block_size=1024 --block_restart_interval=16 --use_existing_db=1 --threads=1 --benchmarks=readrandom —disable_auto_compactions=1 Read QPS increase for about 30% from 2230002 to 2991411. Test Plan: make all check valgrind db_test db_stress --use_block_based_filter = 0 ./auto_sanity_test.sh Reviewers: igor, yhchiang, ljin, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D20979	2014-09-08 10:37:05 -07:00
Igor Canadi	9360cc690e	Fix valgrind issue	2014-09-08 08:01:25 -07:00
Igor Canadi	9f1c80b556	Drop column family from write thread Summary: If we drop column family only from (single) write thread, we can be sure that nobody will drop the column family while we're writing (and our mutex is released). This greatly simplifies my patch that's getting rid of MakeRoomForWrite(). Test Plan: make check, but also running stress test Reviewers: ljin, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22965	2014-09-05 15:20:05 -07:00
Igor Canadi	8de151bb99	Add db_bench with lots of column families to regression tests Summary: That way we can see when this graph goes up and be happy. Couple of changes: 1. title 2. fix db_bench to delete column families before deleting the DB. this was asserting when compiled in debug mode 3. don't sync manifest when disableDataSync. We discussed this offline. I can move it to separate diff if you'd like Test Plan: ran it Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22815	2014-09-05 14:20:18 -07:00
Lei Jin	c9e419ccb6	rename options_ to db_options_ in DBImpl to avoid confusion Summary: as title Test Plan: make release Reviewers: sdong, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22935	2014-09-05 11:48:17 -07:00
Radheshyam Balasundaram	5cd0576ffe	Fix compaction bug in Cuckoo Table Builder. Use kvs_.size() instead of num_entries in FileSize() method. Summary: Fix compaction bug in Cuckoo Table Builder. Use kvs_.size() instead of num_entries in FileSize() method. Also added tests. Test Plan: make check all Also ran db_bench to generate multiple files. Reviewers: sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22743	2014-09-05 11:18:01 -07:00
Raghav Pisolkar	0fbb3facc0	fixed memory leak in unit test DBIteratorBoundTest Summary: fixed memory leak in unit test DBIteratorBoundTest Test Plan: ran valgrind test on my unit test Reviewers: sdong Differential Revision: https://reviews.facebook.net/D22911	2014-09-05 10:35:28 -07:00
Lei Jin	adcd2532ca	fix asan check Summary: PlainTable takes reference instead of a copy. Keep a copy in the test code Test Plan: make asan_check Reviewers: sdong, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22899	2014-09-05 09:53:04 -07:00
liuhuahang	bb6ae0f80c	fix more compile warnings N/A Change-Id: I5b6f9c70aea7d3f3489328834fed323d41106d9f Signed-off-by: liuhuahang <liuhuahang@zerus.co>	2014-09-05 14:14:37 +08:00
Nik Bougalis	4329d74e05	Fix swapped variable names to accurately reflect usage	2014-09-04 20:09:45 -07:00
Stanislau Hlebik	45a5e3ede0	Remove path with arena==nullptr from NewInternalIterator Summary: Simply code by removing code path which does not use Arena from NewInternalIterator Test Plan: make all check make valgrind_check Reviewers: sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22395	2014-09-04 17:40:41 -07:00
Lei Jin	5665e5e285	introduce ImmutableOptions Summary: As a preparation to support updating some options dynamically, I'd like to first introduce ImmutableOptions, which is a subset of Options that cannot be changed during the course of a DB lifetime without restart. ColumnFamily will keep both Options and ImmutableOptions. Any component below ColumnFamily should only take ImmutableOptions in their constructor. Other options should be taken from APIs, which will be allowed to adjust dynamically. I am yet to make changes to memtable and other related classes to take ImmutableOptions in their ctor. That can be done in a seprate diff as this one is already pretty big. Test Plan: make all check Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D22545	2014-09-04 16:18:36 -07:00
Raghav Pisolkar	e0b99d4f5d	created a new ReadOptions parameter 'iterate_upper_bound'	2014-09-04 11:00:16 -07:00
liuhuahang	ef5b384729	fix a few compile warnings 1, const qualifiers on return types make no sense and will trigger a compile warning: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] 2, class HistogramImpl has virtual functions and thus should have a virtual destructor 3, with some toolchain, the macro __STDC_FORMAT_MACROS is predefined and thus should be checked before define Change-Id: I69747a03bfae88671bfbb2637c80d17600159c99 Signed-off-by: liuhuahang <liuhuahang@zerus.co>	2014-09-04 23:06:23 +08:00
Lei Jin	9b58c73c7c	call SanitizeDBOptionsByCFOptions() in the right place Summary: It only covers Open() with default column family right now Test Plan: make release Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22467	2014-09-02 14:42:23 -07:00
Igor Canadi	a84234a61b	Ignore missing column families Summary: Before this diff, whenever we Write to non-existing column family, Write() would fail. This diff adds an option to not fail a Write() when WriteBatch points to non-existing column family. MongoDB said this would be useful for them, since they might have a transaction updating an index that was dropped by another thread. This way, they don't have to worry about checking if all indexes are alive on every write. They don't care if they lose writes to dropped index. Test Plan: added a small unit test Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22143	2014-09-02 13:29:05 -07:00
Igor Canadi	7f19bb93c6	Merge pull request #242 from tdfischer/perf-timer-destructors Refactor PerfStepTimer to automatically stop on destruct	2014-09-02 13:06:40 -07:00
Feng Zhu	8438a19360	fix dropping column family bug Summary: 1. db/db_impl.cc:2324 (DBImpl::BackgroundCompaction) should not raise bg_error_ when column family is dropped during compaction. Test Plan: 1. db_stress Reviewers: ljin, yhchiang, dhruba, igor, sdong Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22653	2014-09-02 12:25:58 -07:00
Torrie Fischer	6614a48418	Refactor PerfStepTimer to stop on destruct This eliminates the need to remember to call PERF_TIMER_STOP when a section has been timed. This allows more useful design with the perf timers and enables possible return value optimizations. Simplistic example: class Foo { public: Foo(int v) : m_v(v); private: int m_v; } Foo makeFrobbedFoo(int errno) { errno = 0; return Foo(); } Foo bar(int *errno) { PERF_TIMER_GUARD(some_timer); return makeFrobbedFoo(errno); } int main(int argc, char[] argv) { Foo f; int errno; f = bar(&errno); if (errno) return -1; return 0; } After bar() is called, perf_context.some_timer would be incremented as if Stop(&perf_context.some_timer) was called at the end, and the compiler is still able to produce optimizations on the return value from makeFrobbedFoo() through to main().	2014-09-02 12:04:22 -07:00
Igor Canadi	990df99a61	Fix ios compile Summary: We need to set contbuild for this :) Test Plan: compiles Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22701	2014-09-02 10:50:15 -07:00
Igor Canadi	7dcadb1d37	Don't let flush preempt compaction in certain cases Summary: I have an application configured with 16 background threads. Write rates are high. L0->L1 compactions is very slow and it limits the concurrency of the system. While it's happening, other 15 threads are idle. However, when there is a need of a flush, that one thread busy with L0->L1 is doing flush, instead of any other 15 threads that are just sitting there. This diff prevents that. If there are threads that are idle, we don't let flush preempt compaction. Test Plan: Will run stress test Reviewers: ljin, sdong, yhchiang Reviewed By: sdong, yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D22299	2014-09-02 08:34:54 -07:00
Nik Bougalis	f09329cb01	Fix candidate file comparison when using path ids	2014-08-31 00:54:15 -07:00
Tomislav Novak	0f9c43ea36	ForwardIterator: reset incomplete iterators on Seek() Summary: When reading from kBlockCacheTier, ForwardIterator's internal child iterators may end up in the incomplete state (read was unable to complete without doing disk I/O). `ForwardIterator::status()` will correctly report that; however, the iterator may be stuck in that state until all sub-iterators are rebuilt: * `NeedToSeekImmutable()` may return false even if some sub-iterators are incomplete * one of the child iterators may be an empty iterator without any state other that the kIncomplete status (created using `NewErrorIterator()`); seeking on any such iterator has no effect -- we need to construct it again Akin to rebuilding iterators after a superversion bump, this diff makes forward iterator reset all incomplete child iterators when `Seek()` or `Next()` are called. Test Plan: TEST_TMPDIR=/dev/shm/rocksdbtest ROCKSDB_TESTS=TailingIterator ./db_test Reviewers: igor, sdong, ljin Reviewed By: ljin Subscribers: lovro, march, leveldb Differential Revision: https://reviews.facebook.net/D22575	2014-08-29 16:21:29 -07:00
Lei Jin	722d80c374	reduce recordTick overhead in compaction loop Summary: It is too expensive to bump ticker to every key/vaue pair Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22527	2014-08-29 09:51:09 -07:00
Igor Canadi	0c26e76b28	Merge pull request #237 from tdfischer/tdfischer/faster-timeout-test test: db: fix test to have a smaller timeout for when it runs on faster ...	2014-08-28 20:40:10 -04:00
Feng Zhu	1d23b5c470	remove_internal_filter_policy Summary: 1. remove class InternalFilterPolicy in db/dbformat.h 2. Transformation from internal key to user key is done in filter_block.cc 3. This is a preparation for patch D20979 Test Plan: make all check valgrind ./db_test Reviewers: igor, yhchiang, ljin, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22509	2014-08-28 17:06:29 -07:00
Igor Canadi	d977e55596	Don't let other compactions run when manual compaction runs Summary: Based on discussions from t4982833. This is just a short-term fix, I plan to revamp manual compaction process as part of t4982812. Also, I think we should schedule automatic compactions at the very end of manual compactions, not when we're done with one level. I made that change as part of this diff. Let me know if you disagree. Test Plan: make check for now Reviewers: sdong, tnovak, yhchiang, ljin Reviewed By: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22401	2014-08-28 13:06:28 -04:00
Igor Canadi	d5bd6c772b	Fix ios compile Summary: No __thread for ios. Test Plan: compile works for ios now Reviewers: ljin, dhruba Reviewed By: dhruba Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22491	2014-08-28 12:46:05 -04:00
Radheshyam Balasundaram	4142a3e783	Adding a user comparator for comparing Uint64 slices. Summary: - New Uint64 comparator - Modify Reader and Builder to take custom user comparators instead of bytewise comparator - Modify logic for choosing unused user key in builder - Modify iterator logic in reader - test changes Test Plan: cuckoo_table_{builder,reader,db}_test make check all Reviewers: ljin, sdong Reviewed By: ljin Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D22377	2014-08-27 10:39:31 -07:00
Radheshyam Balasundaram	b6fd7811eb	Don't do memtable lookup in db_impl_readonly if memtables are empty while opening db. Summary: In DBImpl::Recover method, while loading memtables, also check if memtables are empty. Use this in DBImplReadonly to determine whether to lookup memtable or not. Test Plan: db_test make check all Reviewers: sdong, yhchiang, ljin, igor Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22281	2014-08-26 17:19:03 -07:00
Stanislau Hlebik	9dcb75b6d9	Add is-file-deletions-enabled property Summary: Add property 'rocksdb.is-file-deletions-enable' which equals disable_delete_obsole_file_ Test Plan: make all check Reviewers: sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22119	2014-08-26 16:26:29 -07:00
Lei Jin	1755581f19	improve OptimizeForPointLookup() Summary: also fix HISTORY.md Test Plan: make all check Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22437	2014-08-26 14:15:00 -07:00
Lei Jin	bda6f3363d	fix valgrind error in c_test caused by BlockBasedTableOptions Summary: It was creating BlockBasedTableOptions object in a loop without calling destroy() Test Plan: valgrind ./c_test --leak-check=full --show-reachable=yes Reviewers: sdong, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22431	2014-08-26 09:57:25 -07:00
Torrie Fischer	0db6b028e7	Update timeout to 50ms instead of 3.	2014-08-26 09:38:45 -07:00
Lei Jin	23861857c4	ReadOptions.total_order_seek to allow total order seek for block-based table when hash index is enabled Summary: as title Test Plan: table_test Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22239	2014-08-25 16:14:30 -07:00
Lei Jin	a98badff16	print table options Summary: Add a virtual function in table factory that will print table options Test Plan: make release Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22149	2014-08-25 14:24:09 -07:00
Lei Jin	384400128f	move block based table related options BlockBasedTableOptions Summary: I will move compression related options in a separate diff since this diff is already pretty lengthy. I guess I will also need to change JNI accordingly :( Test Plan: make all check Reviewers: yhchiang, igor, sdong Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21915	2014-08-25 14:22:05 -07:00
Igor Canadi	42ea795209	Fix concurrency issue in CompactionPicker Summary: I am currently working on a project that uses RocksDB. While debugging some perf issues, I came up across interesting compaction concurrency issue. Namely, I had 15 idle threads and a good comapction to do, but CompactionPicker returned "Compaction nothing to do". Here's how Internal stats looked: 2014/08/22-08:08:04.551982 7fc7fc3f5700 ------- DUMPING STATS ------- 2014/08/22-08:08:04.552000 7fc7fc3f5700 Compaction Stats [default] Level Files Size(MB) Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s) Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt) Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ L0 7/5 353 1.0 0.0 0.0 0.0 2.3 2.3 0.0 0.0 0.0 9.4 0 0 0 0 247 46 5.359 8.53 1 8526.25 L1 2/2 86 1.3 2.6 1.9 0.7 2.6 1.9 2.7 1.3 24.3 24.0 39 19 71 52 109 11 9.938 0.00 0 0.00 L2 26/0 833 1.3 5.7 1.7 4.0 5.2 1.2 6.3 3.0 15.6 14.2 47 112 147 35 373 44 8.468 0.00 0 0.00 L3 12/0 505 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 0.000 0.00 0 0.00 Sum 47/7 1778 0.0 8.3 3.6 4.6 10.0 5.4 8.1 4.4 11.6 14.1 86 131 218 87 728 101 7.212 8.53 1 8526.25 Int 0/0 0 0.0 2.4 0.8 1.6 2.7 1.2 11.5 6.1 12.0 13.6 20 43 63 20 203 23 8.845 0.00 0 0.00 Flush(GB): accumulative 2.266, interval 0.444 Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 8.526 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 1 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard DB Stats Uptime(secs): 336.8 total, 60.4 interval Cumulative writes: 61584000 writes, 6480589 batches, 9.5 writes per batch, 1.39 GB user ingest Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written Interval writes: 11235257 writes, 1175050 batches, 9.6 writes per batch, 259.9 MB user ingest Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written To see what happened, go here: `47b452cfcf/db/compaction_picker.cc (L430)` * The for loop started with level 1, because it has the worst score. * PickCompactionBySize on L429 returned nullptr because all files were being compacted * ExpandWhileOverlapping(c) returned true (because that's what it does when it gets nullptr!?) * for loop break-ed, never trying compactions for level 2 :( :( This bug was present at least since January. I have no idea how we didn't find this sooner. Test Plan: Unit testing compaction picker is hard. I tested this by running my service and observing L0->L1 and L2->L3 compactions in parallel. However, for long-term, I opened the task #4968469. @yhchiang is currently refactoring CompactionPicker, hopefully the new version will be unit-testable ;) Here's how my compactions look like after the patch: 2014/08/22-08:50:02.166699 7f3400ffb700 ------- DUMPING STATS ------- 2014/08/22-08:50:02.166722 7f3400ffb700 Compaction Stats [default] Level Files Size(MB) Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s) Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt) Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms) ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ L0 8/5 404 1.5 0.0 0.0 0.0 4.3 4.3 0.0 0.0 0.0 9.6 0 0 0 0 463 88 5.260 0.00 0 0.00 L1 2/2 60 0.9 4.8 3.9 0.8 4.7 3.9 2.4 1.2 23.9 23.6 80 23 131 108 204 19 10.747 0.00 0 0.00 L2 23/3 697 1.0 11.6 3.5 8.1 10.9 2.8 6.4 3.1 17.7 16.6 95 242 317 75 669 92 7.268 0.00 0 0.00 L3 58/14 2207 0.3 6.2 1.6 4.6 5.9 1.3 7.4 3.6 14.6 13.9 43 121 159 38 436 36 12.106 0.00 0 0.00 Sum 91/24 3368 0.0 22.5 9.1 13.5 25.8 12.4 11.2 6.0 13.0 14.9 218 386 607 221 1772 235 7.538 0.00 0 0.00 Int 0/0 0 0.0 3.2 0.9 2.3 3.6 1.3 15.3 8.0 12.4 13.7 24 66 89 23 266 27 9.838 0.00 0 0.00 Flush(GB): accumulative 4.336, interval 0.444 Stalls(secs): 0.000 level0_slowdown, 0.000 level0_numfiles, 0.000 memtable_compaction, 0.000 leveln_slowdown_soft, 0.000 leveln_slowdown_hard Stalls(count): 0 level0_slowdown, 0 level0_numfiles, 0 memtable_compaction, 0 leveln_slowdown_soft, 0 leveln_slowdown_hard DB Stats Uptime(secs): 577.7 total, 60.1 interval Cumulative writes: 116960736 writes, 11966220 batches, 9.8 writes per batch, 2.64 GB user ingest Cumulative WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 GB written Interval writes: 11643735 writes, 1206136 batches, 9.7 writes per batch, 269.2 MB user ingest Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, 0.00 MB written Yay for concurrent L0->L1 and L2->L3 compactions! Reviewers: sdong, yhchiang, ljin Reviewed By: yhchiang Subscribers: yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D22305	2014-08-22 11:32:40 -07:00
Yueh-Hsuan Chiang	47b452cfcf	Fix the error of c_test.c Summary: Fix the error of c_test.c Test Plan: make c_test ./c_test	2014-08-20 17:05:29 -07:00
Yueh-Hsuan Chiang	562b7a1f28	Add missing implementaiton of SanitizeDBOptions in simple_table_db_test.cc Summary: Add missing implementaiton of SanitizeDBOptions in simple_table_db_test.cc Test Plan: make simple_table_db_test.cc	2014-08-20 16:33:25 -07:00
Yueh-Hsuan Chiang	63a2215c63	Improve Options sanitization and add MmapReadRequired() to TableFactory Summary: Currently, PlainTable must use mmap_reads. When PlainTable is used but allow_mmap_reads is not set, rocksdb will fail in flush. This diff improve Options sanitization and add MmapReadRequired() to TableFactory. Test Plan: export ROCKSDB_TESTS=PlainTableOptionsSanitizeTest make db_test -j32 ./db_test Reviewers: sdong, ljin Reviewed By: ljin Subscribers: you, leveldb Differential Revision: https://reviews.facebook.net/D21939	2014-08-20 15:53:39 -07:00
sdong	10720a5587	Revert the unintended change that DestroyDB() doesn't clean up info logs. Summary: A previous change triggered a change by mistake: DestroyDB() will keep info logs under DB directory. Revert the unintended change. Test Plan: Add a unit test case to verify it. Reviewers: ljin, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22209	2014-08-20 12:07:32 -07:00
Torrie Fischer	7c5173d27f	test: db: fix test to have a smaller timeout for when it runs on faster hardware	2014-08-19 13:45:12 -07:00
Radheshyam Balasundaram	162b8151f1	Adding Column Family support in db_bench. Summary: Adding num_column_families flag. Adding support for column families in DoWrite and ReadRandom methods. [Igor, please let me know if this approach sounds good. I shall add it to other methods too.] Test Plan: Ran fillseq on 1M keys and 10 Column families and ran readrandom. Reviewers: sdong, yhchiang, igor, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21387	2014-08-18 18:15:01 -07:00
sdong	28b5c76004	WriteBatchWithIndex: a wrapper of WriteBatch, with a searchable index Summary: Add WriteBatchWithIndex so that a user can query data out of a WriteBatch, to support MongoDB's read-its-own-write. WriteBatchWithIndex uses a skiplist to store the binary index. The index stores the offset of the entry in the write batch. When searching for a key, the key for the entry is read by read the entry from the write batch from the offset. Define a new iterator class for querying data out of WriteBatchWithIndex. A user can create an iterator of the write batch for one column family, seek to a key and keep calling Next() to see next entries. I will add more unit tests if people are OK about this API. Test Plan: make all check Add unit tests. Reviewers: yhchiang, igor, MarkCallaghan, ljin Reviewed By: ljin Subscribers: dhruba, leveldb, xjin Differential Revision: https://reviews.facebook.net/D21381	2014-08-18 16:37:38 -07:00
Radheshyam Balasundaram	36e759d199	Adding Cuckoo Table SST option to db_bench Summary: Adding flags to use cuckoo table SST in db_bench.cc Test Plan: Ran benchmark with fillseq and readrandom Reviewers: sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21729	2014-08-18 11:59:38 -07:00
Igor Canadi	a6fd14c881	Fix valgrind error in c_test	2014-08-18 11:08:51 -07:00
Igor Canadi	c8ecfaedd0	Merge pull request #230 from cockroachdb/spencerkimball/send-user-keys-to-v2-filter Pass parsed user key to prefix extractor in V2 compaction	2014-08-18 11:09:30 -04:00
Yueh-Hsuan Chiang	570ba5aca8	Avoid retrying to read property block from a table when it does not exist. Summary: Avoid retrying to read property block from a table when it does not exist in updating stats for compensating deletion entries. In addition, ReadTableProperties() now returns Status::NotFound instead of Status::Corruption when table properties does not exist in the file. Test Plan: make db_test -j32 export ROCKSDB_TESTS=CompactionDeleteionTrigger ./db_test Reviewers: ljin, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21867	2014-08-15 12:17:44 -07:00
sdong	58b0f9d890	Support purging logs from separate log directory Summary: 1. Support purging info logs from a separate paths from DB path. Refactor the codes of generating info log prefixes so that it can be called when generating new files and scanning log directory. 2. Fix the bug of not scanning multiple DB paths (should only impact multiple DB paths) Test Plan: Add unit test for generating and parsing info log files Add end-to-end test in db_test Reviewers: yhchiang, ljin Reviewed By: ljin Subscribers: leveldb, igor, dhruba Differential Revision: https://reviews.facebook.net/D21801	2014-08-14 13:22:50 -07:00
Lei Jin	58c49466d2	Allow env_posix to lower background thread IO priority Summary: This is a linux-specific system call. Test Plan: ran db_bench Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: haobo, leveldb Differential Revision: https://reviews.facebook.net/D21183	2014-08-13 20:49:58 -07:00
Lei Jin	5a5953b388	Add histogram for DB_SEEK Summary: as title Test Plan: make release Reviewers: sdong, yhchiang Reviewed By: yhchiang Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21717	2014-08-13 15:56:37 -07:00
Feng Zhu	5e642403a9	log db path info before open Summary: 1. write db MANIFEST, CURRENT, IDENTITY, sst files, log files to log before open Test Plan: run db and check LOG file Reviewers: ljin, yhchiang, igor, dhruba, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21459	2014-08-13 13:45:13 -07:00
Stanislau Hlebik	0c9dc9f8e0	Remove malloc from FormatFileNumber Summary: Replace unnecessary malloc with stack allocation Test Plan: make all check Reviewers: sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21771	2014-08-13 11:57:40 -07:00
sdong	48081777f3	Revert "Include candidate files under options.db_log_dir in FindObsoleteFiles()" This reverts commit `54153ab07a`.	2014-08-12 18:14:27 -07:00
Yueh-Hsuan Chiang	0138b8eba8	Fixed compile errors (signed / unsigned comparison) in cuckoo_table_db_test on Mac Summary: Fixed compile errors (signed / unsigned comparison) in cuckoo_table_db_test on Mac Test Plan: make cuckoo_table_db_test	2014-08-12 17:35:09 -07:00
Yueh-Hsuan Chiang	1562653ba0	Fixed a signed-unsigned comparison error in db_test Summary: Fixed a signed-unsigned comparison error in db_test Test Plan: make db_test	2014-08-12 17:26:47 -07:00
Lei Jin	218857b3f5	remove tailing_iter.h/cc Summary: as title Test Plan: make all check ran db_bench and saw seek stats at the end Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21651	2014-08-12 17:13:15 -07:00
Lei Jin	5d0074c471	set bytes_per_sync to 1MB if rate limiter is enabled Summary: as title Test Plan: make all check Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D21201	2014-08-12 16:42:18 -07:00

1 2 3 4 5 ...

1255 Commits