rocksdb

Author	SHA1	Message	Date
Yueh-Hsuan Chiang	8447861896	Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h Summary: Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h Test Plan: make	2014-11-07 14:57:51 -08:00
Yueh-Hsuan Chiang	28c82ff1b3	CompactFiles, EventListener and GetDatabaseMetaData Summary: This diff adds three sets of APIs to RocksDB. = GetColumnFamilyMetaData = * This APIs allow users to obtain the current state of a RocksDB instance on one column family. * See GetColumnFamilyMetaData in include/rocksdb/db.h = EventListener = * A virtual class that allows users to implement a set of call-back functions which will be called when specific events of a RocksDB instance happens. * To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners = CompactFiles = * CompactFiles API inputs a set of file numbers and an output level, and RocksDB will try to compact those files into the specified level. = Example = * Example code can be found in example/compact_files_example.cc, which implements a simple external compactor using EventListener, GetColumnFamilyMetaData, and CompactFiles API. Test Plan: listener_test compactor_test example/compact_files_example export ROCKSDB_TESTS=CompactFiles db_test export ROCKSDB_TESTS=MetaData db_test Reviewers: ljin, igor, rven, sdong Reviewed By: sdong Subscribers: MarkCallaghan, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D24705	2014-11-07 14:45:18 -08:00
Igor Canadi	31342c4005	Fix implicit compare	2014-11-07 12:41:05 -08:00
Igor Canadi	9f20395cd6	Turn -Wshadow back on Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc. Test Plan: compiles Reviewers: ljin, yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D28131	2014-11-06 11:14:28 -08:00
sdong	ac95ae1b5d	Make sure WAL is synced for DB::Write() if write batch is empty Summary: This patch makes it a contract that if an empty write batch is passed to DB::Write() and WriteOptions.sync = true, fsync is called to WAL. Test Plan: A new unit test Reviewers: ljin, rven, yhchiang, igor Reviewed By: igor Subscribers: dhruba, MarkCallaghan, leveldb Differential Revision: https://reviews.facebook.net/D28365	2014-11-06 09:48:19 -08:00
Lei Jin	29a9161f34	Note dynamic options in options.h Summary: as title Test Plan: n/a Reviewers: igor, yhchiang, rven, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D28287	2014-11-04 16:23:45 -08:00
Lei Jin	fd24ae9d05	SetOptions() to return status and also add it to StackableDB Summary: as title Test Plan: ./db_test Reviewers: sdong, yhchiang, rven, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D28269	2014-11-04 16:23:05 -08:00
sdong	83bf09144b	Bump verison number to 3.7 Summary: As tittle Test Plan: N/A Reviewers: ljin, yhchiang, rven, igor Reviewed By: igor Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D28299	2014-11-04 14:52:02 -08:00
sdong	09899f0b51	DB::Open() to automatically increase thread pool size if it is smaller than max number of parallel compactions or flushes Summary: With the patch, thread pool size will be automatically increased if DB's options ask for more parallelism of compactions or flushes. Too many users have been confused by the API. Change it to make it harder for users to make mistakes Test Plan: Add two unit tests to cover the function. Reviewers: yhchiang, rven, igor, MarkCallaghan, ljin Reviewed By: ljin Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D27555	2014-11-03 17:22:34 -08:00
Jonah Cohen	c1a924b9f0	Move convenience.h to /include Summary: Move header file so it can be referenced externally. Test Plan: Rebuild. Reviewers: ljin Reviewed By: ljin Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D28095	2014-10-31 12:08:43 -07:00
Igor Canadi	c2999f54bd	Revert "tmp" This reverts commit `9ab0132360`.	2014-10-29 15:29:33 -07:00
Lei Jin	9ab0132360	tmp Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2014-10-29 13:36:47 -07:00
Lei Jin	f1841985e4	dynamic inplace_update options Summary: Make inplace_update_support and inplace_update_num_locks dynamic. inplace_callback becomes immutable We are almost free of references to cfd->options() in db_impl Test Plan: unit test Reviewers: igor, yhchiang, rven, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D25293	2014-10-27 12:10:13 -07:00
Lei Jin	2dd9bfe3a8	Sanitize block-based table index type and check prefix_extractor Summary: Respond to issue reported https://www.facebook.com/groups/rocksdb.dev/permalink/651090261656158/ Change the Sanitize signature to take both DBOptions and CFOptions Test Plan: unit test Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D25041	2014-10-17 21:18:36 -07:00
Igor Canadi	833357402c	WriteBatchWithIndex supports an iterator that merge its change with a base iterator. Summary: Add an iterator that combines base_iterator of type Iterator* with delta iterator of type WBWIIterator*. Test Plan: nothing yet. work in progress Reviewers: ljin, igor Reviewed By: igor Subscribers: rven, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D24741	2014-10-10 19:02:58 -07:00
sdong	4f65fbd197	WriteBatchWithIndex's iterator to support SeekToFirst(), SeekToLast() and Prev() Summary: Support SeekToFirst(), SeekToLast() and Prev() in WBWIIterator, returned by WriteBatchWithIndex::NewIterator(). Test Plan: Write unit test cases to cover the case. Reviewers: ljin, igor Reviewed By: igor Subscribers: rven, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D24765	2014-10-10 16:19:34 -07:00
sdong	f441b273ae	WriteBatchWithIndex to support an option to overwrite rows when operating the same key Summary: With a new option, when accepting a new key, WriteBatchWithIndex will find an existing index of the same key, and replace the content of it. Test Plan: Add a unit test case. Reviewers: ljin, yhchiang, rven, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D24753	2014-10-10 15:19:21 -07:00
Lei Jin	cd0d581ff5	convert Options from string Summary: Allow accepting Options as a string of key/value pairs Test Plan: unit test Reviewers: yhchiang, sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D24597	2014-10-10 10:00:12 -07:00
Tomislav Novak	88edfd90ae	SkipListRep::LookaheadIterator Summary: This diff introduces the `lookahead` argument to `SkipListFactory()`. This is an optimization for the tailing use case which includes many seeks. E.g. consider the following operations on a skip list iterator: Seek(x), Next(), Next(), Seek(x+2), Next(), Seek(x+3), Next(), Next(), ... If `lookahead` is positive, `SkipListRep` will return an iterator which also keeps track of the previously visited node. Seek() then first does a linear search starting from that node (up to `lookahead` steps). As in the tailing example above, this may require fewer than ~log(n) comparisons as with regular skip list search. Test Plan: Added a new benchmark (`fillseekseq`) which simulates the usage pattern. It first writes N records (with consecutive keys), then measures how much time it takes to read them by calling `Seek()` and `Next()`. $ time ./db_bench -num 10000000 -benchmarks fillseekseq -prefix_size 1 \ -key_size 8 -write_buffer_size $[102410241024] -value_size 50 \ -seekseq_next 2 -skip_list_lookahead=0 [...] DB path: [/dev/shm/rocksdbtest/dbbench] fillseekseq : 0.389 micros/op 2569047 ops/sec; real 0m21.806s user 0m12.106s sys 0m9.672s $ time ./db_bench [...] -skip_list_lookahead=2 [...] DB path: [/dev/shm/rocksdbtest/dbbench] fillseekseq : 0.153 micros/op 6540684 ops/sec; real 0m19.469s user 0m10.192s sys 0m9.252s Reviewers: ljin, sdong, igor Reviewed By: igor Subscribers: dhruba, leveldb, march, lovro Differential Revision: https://reviews.facebook.net/D23997	2014-10-07 11:48:23 -07:00
Igor Canadi	0908ddcea5	Don't keep managing two rocksdb version Summary: Before this diff, there are two places with rocksdb versions. After the diff: 1. we only have one source of truth for rocksdb version 2. we have a script that we can use to get the version that we can use in other compilations (java, go, etc). Test Plan: make Reviewers: yhchiang, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D24333	2014-10-02 11:59:22 -07:00
Lei Jin	5ec53f3edf	make compaction related options changeable Summary: make compaction related options changeable. Most of changes are tedious, following the same convention: grabs MutableCFOptions at the beginning of compaction under mutex, then pass it throughout the job and register it in SuperVersion at the end. Test Plan: make all check Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23349	2014-10-01 16:19:16 -07:00
fyrz	5340484266	Built-in comparator(s) in RocksJava Extended Built-in comparators with ReverseBytewiseComparator. Reverse key handling is under certain conditions essential. E.g. while using timestamp versioned data. As native-comparators were not available using JAVA-API. Both built-in comparators were exposed via JNI to be set upon database creation time.	2014-09-26 10:35:12 +02:00
Lei Jin	c6275956e2	improve memory efficiency of cuckoo reader Summary: When creating a new iterator, instead of storing mapping from key to bucket id for sorting, store only bucket id and read key from mmap file based on the id. This reduces from 20 bytes per entry to only 4 bytes. Test Plan: db_bench Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23757	2014-09-25 16:15:23 -07:00
Lei Jin	581442d446	option to choose module when calculating CuckooTable hash Summary: Using module to calculate hash makes lookup ~8% slower. But it has its benefit: file size is more predictable, more space enffient Test Plan: db_bench Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23691	2014-09-25 13:53:27 -07:00
Igor Canadi	21ddcf6e4f	Remove allow_thread_local Summary: See https://reviews.facebook.net/D19365 Test Plan: compiles Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23907	2014-09-24 13:12:16 -07:00
Lei Jin	0a29ce5393	re-enable BlockBasedTable::SetupForCompaction() Summary: It was commented out in D22545 by accident. Keep the option in ImmutableOptions for now. I can make it dynamic in https://reviews.facebook.net/D23349 Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23865	2014-09-23 14:18:57 -07:00
sdong	d0de413f4d	WriteBatchWithIndex to allow different Comparators for different column families Summary: Previously, one single column family is given to WriteBatchWithIndex to index keys for all column families. An extra map from column family ID to comparator is maintained which can override the default comparator given in the constructor. A WriteBatchWithIndex::SetComparatorForCF() is added for user to add comparators per column family. Also move more codes into anonymous namespace. Test Plan: Add a unit test Reviewers: ljin, igor Reviewed By: igor Subscribers: dhruba, leveldb, yhchiang Differential Revision: https://reviews.facebook.net/D23355	2014-09-22 13:47:39 -07:00
Lei Jin	57a32f147f	change target_file_size_base to uint64_t Summary: It contrains the file size to be 4G max with int Test Plan: tried to grep instance and made sure other related variables are also uint64 Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23697	2014-09-22 11:15:03 -07:00
Lei Jin	51af7c326c	CuckooTable: add one option to allow identity function for the first hash function Summary: MurmurHash becomes expensive when we do millions Get() a second in one thread. Add this option to allow the first hash function to use identity function as hash function. It results in QPS increase from 3.7M/s to ~4.3M/s. I did not observe improvement for end to end RocksDB performance. This may be caused by other bottlenecks that I will address in a separate diff. Test Plan: ``` [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=0 ==== Test CuckooReaderTest.WhenKeyExists ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator ==== Test CuckooReaderTest.CheckIterator ==== Test CuckooReaderTest.CheckIteratorUint64 ==== Test CuckooReaderTest.WhenKeyNotFound ==== Test CuckooReaderTest.TestReadPerformance With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.272us (3.7 Mqps) with batch size of 0, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.138us (7.2 Mqps) with batch size of 10, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.142us (7.1 Mqps) with batch size of 25, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.142us (7.0 Mqps) with batch size of 50, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.144us (6.9 Mqps) with batch size of 100, # of found keys 125829120 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.201us (5.0 Mqps) with batch size of 0, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.121us (8.3 Mqps) with batch size of 10, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.123us (8.1 Mqps) with batch size of 25, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.121us (8.3 Mqps) with batch size of 50, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.112us (8.9 Mqps) with batch size of 100, # of found keys 104857600 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.251us (4.0 Mqps) with batch size of 0, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.107us (9.4 Mqps) with batch size of 10, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.099us (10.1 Mqps) with batch size of 25, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.100us (10.0 Mqps) with batch size of 50, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.116us (8.6 Mqps) with batch size of 100, # of found keys 83886080 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.189us (5.3 Mqps) with batch size of 0, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.095us (10.5 Mqps) with batch size of 10, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.096us (10.4 Mqps) with batch size of 25, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.098us (10.2 Mqps) with batch size of 50, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.105us (9.5 Mqps) with batch size of 100, # of found keys 73400320 [ljin@dev1964 rocksdb] ./cuckoo_table_reader_test --enable_perf --file_dir=/dev/shm --write --identity_as_first_hash=1 ==== Test CuckooReaderTest.WhenKeyExists ==== Test CuckooReaderTest.WhenKeyExistsWithUint64Comparator ==== Test CuckooReaderTest.CheckIterator ==== Test CuckooReaderTest.CheckIteratorUint64 ==== Test CuckooReaderTest.WhenKeyNotFound ==== Test CuckooReaderTest.TestReadPerformance With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.230us (4.3 Mqps) with batch size of 0, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.086us (11.7 Mqps) with batch size of 10, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.088us (11.3 Mqps) with batch size of 25, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.083us (12.1 Mqps) with batch size of 50, # of found keys 125829120 With 125829120 items, utilization is 93.75%, number of hash functions: 2. Time taken per op is 0.083us (12.1 Mqps) with batch size of 100, # of found keys 125829120 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.159us (6.3 Mqps) with batch size of 0, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 10, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.080us (12.6 Mqps) with batch size of 25, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.080us (12.5 Mqps) with batch size of 50, # of found keys 104857600 With 104857600 items, utilization is 78.12%, number of hash functions: 2. Time taken per op is 0.082us (12.2 Mqps) with batch size of 100, # of found keys 104857600 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.154us (6.5 Mqps) with batch size of 0, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.077us (13.0 Mqps) with batch size of 10, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.077us (12.9 Mqps) with batch size of 25, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 50, # of found keys 83886080 With 83886080 items, utilization is 62.50%, number of hash functions: 2. Time taken per op is 0.079us (12.6 Mqps) with batch size of 100, # of found keys 83886080 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.218us (4.6 Mqps) with batch size of 0, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.083us (12.0 Mqps) with batch size of 10, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.085us (11.7 Mqps) with batch size of 25, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.086us (11.6 Mqps) with batch size of 50, # of found keys 73400320 With 73400320 items, utilization is 54.69%, number of hash functions: 2. Time taken per op is 0.078us (12.8 Mqps) with batch size of 100, # of found keys 73400320 ``` Reviewers: sdong, igor, yhchiang Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23451	2014-09-18 11:00:48 -07:00
Lei Jin	a062e1f2c4	SetOptions() for memtable related options Summary: as title Test Plan: make all check I will think a way to set up stress test for this Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23055	2014-09-17 12:49:13 -07:00
Lei Jin	e4eca6a1e5	Options conversion function for convenience Summary: as title Test Plan: options_test Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23283	2014-09-17 12:46:32 -07:00
Igor Canadi	4a27a2f193	Don't sync manifest when disableDataSync = true Summary: As we discussed offline Test Plan: compiles Reviewers: yhchiang, sdong, ljin, dhruba Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22989	2014-09-15 11:32:01 -07:00
Jonah Cohen	092f97e219	Fix comments and typos Summary: Correct some comments and typos in RocksDB. Test Plan: Inspection Reviewers: sdong, igor Reviewed By: igor Differential Revision: https://reviews.facebook.net/D23133	2014-09-09 15:20:49 -07:00
Xiaozheng Tie	6cc12860f0	Added a few statistics for BackupableDB Summary: Added the following statistics to BackupableDB: 1. Number of successful and failed backups in class BackupStatistics 2. Time taken to do a backup 3. Number of files in a backup 1 is implemented in the BackupStatistics class 2 and 3 are added in the BackupMeta and BackupInfo class Test Plan: 1 can be tested using BackupStatistics::ToString(), 2 and 3 can be tested in the BackupInfo class Reviewers: sdong, igor2, ljin, igor Reviewed By: igor Differential Revision: https://reviews.facebook.net/D22785	2014-09-09 13:44:42 -07:00
Lei Jin	52311463e9	MemTableOptions Summary: removed reference to options in WriteBatch and DBImpl::Get() Test Plan: make all check Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23049	2014-09-08 18:46:52 -07:00
Lei Jin	659d2d50c3	move compaction_filter to immutable_options Summary: all shared_ptrs are in immutable_options now. This will also make options assignment a little cheaper Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D23001	2014-09-08 15:09:25 -07:00
Lei Jin	048560a642	reduce references to cfd->options() in DBImpl Summary: I found it is almost impossible to get rid of this function in a single batch. I will take a step by step approach Test Plan: make release Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22995	2014-09-08 15:04:34 -07:00
Igor Canadi	a2bb7c3c33	Push- instead of pull-model for managing Write stalls Summary: Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either: * proceed with all writes without delay * delay all writes by fixed time * stop all writes The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case). When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal. This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write. Test Plan: make check for now. I'll add some unit tests later. Also, perf test. Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22791	2014-09-08 11:20:25 -07:00
Feng Zhu	0af157f9bf	Implement full filter for block based table. Summary: 1. Make filter_block.h a base class. Derive block_based_filter_block and full_filter_block. The previous one is the traditional filter block. The full_filter_block is newly added. It would generate a filter block that contain all the keys in SST file. 2. When querying a key, table would first check if full_filter is available. If not, it would go to the exact data block and check using block_based filter. 3. User could choose to use full_filter or tradional(block_based_filter). They would be stored in SST file with different meta index name. "filter.filter_policy" or "full_filter.filter_policy". Then, Table reader is able to know the fllter block type. 4. Some optimizations have been done for full_filter_block, thus it requires a different interface compared to the original one in filter_policy.h. 5. Actual implementation of filter bits coding/decoding is placed in util/bloom_impl.cc Benchmark: base commit `1d23b5c470` Command: db_bench --db=/dev/shm/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --write_buffer_size=134217728 --max_write_buffer_number=2 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --verify_checksum=false --max_background_compactions=4 --use_plain_table=0 --memtablerep=prefix_hash --open_files=-1 --mmap_read=1 --mmap_write=0 --bloom_bits=10 --bloom_locality=1 --memtable_bloom_bits=500000 --compression_type=lz4 --num=393216000 --use_hash_search=1 --block_size=1024 --block_restart_interval=16 --use_existing_db=1 --threads=1 --benchmarks=readrandom —disable_auto_compactions=1 Read QPS increase for about 30% from 2230002 to 2991411. Test Plan: make all check valgrind db_test db_stress --use_block_based_filter = 0 ./auto_sanity_test.sh Reviewers: igor, yhchiang, ljin, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D20979	2014-09-08 10:37:05 -07:00
Nik Bougalis	d1cfb71ec7	Remove unused member(s)	2014-09-05 20:50:29 -07:00
liuhuahang	bb6ae0f80c	fix more compile warnings N/A Change-Id: I5b6f9c70aea7d3f3489328834fed323d41106d9f Signed-off-by: liuhuahang <liuhuahang@zerus.co>	2014-09-05 14:14:37 +08:00
Lei Jin	5665e5e285	introduce ImmutableOptions Summary: As a preparation to support updating some options dynamically, I'd like to first introduce ImmutableOptions, which is a subset of Options that cannot be changed during the course of a DB lifetime without restart. ColumnFamily will keep both Options and ImmutableOptions. Any component below ColumnFamily should only take ImmutableOptions in their constructor. Other options should be taken from APIs, which will be allowed to adjust dynamically. I am yet to make changes to memtable and other related classes to take ImmutableOptions in their ctor. That can be done in a seprate diff as this one is already pretty big. Test Plan: make all check Reviewers: yhchiang, igor, sdong Reviewed By: sdong Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D22545	2014-09-04 16:18:36 -07:00
Raghav Pisolkar	e0b99d4f5d	created a new ReadOptions parameter 'iterate_upper_bound'	2014-09-04 11:00:16 -07:00
Lei Jin	703c3eacd9	comments about the BlockBasedTableOptions migration in Options Summary: as title Test Plan: none Reviewers: sdong, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22737	2014-09-03 17:01:34 -07:00
Lei Jin	9b58c73c7c	call SanitizeDBOptionsByCFOptions() in the right place Summary: It only covers Open() with default column family right now Test Plan: make release Reviewers: igor, yhchiang, sdong Reviewed By: sdong Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22467	2014-09-02 14:42:23 -07:00
Igor Canadi	a84234a61b	Ignore missing column families Summary: Before this diff, whenever we Write to non-existing column family, Write() would fail. This diff adds an option to not fail a Write() when WriteBatch points to non-existing column family. MongoDB said this would be useful for them, since they might have a transaction updating an index that was dropped by another thread. This way, they don't have to worry about checking if all indexes are alive on every write. They don't care if they lose writes to dropped index. Test Plan: added a small unit test Reviewers: sdong, yhchiang, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22143	2014-09-02 13:29:05 -07:00
Feng Zhu	8438a19360	fix dropping column family bug Summary: 1. db/db_impl.cc:2324 (DBImpl::BackgroundCompaction) should not raise bg_error_ when column family is dropped during compaction. Test Plan: 1. db_stress Reviewers: ljin, yhchiang, dhruba, igor, sdong Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22653	2014-09-02 12:25:58 -07:00
Radheshyam Balasundaram	7f71448388	Implementing a cache friendly version of Cuckoo Hash Summary: This implements a cache friendly version of Cuckoo Hash in which, in case of collission, we try to insert in next few locations. The size of the neighborhood to check is taken as an input parameter in builder and stored in the table. Test Plan: make check all cuckoo_table_{db,reader,builder}_test Reviewers: sdong, ljin Reviewed By: ljin Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22455	2014-08-28 10:42:23 -07:00
Igor Canadi	d5bd6c772b	Fix ios compile Summary: No __thread for ios. Test Plan: compile works for ios now Reviewers: ljin, dhruba Reviewed By: dhruba Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22491	2014-08-28 12:46:05 -04:00
Lei Jin	1755581f19	improve OptimizeForPointLookup() Summary: also fix HISTORY.md Test Plan: make all check Reviewers: sdong, yhchiang, igor Reviewed By: igor Subscribers: leveldb Differential Revision: https://reviews.facebook.net/D22437	2014-08-26 14:15:00 -07:00

1 2 3 4 5 ...

586 Commits