rocksdb

Author	SHA1	Message	Date
Mayank Agarwal	56305221c4	Unify DeleteFile and DeleteWalFiles Summary: This is to simplify rocksdb public APIs and improve the code quality. Created an additional parameter to ParseFileName for log sub type and improved the code for deleting a wal file. Wrote exhaustive unit-tests in delete_file_test Unification of other redundant APIs can be taken up in a separate diff Test Plan: Expanded delete_file test Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13647	2013-10-25 08:32:14 -07:00
Kai Liu	c17607a251	Fix the log number bug when updating MANIFEST file Summary: Crash may occur during the flushes of more than two mem tables. As the info log suggested, even when both were successfully flushed, the recovery process still pick up one of the memtable's log for recovery. This diff fix the problem by setting the correct "log number" in MANIFEST. Test Plan: make test; deployed to leaf4 and make sure it doesn't result in crashes of this type. Reviewers: haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13659	2013-10-24 21:05:33 -07:00
Haobo Xu	2fb361ad98	[RocksDB] Add perf_context.wal_write_time to track time spent on writing the recovery log. Summary: as title Test Plan: make check; ./perf_context_test Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13629	2013-10-23 13:38:39 -07:00
Mayank Agarwal	9b50106f9a	Dbid feature Summary: Create a new type of file on startup if it doesn't already exist called DBID. This will store a unique number generated from boost library's uuid header file. The use-case is to identify the case of a db losing all its data and coming back up either empty or from an image(backup/live replica's recovery) the key point to note is that DBID is not stored in a backup or db snapshot It's preferable to use Boost for uuid because: 1) A non-standard way of generating uuid is not good 2) /proc/sys/kernel/random/uuid generates a uuid but only on linux environments and the solution would not be clean 3) c++ doesn't have any direct way to get a uuid 4) Boost is a very good library that was already having linkage in rocksdb from third-party Note: I had to update the TOOLCHAIN_REV in build files to get latest verison of boost from third-party as the older version had a bug. I had to put Wno-uninitialized in Makefile because boost-1.51 has an unitialized variable and rocksdb would not comiple otherwise. Latet open-source for boost is 1.54 but is not there in third-party. I have notified the concerned people in fbcode about it. @kailiu : While releasing to third-party, an additional dependency will need to be created for boost in TARGETS file. I can help identify. Test Plan: Expand db_test to test 2 cases 1) Restarting db with Id file present - verify that no change to Id 2)Restarting db with Id file deleted - verify that a different Id is there after reopen Also run make all check Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13587	2013-10-22 12:23:34 -07:00
Siying Dong	9edda37027	Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress Summary: This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases. Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later. Test Plan: add test DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13467	2013-10-17 13:33:39 -07:00
Dhruba Borthakur	9cd221094c	Add appropriate LICENSE and Copyright message. Summary: Add appropriate LICENSE and Copyright message. Test Plan: make check Reviewers: CC: Task ID: # Blame Rev:	2013-10-16 17:48:41 -07:00
Siying Dong	073cbfc8f0	Enable background flush thread by default and fix issues related to it Summary: Enable background flush thread in this patch and fix unit tests with: (1) After background flush, schedule a background compaction if condition satisfied; (2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered (3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results. Test Plan: pass all unit tests Reviewers: haobo, xjin, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13461	2013-10-16 13:32:53 -07:00
Mayank Agarwal	fe3713961e	Features in Transaction log iterator Summary: * Logstore requests a valid change of reutrning an empty iterator and not an error in case of no log files. * Changed the code to return the writebatch containing the sequence number requested from GetupdatesSince even if it lies in the middle. Earlier we used to return the next writebatch,. This also allows me oto guarantee that no files played upon by the iterator are redundant. I mean the starting log file has at least a sequence number >= the sequence number requested form GetupdatesSince. * Cleaned up redundant logic in Iterator::Next and made a new function SeekToStartSequence for greater readability and maintainibilty. * Modified a test in db_test accordingly Please check the logic carefully and suggest improvements. I have a separate patch out for more improvements like restricting reader to read till written sequences. Test Plan: * transaction log iterator tests in db_test, * db_repl_stress. * rocks_log_iterator_test in fbcode/wormhole/rocksdb/test - 2 tests thriving on hacks till now can get simplified * testing on the shadow setup for sigma with replication Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13437	2013-10-14 18:16:21 -07:00
Kai Liu	86ef6c3f74	Add statistics to sst file Summary: So far we only have key/value pairs as well as bloom filter stored in the sst file. It will be great if we are able to store more metadata about this table itself, for example, the entry size, bloom filter name, etc. This diff is the first step of this effort. It allows table to keep the basic statistics mentioned in http://fburl.com/14995441, as well as allowing writing user-collected stats to stats block. After this diff, we will figure out the interface of how to allow user to collect their interested statistics. Test Plan: 1. Added several unit tests. 2. Ran `make check` to ensure it doesn't break other tests. Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13419	2013-10-14 15:56:13 -07:00
Siying Dong	88f2f89068	Change Function names from Compaction->Flush When they really mean Flush Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance. Test Plan: Run test suites. Reviewers: haobo, dhruba, xjin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13431	2013-10-14 15:12:15 -07:00
Mayank Agarwal	a8b4a69de0	Fixing error in ParseFileName causing DestroyDB to fail on archive directory Summary: This careless error was causing ASSERT_OK(DestroyDB) to fail in db_test. Basically .. was being returned as a child of db/archive and ParseFileName returned false on that, but 'type' was set to LogFile from earlier and not reset. The return of ParseFileName was not being checked to delete the log file or not. Test Plan: make all check Reviewers: dhruba, haobo, xjin, kailiu, nkg- Reviewed By: nkg- CC: leveldb Differential Revision: https://reviews.facebook.net/D13413	2013-10-10 18:18:31 -07:00
Naman Gupta	cbf4a06427	Add option for storing transaction logs in a separate dir Summary: In some cases, you might not want to store the data log (write ahead log) files in the same dir as the sst files. An example use case is leaf, which stores sst files in tmpfs. And would like to save the log files in a separate dir (disk) to save memory. Test Plan: make all. Ran db_test test. A few test failing. P2785018. If you guys don't see an obvious problem with the code, maybe somebody from the rocksdb team could help me debug the issue here. Running this on leaf worked well. I could see logs stored on disk, and deleted appropriately after compactions. Obviously this is only one set of options. The unit tests cover different options. Seems like I'm missing some edge cases. Reviewers: dhruba, haobo, leveldb CC: xinyaohu, sumeet Differential Revision: https://reviews.facebook.net/D13239	2013-10-08 17:40:27 -07:00
Dhruba Borthakur	4463b11cad	Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Test Plan: make check Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13311	2013-10-06 00:14:26 -07:00
Haobo Xu	bf89edf78b	[RocksDB] Added a property "leveldb.num-immutable-mem-table" so that Flush can be called without blocking, and application still has a way to check when it's done also without blocking. Summary: as title Test Plan: DBTest.NumImmutableMemTable Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13305	2013-10-05 11:54:08 -07:00
Dhruba Borthakur	0a9f873f4b	Removed scribe, thrift and java modules. Summary: Removed scribe, thrift and java modules. Test Plan: make release make check Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13293	2013-10-04 15:36:00 -07:00
Dhruba Borthakur	a143ef9b38	Change namespace from leveldb to rocksdb Summary: Change namespace from leveldb to rocksdb. This allows a single application to link in open-source leveldb code as well as rocksdb code into the same process. Test Plan: compile rocksdb Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13287	2013-10-04 11:59:26 -07:00
Mayank Agarwal	b3ed08129b	Add a statistic to count the number of calls to GetUpdatesSince Summary: This is useful to keep track of refreshes in transaction log iterator Test Plan: make; db_stress --statistics=1 shows it Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13281	2013-10-04 10:47:20 -07:00
Haobo Xu	200c05a23f	[RocksDB] Still honor DisableFileDeletions when purge_log_after_memtable_flush is on Summary: as title Test Plan: make check Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13263	2013-10-03 16:12:43 -07:00
Haobo Xu	fa798e9e28	[Rocksdb] Submit mem table flush job in a different thread pool Summary: As title. This is just a quick hack and not ready for commit. fails a lot of unit test. I will test/debug it directly in ViewState shadow . Test Plan: Try it in shadow test. Reviewers: dhruba, xjin CC: leveldb Differential Revision: https://reviews.facebook.net/D12933	2013-10-03 14:37:19 -07:00
Xing Jin	658a3ce2fa	Fix SIGSEGV issue in universal compaction Summary: We saw SIGSEGV when set options.num_levels=1 in universal compaction style. Dug into this issue for a while, and finally found the root cause (thank Haobo for discussion). Test Plan: Add new unit test. It throws SIGSEGV without this change. Also run "make all check". Reviewers: haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13251	2013-10-02 17:33:31 -07:00
Xing Jin	8eb552bf4d	New unit test for iterator with snapshot Summary: I played with the reported bug about iterator with snapshot: https://code.google.com/p/leveldb/issues/detail?id=200. I turned the original test program (https://code.google.com/p/leveldb/issues/attachmentText?id=200&aid=2000000000&name=test.cc&token=7uOUQW-HFlbAFMUm7EqtaAEy7Tw%3A1378320724136) into a new unit test, but I cannot reproduce the problem. Notice lines 31-34 in above link. I have ran the new test with and without such Put() operations. Both succeed. So this diff simply adds the test, without changing any source codes. Test Plan: run new test. Reviewers: dhruba, haobo, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12735	2013-09-28 11:39:08 -07:00
Haobo Xu	0c4040681a	[RocksDB] Move last_sequence and last_flushed_sequence_ update back into lock protected area Summary: A previous diff moved these outside of lock protected area. Moved back in now. Also moved tmp_batch_ update outside of lock protected area, as only the single write thread can access it. Test Plan: make check Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13137	2013-09-26 20:43:11 -07:00
Haobo Xu	0e422308aa	[RocksDB] Remove Log file immediately after memtable flush Summary: As title. The DB log file life cycle is tied up with the memtable it backs. Once the memtable is flushed to sst and committed, we should be able to delete the log file, without holding the mutex. This is part of the bigger change to avoid FindObsoleteFiles at runtime. It deals with log files. sst files will be dealt with later. Test Plan: make check; db_bench Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D11709	2013-09-12 11:54:44 -07:00
Dhruba Borthakur	32c965d417	Flush was hanging because the configured options specified that more than 1 memtable need to be merged. Summary: There is an config option called Options.min_write_buffer_number_to_merge that specifies the minimum number of write buffers to merge in memory before flushing to a file in L0. But in the the case when the db is being closed, we should not be using this config, instead we should flush whatever write buffers were available at that time. Test Plan: Unit test attached. Reviewers: haobo, emayanke Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D12717	2013-09-06 16:28:33 -07:00
Mayank Agarwal	aa5c897d42	Return pathname relative to db dir in LogFile and cleanup AppendSortedWalsOfType Summary: So that replication can just download from wherever LogFile.Pathname is pointing them. Test Plan: make all check;./db_repl_stress Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12609	2013-09-04 13:44:43 -07:00
Xing Jin	42c109cc2e	New ldb command to convert compaction style Summary: Add new command "change_compaction_style" to ldb tool. For universal->level, it shows "nothing to do". For level->universal, it compacts all files into a single one and moves the file to level 0. Also add check for number of files at level 1+ when opening db with universal compaction style. Test Plan: 'make all check'. New unit test for internal convertion function. Also manully test various cmd like: ./ldb change_compaction_style --old_compaction_style=0 --new_compaction_style=1 --db=/tmp/leveldbtest-3088/db_test Reviewers: haobo, dhruba Reviewed By: haobo CC: vamsi, emayanke Differential Revision: https://reviews.facebook.net/D12603	2013-09-04 13:13:08 -07:00
Mayank Agarwal	c34271a5a5	Fix bug in Counters and record Sequencenumber using only TickerCount Summary: The way counters/statistics are implemented in rocksdb demands that enum Tickers and TickerNameMap follow the same order, otherwise statistics exposed from fbcode/rocks get out-of-sync. 2 counters for prefix had violated this order and when I built counters for fbcode/mcrocksdb, statistics for sequence number were appearing out-of-sync. The other change is to record sequence-number using setTickerCount only and not recordTick. This is because of difference in statistics as understood by rocks/utils which uses ServiceData::statistics function and rocksdb statistics. In rocksdb there is just 1 counter for a countername. But in ServiceData there are 4 independent buckets for every countername-Count, Sum, Average and Rate. SetTickerCount and RecordTick update the same variable in rocksdb but different buckets in ServiceData. Therefore, I had to choose one consistent function from RecordTick or SetTickerCount for sequence number in rocksdb. I chose SetTickerCount because the statistics object in options passed during rocksdb-open is user-dependent and SetTickerCount makes sense there. There will be a corresponding diff to mcorcksdb in fbcode shortly. Test Plan: make all check; check ticker value using fprintfs Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12669	2013-09-01 17:59:32 -07:00
Mayank Agarwal	ab5c5c28fe	Fix build caused by DeleteFile not tolerating / at the beginning Summary: db->DeleteFile calls ParseFileName to check name that was returned for sst file. Now, sst filename is returned using TableFileName which uses MakeFileName. This puts a / at the front of the name and ParseFileName doesn't like that. Changed ParseFileName to tolerate /s at the beginning. The test delet_file_test used to pass earlier because this behaviour of MakeFileName had been changed a while back to not return a / during which delete_file_test was checked in. But MakeFileName had to be reverted to add / at the front because GetLiveFiles used at many places outside rocksdb used the previous behaviour of MakeFileName. Test Plan: make;./delete_filetest;make all check Reviewers: dhruba, haobo, vamsi Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12663	2013-09-01 17:59:13 -07:00
Dhruba Borthakur	59de2dbad7	Cleanup DeleteFile API Summary: The DeleteFile API was removing files inside the db-lock. This is now changed to remove files outside the db-lock. The GetLiveFilesMetadata() returns the smallest and largest seqnuence number of each file as well. Test Plan: deletefile_test Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Maniphest Tasks: T63 Differential Revision: https://reviews.facebook.net/D12567	2013-08-28 21:18:58 -07:00
Haobo Xu	48e5ea0c34	[RocksDB] Fix TransformRepFactory related valgrind problem Summary: Let TransformRepFactory own the passed in transform. Also make it better encapsulated. Test Plan: make valgrind_check; Reviewers: dhruba, emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D12591	2013-08-28 19:27:54 -07:00
Dhruba Borthakur	fc0c399d2e	Introduced a new flag non_blocking_io in ReadOptions. Summary: If ReadOptions.non_blocking_io is set to true, then KeyMayExists and Iterators will return data that is cached in RAM. If the Iterator needs to do IO from storage to serve the data, then the Iterator.status() will return Status::IsRetry(). Test Plan: Enhanced unit test DBTest.KeyMayExist to detect if there were are IOs issues from storage. Added DBTest.NonBlockingIteration to verify nonblocking Iterations. Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Maniphest Tasks: T63 Differential Revision: https://reviews.facebook.net/D12531	2013-08-28 10:49:14 -07:00
Haobo Xu	43eef52001	[RocksDB] move stats counting outside of mutex protected region for DB::Get() Summary: As title. This is possible as tickers are atomic now. db_bench on high qps in-memory muti-thread random get workload, showed ~5% throughput improvement. Test Plan: make check; db_bench; db_stress Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12555	2013-08-27 13:36:10 -07:00
Deon Nicholas	573844807c	Fix for no_io Summary: Oops. My bad. Test Plan: Make all check Reviewers: emayanke Reviewed By: emayanke CC: haobo, leveldb, dhruba Differential Revision: https://reviews.facebook.net/D12525	2013-08-23 16:36:01 -07:00
Dhruba Borthakur	1186192ed1	Replace include/leveldb with include/rocksdb. Summary: Replace include/leveldb with include/rocksdb. Test Plan: make clean; make check make clean; make release Differential Revision: https://reviews.facebook.net/D12489	2013-08-23 10:51:00 -07:00
Jim Paton	74781a0c49	Add three new MemTableRep's Summary: This patch adds three new MemTableRep's: UnsortedRep, PrefixHashRep, and VectorRep. UnsortedRep stores keys in an std::unordered_map of std::sets. When an iterator is requested, it dumps the keys into an std::set and iterates over that. VectorRep stores keys in an std::vector. When an iterator is requested, it creates a copy of the vector and sorts it using std::sort. The iterator accesses that new vector. PrefixHashRep stores keys in an unordered_map mapping prefixes to ordered sets. I also added one API change. I added a function MemTableRep::MarkImmutable. This function is called when the rep is added to the immutable list. It doesn't do anything yet, but it seems like that could be useful. In particular, for the vectorrep, it means we could elide the extra copy and just sort in place. The only reason I haven't done that yet is because the use of the ArenaAllocator complicates things (I can elaborate on this if needed). Test Plan: make -j32 check ./db_stress --memtablerep=vector ./db_stress --memtablerep=unsorted ./db_stress --memtablerep=prefixhash --prefix_size=10 Reviewers: dhruba, haobo, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12117	2013-08-22 23:10:02 -07:00
Xing Jin	17dc128048	Pull from https://reviews.facebook.net/D10917 Summary: Pull Mark's patch and slightly revise it. I revised another place in db_impl.cc with similar new formula. Test Plan: make all check. Also run "time ./db_bench --num=2500000000 --numdistinct=2200000000". It has run for 20+ hours and hasn't finished. Looks good so far: Installed stack trace handler for SIGILL SIGSEGV SIGBUS SIGABRT LevelDB: version 2.0 Date: Tue Aug 20 23:11:55 2013 CPU: 32 * Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz CPUCache: 20480 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 2500000000 RawSize: 276565.6 MB (estimated) FileSize: 157356.3 MB (estimated) Write rate limit: 0 Compression: snappy WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ DB path: [/tmp/leveldbtest-3088/dbbench] fillseq : 7202.000 micros/op 138 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] fillsync : 7148.000 micros/op 139 ops/sec; (2500000 ops) DB path: [/tmp/leveldbtest-3088/dbbench] fillrandom : 7105.000 micros/op 140 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] overwrite : 6930.000 micros/op 144 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] readrandom : 1.020 micros/op 980507 ops/sec; (0 of 2500000000 found) DB path: [/tmp/leveldbtest-3088/dbbench] readrandom : 1.021 micros/op 979620 ops/sec; (0 of 2500000000 found) DB path: [/tmp/leveldbtest-3088/dbbench] readseq : 113.000 micros/op 8849 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] readreverse : 102.000 micros/op 9803 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] Created bg thread 0x7f0ac17f7700 compact : 111701.000 micros/op 8 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] readrandom : 1.020 micros/op 980376 ops/sec; (0 of 2500000000 found) DB path: [/tmp/leveldbtest-3088/dbbench] readseq : 120.000 micros/op 8333 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] readreverse : 29.000 micros/op 34482 ops/sec; DB path: [/tmp/leveldbtest-3088/dbbench] ... finished 618100000 ops Reviewers: MarkCallaghan, haobo, dhruba, chip Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D12441	2013-08-22 22:37:13 -07:00
Simha Venkataramaiah	60bf2b7d4a	Add APIs to query SST file metadata and to delete specific SST files Summary: An api to query the level, key ranges, size etc for each SST file and an api to delete a specific file from the db and all associated state in the bookkeeping datastructures. Notes: Editing the manifest version does not release the obsolete files right away. However deleting the file directly will mess up the iterator. We may need a more aggressive/timely file deletion api. I have used std::unique_ptr - will switch to boost:: since this is external. thoughts? Unit test is fragile right now as it expects the compaction at certain levels. Test Plan: unittest Reviewers: dhruba, vamsi, emayanke CC: zshao, leveldb, haobo Task ID: # Blame Rev:	2013-08-22 15:27:19 -07:00
Deon Nicholas	b87dcae1a3	Made merge_oprator a shared_ptr; and added TTL unit tests Test Plan: - make all check; - make release; - make stringappend_test; ./stringappend_test Reviewers: haobo, emayanke Reviewed By: haobo CC: leveldb, kailiu Differential Revision: https://reviews.facebook.net/D12381	2013-08-20 13:35:28 -07:00
Mayank Agarwal	8a3547d38e	API for getting archived log files Summary: Also expanded class LogFile to have startSequene and FileSize and exposed it publicly Test Plan: make all check Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12087	2013-08-19 13:37:04 -07:00
Mayank Agarwal	387ac0f1e1	Expose statistic for sequence number and implement setTickerCount Summary: statistic for sequence number is needed by wormhole. setTickerCount is demanded for this statistic. I can't simply recordTick(max_sequence) when db recovers because the statistic iobject is owned by client and may/may not be reset during reopen. Eg. statistic is reset in mcrocksdb whereas it is not in db_stress. Therefore it is best to go with setTickerCount Test Plan: ./db_stress ... --statistics=1 and observed expected sequence number Reviewers: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12327	2013-08-15 23:00:20 -07:00
Xing Jin	0a5afd1afc	Minor fix to current codes Summary: Minor fix to current codes, including: coding style, output format, comments. No major logic change. There are only 2 real changes, please see my inline comments. Test Plan: make all check Reviewers: haobo, dhruba, emayanke Differential Revision: https://reviews.facebook.net/D12297	2013-08-14 23:03:57 -07:00
Mayank Agarwal	f1bf169484	Counter for merge failure Summary: With Merge returning bool, it can keep failing silently(eg. While faling to fetch timestamp in TTL). We need to detect this through a rocksdb counter which can get bumped whenever Merge returns false. This will also be super-useful for the mcrocksdb-counter service where Merge may fail. Added a counter NUMBER_MERGE_FAILURES and appropriately updated db/merge_helper.cc I felt that it would be better to directly add counter-bumping in Merge as a default function of MergeOperator class but user should not be aware of this, so this approach seems better to me. Test Plan: make all check Reviewers: dnicholas, haobo, dhruba, vamsi CC: leveldb Differential Revision: https://reviews.facebook.net/D12129	2013-08-13 14:25:42 -07:00
Tyler Harter	f5f1842282	Prefix filters for scans (v4) Summary: Similar to v2 (db and table code understands prefixes), but use ReadOptions as in v3. Also, make the CreateFilter code faster and cleaner. Test Plan: make db_test; export LEVELDB_TESTS=PrefixScan; ./db_test Reviewers: dhruba Reviewed By: dhruba CC: haobo, emayanke Differential Revision: https://reviews.facebook.net/D12027	2013-08-13 14:04:56 -07:00
sumeet	3b81df34bd	Separate compaction filter for each compaction Summary: If we have same compaction filter for each compaction, application cannot know about the different compaction processes. Later on, we can put in more details in compaction filter for the application to consume and use it according to its needs. For e.g. In the universal compaction, we have a compaction process involving all the files while others don't involve all the files. Applications may want to collect some stats only when during full compaction. Test Plan: run existing unit tests Reviewers: haobo, dhruba Reviewed By: dhruba CC: xinyaohu, leveldb Differential Revision: https://reviews.facebook.net/D12057	2013-08-13 10:56:20 -07:00
Dhruba Borthakur	93d77a27d2	Universal Compaction should keep DeleteMarkers unless it is the earliest file. Summary: The pre-existing code was purging a DeleteMarker if thay key did not exist in deeper levels. But in the Universal Compaction Style, all files are in Level0. For compaction runs that did not include the earliest file, we were erroneously purging the DeleteMarkers. The fix is to purge DeleteMarkers only if the compaction includes the earlist file. Test Plan: DBTest.Randomized triggers this code path. Differential Revision: https://reviews.facebook.net/D12081	2013-08-09 14:03:57 -07:00
Xing Jin	8ae905ed63	Fix unit tests for universal compaction (step 2) Summary: Continue fixing existing unit tests for universal compaction. I have tried to apply universal compaction to all unit tests those haven't called ChangeOptions(). I left a few which are either apparently not applicable to universal compaction (because they check files/keys/values at level 1 or above levels), or apparently not related to compaction (e.g., open a file, open a db). I also add a new unit test for universal compaction. Good news is I didn't see any bugs during this round. Test Plan: Ran "make all check" yesterday. Has rebased and is rerunning Reviewers: haobo, dhruba Differential Revision: https://reviews.facebook.net/D12135	2013-08-09 13:35:44 -07:00
Xing Jin	17b8f786a3	Fix unit tests/bugs for universal compaction (first step) Summary: This is the first step to fix unit tests and bugs for universal compactiion. I added universal compaction option to ChangeOptions(), and fixed all unit tests calling ChangeOptions(). Some of these tests obviously assume more than 1 level and check file number/values in level 1 or above levels. I set kSkipUniversalCompaction for these tests. The major bug I found is manual compaction with universal compaction never stops. I have put a fix for it. I have also set universal compaction as the default compaction and found at least 20+ unit tests failing. I haven't looked into the details. The next step is to check all unit tests without calling ChangeOptions(). Test Plan: make all check Reviewers: dhruba, haobo Differential Revision: https://reviews.facebook.net/D12051	2013-08-07 14:05:44 -07:00
Dhruba Borthakur	f5fa26b6a9	Merge branch 'performance' of github.com:facebook/rocksdb into performance Conflicts: db/builder.cc db/db_impl.cc db/version_set.cc include/leveldb/statistics.h	2013-08-07 11:58:06 -07:00
Deon Nicholas	c2d7826ced	[RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences. Summary: Here are the major changes to the Merge Interface. It has been expanded to handle cases where the MergeOperator is not associative. It does so by stacking up merge operations while scanning through the key history (i.e.: during Get() or Compaction), until a valid Put/Delete/end-of-history is encountered; it then applies all of the merge operations in the correct sequence starting with the base/sentinel value. I have also introduced an "AssociativeMerge" function which allows the user to take advantage of associative merge operations (such as in the case of counters). The implementation will always attempt to merge the operations/operands themselves together when they are encountered, and will resort to the "stacking" method if and only if the "associative-merge" fails. This implementation is conjectured to allow MergeOperator to handle the general case, while still providing the user with the ability to take advantage of certain efficiencies in their own merge-operator / data-structure. NOTE: This is a preliminary diff. This must still go through a lot of review, revision, and testing. Feedback welcome! Test Plan: -This is a preliminary diff. I have only just begun testing/debugging it. -I will be testing this with the existing MergeOperator use-cases and unit-tests (counters, string-append, and redis-lists) -I will be "desk-checking" and walking through the code with the help gdb. -I will find a way of stress-testing the new interface / implementation using db_bench, db_test, merge_test, and/or db_stress. -I will ensure that my tests cover all cases: Get-Memtable, Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0, Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found, end-of-history, end-of-file, etc. -A lot of feedback from the reviewers. Reviewers: haobo, dhruba, zshao, emayanke Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D11499	2013-08-05 20:14:32 -07:00
Jim Paton	8e792e5896	Add soft_rate_limit stats Summary: This diff adds histogram stats for soft_rate_limit stalls. It also renames the old rate_limit stats to hard_rate_limit. Test Plan: make -j32 check Reviewers: dhruba, haobo, MarkCallaghan Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D12021	2013-08-05 18:45:23 -07:00

1 2 3 4 5

226 Commits