rocksdb

Author	SHA1	Message	Date
Kai Liu	b37fda842d	Improve the comment for the shared library in Make file	2013-10-22 22:40:16 -07:00
Igor Canadi	30f1b97a06	Enable blobs to be fragmented Summary: I have implemented a FreeList version that supports fragmented blob chunks. Each block gets allocated and freed in FIFO order. Since the idea for the blocks to be big, we will not take a big hit of non-sequential IO. Free list is also faster, taking only O(k) size in both free and allocate instead of O(N) as before. See more info on the task: https://our.intern.facebook.com/intern/tasks/?t=2990558 Also, I'm taking Slice instead of const char * and size in Put function. Test Plan: unittests Reviewers: haobo, kailiu, dhruba, emayanke Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13569	2013-10-22 17:44:00 -07:00
Kai Liu	70e87f7866	Update the latest rocksdb version	2013-10-22 14:49:44 -07:00
Mayank Agarwal	9b50106f9a	Dbid feature Summary: Create a new type of file on startup if it doesn't already exist called DBID. This will store a unique number generated from boost library's uuid header file. The use-case is to identify the case of a db losing all its data and coming back up either empty or from an image(backup/live replica's recovery) the key point to note is that DBID is not stored in a backup or db snapshot It's preferable to use Boost for uuid because: 1) A non-standard way of generating uuid is not good 2) /proc/sys/kernel/random/uuid generates a uuid but only on linux environments and the solution would not be clean 3) c++ doesn't have any direct way to get a uuid 4) Boost is a very good library that was already having linkage in rocksdb from third-party Note: I had to update the TOOLCHAIN_REV in build files to get latest verison of boost from third-party as the older version had a bug. I had to put Wno-uninitialized in Makefile because boost-1.51 has an unitialized variable and rocksdb would not comiple otherwise. Latet open-source for boost is 1.54 but is not there in third-party. I have notified the concerned people in fbcode about it. @kailiu : While releasing to third-party, an additional dependency will need to be created for boost in TARGETS file. I can help identify. Test Plan: Expand db_test to test 2 cases 1) Restarting db with Id file present - verify that no change to Id 2)Restarting db with Id file deleted - verify that a different Id is there after reopen Also run make all check Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13587	2013-10-22 12:23:34 -07:00
Mayank Agarwal	ae8e0770b4	Disallow transaction log iterator to skip sequences Summary: This is expected to solve the "gaps in transaction log iterator" problem. * After a lot of observations on the gaps on the sigmafio machines I found that it is due to a race between log reader and writer always. * So when we drop the wormhole subscription and refresh the iterator, the gaps are not there. * It is NOT due to some boundary or corner case left unattended in the iterator logic because I checked many instances of the gaps against their log files with ldb. The log files are NOT corrupted also. * The solution is to not allow the iterator to read incompletely written sequences and detect gaps inside itself and invalidate it which will cause the application to refresh the iterator normally and seek to the required sequence properly. * Thus, the iterator can at least guarantee that it will not give any gaps. Test Plan: * db_test based log iterator tests * db_repl_stress * testing on sigmafio setup to see gaps go away Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13593	2013-10-22 11:45:35 -07:00
Igor Canadi	c674b42d52	Rephrasing the comment Summary: Per @haobo's request, rephrasing the comment for allocate Test Plan: It's a comment! Reviewers: haobo, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D13575	2013-10-21 10:23:56 -07:00
Kai Liu	43ee5e2b3a	Fix the valgrind error in newly added unittests for table stats Summary: Previous the newly added test called NewBloomFilter without releasing it at the end of the test, which resulted in memory leak and was detected by valgrind. Test Plan: Ran valgrind test.	2013-10-20 22:02:05 -07:00
Igor Canadi	bcc8557901	tmpfs does not support fallocate Summary: This caused Siying's unit test to fail. Test Plan: Unittest Reviewers: dhruba, kailiu, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13539	2013-10-17 22:15:57 -07:00
Siying Dong	65428b0c0a	Fix Bug: iterator.Prev() or iterator.SeekToLast() might return the first element instead of the correct one Summary: Recent patch https://reviews.facebook.net/D11865 introduced a regression bug: DBIter::FindPrevUserEntry(), which is called by DBIter::Prev() (and also implicitly if calling iterator.SeekToLast()) might do issue a seek when having skipped too many entries. If the skipped entry just before the seek() is a delete, the saved key is erased so that it seeks to the front, so Prev() would return the first element. This patch fixes the bug by not doing seek() in DBIter::FindNextUserEntry() if saved key has been erased. Test Plan: Add a test DBTest.IterPrevMaxSkip which would fail without the patch and would pass with the change. Reviewers: dhruba, xjin, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13557	2013-10-17 18:33:18 -07:00
Siying Dong	9edda37027	Universal Compaction to Have a Size Percentage Threshold To Decide Whether to Compress Summary: This patch adds a option for universal compaction to allow us to only compress output files if the files compacted previously did not yet reach a specified ratio, to save CPU costs in some cases. Compression is always skipped for flushing. This is because the size information is not easy to evaluate for flushing case. We can improve it later. Test Plan: add test DBTest.UniversalCompactionCompressRatio1 and DBTest.UniversalCompactionCompressRatio12 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13467	2013-10-17 13:33:39 -07:00
Kai Liu	aac44226a0	Add bloom filter to predefined table stats Summary: As title. Test Plan: Updated the unit tests to make sure new statistic is correctly written/read. Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13497	2013-10-17 11:43:06 -07:00
Vamsi Ponnekanti	6731997f64	[ldb compact is not allowing ttl flag] Summary: Allow ttl flag Test Plan: tested on my database that has merge operations and ttl Revert Plan: OK Task ID: #3038186 Reviewers: emayanke, dhruba, haobo Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13503	2013-10-16 18:06:54 -07:00
Dhruba Borthakur	9cd221094c	Add appropriate LICENSE and Copyright message. Summary: Add appropriate LICENSE and Copyright message. Test Plan: make check Reviewers: CC: Task ID: # Blame Rev:	2013-10-16 17:48:41 -07:00
Igor Canadi	fc4616d898	External Value Store Summary: Developing a capability for storing values on external backing file(s). This is just a highly unoptimized first pass - supports: 1) Allocating some portion of external file to be used to store value 2) Freeing the range, enabling it to be reused by other values As next steps, I plan to: 1) Create some kind of stress testing. Once I can measure stuff, I can focus on optimizing. 2) Optimize locking. 3) Optimize freelist data structure. Currently we have O(n) for both freeing and allocation. 4) Figure out how to do recovery. Test Plan: Created a unit test. Reviewers: dhruba, haobo, kailiu Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13389	2013-10-16 17:33:49 -07:00
Kai Liu	0f31843c15	Fix the patent format Summary: Formatted the PATENT file so that it's easier to read. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-16 15:37:32 -07:00
Siying Dong	073cbfc8f0	Enable background flush thread by default and fix issues related to it Summary: Enable background flush thread in this patch and fix unit tests with: (1) After background flush, schedule a background compaction if condition satisfied; (2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered (3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results. Test Plan: pass all unit tests Reviewers: haobo, xjin, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13461	2013-10-16 13:32:53 -07:00
Dhruba Borthakur	cb5b2baf18	Added Patent information to the source code repository. Summary: Added Patent information to the source code repository. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-15 21:08:47 -07:00
Dhruba Borthakur	b825df81e2	Fix error in previous commit of 'ftruncate' to 'fallocate'. Summary: Fix error in previous commit of 'ftruncate' to 'fallocate'. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-15 13:57:29 -07:00
Dhruba Borthakur	8457b74c26	Fix Unit test when run on tmpfs Summary: tmpfs might not support fallocate(). Fix unit test so that this does not cause a unit test to fail. Test Plan: ./env_test Reviewers: emayanke, igor, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D13455	2013-10-15 12:03:09 -07:00
Mayank Agarwal	da2fd001a6	Fix rocksdb->levledb BytewiseComparator and inverted order of error in db/version_set.cc Summary: This is needed to make existing dbs be able to open and also because BytewiseComparator was not changed since leveldb. The inverted order in the error message caused confusion prebiously Test Plan: make; open existing db Reviewers: leveldb, dhruba Reviewed By: dhruba Differential Revision: https://reviews.facebook.net/D13449	2013-10-14 18:16:54 -07:00
Mayank Agarwal	fe3713961e	Features in Transaction log iterator Summary: * Logstore requests a valid change of reutrning an empty iterator and not an error in case of no log files. * Changed the code to return the writebatch containing the sequence number requested from GetupdatesSince even if it lies in the middle. Earlier we used to return the next writebatch,. This also allows me oto guarantee that no files played upon by the iterator are redundant. I mean the starting log file has at least a sequence number >= the sequence number requested form GetupdatesSince. * Cleaned up redundant logic in Iterator::Next and made a new function SeekToStartSequence for greater readability and maintainibilty. * Modified a test in db_test accordingly Please check the logic carefully and suggest improvements. I have a separate patch out for more improvements like restricting reader to read till written sequences. Test Plan: * transaction log iterator tests in db_test, * db_repl_stress. * rocks_log_iterator_test in fbcode/wormhole/rocksdb/test - 2 tests thriving on hacks till now can get simplified * testing on the shadow setup for sigma with replication Reviewers: dhruba, haobo, kailiu, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13437	2013-10-14 18:16:21 -07:00
Kai Liu	86ef6c3f74	Add statistics to sst file Summary: So far we only have key/value pairs as well as bloom filter stored in the sst file. It will be great if we are able to store more metadata about this table itself, for example, the entry size, bloom filter name, etc. This diff is the first step of this effort. It allows table to keep the basic statistics mentioned in http://fburl.com/14995441, as well as allowing writing user-collected stats to stats block. After this diff, we will figure out the interface of how to allow user to collect their interested statistics. Test Plan: 1. Added several unit tests. 2. Ran `make check` to ensure it doesn't break other tests. Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13419	2013-10-14 15:56:13 -07:00
Siying Dong	88f2f89068	Change Function names from Compaction->Flush When they really mean Flush Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance. Test Plan: Run test suites. Reviewers: haobo, dhruba, xjin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13431	2013-10-14 15:12:15 -07:00
sdong	f8509653ba	LRUCache to try to clean entries not referenced first. Summary: With this patch, when LRUCache.Insert() is called and the cache is full, it will first try to free up entries whose reference counter is 1 (would become 0 after remo\ ving from the cache). We do it in two passes, in the first pass, we only try to release those unreferenced entries. If we cannot free enough space after traversing t\ he first remove_scan_cnt_ entries, we start from the beginning again and remove those entries being used. Test Plan: add two unit tests to cover the codes Reviewers: dhruba, haobo, emayanke Reviewed By: emayanke CC: leveldb, emayanke, xjin Differential Revision: https://reviews.facebook.net/D13377	2013-10-11 09:26:21 -07:00
Dhruba Borthakur	c0ce562c32	Bad nfs file checked in a long time back. Summary: Bad nfs file checked in a long time back. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-10 22:00:49 -07:00
Mayank Agarwal	a8b4a69de0	Fixing error in ParseFileName causing DestroyDB to fail on archive directory Summary: This careless error was causing ASSERT_OK(DestroyDB) to fail in db_test. Basically .. was being returned as a child of db/archive and ParseFileName returned false on that, but 'type' was set to LogFile from earlier and not reset. The return of ParseFileName was not being checked to delete the log file or not. Test Plan: make all check Reviewers: dhruba, haobo, xjin, kailiu, nkg- Reviewed By: nkg- CC: leveldb Differential Revision: https://reviews.facebook.net/D13413	2013-10-10 18:18:31 -07:00
Siying Dong	40a1e31fa5	Minor: Fix a lint error in cache_test.cc Summary: As title. Fix an lint error: Lint: CppLint Error Single-argument constructor 'Value(int v)' may inadvertently be used as a type conversion constructor. Prefix the function with the 'explicit' keyword to avoid this, or add an /* implicit */ comment to suppress this warning. Test Plan: N/A Reviewers: emayanke, haobo, dhruba Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13401	2013-10-10 13:47:25 -07:00
Igor Canadi	d2ca2bd183	Fixing build failure Summary: virtual NewRandomRWFile is not implemented on EnvHdfs, causing build failure. Test Plan: make clean; make all check Reviewers: dhruba, haobo, kailiu Reviewed By: kailiu CC: leveldb Differential Revision: https://reviews.facebook.net/D13383	2013-10-10 01:01:16 -07:00
Igor Canadi	d0beadd456	Env class that can randomly read and write Summary: I have implemented basic simple use case that I need for External Value Store I'm working on. There is a potential for making this prettier by refactoring/combining WritableFile and RandomAccessFile, avoiding some copypasta. However, I decided to implement just the basic functionality, so I can continue working on the other diff. Test Plan: Added a unittest Reviewers: dhruba, haobo, kailiu Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13365	2013-10-10 00:03:08 -07:00
Dhruba Borthakur	7ac3c796f6	Add draft logo. Summary: Add draft logo in jpg format. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-09 22:55:30 -07:00
Dhruba Borthakur	6d5f6a4b1a	A bare-bones rocksdb logo. Summary: A hand-crafted rocksdb logo. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-09 20:32:19 -07:00
Dhruba Borthakur	3c37955a2f	Remove obsolete namespace mappings. Summary: The previous release 2.4 had a mapping to alias the older namespace to rocksdb. This mapping is not needed in the new release. Test Plan: make check make release Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13359	2013-10-08 22:53:38 -07:00
Naman Gupta	cbf4a06427	Add option for storing transaction logs in a separate dir Summary: In some cases, you might not want to store the data log (write ahead log) files in the same dir as the sst files. An example use case is leaf, which stores sst files in tmpfs. And would like to save the log files in a separate dir (disk) to save memory. Test Plan: make all. Ran db_test test. A few test failing. P2785018. If you guys don't see an obvious problem with the code, maybe somebody from the rocksdb team could help me debug the issue here. Running this on leaf worked well. I could see logs stored on disk, and deleted appropriately after compactions. Obviously this is only one set of options. The unit tests cover different options. Seems like I'm missing some edge cases. Reviewers: dhruba, haobo, leveldb CC: xinyaohu, sumeet Differential Revision: https://reviews.facebook.net/D13239	2013-10-08 17:40:27 -07:00
Naman Gupta	116071411b	Make db_test more robust Summary: While working on D13239, I noticed that the same options are not used for opening and destroying at db. So adding that. Also added asserts for successful DestroyDB calls. Test Plan: Ran unit tests. Atleast 1 unit test is failing. They failures are a result of some past logic change. I'm not really planning to fix those. But I would like to check this in. And hopefully the respective unit test owners can fix the broken tests Reviewers: leveldb, haobo CC: xinyaohu, sumeet, dhruba Differential Revision: https://reviews.facebook.net/D13329	2013-10-08 13:19:31 -07:00
Kai Liu	1f8ade6bd6	Fix a bug in table builder Summary: In talbe.cc, when reading the metablock, it uses BytewiseComparator(); However in table_builder.cc, we use r->options.comparator. After tracing the creation of r->options.comparator, I found this comparator is an InternalKeyComparator, which wraps the user defined comparator(details can be found in DBImpl::SanitizeOptions(). I encountered this problem when adding metadata about "bloom filter" before. With different comparator, we may fail to do the binary sort. Current code works well since there is only one entry in meta block. Test Plan: make all check I've also tested this change in https://reviews.facebook.net/D8283 before. Reviewers: dhruba, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13335	2013-10-07 18:53:19 -07:00
Igor Canadi	fa46ddb41f	Move delete and free outside of crtical section Summary: Split Unref into two parts -> cheap and expensive. Try to call expensive Unref outside of critical section to decrease lock contention. Test Plan: unittests Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb, kailiu Differential Revision: https://reviews.facebook.net/D13299	2013-10-07 15:37:40 -07:00
Dhruba Borthakur	1a8c1b0817	Unit test failure in DBTest.NumImmutableMemTable. Summary: Previous patch introduced a unit test failure in DBTest.NumImmutableMemTable because of change in property names. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-10-06 01:12:02 -07:00
Dhruba Borthakur	4463b11cad	Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix. Test Plan: make check Reviewers: emayanke, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D13311	2013-10-06 00:14:26 -07:00
Haobo Xu	bf89edf78b	[RocksDB] Added a property "leveldb.num-immutable-mem-table" so that Flush can be called without blocking, and application still has a way to check when it's done also without blocking. Summary: as title Test Plan: DBTest.NumImmutableMemTable Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13305	2013-10-05 11:54:08 -07:00
Dhruba Borthakur	0a9f873f4b	Removed scribe, thrift and java modules. Summary: Removed scribe, thrift and java modules. Test Plan: make release make check Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13293	2013-10-04 15:36:00 -07:00
Mayank Agarwal	aad2110823	Updating README.fb to have newest verison 2.4 Summary: Test Plan: visual	2013-10-04 12:17:44 -07:00
Dhruba Borthakur	a143ef9b38	Change namespace from leveldb to rocksdb Summary: Change namespace from leveldb to rocksdb. This allows a single application to link in open-source leveldb code as well as rocksdb code into the same process. Test Plan: compile rocksdb Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13287	2013-10-04 11:59:26 -07:00
Mayank Agarwal	b3ed08129b	Add a statistic to count the number of calls to GetUpdatesSince Summary: This is useful to keep track of refreshes in transaction log iterator Test Plan: make; db_stress --statistics=1 shows it Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13281	2013-10-04 10:47:20 -07:00
Mayank Agarwal	854d236361	Add backward compatible option in GetLiveFiles to choose whether to not Flush first Summary: As explained in comments in GetLiveFiles in db.h, this option will cause flush to be skipped in GetLiveFiles because some use-cases use GetSortedWalFiles after GetLiveFiles to generate more complete snapshots. Using GetSortedWalFiles after GetLiveFiles allows us to not Flush in GetLiveFiles first because wals have everything. Note: file deletions will be disabled before calling GLF or GSWF so live logs will not move to archive logs or get delted. Note: Manifest file is truncated to a proper value in GLF, so it will always reply from the proper wal files on a restart Test Plan: make Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13257	2013-10-04 10:20:10 -07:00
Haobo Xu	200c05a23f	[RocksDB] Still honor DisableFileDeletions when purge_log_after_memtable_flush is on Summary: as title Test Plan: make check Reviewers: emayanke Reviewed By: emayanke CC: leveldb Differential Revision: https://reviews.facebook.net/D13263	2013-10-03 16:12:43 -07:00
Haobo Xu	fa798e9e28	[Rocksdb] Submit mem table flush job in a different thread pool Summary: As title. This is just a quick hack and not ready for commit. fails a lot of unit test. I will test/debug it directly in ViewState shadow . Test Plan: Try it in shadow test. Reviewers: dhruba, xjin CC: leveldb Differential Revision: https://reviews.facebook.net/D12933	2013-10-03 14:37:19 -07:00
Xing Jin	658a3ce2fa	Fix SIGSEGV issue in universal compaction Summary: We saw SIGSEGV when set options.num_levels=1 in universal compaction style. Dug into this issue for a while, and finally found the root cause (thank Haobo for discussion). Test Plan: Add new unit test. It throws SIGSEGV without this change. Also run "make all check". Reviewers: haobo, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13251	2013-10-02 17:33:31 -07:00
Mayank Agarwal	6b34021fc2	Triggering verify for gets also Summary: Will use iterators to verify keys in the db for half of its keys and Gets for the other half. Test Plan: ./db_stress --max_key=1000 --ops_per_thread=100 Reviewers: dhruba, haobo Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13227	2013-10-02 11:22:17 -07:00
Haobo Xu	71046971f0	[RocksDB] Added perf counters to track skipped internal keys during iteration Summary: as title. unit test not polished. this is for a quick live test Test Plan: live Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D13221	2013-10-02 10:48:41 -07:00
Kai Liu	861f6e48e4	Remove the hard-coded enum value in statistics.h Summary: I am planning to add more to statistics classes but found current way of using enum is very verbose and unnecessarily increase the difficulity of adding new statistics. In this diff I removed the code that explicitly specifies the value of each enum entry. This will help us easily add new statistic items more conveniently without manually adding the value of other enum entries by one. Test Plan: make; make check; Reviewers: haobo, dhruba, xjin, emayanke, vamsi CC: leveldb Differential Revision: https://reviews.facebook.net/D13197	2013-10-01 14:14:06 -07:00

1 2 3 4 5 ...

740 Commits