rocksdb

Author	SHA1	Message	Date
Kai Liu	9996e2d21c	Merge pull request #61 from Yancey1989/master fix compile warning	2014-01-10 10:28:50 -08:00
Yancey	afdd2d1a46	fix compile warning	2014-01-10 17:56:35 +08:00
Siying Dong	237a3da677	StopWatch not to get time if it is created for statistics and it is disabled Summary: Currently, even if statistics is not enabled, StopWatch only for the stats still gets the time of the day, which is wasteful. This patch adds a new option to StopWatch to disable this get in this case. Test Plan: make all check Reviewers: dhruba, haobo, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14703 Conflicts: db/db_impl.cc	2014-01-09 17:39:48 -08:00
Siying Dong	424a524ac9	[Performance Branch] A Hashed Linked List Based Mem Table Summary: Implement a mem table, in which keys are hashed based on prefixes. In each bucket, entries are organized in a sorted linked list. It has the same thread safety guarantee as skip list. The motivation is to optimize memory usage for the case that prefix hashing is primary way of seeking to the entry. Compared to hash skip list implementation, this implementation is more memory efficient, but inside each bucket, search is always linear. The target scenario is that there are only very limited number of records in each hash bucket. Test Plan: Add a test case in db_test Reviewers: haobo, kailiu, dhruba Reviewed By: haobo CC: igor, nkg-, leveldb Differential Revision: https://reviews.facebook.net/D14979	2014-01-09 16:19:11 -08:00
Igor Canadi	cb37ddf229	Feature requests for BackupableDB Summary: This diff introduces some features that were requested by two internal customers: * Ability for backups not to share table files, because we can't guarantee that equal filename means equal content accross replicas * Ability for two threads to call EnableFileDeletions() and DisableFileDeletions() * Ability to stop backup from another thread and not slow down the DB close * Copy the files to the temporary folder first and then atomically rename Test Plan: Added some tests to backupable_db_test Reviewers: dhruba, sanketh, muthu, sdong, haobo Reviewed By: haobo CC: leveldb, sanketh, muthu Differential Revision: https://reviews.facebook.net/D14769	2014-01-09 12:24:28 -08:00
Igor Canadi	d0406675c2	readwhilewriting benchmark Summary: Added readwhilewriting benchmark to our regression tests. Changed block cache shards from 16 to 64, as Mark found that cache mutex contention is a big bottleneck. Test Plan: Ran it. Reviewers: dhruba, haobo, MarkCallaghan, xjin Reviewed By: MarkCallaghan CC: leveldb Differential Revision: https://reviews.facebook.net/D15075	2014-01-08 17:44:58 -08:00
Siying Dong	5575316350	StopWatch not to get time if it is created for statistics and it is disabled Summary: Currently, even if statistics is not enabled, StopWatch only for the stats still gets the time of the day, which is wasteful. This patch adds a new option to StopWatch to disable this get in this case. Test Plan: make all check Reviewers: dhruba, haobo, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14703	2014-01-08 16:05:36 -08:00
kailiu	12b6d2b839	Separate the aligned and unaligned memory allocation Summary: Use two vectors for different types of memory allocation. Test Plan: run all unit tests. Reviewers: haobo, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D15027	2014-01-08 15:11:42 -08:00
Mark Callaghan	50994bf699	Don't always compress L0 files written by memtable flush Summary: Code was always compressing L0 files written by a memtable flush when compression was enabled. Now this is done when min_level_to_compress=0 for leveled compaction and when universal_compaction_size_percent=-1 for universal compaction. Task ID: #3416472 Blame Rev: Test Plan: ran db_bench with compression options Revert Plan: Database Impact: Memcache Impact: Other Notes: EImportant: - begin PUBLIC platform impact section - Bugzilla: # - end platform impact - Reviewers: dhruba, igor, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14757	2014-01-07 21:50:26 -08:00
Igor Canadi	a45b7d83ba	Merge pull request #59 from mlin/more-c-bindings C API: add rocksdb_env_set_high_priority_background_threads	2014-01-07 16:33:03 -08:00
Igor Canadi	17a222670b	Merge branch 'master' into performance	2014-01-07 11:04:21 -08:00
Mike Lin	4c75e21c20	Eliminate stdout message when launching a posix thread. This seems out of place as it's the only time RocksDB prints to stdout in the normal course of operations. Thread IDs can still be retrieved from the LOG file: cut -d ' ' -f2 LOG \| sort \| uniq \| egrep -x '[0-9a-f]+'	2014-01-07 10:44:02 -08:00
Tomislav Novak	9f690ec62c	Fix a deadlock in CompactRange() Summary: The way DBImpl::TEST_CompactRange() throttles down the number of bg compactions can cause it to deadlock when CompactRange() is called concurrently from multiple threads. Imagine a following scenario with only two threads (max_background_compactions is 10 and bg_compaction_scheduled_ is initially 0): 1. Thread #1 increments bg_compaction_scheduled_ (to LargeNumber), sets bg_compaction_scheduled_ to 9 (newvalue), schedules the compaction (bg_compaction_scheduled_ is now 10) and waits for it to complete. 2. Thread #2 calls TEST_CompactRange(), increments bg_compaction_scheduled_ (now LargeNumber + 10) and waits on a cv for bg_compaction_scheduled_ to drop to LargeNumber. 3. BG thread completes the first manual compaction, decrements bg_compaction_scheduled_ and wakes up all threads waiting on bg_cv_. Thread #1 runs, increments bg_compaction_scheduled_ by LargeNumber again (now 2*LargeNumber + 9). Since that's more than LargeNumber + newvalue, thread #2 also goes to sleep (waiting on bg_cv_), without resetting bg_compaction_scheduled_. This diff attempts to address the problem by introducing a new counter bg_manual_only_ (when positive, MaybeScheduleFlushOrCompaction() will only schedule manual compactions). Test Plan: I could pretty much consistently reproduce the deadlock with a program that calls CompactRange(nullptr, nullptr) immediately after Write() from multiple threads. This no longer happens with this patch. Tests (make check) pass. Reviewers: dhruba, igor, sdong, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14799	2014-01-07 10:37:34 -08:00
kailiu	c370f5597a	Revert change in `8f6e319`.	2014-01-06 11:53:19 -08:00
Kai Liu	be271c3357	Merge pull request #56 from sepeth/refactor-detect-version Refactor build_tools/build_detect_version	2014-01-06 11:50:24 -08:00
kailiu	7e70ff63d6	Fix issue #57	2014-01-06 11:11:19 -08:00
Doğan Çeçen	d800dc567a	Refactor build_tools/build_detect_version	2014-01-06 08:44:43 +02:00
Kai Liu	8f6e31951e	Add a hack to build_detect_platform so it works in all types of fb-servers	2014-01-04 23:47:44 -08:00
Kai Liu	8c4eb71b5d	Fix one more valgrind error in table_test	2014-01-03 18:27:33 -08:00
Kai Liu	5e7d5629c7	Fix the valgrind issues	2014-01-03 11:48:31 -08:00
Kai Liu	774ed89c24	Replace vector with autovector Summary: this diff only replace the cases when we need to frequently create vector with small amount of entries. This diff doesn't aim to improve performance of a specific area, but more like a small scale test for the autovector and see how it works in real life. Test Plan: make check I also ran the performance tests, however there is no performance gain/loss. All performance numbers are pretty much the same before/after the change. Reviewers: dhruba, haobo, sdong, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D14985	2014-01-02 16:43:35 -08:00
kailiu	e72aa37cc5	Merge branch 'master' into performance Conflicts: db/table_cache.cc	2014-01-02 16:34:59 -08:00
kailiu	476416c27c	Some minor refactoring on the code Summary: I made some cleanup while reading the source code in `db`. Most changes are about style, naming or C++ 11 new features. Test Plan: ran `make check` Reviewers: haobo, dhruba, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D15009	2014-01-02 16:32:31 -08:00
Kai Liu	463086bce8	Add clang-format rules Summary: The rule file is forked from that in Facebook's repo. I'll add format file for now and team members can tune the rules later. In this patch, I made only two changes in order to be consistent with existing coding style `SpacesBeforeTrailingComments: 2` `ColumnLimit: 80` Test Plan: N/A Reviewers: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D15015	2014-01-02 14:32:15 -08:00
Kai Liu	46950597d0	Automate the preparation step for a new release Summary: Added a script that prepares the repo for facebook's new rocksdb release, which will automatically do some necessary work to make sure this repo is ready for 3rdparty release. Test Plan: Run this script and observed: * new version was created (both in local and remote repo) as a git tag. * build_version.cc was updated * build_detect_platform was changed so that it won't create any new change. Reviewers: haobo, dhruba, sdong, igor CC: leveldb Differential Revision: https://reviews.facebook.net/D15003	2014-01-02 11:35:33 -08:00
kailiu	9281a826f1	Hotfix the bug in table cache's GetSliceForFileNumber Forgot to fix this problem in master branch. Already fixed it in performance branch.	2014-01-02 10:30:42 -08:00
Igor Canadi	b60c14f6ee	Support multi-threaded DisableFileDeletions() and EnableFileDeletions() Summary: We don't want two threads to clash if they concurrently call DisableFileDeletions() and EnableFileDeletions(). I'm adding a counter that will enable file deletions only after all DisableFileDeletions() calls have been negated with EnableFileDeletions(). However, we also don't want to break the old behavior, so I added a parameter force to EnableFileDeletions(). If force is true, we will still enable file deletions after every call to EnableFileDeletions(), which is what is happening now. Test Plan: make check Reviewers: dhruba, haobo, sanketh Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14781	2014-01-02 03:33:42 -08:00
Igor Canadi	345fb94d26	moving autovector_test after db_test	2014-01-02 03:30:29 -08:00
Igor Canadi	52ea1be90a	Add -DROCKSDB_FALLOCATE_PRESENT to fbcode build	2014-01-02 02:00:04 -08:00
Igor Canadi	2b3aab3ee6	Merge pull request #48 from dyu/master fix build bug	2014-01-02 00:27:18 -08:00
Mike Lin	4b1d049236	C API: add rocksdb_env_set_high_priority_background_threads	2013-12-31 15:14:18 -08:00
Kai Liu	fe030bd1ca	update the latest version in README.fb to 2.7	2013-12-30 16:16:24 -08:00
Kai Liu	5a20744a6a	Simplify build_tools/build_detect_version	2013-12-30 16:14:55 -08:00
Kai Liu	1795397bf0	Update README.fb Update the latest version number.	2013-12-30 14:53:56 -08:00
dyu	e842b99fc5	docs for shared library builds	2013-12-30 21:34:45 +08:00
dyu	a6b476a2ac	tweak build bug fix	2013-12-30 21:33:52 +08:00
kailiu	f1cec73a76	Merge branch 'master' into performance Conflicts: db/db_impl.cc db/db_test.cc db/memtable.cc db/version_set.cc include/rocksdb/statistics.h	2013-12-27 12:23:17 -08:00
dyu	9d4dc0da27	fix build bug from recent commit:`43c386b72e`	2013-12-27 15:19:31 +08:00
Siying Dong	a094f3b3b5	TableCache.FindTable() to avoid the mem copy of file number Summary: I'm not sure what's the purpose of encoding file number to a new buffer for looking up the table cache. It seems to be unnecessary to me. With this patch, we point the lookup key to the address of the int64 of the file number. Test Plan: make all check Reviewers: dhruba, haobo, igor, kailiu Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14811	2013-12-26 16:57:07 -08:00
Siying Dong	18df47b79a	Avoid malloc in NotFound key status if no message is given. Summary: In some places we have NotFound status created with empty message, but it doesn't avoid a malloc. With this patch, the malloc is avoided for that case. The motivation of it is that I found in db_bench readrandom test when all keys are not existing, about 4% of the total running time is spent on malloc of Status, plus a similar amount of CPU spent on free of them, which is not necessary. Test Plan: make all check Reviewers: dhruba, haobo, igor Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D14691	2013-12-26 16:23:10 -08:00
Kai Liu	b40c052bfa	Fix all the comparison issue in fb dev servers	2013-12-26 16:13:49 -08:00
kailiu	113a08c929	Fix [-Werror=sign-compare] in autovector_test	2013-12-26 15:47:07 -08:00
kailiu	079a21ba99	Fix the unused variable warning message in mac os	2013-12-26 15:12:30 -08:00
kailiu	c01676e46d	Implement autovector Summary: A vector that leverages pre-allocated stack-based array to achieve better performance for array with small amount of items. Test Plan: Added tests for both correctness and performance Here is the performance benchmark between vector and autovector Please note that in the test "Creation and Insertion Test", the test case were designed with the motivation described below: * no element inserted: internal array of std::vector may not really get initialize. * one element inserted: internal array of std::vector must have initialized. * kSize elements inserted. This shows the most time we'll spend if we keep everything in stack. * 2 * kSize elements inserted. The internal vector of autovector must have been initialized. Note: kSize is the capacity of autovector ===================================================== Creation and Insertion Test ===================================================== created 100000 vectors: each was inserted with 0 elements total time elapsed: 128000 (ns) created 100000 autovectors: each was inserted with 0 elements total time elapsed: 3641000 (ns) created 100000 VectorWithReserveSizes: each was inserted with 0 elements total time elapsed: 9896000 (ns) ----------------------------------- created 100000 vectors: each was inserted with 1 elements total time elapsed: 11089000 (ns) created 100000 autovectors: each was inserted with 1 elements total time elapsed: 5008000 (ns) created 100000 VectorWithReserveSizes: each was inserted with 1 elements total time elapsed: 24271000 (ns) ----------------------------------- created 100000 vectors: each was inserted with 4 elements total time elapsed: 39369000 (ns) created 100000 autovectors: each was inserted with 4 elements total time elapsed: 10121000 (ns) created 100000 VectorWithReserveSizes: each was inserted with 4 elements total time elapsed: 28473000 (ns) ----------------------------------- created 100000 vectors: each was inserted with 8 elements total time elapsed: 75013000 (ns) created 100000 autovectors: each was inserted with 8 elements total time elapsed: 18237000 (ns) created 100000 VectorWithReserveSizes: each was inserted with 8 elements total time elapsed: 42464000 (ns) ----------------------------------- created 100000 vectors: each was inserted with 16 elements total time elapsed: 102319000 (ns) created 100000 autovectors: each was inserted with 16 elements total time elapsed: 76724000 (ns) created 100000 VectorWithReserveSizes: each was inserted with 16 elements total time elapsed: 68285000 (ns) ----------------------------------- ===================================================== Sequence Access Test ===================================================== performed 100000 sequence access against vector size: 4 total time elapsed: 198000 (ns) performed 100000 sequence access against autovector size: 4 total time elapsed: 306000 (ns) ----------------------------------- performed 100000 sequence access against vector size: 8 total time elapsed: 565000 (ns) performed 100000 sequence access against autovector size: 8 total time elapsed: 512000 (ns) ----------------------------------- performed 100000 sequence access against vector size: 16 total time elapsed: 1076000 (ns) performed 100000 sequence access against autovector size: 16 total time elapsed: 1070000 (ns) ----------------------------------- Reviewers: dhruba, haobo, sdong, chip Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D14655	2013-12-26 15:03:47 -08:00
Kai Liu	5643ae1a3f	Merge pull request #32 from jamesgolick/master Only try to use fallocate if it's actually present on the system.	2013-12-26 14:20:22 -08:00
Dhruba Borthakur	71ddb117c8	Add a pointer to the engineering design discussion forum. Summary: Add a pointer to the engineering design discussion forum. Test Plan: Reviewers: CC: Task ID: # Blame Rev:	2013-12-23 12:19:18 -08:00
Haobo Xu	bf4a48ccb3	[RocksDB] [Performance Branch] Revert previous patch. Summary: The previous patch is wrong. rep_.resize(kHeader) just resets the header portion to zero, and should not cause a re-allocation if g++ does it right. I will go ahead and revert it. Test Plan: make check Reviewers: dhruba, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D14793	2013-12-20 18:20:06 -08:00
Haobo Xu	e94eea4527	[RocksDB] [Performance Branch] Minor fix, Remove string resize from WriteBatch::Clear Summary: tmp_batch_ will get re-allocated for every merged write batch because of the existing resize in WriteBatch::Clear. Note that in DBImpl::BuildBatchGroup, we have a hard coded upper limit of batch size 1<<20 = 1MB already. Test Plan: make check Reviewers: dhruba, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D14787	2013-12-20 16:29:05 -08:00
Siying Dong	abaf26266d	[RocksDB] [Performance Branch] Some Changes to PlainTable format Summary: Some changes to PlainTable format: (1) support variable key length (2) use user defined slice transformer to extract prefixes (3) Run some test cases against PlainTable in db_test and table_test Test Plan: test db_test Reviewers: haobo, kailiu CC: dhruba, igor, leveldb, nkg- Differential Revision: https://reviews.facebook.net/D14457	2013-12-20 12:08:35 -08:00
Igor Canadi	b26dc95628	Initialize sequence number in BatchResult - issue #39	2013-12-20 10:01:12 -08:00

... 4 5 6 7 8 ...

1259 Commits