Commit Graph

730 Commits

Author SHA1 Message Date
Kai Liu
aac44226a0 Add bloom filter to predefined table stats
Summary: As title.

Test Plan: Updated the unit tests to make sure new statistic is correctly written/read.

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13497
2013-10-17 11:43:06 -07:00
Vamsi Ponnekanti
6731997f64 [ldb compact is not allowing ttl flag]
Summary: Allow ttl flag

Test Plan:
tested on my database that has merge operations and ttl

Revert Plan: OK

Task ID: #3038186

Reviewers: emayanke, dhruba, haobo

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13503
2013-10-16 18:06:54 -07:00
Dhruba Borthakur
9cd221094c Add appropriate LICENSE and Copyright message.
Summary:
Add appropriate LICENSE and Copyright message.

Test Plan:
make check

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-16 17:48:41 -07:00
Igor Canadi
fc4616d898 External Value Store
Summary:
Developing a capability for storing values on external backing file(s).

This is just a highly unoptimized first pass - supports:
1) Allocating some portion of external file to be used to store value
2) Freeing the range, enabling it to be reused by other values

As next steps, I plan to:
1) Create some kind of stress testing. Once I can measure stuff, I can focus on optimizing.
2) Optimize locking.
3) Optimize freelist data structure. Currently we have O(n) for both freeing and allocation.
4) Figure out how to do recovery.

Test Plan: Created a unit test.

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13389
2013-10-16 17:33:49 -07:00
Kai Liu
0f31843c15 Fix the patent format
Summary:

Formatted the PATENT file so that it's easier to read.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-16 15:37:32 -07:00
Siying Dong
073cbfc8f0 Enable background flush thread by default and fix issues related to it
Summary:
Enable background flush thread in this patch and fix unit tests with:
(1) After background flush, schedule a background compaction if condition satisfied;
(2) Fix a bug that if universal compaction is enabled and number of levels are set to be 0, compaction will not be automatically triggered
(3) Fix unit tests to wait for compaction to finish instead of flush, before checking the compaction results.

Test Plan: pass all unit tests

Reviewers: haobo, xjin, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13461
2013-10-16 13:32:53 -07:00
Dhruba Borthakur
cb5b2baf18 Added Patent information to the source code repository.
Summary:
Added Patent information to the source code repository.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-15 21:08:47 -07:00
Dhruba Borthakur
b825df81e2 Fix error in previous commit of 'ftruncate' to 'fallocate'.
Summary:
Fix error in previous commit of 'ftruncate' to 'fallocate'.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-15 13:57:29 -07:00
Dhruba Borthakur
8457b74c26 Fix Unit test when run on tmpfs
Summary:
tmpfs might not support fallocate(). Fix unit test so that this
does not cause a unit test to fail.

Test Plan: ./env_test

Reviewers: emayanke, igor, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13455
2013-10-15 12:03:09 -07:00
Mayank Agarwal
da2fd001a6 Fix rocksdb->levledb BytewiseComparator and inverted order of error in db/version_set.cc
Summary:
This is needed to make existing dbs be able to open and also because BytewiseComparator was not changed since leveldb.
The inverted order in the error message caused confusion prebiously

Test Plan: make; open existing db

Reviewers: leveldb, dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D13449
2013-10-14 18:16:54 -07:00
Mayank Agarwal
fe3713961e Features in Transaction log iterator
Summary:
* Logstore requests a valid change of reutrning an empty iterator and not an error in case of no log files.
* Changed the code to return the writebatch containing the sequence number requested from GetupdatesSince even if it lies in the middle. Earlier we used to return the next writebatch,. This also allows me oto guarantee that no files played upon by the iterator are redundant. I mean the starting log file has at least a sequence number >= the sequence number requested form GetupdatesSince.
* Cleaned up redundant logic in Iterator::Next and made a new function SeekToStartSequence for greater readability and maintainibilty.
* Modified a test in db_test accordingly
Please check the logic carefully and suggest improvements. I have a separate patch out for more improvements like restricting reader to read till written sequences.

Test Plan:
* transaction log iterator tests in db_test,
* db_repl_stress.
* rocks_log_iterator_test in fbcode/wormhole/rocksdb/test - 2 tests thriving on hacks till now can get simplified
* testing on the shadow setup for sigma with replication

Reviewers: dhruba, haobo, kailiu, sdong

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13437
2013-10-14 18:16:21 -07:00
Kai Liu
86ef6c3f74 Add statistics to sst file
Summary:
So far we only have key/value pairs as well as bloom filter stored in the
sst file.  It will be great if we are able to store more metadata about
this table itself, for example, the entry size, bloom filter name, etc.

This diff is the first step of this effort. It allows table to keep the
basic statistics mentioned in http://fburl.com/14995441, as well as
allowing writing user-collected stats to stats block.

After this diff, we will figure out the interface of how to allow user to collect their interested statistics.

Test Plan:
1. Added several unit tests.
2. Ran `make check` to ensure it doesn't break other tests.

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13419
2013-10-14 15:56:13 -07:00
Siying Dong
88f2f89068 Change Function names from Compaction->Flush When they really mean Flush
Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance.

Test Plan: Run test suites.

Reviewers: haobo, dhruba, xjin

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13431
2013-10-14 15:12:15 -07:00
sdong
f8509653ba LRUCache to try to clean entries not referenced first.
Summary:
With this patch, when LRUCache.Insert() is called and the cache is full, it will first try to free up entries whose reference counter is 1 (would become 0 after remo\
ving from the cache). We do it in two passes, in the first pass, we only try to release those unreferenced entries. If we cannot free enough space after traversing t\
he first remove_scan_cnt_ entries, we start from the beginning again and remove those entries being used.

Test Plan: add two unit tests to cover the codes

Reviewers: dhruba, haobo, emayanke

Reviewed By: emayanke

CC: leveldb, emayanke, xjin

Differential Revision: https://reviews.facebook.net/D13377
2013-10-11 09:26:21 -07:00
Dhruba Borthakur
c0ce562c32 Bad nfs file checked in a long time back.
Summary:
Bad nfs file checked in a long time back.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-10 22:00:49 -07:00
Mayank Agarwal
a8b4a69de0 Fixing error in ParseFileName causing DestroyDB to fail on archive directory
Summary:
This careless error was causing ASSERT_OK(DestroyDB) to fail in db_test.
Basically .. was being returned as a child of db/archive and ParseFileName returned false on that,
but 'type' was set to LogFile from earlier and not reset. The return of ParseFileName was not being checked to delete the log file or not.

Test Plan: make all check

Reviewers: dhruba, haobo, xjin, kailiu, nkg-

Reviewed By: nkg-

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13413
2013-10-10 18:18:31 -07:00
Siying Dong
40a1e31fa5 Minor: Fix a lint error in cache_test.cc
Summary:
As title. Fix an lint error:

Lint: CppLint Error
Single-argument constructor 'Value(int v)' may inadvertently be used as a type conversion constructor. Prefix the function with the 'explicit' keyword to avoid this, or add an /* implicit */ comment to suppress this warning.

Test Plan: N/A

Reviewers: emayanke, haobo, dhruba

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13401
2013-10-10 13:47:25 -07:00
Igor Canadi
d2ca2bd183 Fixing build failure
Summary: virtual NewRandomRWFile is not implemented on EnvHdfs, causing build failure.

Test Plan: make clean; make all check

Reviewers: dhruba, haobo, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13383
2013-10-10 01:01:16 -07:00
Igor Canadi
d0beadd456 Env class that can randomly read and write
Summary: I have implemented basic simple use case that I need for External Value Store I'm working on. There is a potential for making this prettier by refactoring/combining WritableFile and RandomAccessFile, avoiding some copypasta. However, I decided to implement just the basic functionality, so I can continue working on the other diff.

Test Plan: Added a unittest

Reviewers: dhruba, haobo, kailiu

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13365
2013-10-10 00:03:08 -07:00
Dhruba Borthakur
7ac3c796f6 Add draft logo.
Summary:
Add draft logo in jpg format.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-09 22:55:30 -07:00
Dhruba Borthakur
6d5f6a4b1a A bare-bones rocksdb logo.
Summary:
A hand-crafted rocksdb logo.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-09 20:32:19 -07:00
Dhruba Borthakur
3c37955a2f Remove obsolete namespace mappings.
Summary:
The previous release 2.4 had a mapping to alias the older
namespace to rocksdb. This mapping is not needed in the new
release.

Test Plan:
make check
make release

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13359
2013-10-08 22:53:38 -07:00
Naman Gupta
cbf4a06427 Add option for storing transaction logs in a separate dir
Summary: In some cases, you might not want to store the data log (write ahead log) files in the same dir as the sst files. An example use case is leaf, which stores sst files in tmpfs. And would like to save the log files in a separate dir (disk) to save memory.

Test Plan: make all. Ran db_test test. A few test failing. P2785018. If you guys don't see an obvious problem with the code, maybe somebody from the rocksdb team could help me debug the issue here. Running this on leaf worked well. I could see logs stored on disk, and deleted appropriately after compactions. Obviously this is only one set of options. The unit tests cover different options. Seems like I'm missing some edge cases.

Reviewers: dhruba, haobo, leveldb

CC: xinyaohu, sumeet

Differential Revision: https://reviews.facebook.net/D13239
2013-10-08 17:40:27 -07:00
Naman Gupta
116071411b Make db_test more robust
Summary: While working on D13239, I noticed that the same options are not used for opening and destroying at db. So adding that. Also added asserts for successful DestroyDB calls.

Test Plan: Ran unit tests. Atleast 1 unit test is failing. They failures are a result of some past logic change. I'm not really planning to fix those. But I would like to check this in. And hopefully the respective unit test owners can fix the broken tests

Reviewers: leveldb, haobo

CC: xinyaohu, sumeet, dhruba

Differential Revision: https://reviews.facebook.net/D13329
2013-10-08 13:19:31 -07:00
Kai Liu
1f8ade6bd6 Fix a bug in table builder
Summary:
In talbe.cc, when reading the metablock, it uses BytewiseComparator();
However in table_builder.cc, we use r->options.comparator. After tracing
the creation of r->options.comparator, I found this comparator is an
InternalKeyComparator, which wraps the user defined comparator(details
can be found in DBImpl::SanitizeOptions().

I encountered this problem when adding metadata about "bloom filter"
before. With different comparator, we may fail to do the binary sort.

Current code works well since there is only one entry in meta block.

Test Plan:
make all check

I've also tested this change in https://reviews.facebook.net/D8283 before.

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13335
2013-10-07 18:53:19 -07:00
Igor Canadi
fa46ddb41f Move delete and free outside of crtical section
Summary: Split Unref into two parts -> cheap and expensive. Try to call expensive Unref outside of critical section to decrease lock contention.

Test Plan: unittests

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb, kailiu

Differential Revision: https://reviews.facebook.net/D13299
2013-10-07 15:37:40 -07:00
Dhruba Borthakur
1a8c1b0817 Unit test failure in DBTest.NumImmutableMemTable.
Summary:
Previous patch introduced a unit test failure in
DBTest.NumImmutableMemTable because of change in property names.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-06 01:12:02 -07:00
Dhruba Borthakur
4463b11cad Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix.
Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix.

Test Plan: make check

Reviewers: emayanke, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13311
2013-10-06 00:14:26 -07:00
Haobo Xu
bf89edf78b [RocksDB] Added a property "leveldb.num-immutable-mem-table" so that Flush can be called without blocking, and application still has a way to check when it's done also without blocking.
Summary: as title

Test Plan: DBTest.NumImmutableMemTable

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13305
2013-10-05 11:54:08 -07:00
Dhruba Borthakur
0a9f873f4b Removed scribe, thrift and java modules.
Summary: Removed scribe, thrift and java modules.

Test Plan:
make release
make check

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13293
2013-10-04 15:36:00 -07:00
Mayank Agarwal
aad2110823 Updating README.fb to have newest verison 2.4
Summary:

Test Plan: visual
2013-10-04 12:17:44 -07:00
Dhruba Borthakur
a143ef9b38 Change namespace from leveldb to rocksdb
Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.

Test Plan: compile rocksdb

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13287
2013-10-04 11:59:26 -07:00
Mayank Agarwal
b3ed08129b Add a statistic to count the number of calls to GetUpdatesSince
Summary: This is useful to keep track of refreshes in transaction log iterator

Test Plan: make; db_stress --statistics=1 shows it

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13281
2013-10-04 10:47:20 -07:00
Mayank Agarwal
854d236361 Add backward compatible option in GetLiveFiles to choose whether to not Flush first
Summary:
As explained in comments in GetLiveFiles in db.h, this option will cause flush to be skipped in GetLiveFiles because some use-cases use GetSortedWalFiles after GetLiveFiles to generate more complete snapshots.
Using GetSortedWalFiles after GetLiveFiles allows us to not Flush in GetLiveFiles first because wals have everything.
Note: file deletions will be disabled before calling GLF or GSWF so live logs will not move to archive logs or get delted.
Note: Manifest file is truncated to a proper value in GLF, so it will always reply from the proper wal files on a restart

Test Plan: make

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13257
2013-10-04 10:20:10 -07:00
Haobo Xu
200c05a23f [RocksDB] Still honor DisableFileDeletions when purge_log_after_memtable_flush is on
Summary: as title

Test Plan: make check

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13263
2013-10-03 16:12:43 -07:00
Haobo Xu
fa798e9e28 [Rocksdb] Submit mem table flush job in a different thread pool
Summary: As title. This is just a quick hack and not ready for commit. fails a lot of unit test. I will test/debug it directly in ViewState shadow .

Test Plan: Try it in shadow test.

Reviewers: dhruba, xjin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12933
2013-10-03 14:37:19 -07:00
Xing Jin
658a3ce2fa Fix SIGSEGV issue in universal compaction
Summary:
We saw SIGSEGV when set options.num_levels=1 in universal compaction
style. Dug into this issue for a while, and finally found the root cause (thank Haobo for discussion).

Test Plan: Add new unit test. It throws SIGSEGV without this change. Also run "make all check".

Reviewers: haobo, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13251
2013-10-02 17:33:31 -07:00
Mayank Agarwal
6b34021fc2 Triggering verify for gets also
Summary: Will use iterators to verify keys in the db for half of its keys and Gets for the other half.

Test Plan: ./db_stress --max_key=1000 --ops_per_thread=100

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13227
2013-10-02 11:22:17 -07:00
Haobo Xu
71046971f0 [RocksDB] Added perf counters to track skipped internal keys during iteration
Summary: as title. unit test not polished. this is for a quick live test

Test Plan: live

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13221
2013-10-02 10:48:41 -07:00
Kai Liu
861f6e48e4 Remove the hard-coded enum value in statistics.h
Summary:
I am planning to add more to statistics classes but found current way of using enum is very verbose and unnecessarily increase the
difficulity of adding new statistics.

In this diff I removed the code that explicitly specifies the value of each enum entry. This will help us easily add new statistic
items more conveniently without manually adding the value of other enum entries by one.

Test Plan: make; make check;

Reviewers: haobo, dhruba, xjin, emayanke, vamsi

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13197
2013-10-01 14:14:06 -07:00
Natalie Hildebrandt
7edb92b843 Phase 2 of iterator stress test
Summary: Using an iterator instead of the Get method, each thread goes through a portion of the database and verifies values by comparing to the shared state.

Test Plan:
./db_stress --db=/tmp/tmppp --max_key=10000 --ops_per_thread=10000

To test some basic cases, the following lines can be added (each set in turn) to the verifyDb method with the following expected results:

    // Should abort with "Unexpected value found"
    shared.Delete(start);

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start));

    // Should succeed
    WriteOptions write_opts;
    shared.Delete(start);
     db_->Delete(write_opts, Key(start));

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start + (end-start)/2));

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(end-1));

    // Should abort with "Unexpected value"
    shared.Delete(end-1);

    // Should abort with "Unexpected value"
    shared.Delete(start + (end-start)/2);

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(start));
    shared.Delete(start);
    db_->Delete(write_opts, Key(end-1));
    db_->Delete(write_opts, Key(end-2));

To test the out of range abort, change the key in the for loop to Key(i+1), so that the key defined by the index i is now outside of the supposed range of the database.

Reviewers: emayanke

Reviewed By: emayanke

CC: dhruba, xjin

Differential Revision: https://reviews.facebook.net/D13071
2013-09-30 16:48:00 -07:00
Haobo Xu
22bb7c754b [RocksDB] print the name of options.memtable_factory in LOG so we know
Summary: as title

Test Plan: make check

Reviewers: dhruba, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13179
2013-09-28 20:57:29 -07:00
Xing Jin
8eb552bf4d New unit test for iterator with snapshot
Summary:
I played with the reported bug about iterator with snapshot:
https://code.google.com/p/leveldb/issues/detail?id=200.

I turned the original test program
(https://code.google.com/p/leveldb/issues/attachmentText?id=200&aid=2000000000&name=test.cc&token=7uOUQW-HFlbAFMUm7EqtaAEy7Tw%3A1378320724136)
into a new unit test, but I cannot reproduce the problem. Notice lines
31-34 in above link. I have ran the new test with and without such Put()
operations. Both succeed.

So this diff simply adds the test, without changing any source codes.

Test Plan: run new test.

Reviewers: dhruba, haobo, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12735
2013-09-28 11:39:08 -07:00
Haobo Xu
0c4040681a [RocksDB] Move last_sequence and last_flushed_sequence_ update back into lock protected area
Summary: A previous diff moved these outside of lock protected area. Moved back in now. Also moved tmp_batch_ update outside of lock protected area, as only the single write thread can access it.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13137
2013-09-26 20:43:11 -07:00
Haobo Xu
08740b15a4 [RocksDB] Fix skiplist sequential insertion optimization
Summary: The original optimization missed updating links other than the lowest level.

Test Plan: make check; perf_context_test

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, adsharma

Differential Revision: https://reviews.facebook.net/D13119
2013-09-26 15:17:03 -07:00
Haobo Xu
e0aa19a94e [RocbsDB] Add an option to enable set based memtable for perf_context_test
Summary:
as title.
Some result:

-- Sequential insertion of 1M key/value with stock skip list (all in on memtable)
time ./perf_context_test  --total_keys=1000000  --use_set_based_memetable=0
Inserting 1000000 key/value pairs
...
Put uesr key comparison:
Count: 1000000  Average: 8.0179  StdDev: 176.34
Min: 0.0000  Median: 2.5555  Max: 88933.0000
Percentiles: P50: 2.56 P75: 2.83 P99: 58.21 P99.9: 133.62 P99.99: 987.50
Get uesr key comparison:
Count: 1000000  Average: 43.4465  StdDev: 379.03
Min: 2.0000  Median: 36.0195  Max: 88939.0000
Percentiles: P50: 36.02 P75: 43.66 P99: 112.98 P99.9: 824.84 P99.99: 7615.38
real	0m21.345s
user	0m14.723s
sys	0m5.677s

-- Sequential insertion of 1M key/value with set based memtable (all in on memtable)
time ./perf_context_test  --total_keys=1000000  --use_set_based_memetable=1
Inserting 1000000 key/value pairs
...
Put uesr key comparison:
Count: 1000000  Average: 61.5022  StdDev: 6.49
Min: 0.0000  Median: 62.4295  Max: 71.0000
Percentiles: P50: 62.43 P75: 66.61 P99: 71.00 P99.9: 71.00 P99.99: 71.00
Get uesr key comparison:
Count: 1000000  Average: 29.3810  StdDev: 3.20
Min: 1.0000  Median: 29.1801  Max: 34.0000
Percentiles: P50: 29.18 P75: 32.06 P99: 34.00 P99.9: 34.00 P99.99: 34.00
real	0m28.875s
user	0m21.699s
sys	0m5.749s

Worst case comparison for a Put is 88933 (skiplist) vs 71 (set based memetable)

Of course, there's other in-efficiency in set based memtable implementation, which lead to the overall worst performance. However, P99 behavior advantage is very very obvious.

Test Plan: ./perf_context_test and viewstate shadow testing

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13095
2013-09-25 22:49:18 -07:00
Dhruba Borthakur
f1a60e5c3e The vector rep implementation was segfaulting because of incorrect initialization of vector.
Summary:
The constructor for Vector memtable has a parameter called 'count'
that specifies the capacity of the vector to be reserved at allocation
time. It was incorrectly used to initialize the size of the vector.

Test Plan: Enhanced db_test.

Reviewers: haobo, xjin, emayanke

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13083
2013-09-25 11:33:52 -07:00
Dhruba Borthakur
87d6eb2f6b Implement apis in the Environment to clear out pages in the OS cache.
Summary:
Added a new api to the Environment that allows clearing out not-needed
pages from the OS cache. This will be helpful when the compressed
block cache replaces the OS cache.

Test Plan: EnvPosixTest.InvalidateCache

Reviewers: haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13041
2013-09-23 22:05:03 -07:00
Natalie Hildebrandt
9262061b0d Fixing crashing tests to include iterpercent param
Summary: Adding in the iterpercent flag to tests.

Test Plan: make crash_test

Reviewers: emayanke

Reviewed By: emayanke

Differential Revision: https://reviews.facebook.net/D13035
2013-09-20 16:27:22 -07:00
Dhruba Borthakur
5e9f3a9aa7 Better locking in vectorrep that increases throughput to match speed of storage.
Summary:
There is a use-case where we want to insert data into rocksdb as
fast as possible. Vector rep is used for this purpose.

The background flush thread needs to flush the vectorrep to
storage. It acquires the dblock then sorts the vector, releases
the dblock and then writes the sorted vector to storage. This is
suboptimal because the lock is held during the sort, which
prevents new writes for occuring.

This patch moves the sorting of the vector rep to outside the
db mutex. Performance is now as fastas the underlying storage
system. If you are doing buffered writes to rocksdb files, then
you can observe throughput upwards of 200 MB/sec writes.

This is an early draft and not yet ready to be reviewed.

Test Plan:
make check

Task ID: #

Blame Rev:

Reviewers: haobo

Reviewed By: haobo

CC: leveldb, haobo

Differential Revision: https://reviews.facebook.net/D12987
2013-09-19 21:48:10 -07:00