Commit Graph

1465 Commits

Author SHA1 Message Date
sdong
4d16a9a633 VersionBuilder to optimize for applying a later edit deleting files added by previous edits
Summary: During recovery, VersionBuilder::Apply() was called multiple times. If the DB is open for long enough, most of files added earlier will be deleted by later deletes. In current solution, sorting added file happens first and then deletes are applied. In this patch, deletes are applied when possible inside Apply(), which can significantly reduce the sorting time in some cases.

Test Plan:
Add unit tests in version_builder
valgrind_check
Open a manifest of 50MB, with 9K live files. The manifest read time reduced from 1.6 seconds to 0.7 seconds.

Reviewers: rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30765
2015-01-07 10:36:58 -08:00
Igor Canadi
7731d51c82 Simplify column family concurrency
Summary:
This patch changes concurrency guarantees around ColumnFamilySet::column_families_ and ColumnFamilySet::column_families_data_.

Before:
* When mutating: lock DB mutex and spin lock
* When reading: lock DB mutex OR spin lock

After:
* When mutating: lock DB mutex and be in write thread
* When reading: lock DB mutex or be in write thread

That way, we eliminate the spin lock that protects these hash maps and  simplify concurrency. That means we don't need to lock the spin lock during writing, since writing is mutually exclusive with column family create/drop (the only operations that mutate those hash maps).

With these new restrictions, I also needed to move column family create to the write thread (column family drop was already in the write thread).

Even though we don't need to lock the spin lock during write, impact on performance should be minimal -- the spin lock is almost never busy, so locking it is almost free.

This addresses task t5116919.

Test Plan:
make check

Stress test with lots and lots of column family drop and create:

   time ./db_stress --threads=30 --ops_per_thread=5000000 --max_key=5000 --column_families=200 --clear_column_family_one_in=100000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress/

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30651
2015-01-06 12:44:21 -08:00
Igor Canadi
07aa4e0e35 Fix compaction summary log for trivial move
Summary: When trivial move commit is done, we log the summary of the input version instead of current. This is inconsistent with other log messages and confusing.

Test Plan: compiles

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30939
2015-01-05 17:32:49 -08:00
Leonidas Galanis
9d5bd411be benchmark.sh won't run through all tests properly if one specifies wal_dir to be different than db directory.
Summary:
A command line like this to run all the tests:
source benchmark.config.sh && nohup ./benchmark.sh 'bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting'
where
benchmark.config.sh is:
export DB_DIR=/data/mysql/rocksdata
export WAL_DIR=/txlogs/rockswal
export OUTPUT_DIR=/root/rocks_benchmarking/output

Will fail for the tests that need a new DB .

Also 1) set disable_data_sync=0 and 2) add debug mode to run through all the tests more quickly

Test Plan: run ./benchmark.sh 'debug,bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting' and verify that there are no complaints about WAL dir not being empty.

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D30909
2015-01-05 15:36:47 -08:00
Igor Canadi
62ad0a9b19 Deprecating skip_log_error_on_recovery
Summary:
Since https://reviews.facebook.net/D16119, we ignore partial tailing writes. Because of that, we no longer need skip_log_error_on_recovery.

The documentation says "Skip log corruption error on recovery (If client is ok with losing most recent changes)", while the option actually ignores any corruption of the WAL (not only just the most recent changes). This is very dangerous and can lead to DB inconsistencies. This was originally set up to ignore partial tailing writes, which we now do automatically (after D16119). I have digged up old task t2416297 which confirms my findings.

Test Plan: There was actually no tests that verified correct behavior of skip_log_error_on_recovery.

Reviewers: yhchiang, rven, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30603
2015-01-05 13:35:56 -08:00
Igor Canadi
fa0b126c0c Fix corruption_test -- if status is not OK, return status -- during recovery 2015-01-05 10:49:41 -08:00
Igor Canadi
d7b4bb62a7 Fail DB::Open() on WAL corruption
Summary:
This is a serious bug. If paranod_check == true and WAL is corrupted, we don't fail DB::Open(). I tried going into history and it seems we've been doing this for a long long time.

I found this when investigating t5852041.

Test Plan: Added unit test to verify correct behavior.

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30597
2015-01-05 10:26:34 -08:00
sdong
e9ca358157 Fix CLANG build for db_bench
Summary: CLANG was broken for a recent change in db_ench. Fix it.

Test Plan: Build db_bench using CLANG.

Reviewers: rven, igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30801
2014-12-30 18:33:49 -08:00
Yueh-Hsuan Chiang
bf287b76e0 Add structures for exposing thread events and operations.
Summary:
Add structures for exposing events and operations.  Event describes
high-level action about a thread such as doing compaciton or
doing flush, while an operation describes lower-level action
of a thread such as reading / writing a SST table, waiting for
mutex.  Events and operations are designed to be independent.
One thread would typically involve in one event and one operation.

Code instrument will be in a separate diff.

Test Plan:
Add unit-tests in thread_list_test
make dbg -j32
./thread_list_test
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: rven, jonahcohen, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29781
2014-12-30 10:39:13 -08:00
sdong
a801c1fb09 db_bench --num_hot_column_families to be default off
Summary: Having --num_hot_column_families default on fails some existing regression tests. By default turn it off

Test Plan: Run db_bench to make sure it is default off.

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D30705
2014-12-24 09:00:23 -08:00
sdong
ddc81440d5 db_bench to add an option as number of hot column families to add to
Summary:
Add option --num_hot_column_families in db_bench. If it is set, write options will first write to that number of column families, and then move on to next set of hot column families. The working set of column families can be smaller than total number of CFs.

It is to test how RocksDB can handle cold column families

Test Plan: Run db_bench with  --num_hot_column_families set and not set.

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30663
2014-12-23 16:52:05 -08:00
Yueh-Hsuan Chiang
a944afd356 Fixed a compile error in db/db_impl.cc on ROCKSDB_LITE 2014-12-23 16:19:40 -08:00
Chris BeHanna
d232cb156b Fix the build with -DNDEBUG.
Dike out the body of VerifyCompactionResult.  With assert() compiled out, the
loop index variable in the inner loop was unused, breaking the build when
-Werror is enabled.
2014-12-22 17:06:18 -06:00
Yueh-Hsuan Chiang
45bab305f9 Move GetThreadList() feature under Env.
Summary:
GetThreadList() feature depends on the thread creation and destruction, which is currently handled under Env.
This patch moves GetThreadList() feature under Env to better manage the dependency of GetThreadList() feature
on thread creation and destruction.

Renamed ThreadStatusImpl to ThreadStatusUpdater.  Add ThreadStatusUtil, which is a static class contains
utility functions for ThreadStatusUpdater.

Test Plan: run db_test, thread_list_test and db_bench and verify the life cycle of Env and ThreadStatusUpdater is properly managed.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: ljin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30057
2014-12-22 12:20:17 -08:00
Igor Canadi
4fd26f287c Only execute flush from compaction if max_background_flushes = 0
Summary: As title. We shouldn't need to execute flush from compaction if there are dedicated threads doing flushes.

Test Plan: make check

Reviewers: rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30579
2014-12-22 12:05:14 +01:00
Igor Canadi
0acc738810 Speed up FindObsoleteFiles()
Summary:
There are two versions of FindObsoleteFiles():
* full scan, which is executed every 6 hours (and it's terribly slow)
* no full scan, which is executed every time a background process finishes and iterator is deleted

This diff is optimizing the second case (no full scan). Here's what we do before the diff:
* Get the list of obsolete files (files with ref==0). Some files in obsolete_files set might actually be live.
* Get the list of live files to avoid deleting files that are live.
* Delete files that are in obsolete_files and not in live_files.

After this diff:
* The only files with ref==0 that are still live are files that have been part of move compaction. Don't include moved files in obsolete_files.
* Get the list of obsolete files (which exclude moved files).
* No need to get the list of live files, since all files in obsolete_files need to be deleted.

I'll post the benchmark results, but you can get the feel of it here: https://reviews.facebook.net/D30123

This depends on D30123.

P.S. We should do full scan only in failure scenarios, not every 6 hours. I'll do this in a follow-up diff.

Test Plan:
One new unit test. Made sure that unit test fails if we don't have a `if (!f->moved)` safeguard in ~Version.

make check

Big number of compactions and flushes:

  ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30249
2014-12-22 12:04:45 +01:00
Igor Canadi
f8999fcf31 Fix a SIGSEGV in BackgroundFlush
Summary:
This one wasn't easy to find :)

What happens is we go through all cfds on flush_queue_ and find no cfds to flush, *but* the cfd is set to the last CF we looped through and following code assumes we want it flushed.

BTW @sdong do you think we should also make BackgroundFlush() only check a single cfd for flushing instead of doing this `while (!flush_queue_.empty())`?

Test Plan: regression test no longer fails

Reviewers: sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30591
2014-12-21 00:23:28 -08:00
Igor Canadi
fdb6be4e24 Rewritten system for scheduling background work
Summary:
When scaling to higher number of column families, the worst bottleneck was MaybeScheduleFlushOrCompaction(), which did a for loop over all column families while holding a mutex. This patch addresses the issue.

The approach is similar to our earlier efforts: instead of a pull-model, where we do something for every column family, we can do a push-based model -- when we detect that column family is ready to be flushed/compacted, we add it to the flush_queue_/compaction_queue_. That way we don't need to loop over every column family in MaybeScheduleFlushOrCompaction.

Here are the performance results:

Command:

    ./db_bench --write_buffer_size=268435456 --db_write_buffer_size=268435456 --db=/fast-rocksdb-tmp/rocks_lots_of_cf --use_existing_db=0 --open_files=55000 --statistics=1 --histogram=1 --disable_data_sync=1 --max_write_buffer_number=2 --sync=0 --benchmarks=fillrandom --threads=16 --num_column_families=5000  --disable_wal=1 --max_background_flushes=16 --max_background_compactions=16 --level0_file_num_compaction_trigger=2 --level0_slowdown_writes_trigger=2 --level0_stop_writes_trigger=3 --hard_rate_limit=1 --num=33333333 --writes=33333333

Before the patch:

     fillrandom   :      26.950 micros/op 37105 ops/sec;    4.1 MB/s

After the patch:

      fillrandom   :      17.404 micros/op 57456 ops/sec;    6.4 MB/s

Next bottleneck is VersionSet::AddLiveFiles, which is painfully slow when we have a lot of files. This is coming in the next patch, but when I removed that code, here's what I got:

      fillrandom   :       7.590 micros/op 131758 ops/sec;   14.6 MB/s

Test Plan:
make check

two stress tests:

Big number of compactions and flushes:

    ./db_stress --threads=30 --ops_per_thread=20000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000

max_background_flushes=0, to verify that this case also works correctly

    ./db_stress --threads=30 --ops_per_thread=2000000 --max_key=10000 --column_families=20 --clear_column_family_one_in=10000000 --verify_before_write=0  --reopen=3 --max_background_compactions=3 --max_background_flushes=0 --db=/fast-rocksdb-tmp/db_stress --prefixpercent=0 --iterpercent=0 --writepercent=75 --db_write_buffer_size=2000000

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30123
2014-12-19 20:38:12 +01:00
Yueh-Hsuan Chiang
25f70a5abb Avoid unnecessary unlock and lock mutex when notifying events.
Summary: Avoid unnecessary unlock and lock mutex when notifying events.

Test Plan: ./listener_test

Reviewers: igor

Reviewed By: igor

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30267
2014-12-16 17:10:23 -08:00
Venkatesh Radhakrishnan
7661e5a76e Move the file copy out of the mutex.
Summary:
We now release the mutex before copying the files in the case
of the trivial move. This path does not use the compaction job.

Test Plan: DBTest.LevelCompactionThirdPath

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30381
2014-12-16 16:57:22 -08:00
Venkatesh Radhakrishnan
153f4f0719 RocksDB: Allow Level-Style Compaction to Place Files in Different Paths
Summary:
Allow Level-style compaction to place files in different paths
This diff provides the code for task 4854591. We now support level-compaction
to place files in different paths by specifying  them in db_paths  along with
the minimum level for files to store in that path.

Test Plan: ManualLevelCompactionOutputPathId in db_test.cc

Reviewers: yhchiang, MarkCallaghan, dhruba, yoshinorim, sdong

Reviewed By: sdong

Subscribers: yoshinorim, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29799
2014-12-15 21:48:16 -08:00
sdong
7ab1526c0e Add an assert and avoid std::sort(autovector) to investigate an ASAN issue
Summary:
ASAN build fails once for this error:

14:04:52 ==== Test DBTest.CompactFilesOnLevelCompaction
14:04:52 db_test: db/version_set.cc:1062: void rocksdb::VersionStorageInfo::AddFile(int, rocksdb::FileMetaData*): Assertion `level <= 0 || level_files->empty() || internal_comparator_->Compare( (*level_files)[level_files->size() - 1]->largest, f->smallest) < 0' failed.

Not abling figure out reason. We use std:vector for sorting for save and add one more assert to help figure out whether it is the sorting's problem.

Test Plan: make all check

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D30117
2014-12-12 12:44:00 -08:00
sdong
d7a486668c Improve scalability of DB::GetSnapshot()
Summary: Now DB::GetSnapshot() doesn't scale to more column families, as it needs to go through all the column families to find whether snapshot is supported. This patch optimizes it.

Test Plan:
Add unit tests to cover negative cases.
make all check

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30093
2014-12-11 13:27:57 -08:00
sdong
0ab0242f37 VersionBuilder to use unordered set and map to store added and deleted files
Summary: Set operations in VerisonBuilder is shown as a performance bottleneck of restarting DB when there are lots of files. Make both of added_files and deleted_files use unordered set or map. Only when adding the files, sort the added files.

Test Plan: make all check

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: hermanlee4, leveldb, dhruba, ljin

Differential Revision: https://reviews.facebook.net/D30051
2014-12-10 18:53:30 -08:00
sdong
046ba7d47c Fix calculation of max_total_wal_size in db_options_.max_total_wal_size == 0 case
Summary: This is a regression bug introduced by https://reviews.facebook.net/D24729 . max_total_wal_size would be off the target it should be more and more in the case that the a user holds the current super version after flush or compaction. This patch fixes it

Test Plan: make all check

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: ljin, yoshinorim, MarkCallaghan, hermanlee4, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29961
2014-12-08 15:26:35 -08:00
Leonidas Galanis
635c61fd3b Fix problem with create_if_missing option when wal_dir is used
Summary: When wal_dir is used, DestroyDB is not passed the wal_dir option and so we get a Corruption exception.

Test Plan:
Verified manually that the following command line works now:
./db_bench --db=/mnt/db/rocksdb ... --disable_wal=0 --wal_dir=/data/users/rocksdb/WAL... --benchmarks=filluniquerandom --use_existing_db=0...

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29859
2014-12-08 12:53:24 -08:00
sdong
1f04066cab Add DBProperty to return number of snapshots and time for oldest snapshot
Summary:
Add a counter in SnapshotList to show number of snapshots. Also a unix timestamp in every snapshot.
Add two DB Properties to return number of snapshots and timestamp of the oldest one.

Test Plan: Add unit test checking

Reviewers: yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba, MarkCallaghan

Differential Revision: https://reviews.facebook.net/D29919
2014-12-05 17:07:49 -08:00
Yueh-Hsuan Chiang
a94d54aa47 Remove the use of exception in WriteBatch::Handler
Summary:
Remove the use of exception in WriteBatch::Handler.  Now the default
implementations of Put, Merge, and Delete in WriteBatch::Handler are no-op.

Test Plan:
Add three test cases in write_batch_test
./write_batch_test

Reviewers: sdong, igor

Reviewed By: sdong, igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29835
2014-12-04 12:01:55 -08:00
Yueh-Hsuan Chiang
5f719d7202 Replace exception by setting valid_ = false in DBIter::MergeValuesNewToOld()
Summary: Replace exception by setting valid_ = false in DBIter::MergeValuesNewToOld().

Test Plan:
Not sure if I am right at this, but it seems we currently don't have a good
way to test that code path as it requires dynamically set merge_operator = nullptr
at the time while Merge() is calling.

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29811
2014-12-04 11:11:11 -08:00
Mark Callaghan
32a0a03844 Add Moved(GB) to Compaction IO stats
Summary:
Adds counter for bytes moved (files pushed down a level rather than compacted) to compaction
IO stats as Moved(GB). From the output removed these infrequently used columns: RW-Amp, Rn(cnt), Rnp1(cnt),
Wnp1(cnt), Wnew(cnt).
Example old output:
Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) RW-Amp W-Amp Rd(MB/s) Wr(MB/s)  Rn(cnt) Rnp1(cnt) Wnp1(cnt) Wnew(cnt)  Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms) RecordIn RecordDrop
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0     0/0          0   0.0      0.0     0.0      0.0    2130.8   2130.8    0.0   0.0      0.0    109.1        0         0         0         0      20002     25068    0.798      28.75     182059    0.16       0          0
  L1   142/0        509   1.0   4618.5  2036.5   2582.0    4602.1   2020.2    4.5   2.3     88.5     88.1    24220    701246   1215528    514282      53466      4229   12.643       0.00          0    0.002032745988  300688729

Example new output:
Level   Files   Size(MB) Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) Stall(sec) Stall(cnt) Avg(ms)     RecordIn   RecordDrop
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0     7/0         13   1.8      0.0     0.0      0.0       0.6      0.6       0.0   0.0      0.0     14.7        44       353    0.124       0.03        626    0.05            0            0
  L1     9/0         16   1.6      0.0     0.0      0.0       0.0      0.0       0.6   0.0      0.0      0.0         0         0    0.000       0.00          0    0.00            0            0

Task ID: #

Blame Rev:

Test Plan:
make check, run db_bench --fillseq --stats_per_interval --stats_interval and look at output

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D29787
2014-12-03 18:28:39 -08:00
Jonah Cohen
a14b7873ee Enforce write buffer memory limit across column families
Summary:
Introduces a new class for managing write buffer memory across column
families.  We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.

Test Plan: Added SharedWriteBuffer unit test to db_test.cc

Reviewers: sdong, rven, ljin, igor

Reviewed By: igor

Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim

Differential Revision: https://reviews.facebook.net/D22581
2014-12-02 12:09:20 -08:00
Igor Canadi
37d73d597e Fix linters
Summary:
Two fixes:
1. if cpplint is not present on the system, don't return a confusing error in the linter
2. Add include_alpha, which means our includes should be sorted lexicographically

Test Plan: Tried unsorting our includes, lint complained

Reviewers: rven, ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28845
2014-12-02 13:53:39 -05:00
Yueh-Hsuan Chiang
bcf9086899 Block Universal and FIFO compactions in ROCKSDB_LITE
Summary: Block Universal and FIFO compactions in ROCKSDB_LITE

Test Plan:
make shared_lib -j32
make OPT=-DROCKSDB_LITE shared_lib

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29589
2014-11-26 15:45:11 -08:00
Yueh-Hsuan Chiang
73d72ed5c7 Block ReadOnlyDB in ROCKSDB_LITE
Summary:
db_imp_readonly.o is one of the big obj file.  If it's not a necessary
feature, we should probably block it in ROCKSDB_LITE.

    1322704 Nov 24 16:55 db/db_impl_readonly.o

Test Plan:
make shared_lib -j32
make ROCKSDB_LITE shared_lib -j32

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29583
2014-11-26 11:37:59 -08:00
Matt Amos
26109d487a Store upper bound Slice with the same lifetime as the ReadOptions so that we can provide a pointer to it. 2014-11-26 11:29:13 +00:00
Matt Amos
805bac6d25 Add test for upper bounds on iterators using C interface. 2014-11-25 23:07:40 +00:00
Yueh-Hsuan Chiang
274ba62707 Block internal_stats in ROCKSDB_LITE
Summary: Block internal_stats in ROCKSDB_LITE.

Test Plan: make OPT=-DROCKSDB_LITE shared_lib

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29541
2014-11-25 12:01:27 -08:00
Reed Allman
88dd8d889b c api: add max wal total to opts 2014-11-24 22:00:29 -08:00
Yueh-Hsuan Chiang
13de000f07 Add rocksdb::ToString() to address cases where std::to_string is not available.
Summary:
In some environment such as android, the c++ library does not have
std::to_string.  This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.

Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29181
2014-11-24 20:44:49 -08:00
Yueh-Hsuan Chiang
90ee85f8e1 Improve listener_test to avoid possible false alarm
Summary:
Improve listener_test to avoid possible false alarm

Test Plan:
./listener_test
2014-11-24 18:28:06 -08:00
Lei Jin
2946e37a08 remove unreliable test in db/cuckoo_table_db_test.cc
Summary:
This compaction trigger does not seem to test any thing specific to
cuckoo table. Remove it.

Test Plan: make all check

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29523
2014-11-24 15:18:09 -08:00
Lei Jin
9c7ca65d21 free builders in VersionSet::DumpManifest
Summary:
Reported by bootcamper
This causes ldb tool to fail the assertion in ~ColumnFamilyData()

Test Plan:
./ldb --db=/tmp/test_db1 --create_if_missing put a1 b1
./ldb manifest_dump --path=/tmp/test_db1/MANIFEST-000001

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29517
2014-11-24 15:03:08 -08:00
Venkatesh Radhakrishnan
3257221499 Fixes valgrind error in GetSnapshotLink. Free checkpoint now.
Summary: Free checkpoint after its directory is removed.

Test Plan: Run valgrind with GetSnapshotLink.

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29493
2014-11-24 10:20:50 -08:00
Yueh-Hsuan Chiang
569853ed10 Fix leak when create_missing_column_families=true on ThreadStatus
Summary:
An entry of ConstantColumnFamilyInfo is created when:
1. DB::Open
2. CreateColumnFamily.

However, there are cases that DB::Open could also call CreateColumnFamily
when create_missing_column_families=true.  As a result, it will create
duplicate ConstantColumnFamilyInfo and one of them would be leaked.

Test Plan: ./deletefile_test

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29307
2014-11-22 00:04:41 -08:00
sdong
3a40c427b9 Fix db_bench on CLANG mode
Summary: "build all" breaks in Clang mode with db_bench. Fix it.

Test Plan: USE_CLANG=1 make all

Reviewers: ljin, rven, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D29379
2014-11-21 11:30:22 -08:00
Yueh-Hsuan Chiang
aa31fc5068 Improve listener_test by ensuring flushes are completed before assert.
Summary: Improve listener_test by ensuring flushes are completed before assert.

Test Plan: listener_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29319
2014-11-21 10:22:05 -08:00
Igor Canadi
cd278584c9 Clean up StringSplit
Summary: stringSplit is not how we name our functions. Also, we had two StringSplit's in the codebase

Test Plan: make check

Reviewers: yhchiang, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29361
2014-11-21 11:05:28 -05:00
Yueh-Hsuan Chiang
4b63fcbff3 Add enable_thread_tracking to DBOptions
Summary:
Add enable_thread_tracking to DBOptions to allow
tracking thread status related to the DB.  Default is off.

Test Plan:
export ROCKSDB_TESTS=ThreadList
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29289
2014-11-20 21:13:18 -08:00
Venkatesh Radhakrishnan
004f416b77 Moved checkpoint to utilities
Summary:
Moved checkpoint to utilities.
Addressed comments by Igor, Siying, Dhruba

Test Plan: db_test/SnapshotLink

Reviewers: dhruba, igor, sdong

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29079
2014-11-20 15:54:47 -08:00
Yueh-Hsuan Chiang
d0c5f28a5c Introduce GetThreadList API
Summary:
Add GetThreadList API, which allows developer to track the
status of each process.  Currently, calling GetThreadList will
only get the list of background threads in RocksDB with their
thread-id and thread-type (priority) set.  Will add more support
on this in the later diffs.

ThreadStatus currently has the following properties:

  // An unique ID for the thread.
  const uint64_t thread_id;

  // The type of the thread, it could be ROCKSDB_HIGH_PRIORITY,
  // ROCKSDB_LOW_PRIORITY, and USER_THREAD
  const ThreadType thread_type;

  // The name of the DB instance where the thread is currently
  // involved with.  It would be set to empty string if the thread
  // does not involve in any DB operation.
  const std::string db_name;

  // The name of the column family where the thread is currently
  // It would be set to empty string if the thread does not involve
  // in any column family.
  const std::string cf_name;

  // The event that the current thread is involved.
  // It would be set to empty string if the information about event
  // is not currently available.

Test Plan:
./thread_list_test
export ROCKSDB_TESTS=GetThreadList
./db_test

Reviewers: rven, igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25047
2014-11-20 10:49:32 -08:00
Lei Jin
8d3f8f9696 remove all remaining references to cfd->options()
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29073
2014-11-18 10:20:10 -08:00
Lei Jin
1e4a45aac8 remove cfd->options() in DBImpl::NotifyOnFlushCompleted
Summary: We should not reference cfd->options() directly!

Test Plan: make release

Reviewers: sdong, rven, igor, yhchiang

Reviewed By: igor, yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29061
2014-11-18 10:19:48 -08:00
Yueh-Hsuan Chiang
98e59f9813 Fixed a bug which could hide non-ok status in CompactionJob::Run()
Summary: Fixed a bug which could hide non-ok status in CompactionJob::Run()

Test Plan: make

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28995
2014-11-16 21:52:23 -08:00
Igor Canadi
23295b74b6 Clean job_context 2014-11-14 16:57:17 -08:00
Igor Canadi
84af2ff8d3 Clean job context in DeleteFile 2014-11-14 16:20:24 -08:00
Igor Canadi
5c04acda08 Explicitly clean JobContext
Summary: This way we can gurantee that old MemTables get destructed before DBImpl gets destructed, which might be useful if we want to make them depend on state from DBImpl.

Test Plan: make check with asserts in JobContext's destructor

Reviewers: ljin, sdong, yhchiang, rven, jonahcohen

Reviewed By: jonahcohen

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28959
2014-11-14 15:43:10 -08:00
Igor Canadi
26dc5da96c Fix compaction_job_test 2014-11-14 13:42:13 -08:00
Igor Canadi
5f583d2a9c Merge pull request #394 from lalinsky/cuckoo-c
Cuckoo table options missing in the C interface
2014-11-14 13:23:06 -08:00
Igor Canadi
04ca7481d2 Fix build 2014-11-14 11:52:17 -08:00
Venkatesh Radhakrishnan
6c1b040cc9 Provide openable snapshots
Summary: Store links to live files in directory on same disk

Test Plan:
Take snapshot and open it. Added a test GetSnapshotLink in
db_test.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28713
2014-11-14 11:38:26 -08:00
Igor Canadi
9be338cf9d CompactionJobTest
Summary:
This is just a simple test that passes two files though a compaction. It shows the framework so that people can continue building new compaction *unit* tests.
In the future we might want to move some Compaction* tests from DBTest here. For example, CompactBetweenSnapshot seems a good candidate.

Hopefully this test can be simpler when we mock out VersionSet.

Test Plan: this is a test

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28449
2014-11-14 11:35:48 -08:00
Lukáš Lalinský
e6c3cc6574 Add very basic tests to make sure the C cuckoo table options compile and run 2014-11-14 11:31:52 -08:00
Lukáš Lalinský
c44a292781 Add cuckoo table options to the C interface 2014-11-14 11:00:39 -08:00
sdong
a177742a9b Make db_stress built for ROCKSDB_LITE
Summary:
Make db_stress built for ROCKSDB_LITE.
The test doesn't pass tough. It seg fault quickly. But I took a look and it doesn't seem to be related to lite version. Likely to be a bug inside RocksDB.

Test Plan: make db_stress

Reviewers: yhchiang, rven, ljin, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28797
2014-11-14 10:20:51 -08:00
sdong
f822129b32 Add a unit test for behavior when merge operator and compaction filter co-exist.
Summary: Add a unit test in db_test to verify the behavior when both of merge operator and compaction filter apply to a key when merging.

Test Plan: Run the new test

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28455
2014-11-13 15:21:12 -08:00
Yueh-Hsuan Chiang
4161de92a3 Fix SIGSEGV
Summary: As a short-term fix, let's go back to previous way of calculating NeedsCompaction(). SIGSEGV happens because NeedsCompaction() can happen before super_version (and thus MutableCFOptions) is initialized.

Test Plan: make check

Reviewers: ljin, sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28875
2014-11-13 15:21:04 -08:00
Igor Canadi
772bc97f13 No CompactFiles in ROCKSDB_LITE
Summary: It adds lots of code.

Test Plan: compile for iOS, compile for mac. works.

Reviewers: rven, sdong, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28857
2014-11-13 16:45:33 -05:00
Yueh-Hsuan Chiang
1d1a64f58a Move NeedsCompaction() from VersionStorageInfo to CompactionPicker
Summary:
Move NeedsCompaction() from VersionStorageInfo to CompactionPicker
to allow different compaction strategy to have their own way to
determine whether doing compaction is necessary.

When compaction style is set to kCompactionStyleNone, then
NeedsCompaction() will always return false.

Test Plan:
export ROCKSDB_TESTS=Compact
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28719
2014-11-13 13:41:43 -08:00
Igor Canadi
25f273027b Fix iOS compile with -Wshorten-64-to-32
Summary: So iOS size_t is 32-bit, so we need to static_cast<size_t> any uint64_t :(

Test Plan: TARGET_OS=IOS make static_lib

Reviewers: dhruba, ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28743
2014-11-13 14:39:30 -05:00
sdong
fa50abb726 Fix bug of reading from empty DB.
Summary: I found that db_stress sometimes segfault on my machine. Fix the bug.

Test Plan: make all check. Run db_stress

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28803
2014-11-13 10:42:18 -08:00
Yueh-Hsuan Chiang
9759495229 Fixed clang compile error in version_builder_test
Summary: Fixed clang compile error in version_builder_test

Test Plan: ./version_builder_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: sdong, dhruba

Differential Revision: https://reviews.facebook.net/D28731
2014-11-11 15:22:06 -08:00
Yueh-Hsuan Chiang
5811419357 Fixed GetEstimatedActiveKeys
Summary:
Fixed a bug in GetEstimatedActiveKeys which does not normalized
the sampled information correctly.

Add a test in version_builder_test.

Test Plan: version_builder_test

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28707
2014-11-11 14:28:18 -08:00
Igor Canadi
767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Igor Canadi
113796c493 Fix NewFileNumber()
Summary: I mistakenly changed the behavior to ++next_file_number_ instead of next_file_number_++, as it should have been: 344edbb044/db/version_set.h (L539)

Test Plan: none. not sure if this would break anything. It's just different behavior, so I'd rather not risk

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28557
2014-11-11 06:58:47 -08:00
Tomislav Novak
35c8c814e8 Make ForwardIterator::status() more efficient
Summary:
In D19581 I made `ForwardIterator::status()` check all child iterators,
including immutable ones. It's, however, not necessary to do it every
time -- it'll suffice to check only when they're used and their status
could change.

This diff:
* introduces `immutable_status_` which is updated by `Seek()` and `Next()`
* removes special handling of `kIncomplete` status in those methods

Test Plan:
* `db_test`
* hacked ReadSequential in db_bench.cc to check `status()` in addition to
  validity:

```
   $ ./db_bench -use_existing_db -benchmarks readseq -disable_auto_compactions \
      -use_tailing_iterator  # without this patch
   Keys:       16 bytes each
   Values:     100 bytes each (50 bytes after compression)
   Entries:    1000000
   [...]
   DB path: [/dev/shm/rocksdbtest/dbbench]
   readseq      :       0.562 micros/op 1778103 ops/sec;   98.4 MB/s
   $ ./db_bench -use_existing_db -benchmarks readseq -disable_auto_compactions \
      -use_tailing_iterator  # with the patch
   readseq      :       0.433 micros/op 2311363 ops/sec;  127.8 MB/s
```

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb, march, lovro

Differential Revision: https://reviews.facebook.net/D24063
2014-11-10 15:44:20 -08:00
Igor Canadi
c7ee9c3ab7 Fix -Wnon-virtual-dtor errors
Summary: This breaks mongo+rocks build https://mci.10gen.com/task_log_raw/rocksdb_ubuntu1404_rocksdb_c6e8e3d868660dc66b3bbd438cdc135df6356c5a_14_11_10_21_36_10_compile_ubuntu1404_rocksdb/0?type=T

Test Plan: m check + -Wnon-virtual-dtor

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28653
2014-11-10 17:39:38 -05:00
Igor Canadi
746252197b Merge pull request #384 from msb-at-yahoo/compaction-filter-2-empty-changed-values
CompactionFilterV2: eliminate an often unnecessary allocation.
2014-11-10 16:12:15 -05:00
Igor Canadi
4a3bd2bad2 Optimize usage of Status in CompactionJob
Summary: Based on @ljin feedback

Test Plan: compiles

Reviewers: ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: ljin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28515
2014-11-10 11:57:58 -08:00
Igor Canadi
e3d3567b5b Get rid of mutex in CompactionJob's state
Summary: Based on @sdong's feedback in the diff, we shouldn't keep db_mutex in CompactionJob's state. This diff removes db_mutex from CompactionJob state, by making next_file_number_ atomic. That way we only need to pass the lock to InstallCompactionResults() because of LogAndApply()

Test Plan: make check

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28491
2014-11-07 15:44:12 -08:00
Yueh-Hsuan Chiang
344edbb044 Fixed the shadowing in db/compaction.cc and include/rocksdb/db.h
Summary:
Fixed the shadowing in db/compaction.cc and include/rocksdb/db.h

Test Plan:
make
2014-11-07 15:22:10 -08:00
Yueh-Hsuan Chiang
b8b3903429 Fixed compile error in db/db_impl.cc
Summary:
Fixed compile error in db/db_impl.cc

Test Plan:
make
2014-11-07 15:13:01 -08:00
Yueh-Hsuan Chiang
b622ba5d6b Fixed compile error in db/flush_job.cc
Summary:
Fixed compile error in db/flush_job.cc

Test Plan:
make
2014-11-07 15:11:36 -08:00
Yueh-Hsuan Chiang
642ac9d8ab Fixed compile error in db/compaction.cc and db/compaction_picker.cc
Summary:
Fixed compile error in db/compaction.cc and db/compaction_picker.cc

Test Plan:
make
2014-11-07 15:08:12 -08:00
Igor Canadi
68effa0348 Fix -Wshadow for tools
Summary: Previously I made `make check` work with -Wshadow, but there are some tools that are not compiled using `make check`.

Test Plan: make all

Reviewers: yhchiang, rven, ljin, sdong

Reviewed By: ljin, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28497
2014-11-07 15:04:30 -08:00
Yueh-Hsuan Chiang
8447861896 Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h
Summary:
Fixed -WShadow errors in db/db_test.cc and include/rocksdb/metadata.h

Test Plan:
make
2014-11-07 14:57:51 -08:00
Yueh-Hsuan Chiang
28c82ff1b3 CompactFiles, EventListener and GetDatabaseMetaData
Summary:
This diff adds three sets of APIs to RocksDB.

= GetColumnFamilyMetaData =
* This APIs allow users to obtain the current state of a RocksDB instance on one column family.
* See GetColumnFamilyMetaData in include/rocksdb/db.h

= EventListener =
* A virtual class that allows users to implement a set of
  call-back functions which will be called when specific
  events of a RocksDB instance happens.
* To register EventListener, simply insert an EventListener to ColumnFamilyOptions::listeners

= CompactFiles =
* CompactFiles API inputs a set of file numbers and an output level, and RocksDB
  will try to compact those files into the specified level.

= Example =
* Example code can be found in example/compact_files_example.cc, which implements
  a simple external compactor using EventListener, GetColumnFamilyMetaData, and
  CompactFiles API.

Test Plan:
listener_test
compactor_test
example/compact_files_example
export ROCKSDB_TESTS=CompactFiles
db_test
export ROCKSDB_TESTS=MetaData
db_test

Reviewers: ljin, igor, rven, sdong

Reviewed By: sdong

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24705
2014-11-07 14:45:18 -08:00
Igor Canadi
a0f887c9e4 Fix compile 2014-11-07 12:07:43 -08:00
Igor Canadi
53af5d877d Redesign pending_outputs_
Summary:
Here's a prototype of redesigning pending_outputs_. This way, we don't have to expose pending_outputs_ to other classes (CompactionJob, FlushJob, MemtableList). DBImpl takes care of it.

Still have to write some comments, but should be good enough to start the discussion.

Test Plan: make check, will also run stress test

Reviewers: ljin, sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28353
2014-11-07 11:50:34 -08:00
Jonah Cohen
ec101cd49a Correctly test both compaction styles in CompactionDeletionTriggerReopen
Summary:
CompactionDeletionTriggerReopen wasn't actually testing universal
compaction.

Test Plan: db_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D28443
2014-11-06 17:28:49 -08:00
Yueh-Hsuan Chiang
8d87467bb0 Make PartialCompactionFailure Test more robust again.
Summary:
Make PartialCompactionFailure Test more robust again by
blocking background compaction until we simulate the
file creation error.

Test Plan:
export ROCKSDB_TESTS=PartialCompactionFailure
./db_test

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28431
2014-11-06 17:07:52 -08:00
Lei Jin
64d302d304 make DropWritesFlush deterministic
Summary:
TEST_WaitForFlush should wait until it sees error when parameter is set
to true so we don't need to loop and timeout

Test Plan: ROCKSDB_TESTS=DropWritesFlush ./db_test

Reviewers: sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28419
2014-11-06 16:07:07 -08:00
Yueh-Hsuan Chiang
e526b71402 Make PartialCompactionFailure Test more robust.
Summary: Make PartialCompactionFailure Test more robust.

Test Plan:
export ROCKSDB_TESTS=PartialCompactionFailure
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28425
2014-11-06 13:53:02 -08:00
Igor Canadi
9f20395cd6 Turn -Wshadow back on
Summary: It turns out that -Wshadow has different rules for gcc than clang. Previous commit fixed clang. This commits fixes the rest of the warnings for gcc.

Test Plan: compiles

Reviewers: ljin, yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28131
2014-11-06 11:14:28 -08:00
sdong
367a3f9cb4 Improve DBTest.GroupCommitTest: artificially slowdown log writing to trigger group commit
Summary: In order to avoid random failure of DBTest.GroupCommitTest, artificially sleep 100 microseconds in each log writing.

Test Plan: Run the test in a machine where valgrind version of the test always fails multiple times and see it always succeed.

Reviewers: igor, yhchiang, rven, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28401
2014-11-06 10:48:06 -08:00
sdong
ac95ae1b5d Make sure WAL is synced for DB::Write() if write batch is empty
Summary: This patch makes it a contract that if an empty write batch is passed to DB::Write() and WriteOptions.sync = true, fsync is called to WAL.

Test Plan: A new unit test

Reviewers: ljin, rven, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, MarkCallaghan, leveldb

Differential Revision: https://reviews.facebook.net/D28365
2014-11-06 09:48:19 -08:00
Shi Feng
ea18b944a7 Add db_bench option --report_file_operations
Summary: Add db_bench option --report_file_operations

Test Plan:
./db_bench --report_file_operations
Observe outputs on # of file operations

Reviewers: ljin, MarkCallaghan, sdong

Reviewed By: sdong

Subscribers: yhchiang, rven, igor, dhruba

Differential Revision: https://reviews.facebook.net/D27945
2014-11-05 18:40:18 -08:00
sdong
2ea1219eb6 Fix RecordIn and RecordDrop stats
Summary:
1. fix possible overflow of the two stats by using uint64_t
2. use a similar source of data to calculate RecordDrop. Previous one is not correct.

Test Plan: See outputs of db_bench settings, and the results look reasonable

Reviewers: MarkCallaghan, ljin, igor

Reviewed By: igor

Subscribers: rven, leveldb, yhchiang, dhruba

Differential Revision: https://reviews.facebook.net/D28155
2014-11-05 11:03:34 -08:00
maurice barnum
76f6c7c7c4 CompactionFilterV2: eliminate an often unnecessary allocation.
If a compaction filter implementation is simply filtering values, then
allocating the "changed values" bitmap is an extra memory allocation
that adds no value. Additionally, the compaction implementation has to
do marginally more work to calculate the offset into the bitmap
(vector<bool> specialization) for each record the filter did not mark
for deletion.

Explicitly handle the case where compact_->value_changed_buf_ is empty.
2014-11-05 05:31:11 +00:00
Lei Jin
fd24ae9d05 SetOptions() to return status and also add it to StackableDB
Summary: as title

Test Plan: ./db_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28269
2014-11-04 16:23:05 -08:00
Lei Jin
b1267750fb fix the asan check
Summary: as title

Test Plan: ran it

Reviewers: yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28311
2014-11-04 15:58:14 -08:00