Commit Graph

514 Commits

Author SHA1 Message Date
sdong
b54d4dd435 tools/sst_dump_tool_imp.h not to depend on "util/testutil.h"
Summary:
util/testutil.h doesn't seem to be used in tools/sst_dump_tool_imp.h. Remove it.
Also move some other include to tools/sst_dump_tool.cc instead.

Test Plan: Build with GCC, CLANG and with GCC 4.81 and 4.9.

Reviewers: yuslepukhin, yhchiang, rven, anthony, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D52791
2016-01-13 11:34:53 -08:00
Alexander Fenster
e16438bb86 fixing build warning 2016-01-11 11:23:33 -08:00
Alexander Fenster
b73fbbaf64 added --no_value option to ldb scan to dump key only 2016-01-11 10:51:42 -08:00
Gunnar Kudrjavets
b1a3b4c0d0 Make ldb automagically determine the file type and use the correct dumping function
Summary:
This set of changes implements the following design: `ldb` will utilize `--path` parameter which can be used to specify a file name. Tool will then apply some heuristic to determine how to output the data properly. The design decision is not to probe the file content, but use file names to determine what dumping function to call.

Usage examples:

Understands that path points to a manifest file and dumps it.
`./ldb --path=/tmp/test_db/MANIFEST-000023 dump`

Understands that path points to a WAL file and dumps it.
`./ldb --path=/tmp/test_db/000024.log dump --header`

Understands that path points to a SST file and dumps it.
`./ldb --path=/tmp/test_db/000007.sst dump`

Figures out that none of the supported file types are applicable and outputs
an appropriate error message.
`./ldb --path=/tmp/cron.log dump`

Test Plan:
Basics:

git diff
make clean
make -j 32 commit-prereq
arc lint

More specific testing (done as part of commit-prereq, but can be iterated separately when making isolated changes):

make clean
make ldb
python tools/ldb_test.py
make rocksdb_dump
make rocksdb_undump
sh tools/rocksdb_dump_test.sh

Reviewers: rven, IslamAbdelRahman, yhchiang, kradhakrishnan, anthony, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D52269
2016-01-06 14:19:08 -08:00
Mark Callaghan
4041903ecd Enhance db_bench write rate limit
Summary:
1) changes tools/{benchmark,run_flash_bench}.sh to optionally use the write rate limit
2) removes code for --writes_per_second and switches the 'background' write rate limit
to use --benchmark_write_rate_limit

Replaces https://reviews.facebook.net/D49113

Task ID: #9555881

Blame Rev:

Test Plan:
tools/run_flash_bench.sh

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D52485
2016-01-04 12:01:27 -08:00
Siying Dong
298ba27ae2 Merge pull request #846 from yuslepukhin/enble_c4244_lossofdata
Enable MS compiler warning c4244.
2015-12-23 22:59:42 -08:00
Andrew Kryczka
e089db40f9 Skip bottom-level filter block caching when hit-optimized
Summary:
When Get() or NewIterator() trigger file loads, skip caching the filter block if
(1) optimize_filters_for_hits is set and (2) the file is on the bottommost
level. Also skip checking filters under the same conditions, which means that
for a preloaded file or a file that was trivially-moved to the bottom level, its
filter block will eventually expire from the cache.

- added parameters/instance variables in various places in order to propagate the config ("skip_filters") from version_set to block_based_table_reader
- in BlockBasedTable::Rep, this optimization prevents filter from being loaded when the file is opened simply by setting filter_policy = nullptr
- in BlockBasedTable::Get/BlockBasedTable::NewIterator, this optimization prevents filter from being used (even if it was loaded already) by setting filter = nullptr

Test Plan:
updated unit test:

  $ ./db_test --gtest_filter=DBTest.OptimizeFiltersForHits

will also run 'make check'

Reviewers: sdong, igor, paultuckfield, anthony, rven, kradhakrishnan, IslamAbdelRahman, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D51633
2015-12-23 10:15:07 -08:00
Dmitri Smirnov
236fe21c92 Enable MS compiler warning c4244.
Mostly due to the fact that there are differences in sizes of int,long
  on 64 bit systems vs GNU.
2015-12-11 16:47:34 -08:00
charsyam
c30b499541 fix typos in comments 2015-12-11 01:54:48 +09:00
yuslepukhin
78de0c9222 Fix up VS 15 build.
Fix warnings
 Take advantage of native snprintf on VS 15
2015-12-08 08:38:21 -08:00
Islam AbdelRahman
88e0527724 Reduce moving memory in LDB::ScanCommand
Summary:
Based on https://github.com/facebook/rocksdb/issues/843
It looks that when the data is hot we spend significant amount of time moving data out of RocksDB blocks. This patch reduce moving memory when possible

Original performance
```
$ time ./ldb --db=/home/tec/local/ellina_test/testdb scan > /dev/null
real	0m16.736s
user	0m11.993s
sys	0m4.725s
```

Performance after reducing memcpy
```
$ time ./ldb --db=/home/tec/local/ellina_test/testdb scan > /dev/null
real	0m11.590s
user	0m6.983s
sys	0m4.595s
```

Test Plan:
dump the output of the scan into 2 files and verifying the are exactly the same
make check

Reviewers: sdong, yhchiang, anthony, rven, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D51093
2015-11-19 22:26:37 -08:00
sdong
51fce92e11 "ldb compact" should force bottommost level compaction
Summary: Now "ldb compact" skips the bottommost level compaction. This is an unintended behavior change. Reverting it now. Maybe we need to add another mode later for it.

Test Plan: Run a manual test of 'ldb' to make sure bottom most level is compacted.

Reviewers: IslamAbdelRahman, yhchiang, anthony, kradhakrishnan, rven

Reviewed By: rven

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D50925
2015-11-17 18:07:11 -08:00
Dmitri Smirnov
ee2c3236dd Fix compilation problem on Windows.
char is not a valid template parameter for std::uniform_int_distribution
  according to the standard. Replacing with int should be just fine.
2015-10-29 11:29:18 -07:00
Igor Canadi
c97667d9f1 Fix RocksDB lite build for write_stress
Summary: We don't have access to GetLiveFilesMetadata() in RocksDB lite. If compiling write_stress for lite, I skip the check for leaked files, which depends on this function.

Test Plan: OPT=-DROCKSDB_LITE m write_stress

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D49647
2015-10-28 16:37:39 -07:00
Igor Canadi
4b66d95344 Write stress test
Summary:
The goal of this diff is to create a simple stress test with focus on catching:
* bugs in compaction/flush processes, especially the ones that cause assertion errors
* bugs in the code that deletes obsolete files

There are two parts of the test:
* write_stress, a binary that writes to the database
* write_stress_runner.py, a script that invokes and kills write_stress

Here are some interesting parts of write_stress:
* Runs with very high concurrency of compactions and flushes (32 threads total) and tries to create a huge amount of small files
* The keys written to the database are not uniformly distributed -- there is a 3-character prefix that mutates occasionally (in prefix mutator thread), in such a way that the first character mutates slower than second, which mutates slower than third character. That way, the compaction stress tests some interesting compaction features like trivial moves and bottommost level calculation
* There is a thread that creates an iterator, holds it for couple of seconds and then iterates over all keys. This is supposed to test RocksDB's abilities to keep the files alive when there are references to them.
* Some writes trigger WAL sync. This is stress testing our WAL sync code.
* At the end of the run, we make sure that we didn't leak any of the sst files

write_stress_runner.py changes the mode in which we run write_stress and also kills and restarts it. There are some interesting characteristics:
* At the beginning we divide the full test runtime into smaller parts -- shorter runtimes (couple of seconds) and longer runtimes (100, 1000) seconds
* The first time we run write_stress, we destroy the old DB. Every next time during the test, we use the same DB.
* We can run in kill mode or clean-restart mode. Kill mode kills the write_stress violently.
* We can run in mode where delete_obsolete_files_with_fullscan is true or false
* We can run with low_open_files mode turned on or off. When it's turned on, we configure table cache to only hold a couple of files -- that way we need to reopen files every time we access them.

Another goal was to create a stress test without a lot of parameters. So tools/write_stress_runner.py should only take one parameter -- runtime_sec and it should figure out everything else on its own.

In a separate diff, I'll add this new test to our nightly legocastle runs.

Test Plan:
The goal of this test was to retroactively catch the following bugs: D33045, D48201, D46899, D42399. I failed to reproduce D48201, but all others have been caught!

When i reverted https://reviews.facebook.net/D33045:

     ./write_stress --runtime_sec=200 --low_open_files_mode=true
     Iterator statuts not OK: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/089166.sst: No such file or directory

When i reverted https://reviews.facebook.net/D42399:

    python tools/write_stress_runner.py --runtime_sec=5000
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1
    Running write_stress, will kill after 2 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 7 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 5 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false
    Running write_stress, will kill after 8 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --low_open_files_mode=true
    Write to DB failed: IO error: /fast-rocksdb-tmp/rocksdb_test/write_stress/019250.sst: No such file or directory
    ERROR: write_stress died with exitcode=-6

When i reverted https://reviews.facebook.net/D46899:

    python tools/write_stress_runner.py --runtime_sec=1000
    runtime: 1000
    Going to execute write stress for [3, 3, 100, 3, 2, 100, 1, 788]
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --low_open_files_mode=true
    Running write_stress for 3 seconds: ./write_stress --runtime_sec=3 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    Running write_stress, will kill after 100 seconds: ./write_stress --runtime_sec=-1 --destroy_db=false --delete_obsolete_files_with_fullscan=true
    write_stress: db/db_impl.cc:2070: void rocksdb::DBImpl::MarkLogsSynced(uint64_t, bool, const rocksdb::Status&): Assertion `log.getting_synced' failed.
    ERROR: write_stress died with exitcode=-6

Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan, sdong, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49533
2015-10-28 16:15:07 -07:00
sdong
ab0f3b964f crash_test to trigger some less frequent crash point more frequently
Summary: crash_test still has a very low chance to hit some crash point. Have another mode for covering them more likely.

Test Plan: Run crash_test and see db_stress is called with expected prameters.

Reviewers: kradhakrishnan, igor, anthony, rven, IslamAbdelRahman, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D49473
2015-10-27 12:06:06 -07:00
Siying Dong
138876a62c Merge pull request #746 from ceph/wip-recycle
Add Options.recycle_log_file_num for Recycling WAL Files
2015-10-26 15:01:28 -07:00
Shusen Liu
d0d13ebf67 fix bug in db_crashtest.py
Summary:
in tools/db_crashtest.py, cmd_params['db'] by default is a lambda expression, not the actual db_name.
fix by get the db_name before passing it to gen_cmd.

Test Plan: run `make crashtest`

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49119
2015-10-20 22:01:11 -07:00
Shusen Liu
033c6f1add T7916298, bug fix
Summary: dbname => cmd_params['db']

Test Plan: Run `make crash_test`

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D49077
2015-10-19 21:09:35 -07:00
Shusen Liu
4575de5b9e #7916298: merge tools/db_crashtest2.py into tools/db_crashtest.py
Summary:
merge tools/db_crashtest2.py into tools/db_crashtest.py

python tools/db_crashtest.py -h  # show help message, ALL parameters can be overwrite by arguments

Example usages:
python tools/db_crashtest.py blackbox  # run blackbox with default parameters
python tools/db_crashtest.py blackbox --simple
python tools/db_crashtest.py whitebox  # run whitebox with default parameters
python tools/db_crashtest.py whitebox --simple

all default parameters are identical to previous version.

Test Plan: `make crash_test` and make sure it can run with expected parameters pased to db_stress.

Reviewers: igor, rven, anthony, IslamAbdelRahman, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D48567
2015-10-19 13:24:55 -07:00
Sage Weil
3ac13c99d1 log_reader: pass log_number and optional info_log to ctor
We will need the log number to validate the recycle-style CRCs.  The log
is helpful for debugging, but optional, as not all callers have it.

Signed-off-by: Sage Weil <sage@redhat.com>
2015-10-18 21:24:32 -04:00
sdong
f9ba79ecd6 crash_test to trigger fail points other than file appending more frequently
Summary:
For half of the crash_test run, disable fail point for file appending, in order to trigger other fail point more frequently.
Also, tune crash test parameter a little bit for it to initialize faster.

Test Plan: Run crash_test and make sure it issues db_stress commands as expected.

Reviewers: yhchiang, igor, rven, anthony, IslamAbdelRahman, kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D48843
2015-10-16 11:35:27 -07:00
sdong
680156ca61 crash_test to run with data sync on
Summary: Mode of data sync off is a much less used than the case of data sync on. Crash test should cover the more common case than a corner case. So turn data sync on in crash tests.

Test Plan: Run crash test and make sure it can run with expected parameters pased to db_stres.

Reviewers: IslamAbdelRahman, anthony, igor, rven, kradhakrishnan, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D48729
2015-10-15 14:37:08 -07:00
sdong
e1a5ff857b Allow users to disable some kill points in db_stress
Summary:
Give a name for every kill point, and allow users to disable some kill points based on prefixes. The kill points can be passed by db_stress through a command line paramter. This provides a way for users to boost the chance of triggering low frequency kill points
This allow follow up changes in crash test scripts to improve crash test coverage.

Test Plan:
Manually run db_stress with variable values of --kill_random_test and --kill_prefix_blacklist. Like this:
 --kill_random_test=2 --kill_prefix_blacklist=Posix,WritableFileWriter::Append,WritableFileWriter::WriteBuffered,WritableFileWriter::Sync

Reviewers: igor, kradhakrishnan, rven, IslamAbdelRahman, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D48735
2015-10-15 14:33:13 -07:00
Islam AbdelRahman
6d730b4ae7 Block tests under ROCKSDB_LITE
Summary:
This patch will block all tests (not including db_test) that don't compile / fail under ROCKSDB_LITE

Test Plan:
OPT=-DROCKSDB_LITE make db_compaction_filter_test -j64 &&
OPT=-DROCKSDB_LITE make db_compaction_test -j64 &&
OPT=-DROCKSDB_LITE make db_dynamic_level_test -j64 &&
OPT=-DROCKSDB_LITE make db_log_iter_test -j64 &&
OPT=-DROCKSDB_LITE make db_tailing_iter_test -j64 &&
OPT=-DROCKSDB_LITE make db_universal_compaction_test -j64 &&
OPT=-DROCKSDB_LITE make ldb_cmd_test -j64

make clean

make db_compaction_filter_test -j64 &&
make db_compaction_test -j64 &&
make db_dynamic_level_test -j64 &&
make db_log_iter_test -j64 &&
make db_tailing_iter_test -j64 &&
make db_universal_compaction_test -j64 &&
make ldb_cmd_test -j64

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D48723
2015-10-15 10:51:00 -07:00
Venkatesh Radhakrishnan
63e507c59c Move ldb and sst_dump from utils to tools.
Summary: As part of cleaning up dependencies for tech debt week, we are moving ldb and sst_dump tools from util to tools, since they are tools.

Test Plan: make check

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D48747
2015-10-14 17:08:28 -07:00
Islam AbdelRahman
9babaeed16 Update dump_tool and undump_tool to accept Options
Summary:
Refactor dump_tool and undump_tool so that it's possible to use them with customized options
for example setting a specific comparator similar to what Dragon is doing with the LdbTool

https://phabricator.fb.com/diffusion/FBCODE/browse/master/dragon/tools/Ldb.cpp

Test Plan:
compiles
used it to dump / undump a dragon shard

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, adsharma

Differential Revision: https://reviews.facebook.net/D47853
2015-10-05 19:49:48 -07:00
Yueh-Hsuan Chiang
a263002a36 Fixed a tsan warning in db_stress.cc
Summary:
Fixed the following tsan warning in db_stress.cc

  WARNING: ThreadSanitizer: data race (pid=3163194)
  Read of size 8 at 0x7fd1797cb518 by thread T32:
    #0 VerifyDb tools/db_stress.cc:1731 (db_stress+0x000000040674)
    #1 rocksdb::StressTest::ThreadBody(void*) tools/db_stress.cc:1191 (db_stress+0x0000000625a9)
    #2 StartThreadWrapper util/env_posix.cc:1648 (db_stress+0x00000028bbbd)

  Previous write of size 8 at 0x7fd1797cb518 by thread T31:
    #0 VerifyDb tools/db_stress.cc:1726 (db_stress+0x00000004072a)
    #1 rocksdb::StressTest::ThreadBody(void*) tools/db_stress.cc:1191 (db_stress+0x0000000625a9)
    #2 StartThreadWrapper util/env_posix.cc:1648 (db_stress+0x00000028bbbd)

The cause is that in VerifyDb(), the static local const variable long max_key
can be read and written at the same time.  This patch fixed it by making it
non-static.

Test Plan: db_stress

Reviewers: igor, sdong, IslamAbdelRahman, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D47703
2015-09-28 12:06:43 -07:00
Andres Noetzli
014fd55adc Support for SingleDelete()
Summary:
This patch fixes #7460559. It introduces SingleDelete as a new database
operation. This operation can be used to delete keys that were never
overwritten (no put following another put of the same key). If an overwritten
key is single deleted the behavior is undefined. Single deletion of a
non-existent key has no effect but multiple consecutive single deletions are
not allowed (see limitations).

In contrast to the conventional Delete() operation, the deletion entry is
removed along with the value when the two are lined up in a compaction. Note:
The semantics are similar to @igor's prototype that allowed to have this
behavior on the granularity of a column family (
https://reviews.facebook.net/D42093 ). This new patch, however, is more
aggressive when it comes to removing tombstones: It removes the SingleDelete
together with the value whenever there is no snapshot between them while the
older patch only did this when the sequence number of the deletion was older
than the earliest snapshot.

Most of the complex additions are in the Compaction Iterator, all other changes
should be relatively straightforward. The patch also includes basic support for
single deletions in db_stress and db_bench.

Limitations:
- Not compatible with cuckoo hash tables
- Single deletions cannot be used in combination with merges and normal
  deletions on the same key (other keys are not affected by this)
- Consecutive single deletions are currently not allowed (and older version of
  this patch supported this so it could be resurrected if needed)

Test Plan: make all check

Reviewers: yhchiang, sdong, rven, anthony, yoshinorim, igor

Reviewed By: igor

Subscribers: maykov, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43179
2015-09-17 11:42:56 -07:00
Dmitry Marakasov
4b0b0201c9 Fix `integer overflow in expression' error 2015-09-15 14:41:00 +03:00
Amit Arya
7a31960ee9 Tests for ManifestDumpCommand and ListColumnFamiliesCommand
Summary:
Added tests for two LDBCommands namely i) ManifestDumpCommand and ii) ListColumnFamiliesCommand.
+ Minor fix in the sscanf formatter (along relace C cast with C++ cast) + replacing localtime with localtime_r which is thread safe.

Test Plan: make all && ./tools/ldb_test.py

Reviewers: anthony, igor, IslamAbdelRahman, kradhakrishnan, lgalanis, rven, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D45819
2015-09-08 14:23:42 -07:00
sdong
7a0dbdf3ac Add ZSTD (not final format) compression type
Summary: Add ZSTD compression type. The same way as adding LZ4.

Test Plan: run all tests. Generate files in db_bench. Make sure reads succeed. But the SST files cannot be opened in older versions. Also some other adhoc tests.

Reviewers: rven, anthony, IslamAbdelRahman, kradhakrishnan, igor

Reviewed By: igor

Subscribers: MarkCallaghan, maykov, yoshinorim, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D45747
2015-08-28 11:01:13 -07:00
Mark Callaghan
4c81ac0c59 Fix benchmark report script
Summary:
db_bench output now displays Percentile many times with --statistics after
read IO latency histograms were added. So I only need the last one in the report output.

Task ID: #

Blame Rev:

Test Plan:
run run_flash_bench.sh

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D45093
2015-08-22 12:18:00 -07:00
Ari Ekmekji
b6def58f73 Changed 'num_subcompactions' to the more accurate 'max_subcompactions'
Summary:
Up until this point we had DbOptions.num_subcompactions, but
it is semantically more correct to call this max_subcompactions since
we will schedule *up to* DbOptions.max_subcompactions smaller compactions
at a time during a compaction job.

I also added a --subcompactions option to db_bench

Test Plan: make all   make check

Reviewers: sdong, igor, anthony, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D45069
2015-08-21 14:25:34 -07:00
Mark Callaghan
41a0e2811d Improve defaults for benchmarks
Summary:
Changes include:
* don't sync-on-commit for single writer thread in readwhile... tests
* make default block size 8kb rather than 4kb to avoid too small blocks after compression
* use snappy instead of zlib to avoid stalls from compression latency
* disable statistics
* use bytes_per_sync=8M to reduce throughput loss on disk
* use open_files=-1 to reduce mutex contention

Task ID: #

Blame Rev:

Test Plan:
run benchmark

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D44961
2015-08-20 18:59:10 -07:00
Ari Ekmekji
f0da6977a3 [Parallel L0-L1 Compaction Prep]: Giving Subcompactions Their Own State
Summary:
In prepration for running multiple threads at the same time during
a compaction job, this patch assigns each subcompaction its own state
(instead of sharing the one global CompactionState). Each subcompaction then
uses this state to update its statistics, keep track of its snapshots, etc.
during the course of execution. Then at the end of all the executions the
statistics are aggregated across the subcompactions so that the final result
is the same as if only one larger compaction had run.

Test Plan: ./db_test  ./db_compaction_test  ./compaction_job_test

Reviewers: sdong, anthony, igor, noetzli, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43239
2015-08-18 11:06:23 -07:00
Andres Notzli
4249f159d5 Removing duplicate code in db_bench/db_stress, fixing typos
Summary:
While working on single delete support for db_bench, I realized that
db_bench/db_stress contain a bunch of duplicate code related to
copmression and found some typos. This patch removes duplicate code,
typos and a redundant #ifndef in internal_stats.cc.

Test Plan: make db_stress && make db_bench && ./db_bench --benchmarks=compress,uncompress

Reviewers: yhchiang, sdong, rven, anthony, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43965
2015-08-11 11:46:15 -07:00
sdong
cf3e05304f crash_test cleans up directory before testing if TEST_TMPDIR is set
Summary: In a recent change, crash_test can put data under TEST_TMPDIR. However, the directory is not cleaned before running the test, which may cause unexpected results. Clean it.

Test Plan: Run white and black box crash test against non-existing, or non-empty but not compactible DBs, and make sure it works as expected.

Reviewers: kradhakrishnan, rven, yhchiang, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D43515
2015-08-04 14:59:28 -07:00
sdong
e2a3bfe74b First half of whitebox_crash_test to keep crashing the same DB
Summary: Currently, whitebox crash test is not really executed, because the DB is destroyed after each crash. With this fix, in the first half of the time, DB will keep opening the crashed DB and continue from there.

Test Plan: "make whitebox_crash_test" and see the same DB keeps crashing and being reopened.

Reviewers: IslamAbdelRahman, yhchiang, rven, kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D43503
2015-08-04 12:16:44 -07:00
sdong
2e73bd4ff2 crash_test to put DB under TEST_TMPDIR
Summary: Currently crash_test only puts data under /tmp. It is less flexible if we want to cover different file systems or media. Make crash_test to appreciate TEST_TMPDIR so that users can run it against another file system.

Test Plan: Run blackbox_crash_test and whitebox_crash_test with or without TEST_TMPDIR set and make sure DBs are put in the right place

Reviewers: kradhakrishnan, yhchiang, rven, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D43509
2015-08-04 12:10:55 -07:00
sdong
1205bdbcee crash_test to cover simply cases
Summary:
crash_test now only runs complicated options, multiple column families, prefix hash, frequently changing options, many compaction threads, etc. These options are good to cover new features but we loss coverage in most common use cases. Furthermore, by running only for multiple column families, we are not able to create LSM trees that are large enough to cover some stress cases.
Make half of crash_test runs the simply tests: single column family, default mem table, one compaction thread, no change options.

Test Plan: Run crash_test

Reviewers: rven, yhchiang, IslamAbdelRahman, kradhakrishnan

Reviewed By: kradhakrishnan

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D43461
2015-08-04 12:08:38 -07:00
Andres Noetzli
bd852bf118 Fixed typos in db_stress
Summary: Fixed typos.

Test Plan: None

Reviewers: igor, yhchiang, sdong, anthony, rven

Reviewed By: rven

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D43365
2015-07-31 14:11:43 -07:00
sdong
7bfae3a723 tools/db_crashtest2.py should run on the same DB
Summary:
Crash tests are supposed to restart the same DB after crashing, but it is now opening a different DB. Fix it.
It's probably a leftover of https://reviews.facebook.net/D17073

Test Plan: Run the test and make sure the same Db is opened.

Reviewers: kradhakrishnan, rven, igor, IslamAbdelRahman, yhchiang, anthony

Reviewed By: anthony

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D43197
2015-07-29 15:50:37 -07:00
Dmitri Smirnov
31b35c902e Add missing tests, fix db_sanity
Add heap_test, merge_helper_test
 Fix uninitialized pointers in db_sanity_test that cause SIGSEV when DB::Open fails in case compression is not linked.
2015-07-21 18:04:28 -07:00
Igor Canadi
35ca59364c Don't let flushes preempt compactions
Summary:
When we first started, max_background_flushes was 0 by default and compaction thread was executing flushes (since there was no flush thread). Then, we switched the default max_background_flushes to 1. However, we still support the case where there is no flush thread and flushes are done in compaction. This is making our code a bit more complicated. By not supporting this use-case we can make our code simpler.

We have a special case that when you set max_background_flushes to 0, we
schedule the flush to execute on the compaction thread.

Test Plan: make check (there might be some unit tests that depend on this behavior)

Reviewers: IslamAbdelRahman, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41931
2015-07-17 12:02:52 -07:00
Igor Canadi
e94c510c3f Make ldb_test not depend on compression
Summary: This is failing our tsan tests. Our new behavior is to fail DB::Open() if the requested compression is not available. The easiest fix is to make ldb_test not depend on compression.

Test Plan: python tools/ldb_test.py

Reviewers: sdong, yhchiang, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42075
2015-07-14 23:13:23 -07:00
Igor Canadi
a9c5109515 Deprecate purge_redundant_kvs_while_flush
Summary: This option is guarding the feature implemented 2 and a half years ago: D8991. The feature was enabled by default back then and has been running without issues. There is no reason why any client would turn this feature off. I found no reference in fbcode.

Test Plan: none

Reviewers: sdong, yhchiang, anthony, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D42063
2015-07-14 13:07:02 +02:00
Islam AbdelRahman
e290f5d3ce Block reduce_levels_test in ROCKSDB_LITE
Summary: Block reduce_levels_test in ROCKSDB_LITE as LDBCommand is not supported

Test Plan:
make reduce_levels_test -j64
OPT=-DROCKSDB_LITE make reduce_levels_test -j64
make check -j64

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41967
2015-07-13 18:45:55 -07:00
sdong
f9728640f3 "make format" against last 10 commits
Summary: This helps Windows port to format their changes, as discussed. Might have formatted some other codes too becasue last 10 commits include more.

Test Plan: Build it.

Reviewers: anthony, IslamAbdelRahman, kradhakrishnan, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D41961
2015-07-13 13:50:18 -07:00
Dmitri Smirnov
9dbde7277c Merge remote-tracking branch 'origin' into ms_win_port 2015-07-02 11:34:22 -07:00
Dmitri Smirnov
18285c1e2f Windows Port from Microsoft
Summary: Make RocksDb build and run on Windows to be functionally
 complete and performant. All existing test cases run with no
 regressions. Performance numbers are in the pull-request.

 Test plan: make all of the existing unit tests pass, obtain perf numbers.

 Co-authored-by: Praveen Rao praveensinghrao@outlook.com
 Co-authored-by: Sherlock Huang baihan.huang@gmail.com
 Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
 Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
2015-07-01 16:13:56 -07:00
Igor Canadi
619167ee66 Fix mac compile
Summary: as title

Test Plan: make check

Reviewers: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40785
2015-06-26 10:29:24 -07:00
Michael Callahan
15325bf55b First version of rocksdb_dump and rocksdb_undump.
Summary: Hack up rocksdb_dump and rocksdb_undump utilities to get this task rolling/promote discussion.

Test Plan: Dump/undump databases recursively to see if nothing is lost.

Reviewers: sdong, yhchiang, rven, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37269
2015-06-19 16:24:36 -07:00
Yueh-Hsuan Chiang
2e764f06ea [API Change] Improve EventListener::OnFlushCompleted interface
Summary:
EventListener::OnFlushCompleted() now passes a structure instead
of a list of parameters.  This minimizes the API change in the
future.

Test Plan:
listener_test
compact_files_test
example/compact_files_example

Reviewers: kradhakrishnan, sdong, IslamAbdelRahman, rven, igor

Reviewed By: rven, igor

Subscribers: IslamAbdelRahman, rven, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39543
2015-06-05 12:28:51 -07:00
Yueh-Hsuan Chiang
fc83821270 Add EventListener::OnTableFileCreated()
Summary:
Add EventListener::OnTableFileCreated(), which will be called
when a table file is created.  This patch is part of the
EventLogger and EventListener integration.

Test Plan: Augment existing test in db/listener_test.cc

Reviewers: anthony, kradhakrishnan, rven, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D38865
2015-06-02 14:12:23 -07:00
Yueh-Hsuan Chiang
16c197627a Fixed db_stress
Summary:
Fixed db_stress by correcting the verification of column family
names in the Listener of db_stress

Test Plan: db_stress

Reviewers: igor, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39255
2015-05-30 14:26:00 -07:00
Yueh-Hsuan Chiang
832271f6b1 Fixed a compile warning in db_stress in NDEBUG mode.
Summary: Fixed a compile warning in db_stress in NDEBUG mode.

Test Plan: make OPT=-DNDEBUG db_stress

Reviewers: sdong, anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39213
2015-05-29 15:00:25 -07:00
agiardullo
dc9d70de65 Optimistic Transactions
Summary: Optimistic transactions supporting begin/commit/rollback semantics.  Currently relies on checking the memtable to determine if there are any collisions at commit time.  Not yet implemented would be a way of enuring the memtable has some minimum amount of history so that we won't fail to commit when the memtable is empty.  You should probably start with transaction.h to get an overview of what is currently supported.

Test Plan: Added a new test, but still need to look into stress testing.

Reviewers: yhchiang, igor, rven, sdong

Reviewed By: sdong

Subscribers: adamretter, MarkCallaghan, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D33435
2015-05-29 14:36:35 -07:00
Yueh-Hsuan Chiang
d5a0c0e69b Fixed a compile warning in db_stress
Summary:
Fixed the following compile warning in db_stress:
error: 'OnCompactionCompleted' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]

Test Plan: make db_stress

Reviewers: sdong, igor, anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39207
2015-05-29 13:37:59 -07:00
Yueh-Hsuan Chiang
9ffc8ba024 Include EventListener in stress test.
Summary: Include EventListener in stress test.

Test Plan: make blackbox_crash_test whitebox_crash_test

Reviewers: anthony, igor, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D39105
2015-05-29 13:17:49 -07:00
agiardullo
c815351038 Support saving history in memtable_list
Summary:
For transactions, we are using the memtables to validate that there are no write conflicts.  But after flushing, we don't have any memtables, and transactions could fail to commit.  So we want to someone keep around some extra history to use for conflict checking.  In addition, we want to provide a way to increase the size of this history if too many transactions fail to commit.

After chatting with people, it seems like everyone prefers just using Memtables to store this history (instead of a separate history structure).  It seems like the best place for this is abstracted inside the memtable_list.  I decide to create a separate list in MemtableListVersion as using the same list complicated the flush/installalflushresults logic too much.

This diff adds a new parameter to control how much memtable history to keep around after flushing.  However, it sounds like people aren't too fond of adding new parameters.  So I am making the default size of flushed+not-flushed memtables be set to max_write_buffers.  This should not change the maximum amount of memory used, but make it more likely we're using closer the the limit.  (We are now postponing deleting flushed memtables until the max_write_buffer limit is reached).  So while we might use more memory on average, we are still obeying the limit set (and you could argue it's better to go ahead and use up memory now instead of waiting for a write stall to happen to test this limit).

However, if people are opposed to this default behavior, we can easily set it to 0 and require this parameter be set in order to use transactions.

Test Plan: Added a xfunc test to play around with setting different values of this parameter in all tests.  Added testing in memtablelist_test and planning on adding more testing here.

Reviewers: sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D37443
2015-05-28 16:34:24 -07:00
Aaron Schlesinger
6116ccc232 moving dockerfile to root 2015-05-22 16:06:53 -07:00
Aaron Schlesinger
d90cee9fd3 adding docker build script and dockerfile 2015-05-22 16:03:39 -07:00
Mark Callaghan
88044340c1 Add Size-GB column to benchmark reports
Summary:
See https://gist.github.com/mdcallag/b867ee051d765760be0d for a sample

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37971
2015-05-02 07:46:12 -07:00
Mark Callaghan
a087f80e9d Add scripts to run leveldb benchmark
Summary:
This runs a benchmark for LevelDB similar to what we have
in tools/run_flash_bench.sh. It requires changes to db_bench that I published
in a LevelDB fork on github.  Some results are at:
http://smalldatum.blogspot.com/2015/04/comparing-leveldb-and-rocksdb-take-2.html

Sample output:
ops/sec	mb/sec	usec/op	avg	p50	Test
525	16.4	1904.5	1904.5	111.0	fillseq.v32768
75187	15.5	13.3	13.3	4.4	fillseq.v200
28328	5.8	35.3	35.3	4.7	overwrite.t1.s0
175438	0.0	5.7	5.7	4.4	readrandom.t1
28490	5.9	35.1	35.1	4.7	overwrite.t1.s0
121951	0.0	8.2	8.2	5.7	readwhilewriting.t1

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37749
2015-04-27 19:32:56 -07:00
Igor Canadi
4961a9622c Fix build
Summary: Build broken by 6ede020dc4

Test Plan: make all

Reviewers: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D37689
2015-04-25 21:12:52 -07:00
clark.kang
6ede020dc4 fix typos 2015-04-25 18:14:27 +09:00
Mark Callaghan
283a042969 Set --seed per test
Summary:
This is done to avoid having each thread use the same seed between runs
of db_bench. Without this we can inflate the OS filesystem cache hit rate on
reads for read heavy tests and generally see the same key sequences get generated
between teste runs.

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37563
2015-04-23 09:18:25 -07:00
Mark Callaghan
78dbd087d1 Improve benchmark scripts
Summary:
This adds:
1) use of --level_compaction_dynamic_level_bytes=true
2) use of --bytes_per_sync=2M
The second is a big win for disks. The first helps in general.

This also adds a new test, fillseq with 32kb values to increase the peak
ingest and make it more likely that storage limits throughput.

Sample outpout from the first 3 tests - https://gist.github.com/mdcallag/e793bd3038e367b05d6f

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37509
2015-04-22 13:23:08 -07:00
Mark Callaghan
9da8748016 Get benchmark.sh loads to run faster
Summary:
This changes loads to use vector memtable and disable the WAL. This also
increases the chance we will see IO bottlenecks during loads which is good to stress
test HW. But I also think it is a good way to load data quickly as this is a bulk
operation and the WAL isn't needed.

The two numbers below are the MB/sec rates for fillseq, bulkload using a skiplist
or vector memtable and the WAL enabled or disabled. There is a big benefit from
using the vector memtable and WAL disabled. Alas there is also a perf bug in
the use of std::sort for ordered input when the vector is flushed. Task is open
for that.
  112, 66 - skiplist with wal
  250, 116 - skiplist without wal
  110, 108 - vector with wal
  232, 370 - vector without wal

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D36957
2015-04-13 17:18:07 -07:00
sdong
ee9bdd38a1 Script to check whether RocksDB can read DB generated by previous releases and vice versa
Summary: Add a script, which checks out changes from a list of tags, build them and load the same data into it. In the last, checkout the target build and make sure it can successfully open DB and read all the data. It is implemented through ldb tool, because ldb tool is available from all previous builds so that we don't have to cross build anything.

Test Plan: Run the script.

Reviewers: yhchiang, rven, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D36639
2015-04-08 16:04:59 -07:00
Mark Callaghan
3be82bc894 Add p99.9 and p99.99 response time to benchmark report, add new summary report
Summary:
This adds p99.9 and p99.99 response times to the benchmark report and
adds a second report, report2.txt that has tests listed in test order rather
than the time in which they were run, so overwrite tests are listed for
all thread counts, then update etc.

Also changes fillseq to compress all levels to avoid write-amp from rewriting
uncompressed files when they reach the first level to compress.

Increase max_write_buffer_number to avoid stalls during fillseq and make
max_background_flushes agree with max_write_buffer_number.

See https://gist.github.com/mdcallag/297ff4316a25cb2988f7 for an example
of the new report (report2.txt)

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D36537
2015-04-06 10:42:12 -07:00
Mark Callaghan
1bd70fb54a Add --stats_interval_seconds to db_bench
Summary:
The --stats_interval_seconds determines interval for stats reporting
and overrides --stats_interval when set. I also changed tools/benchmark.sh
to report stats every 60 seconds so I can avoid trying to figure out a
good value for --stats_interval per test and per storage device.

Task ID: #6631621

Blame Rev:

Test Plan:
run tools/run_flash_bench, look at output

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D36189
2015-03-30 12:58:32 -07:00
Mark Callaghan
99ec2412e5 Make the benchmark scripts configurable and add tests
Summary:
This makes run_flash_bench.sh configurable. Previously it was hardwired for 1B keys and tests
ran for 12 hours each. That kept me from using it. This makes it configuable, adds more tests,
makes the duration per-test configurable and refactors the test scripts.

Adds the seekrandomwhilemerging test to db_bench which is the same as seekrandomwhilewriting except
the writer thread does Merge rather than Put.

Forces the stall-time column in compaction IO stats to use a fixed format (H:M:S) which makes
it easier to scrape and parse. Also adds an option to AppendHumanMicros to force a fixed format.
Sometimes automation and humans want different format.

Calls thread->stats.AddBytes(bytes); in db_bench for more tests to get the MB/sec summary
stats in the output at test end.

Adds the average ingest rate to compaction IO stats. Output now looks like:
https://gist.github.com/mdcallag/2bd64d18be1b93adc494

More information on the benchmark output is at https://gist.github.com/mdcallag/db43a58bd5ac624f01e1

For benchmark.sh changes default RocksDB configuration to reduce stalls:
* min_level_to_compress from 2 to 3
* hard_rate_limit from 2 to 3
* max_grandparent_overlap_factor and max_bytes_for_level_multiplier from 10 to 8
* L0 file count triggers from 4,8,12 to 4,12,20 for (start,stall,stop)

Task ID: #6596829

Blame Rev:

Test Plan:
run tools/run_flash_bench.sh

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D36075
2015-03-30 11:28:25 -07:00
Yueh-Hsuan Chiang
cfa576402c Make auto_sanity_test always use the db_sanity_test.cc of the newer commit.
Summary:
Whenever we add new tests in db_sanity_test.cc, the verification test
will fail since the old version db_sanity_test.cc does not have the
newly added test.  This patch makes auto_sanity_test.sh always use
the db_sanity_test.cc of the newer commit.

As a result, a macro guard is added to allow db_sanity_test.cc to be
backward compatible.

Test Plan: tools/auto_sanity_check.sh

Reviewers: sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D35997
2015-03-27 11:32:49 -07:00
Mark Callaghan
dfccc7b4e2 Add readwhilemerging benchmark
Summary:
This is like readwhilewriting but uses Merge rather than Put in the writer thread.
I am using it for in-progress benchmarks. I don't think the other benchmarks for Merge
cover this behavior. The purpose for this test is to measure read performance when
readers might have to merge results. This will also benefit from work-in-progress
to add skewed key generation.

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D35115
2015-03-18 13:50:52 -07:00
Igor Sugak
b4b69e4f77 rocksdb: switch to gtest
Summary:
Our existing test notation is very similar to what is used in gtest. It makes it easy to adopt what is different.
In this diff I modify existing [[ https://code.google.com/p/googletest/wiki/Primer#Test_Fixtures:_Using_the_Same_Data_Configuration_for_Multiple_Te | test fixture ]] classes to inherit from `testing::Test`. Also for unit tests that use fixture class, `TEST` is replaced with `TEST_F` as required in gtest.

There are several custom `main` functions in our existing tests. To make this transition easier, I modify all `main` functions to fallow gtest notation. But eventually we can remove them and use implementation of `main` that gtest provides.

```lang=bash
% cat ~/transform
#!/bin/sh
files=$(git ls-files '*test\.cc')
for file in $files
do
  if grep -q "rocksdb::test::RunAllTests()" $file
  then
    if grep -Eq '^class \w+Test {' $file
    then
      perl -pi -e 's/^(class \w+Test) {/${1}: public testing::Test {/g' $file
      perl -pi -e 's/^(TEST)/${1}_F/g' $file
    fi
    perl -pi -e 's/(int main.*\{)/${1}::testing::InitGoogleTest(&argc, argv);/g' $file
    perl -pi -e 's/rocksdb::test::RunAllTests/RUN_ALL_TESTS/g' $file
  fi
done
% sh ~/transform
% make format
```

Second iteration of this diff contains only scripted changes.

Third iteration contains manual changes to fix last errors and make it compilable.

Test Plan:
Build and notice no errors.
```lang=bash
% USE_CLANG=1 make check -j55
```
Tests are still testing.

Reviewers: meyering, sdong, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D35157
2015-03-17 14:08:00 -07:00
Igor Sugak
9fd6edf81c rocksdb: Replace ASSERT* with EXPECT* in functions that does not return void value
Summary:
gtest does not use exceptions to fail a unit test by design, and `ASSERT*`s are implemented using `return`. As a consequence we cannot use `ASSERT*` in a function that does not return `void` value ([[ https://code.google.com/p/googletest/wiki/AdvancedGuide#Assertion_Placement | 1]]), and have to fix our existing code. This diff does this in a generic way, with no manual changes.

In order to detect all existing `ASSERT*` that are used in functions that doesn't return void value, I change the code to generate compile errors for such cases.

In `util/testharness.h` I defined `EXPECT*` assertions, the same way as `ASSERT*`, and redefined `ASSERT*` to return `void`. Then executed:

```lang=bash
% USE_CLANG=1 make all -j55 -k 2> build.log
% perl -naF: -e 'print "-- -number=".$F[1]." ".$F[0]."\n" if  /: error:/' \
build.log | xargs -L 1 perl -spi -e 's/ASSERT/EXPECT/g if $. == $number'
% make format
```
After that I reverted back change to `ASSERT*` in `util/testharness.h`. But preserved introduced `EXPECT*`, which is the same as `ASSERT*`. This will be deleted once switched to gtest.

This diff is independent and contains manual changes only in `util/testharness.h`.

Test Plan:
Make sure all tests are passing.
```lang=bash
% USE_CLANG=1 make check
```

Reviewers: igor, lgalanis, sdong, yufei.zhu, rven, meyering

Reviewed By: meyering

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33333
2015-03-16 20:52:32 -07:00
Mark Callaghan
58878f1c6a Switch to use_existing_db=1 for updaterandom and mergerandom
Summary:
Without this change about half of the updaterandom reads and merge puts will be for keys that don't exist.
I think it is better for these tests to start with a full database and use fillseq to fill it.

Task ID: #

Blame Rev:

Test Plan:
Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D35043
2015-03-14 08:36:57 -07:00
Leonidas Galanis
e126e0da5b Single threaded tests -> sync=0 Multi threaded tests -> sync=1 by default unless DB_BENCH_NO_SYNC is defined
Summary:
Single threaded tests -> sync=0 Multi threaded tests -> sync=1 by default unless DB_BENCH_NO_SYNC is defined.

Also added updaterandom and mergerandom with putOperator. I am waiting for some results from udb on this.

Test Plan:
DB_BENCH_NO_SYNC=1 WAL_DIR=/tmp OUTPUT_DIR=/tmp/b DB_DIR=/tmp ./tools/benchmark.sh debug,bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting,updaterandom,mergerandom

WAL_DIR=/tmp OUTPUT_DIR=/tmp/b DB_DIR=/tmp ./tools/benchmark.sh debug,bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting,updaterandom,mergerandom

Verify sync settings

Reviewers: sdong, MarkCallaghan, igor, rven

Reviewed By: igor, rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D34185
2015-03-06 14:12:53 -08:00
Igor Sugak
62247ffa3b rocksdb: Add missing override
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.

Prerequisites: bear and clang 3.5 build with extra tools

```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```

Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```

Reviewers: igor, kradhakrishnan, rven, meyering, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34077
2015-02-26 11:28:41 -08:00
Igor Sugak
62f7a1be4f rocksdb: Fixed 'Dead assignment' and 'Dead initialization' scan-build warnings
Summary:
This diff contains trivial fixes for 6 scan-build warnings:

**db/c_test.c**
`db` variable is never read. Removed assignment.
scan-build report:
http://home.fburl.com/~sugak/latest20/report-9b77d2.html#EndPath

**db/db_iter.cc**
`skipping` local variable is assigned to false. Then in the next switch block the only "non return" case assign `skipping` to true, the rest cases don't use it and all do return.
scan-build report:
http://home.fburl.com/~sugak/latest20/report-13fca7.html#EndPath

**db/log_reader.cc**
In `bool Reader::SkipToInitialBlock()` `offset_in_block` local variable is assigned to 0 `if (offset_in_block > kBlockSize - 6)` and then never used. Removed the assignment and renamed it to `initial_offset_in_block` to avoid confusion.
scan-build report:
http://home.fburl.com/~sugak/latest20/report-a618dd.html#EndPath

In `bool Reader::ReadRecord(Slice* record, std::string* scratch)` local variable `in_fragmented_record` in switch case `kFullType` block is assigned to false and then does `return` without use. In the other switch case `kFirstType` block the same `in_fragmented_record` is assigned to false, but later assigned to true without prior use. Removed assignment for both cases.
scan-build reprots:
http://home.fburl.com/~sugak/latest20/report-bb86b0.html#EndPath
http://home.fburl.com/~sugak/latest20/report-a975be.html#EndPath

**table/plain_table_key_coding.cc**
Local variable `user_key_size` is assigned when declared. But then in both places where it is used assigned to `static_cast<uint32_t>(key.size() - 8)`. Changed to initialize the variable to the proper value in declaration.
scan-build report:
http://home.fburl.com/~sugak/latest20/report-9e6b86.html#EndPath

**tools/db_stress.cc**
Missing `break` in switch case block. This seems to be a bug. Added missing `break`.

Test Plan:
Make sure all tests are passing and scan-build does not report 'Dead assignment' and 'Dead initialization' bugs.
```lang=bash
% make check
% make analyze
```

Reviewers: meyering, igor, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D33795
2015-02-23 14:10:09 -08:00
Ramki Balasubramanian
5d1151deba Added simple monitoring script to monitor overusage of memory in db_bench
Summary: rockuse more memory that asked to. Monitor and report.

Test Plan: run the pro with conditions to simulate the overusage. It should report that the process is using more memory than needed.

Reviewers: yhchiang, rven, sdong, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D33249
2015-02-11 18:40:11 -08:00
Igor Canadi
e8bf2310a0 Remove blob store from the codebase
Summary: We don't have plans to work on this in the short term. If we ever resurrect the project, we can find the code in the history. No need for it to linger around

Test Plan: no test

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D32349
2015-01-27 16:55:33 -08:00
Igor Canadi
2bb059007b Change db_stress to work with format_version == 2 2015-01-14 16:25:36 -08:00
Igor Canadi
9ab5adfc59 New BlockBasedTable version -- better compressed block format
Summary:
This diff adds BlockBasedTable format_version = 2. New format version brings better compressed block format for these compressions:
1) Zlib -- encode decompressed size in compressed block header
2) BZip2 -- encode decompressed size in compressed block header
3) LZ4 and LZ4HC -- instead of doing memcpy of size_t encode size as varint32. memcpy is very bad because the DB is not portable accross big/little endian machines or even platforms where size_t might be 8 or 4 bytes.

It does not affect format for snappy.

If you write a new database with format_version = 2, it will not be readable by RocksDB versions before 3.10. DB::Open() will return corruption in that case.

Test Plan:
Added a new test in db_test.
I will also run db_bench and verify VSIZE when block_cache == 1GB

Reviewers: yhchiang, rven, MarkCallaghan, dhruba, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31461
2015-01-14 16:24:24 -08:00
Igor Canadi
516a04267e Add LZ4 compression to sanity test
Summary: This will be used to test format changes in https://reviews.facebook.net/D31461

Test Plan: run it

Reviewers: MarkCallaghan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31515
2015-01-14 09:45:29 -08:00
Leonidas Galanis
9d5bd411be benchmark.sh won't run through all tests properly if one specifies wal_dir to be different than db directory.
Summary:
A command line like this to run all the tests:
source benchmark.config.sh && nohup ./benchmark.sh 'bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting'
where
benchmark.config.sh is:
export DB_DIR=/data/mysql/rocksdata
export WAL_DIR=/txlogs/rockswal
export OUTPUT_DIR=/root/rocks_benchmarking/output

Will fail for the tests that need a new DB .

Also 1) set disable_data_sync=0 and 2) add debug mode to run through all the tests more quickly

Test Plan: run ./benchmark.sh 'debug,bulkload,fillseq,overwrite,filluniquerandom,readrandom,readwhilewriting' and verify that there are no complaints about WAL dir not being empty.

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D30909
2015-01-05 15:36:47 -08:00
Qiao Yang
cef6f84393 Added 'dump_live_files' command to ldb tool.
Summary:
Priliminary diff to solicit comments.
Given DB path, dump all SST files (key/value and properties), WAL file and manifest
files. What command options do we need to support for this command? Maybe
output_hex for keys?

Test Plan: Create additional ldb unit tests.

Reviewers: sdong, rven

Reviewed By: rven

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D29547
2014-12-12 17:50:36 -08:00
Lei Jin
e93f044d99 add range scan test to benchmark script
Summary: as title

Test Plan: ran it

Reviewers: yhchiang, igor, sdong, MarkCallaghan

Reviewed By: MarkCallaghan

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D25563
2014-12-10 13:04:58 -08:00
Yueh-Hsuan Chiang
a5d4fc0a25 Fix compile warning in db_stress
Summary:
Fix compile warning in db_stress

Test Plan:
make db_stress
2014-12-04 11:59:29 -08:00
Yueh-Hsuan Chiang
97c1940882 Fix compile warning in db_stress.cc on Mac
Summary:
Fix the following compile warning in db_stress.cc on Mac
tools/db_stress.cc:1688:52: error: format specifies type 'unsigned long' but the argument has type '::google::uint64' (aka 'unsigned long long') [-Werror,-Wformat]
    fprintf(stdout, "DB-write-buffer-size: %lu\n", FLAGS_db_write_buffer_size);
                                           ~~~     ^~~~~~~~~~~~~~~~~~~~~~~~~~
                                           %llu

Test Plan:
make
2014-12-04 11:19:12 -08:00
Jonah Cohen
a14b7873ee Enforce write buffer memory limit across column families
Summary:
Introduces a new class for managing write buffer memory across column
families.  We supplement ColumnFamilyOptions::write_buffer_size with
ColumnFamilyOptions::write_buffer, a shared pointer to a WriteBuffer
instance that enforces memory limits before flushing out to disk.

Test Plan: Added SharedWriteBuffer unit test to db_test.cc

Reviewers: sdong, rven, ljin, igor

Reviewed By: igor

Subscribers: tnovak, yhchiang, dhruba, xjin, MarkCallaghan, yoshinorim

Differential Revision: https://reviews.facebook.net/D22581
2014-12-02 12:09:20 -08:00
Yueh-Hsuan Chiang
13de000f07 Add rocksdb::ToString() to address cases where std::to_string is not available.
Summary:
In some environment such as android, the c++ library does not have
std::to_string.  This path adds rocksdb::ToString(), which wraps std::to_string
when std::to_string is not available, and implements std::to_string
in the other case.

Test Plan:
make dbg -j32
./db_test
make clean
make dbg OPT=-DOS_ANDROID -j32
./db_test

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29181
2014-11-24 20:44:49 -08:00
Saghm Rossi
bafce61979 first rdb commit
Summary: First commit for rdb shell

Test Plan: unit_test.js does simple assertions on most of the main functionality; will update with rest of tests

Reviewers: igor, rven, lijn, yhciang, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28749
2014-11-20 23:33:00 -05:00
sdong
a177742a9b Make db_stress built for ROCKSDB_LITE
Summary:
Make db_stress built for ROCKSDB_LITE.
The test doesn't pass tough. It seg fault quickly. But I took a look and it doesn't seem to be related to lite version. Likely to be a bug inside RocksDB.

Test Plan: make db_stress

Reviewers: yhchiang, rven, ljin, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28797
2014-11-14 10:20:51 -08:00
Igor Canadi
767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Igor Canadi
00211f9c5b Fix SIGSEGV in db_stresS 2014-11-08 13:01:31 -08:00
Federico Piccinini
543df158c5 Expose sst_dump functionality as library call.
Summary:
Refactor sst_dump to follow the same structure as ldb. Introduce a
SSTDump interface.

Test Plan: built sst_dump and tried it manually.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28485
2014-11-07 17:23:58 -08:00
Igor Canadi
68effa0348 Fix -Wshadow for tools
Summary: Previously I made `make check` work with -Wshadow, but there are some tools that are not compiled using `make check`.

Test Plan: make all

Reviewers: yhchiang, rven, ljin, sdong

Reviewed By: ljin, sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28497
2014-11-07 15:04:30 -08:00
Lei Jin
fd24ae9d05 SetOptions() to return status and also add it to StackableDB
Summary: as title

Test Plan: ./db_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28269
2014-11-04 16:23:05 -08:00
Igor Canadi
8e79ce68ce Revert "Fix lint errors and coding style of ldb related codes."
This reverts commit bc9f36fd5e.
2014-10-31 19:22:49 -07:00
Yueh-Hsuan Chiang
bc9f36fd5e Fix lint errors and coding style of ldb related codes.
Summary: Fix lint errors and coding style of ldb related codes.

Test Plan: ./ldb

Reviewers: ljin, sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28125
2014-10-31 15:01:39 -07:00
Venkatesh Radhakrishnan
7e01d12026 Add support for in place update for db_stress
Summary:
Added two flags which operate as follows:
in_place_update: enable in_place_update for default column family
set_in_place_one_in: toggles the value of the option inplace_update_support with a probability of 1/N

Test Plan:
Run db_stress with the two flags above set.
Specifically tried in_place_update set to true and set_in_place_one_in set to 10,000.

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28029
2014-10-31 12:02:14 -07:00
Igor Canadi
c1c68bce43 remove atomic_pointer.h references 2014-10-27 15:12:20 -07:00
Igor Canadi
48842ab316 Deprecate AtomicPointer
Summary: RocksDB already depends on C++11, so we might as well all the goodness that C++11 provides. This means that we don't need AtomicPointer anymore. The less things in port/, the easier it will be to port to other platforms.

Test Plan: make check + careful visual review verifying that NoBarried got memory_order_relaxed, while Acquire/Release methods got memory_order_acquire and memory_order_release

Reviewers: rven, yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D27543
2014-10-27 14:50:21 -07:00
Lei Jin
714c63c584 db_stress for dynamic options
Summary: Allow SetOptions() during db_stress test

Test Plan: make crash_test

Reviewers: sdong, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D25497
2014-10-27 12:11:16 -07:00
Lei Jin
f18b4a4847 minor update to benchmark script
Summary: Try to match some parameters from Dhruba's benchmarks on github

Test Plan: ran it

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D24687
2014-10-10 09:55:40 -07:00
Igor Canadi
90b8c07b48 Fix unit tests errors
Summary: Those were introduced with 2fb1fea30f because the flushing behavior changed when max_background_flushes is > 0.

Test Plan: make check

Reviewers: ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23577
2014-09-18 13:32:44 -07:00
Lei Jin
add22e3515 standardize scripts to run RocksDB benchmarks
Summary:
Hope these scripts will allow people to run/repro benchmark easily
I think it is time to re-run flash benchmarks and report results
Please comment if any other benchmark runs are needed

Test Plan: ran it

Reviewers: yhchiang, igor, sdong

Reviewed By: igor

Subscribers: dhruba, MarkCallaghan, leveldb

Differential Revision: https://reviews.facebook.net/D23139
2014-09-12 16:25:35 -07:00
Igor Canadi
88841bd007 Explicitly cast char to signed char in Hash()
Summary:
The compilers we use treat char as signed. However, this is not guarantee of C standard and some compilers (for ARM platform for example), treat char as unsigned. Code that assumes that char is either signed or unsigned is wrong.

This change explicitly casts the char to signed version. This will not break any of our use cases on x86, which, I believe are all of them. In case somebody out there is using RocksDB on ARM AND using bloom filters, they're going to have a bad time. However, it is very unlikely that this is the case.

Test Plan: sanity test with previous commit (with new sanity test)

Reviewers: yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D22767
2014-09-08 18:57:40 -07:00
Feng Zhu
0af157f9bf Implement full filter for block based table.
Summary:
1. Make filter_block.h a base class. Derive block_based_filter_block and full_filter_block. The previous one is the traditional filter block. The full_filter_block is newly added. It would generate a filter block that contain all the keys in SST file.

2. When querying a key, table would first check if full_filter is available. If not, it would go to the exact data block and check using block_based filter.

3. User could choose to use full_filter or tradional(block_based_filter). They would be stored in SST file with different meta index name. "filter.filter_policy" or "full_filter.filter_policy". Then, Table reader is able to know the fllter block type.

4. Some optimizations have been done for full_filter_block, thus it requires a different interface compared to the original one in filter_policy.h.

5. Actual implementation of filter bits coding/decoding is placed in util/bloom_impl.cc

Benchmark: base commit 1d23b5c470
Command:
db_bench --db=/dev/shm/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --write_buffer_size=134217728 --max_write_buffer_number=2 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --verify_checksum=false --max_background_compactions=4 --use_plain_table=0 --memtablerep=prefix_hash --open_files=-1 --mmap_read=1 --mmap_write=0 --bloom_bits=10 --bloom_locality=1 --memtable_bloom_bits=500000 --compression_type=lz4 --num=393216000 --use_hash_search=1 --block_size=1024 --block_restart_interval=16 --use_existing_db=1 --threads=1 --benchmarks=readrandom —disable_auto_compactions=1
Read QPS increase for about 30% from 2230002 to 2991411.

Test Plan:
make all check
valgrind db_test
db_stress --use_block_based_filter = 0
./auto_sanity_test.sh

Reviewers: igor, yhchiang, ljin, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20979
2014-09-08 10:37:05 -07:00
Lei Jin
5665e5e285 introduce ImmutableOptions
Summary:
As a preparation to support updating some options dynamically, I'd like
to first introduce ImmutableOptions, which is a subset of Options that
cannot be changed during the course of a DB lifetime without restart.

ColumnFamily will keep both Options and ImmutableOptions. Any component
below ColumnFamily should only take ImmutableOptions in their
constructor. Other options should be taken from APIs, which will be
allowed to adjust dynamically.

I am yet to make changes to memtable and other related classes to take
ImmutableOptions in their ctor. That can be done in a seprate diff as
this one is already pretty big.

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D22545
2014-09-04 16:18:36 -07:00
Feng Zhu
8ed70fc209 add assert to db Put in db_stress test
Summary:
1. assert db->Put to be true in db_stress
2. begin column family with name "1".

Test Plan: 1. ./db_stress

Reviewers: ljin, yhchiang, dhruba, sdong, igor

Reviewed By: sdong, igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22659
2014-09-02 13:21:59 -07:00
Lei Jin
384400128f move block based table related options BlockBasedTableOptions
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21915
2014-08-25 14:22:05 -07:00
Feng Zhu
6c4c159b2c fix_sst_dump_for_old_sst_format
Summary:
1. fix segment error when dumping old sst format (no properties nor stats)
2. Enable dumpping old sst format

Test Plan:
Generate block based sst file with "properties", and one with "stats" and one without neither.
Read it using sst_dump

Reviewers: ljin, igor, yhchiang, dhruba, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21837
2014-08-14 09:59:41 -07:00
Igor Canadi
0ff183a0d9 Move include/utilities/*.h to include/rocksdb/utilities/*.h
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:

    #include <rocksdb/db.h>
    #include <utilities/backupable_db.h> // still our public header!

Also, internally, we include:

    #include "utilities/backupable/backupable_db.h" // internal header
    #include "utilities/backupable_db.h" // public header

which is confusing.

This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.

Test Plan: compiles

Reviewers: dhruba, ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20409
2014-07-23 10:21:38 -04:00
Stanislau Hlebik
92d73cbe78 Add PlainTableOptions
Summary:
Since we have a lot of options for PlainTable, add a struct PlainTableOptions
to manage them

Test Plan: make all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20175
2014-07-18 00:08:38 -07:00
Igor Canadi
591c2a3b4b [db stress] Don't drop column families if there's only 1 2014-07-14 07:56:07 -07:00
Stanislau Hlebik
30c81e7717 Removing NewTotalOrderPlainTableFactory
Summary:
Seems like NewTotalOrderPlainTableFactory is useless and is semantically incorrect.
Total order mode indicator is prefix_extractor == nullptr,
but NewTotalOrderPlainTableFactory doesn't set it to be nullptr. That's why some tests
in plain_table_db_tests is incorrect.

Test Plan: make all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19587
2014-07-10 11:32:04 -07:00
Feng Zhu
c4018e771c In tools/db_stress.cc, set proper value in NewHashSkipListRepFactory's bucket_size
Summary:
    Now that the arena is used to allocate space for hashskiplist's bucket. The bucket size
    need to be set small enough to avoid "should_flush_" failure in memtable's assertion.

Test Plan:
    make all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: igor

Differential Revision: https://reviews.facebook.net/D19371
2014-07-01 11:02:42 -07:00
Yueh-Hsuan Chiang
bf4b1528d8 Fix compile error in reduce_levels_test.
Summary:
Fixed the following compile error.
    tools/reduce_levels_test.cc:89:31: error: no matching function for call to 'InitFromCmdLineArgs'
      LDBCommand* level_reducer = LDBCommand::InitFromCmdLineArgs(args);
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ./util/ldb_cmd.h:56:22: note: candidate function not viable: requires 3 arguments, but 1 was provided
      static LDBCommand* InitFromCmdLineArgs(
                         ^
    ./util/ldb_cmd.h:62:22: note: candidate function not viable: requires 4 arguments, but 1 was provided
      static LDBCommand* InitFromCmdLineArgs(
                         ^
    1 error generated.

Test Plan:
make reduce_levels_test
./reduce_levels_test

Reviewers: haobo, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19251
2014-06-23 17:48:20 -06:00
Igor Canadi
d4a8423334 Remove seek compaction
Summary:
As discussed in our internal group, we don't get much use of seek compaction at the moment, while it's making code more complicated and slower in some cases.

This diff removes seek compaction and (hopefully) all code that was introduced to support seek compaction.

There is one test case that relied on didIO information. I'll try to find another way to implement it.

Test Plan: make check

Reviewers: sdong, haobo, yhchiang, ljin, dhruba

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19161
2014-06-20 10:23:02 +02:00
sdong
edd47c5104 PlainTable to encode to avoid to rewrite prefix when it is the same as the previous key
Summary:
Add a encoding feature of PlainTable to encode PlainTable's keys to save some bytes for the same prefixes.
The data format is documented in table/plain_table_factory.h

Test Plan: Add unit test coverage in plain_table_db_test

Reviewers: yhchiang, igor, dhruba, ljin, haobo

Reviewed By: haobo

Subscribers: nkg-, leveldb

Differential Revision: https://reviews.facebook.net/D18735
2014-06-18 20:41:52 -07:00
sdong
9202d9b625 Fix sst_dump for PlainTable
Summary: sst_dump now doesn't work well for PlainTable. Not sure when it started, but this should fix it.

Test Plan: Run sst_dump against a file that used to fail.

Reviewers: yhchiang, haobo, igor

Reviewed By: igor

Subscribers: dhruba, ljin, leveldb

Differential Revision: https://reviews.facebook.net/D19023
2014-06-12 11:03:03 -07:00
sdong
ee5a51e6ce sst_dump: still try to print out table properties even if failing to read the file
Summary: Even if the file is corrupted, table properties are usually available to print out. Now sst_dump would just fail without printing table properties. With this patch, table properties are still try to be printed out.

Test Plan: run sst_dump against multiple scenarios

Reviewers: igor, yhchiang, ljin, haobo

Reviewed By: haobo

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18981
2014-06-09 11:21:44 -07:00
Igor Canadi
a0191c9dfe Create Missing Column Families
Summary: Provide an convenience option to create column families if they are missing from the DB. Task #4460490

Test Plan: added unit test. also, stress test for some time

Reviewers: sdong, haobo, dhruba, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: yhchiang, leveldb

Differential Revision: https://reviews.facebook.net/D18951
2014-06-06 18:04:56 -07:00
sdong
b92a19a431 sst_dump: Set dummy prefix extractor for binary search index in block based table
Summary: Now sst_dump fails in block based tables if binary search index is used, as it requires a prefix extractor. Add it.

Test Plan: Run it against such a file to make sure it fixes the problem.

Reviewers: yhchiang, kailiu

Reviewed By: kailiu

Subscribers: ljin, igor, dhruba, haobo, leveldb

Differential Revision: https://reviews.facebook.net/D18927
2014-06-05 15:37:23 -07:00
sdong
593bb2c40b db_stress to add an option to periodically change background thread pool size.
Summary: Add an option to indicates the variation of background threads. If the flag is not 0, for every 100 milliseconds, adjust thread pool size to a value within the range.

Test Plan: run db_stress

Reviewers: haobo, dhruba, igor, ljin

Reviewed By: ljin

Subscribers: leveldb, nkg-

Differential Revision: https://reviews.facebook.net/D18759
2014-06-02 10:32:09 -07:00
Igor Canadi
3a1cf1281b Run FIFO compaction as part of db_crashtest2.py
Summary: As title

Test Plan: ran it

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D18783
2014-05-22 10:24:24 -07:00
Chilledheart
fa16ef394c Fix errors while building with clang 2014-05-15 12:34:53 +08:00
Igor Canadi
a1068c91a1 Make RocksDB work with newer gflags
Summary:
Newer gflags switched from `google` namespace to `gflags` namespace. See: https://github.com/facebook/rocksdb/issues/139 and https://github.com/facebook/rocksdb/issues/102

Unfortunately, they don't define any macro with their namespace, so we need to actually try to compile gflags with two different namespace to figure out which one is the correct one.

Test Plan: works in fbcode environemnt. I'll also try in ubutnu with newer gflags

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18537
2014-05-08 17:25:13 -07:00
Igor Canadi
0afc8bc29a xxHash
Summary:
Originally: https://github.com/facebook/rocksdb/pull/87/files

I'm taking over to apply some finishing touches

Test Plan: will add tests

Reviewers: dhruba, haobo, sdong, yhchiang, ljin

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18315
2014-05-01 14:09:32 -04:00
Igor Canadi
91ef2eae23 Use new DBWithTTL API in tests 2014-04-28 23:46:24 -04:00
Lei Jin
3995e801ab kill ReadOptions.prefix and .prefix_seek
Summary:
also add an override option total_order_iteration if you want to use full
iterator with prefix_extractor

Test Plan: make all check

Reviewers: igor, haobo, sdong, yhchiang

Reviewed By: haobo

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D17805
2014-04-25 12:21:34 -07:00
Igor Canadi
472a80a3ae Initialize verification_failed in db_stress 2014-04-24 06:53:11 -07:00
Igor Canadi
2413a06c7b Improve stability of db_stress
Summary:
Currently, whenever DB Verification fails we bail out by calling `exit(1)`. This is kind of bad since it causes unclean shutdown and spew of error log messages like:

    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument
    05:03:27 pthread lock: Invalid argument

This diff adds a new parameter that is set to true when verification fails. It can then use the parameter to bail out safely.

Test Plan: Casued artificail failure. Verified that exit was clean.

Reviewers: dhruba, haobo, ljin

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18243
2014-04-24 09:22:58 -04:00
Yueh-Hsuan Chiang
af6ad113a8 Fix SIGFAULT when running sst_dump on v2.6 db
Summary: Fix the sigfault when running sst_dump on v2.6 db.

Test Plan:
    git checkout bba6595b1f
    make clean
    make db_bench
    ./db_bench --db=/tmp/some/db --benchmarks=fillseq
    arc patch D18039
    make clean
    make sst_dump
    ./sst_dump --file=/tmp/some/db --command=check

Reviewers: igor, haobo, sdong

Reviewed By: sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18039
2014-04-21 17:49:47 -07:00
Igor Canadi
8dc34364d2 Rename "benchmark" back to "bench".
Also, make `benchharness.cc` not compiled into rocksdb library.
2014-04-21 13:12:15 -07:00
Pratyush Seth
ff1b5df4c6 Added benchmark functionality on the lines of folly/Benchmark.h
Summary: Added benchmark functionality on the lines of folly/Benchmark.h

Test Plan: Added unit tests

Reviewers: igor, haobo, sdong, ljin, yhchiang, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17973
2014-04-21 12:29:55 -07:00
Lei Jin
0f2d768191 hints for narrowing down FindFile range and avoiding checking unrelevant L0 files
Summary:
The file tree structure in Version is prebuilt and the range of each file is known.
On the Get() code path, we do binary search in FindFile() by comparing
target key with each file's largest key and also check the range for each L0 file.
With some pre-calculated knowledge, each key comparision that has been done can serve
as a hint to narrow down further searches:
(1) If a key falls within a L0 file's range, we can safely skip the next
file if its range does not overlap with the current one.
(2) If a key falls within a file's range in level L0 - Ln-1, we should only
need to binary search in the next level for files that overlap with the current one.

(1) will be able to skip some files depending one the key distribution.
(2) can greatly reduce the range of binary search, especially for bottom
levels, given that one file most likely only overlaps with N files from
the level below (where N is max_bytes_for_level_multiplier). So on level
L, we will only look at ~N files instead of N^L files.

Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
9.6M/s), it gives us ~13% improvement.

Test Plan: make all check

Reviewers: haobo, igor, dhruba, sdong, yhchiang

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17205
2014-04-21 09:10:12 -07:00
Igor Canadi
ce353c2474 Nuke tools/shell
Summary: We don't use or build this code

Test Plan: builds

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17979
2014-04-17 14:43:42 -07:00
Igor Canadi
23c8f89b57 Revert "Don't compile ldb tool into static library"
This reverts commit e296577ef6.
2014-04-15 11:29:02 -07:00
Igor Canadi
a347ffe92f Revert "Fix sst_dump and reduce_levels_test compile errors"
This reverts commit d8f00b4109.
2014-04-15 11:28:52 -07:00
Igor Canadi
d8f00b4109 Fix sst_dump and reduce_levels_test compile errors 2014-04-15 11:13:12 -07:00
Igor Canadi
e296577ef6 Don't compile ldb tool into static library
Summary:
This is first step of my effort to reduce size of librocksdb.a for use in mobile.

ldb object files are huge and are ment to be used as a command line tool. I moved them to `tools/` directory and include them only when compiling `ldb`

This diff reduced librocksdb.a from 42MB to 39MB on my mac (not stripped).

Test Plan: ran ldb

Reviewers: dhruba, haobo, sdong, ljin, yhchiang

Reviewed By: yhchiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17823
2014-04-15 10:52:39 -07:00
Igor Canadi
4daea66343 Turn on -Wmissing-prototypes
Summary: Compiling for iOS has by default turned on -Wmissing-prototypes, which causes rocksdb to fail compiling. This diff turns on -Wmissing-prototypes in our compile options and cleans up all functions with missing prototypes.

Test Plan: compiles

Reviewers: dhruba, haobo, ljin, sdong

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17649
2014-04-09 21:17:14 -07:00
Igor Canadi
b947fdc89d Column family support for DB::OpenForReadOnly()
Summary: When opening DB in read-only mode, client can choose to only specify a subset of column families ("default" column family can't be omitted, though)

Test Plan: added a unit test in column_family_test

Reviewers: haobo, sdong, ljin, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17565
2014-04-09 09:56:17 -07:00
Igor Canadi
3d2fe844ab Merge branch 'master' into columnfamilies
Conflicts:
	db/db_impl.cc
	db/db_impl.h
	db/memtable_list.cc
	db/version_set.cc
2014-04-07 11:31:11 -07:00
Yueh-Hsuan Chiang
c0b9fa8b3e Add script auto_sanity_test.sh to perform auto sanity test
Summary:
Add script auto_sanity_test.sh to perform auto sanity test

usage: auto_sanity_test.sh [new_commit] [old_commit]

Running without commit parameter will do the sanity test with the latest
and the latest 10 commit.

Test Plan: ./auto_sanity_test.sh

Reviewers: haobo, igor

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17397
2014-04-03 10:21:46 -07:00
Igor Canadi
8555ce2dec Merge branch 'master' into columnfamilies 2014-04-02 10:48:05 -07:00
Igor Canadi
05080dae3f fix db_sanity_test 2014-03-31 17:06:53 -07:00
Igor Canadi
76642b81f4 Increase done even if progress_reports is false 2014-03-20 16:52:59 -07:00
Igor Canadi
ac328a86b9 Merge branch 'master' into columnfamilies
Conflicts:
	db/db_impl.cc
	db/db_test.cc
2014-03-20 14:41:37 -07:00
Yiting Li
7981a43274 Consistency Check Function
Summary: Added a function/command to check the consistency of live files' meta data

Test Plan:
Manual test (size mismatch, file not exist).
Command test script.

Reviewers: haobo

Reviewed By: haobo

CC: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D16935
2014-03-20 13:42:45 -07:00
Igor Canadi
4cfb0eb40f Delete rocksdb dir after crashtest
Summary:
We should clean up after ourselves. Last crashtest failed because the disk on the cibuild machine was full.

Also, db_crashtest is supposed to run the test on the same database in every iteration. This isn't the case currently, because every run creates a new tmp directory. It's fixed with this diff.

Test Plan: ran it

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17073
2014-03-20 11:11:08 -07:00
Igor Canadi
159928dfa5 Added flag progress_reports in db_stress 2014-03-19 09:58:41 -07:00
Igor Canadi
64904b39a0 Merge branch 'master' into columnfamilies
Conflicts:
	utilities/backupable/backupable_db.cc
2014-03-17 17:57:14 -07:00
Igor Canadi
5601bc4619 Check starts_with(prefix) in MultiPrefixIterate
Summary: We switched to prefix_seek method of seeking. This means that anytime we check Valid(), we also need to check starts_with(prefix)

Test Plan: ran db_stress

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16953
2014-03-17 17:02:34 -07:00
Igor Canadi
e0c1211555 Merge branch 'master' into columnfamilies
Conflicts:
	db/version_set.cc
	tools/db_stress.cc
2014-03-17 12:21:05 -07:00
Igor Canadi
9b8a2b52d4 No prefix iterator in db_stress
Summary: We're trying to deprecate prefix iterators, so no need to test them in db_stress

Test Plan: ran it

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16917
2014-03-17 10:02:46 -07:00
Igor Canadi
928ee23567 Change WriteBatch interface 2014-03-14 13:40:06 -07:00
Igor Canadi
e1f56e12cf Merge branch 'master' into columnfamilies
Conflicts:
	db/db_impl.cc
	db/db_test.cc
	tools/db_stress.cc
2014-03-13 13:21:20 -07:00
Igor Canadi
04a1035efe Revert "DB stress with normal skip list"
This reverts commit 86926d8c6a.
2014-03-12 22:21:13 -07:00
Lei Jin
02a2cb139b fix VerifyDb in StressTest
Summary:
this should fix the hash_skip_list issue, but I still see seqno
assertion failure in the last run. Will continue investigating and
address that in a different diff

Test Plan: make whitebox_crash_test

Reviewers: igor

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16851
2014-03-12 22:20:22 -07:00
Igor Canadi
86926d8c6a DB stress with normal skip list
Summary:
Hash skip list has issues, causing db_stress to fail badly.

For now, switching back to skip_list by default before we figure out root cause.

Test Plan: db_stress is happy(ier)

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16845
2014-03-12 17:35:27 -07:00
Igor Canadi
7b7793e97a Don't sync in stress test
Summary: Syncing in stress test makes it run much much much slower. It also doesn't add much value IMO.

Test Plan: no

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16839
2014-03-12 15:12:09 -07:00
Igor Canadi
dff9214165 Merge branch 'master' into columnfamilies
Conflicts:
	db/db_impl.cc
	tools/db_stress.cc
2014-03-12 10:17:41 -07:00
Igor Canadi
2b95dc1542 Revert "Fix bad merge of D16791 and D16767"
This reverts commit 839c8ecfcd.
2014-03-12 09:37:43 -07:00
Igor Canadi
5ba028c179 DBStress cleanup
Summary:
*) fixed the comment
*) constant 1 was not casted to 64-bit, which (I think) might cause overflow if we shift it too much
*) default prefix size to be 7, like it was before

Test Plan: compiled

Reviewers: ljin

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16827
2014-03-12 09:31:06 -07:00
sdong
839c8ecfcd Fix bad merge of D16791 and D16767
Summary: A bad Auto-Merge caused log buffer is flushed twice. Remove the unintended one.

Test Plan: Should already be tested (the code looks the same as when I ran unit tests).

Reviewers: haobo, igor

Reviewed By: haobo

CC: ljin, yhchiang, leveldb

Differential Revision: https://reviews.facebook.net/D16821
2014-03-11 21:31:57 -07:00
Lei Jin
86ba3e24e3 make assert based on FLAGS_prefix_size
Summary: as title

Test Plan: running python tools/db_crashtest.py

Reviewers: igor

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16803
2014-03-11 16:33:42 -07:00
Lei Jin
02dab3be11 fix db_stress test
Summary: Fix the db_stress test, let is run with HashSkipList for real

Test Plan:
python tools/db_crashtest.py
python tools/db_crashtest2.py

Reviewers: igor, haobo

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16773
2014-03-11 13:44:33 -07:00
Igor Canadi
56ca8338e5 initialize static const outside of class 2014-03-11 13:08:48 -07:00
Igor Canadi
457c78eb89 [CF] db_stress for column families
Summary:
I had this diff for a while to test column families implementation. Last night, I ran it sucessfully for 10 hours with the command:

   time ./db_stress --threads=30 --ops_per_thread=200000000 --max_key=5000 --column_families=20 --clear_column_family_one_in=3000000 --verify_before_write=1  --reopen=50 --max_background_compactions=10 --max_background_flushes=10 --db=/tmp/db_stress

It is ready to be committed :)

Test Plan: Ran it for 10 hours

Reviewers: dhruba, haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16797
2014-03-11 12:06:12 -07:00
Lei Jin
8d007b4aaf Consolidate SliceTransform object ownership
Summary:
(1) Fix SanitizeOptions() to also check HashLinkList. The current
dynamic case just happens to work because the 2 classes have the same
layout.
(2) Do not delete SliceTransform object in HashSkipListFactory and
HashLinkListFactory destructor. Reason: SanitizeOptions() enforces
prefix_extractor and SliceTransform to be the same object when
Hash**Factory is used. This makes the behavior strange: when
Hash**Factory is used, prefix_extractor will be released by RocksDB. If
other memtable factory is used, prefix_extractor should be released by
user.

Test Plan: db_bench && make asan_check

Reviewers: haobo, igor, sdong

Reviewed By: igor

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D16587
2014-03-10 12:56:46 -07:00
Lei Jin
cff908db09 fix ldb_test TtlPutGet test
Summary:
If the last byte of the timestamp is 0, the output value of the ttl db
can cut off earlier. Change to compare hex format instead.

Test Plan:
this does not happen consistently, but it did not fail after the last
100 runs
for i in {1..100}; do python ./tools/ldb_test.py; done

Reviewers: igor, haobo

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16725
2014-03-10 12:11:46 -07:00
Igor Canadi
9c8ad62691 DB Sanity Test
Summary:
@kailiu mentioned on meeting yesterday that we sometimes have trouble opening DB created by old version with the new version. This will be very important to test for column families, since I'm changing disk format for the MANIFEST.

I added a tool that can help us test that. Usage:
./db_sanity_test <path> create
will create a bunch of DBs under <path>
<change RocksDB version>
./db_sanity_test <path> verify
will verify consistency of DBs created under <path>

Test Plan: ran the db_sanity_test

Reviewers: kailiu, dhruba, haobo

Reviewed By: kailiu

CC: leveldb, kailiu, xjin

Differential Revision: https://reviews.facebook.net/D16605
2014-03-06 11:36:39 -08:00
Igor Canadi
97eddef235 Reopen DB in crash test
Summary:
Why don't we automatically reopen DB when running crash test (running in our nightly build)? If I understand correctly, crashtest is manually reopenning the DB, but then the DB does not check its consistency when you kill db_stress process and then re-run it again.
Does this make sense?

Test Plan: not reall

Reviewers: dhruba, haobo, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D16167
2014-03-03 17:10:30 -08:00
Igor Canadi
8e634d3ea4 Merge pull request #74 from alberts/lz4
Support for LZ4 compression.
2014-02-10 15:46:56 -08:00
Albert Strasheim
df2f92214a Support for LZ4 compression. 2014-02-08 14:15:51 -08:00
Kai Liu
b8ea5e36b3 Fix incompatible compilation in Linux server 2014-02-07 19:47:48 -08:00
kailiu
161ab42a8a Make table properties shareable
Summary:
We are going to expose properties of all tables to end users through "some" db interface.
However, current design doesn't naturally fit for this need, which is because:

1. If a table presents in table cache, we cannot simply return the reference to its table properties, because the table may be destroy after compaction (and we don't want to hold the ref of the version).
2. Copy table properties is OK, but it's slow.

Thus in this diff, I change the table reader's interface to return a shared pointer (for const table properties), instead a const refernce.

Test Plan: `make check` passed

Reviewers: haobo, sdong, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15999
2014-02-07 19:26:49 -08:00
Yueh-Hsuan Chiang
3ce8d9a988 Add support for plain table format to sst_dump.
Summary:
This diff enables the command line tool `sst_dump` to work for sst files
under plain table format.  Changes include:
  * In tools/sst_dump.cc:
    - add support for plain table format
    - display prefix_extractor information when --show_properties is on
  * In table/format.cc
    - Now the table magic number of a Footer can be later initialized
      via ReadFooterFromFile().
  * In table/meta_bocks:
    - add function ReadTableMagicNumber() that reads the magic number of
      the specified file.

Minor fixes:
 - remove a duplicate #include in table/table_test.cc
 - fix a commentary typo in include/rocksdb/memtablerep.h
 - fix lint errors.

Test Plan:
Runs sst_dump with both block-based and plain-table format files with
different arguments, specifically those with --show-properties and --from.

* sample output:
  https://reviews.facebook.net/P261

Reviewers: kailiu, sdong, xjin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15903
2014-02-07 11:15:00 -08:00
kailiu
84f8185fc0 Merge branch 'master' into performance
Conflicts:
	HISTORY.md
	db/db_impl.cc
	db/memtable.cc
2014-02-05 21:21:00 -08:00
Igor Canadi
2966d764cd Fix some 32-bit compile errors
Summary: RocksDB doesn't compile on 32-bit architecture apparently. This is attempt to fix some of 32-bit errors. They are reported here: https://gist.github.com/paxos/8789697

Test Plan: RocksDB still compiles on 64-bit :)

Reviewers: kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15825
2014-02-03 13:48:30 -08:00
Siying Dong
d169b67680 [Performance Branch] PlainTable to encode rows with seqID 0, value type using 1 internal byte.
Summary: In PlainTable, use one single byte to represent 8 bytes of internal bytes, if seqID = 0 and it is value type (which should be common for bottom most files). It is to save 7 bytes for uncompressed cases.

Test Plan: make all check

Reviewers: haobo, dhruba, kailiu

Reviewed By: haobo

CC: igor, leveldb

Differential Revision: https://reviews.facebook.net/D15489
2014-02-03 12:19:30 -08:00
kailiu
4f6cb17bdb First phase API clean up
Summary:
Addressed all the issues in https://reviews.facebook.net/D15447.
Now most table-related modules are hidden from user land.

Test Plan: make check

Reviewers: sdong, haobo, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15525
2014-02-03 00:30:43 -08:00
kailiu
4e0298f23c Clean up arena API
Summary:
Easy thing goes first. This patch moves arena to internal dir; based
on which, the coming patch will deal with memtable_rep.

Test Plan: make check

Reviewers: haobo, sdong, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15615
2014-01-30 22:10:10 -08:00
Yueh-Hsuan Chiang
a46ac92138 Allow command line tool sst-dump to display table properties.
Summary:
Add option '--show_properties' to sst_dump tool to allow displaying
property block of the specified files.

Test Plan:
Run sst_dump with the following arguments, which covers cases affected by
this diff:

  1. with only --file
  2. with both --file and --show_properties
  3. with --file, --show_properties, and --from

Reviewers: kailiu, xjin

Differential Revision: https://reviews.facebook.net/D15453
2014-01-30 01:50:34 -08:00
Siying Dong
8477255da3 Moving Some includes from options.h to forward declaration
Summary: By removing some includes form options.h and reply on forward declaration, we can more easily reason the dependencies.

Test Plan: make all check

Reviewers: kailiu, haobo, igor, dhruba

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15411
2014-01-24 17:16:22 -08:00
Igor Canadi
e832e72b31 Revert "Moving to glibc-fb"
This reverts commit d24961b65e.

For some reason, glibc2.17-fb breaks gflags. Reverting for now
2014-01-24 11:50:38 -08:00
Igor Canadi
d24961b65e Moving to glibc-fb
Summary:
It looks like we might have some trouble when building the new release with 4.8, since fbcode is using glibc2.17-fb by default and we are using glibc2.17. It was reported by Benjamin Renard in our internal group.

This diff moves our fbcode build to use glibc2.17-fb by default. I got some linker errors when compiling, complaining that `google::SetUsageMessage()` was undefined. After deleting all offending lines, the compile was successful and everything works.

Test Plan:
Compiled
Ran ./db_bench ./db_stress ./db_repl_stress

Reviewers: kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15405
2014-01-24 10:24:08 -08:00
Igor Canadi
83681bf9ef Statistics code cleanup
Summary: I'm separating code-cleanup part of https://reviews.facebook.net/D14517. This will make D14517 easier to understand and this diff easier to review.

Test Plan: make check

Reviewers: haobo, kailiu, sdong, dhruba, tnovak

Reviewed By: tnovak

CC: leveldb

Differential Revision: https://reviews.facebook.net/D15099
2014-01-17 12:46:06 -08:00
Igor Canadi
eb12e47e0e Killing Transform Rep
Summary:
Let's get rid of TransformRep and it's children. We have confirmed that HashSkipListRep works better with multifeed, so there is no benefit to keeping this around.

This diff is mostly just deleting references to obsoleted functions. I also have a diff for fbcode that we'll need to push when we switch to new release.

I had to expose HashSkipListRepFactory in the client header files because db_impl.cc needs access to GetTransform() function for SanitizeOptions.

Test Plan: make check

Reviewers: dhruba, haobo, kailiu, sdong

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14397
2013-12-03 12:42:15 -08:00
Igor Canadi
469a9f32a7 Fix two nasty use-after-free-bugs
Summary:
These bugs were caught by ASAN crash test.
1. The first one, in table/filter_block.cc is very nasty. We first reference entries_ and store the reference to Slice prev. Then, we call entries_.append(), which can change the reference. The Slice prev now points to junk.
2. The second one is a bug in a test, so it's not very serious. Once we set read_opts.prefix, we never clear it, so some other function might still reference it.

Test Plan: asan crash test now runs more than 5 mins. Before, it failed immediately. I will run the full one, but the full one takes quite some time (5 hours)

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14223
2013-11-19 21:01:48 -08:00
kailiu
97d8e573a6 make util/env_posix.cc work under mac
Summary: This diff invoves some more complicated issues in the posix environment.

Test Plan: works under mac os. will need to verify dev box.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14061
2013-11-16 23:44:39 -08:00
Pascal Borreli
443e04e62d Fixed typos 2013-11-16 11:21:34 +00:00
Kai Liu
88ba331c1a Add the index/filter block cache
Summary: This diff leverage the existing block cache and extend it to cache index/filter block.

Test Plan:
Added new tests in db_test and table_test

The correctness is checked by:

1. make check
2. make valgrind_check

Performance is test by:

1. 10 times of build_tools/regression_build_test.sh on two versions of rocksdb before/after the code change. Test results suggests no significant difference between them. For the two key operatons `overwrite` and `readrandom`, the average iops are both 20k and ~260k, with very small variance).
2. db_stress.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, haobo, xjin

Differential Revision: https://reviews.facebook.net/D13167
2013-11-12 22:46:51 -08:00
Kai Liu
22e1b04deb Quick fix for a string format
Summary:

Fix one more string format issue that throws warning in mac
2013-11-12 21:22:32 -08:00
shamdor
c2be2cba04 WAL log retention policy based on archive size.
Summary:
Archive cleaning will still happen every WAL_ttl seconds
but archived logs will be deleted only if archive size
is greater then a WAL_size_limit value.
Empty archived logs will be deleted evety WAL_ttl.

Test Plan:
1. Unit tests pass.
2. Benchmark.

Reviewers: emayanke, dhruba, haobo, sdong, kailiu, igor

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13869
2013-11-06 18:46:28 -08:00
Dhruba Borthakur
b4ad5e89ae Implement a compressed block cache.
Summary:
Rocksdb can now support a uncompressed block cache, or a compressed
block cache or both. Lookups first look for a block in the
uncompressed cache, if it is not found only then it is looked up
in the compressed cache. If it is found in the compressed cache,
then it is uncompressed and inserted into the uncompressed cache.

It is possible that the same block resides in the compressed cache
as well as the uncompressed cache at the same time. Both caches
have their own individual LRU policy.

Test Plan: Unit test case attached.

Reviewers: kailiu, sdong, haobo, leveldb

Reviewed By: haobo

CC: xjin, haobo

Differential Revision: https://reviews.facebook.net/D12675
2013-11-01 14:31:35 -07:00
Piyush Garg
1e4375d2ef Task #3071144 Enhance ldb (db dump tool for leveldb) to report row counters for each row type
Summary: Added an option --count_delim=<char> which takes the given character as delimiter ('.' by default) and reports count of each row type found in the db

Test Plan:
1. Created test in file (for DBDumperCommand) rocksdb/tools/ldb_test.py which puts various key value pair in db and checks the output using dump --count_delim ,--count_delim="." and --count_delim=",".
2. Created test in file (for InternalDumperCommand) rocksdb/tools/ldb_test.py which puts various key value pair in db and checks the output using dump --count_delim ,--count_delim="." and --count_delim=",".
3. Manually created a database with several keys of several type and verified by running the command
   ./ldb db=<path> dump --count_delim="<char>"
   ./ldb db=<path> idump --count_delim="<char>"

Reviewers: vamsi, dhruba, emayanke, kailiu

Reviewed By: vamsi

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13815
2013-11-01 13:59:14 -07:00
Igor Canadi
138a8eee8e Fix make release
Summary: Don't define if already defined.

Test Plan: Running make release in parallel, will not commit if it fails.

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13833
2013-10-31 11:47:22 -07:00
Siying Dong
f03b2df010 Follow-up Cleaning-up After D13521
Summary:
This patch is to address @haobo's comments on D13521:
1. rename Table to be TableReader and make its factory function to be GetTableReader
2. move the compression type selection logic out of TableBuilder but to compaction logic
3. more accurate comments
4. Move stat name constants into BlockBasedTable implementation.
5. remove some uncleaned codes in simple_table_db_test

Test Plan: pass test suites.

Reviewers: haobo, dhruba, kailiu

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13785
2013-10-30 10:52:33 -07:00
Siying Dong
d4eec30ed0 Make "Table" pluggable
Summary: This patch makes Table and TableBuilder a abstract class and make all the implementation of the current table into BlockedBasedTable and BlockedBasedTable Builder.

Test Plan: Make db_test.cc to work with block based table. Add a new test simple_table_db_test.cc where a different simple table format is implemented.

Reviewers: dhruba, haobo, kailiu, emayanke, vamsi

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13521
2013-10-28 17:54:09 -07:00
Igor Canadi
8ace6b0f91 Run benchmark with no debug
Summary: assert(Overlap) significantly slows down the benchmark. Ignore assertions when executing blob_store_bench.

Test Plan: Ran the benchmark

Reviewers: dhruba, haobo, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13737
2013-10-28 17:42:33 -07:00
Igor Canadi
17991cd5a0 Fix data race in BlobStore benchmark
Summary: Apparently C++ doesn't like it if you copy around its atomic<> variables. When running a benchmark for a longer time, benchmark used to stall. Changed WorkerThread in config to WorkerThread*. It works now.

Test Plan: Ran benchmark

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13731
2013-10-28 16:31:44 -07:00
Slobodan Predolac
e44976b199 Conversion of db_bench, db_stress and db_repl_stress to use gflags
Summary: Converted db_stress, db_repl_stress and db_bench to use gflags

Test Plan: I tested by printing out all the flags from old and new versions. Tried defaults, + various combinations with "interesting flags". Also, tested by running db_crashtest.py and db_crashtest2.py.

Reviewers: emayanke, dhruba, haobo, kailiu, sdong

Reviewed By: emayanke

CC: leveldb, xjin

Differential Revision: https://reviews.facebook.net/D13581
2013-10-24 07:43:14 -07:00
Igor Canadi
7e2c1ba173 BlobStore Benchmark
Summary:
Finally, arc diff works again! This has been sitting in my repo for a while.

I would like some comments on my BlobStore benchmark. We don't have to check this in.

Also, I don't do any fsync in the BlobStore, so this is all extremely fast. I'm not sure what durability guarantees we need from the BlobStore.

Test Plan: Nope

Reviewers: dhruba, haobo, kailiu, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13527
2013-10-23 17:31:12 -07:00
Dhruba Borthakur
9cd221094c Add appropriate LICENSE and Copyright message.
Summary:
Add appropriate LICENSE and Copyright message.

Test Plan:
make check

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-10-16 17:48:41 -07:00
Siying Dong
88f2f89068 Change Function names from Compaction->Flush When they really mean Flush
Summary: When I debug the unit test failures when enabling background flush thread, I feel the function names can be made clearer for people to understand. Also, if the names are fixed, in many places, some tests' bugs are obvious (and some of those tests are failing). This patch is to clean it up for future maintenance.

Test Plan: Run test suites.

Reviewers: haobo, dhruba, xjin

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13431
2013-10-14 15:12:15 -07:00
Dhruba Borthakur
4463b11cad Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix.
Summary: Migrate names of properties from 'leveldb' prefix to 'rocksdb' prefix.

Test Plan: make check

Reviewers: emayanke, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13311
2013-10-06 00:14:26 -07:00
Dhruba Borthakur
a143ef9b38 Change namespace from leveldb to rocksdb
Summary:
Change namespace from leveldb to rocksdb. This allows a single
application to link in open-source leveldb code as well as
rocksdb code into the same process.

Test Plan: compile rocksdb

Reviewers: emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13287
2013-10-04 11:59:26 -07:00
Mayank Agarwal
6b34021fc2 Triggering verify for gets also
Summary: Will use iterators to verify keys in the db for half of its keys and Gets for the other half.

Test Plan: ./db_stress --max_key=1000 --ops_per_thread=100

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13227
2013-10-02 11:22:17 -07:00
Natalie Hildebrandt
7edb92b843 Phase 2 of iterator stress test
Summary: Using an iterator instead of the Get method, each thread goes through a portion of the database and verifies values by comparing to the shared state.

Test Plan:
./db_stress --db=/tmp/tmppp --max_key=10000 --ops_per_thread=10000

To test some basic cases, the following lines can be added (each set in turn) to the verifyDb method with the following expected results:

    // Should abort with "Unexpected value found"
    shared.Delete(start);

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start));

    // Should succeed
    WriteOptions write_opts;
    shared.Delete(start);
     db_->Delete(write_opts, Key(start));

    // Should abort with "Value not found"
    WriteOptions write_opts;
    db_->Delete(write_opts, Key(start + (end-start)/2));

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(end-1));

    // Should abort with "Unexpected value"
    shared.Delete(end-1);

    // Should abort with "Unexpected value"
    shared.Delete(start + (end-start)/2);

    // Should abort with "Value not found"
    db_->Delete(write_opts, Key(start));
    shared.Delete(start);
    db_->Delete(write_opts, Key(end-1));
    db_->Delete(write_opts, Key(end-2));

To test the out of range abort, change the key in the for loop to Key(i+1), so that the key defined by the index i is now outside of the supposed range of the database.

Reviewers: emayanke

Reviewed By: emayanke

CC: dhruba, xjin

Differential Revision: https://reviews.facebook.net/D13071
2013-09-30 16:48:00 -07:00
Natalie Hildebrandt
9262061b0d Fixing crashing tests to include iterpercent param
Summary: Adding in the iterpercent flag to tests.

Test Plan: make crash_test

Reviewers: emayanke

Reviewed By: emayanke

Differential Revision: https://reviews.facebook.net/D13035
2013-09-20 16:27:22 -07:00
Natalie Hildebrandt
433541823c Phase 1 of an iterator stress test
Summary:
Added MultiIterate() which does a seek and some Next/Prev
calls.  Iterator status is checked only, no data integrity check

Test Plan:
make db_stress
./db_stress --iterpercent=<nonzero value> --readpercent=, etc.

Reviewers: emayanke, dhruba, xjin

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12915
2013-09-19 16:47:24 -07:00
Dhruba Borthakur
4012ca1c7b Added a parameter to limit the maximum space amplification for universal compaction.
Summary:
Added a new field called max_size_amplification_ratio in the
CompactionOptionsUniversal structure. This determines the maximum
percentage overhead of space amplification.

The size amplification is defined to be the ratio between the size of
the oldest file to the sum of the sizes of all other files. If the
size amplification exceeds the specified value, then min_merge_width
and max_merge_width are ignored and a full compaction of all files is done.
A value of 10 means that the size a database that stores 100 bytes
of user data could occupy 110 bytes of physical storage.

Test Plan: Unit test DBTest.UniversalCompactionSpaceAmplification added.

Reviewers: haobo, emayanke, xjin

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12825
2013-09-13 16:27:18 -07:00
Dhruba Borthakur
1186192ed1 Replace include/leveldb with include/rocksdb.
Summary: Replace include/leveldb with include/rocksdb.

Test Plan:
make clean; make check
make clean; make release

Differential Revision: https://reviews.facebook.net/D12489
2013-08-23 10:51:00 -07:00
Jim Paton
74781a0c49 Add three new MemTableRep's
Summary:
This patch adds three new MemTableRep's: UnsortedRep, PrefixHashRep, and VectorRep.

UnsortedRep stores keys in an std::unordered_map of std::sets. When an iterator is requested, it dumps the keys into an std::set and iterates over that.

VectorRep stores keys in an std::vector. When an iterator is requested, it creates a copy of the vector and sorts it using std::sort. The iterator accesses that new vector.

PrefixHashRep stores keys in an unordered_map mapping prefixes to ordered sets.

I also added one API change. I added a function MemTableRep::MarkImmutable. This function is called when the rep is added to the immutable list. It doesn't do anything yet, but it seems like that could be useful. In particular, for the vectorrep, it means we could elide the extra copy and just sort in place. The only reason I haven't done that yet is because the use of the ArenaAllocator complicates things (I can elaborate on this if needed).

Test Plan:
make -j32 check
./db_stress --memtablerep=vector
./db_stress --memtablerep=unsorted
./db_stress --memtablerep=prefixhash --prefix_size=10

Reviewers: dhruba, haobo, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12117
2013-08-22 23:10:02 -07:00
Xing Jin
af732c7ba8 Add universal compaction to db_stress nightly build
Summary:
Most code change in this diff is code cleanup/rewrite. The logic changes include:

(1) add universal compaction to db_crashtest2.py
(2) randomly set --test_batches_snapshots to be 0 or 1 in db_crashtest2.py. Old codes always use 1.
(3) use different tmp directory as db directory in different runs. I saw some intermittent errors in my local tests. Use of different tmp directory seems to be able to solve the issue.

Test Plan: Have run "make crashtest" for multiple times. Also run "make all check"

Reviewers: emayanke, dhruba, haobo

Reviewed By: emayanke

Differential Revision: https://reviews.facebook.net/D12369
2013-08-20 17:37:49 -07:00
Deon Nicholas
b87dcae1a3 Made merge_oprator a shared_ptr; and added TTL unit tests
Test Plan:
- make all check;
- make release;
- make stringappend_test; ./stringappend_test

Reviewers: haobo, emayanke

Reviewed By: haobo

CC: leveldb, kailiu

Differential Revision: https://reviews.facebook.net/D12381
2013-08-20 13:35:28 -07:00
Mayank Agarwal
8a3547d38e API for getting archived log files
Summary: Also expanded class LogFile to have startSequene and FileSize and exposed it publicly

Test Plan: make all check

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12087
2013-08-19 13:37:04 -07:00
Deon Nicholas
ad48c3c262 Benchmarking for Merge Operator
Summary:
Updated db_bench and utilities/merge_operators.h to allow for dynamic benchmarking
of merge operators in db_bench. Added a new test (--benchmarks=mergerandom), which performs
a bunch of random Merge() operations over random keys. Also added a "--merge_operator=" flag
so that the tester can easily benchmark different merge operators. Currently supports
the PutOperator and UInt64Add operator. Support for stringappend or list append may come later.

Test Plan:
	1. make db_bench
	2. Test the PutOperator (simulating Put) as follows:
./db_bench --benchmarks=fillrandom,readrandom,updaterandom,readrandom,mergerandom,readrandom --merge_operator=put
--threads=2

3. Test the UInt64AddOperator (simulating numeric addition) similarly:
./db_bench --value_size=8 --benchmarks=fillrandom,readrandom,updaterandom,readrandom,mergerandom,readrandom
--merge_operator=uint64add --threads=2

Reviewers: haobo, dhruba, zshao, MarkCallaghan

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11535
2013-08-15 17:13:07 -07:00
Tyler Harter
85d83a150b Update crashtests to match D12267
Summary: I changed the db_stress configs, but forgot to update the scripts using the old configs.

Test Plan: 'make blackbox_crash_test' and 'make whitebox_crash_test' start running normally now (I haven't run them til the end, though).

Reviewers: vamsi

Reviewed By: vamsi

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D12303
2013-08-15 10:14:32 -07:00
Xing Jin
0a5afd1afc Minor fix to current codes
Summary:
Minor fix to current codes, including: coding style, output format,
comments. No major logic change. There are only 2 real changes, please see my inline comments.

Test Plan: make all check

Reviewers: haobo, dhruba, emayanke

Differential Revision: https://reviews.facebook.net/D12297
2013-08-14 23:03:57 -07:00
Tyler Harter
7612d496ff Add prefix scans to db_stress (and bug fix in prefix scan)
Summary: Added support for prefix scans.

Test Plan: ./db_stress --max_key=4096 --ops_per_thread=10000

Reviewers: dhruba, vamsi

Reviewed By: vamsi

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12267
2013-08-14 16:58:36 -07:00
Dhruba Borthakur
f3967a5132 Merge remote-tracking branch 'origin' into performance 2013-08-12 09:58:50 -07:00
Haobo Xu
58a0ae06dc [RocksDB] Improve sst_dump to take user key range
Summary: The ability to dump internal keys associated with certain user keys, directly from sst files, is very useful for diagnosis. Will incorporate it directly into ldb later.

Test Plan: run it

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D12075
2013-08-08 15:31:12 -07:00
Dhruba Borthakur
f5fa26b6a9 Merge branch 'performance' of github.com:facebook/rocksdb into performance
Conflicts:
	db/builder.cc
	db/db_impl.cc
	db/version_set.cc
	include/leveldb/statistics.h
2013-08-07 11:58:06 -07:00
Mayank Agarwal
1d7b4765c3 Expose base db object from ttl wrapper
Summary: rocksdb replicaiton will need this when writing value+TS from master to slave 'as is'

Test Plan: make

Reviewers: dhruba, vamsi, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11919
2013-08-05 18:44:14 -07:00
Dhruba Borthakur
711a30cb30 Merge branch 'master' into performance
Conflicts:
	include/leveldb/options.h
	include/leveldb/statistics.h
	util/options.cc
2013-08-02 10:22:08 -07:00
Mayank Agarwal
59d0b02f8b Expand KeyMayExist to return the proper value if it can be found in memory and also check block_cache
Summary: Removed KeyMayExistImpl because KeyMayExist demanded Get like semantics now. Removed no_io from memtable and imm because we need the proper value now and shouldn't just stop when we see Merge in memtable. Added checks to block_cache. Updated documentation and unit-test

Test Plan: make all check;db_stress for 1 hour

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11853
2013-08-01 09:07:46 -07:00
Mayank Agarwal
f3baeecd44 Adding filter_deletes to crash_tests run in jenkins
Summary: filter_deletes options introduced in db_stress makes it drop Deletes on key if KeyMayExist(key) returns false on the key. code change was simple and tested so not wasting reviewer's time.

Test Plan: maek crash_test; python tools/db_crashtest[1|2].py

CC: dhruba, vamsi

Differential Revision: https://reviews.facebook.net/D11769
2013-07-23 13:49:16 -07:00
Mayank Agarwal
bf66c10b13 Use KeyMayExist for WriteBatch-Deletes
Summary:
Introduced KeyMayExist checking during writebatch-delete and removed from Outer Delete API because it uses writebatch-delete.
Added code to skip getting Table from disk if not already present in table_cache.
Some renaming of variables.
Introduced KeyMayExistImpl which allows checking since specified sequence number in GetImpl useful to check partially written writebatch.
Changed KeyMayExist to not be pure virtual and provided a default implementation.
Expanded unit-tests in db_test to check appropriately.
Ran db_stress for 1 hour with ./db_stress --max_key=100000 --ops_per_thread=10000000 --delpercent=50 --filter_deletes=1 --statistics=1.

Test Plan: db_stress;make check

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb, xjin

Differential Revision: https://reviews.facebook.net/D11745
2013-07-23 13:36:50 -07:00
Dhruba Borthakur
4a745a5666 Merge branch 'master' into performance
Conflicts:
	db/version_set.cc
	include/leveldb/options.h
	util/options.cc
2013-07-17 15:05:57 -07:00
Mayank Agarwal
2a986919d6 Make rocksdb-deletes faster using bloom filter
Summary:
Wrote a new function in db_impl.c-CheckKeyMayExist that calls Get but with a new parameter turned on which makes Get return false only if bloom filters can guarantee that key is not in database. Delete calls this function and if the option- deletes_use_filter is turned on and CheckKeyMayExist returns false, the delete will be dropped saving:
1. Put of delete type
2. Space in the db,and
3. Compaction time

Test Plan:
make all check;
will run db_stress and db_bench and enhance unit-test once the basic design gets approved

Reviewers: dhruba, haobo, vamsi

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11607
2013-07-11 12:11:11 -07:00
Mayank Agarwal
821889e207 Print complete statistics in db_stress
Summary: db_stress should alos print complete statistics like db_bench. Needed this when I wanted to measure number of delete-IOs dropped due to CheckKeyMayExist to be introduced to rocksdb codebase later- to make deltes in rocksdb faster

Test Plan: make db_stress;./db_stress --max_key=100 --ops_per_thread=1000 --statistics=1

Reviewers: sheki, dhruba, vamsi, haobo

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D11655
2013-07-10 18:07:13 -07:00
Dhruba Borthakur
116ec527f2 Renamed 'hybrid_compaction' tp be "Universal Compaction'.
Summary:
All the universal compaction parameters are encapsulated in
a new file universal_compaction.h

Test Plan:
make check
2013-07-03 15:47:53 -07:00
Dhruba Borthakur
47c4191fe8 Reduce write amplification by merging files in L0 back into L0
Summary:
There is a new option called hybrid_mode which, when switched on,
causes HBase style compactions.  Files from L0 are
compacted back into L0. This meat of this compaction algorithm
is in PickCompactionHybrid().

All files reside in L0. That means all files have overlapping
keys. Each file has a time-bound, i.e. each file contains a
range of keys that were inserted around the same time. The
start-seqno and the end-seqno refers to the timeframe when
these keys were inserted.  Files that have contiguous seqno
are compacted together into a larger file. All files are
ordered from most recent to the oldest.

The current compaction algorithm starts to look for
candidate files starting from the most recent file. It continues to
add more files to the same compaction run as long as the
sum of the files chosen till now is smaller than the next
candidate file size. This logic needs to be debated
and validated.

The above logic should reduce write amplification to a
large extent... will publish numbers shortly.

Test Plan: dbstress runs for 6 hours with no data corruption (tested so far).

Differential Revision: https://reviews.facebook.net/D11289
2013-06-30 20:07:04 -07:00
Haobo Xu
0f78fad9f5 [RocksDB] add back --mmap_read options to crashtest
Summary: As title, now that db_stress supports --map_read properly

Test Plan: make crash_test

Reviewers: vamsi, emayanke, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11391
2013-06-19 16:15:59 -07:00
Haobo Xu
96be2c4ee0 [RocksDB] Add mmap_read option for db_stress
Summary: as title, also removed an incorrect assertion

Test Plan: make check; db_stress --mmap_read=1; db_stress --mmap_read=0

Reviewers: dhruba, emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11367
2013-06-19 10:28:32 -07:00
Dhruba Borthakur
836534debd Enhance dbstress to allow specifying compaction trigger for L0.
Summary:
Rocksdb allos specifying the number of files in L0 that triggers
compactions. Expose this api as a command line parameter for
running db_stress.

Test Plan: Run test

Reviewers: sheki, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11343
2013-06-17 14:15:09 -07:00
Haobo Xu
0c2a2dd5e8 [RocksDB] Fix build. Removed deprecated option --mmap_read from db_crashtest
Summary: As title

Test Plan: db_crashtest

Reviewers: vamsi, emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11271
2013-06-13 13:48:35 -07:00
Haobo Xu
bdf1085944 [RocksDB] cleanup EnvOptions
Summary:
This diff simplifies EnvOptions by treating it as POD, similar to Options.
- virtual functions are removed and member fields are accessed directly.
- StorageOptions is removed.
- Options.allow_readahead and Options.allow_readahead_compactions are deprecated.
- Unused global variables are removed: useOsBuffer, useFsReadAhead, useMmapRead, useMmapWrite

Test Plan: make check; db_stress

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11175
2013-06-12 11:17:19 -07:00
Mayank Agarwal
7a6bd8e975 Modifying options to db_stress when it is run with db_crashtest
Summary: These extra options caught some bugs. Will be run via Jenkins now with the crash_test

Test Plan: ./make crashtest

Reviewers: dhruba, vamsi

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11151
2013-06-09 09:58:46 -07:00
Vamsi Ponnekanti
3bb9449906 [Fix whilebox crash test failure]
Summary:
I think the check for "error" that I added had caused
false alarm. Fixed that.

Test Plan:
Revert Plan: OK

Task ID: #

Reviewers: emayanke, dhruba

Reviewed By: emayanke

Differential Revision: https://reviews.facebook.net/D11139
2013-06-07 11:34:46 -07:00
Vamsi Ponnekanti
5cf7a00bda [Make most of the changes suggested by Aaron]
Summary: $title

Test Plan:
Revert Plan: OK

Task ID: #

Reviewers: emayanke, akushner

Reviewed By: akushner

Differential Revision: https://reviews.facebook.net/D10923
2013-06-06 17:31:45 -07:00
Haobo Xu
c2e2460f8a [RocksDB] Expose DBStatistics
Summary: Make Statistics usable by client

Test Plan: make check; db_bench

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D10899
2013-05-23 11:49:38 -07:00