Summary:
Release builds are failing on Linux with the error:
```
tools/db_stress.cc: In function ‘int main(int, char**)’:
tools/db_stress.cc:2365:12: error: ‘rocksdb::SyncPoint’ has not been declared
rocksdb::SyncPoint::GetInstance()->SetCallBack(
^
tools/db_stress.cc:2370:12: error: ‘rocksdb::SyncPoint’ has not been declared
rocksdb::SyncPoint::GetInstance()->SetCallBack(
^
tools/db_stress.cc:2375:12: error: ‘rocksdb::SyncPoint’ has not been declared
rocksdb::SyncPoint::GetInstance()->EnableProcessing();
^
make[1]: *** [tools/db_stress.o] Error 1
make[1]: Leaving directory `/data/sandcastle/boxes/trunk-git-rocksdb-public'
make: *** [release] Error 2
```
Closes https://github.com/facebook/rocksdb/pull/2355
Differential Revision: D5113552
Pulled By: sagar0
fbshipit-source-id: 351df707277787da5633ba4a40e52edc7c895dc4
Summary:
Disable direct reads for log and manifest. Direct reads should not affect sequential_file
Also add kDirectIO for option_config_ in db_test_util
Closes https://github.com/facebook/rocksdb/pull/2337
Differential Revision: D5100261
Pulled By: lightmark
fbshipit-source-id: 0ebfd13b93fa1b8f9acae514ac44f8125a05868b
Summary:
- Introduced an include/ file dedicated to db-related debug functions to avoid making db.h more complex
- Added debugging function, `GetAllKeyVersions()`, to return a listing of internal data for a range of user keys. The new `struct KeyVersion` exposes data similar to internal key without exposing any internal type.
- Migrated the "ldb idump" subcommand to use this function
- The API takes an inclusive-exclusive range to match behavior of "ldb idump". This will be quite annoying for users who want to query a single user key's versions :(.
Closes https://github.com/facebook/rocksdb/pull/2232
Differential Revision: D4976007
Pulled By: ajkr
fbshipit-source-id: cab375da53a7595d6575af2b7e3b776aa3ad793e
Summary:
Allow an option for users to do some compaction in FIFO compaction, to pay some write amplification for fewer number of files.
Closes https://github.com/facebook/rocksdb/pull/2163
Differential Revision: D4895953
Pulled By: siying
fbshipit-source-id: a1ab608dd0627211f3e1f588a2e97159646e1231
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
In a previous commit, I changed the way to checkout release branches from "git checkout <branch_name>" to "git checkout origin/<branch_name>". However, this doesn't seem to work in our CI environment. Revert it.
Closes https://github.com/facebook/rocksdb/pull/2189
Differential Revision: D4922294
Pulled By: siying
fbshipit-source-id: 482c17f9b05e6ccb190876b050682fe5a458103d
Summary:
tools/check_format_compatible.sh will check a newer version of RocksDB can open option files generated by older version releases. In order to achieve that, a new parameter "--try_load_options" is added to ldb. With this parameter set, if option file exists, we load the option file and use it to open the DB. With this opiton set, we can validate option loading logic.
Closes https://github.com/facebook/rocksdb/pull/2178
Differential Revision: D4914989
Pulled By: siying
fbshipit-source-id: db114f7724fcb41e5e9483116d84d7c4b8389ca4
Summary:
Tested by running it on a remote machine.
I could not run it on the particular remote machine which has a different location for time command since it is busy and the script does not allow concurrent runs. So I tested it by hacking the script and replacing the command with "\$(hostname)" and confirmed that the scripts prints out the host name of the remote machine.
Closes https://github.com/facebook/rocksdb/pull/2181
Differential Revision: D4921654
Pulled By: maysamyabandeh
fbshipit-source-id: 8abb5ea9f7234f3c50a749576ccbb47ff605beb9
Summary:
Need to add more recent versions to tools/check_format_compatible.sh to meka sure backward and forward compatibility.
Closes https://github.com/facebook/rocksdb/pull/2175
Differential Revision: D4911585
Pulled By: siying
fbshipit-source-id: 943e6488757efb11bb6720d811c7ba949915c9de
Summary:
Add a function to allow users to reset internal stats without restarting the DB.
Closes https://github.com/facebook/rocksdb/pull/2167
Differential Revision: D4907939
Pulled By: siying
fbshipit-source-id: ab2dd85b88aabe9380da7485320a1d460d3e1f68
Summary:
Replace Options::use_direct_writes with Options::use_direct_io_for_flush_and_compaction
Now if Options::use_direct_io_for_flush_and_compaction = true, we will enable direct io for both reads and writes for flush and compaction job. Whereas Options::use_direct_reads controls user reads like iterator and Get().
Closes https://github.com/facebook/rocksdb/pull/2117
Differential Revision: D4860912
Pulled By: lightmark
fbshipit-source-id: d93575a8a5e780cf7e40797287edc425ee648c19
Summary:
filter the warning out and only print it once.
Closes https://github.com/facebook/rocksdb/pull/2137
Differential Revision: D4870925
Pulled By: lightmark
fbshipit-source-id: 91b363ce7f70bce88b0780337f408fc4649139b8
Summary:
Run the time command before regression tests, parse the output, and add the numbers to the report.
Closes https://github.com/facebook/rocksdb/pull/2101
Differential Revision: D4862781
Pulled By: maysamyabandeh
fbshipit-source-id: 4a81caa5d14187d67093aad154c8f0ad56aba901
Summary:
Check the result of the benchmark againt a specified truth_db, which is
expected to be produced using the same benchmark but perhaps on a
different commit or with different configs.
The verification is simple and assumes that key/values are generated
deterministically. This assumption would break if db_bench using rand
variable differently from the benchmark that produced truth_db.
Currently it is checked to work on fillrandom and readwhilewriting.
A param finish_after_writes is added to ensure that the background
writing thread will write the same number of entries between two
benchmarks.
Example:
$ TEST_TMPDIR=/dev/shm/truth_db ./db_bench
--benchmarks="fillrandom,readwhilewriting" --num=200000
--finish_after_writes=true
$ TEST_TMPDIR=/dev/shm/tmpdb ./db_bench
--benchmarks="fillrandom,readwhilewriting,verify" --truth_db
/dev/shm/truth_db/dbbench --num=200000 --finish_after_writes=true
Verifying db <= truth_db...
Verifying db >= truth_db...
...Verified
Closes https://github.com/facebook/rocksdb/pull/2098
Differential Revision: D4839233
Pulled By: maysamyabandeh
fbshipit-source-id: 2f4ed31
Summary:
This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.
Check the sub-commits for easier reviewing.
Closes https://github.com/facebook/rocksdb/pull/2094
Differential Revision: D4837730
Pulled By: sagar0
fbshipit-source-id: 344278a
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090
Differential Revision: D4833681
Pulled By: siying
fbshipit-source-id: 2fd8bef
Summary:
Rebuilding regression db at the first run every monday.
Closes https://github.com/facebook/rocksdb/pull/2093
Differential Revision: D4836961
Pulled By: lightmark
fbshipit-source-id: 22d6c25
Summary:
It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency.
Closes https://github.com/facebook/rocksdb/pull/2080
Differential Revision: D4821141
Pulled By: siying
fbshipit-source-id: ca7d768
Summary:
Fix the bug that previous test existence locally
Closes https://github.com/facebook/rocksdb/pull/2074
Differential Revision: D4813631
Pulled By: lightmark
fbshipit-source-id: eceb0d9
Summary:
add ldb to regression test in order to enable reuse db by creating checkpoint
Closes https://github.com/facebook/rocksdb/pull/2030
Differential Revision: D4800549
Pulled By: lightmark
fbshipit-source-id: d3a7325
Summary:
Added fifo benchmark to db_bench.
One thing i am not sure is that i am using CompactRange() instead of CompactFiles(). (may cause performance skew because CompactionRange() is not happening in current thread?) For CompactFiles(), for some reason FIFO compaction doesn't work as expected. More insight is welcomed. I guess FIFO compaction doesn't work with file names? igorcanadi
test cmd:
./db_bench --compaction_style=2 --benchmarks=fillseqdeterministic --disable_auto_compactions --num_levels=1 --fifo_compaction_max_table_files_size_mb=10
---------------------- DB 0 LSM ---------------------
Level[0]: /000014.sst(size: 4211014 bytes)
fillseqdeterministic : 4.731 micros/op 211381 ops/sec; 23.4 MB/s
Closes https://github.com/facebook/rocksdb/pull/1734
Differential Revision: D4774964
Pulled By: siying
fbshipit-source-id: 9d08df6
Summary:
PlainTable now supports non-mmap mode. We don't need to check it anymore.
Closes https://github.com/facebook/rocksdb/pull/1882
Differential Revision: D4751643
Pulled By: siying
fbshipit-source-id: ab14540
Summary:
Removed max_grandparent_overlap_factor from benchmark.sh since it is not a valid option anymore.
Closes https://github.com/facebook/rocksdb/pull/2015
Differential Revision: D4748229
Pulled By: lgalanis
fbshipit-source-id: c3869ea
Summary:
PinnableSlice
Summary:
Currently the point lookup values are copied to a string provided by the
user. This incures an extra memcpy cost. This patch allows doing point lookup
via a PinnableSlice which pins the source memory location (instead of
copying their content) and releases them after the content is consumed
by the user. The old API of Get(string) is translated to the new API
underneath.
Here is the summary for improvements:
value 100 byte: 1.8% regular, 1.2% merge values
value 1k byte: 11.5% regular, 7.5% merge values
value 10k byte: 26% regular, 29.9% merge values
The improvement for merge could be more if we extend this approach to
pin the merge output and delay the full merge operation until the user
actually needs it. We have put that for future work.
PS:
Sometimes we observe a small decrease in performance when switching from
t5452014 to this patch but with the old Get(string) API. The d
Closes https://github.com/facebook/rocksdb/pull/1756
Differential Revision: D4391738
Pulled By: maysamyabandeh
fbshipit-source-id: 6f3edd3
Summary:
Add the flag --prefix to the sst_dump tool
This flag is similar to, and exclusive from, the --from flag.
--prefix=0x00FF will return all rows prefixed with 0x00FF.
The --to flag may also be specified and will work as expected.
These changes were used to help in debugging the power cycle corruption issue and theses changes were tested by scanning through a udb.
Closes https://github.com/facebook/rocksdb/pull/1984
Differential Revision: D4691814
Pulled By: reidHoruff
fbshipit-source-id: 027f261
Summary:
This reverts commit d43adf21bb.
The patch has caused problems in regression tests. Will revert it for now until we figure how to debug the problems regression tests.
Closes https://github.com/facebook/rocksdb/pull/1975
Differential Revision: D4682880
Pulled By: maysamyabandeh
fbshipit-source-id: 84df83a
Summary:
It augments the regression benchmarks with a time command, parses the output, and print them to the SUMMARY.csv file.
I tested a variation of the script locally. Any idea how to do run a test that also involves writing to scuba tables?
Closes https://github.com/facebook/rocksdb/pull/1967
Differential Revision: D4679470
Pulled By: maysamyabandeh
fbshipit-source-id: 44dac30
Summary:
Also extracted the common logic into a base class, BackupableCommand.
Closes https://github.com/facebook/rocksdb/pull/1939
Differential Revision: D4630121
Pulled By: ajkr
fbshipit-source-id: 04bb067
Summary:
This option is needed to be enabled for Direct IO
and I cannot think of a reason where we need to disable it
remove it and default it to true
Closes https://github.com/facebook/rocksdb/pull/1944
Differential Revision: D4641088
Pulled By: IslamAbdelRahman
fbshipit-source-id: d7085b9
Summary:
The default behavior was too weird because, previously, we got the L0 file size limit (64MB) from Options default and L1+ file size limit (2MB) from the hardcoded value. We should get both from Options default.
Closes https://github.com/facebook/rocksdb/pull/1943
Differential Revision: D4640301
Pulled By: ajkr
fbshipit-source-id: fd8c0fd
Summary:
add direct_io and compaction_readahead_size in db_stress
test direct_io under db_stress with compaction_readahead_size enabled to capture bugs found in production.
`./db_stress --allow_concurrent_memtable_write=0 --use_direct_reads --use_direct_writes --compaction_readahead_size=4096`
Closes https://github.com/facebook/rocksdb/pull/1906
Differential Revision: D4604514
Pulled By: IslamAbdelRahman
fbshipit-source-id: ebbf0ee
Summary:
Remove disableDataSync, and another similarly named disable_data_sync options.
This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods.
Closes https://github.com/facebook/rocksdb/pull/1859
Differential Revision: D4541292
Pulled By: sagar0
fbshipit-source-id: 5b3a6ca
Summary:
move the argument description to the right section, make it clearer that the '=' sign is required, and use it in one subcommand where it seemed forgotten before.
Closes https://github.com/facebook/rocksdb/pull/1840
Differential Revision: D4515096
Pulled By: ajkr
fbshipit-source-id: 7b5b1c1
Summary:
I want to be able to, e.g., DECLARE_string(statistics_string); in my application such that I can override the default value of statistics_string. For this to work, we need to remove the unnamed namespace containing all the flags, and make sure all variables/functions covered by that namespace are static.
Replaces #1828 due to internal tool issues.
Closes https://github.com/facebook/rocksdb/pull/1844
Differential Revision: D4515124
Pulled By: ajkr
fbshipit-source-id: 23b695e
Summary:
introduce new methods into a public threadpool interface,
- allow submission of std::functions as they allow greater flexibility.
- add Joining methods to the implementation to join scheduled and submitted jobs with
an option to cancel jobs that did not start executing.
- Remove ugly `#ifdefs` between pthread and std implementation, make it uniform.
- introduce pimpl for a drop in replacement of the implementation
- Introduce rocksdb::port::Thread typedef which is a replacement for std::thread. On Posix Thread defaults as before std::thread.
- Implement WindowsThread that allocates memory in a more controllable manner than windows std::thread with a replaceable implementation.
- should be no functionality changes.
Closes https://github.com/facebook/rocksdb/pull/1823
Differential Revision: D4492902
Pulled By: siying
fbshipit-source-id: c74cb11
Summary:
Added -statistics_string to deserialize a Statistics object using the factory functions registered by applications.
Closes https://github.com/facebook/rocksdb/pull/1812
Differential Revision: D4469811
Pulled By: ajkr
fbshipit-source-id: 2d80862
Summary:
The Env registration framework supports registering client Envs and selecting which one to instantiate according to a text field. This enabled things like adding the -env_uri argument to db_bench, so the same binary could be reused with different Envs just by changing CLI config.
Now this problem has come up again in a non-Env context, as I want to instantiate a client Statistics implementation from db_bench, which is configured entirely via text parameters. Also, in the future we may wish to use it for deserializing client objects when loading OPTIONS file.
This diff generalizes the Env registration logic to work with arbitrary types.
- Generalized registration and instantiation code by templating them
- The entire implementation is in a header file as that's Google style guide's recommendation for template definitions
- Pattern match with std::regex_match rather than checking prefix, which was the previous behavior
- Rename functions/files to be non-Env-specific
Closes https://github.com/facebook/rocksdb/pull/1776
Differential Revision: D4421933
Pulled By: ajkr
fbshipit-source-id: 34647d1
Summary:
Currently the point lookup values are copied to a string provided by the user.
This incures an extra memcpy cost. This patch allows doing point lookup
via a PinnableSlice which pins the source memory location (instead of
copying their content) and releases them after the content is consumed
by the user. The old API of Get(string) is translated to the new API
underneath.
Here is the summary for improvements:
1. value 100 byte: 1.8% regular, 1.2% merge values
2. value 1k byte: 11.5% regular, 7.5% merge values
3. value 10k byte: 26% regular, 29.9% merge values
The improvement for merge could be more if we extend this approach to
pin the merge output and delay the full merge operation until the user
actually needs it. We have put that for future work.
PS:
Sometimes we observe a small decrease in performance when switching from
t5452014 to this patch but with the old Get(string) API. The difference
is a little and could be noise. More importantly it is safely
cancelled
Closes https://github.com/facebook/rocksdb/pull/1732
Differential Revision: D4374613
Pulled By: maysamyabandeh
fbshipit-source-id: a077f1a
Summary:
?tion
Add flags in db_bench to test with block cache mid-point insertion.
Also update sst_dump to dump total block sizes of each type. I find it
useful to look at these test db stats and I don't know if we have them
elsewhere.
Closes https://github.com/facebook/rocksdb/pull/1706
Differential Revision: D4355812
Pulled By: yiwu-arbug
fbshipit-source-id: 3e4a348
Summary:
Add the parameter in db_bench to help users to measure latency histogram with constant read rate.
Closes https://github.com/facebook/rocksdb/pull/1683
Differential Revision: D4341387
Pulled By: siying
fbshipit-source-id: 1b4b276
Summary:
Fixes compile error:
In file included from ./util/statistics.h:17:0,
from ./util/stop_watch.h:8,
from ./util/perf_step_timer.h:9,
from ./util/iostats_context_imp.h:8,
from ./util/posix_logger.h:27,
from ./port/util_logger.h:18,
from ./db/auto_roll_logger.h:15,
from db/auto_roll_logger.cc:6:
./util/thread_local.h:65:16: error: 'function' in namespace 'std' does not name a template type
typedef std::function<void(void*, void*)> FoldFunc;
Closes https://github.com/facebook/rocksdb/pull/1656
Differential Revision: D4318702
Pulled By: yiwu-arbug
fbshipit-source-id: 8c5d17a
Summary:
Include a dump of user_collected_properties in sst_dump
Closes https://github.com/facebook/rocksdb/pull/1668
Differential Revision: D4325078
Pulled By: IslamAbdelRahman
fbshipit-source-id: 226b6d6
Summary:
char is unsigned on power by default causing this test to fail with the FF case. ppc64 return 255 while x86 returned -1. Casting works on both platforms.
Closes https://github.com/facebook/rocksdb/pull/1500
Differential Revision: D4308775
Pulled By: yiwu-arbug
fbshipit-source-id: db3e6e0
Summary:
The info log file ("LOG") is stored in the db directory by default. When the db is on a distributed env, this is unnecessarily slow. So, I added an option to db_bench to just print the info log messages to stderr.
Closes https://github.com/facebook/rocksdb/pull/1641
Differential Revision: D4309348
Pulled By: ajkr
fbshipit-source-id: 1e6f851
Summary:
made db_stress capable of adding range deletions to its db and verifying their correctness. i'll make db_crashtest.py use this option later once the collapsing optimization (https://github.com/facebook/rocksdb/pull/1614) is committed because currently it slows down the test too much.
Closes https://github.com/facebook/rocksdb/pull/1625
Differential Revision: D4293939
Pulled By: ajkr
fbshipit-source-id: d3beb3a
Summary:
When enabled, this option replaces range tombstones with a sequence of
point tombstones covering the same range. This can be used to A/B test perf of
range tombstones vs sequential point tombstones, and help us find the cross-over
point, i.e., the size of the range above which range tombstones outperform point
tombstones.
Closes https://github.com/facebook/rocksdb/pull/1594
Differential Revision: D4246312
Pulled By: ajkr
fbshipit-source-id: 3b00b23
Summary:
Reduce number of comparisons in heap by caching which child node in the first level is smallest (left_child or right_child)
So next time we can compare directly against the smallest child
I see that the total number of calls to comparator drops significantly when using this optimization
Before caching (~2mil key comparison for iterating the DB)
```
$ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq" --db="/dev/shm/heap_opt" --use_existing_db --disable_auto_compactions --cache_size=1000000000 --perf_level=2
readseq : 0.338 micros/op 2959201 ops/sec; 327.4 MB/s user_key_comparison_count = 2000008
```
After caching (~1mil key comparison for iterating the DB)
```
$ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq" --db="/dev/shm/heap_opt" --use_existing_db --disable_auto_compactions --cache_size=1000000000 --perf_level=2
readseq : 0.309 micros/op 3236801 ops/sec; 358.1 MB/s user_key_comparison_count = 1000011
```
It also improves
Closes https://github.com/facebook/rocksdb/pull/1600
Differential Revision: D4256027
Pulled By: IslamAbdelRahman
fbshipit-source-id: 76fcc66
Summary:
Now that we have userspace persisted cache, we don't need flashcache anymore.
Closes https://github.com/facebook/rocksdb/pull/1588
Differential Revision: D4245114
Pulled By: igorcanadi
fbshipit-source-id: e2c1c72
Summary:
- Made RangeDelAggregator's InternalKeyComparator member a reference-to-const so we don't need to copy-construct it. Also added InternalKeyComparator to ImmutableCFOptions so we don't need to construct one for each DBIter.
- Made MemTable::NewRangeTombstoneIterator and the table readers' NewRangeTombstoneIterator() functions return nullptr instead of NewEmptyInternalIterator to avoid the allocation. Updated callers accordingly.
Closes https://github.com/facebook/rocksdb/pull/1548
Differential Revision: D4208169
Pulled By: ajkr
fbshipit-source-id: 2fd65cf
Summary:
Added a few options to configure when to add range tombstones during
any benchmark involving writes.
Closes https://github.com/facebook/rocksdb/pull/1522
Differential Revision: D4187388
Pulled By: ajkr
fbshipit-source-id: 2c8a473
Summary:
Add a subcommand to ldb with which we can delete a range of keys.
Closes https://github.com/facebook/rocksdb/pull/1521
Differential Revision: D4186338
Pulled By: ajkr
fbshipit-source-id: b8e9861
Summary:
enhancing sst_dump to be able to parse internal key
Closes https://github.com/facebook/rocksdb/pull/1482
Differential Revision: D4154175
Pulled By: siying
fbshipit-source-id: b0e28b1
Summary:
Use 16384 as e.g .value for ldb the --compression_max_dict_bytes option.
I think 14 was copy and pasted from the options in the lines above.
Closes https://github.com/facebook/rocksdb/pull/1483
Differential Revision: D4154393
Pulled By: siying
fbshipit-source-id: ef53a69
Summary:
Note: reviewed in https://reviews.facebook.net/D65115
- DBIter maintains a range tombstone accumulator. We don't cleanup obsolete tombstones yet, so if the user seeks back and forth, the same tombstones would be added to the accumulator multiple times.
- DBImpl::NewInternalIterator() (used to make DBIter's underlying iterator) adds memtable/L0 range tombstones, L1+ range tombstones are added on-demand during NewSecondaryIterator() (see D62205)
- DBIter uses ShouldDelete() when advancing to check whether keys are covered by range tombstones
Closes https://github.com/facebook/rocksdb/pull/1464
Differential Revision: D4131753
Pulled By: ajkr
fbshipit-source-id: be86559
Summary:
Fix the bug that --dump_malloc_stats is set before opening the DB.
Closes https://github.com/facebook/rocksdb/pull/1447
Differential Revision: D4106001
Pulled By: siying
fbshipit-source-id: 4e746da
Summary:
Some older tags don't run GCC 4.8 with FB internal setting. Fixed them and created branches. Change the format compatible script accordingly.
Also add more releases to check format compatibility.
* enable cmake to work on linux and osx also
* port part of build_detect_platform not covered by thirdparty.inc
to cmake.
- detect fallocate()
- detect malloc_usable_size()
- detect JeMalloc
- detect snappy
* check for asan,tsan,ubsan
* create 'build_version.cc' in build directory.
* add `check` target to support 'make check'.
* add `tools` target to match its counterpart in Makefile.
* use `date` on non-win32 platforms.
* pass different cflags on non-win32 platforms
* detect pthead library using FindThread cmake module.
* enable CMP0042 to silence the cmake warning on osx
* reorder the linked libraries. because testutillib references gtest, to
enable the linker to find the referenced symbols, we need to put gtest
after testutillib.
Signed-off-by: Marcus Watts <mwatts@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
* hash_table_bench.cc: fix build without gflags
Signed-off-by: Kefu Chai <kchai@redhat.com>
* remove gtest from librocksdb linkage
testharness.cc is included in librocksdb sources, and it uses gtest. but
gtest is not supposed to be part of the public API of librocksdb. so, in
this change, the testharness.cc is moved out out librocksdb, and is
built as an object target, then linked with the tools and tests instead.
Signed-off-by: Marcus Watts <mwatts@redhat.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
Summary: Use ImmutableDBOptions/MutableDBOptions internally and DBOptions only for user-facing APIs. MutableDBOptions is barely a placeholder for now. I'll start to move options to MutableDBOptions in following diffs.
Test Plan:
make all check
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D64065
Summary:
ZSTD 1.0.0 is coming. We can finally add a support of ZSTD without worrying about compatibility.
Still keep ZSTDNotFinal for compatibility reason.
Test Plan: Run all tests. Run db_bench with ZSTD version with RocksDB built with ZSTD 1.0 and older.
Reviewers: andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: cyan, igor, IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D63141
Summary: There's no reference to ImmutableCFOptions elsewhere in /include/rocksdb. ImmutableCFOptions was introduced in this commit (5665e5e285) but later its reference in /include/rocksdb/table.h is removed.
Test Plan:
make all check
Reviewers: IslamAbdelRahman, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: yhchiang, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D63177
Summary: Fix using `arg[i].thread` after deleting it
Test Plan: run clang_analyze
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D63171
Summary: To reduce number of options, merge source_compaction_factor, max_grandparent_overlap_bytes and expanded_compaction_factor into max_compaction_bytes.
Test Plan: Add two new unit tests. Run all existing tests, including jtest.
Reviewers: yhchiang, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59829
Summary:
Add ReadOptions::read_amp_bytes_per_bit option which allow us to create a bitmap for every data block we read
the bitmap will contain (block_size / read_amp_bytes_per_bit) bits.
We will use this bitmap to mark which bytes have been used of the block so we can calculate the read amplification
Test Plan: added new tests
Reviewers: andrewkr, yhchiang, sdong
Reviewed By: sdong
Subscribers: yiwu, leveldb, march, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D58707
Summary:
Clock-based cache implemenetation aim to have better concurreny than
default LRU cache. See inline comments for implementation details.
Test Plan:
Update cache_test to run on both LRUCache and ClockCache. Adding some
new tests to catch some of the bugs that I fixed while implementing the
cache.
Reviewers: kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61647
Summary:
This is a proof of concept of a RocksDB blob log file. The actual value of the Put() is appended to a blob log using normal data block format, and the handle of the block is written as the value of the key in RocksDB.
The prototype only supports Put() and Get(). It doesn't support DB restart, garbage collection, Write() call, iterator, snapshots, etc.
Test Plan: Add unit tests.
Reviewers: arahut
Reviewed By: arahut
Subscribers: kradhakrishnan, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61485
Summary: As title.
Test Plan: Run the benchmark with and without the parameter.
Reviewers: yiwu, andrewkr, lightmark, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61491
Summary: Add a benchmark to `db_bench`. In this benchmark, a write thread will populate time series data in the format of 'id | timestamp', and multiple read threads will randomly retrieve all data from one id at a time.
Test Plan: Run the benchmark: `num=134217728;bpl=536870912;mb=67108864;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4;delay=8;stop=12;wbn=3;mbc=20;wbs=134217728;dds=0;sync=0;t=32;vs=800;bs=4096;cs=17179869184;of=500000;wps=0;si=10000000; kir=100000; dir=/data/users/jhli/test/; ./db_bench --benchmarks=timeseries --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$num --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 --open_files=$of --verify_checksum=1 --db=$dir --sync=$sync --disable_wal=0 --compression_type=none --stats_interval=$si --compression_ratio=1 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=0 --key_id_range=$kir`
Reviewers: andrewkr, sdong
Reviewed By: sdong
Subscribers: lgalanis, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60651
Summary:
db_stress test is now failing because of this scenario
- run db_stress with merge_operator enabled (now we have a db with merge operands)
- run db_stress with merge_operator disabled (now when we fail to open the db)
the solution is to pass the merge_operator to the DB even if we are not going to do any merge operations
Test Plan: Check the failure
Reviewers: andrewkr, yiwu, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61311
Summary: OnTableFileCreated() now is also called when the file creaion fails. In that case, we shouldn't assert the file size is not 0.
Test Plan: Run crash test
Reviewers: yiwu, andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61137
Summary: Extend the option memtable_prefix_bloom_huge_page_tlb_size from just putting memtable bloom filter to huge page to memtable itself too.
Test Plan: Run all existing tests.
Reviewers: IslamAbdelRahman, yhchiang, andrewkr
Reviewed By: andrewkr
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60513
Summary:
- Added a new subcommand, 'ldb restore', that restores from backup
- Made backup_env_uri optional (also for 'ldb backup') because it can use db_env when backup_env isn't provided
Test Plan:
verify backup and restore commands work:
$ ./ldb backup --db=/data/users/andrewkr/rocksdb/db_bench-out/ --num_threads 1 --backup_dir=/data/users/andrewkr/rocksdb/db_bench-out/backup/
$ ./ldb restore --db=/data/users/andrewkr/rocksdb/db_bench-out/restore/ --num_threads 1 --backup_dir=/data/users/andrewkr/rocksdb/db_bench-out/backup/
Reviewers: sdong, wanning
Reviewed By: wanning
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60849
Summary:
Tsan crash white-box test was disabled because it never ends. Re-enable it with reduced killing odds.
Add a parameter in crash test script to allow we pass it through an environment variable.
Test Plan: Run it manually.
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61053
getline on std::cin can be very inefficient when ldb is loading large values, with high CPU usage in libc _IO_(un)getc, this is because of the performance penalty that comes from synchronizing stdio and iostream buffers.
See the reproducers and tests in #1133 .
If an ifstream on /dev/stdin is used (when available) then using ldb to load large values can be much more efficient.
I thought for ldb load, that this approach is preferable to using <cstdio> or std::ios_base::sync_with_stdio(false).
I couldn't think of a use case where ldb load would need to support reading unbuffered input, an alternative approach would be to add support for passing --input_file=/dev/stdin.
I have a CLA in place, thanks.
The CI tests were failing at the time of https://github.com/facebook/rocksdb/pull/1156, so this change and PR will supersede it.
* Added new statistics and refactored to allow ioptions to be passed around as required to access environment and statistics pointers (and, as a convenient side effect, info_log pointer).
* Prevent incrementing compression counter when compression is turned off in options.
* Prevent incrementing compression counter when compression is turned off in options.
* Added two more supported compression types to test code in db_test.cc
* Prevent incrementing compression counter when compression is turned off in options.
* Added new StatsLevel that excludes compression timing.
* Fixed casting error in coding.h
* Fixed CompressionStatsTest for new StatsLevel.
* Removed unused variable that was breaking the Linux build
Summary: add backup support for ldb tool, and use it to run load test for backup on two HDFS envs
Test Plan: first generate some db, then compile against load test in fbcode, run load_test --db=<db path> backup --backup_env_uri=<URI of backup env> --backup_dir=<backup directory> --num_threads=<number of thread>
Reviewers: andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60633
Summary: Add option write_buffer_manager to help users control total memory spent on memtables across multiple DB instances.
Test Plan: Add a new unit test.
Reviewers: yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: adela, benj, sumeet, muthu, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59925
* Fixed Windows build error in CMakeLists.txt and perf_level error in db_bench_tool.cc
* Changed hard-coded perf levels in db_bench_tool.cc to enum values from perf_level.h
* Replaced remaining FLAGS_perf_level > 0
Summary:
This diff makes the following improvement in regression_test.sh:
1. Add NUM_OPS and DELETE_TEST_PATH to regression_test.sh:
* NUM_OPS: The number of operations that will be issued
in EACH thread.
Default: $NUM_KEYS / $NUM_THREADS
* DELETE_TEST_PATH: If true, then the test directory
will be deleted after the script ends.
Default: 0
2. Add more information in SUMMARY.csv
3. Fix a bug in regression_test.sh where each thread in fillseq will all issue $NUM_KEYS writes.
4. Add --deletes in db_bench, which allows us to control the number of deletes instead of must using FLAGS_num.
Test Plan: run regression test with and without DELETE_TEST_PATH and NUM_OPS
Reviewers: yiwu, sdong, IslamAbdelRahman, gunnarku
Reviewed By: gunnarku
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60039
Summary: filter_deletes option was removed, remove it from crash_test to fix it
Test Plan: make crash_test
Reviewers: yhchiang, andrewkr, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59901
Summary: filter_deltes is not a frequently used feature. Remove it.
Test Plan: Run all test suites.
Reviewers: igor, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59427
Summary:
The sandcastle setup doesn't provide a remote with our branches. Need
to fetch them directly from github.
Test Plan:
ran script on devserver and the relevant commands on a sandcastle
host.
Reviewers: IslamAbdelRahman, lightmark, kradhakrishnan, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D59511
Summary:
memtable_prefix_bloom_probes is not a critical option. Remove it to reduce number of options.
It's easier for users to make mistakes with memtable_prefix_bloom_bits, turn it to memtable_prefix_bloom_bits_ratio
Test Plan: Run all existing tests
Reviewers: yhchiang, igor, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: gunnarku, yoshinorim, MarkCallaghan, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D59199
Summary:
Added an option, --env_uri. When provided, it is used as an argument to
NewEnvFromUri(), which instantiates an Env based on it.
Test Plan:
built a simple binary that registers ChrootEnv for prefix "/", then
ran:
$ ./tmp --env_uri /tmp/ --db /abcde
/tmp/ is the chroot directory and /abcde is the db_name. Then I verified
db_bench uses /tmp/abcde
Reviewers: sdong, kradhakrishnan, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D59325
Summary:
We had to go back and update the g++ path for 4.4.fb-4.8.fb. So the
path is now fixed on the branches, but can't be fixed on the tags since they're
immutable. By making format compatibility tests use branch names (when
available), backported fixes like this will be used without having to re-release.
Also removed v1.5.7 and v2.1 because make fails.
Test Plan:
$ build_tools/rocksdb-lego-determinator run_format_compatible
Reviewers: sdong, lightmark, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D59355
Summary:
This patch does the following improvement for regression_test.sh
* Allow regression_test.sh to specify OPTIONS_FILE.
* Add header comments that includes examples on how to run the script
and introduce all configurable parameters.
* bug fix.
Test Plan: Run the example commands in the header comments of regression_test.sh
Reviewers: sdong, yiwu, gunnarku
Reviewed By: gunnarku
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D59175
Summary:
This patch allows db_bench to initialize it's RocksDB Options via a
options file, specified by the --options_file flag. Note that if
--options_file flag is set, then it has higher priority than the
command-line argument.
Test Plan: db_bench_tool_test
Reviewers: sdong, IslamAbdelRahman, kradhakrishnan, yiwu, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D58533
Summary:
This patch does the following improvement on the regression_test.sh
* allows db_bench being executed at a remost host while storing the
benchmark results locally.
* kills all db_bench related processes before running db_bench
* better error handling.
Test Plan:
1. Run regression_test.sh both locally and remotely
2. Run multiple regression_test.sh at the same time and make sure
i. Only one runs successfully.
ii. The one that runs successfully will kill all other db_bench
processes before it runs any benchmark.
Reviewers: sdong, yiwu, gunnarku
Reviewed By: gunnarku
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D58611
Summary: The function wrapper for LDBCommand::SelectCommand is too long so that Windows build fails with warning "decorated name length exceeded, name was truncated". Shrink the length by using a struct.
Test Plan: Build on both of Linux and Windows and make sure the warning doesn't show in either platform.
Reviewers: andrewkr, adsharma, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D58965
Summary:
A couple of notes from the diff:
- The namespace block I added at the top of table_properties_collector.cc was in reaction to an issue i was having with PutVarint64 and reusing the "val" string. I'm not sure this is the cleanest way of doing this, but abstracting this out at least results in the correct behavior.
- I chose "rocksdb.merge.operands" as the property name. I am open to suggestions for better names.
- The change to sst_dump_tool.cc seems a bit inelegant to me. Is there a better way to do the if-else block?
Test Plan:
I added a test case in table_properties_collector_test.cc. It adds two merge operands and checks to make sure that both of them are reflected by GetMergeOperands. It also checks to make sure the wasPropertyPresent bool is properly set in the method.
Running both of these tests should pass:
./table_properties_collector_test
./sst_dump_test
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D58119
Summary:
Added a new abstraction to cache page to RocksDB designed for the read
cache use.
RocksDB current block cache is more of an object cache. For the persistent read cache
project, what we need is a page cache equivalent. This changes adds a cache
abstraction to RocksDB to cache pages called PersistentCache. PersistentCache can cache
uncompressed pages or raw pages (content as in filesystem). The user can
choose to operate PersistentCache either in COMPRESSED or UNCOMPRESSED mode.
Blame Rev:
Test Plan: Run unit tests
Reviewers: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D55707
Summary:
So a customized ldb tool can pass it's own Selector.
Such a selector is expected to call LDBCommand::SelectCommand
and then add some of its own customized commands
Test Plan: make ldb
Reviewers: sdong, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57249
Summary:
The implementation remains where it is. Only the
header is exported. This is so that a customized
ldb tool can print help along with its own
extra commands
Test Plan: make ldb
Reviewers: sdong, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57255
Summary: Adds three new WriteBatch data types: Prepare(xid), Commit(xid), Rollback(xid). Prepare(xid) should precede the (single) operation to which is applies. There can obviously be multiple Prepare(xid) markers. There should only be one Rollback(xid) or Commit(xid) marker yet not both. None of this logic is currently enforced and will most likely be implemented further up such as in the memtableinserter. All three markers are similar to PutLogData in that they are writebatch meta-data, ie stored but not counted. All three markers differ from PutLogData in that they will actually be written to disk. As for WriteBatchWithIndex, Prepare, Commit, Rollback are all implemented just as PutLogData and none are tested just as PutLogData.
Test Plan: single unit test in write_batch_test.
Reviewers: hermanlee4, sdong, anthony
Subscribers: leveldb, dhruba, vasilep, andrewkr
Differential Revision: https://reviews.facebook.net/D57867
Summary:
This diff includes an initial script running a set of benchmarks for
regression test. The script does the following things:
checkout the specified rocksdb commit (or origin/master as default)
make clean && DEBUG_LEVEL=0 make db_bench
setup test directories
run set of benchmarks and store results
Currently, the script will run couple benchmarks, store all the benchmark
output, extract micros per op and percentile information for each benchmark
and store them in a single SUMMARY.csv file. The SUMMARY.csv will make the
follow-up regression detection easier.
In addition, the current script only takes env arguments to set important
attributes of db_bench. Will follow-up with a patch that allows db_bench
to construct options from an options file.
Test Plan:
NUM_KEYS=100 ./tools/regression_test.sh
Sample SUMMARY.csv file:
commit id, benchmark, ms-per-op, p50, p75, p99, p99.9, p99.99
7e23ddf575890510e7d2fc7a79b31a1bbf317917, fillseq, 15.28, 54.66, 77.14, 5000.00, 17900.00, 18483.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, overwrite, 13.54, 57.69, 86.39, 3000.00, 15600.00, 17013.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, readrandom, 1.04, 0.80, 1.67, 293.33, 395.00, 504.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, readwhilewriting, 2.75, 1.01, 1.87, 200.00, 460.00, 485.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, deleterandom, 3.64, 48.12, 70.09, 200.00, 336.67, 347.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, seekrandom, 24.31, 391.87, 513.69, 872.73, 990.00, 1048.00
7e23ddf575890510e7d2fc7a79b31a1bbf317917, seekrandomwhilewriting, 14.02, 185.14, 294.15, 700.00, 1440.00, 1527.00
Reviewers: sdong, IslamAbdelRahman, kradhakrishnan, yiwu, andrewkr, gunnarku
Reviewed By: gunnarku
Subscribers: gunnarku, MarkCallaghan, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D57597
Summary:
This is needed so that rocksdb users can add more
commands to the included ldb tool by adding more custom
commands.
Test Plan: make -j ldb
Reviewers: sdong, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57243
Summary:
crash_test grep for 'fail' string in the output and if found it consider that we failed.
Update the output to use something else
Test Plan: make crash_test (still running)
Reviewers: yhchiang, sdong, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57381
Summary:
This adds a new metablock containing a shared dictionary that is used
to compress all data blocks in the SST file. The size of the shared dictionary
is configurable in CompressionOptions and defaults to 0. It's currently only
used for zlib/lz4/lz4hc, but the block will be stored in the SST regardless of
the compression type if the user chooses a nonzero dictionary size.
During compaction, computes the dictionary by randomly sampling the first
output file in each subcompaction. It pre-computes the intervals to sample
by assuming the output file will have the maximum allowable length. In case
the file is smaller, some of the pre-computed sampling intervals can be beyond
end-of-file, in which case we skip over those samples and the dictionary will
be a bit smaller. After the dictionary is generated using the first file in a
subcompaction, it is loaded into the compression library before writing each
block in each subsequent file of that subcompaction.
On the read path, gets the dictionary from the metablock, if it exists. Then,
loads that dictionary into the compression library before reading each block.
Test Plan: new unit test
Reviewers: yhchiang, IslamAbdelRahman, cyan, sdong
Reviewed By: sdong
Subscribers: andrewkr, yoshinorim, kradhakrishnan, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D52287
Summary:
As db_stress with CompactFiles possibly catches a previous bug currently,
temporarily disable CompactFiles in db_stress in its default setting
to allows new bug to be detected while investigating the bug in CompactFiles.
Test Plan: crash test
Reviewers: sdong, kradhakrishnan, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D57333
Summary:
Introduced option to dump malloc statistics using new option flag.
Added new command line option to db_bench tool to enable this
funtionality.
Also extended build to support environments with/without jemalloc.
Test Plan:
1) Build rocksdb using `make` command. Launch the following command
`./db_bench --benchmarks=fillrandom --dump_malloc_stats=true
--num=10000000` end verified that jemalloc dump is present in LOG file.
2) Build rocksdb using `DISABLE_JEMALLOC=1 make db_bench -j32` and ran
the same db_bench tool and found the following message in LOG file:
"Please compile with jemalloc to enable malloc dump".
3) Also built rocksdb using `make` command on MacOS to verify behavior
in non-FB environment.
Also to debug build configuration change temporary changed
AM_DEFAULT_VERBOSITY = 1 in Makefile to see compiler and build
tools output. For case 1) -DROCKSDB_JEMALLOC was present in compiler
command line. For both 2) and 3) this flag was not present.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57321
Comparable with Snappy on comp ratio.
Implemented using Windows API, does not require external package.
Avaiable since Windows 8 and server 2012.
Use -DXPRESS=1 with CMake to enable.
Summary:
Enable testing CompactFiles in db_stress by adding flag test_compact_files
to db_stress.
Test Plan:
./db_stress --test_compact_files=1 --compaction_style=0 --allow_concurrent_memtable_write=false --ops_per_thread=100000
./db_stress --test_compact_files=1 --compaction_style=1 --allow_concurrent_memtable_write=false --ops_per_thread=100000
Sample output (note that it's normal to have some CompactFiles() failed):
Stress Test : 491.891 micros/op 65054 ops/sec
: Wrote 21.98 MB (0.45 MB/sec) (45% of 3200352 ops)
: Wrote 1440728 times
: Deleted 441616 times
: Single deleted 38181 times
: 319251 read and 19025 found the key
: Prefix scanned 640520 times
: Iterator size sum is 9691415
: Iterated 319704 times
: Got errors 0 times
: 1323 CompactFiles() succeed
: 32 CompactFiles() failed
2016/04/11-15:50:58 Verification successful
Reviewers: sdong, IslamAbdelRahman, kradhakrishnan, yiwu, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D56565
Summary:
This is taken from the "Write(GB)" column in compaction stats, so the
units should be GB, not MB.
Test Plan: none
Reviewers: sdong, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56889
Summary: It is useful to print out IO stats in flush jobs too. Extend options.compaction_measure_io_stats to flush jobs and raname it.
Test Plan: Try db_bench and see the stats are printed out.
Reviewers: yhchiang
Reviewed By: yhchiang
Subscribers: kradhakrishnan, yiwu, IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56769
* Update db_bench_tool.cc
I fixed the wrong letters, LevelDB -> rocksDB, because I thought of LevelDB as the wrong presentation.
the following show my fix :
fprintf(stderr, "LevelDB: version %d.%d\n",
kMajorVersion, kMinorVersion);
----------------->
fprintf(stderr, "rocksDB: version %d.%d\n",
kMajorVersion, kMinorVersion);
* Update db_bench_tool.cc
* Update db_bench_tool.cc
Summary:
strings.h header does not exist on Windows. So, we can try another way
to compare strings ignoring case.
Test Plan:
built and ran:
$ ./ldb_cmd_test
Reviewers: sdong, yhchiang, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D56535
Summary:
Current Message
```
sst_dump [--command=check|scan|none|raw] [--verify_checksum] --file=data_dir_OR_sst_file [--output_hex] [--input_key_hex] [--from=<user_key>] [--to=<user_key>] [--read_num=NUM] [--show_properties] [--show_compression_sizes] [--show_compression_sizes [--set_block_size=<block_size>]]
```
New message
```
sst_dump --file=<data_dir_OR_sst_file> [--command=check|scan|raw]
--file=<data_dir_OR_sst_file>
Path to SST file or directory containing SST files
--command=check|scan|raw
check: Iterate over entries in files but dont print anything except if an error is encounterd (default command)
scan: Iterate over entries in files and print them to screen
raw: Dump all the table contents to <file_name>_dump.txt
--output_hex
Can be combined with scan command to print the keys and values in Hex
--from=<user_key>
Key to start reading from when executing check|scan
--to=<user_key>
Key to stop reading at when executing check|scan
--read_num=<num>
Maximum number of entries to read when executing check|scan
--verify_checksum
Verify file checksum when executing check|scan
--input_key_hex
Can be combined with --from and --to to indicate that these values are encoded in Hex
--show_properties
Print table properties after iterating over the file
--show_compression_sizes
Independent command that will recreate the SST file using 16K block size with different
compressions and report the size of the file using such compression
--set_block_size=<block_size>
Can be combined with --show_compression_sizes to set the block size that will be used
when trying different compression algorithms
```
Test Plan: none
Reviewers: yhchiang, andrewkr, kradhakrishnan, yiwu, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56325
Summary:
Added the column family name to the properties block. This property
is omitted only if the property is unavailable, such as when RepairDB()
writes SST files.
In a next diff, I will change RepairDB to use this new property for
deciding to which column family an existing SST file belongs. If this
property is missing, it will add it to the "unknown" column family (same
as its existing behavior).
Test Plan:
New unit test:
$ ./db_table_properties_test --gtest_filter=DBTablePropertiesTest.GetColumnFamilyNameProperty
Reviewers: IslamAbdelRahman, yhchiang, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D55605
Summary:
Adapted a stderr logger from the option tests. Moved it to a separate
header so we can reuse it, e.g., from ldb subcommands for faster debugging. This
is especially useful to make errors/warnings more visible when running
"ldb repair", which involves potential data loss.
Test Plan:
ran options_test and "ldb repair"
$ ./ldb repair --db=./tmp/
[WARN] **** Repaired rocksdb ./tmp/; recovered 1 files; 588bytes. Some data may have been lost. ****
OK
Reviewers: IslamAbdelRahman, yhchiang, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D56151
Summary:
When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache.
What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention.
Test Plan:
'export TEST_TMPDIR=/dev/shm/ && DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32' is OK.
I didn't run the Java tests, I don't have Java set up on my devserver.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56133
Summary:
Undefined Behavior Sanitizer (ubsan) is //a good thing// which will help us to find sneaky bugs with low cost. Please see http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/ for more details and official GCC documentation for more context: https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html.
Changes itself are quite simple and pretty much imitating whatever is implemented for ASan.
Hooking the UBsan validation build to Sandcastle is a separate step and will be dealt as separate diff because code is in internal repository.
Test Plan: Make sure that that there no regressions when it comes to builds and test pass rate.
Reviewers: leveldb, sdong, IslamAbdelRahman, yhchiang
Reviewed By: yhchiang
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56049
Summary:
Expose the inverse of ToString(hex=true) on Slice: Slice::DecodeHex
Refactor the other implementation of to/from hex in ldb_cmd.h to use the Slice
version
(Difference between the 2 is whether 0x is expected/produced in front of the hex
string or not)
Eliminated support for invalid odd length hex string - this is now invalid
instead of having 1/2 byte set
Added (inverse of HexToString) test for LDBCommand::StringToHex which also
indirectly tests Slice::ToString(true)
After moving the original implementation from ldb_cmd.h, updated it to much simpler/efficient version
(originally/inspired from https://github.com/facebook/wdt/blob/master/util/EncryptionUtils.cpp#L140-L169 )
Test Plan: run tests
Reviewers: uddipta, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D56121
record.size() should not be less than 12.
This "magic number" seems to be the WriteBatch header (8 byte sequence
and 4 byte count). Replaced all the places where "12" was used
by WriteBatchInternal::kHeader.
Summary: Test seems to fail if we don't use consistent version between testing forward and backward compatibility.
Test Plan: Run the script (with some version removed manually to make it shorter)
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D55773
Summary: After introducing a less forward-compatible change, update the backward compatible checking tool.
Test Plan: Run it.
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D55695
Summary:
When a block based table file is opened, if prefetch_index_and_filter is true, it will prefetch the index and filter blocks, putting them into the block cache.
What this feature adds: when a L0 block based table file is opened, if pin_l0_filter_and_index_blocks_in_cache is true in the options (and prefetch_index_and_filter is true), then the filter and index blocks aren't released back to the block cache at the end of BlockBasedTableReader::Open(). Instead the table reader takes ownership of them, hence pinning them, ie. the LRU cache will never push them out. Meanwhile in the table reader, further accesses will not hit the block cache, thus avoiding lock contention.
When the table reader is destroyed, it releases the pinned blocks (if there were any). This has to happen before the cache is destroyed, so I had to introduce a TableReader::Close(), to guarantee the order of destruction.
Test Plan:
Added two unit tests for this. Existing unit tests run fine (default is pin_l0_filter_and_index_blocks_in_cache=false).
DISABLE_JEMALLOC=1 OPT=-g make all valgrind_check -j32
Mac: OK.
Linux: with D55287 patched in it's OK.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D54801
Summary: For compactions benchmarks (both level and universal) we'll use `--max_write_buffer_number=4`. For all the other benchmarks which don't customize the value of `--max_background_flushes` we'll continue using `--max_write_buffer_number=8`.
Test Plan:
To validate basic correctness and command-line options:
```
cd ~/rocksdb
NKEYS=10000000 ./tools/run_flash_bench.sh
```
Reviewers: MarkCallaghan
Reviewed By: MarkCallaghan
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D55497
Summary: Set of updates to the subcompaction benchmark script which are based on our internal discussions. The intent behind the changes is to make sure that the scripts will correctly reflect how we're doing the actual benchmarking.
Test Plan: Tested by exercising the full set of compaction benchmarks and validating the execution and consistency of results.
Reviewers: MarkCallaghan, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D55461
Summary: This will make it easier for admins and devs to use RepairDB.
Test Plan:
Tried deleting the manifest and verified it recovers:
$ ldb --create_if_missing --db=/tmp/test_db put ok ok
$ rm -f /tmp/test_db/MANIFEST-000001
$ ./ldb --db=/tmp/test_db repair
$ ldb --db=/tmp/test_db get ok
ok
Reviewers: yhchiang, sdong, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D55359
Summary: Refactored db_bench transaction stress tests so that they can be called from unit tests as well.
Test Plan: run new unit test as well as db_bench
Reviewers: yhchiang, IslamAbdelRahman, sdong
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D55203
Summary: After closely working with Mark, Siying, and Yueh-Hsuan this set of changes reflects the updates needed to measure RocksDB subcompaction performance in a correct manner. The essence of the benchmark is executing `fillrandom` followed by `compact` with the correct set of options for various number of subcompactions specified.
Test Plan: Tested internally to verify correctness and reliability.
Reviewers: sdong, yhchiang, MarkCallaghan
Reviewed By: MarkCallaghan
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D55089
Test Plan:
Built and ran with gflags:
% ./db_bench
LevelDB: version 4.5
Date: Tue Feb 16 12:04:23 2016
CPU: 40 * Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
...
And without gflags:
% ./db_bench
Please install gflags to run rocksdb tools
%
Reviewers: sdong, igor
Reviewed By: igor
Subscribers: igor, dhruba
Differential Revision: https://reviews.facebook.net/D54243
Summary:
Implement a benchmark for universal compaction based on the feature description (see below), in-person discussions, and reading source code:
https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guidehttps://github.com/facebook/rocksdb/wiki/Universal-Compactionhttps://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide#universal-compaction
Universal compaction benchmark is based on `overwrite` benchmark, adding compaction specific options to it, and executing it for different values of subcompaction to understand the impact of scaling out subcompactions for a particular scenario.
Test Plan:
- Execute the benchmark on various machines for multiple iterations to verify the reliability.
- Observe the output to make sure that compaction is taking place.
- Observe the execution to make sure that arguments passed to `db_bench` are correct.
Reviewers: sdong, MarkCallaghan
Reviewed By: MarkCallaghan
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D54045
Summary: Default crash test uses prefix hash memtable, which is not compatible to concurrent memtable. Allow prefix test run with skip list and use skip list memtable when concurrent insert is used.
Test Plan: Run "python -u tools/db_crashtest.py whitebox" and watch sometimes skip list is used.
Reviewers: anthony, yhchiang, kradhakrishnan, andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D53907
Summary: If users turn on concurrent insert but the memtable doesn't support it, they might see unexcepted crash. Fix it by explicitly fail.
Test Plan:
Run different setting of stress_test and make sure it fails correctly.
Will add a unit test too.
Reviewers: anthony, kradhakrishnan, IslamAbdelRahman, yhchiang, andrewkr, ngbronson
Reviewed By: ngbronson
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D53895
Summary: This set of changes is part of the work to introduce benchmark for universal style compaction in RocksDB. It's conceptually separate from the compaction work, so sending it out as a separate diff to get it out of the way.
Test Plan:
- Run `./tools/run_flash_bench.sh`.
- Look at the contents of `report.txt` and `report2.txt` to make sure that data is reported and attributed correctly.
- During `db_bench` execution time make sure that the correct flags are passed to `--disable_wal` depending on the benchmark being executed.
Reviewers: MarkCallaghan
Reviewed By: MarkCallaghan
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D53865
Summary: Add an option of --allow_concurrent_memtable_write in stress test and cover it in crash test
Test Plan: Run crash test and make sure three combinations of the two options show up randomly.
Reviewers: IslamAbdelRahman, yhchiang, andrewkr, anthony, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D53811
Summary:
Some of the tests aren't considered to be critical when it comes to getting key benchmarking data for RocksDB. Therefore we'll introduce an environment variable `SKIP_LOW_PRI_TESTS` which enables skipping those test cases. By default all the tests will be run. If you want to optimize the test-case execution then do the following:
`
$ export SKIP_LOW_PRI_TESTS=1
$ ./tools/run_flash_bench.sh
`
Test Plan: Verified that when `SKIP_LOW_PRI_TESTS` is not set then `benchmark.sh` is called for all the scenarios and when `SKIP_LOW_PRI_TESTS` is set to `1` then `benchmark.sh` is called only for the test-cases which are critical.
Reviewers: MarkCallaghan
Reviewed By: MarkCallaghan
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D53739
Summary:
Add an option --column_family option, so that users can query or update specific column family.
Also add an create column family parameter to make unit test easier.
Still need to add unit tests.
Test Plan: Will add a test case in ldb python test.
Reviewers: yhchiang, rven, andrewkr, IslamAbdelRahman, kradhakrishnan, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D53265
Summary:
In crash test, when coming to each kill point, we start a random class using seed as current second. With this approach, for every second, the random number used is the same. However, in each second, there are multiple kill points with different frequency. It makes it hard to reason about chance of kill point to trigger. With this commit, we use thread local random seed to generate the random number, so that it will take different values per second, hoping it makes chances of killing much easier to reason about.
Also significantly reduce the kill odd to make sure time before kiling is similar as before.
Test Plan: Run white box crash test and see the killing happens as expected and the run time time before killing reasonable.
Reviewers: kradhakrishnan, IslamAbdelRahman, rven, yhchiang, andrewkr, anthony
Reviewed By: anthony
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D52971
Summary:
util/testutil.h doesn't seem to be used in tools/sst_dump_tool_imp.h. Remove it.
Also move some other include to tools/sst_dump_tool.cc instead.
Test Plan: Build with GCC, CLANG and with GCC 4.81 and 4.9.
Reviewers: yuslepukhin, yhchiang, rven, anthony, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D52791
Summary:
This set of changes implements the following design: `ldb` will utilize `--path` parameter which can be used to specify a file name. Tool will then apply some heuristic to determine how to output the data properly. The design decision is not to probe the file content, but use file names to determine what dumping function to call.
Usage examples:
Understands that path points to a manifest file and dumps it.
`./ldb --path=/tmp/test_db/MANIFEST-000023 dump`
Understands that path points to a WAL file and dumps it.
`./ldb --path=/tmp/test_db/000024.log dump --header`
Understands that path points to a SST file and dumps it.
`./ldb --path=/tmp/test_db/000007.sst dump`
Figures out that none of the supported file types are applicable and outputs
an appropriate error message.
`./ldb --path=/tmp/cron.log dump`
Test Plan:
Basics:
git diff
make clean
make -j 32 commit-prereq
arc lint
More specific testing (done as part of commit-prereq, but can be iterated separately when making isolated changes):
make clean
make ldb
python tools/ldb_test.py
make rocksdb_dump
make rocksdb_undump
sh tools/rocksdb_dump_test.sh
Reviewers: rven, IslamAbdelRahman, yhchiang, kradhakrishnan, anthony, igor, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D52269
Summary:
1) changes tools/{benchmark,run_flash_bench}.sh to optionally use the write rate limit
2) removes code for --writes_per_second and switches the 'background' write rate limit
to use --benchmark_write_rate_limit
Replaces https://reviews.facebook.net/D49113
Task ID: #9555881
Blame Rev:
Test Plan:
tools/run_flash_bench.sh
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: igor
Reviewed By: igor
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D52485
Summary:
When Get() or NewIterator() trigger file loads, skip caching the filter block if
(1) optimize_filters_for_hits is set and (2) the file is on the bottommost
level. Also skip checking filters under the same conditions, which means that
for a preloaded file or a file that was trivially-moved to the bottom level, its
filter block will eventually expire from the cache.
- added parameters/instance variables in various places in order to propagate the config ("skip_filters") from version_set to block_based_table_reader
- in BlockBasedTable::Rep, this optimization prevents filter from being loaded when the file is opened simply by setting filter_policy = nullptr
- in BlockBasedTable::Get/BlockBasedTable::NewIterator, this optimization prevents filter from being used (even if it was loaded already) by setting filter = nullptr
Test Plan:
updated unit test:
$ ./db_test --gtest_filter=DBTest.OptimizeFiltersForHits
will also run 'make check'
Reviewers: sdong, igor, paultuckfield, anthony, rven, kradhakrishnan, IslamAbdelRahman, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D51633