Commit Graph

9155 Commits

Author SHA1 Message Date
Peter Dillinger
b810e62b39 Clarifying comments in db.h (#6768)
Summary:
And fix a confusingly worded log message
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6768

Reviewed By: anand1976

Differential Revision: D21284527

Pulled By: pdillinger

fbshipit-source-id: f03c1422c229a901c3a65e524740452349626164
2020-04-28 15:26:03 -07:00
Peter Dillinger
bae6f58696 Basic MultiGet support for partitioned filters (#6757)
Summary:
In MultiGet, access each applicable filter partition only once
per batch, rather than for each applicable key. Also,

* Fix Bloom stats for MultiGet
* Fix/refactor MultiGetContext::Range::KeysLeft, including
* Add efficient BitsSetToOne implementation
* Assert that MultiGetContext::Range does not go beyond shift range

Performance test: Generate db:

    $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10 -partition_index_and_filters=true
    ...

Before (middle performing run of three; note some missing Bloom stats):

    $ ./db_bench --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
    multireadrandom :      26.403 micros/op 597517 ops/sec; (548427 of 671968 found)
    rocksdb.block.cache.filter.hit COUNT : 83443275
    rocksdb.bloom.filter.useful COUNT : 0
    rocksdb.bloom.filter.full.positive COUNT : 0
    rocksdb.bloom.filter.full.true.positive COUNT : 7931450
    rocksdb.number.multiget.get COUNT : 385984
    rocksdb.number.multiget.keys.read COUNT : 12351488
    rocksdb.number.multiget.bytes.read COUNT : 793145000
    rocksdb.number.multiget.keys.found COUNT : 7931450

After (middle performing run of three):

    $ ./db_bench_new --use-existing-db --benchmarks=multireadrandom --num=15000000 --cache_index_and_filter_blocks --bloom_bits=10 --threads=16 --cache_size=20000000 -partition_index_and_filters -batch_size=32 -multiread_batched -statistics --duration=20 2>&1 | egrep 'micros/op|block.cache.filter.hit|bloom.filter.(full|use)|number.multiget'
    multireadrandom :      21.024 micros/op 752963 ops/sec; (705188 of 863968 found)
    rocksdb.block.cache.filter.hit COUNT : 49856682
    rocksdb.bloom.filter.useful COUNT : 45684579
    rocksdb.bloom.filter.full.positive COUNT : 10395458
    rocksdb.bloom.filter.full.true.positive COUNT : 9908456
    rocksdb.number.multiget.get COUNT : 481984
    rocksdb.number.multiget.keys.read COUNT : 15423488
    rocksdb.number.multiget.bytes.read COUNT : 990845600
    rocksdb.number.multiget.keys.found COUNT : 9908456

So that's about 25% higher throughput even for random keys
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6757

Test Plan: unit test included

Reviewed By: anand1976

Differential Revision: D21243256

Pulled By: pdillinger

fbshipit-source-id: 5644a1468d9e8c8575be02f4e04bc5d62dbbb57f
2020-04-28 14:49:34 -07:00
Peter Dillinger
a7f0b27b39 HISTORY.md update for bzip upgrade (#6767)
Summary:
See https://github.com/facebook/rocksdb/issues/6714 and https://github.com/facebook/rocksdb/issues/6703
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6767

Reviewed By: riversand963

Differential Revision: D21283307

Pulled By: pdillinger

fbshipit-source-id: 8463bec725669d13846c728ad4b5bde43f9a84f8
2020-04-28 12:29:31 -07:00
Peter Dillinger
4574d7513d Update HISTORY.md for block cache redundant adds (#6764)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6764

Reviewed By: ltamasi

Differential Revision: D21267108

Pulled By: pdillinger

fbshipit-source-id: a3dfe2dbe4e8f6309a53eb72903ef58d52308f97
2020-04-28 08:26:43 -07:00
Yanqin Jin
d4398e08fc Fix timestamp support for MultiGet (#6748)
Summary:
1. Avoid nullptr dereference when passing timestamp to KeyContext creation.
2. Construct LookupKey correctly with timestamp when creating MultiGetContext.
3. Compare without timestamp when sorting KeyContexts.

Fixes https://github.com/facebook/rocksdb/issues/6745

Test plan (dev server):
make check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6748

Reviewed By: pdillinger

Differential Revision: D21258691

Pulled By: riversand963

fbshipit-source-id: 44e65b759c18b9986947783edf03be4f890bb004
2020-04-27 22:49:56 -07:00
Cheng Chang
4cd859edf1 Fix build under LITE (#6758)
Summary:
GetSupportedCompressions needs to be defined under LITE.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6758

Test Plan: build under LITE

Reviewed By: zhichao-cao

Differential Revision: D21247937

Pulled By: cheng-chang

fbshipit-source-id: 880e59d3e107cdd736d16427a68c5641d1318fb4
2020-04-27 16:55:14 -07:00
Levi Tamasi
bea91d5d61 Destroy any ColumnFamilyHandles in BlobDB::Open upon error (#6763)
Summary:
If an error happens during BlobDBImpl::Open after the base DB has been
opened, we need to destroy the `ColumnFamilyHandle`s returned by `DB::Open`
to prevent an assertion in `ColumnFamilySet`'s destructor from being hit.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6763

Test Plan: Ran `make check` and tested using the BlobDB mode of `db_bench`.

Reviewed By: riversand963

Differential Revision: D21262643

Pulled By: ltamasi

fbshipit-source-id: 60ebc7ab19be66cf37fbe5f6d8957d58470f3d3b
2020-04-27 16:45:13 -07:00
Albert Hse-Lin Chen
cc8d16efd6 Fixed minor typo in comment for MergeOperator::FullMergeV2() (#6759)
Summary:
Fixed minor typo in comment for FullMergeV2().
Last operand up to snapshot should be +4 instead of +3.

Signed-off-by: Albert Hse-Lin Chen <hselin@kalista.io>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6759

Reviewed By: cheng-chang

Differential Revision: D21260295

Pulled By: zhichao-cao

fbshipit-source-id: cc942306f246c8606538feb30bfdf6df9fb6c54e
2020-04-27 14:44:47 -07:00
Peter Dillinger
249eff0f30 Stats for redundant insertions into block cache (#6681)
Summary:
Since read threads do not coordinate on loading data into block
cache, two threads between Lookup and Insert can end up loading and
inserting the same data. This is particularly concerning with
cache_index_and_filter_blocks since those are hot and more likely to
be race targets if ejected from (or not pre-populated in) the cache.

Particularly with moves toward disaggregated / network storage, the cost
of redundant retrieval might be high, and we should at least have some
hard statistics from which we can estimate impact.

Example with full filter thrashing "cliff":

    $ ./db_bench --benchmarks=fillrandom --num=15000000 --cache_index_and_filter_blocks -bloom_bits=10
    ...
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((130 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 14181
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 476
    rocksdb.block.cache.data.add COUNT : 12749
    rocksdb.block.cache.data.add.redundant COUNT : 18
    rocksdb.block.cache.filter.add COUNT : 1003
    rocksdb.block.cache.filter.add.redundant COUNT : 217
    rocksdb.block.cache.index.add COUNT : 429
    rocksdb.block.cache.index.add.redundant COUNT : 241
    $ ./db_bench --db=/tmp/rocksdbtest-172704/dbbench --use_existing_db --benchmarks=readrandom,stats --num=200000 --cache_index_and_filter_blocks --cache_size=$((120 * 1024 * 1024)) --bloom_bits=10 --threads=16 -statistics 2>&1 | egrep '^rocksdb.block.cache.(.*add|.*redundant)' | grep -v compress | sort
    rocksdb.block.cache.add COUNT : 1182223
    rocksdb.block.cache.add.failures COUNT : 0
    rocksdb.block.cache.add.redundant COUNT : 302728
    rocksdb.block.cache.data.add COUNT : 31425
    rocksdb.block.cache.data.add.redundant COUNT : 12
    rocksdb.block.cache.filter.add COUNT : 795455
    rocksdb.block.cache.filter.add.redundant COUNT : 130238
    rocksdb.block.cache.index.add COUNT : 355343
    rocksdb.block.cache.index.add.redundant COUNT : 172478
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6681

Test Plan: Some manual testing (above) and unit test covering key metrics is included

Reviewed By: ltamasi

Differential Revision: D21134113

Pulled By: pdillinger

fbshipit-source-id: c11497b5f00f4ffdfe919823904e52d0a1a91d87
2020-04-27 13:20:27 -07:00
Akanksha Mahajan
75b13ea94a Allow sst_dump to check size of different compression levels and report time (#6634)
Summary:
Summary : 1. Add two arguments --compression_level_from and --compression_level_to to check
	  the compression size with different compression level in the given range. Users must
          specify one compression type else it will error out. Both from and to levels must
	  also be specified together.
	  2. Display the time taken to compress each file with different compressions by default.

Test Plan : make -j64 check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6634

Test Plan: make -j64 check

Reviewed By: anand1976

Differential Revision: D20810282

Pulled By: akankshamahajan15

fbshipit-source-id: ac9098d3c079a1fad098f6678dbedb4d888a791b
2020-04-27 12:36:16 -07:00
Peter Dillinger
791e5714a5 Understand common build variables passed as make variables (#6740)
Summary:
Some common build variables like USE_CLANG and
COMPILE_WITH_UBSAN did not work if specified as make variables, as in
`make USE_CLANG=1 check` etc. rather than (in theory less hygienic)
`USE_CLANG=1 make check`. This patches Makefile to export some commonly
used ones to build_detect_platform so that they work. (I'm skeptical of
a broad `export` in Makefile because it's hard to predict how random
make variables might affect various invoked tools.)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6740

Test Plan: manual / CI

Reviewed By: siying

Differential Revision: D21229011

Pulled By: pdillinger

fbshipit-source-id: b00c69b23eb2a13105bc8d860ce2d1e61ac5a355
2020-04-27 10:48:49 -07:00
Yanqin Jin
3b2f2719eb Update buckifier to unblock future internal release (#6726)
Summary:
Some recent PRs added new source files or modified TARGETS file manually.
During next internal release, executing the following command will revert the
manual changes.
Update buckifier so that the following command
```
python buckfier/buckify_rocksdb.py
```
does not change TARGETS file.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6726

Test Plan:
```
python buckifier/buckify_rocksdb.py
```

Reviewed By: siying

Differential Revision: D21098930

Pulled By: riversand963

fbshipit-source-id: e884f507fefef88163363c9097a460c98f1ed850
2020-04-26 17:35:37 -07:00
Cheng Chang
0a77617820 Disable O_DIRECT in stress test when db directory does not support direct IO (#6727)
Summary:
In crash test, the db directory might be set to /dev/shm or /tmp, in certain environments such as internal testing infrastructure, neither of these directories support direct IO, so direct IO is never enabled in crash test.

This PR sets up SyncPoints in direct IO related code paths to disable O_DIRECT flag in calls to `open`, so the direct IO code paths will be executed, all direct IO related assertions will be checked, but no real direct IO request will be issued to the file system.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6727

Test Plan:
export CRASH_TEST_EXT_ARGS="--use_direct_reads=1 --mmap_read=0"
make -j24 crash_test

Reviewed By: zhichao-cao

Differential Revision: D21139250

Pulled By: cheng-chang

fbshipit-source-id: db9adfe78d91aa4759835b1af91c5db7b27b62ee
2020-04-25 00:01:03 -07:00
Cheng Chang
40497a875a Reduce memory copies when fetching and uncompressing blocks from SST files (#6689)
Summary:
In https://github.com/facebook/rocksdb/pull/6455, we modified the interface of `RandomAccessFileReader::Read` to be able to get rid of memcpy in direct IO mode.
This PR applies the new interface to `BlockFetcher` when reading blocks from SST files in direct IO mode.

Without this PR, in direct IO mode, when fetching and uncompressing compressed blocks, `BlockFetcher` will first copy the raw compressed block into `BlockFetcher::compressed_buf_` or `BlockFetcher::stack_buf_` inside `RandomAccessFileReader::Read` depending on the block size. then during uncompressing, it will copy the uncompressed block into `BlockFetcher::heap_buf_`.

In this PR, we get rid of the first memcpy and directly uncompress the block from `direct_io_buf_` to `heap_buf_`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6689

Test Plan: A new unit test `block_fetcher_test` is added.

Reviewed By: anand1976

Differential Revision: D21006729

Pulled By: cheng-chang

fbshipit-source-id: 2370b92c24075692423b81277415feb2aed5d980
2020-04-24 15:32:56 -07:00
Cheng Chang
1758f76f2d Fix unused variable of r in release mode (#6750)
Summary:
In release mode, asserts are not compiled, so `r` is not used, causing compiler warnings.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6750

Test Plan: make check under release mode

Reviewed By: anand1976

Differential Revision: D21220365

Pulled By: cheng-chang

fbshipit-source-id: fd4afa9843d54af68c4da8660ec61549803e1167
2020-04-24 15:14:13 -07:00
anand76
9e7b7e2c08 Silence false alarms in db_stress fault injection (#6741)
Summary:
False alarms are caused by codepaths that intentionally swallow IO
errors.

Tests:
make crash_test
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6741

Reviewed By: ltamasi

Differential Revision: D21181138

Pulled By: anand1976

fbshipit-source-id: 5ccfbc68eb192033488de6269e59c00f2c65ce00
2020-04-24 13:06:12 -07:00
Yanqin Jin
e04f3bce4f Update CURRENT file after best-efforts recovery (#6746)
Summary:
After a successful recovery, the CURRENT file should be updated to point to the valid MANIFEST.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6746

Test Plan: make check

Reviewed By: anand1976

Differential Revision: D21189876

Pulled By: riversand963

fbshipit-source-id: 7537b49988c5c425ebe9505a5cc260de351ad79b
2020-04-23 16:21:09 -07:00
Cheng Chang
51bdfae010 Check alignment of MultiRead requests in direct IO mode (#6739)
Summary:
Add assertions to check direct IO's alignment requirements in MultiRead.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6739

Test Plan: make check

Reviewed By: siying

Differential Revision: D21143825

Pulled By: cheng-chang

fbshipit-source-id: 26f1623b062a1851080771128feac0669a61f5e9
2020-04-23 15:19:31 -07:00
Levi Tamasi
bc51e33d9c Make sure (Shared)BlobFileMetaData are owned by shared_ptrs (#6749)
Summary:
The patch makes a couple of small cleanups to `SharedBlobFileMetaData` and `BlobFileMetaData`:
* It makes the constructors private and introduces factory methods to ensure these objects are always owned by `shared_ptr`s. Note that `SharedBlobFileMetaData` has an additional factory that takes a deleter object; we can utilize this to e.g. notify `VersionSet` when a blob file becomes obsolete (which is exactly when `SharedBlobFileMetaData` is destroyed).
* It disables move operations explicitly instead of relying on them being suppressed because of a user-declared destructor.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6749

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D21206947

Pulled By: ltamasi

fbshipit-source-id: 9094c14cc335b3e226f883e5a0df4f87a5cdeb95
2020-04-23 13:44:29 -07:00
Ibrahim Jarif
ae77880223 Fix some typos in code comments (#6733)
Summary:
This PR fixes some typos in code comments.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6733

Reviewed By: siying

Differential Revision: D21209037

Pulled By: zhichao-cao

fbshipit-source-id: d9274611fab1f5e992998c8c4117b8078c4cbc69
2020-04-23 12:28:49 -07:00
mrambacher
4cbc19d2a1 Add a ConfigOptions for use in comparing objects and converting to/from strings (#6389)
Summary:
The methods in convenience.h are used to compare/convert objects to/from strings.  There is a mishmash of parameters in use here with more needed in the future.  This PR replaces those parameters with a single structure.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6389

Reviewed By: siying

Differential Revision: D21163707

Pulled By: zhichao-cao

fbshipit-source-id: f807b4cc7e2b0af3871536b69546b2604dfa81bd
2020-04-21 17:38:17 -07:00
anand76
c1ccd6b6af Implement deadline support for MultiGet (#6710)
Summary:
Initial implementation of ReadOptions.deadline for MultiGet. If the request takes longer than the deadline, the keys not yet found will be returned with Status::TimedOut(). This
implementation enforces the deadline in DBImpl, which is fairly high
level. Its best effort and may not check the deadline after every key
lookup, but may do so after a batch of keys.

In subsequent stages, we will extend this to passing a timeout down to the FileSystem.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6710

Test Plan: Add new unit tests

Reviewed By: riversand963

Differential Revision: D21149158

Pulled By: anand1976

fbshipit-source-id: 9f44eecffeb40873f5034ed59a66d21f9f88879e
2020-04-21 14:51:51 -07:00
Tomas Kolda
6ee66cf8f0 Prevents Table Cache to open same files more times (#6707)
Summary:
In highly concurrent requests table cache opens same file more times which lowers purpose of max_open_files. Fixes (https://github.com/facebook/rocksdb/issues/6699)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6707

Reviewed By: ltamasi

Differential Revision: D21044965

fbshipit-source-id: f6e91d90b60dad86e518b5147021da42460ee1d2
2020-04-21 13:16:31 -07:00
Andrew Kryczka
f9155a3404 Prevent uninitialized load in IndexBlockIter (#6736)
Summary:
When index block is empty or an error happens while reading it,
`Invalidate()` is called rather than `Initialize()`. So `Seek()` must
not refer to member variables that are only initialized in
`Initialize()` until it is sure `Initialize()` has been called.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6736

Reviewed By: siying

Differential Revision: D21139641

Pulled By: ajkr

fbshipit-source-id: 71c58cc1adbd795dc3729dd5023bf7df1515ff32
2020-04-20 16:32:43 -07:00
Akanksha Mahajan
03a1d95db0 Set max_background_flushes dynamically (#6701)
Summary:
1. Add changes so that max_background_flushes can be set dynamically.
                   2. Add a testcase DBOptionsTest.SetBackgroundFlushThreads which set the
                        max_background_flushes dynamically using SetDBOptions.

TestPlan:  1. make -j64 check
                  2. Using new testcase DBOptionsTest.SetBackgroundFlushThreads
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6701

Reviewed By: ajkr

Differential Revision: D21028010

Pulled By: akankshamahajan15

fbshipit-source-id: 5f949e4a8fd3c32537b637947b7ee09a69cfc7c1
2020-04-20 16:19:02 -07:00
Peter Dillinger
31da5e34c1 C++20 compatibility (#6697)
Summary:
Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:

* Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
* Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
* std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
* Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
* Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
* Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
* Added GCC 9 C++11 & GCC9 C++20 in Travis
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697

Test Plan: make check and CI

Reviewed By: cheng-chang

Differential Revision: D21020318

Pulled By: pdillinger

fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
2020-04-20 13:24:25 -07:00
sdong
fe206f4f7c crash_test to cover index_type kBinarySearchWithFirstKey (#6721)
Summary:
Recently index_type kBinarySearchWithFirstKey is improved so that the API guarantee is exactly the same as other types and it is ready for wide production. We should cover it in crash tst.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6721

Test Plan: Run crash_test

Reviewed By: anand1976

Differential Revision: D21099781

fbshipit-source-id: fda91eba831d9eacbb140c703e9768bb1701f935
2020-04-20 12:57:15 -07:00
Peter Dillinger
45d2b4efca Fix tabs and lint-ignores (#6734)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6734

Reviewed By: cheng-chang

Differential Revision: D21134556

Pulled By: pdillinger

fbshipit-source-id: 3636cc1d1333137b70031f8277458781c21631fb
2020-04-20 11:39:31 -07:00
Yanqin Jin
243852ec15 Add IsDirectory() to Env and FS (#6711)
Summary:
IsDirectory() is a common API to check whether a path is a regular file or
directory.
POSIX: call stat() and use S_ISDIR(st_mode)
Windows: PathIsDirectoryA() and PathIsDirectoryW()
HDFS: FileSystem.IsDirectory()
Java: File.IsDirectory()
...
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6711

Test Plan: make check

Reviewed By: anand1976

Differential Revision: D21053520

Pulled By: riversand963

fbshipit-source-id: 680aadfd8ce982b63689190cf31b3145d5a89e27
2020-04-17 14:39:18 -07:00
sdong
63d82e57b9 crash_test to cover small max_open_files (#6719)
Summary:
RocksDB behavior is different while max_open_files is small or large. Add the coverage to small max_open_files.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6719

Test Plan: Run crash_test

Reviewed By: pdillinger

Differential Revision: D21081021

fbshipit-source-id: e3e211761a9bd25d93d19a61c1f7b62d48cf5e3c
2020-04-17 11:00:07 -07:00
Nicolas Pépin-Perreault
9e6f3efcd2 Add RocksIterator::Refresh (#6573)
Summary:
This PR exposes the `Iterator::Refresh` method to the Java API by adding it on the `RocksIteratorInterface` interface. There are three concrete implementations: `RocksIterator`, `SstFileReaderIterator`, and `WBWIRocksIterator`. For the first two cases, the JNI side simply delegates to the underlying `Iterator::Refresh` method; in the last case, as it doesn't share an ancestor, and per the discussion in https://github.com/facebook/rocksdb/issues/3465, a `Status::NotSupported` exception is thrown.

As the last PR had no activity in a while, I'm opening a new one - I'm completely fine with merging the previous PR if it gets completed before this is reviewed.

Let me know if there's anything missing or anything else I can do 👍
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6573

Reviewed By: cheng-chang

Differential Revision: D20604666

Pulled By: pdillinger

fbshipit-source-id: 4de17df1180c3b87b76cfdd77b674b81fc0563f7
2020-04-16 15:55:26 -07:00
Adam Retter
9ca49bd4df Keep building RocksJava on all architectures (#6583)
Summary:
Adding solid support for multiple architectures was initially triggered by RocksJava users. As such I would like to keep the CI for RocksJava on all architectures, to ensure we don't break backwards compatibility.

pdillinger okay let's see how long it takes to complete Travis-CI with this one...
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6583

Reviewed By: cheng-chang

Differential Revision: D21036718

Pulled By: pdillinger

fbshipit-source-id: 97afe0db2e4c575cc0284fdc1d4cc45d5deb2272
2020-04-16 15:50:45 -07:00
Adam Retter
5fef0ffd66 Update RocksJava static version of bzip2 (#6714)
Summary:
Updates the version of bzip2 used for RocksJava static builds.

Please, can we also get this cherry-picked to:

1. 6.7.fb
2. 6.8.fb
3. 6.9.fb
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6714

Reviewed By: cheng-chang

Differential Revision: D21067233

Pulled By: pdillinger

fbshipit-source-id: 8164b7eb99c5ca7b2021ab8c371ba9ded4cb4f7e
2020-04-16 15:35:24 -07:00
Andrew Kryczka
6717ada899 Fix CF import with overlapping SST files (#6663)
Summary:
Invariant checking should use internal key comparator rather than
`sstableKeyCompare()`. The latter was intended for checking whether a
compaction input file's neighboring files need to be included in the
same compaction. Using it for invariant checking was leading to false
positives for files with overlapping endpoints.

Fixes https://github.com/facebook/rocksdb/issues/6647.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6663

Test Plan: regression test

Reviewed By: zhichao-cao

Differential Revision: D20910466

Pulled By: ajkr

fbshipit-source-id: f0b70dad7c4096fce635cab7a36f16e14f74ae3f
2020-04-16 13:16:06 -07:00
sdong
73523baeb1 crash_test to cover options.avoid_flush_during_recovery (#6712)
Summary:
Options.avoid_flush_during_recovery is uncovered in crash_test. Add the coverage with a chance of 1/8, as it is a less frequently used options.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6712

Test Plan: Run crash_test and see the option can be used or not used by chance.

Reviewed By: ltamasi

Differential Revision: D21056566

fbshipit-source-id: c3b1521517cfc204786e6ef8c6acd7fffda64793
2020-04-16 12:11:45 -07:00
Yueh-Hsuan Chiang
5801af4646 Add env_fault_injection argument to db_stress (#6687)
Summary:
Add env_fault_injection argument to db_stress.  When enabled,
FaultInjectionTestEnv will be used instead.  Currently this
option does not support running with other env setting.

This will allow
us to later manually produce error when running db_crashtest.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6687

Test Plan:
make db_stress -j32
./db_stress --env_fault_injection
./db_stress --env_fault_injection --hdfs   // expect error message

Reviewed By: ajkr

Differential Revision: D21014683

Pulled By: yhchiang

fbshipit-source-id: 0724aeac37efd57adb72a37defe6dbd3bfa8106a
2020-04-16 11:13:44 -07:00
Cheng Chang
2767972386 Fix warning when O_CLOEXEC is not defined (#6695)
Summary:
Compilation fails on systems that do not support O_CLOEXEC. Fix it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6695

Test Plan: compile without O_CLOEXEC support

Reviewed By: anand1976

Differential Revision: D21011850

Pulled By: cheng-chang

fbshipit-source-id: f1bf1cce2aa65c7d10b5a9613e941db30e928347
2020-04-16 11:02:50 -07:00
Mike Kolupaev
e45673dece Properly report IO errors when IndexType::kBinarySearchWithFirstKey is used (#6621)
Summary:
Context: Index type `kBinarySearchWithFirstKey` added the ability for sst file iterator to sometimes report a key from index without reading the corresponding data block. This is useful when sst blocks are cut at some meaningful boundaries (e.g. one block per key prefix), and many seeks land between blocks (e.g. for each prefix, the ranges of keys in different sst files are nearly disjoint, so a typical seek needs to read a data block from only one file even if all files have the prefix). But this added a new error condition, which rocksdb code was really not equipped to deal with: `InternalIterator::value()` may fail with an IO error or Status::Incomplete, but it's just a method returning a Slice, with no way to report error instead. Before this PR, this type of error wasn't handled at all (an empty slice was returned), and kBinarySearchWithFirstKey implementation was considered a prototype.

Now that we (LogDevice) have experimented with kBinarySearchWithFirstKey for a while and confirmed that it's really useful, this PR is adding the missing error handling.

It's a pretty inconvenient situation implementation-wise. The error needs to be reported from InternalIterator when trying to access value. But there are ~700 call sites of `InternalIterator::value()`, most of which either can't hit the error condition (because the iterator is reading from memtable or from index or something) or wouldn't benefit from the deferred loading of the value (e.g. compaction iterator that reads all values anyway). Adding error handling to all these call sites would needlessly bloat the code. So instead I made the deferred value loading optional: only the call sites that may use deferred loading have to call the new method `PrepareValue()` before calling `value()`. The feature is enabled with a new bool argument `allow_unprepared_value` to a bunch of methods that create iterators (it wouldn't make sense to put it in ReadOptions because it's completely internal to iterators, with virtually no user-visible effect). Lmk if you have better ideas.

Note that the deferred value loading only happens for *internal* iterators. The user-visible iterator (DBIter) always prepares the value before returning from Seek/Next/etc. We could go further and add an API to defer that value loading too, but that's most likely not useful for LogDevice, so it doesn't seem worth the complexity for now.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6621

Test Plan: make -j5 check . Will also deploy to some logdevice test clusters and look at stats.

Reviewed By: siying

Differential Revision: D20786930

Pulled By: al13n321

fbshipit-source-id: 6da77d918bad3780522e918f17f4d5513d3e99ee
2020-04-15 17:40:44 -07:00
anand76
610a09ccff Remove a printf from db_stress that's not useful info (#6705)
Summary:
This was causing db_crashtest.py to wrongly assume an error by parsing the output. Hopefully this will stabilize the crash tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6705

Test Plan: make blackbox_crash_test

Reviewed By: ltamasi

Differential Revision: D21043335

Pulled By: anand1976

fbshipit-source-id: 5cddd112b124d4e2ebd11724a17d4ef0f50c1cf8
2020-04-15 12:13:35 -07:00
sdong
165560fb32 Two Improvements to tools/check_format_compatible.sh (#6702)
Summary:
Improve it in two ways:
1. tools/check_format_compatible.sh is not friendly to run outside FB environment. remove the hard-coded http proxy setting. Instead, move it to Legocastle configuration
2. Always disable warning as error, so that older build is more likely to pass.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6702

Test Plan: Run the test and make sure at least it doesn't break.

Reviewed By: riversand963

Differential Revision: D21033329

fbshipit-source-id: 88b4ec1ec49547b772790050a165466bdc4a62a0
2020-04-15 11:28:11 -07:00
anand76
234e2ed5b6 Fix a couple of bugs in db_stress fault injection (#6700)
Summary:
1. Fix a memory leak in FaultInjectionTestFS in the stack trace related
code
2. Check status of all MultiGet keys before deciding whether an error
was swallowed, instead of assuming an ok status for any key means an
undetected error
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6700

Test Plan: Run db_stress with asan and fault injection

Reviewed By: cheng-chang

Differential Revision: D21021498

Pulled By: anand1976

fbshipit-source-id: 489191efd1ab0fa834923a1e1d57253a7a315465
2020-04-14 11:06:55 -07:00
Cheng Chang
9ae8058d95 Suppress file deletion error message in FaultInjectionTestEnv (#6696)
Summary:
The error message is causing problems in the crash tests due to the
error parsing logic in db_crashtest.py.

This is a follow up PR for https://github.com/facebook/rocksdb/pull/6694.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6696

Test Plan: make check

Reviewed By: zhichao-cao

Differential Revision: D21021875

Pulled By: cheng-chang

fbshipit-source-id: 11e3f536df16941a89949ebcd2147cd8dfa3fbe0
2020-04-14 10:55:10 -07:00
anand76
3d6d7bcf17 Log CompactOnDeletionCollectorFactory parameters on DB open (#6686)
Summary:
Log it in the info log to help in troubleshooting. It is logged as follows -
```
2020/04/10-10:51:39.886662 7ffff7fef340                   Options.table_properties_collectors: CompactOnDeletionCollector (Sliding window size = 100 Deletion trigger = 90);
```

Tests:
make check
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6686

Reviewed By: ltamasi

Differential Revision: D21002442

Pulled By: anand1976

fbshipit-source-id: 7adf0dbae7f1febcb00ce61fea5097118ede5c6a
2020-04-13 19:58:04 -07:00
Zhichao Cao
38dfa406ff Add NewFileChecksumGenCrc32cFactory to file checksum (#6688)
Summary:
Add NewFileChecksumGenCrc32cFactory to file checksum public interface such that applications can use the build in crc32 checksum factory.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6688

Test Plan: pass make asan_check

Reviewed By: riversand963

Differential Revision: D21006859

Pulled By: zhichao-cao

fbshipit-source-id: ea8a45196a8b77c310728ab05f6cc0f49f3baef0
2020-04-13 19:13:41 -07:00
Ziyue Yang
41563b61db Fix data racing of BlockBasedTableBuilder::ParallelCompressionRep::first_block (#6640)
Summary:
BlockBasedTableBuilder::ParallelCompressionRep::first_block can be read in
Flush() and written in BGWorkWriteRawBlock() concurrently. This commit fixes
the issue by reading first_block out before pushing the block to compression
and write.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6640

Test Plan: Run all tests concurrently with TSAN.

Reviewed By: cheng-chang

Differential Revision: D20851370

fbshipit-source-id: 6f039222e8319d31e15f1b45e05c106527253f72
2020-04-13 16:24:57 -07:00
anand76
d9cad3a526 Suppress file deletion error message in FaultInjectionTestFS (#6694)
Summary:
The error message is causing problems in the crash tests due to the
error parsing logic in db_crashtest.py.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6694

Reviewed By: siying

Differential Revision: D20998531

Pulled By: anand1976

fbshipit-source-id: 89cb54a5f5bb664ae6d239c37559f10e14c5ea07
2020-04-13 15:18:38 -07:00
Andrew Kryczka
9eca6d651d fix comparison count for format_version=3 indexes (#6650)
Summary:
In index blocks since `format_version=3`, user keys are written
rather than internal keys. When reading such blocks, the comparator is
obtained via `InternalKeyComparator::user_comparator()`. That function
must not return an unwrapped result as the wrapper class provides
accounting logic to populate `PerfContext::user_key_comparison_count`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6650

Test Plan:
ran db_bench and verified
`PerfContext::user_key_comparison_count` became larger.

Reviewed By: cheng-chang

Differential Revision: D20866325

Pulled By: ajkr

fbshipit-source-id: ad755d46bda31157dacc5b66e532279f19ad538c
2020-04-13 11:18:37 -07:00
anand76
79c838eb0f Fix a few bugs in db_stress fault injection (#6693)
Summary:
Fix the following issues -
1. Output parsing error in db_crashtest.py
2. Memory leak on exit
3. False alarm on filter block read error
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6693

Test Plan: asan_crash

Reviewed By: cheng-chang

Differential Revision: D20990399

Pulled By: anand1976

fbshipit-source-id: 178ee0dd7c69a4bc5db698379db0dedb29281699
2020-04-13 11:01:03 -07:00
Yanqin Jin
eeb3cf3f58 Fix release build (#6690)
Summary:
Fix release build caused by variable defined but unused.

Test plan (devserver)
```
make release
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6690

Reviewed By: cheng-chang

Differential Revision: D20980571

Pulled By: riversand963

fbshipit-source-id: c3f3b13f81dce4bdb19876dc2e710d5902ff8a02
2020-04-11 22:04:04 -07:00
anand76
5c19a441c4 Fault injection in db_stress (#6538)
Summary:
This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538

Test Plan:
crash_test
make check

Reviewed By: riversand963

Differential Revision: D20714347

Pulled By: anand1976

fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
2020-04-10 17:21:26 -07:00