Commit Graph

663 Commits

Author SHA1 Message Date
Zhongyi Xie
954b496b3f fix memory leak in two_level_iterator
Summary:
this PR fixes a few failed contbuild:
1. ASAN memory leak in Block::NewIterator (table/block.cc:429). the proper destruction of first_level_iter_ and second_level_iter_ of two_level_iterator.cc is missing from the code after the refactoring in https://github.com/facebook/rocksdb/pull/3406
2. various unused param errors introduced by https://github.com/facebook/rocksdb/pull/3662
3. updated comment for `ForceReleaseCachedEntry` to emphasize the use of `force_erase` flag.
Closes https://github.com/facebook/rocksdb/pull/3718

Reviewed By: maysamyabandeh

Differential Revision: D7621192

Pulled By: miasantreble

fbshipit-source-id: 476c94264083a0730ded957c29de7807e4f5b146
2018-04-15 17:26:26 -07:00
Amy Tai
28087acd79 Implemented Knuth shuffle to construct permutation for selecting no_o…
Summary:
…verwrite_keys. Also changed each no_overwrite_key set to an unordered set, otherwise Knuth shuffle only gets you 2x time improvement, because insertion (and subsequent internal sorting) into an ordered set is the bottleneck.

With this change, each iteration of permutation construction and prefix selection takes around 40 secs, as opposed to 360 secs previously. However, this still means that with the default 10 CF per blackbox test case, the test is going to time out given the default interval of 200 secs.

Also, there is currently an assertion error affecting all blackbox tests in db_crashtest.py; this assertion error will be fixed in a future PR.
Closes https://github.com/facebook/rocksdb/pull/3699

Differential Revision: D7624616

Pulled By: amytai

fbshipit-source-id: ea64fbe83407ff96c1c0ecabbc6c830576939393
2018-04-13 22:13:13 -07:00
David Lai
3be9b36453 comment unused parameters to turn on -Wunused-parameter flag
Summary:
This PR comments out the rest of the unused arguments which allow us to turn on the -Wunused-parameter flag. This is the second part of a codemod relating to https://github.com/facebook/rocksdb/pull/3557.
Closes https://github.com/facebook/rocksdb/pull/3662

Differential Revision: D7426121

Pulled By: Dayvedde

fbshipit-source-id: 223994923b42bd4953eb016a0129e47560f7e352
2018-04-12 17:59:16 -07:00
Maysam Yabandeh
eb5a295440 WritePrepared Txn: add write_committed option to dump_wal
Summary:
Currently dump_wal cannot print the prepared records from the WAL that is generated by WRITE_PREPARED write policy since the default reaction of the handler is to return NotSupported if markers of WRITE_PREPARED are encountered. This patch enables the admin to pass --write_committed=false option, which will be accordingly passed to the handler. Note that DBFileDumperCommand and DBDumperCommand are still not updated by this patch but firstly they are not urgent and secondly we need to revise this approach later when we also add WRITE_UNPREPARED markers so I leave it for future work.

Tested by running it on a WAL generated by WRITE_PREPARED:
$ ./ldb dump_wal --walfile=/dev/shm/dbbench/000003.log  | grep BEGIN_PREARE | head -1
1,2,70,0,BEGIN_PREARE
$ ./ldb dump_wal --walfile=/dev/shm/dbbench/000003.log --write_committed=false | grep BEGIN_PREARE | head -1
1,2,70,0,BEGIN_PREARE PUT(0) : 0x30303031313330313938 PUT(0) : 0x30303032353732313935 END_PREPARE(0x74786E31313535383434323738303738363938313335312D30)
Closes https://github.com/facebook/rocksdb/pull/3682

Differential Revision: D7522090

Pulled By: maysamyabandeh

fbshipit-source-id: a0332207261c61e18b2f9dfbe9feecd9a1339aca
2018-04-07 21:56:42 -07:00
Phani Shekhar Mantripragada
446b32cfc3 Support for Column family specific paths.
Summary:
In this change, an option to set different paths for different column families is added.
This option is set via cf_paths setting of ColumnFamilyOptions. This option will work in a similar fashion to db_paths setting. Cf_paths is a vector of Dbpath values which contains a pair of the absolute path and target size. Multiple levels in a Column family can go to different paths if cf_paths has more than one path.
To maintain backward compatibility, if cf_paths is not specified for a column family, db_paths setting will be used. Note that, if db_paths setting is also not specified, RocksDB already has code to use db_name as the only path.

Changes :
1) A new member "cf_paths" is added to ImmutableCfOptions. This is set, based on cf_paths setting of ColumnFamilyOptions and db_paths setting of ImmutableDbOptions.  This member is used to identify the path information whenever files are accessed.
2) Validation checks are added for cf_paths setting based on existing checks for db_paths setting.
3) DestroyDB, PurgeObsoleteFiles etc. are edited to support multiple cf_paths.
4) Unit tests are added appropriately.
Closes https://github.com/facebook/rocksdb/pull/3102

Differential Revision: D6951697

Pulled By: ajkr

fbshipit-source-id: 60d2262862b0a8fd6605b09ccb0da32bb331787d
2018-04-05 19:58:20 -07:00
Yi Wu
36a9f22931 Blob DB: blob_dump to show uncompressed values
Summary:
Make blob_dump tool able to show uncompressed values if the blob file is compressed. Also show total compressed vs. raw size at the end if --show_summary is provided.
Closes https://github.com/facebook/rocksdb/pull/3633

Differential Revision: D7348926

Pulled By: yiwu-arbug

fbshipit-source-id: ca709cb4ed5cf6a550ff2987df8033df81516f8e
2018-04-05 11:12:16 -07:00
Andrew Kryczka
b058a33705 Reduce default --nooverwritepercent in black-box crash tests
Summary:
Previously `python tools/db_crashtest.py blackbox` would do no useful work as the crash interval (two minutes) was shorter than the preparation phase. The preparation phase is slow because of the ridiculously inefficient way it computes which keys should not be overwritten. It was doing this for 60M keys since default values were `FLAGS_nooverwritepercent == 60` and `FLAGS_max_key == 100000000`.

Move the "nooverwritepercent" override from whitebox-specific to the general options so it also applies to blackbox test runs. Now preparation phase takes a few seconds.
Closes https://github.com/facebook/rocksdb/pull/3671

Differential Revision: D7457732

Pulled By: ajkr

fbshipit-source-id: 601f4461a6a7e49e50449dcf15aebc9b8a98d6f0
2018-04-03 15:28:40 -07:00
Anand Ananthabhotla
f9f4d40f93 Align SST file data blocks to avoid spanning multiple pages
Summary:
Provide a block_align option in BlockBasedTableOptions to allow
alignment of SST file data blocks. This will avoid higher
IOPS/throughput load due to < 4KB data blocks spanning 2 4KB pages.
When this option is set to true, the block alignment is set to lower of
block size and 4KB.
Closes https://github.com/facebook/rocksdb/pull/3502

Differential Revision: D7400897

Pulled By: anand1976

fbshipit-source-id: 04cc3bd144e88e3431a4f97604e63ad7a0f06d44
2018-03-26 20:26:10 -07:00
Sagar Vemuri
a993c0139d Add 5.11 and 5.12 to tools/check_format_compatible.sh
Summary: Closes https://github.com/facebook/rocksdb/pull/3646

Differential Revision: D7384727

Pulled By: sagar0

fbshipit-source-id: f713af7adb2ffea5303bbf0fac8a8a1630af7b38
2018-03-23 12:43:06 -07:00
Siying Dong
6383e42362 benchmark.sh to use --max_background_job
Summary: Closes https://github.com/facebook/rocksdb/pull/3632

Differential Revision: D7347012

Pulled By: siying

fbshipit-source-id: 46230ec4a917ccf4c478825b07e92b4665a4820b
2018-03-20 18:57:55 -07:00
Bruce Mitchener
a3a3f5497c Fix some typos in comments and docs.
Summary: Closes https://github.com/facebook/rocksdb/pull/3568

Differential Revision: D7170953

Pulled By: siying

fbshipit-source-id: 9cfb8dd88b7266da920c0e0c1e10fb2c5af0641c
2018-03-08 10:27:25 -08:00
Yi Wu
b864bc9b5b Blob DB: Improve FIFO eviction
Summary:
Improving blob db FIFO eviction with the following changes,
* Change blob_dir_size to max_db_size. Take into account SST file size when computing DB size.
* FIFO now only take into account live sst files and live blob files. It is normal for disk usage to go over max_db_size because there are obsolete sst files and blob files pending deletion.
* FIFO eviction now also evict TTL blob files that's still open. It doesn't evict non-TTL blob files.
* If FIFO is triggered, it will pass an expiration and the current sequence number to compaction filter. Compaction filter will then filter inlined keys to evict those with an earlier expiration and smaller sequence number. So call LSM FIFO.
* Compaction filter also filter those blob indexes where corresponding blob file is gone.
* Add an event listener to listen compaction/flush event and update sst file size.
* Implement DB::Close() to make sure base db, as well as event listener and compaction filter, destruct before blob db.
* More blob db statistics around FIFO.
* Fix some locking issue when accessing a blob file.
Closes https://github.com/facebook/rocksdb/pull/3556

Differential Revision: D7139328

Pulled By: yiwu-arbug

fbshipit-source-id: ea5edb07b33dfceacb2682f4789bea61de28bbfa
2018-03-06 11:57:42 -08:00
Pooya Shareghi
0a2354ca8f Added bytes XOR merge operator
Summary:
Closes https://github.com/facebook/rocksdb/pull/575

I fixed the merge conflicts etc.
Closes https://github.com/facebook/rocksdb/pull/3065

Differential Revision: D7128233

Pulled By: sagar0

fbshipit-source-id: 2c23a48c9f0432c290b0cd16a12fb691bb37820c
2018-03-06 10:27:36 -08:00
Andrew Kryczka
5d68243e61 Comment out unused variables
Summary:
Submitting on behalf of another employee.
Closes https://github.com/facebook/rocksdb/pull/3557

Differential Revision: D7146025

Pulled By: ajkr

fbshipit-source-id: 495ca5db5beec3789e671e26f78170957704e77e
2018-03-05 13:13:41 -08:00
Maysam Yabandeh
d060421c77 Fix a leak in prepared_section_completed_
Summary:
The zeroed entries were not removed from prepared_section_completed_ map. This patch adds a unit test to show the problem and fixes that by refactoring the code. The new code is more efficient since i) it uses two separate mutex to avoid contention between commit and prepare threads, ii) it uses a sorted vector for maintaining uniq log entires with prepare which avoids a very large heap with many duplicate entries.
Closes https://github.com/facebook/rocksdb/pull/3545

Differential Revision: D7106071

Pulled By: maysamyabandeh

fbshipit-source-id: b3ae17cb6cd37ef10b6b35e0086c15c758768a48
2018-03-01 20:41:56 -08:00
Igor Sugak
aba3409740 Back out "[codemod] - comment out unused parameters"
Reviewed By: igorsugak

fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37
2018-02-22 12:43:17 -08:00
David Lai
f4a030ce81 - comment out unused parameters
Reviewed By: everiq, igorsugak

Differential Revision: D7046710

fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461
2018-02-22 09:44:23 -08:00
Andrew Kryczka
1960e73e21 fix handling of empty string as checkpoint directory
Summary:
- made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic.
- made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future.

Differential Revision: D6874562

fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7
2018-02-20 16:44:00 -08:00
Yi Wu
989d12313c Legocastle job to report lite build binary size to scuba
Summary:
Add a legocastle job to continuously build the last 10 commits every 4 hours and report lite build binary size to scuba.
Closes https://github.com/facebook/rocksdb/pull/3511

Differential Revision: D7001730

Pulled By: yiwu-arbug

fbshipit-source-id: 7c8ca87c46d663c786a0d32be69ebbe7b19a5eb9
2018-02-15 17:27:24 -08:00
Andrew Kryczka
0a0fad447b db_bench separate options for partition index and filters
Summary:
Some workloads (like my current benchmarking) may want partitioned indexes without partitioned filters. Particularly, when `-optimize_filters_for_hits=true`, the total index size may be larger than the total filter size, so it can make sense to hold all filters in-memory but not all indexes.
Closes https://github.com/facebook/rocksdb/pull/3492

Differential Revision: D6970092

Pulled By: ajkr

fbshipit-source-id: b7fa1828e1d13829339aefb90fd56eb7c5337f61
2018-02-12 14:57:13 -08:00
Chinmay Kamat
9fc72d6f16 Compilation fixes for powerpc build, -Wparentheses-equality error and missing header guards
Summary:
This pull request contains miscellaneous compilation fixes.

Thanks,
Chinmay
Closes https://github.com/facebook/rocksdb/pull/3462

Differential Revision: D6941424

Pulled By: sagar0

fbshipit-source-id: fe9c26507bf131221f2466740204bff40a15614a
2018-02-09 14:12:43 -08:00
Tamir Duberstein
cd5092e168 Suppress unused warnings
Summary:
- Use `__unused__` everywhere
- Suppress unused warnings in Release mode
    + This currently affects non-MSVC builds (e.g. mingw64).
Closes https://github.com/facebook/rocksdb/pull/3448

Differential Revision: D6885496

Pulled By: miasantreble

fbshipit-source-id: f2f6adacec940cc3851a9eee328fafbf61aad211
2018-02-02 12:27:07 -08:00
Siying Dong
e2d4b0efb1 db_bench: sanity check CuckooTable with mmap_read option
Summary:
This is to avoid run time error. Fail the db_bench immediately if cuckoo table is used but mmap_read is not specified.
Closes https://github.com/facebook/rocksdb/pull/3420

Differential Revision: D6838284

Pulled By: siying

fbshipit-source-id: 20893fa28d40fadc31e4ff154bed02f5a1bad341
2018-01-29 14:27:32 -08:00
Mark Isaacson
b8eb32f8cf Suppress lint in old files
Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues.

Reviewed By: yiwu-arbug

Differential Revision: D6821806

fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c
2018-01-29 12:56:42 -08:00
Andrew Kryczka
9f7ccc8445 fix db_bench filluniquerandom key count assertion
Summary:
It failed every time. I guess people usually ran with assertions disabled.
Closes https://github.com/facebook/rocksdb/pull/3422

Differential Revision: D6822984

Pulled By: ajkr

fbshipit-source-id: 2e90db75618b26ac1c46ddfa9e03c095c7bf16e3
2018-01-29 11:43:21 -08:00
Andrew Kryczka
0e6e405fec db_bench support for memtable in-place update
Summary: Closes https://github.com/facebook/rocksdb/pull/3416

Differential Revision: D6820606

Pulled By: ajkr

fbshipit-source-id: 5035ffb33ade8d50520cafeb685ee8c8fcf1cca8
2018-01-26 10:57:49 -08:00
Siying Dong
47ad6b81ff Add 5.10.fb to tools/check_format_compatible.sh
Summary: Closes https://github.com/facebook/rocksdb/pull/3383

Differential Revision: D6762375

Pulled By: siying

fbshipit-source-id: dc1e0dc9718ffb59ffe42e2a2c844b67f935a5fb
2018-01-19 12:42:07 -08:00
Anand Ananthabhotla
199405192d Add a BlockBasedTableOption to turn off index block compression.
Summary:
Add a new bool option index_uncompressed in BlockBasedTableOptions.
Closes https://github.com/facebook/rocksdb/pull/3303

Differential Revision: D6686161

Pulled By: anand1976

fbshipit-source-id: 748b46993d48a01e5f89b6bd3e41f06a59ec6054
2018-01-10 15:11:59 -08:00
Yi Wu
46ec52499e Fix db_bench write being disabled in lite build
Summary:
The macro was added by mistake in #2372
Closes https://github.com/facebook/rocksdb/pull/3343

Differential Revision: D6681356

Pulled By: yiwu-arbug

fbshipit-source-id: 4180172fb0eaef4189c07f219241e0c261c03461
2018-01-09 10:57:29 -08:00
Maysam Yabandeh
00b33c2474 WritePrepared Txn: address some pending TODOs
Summary:
This patch addresses a couple of minor TODOs for WritePrepared Txn such as double checking some assert statements at runtime as well, skip extra AddPrepared in non-2pc transactions, and safety check for infinite loops.
Closes https://github.com/facebook/rocksdb/pull/3302

Differential Revision: D6617002

Pulled By: maysamyabandeh

fbshipit-source-id: ef6673c139cb49f64c0879508d2f573b78609aca
2018-01-09 08:57:20 -08:00
yingsu00
f54d7f5fea Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**

RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.

This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.

**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.

Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.

1) ReadRandom in db_bench overall metrics

    PER RUN
    Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
    3-way      |  1   | 4.143   | 241387 | 26.7
    3-way      |  2   | 3.775   | 264872 | 29.3
    3-way      | 3    | 4.116   | 242929 | 26.9
    FastCrc32c|1  | 4.037   | 247727 | 27.4
    FastCrc32c|2  | 4.648   | 215166 | 23.8
    FastCrc32c|3  | 4.352   | 229799 | 25.4

     AVG
    Algorithm     |    Average of micros/op |   Average of ops/sec |    Average of Throughput (MB/s)
    3-way           |     4.01                               |      249,729                 |      27.63
    FastCrc32c  |     4.35                              |     230,897                  |      25.53

 2)   Crc32c computation CPU cost (inclusive samples percentage)
    PER RUN
    Implementation | run |  TotalSamples   | Crc32c percentage
    3-way                 |  1    |  4,572,250,000 | 4.37%
    3-way                 |  2    |  3,779,250,000 | 4.62%
    3-way                 |  3    |  4,129,500,000 | 4.48%
    FastCrc32c       |  1    |  4,663,500,000 | 11.24%
    FastCrc32c       |  2    |  4,047,500,000 | 12.34%
    FastCrc32c       |  3    |  4,366,750,000 | 11.68%

 **# Test Plan**
     make -j64 corruption_test && ./corruption_test
      By default it uses 3-way SSE algorithm

     NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test

    make clean && DEBUG_LEVEL=0 make -j64 db_bench
    make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173

Differential Revision: D6330882

Pulled By: yingsu00

fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
2017-12-19 18:26:49 -08:00
Maysam Yabandeh
95583e1532 db_stress: skip snapshot check if cf is dropped
Summary:
We added a new verification that ensures a value that snapshot reads when is released is the same as when it was created. This test however fails when the cf is dropped in between. The patch skips the tests if that was the case.
Closes https://github.com/facebook/rocksdb/pull/3279

Differential Revision: D6581584

Pulled By: maysamyabandeh

fbshipit-source-id: afe37d371c0f91818d2e279b3949b810e112e8eb
2017-12-15 16:28:04 -08:00
Maysam Yabandeh
cd2e5cae7f WritePrepared Txn: make db_stress transactional
Summary:
Add "--use_txn" option to use transactional API in db_stress, default being WRITE_PREPARED policy, which is the main intention of modifying db_stress. It also extend the existing snapshots to verify that before releasing a snapshot a read from it returns the same value as before.
Closes https://github.com/facebook/rocksdb/pull/3243

Differential Revision: D6556912

Pulled By: maysamyabandeh

fbshipit-source-id: 1ae31465be362d44bd06e635e2e9e49a1da11268
2017-12-13 11:57:29 -08:00
Yi Wu
e1c569c324 Fix clang-analyzer false-positive on ldb_cmd.cc
Summary:
clang-analyzer complaint about db_ being nullptr, but it couldn't be because it checks exec_stats before proceed. Add an assert to get around the false-positive.

Test Plan
`make analyze`
Closes https://github.com/facebook/rocksdb/pull/3236

Differential Revision: D6505417

Pulled By: yiwu-arbug

fbshipit-source-id: e5b65764ea994dd9e4bab3e697b97dc70dc22cab
2017-12-06 22:58:46 -08:00
Sagar Vemuri
d51fcb21f4 Blob DB: Add db_bench options
Summary:
Adding more BlobDB db_bench options which are needed for benchmarking.
Closes https://github.com/facebook/rocksdb/pull/3230

Differential Revision: D6500711

Pulled By: sagar0

fbshipit-source-id: 91d63122905854ef7c9148a0235568719146e6c5
2017-12-06 20:44:12 -08:00
Yi Wu
7f04af32a5 ldb to allow db with --try_load_options and without an options file
Summary:
This is to fix tools/check_format_compatible.sh. The tool try to open
old versions of rocksdb with the provided options file. When options
file is missing (e.g. rocksdb 2.2), it should still proceed with default
options.
Closes https://github.com/facebook/rocksdb/pull/3232

Differential Revision: D6503955

Pulled By: yiwu-arbug

fbshipit-source-id: e44cfcce7ddc7d12cf83466ed3f3fe7624aa78b8
2017-12-06 16:42:26 -08:00
Yi Wu
b5798bd324 Add missing recent versions to format compatible test
Summary:
Add recent versions for format compatible test. We should probably update the script to auto include available versions (by looking at include/rocksdb/versions.h and deduce branch names), but we can do it later.
Closes https://github.com/facebook/rocksdb/pull/3233

Differential Revision: D6503631

Pulled By: yiwu-arbug

fbshipit-source-id: e2b01d1ef6e784ff6ffa1bd75d741755e3c69a8c
2017-12-06 16:13:50 -08:00
Andrew Kryczka
63f1c0a57d fix gflags namespace
Summary:
I started adding gflags support for cmake on linux and got frustrated that I'd need to duplicate the build_detect_platform logic, which determines namespace based on attempting compilation. We can do it differently -- use the GFLAGS_NAMESPACE macro if available, and if not, that indicates it's an old gflags version without configurable namespace so we can simply hardcode "google".
Closes https://github.com/facebook/rocksdb/pull/3212

Differential Revision: D6456973

Pulled By: ajkr

fbshipit-source-id: 3e6d5bde3ca00d4496a120a7caf4687399f5d656
2017-12-01 10:42:05 -08:00
Maysam Yabandeh
18dcf7f98d WritePrepared Txn: PreReleaseCallback
Summary:
Add PreReleaseCallback to be called at the end of WriteImpl but before publishing the sequence number. The callback is used in WritePrepareTxn to i) update the commit map, ii) update the last published sequence number in the 2nd write queue. It also ensures that all the commits will go to the 2nd queue.
These changes will ensure that the commit map is updated before the sequence number is published and used by reading snapshots. If we use two write queues, the snapshots will use the seq number published by the 2nd queue. If we use one write queue (the default, the snapshots will use the last seq number in the memtable, which also indicates the last published seq number.
Closes https://github.com/facebook/rocksdb/pull/3205

Differential Revision: D6438959

Pulled By: maysamyabandeh

fbshipit-source-id: f8b6c434e94bc5f5ab9cb696879d4c23e2577ab9
2017-11-30 23:50:45 -08:00
Andrew Kryczka
ed3af9ef99 improve ldb CLI option support
Summary:
- Made CLI arguments take precedence over options file when both are provided. Note some of the CLI args are not settable via options file, like `--compression_max_dict_bytes`, so it's necessary to allow both ways of providing options simultaneously.
- Changed `PrepareOptionsForOpenDB` to update the proper `ColumnFamilyOptions` if one exists for the user's `--column_family_name` argument. I supported this only in the base class, `LDBCommand`, so it works for the general arguments. Will defer adding support for subcommand-specific arguments.
- Made the command fail if `--try_load_options` is provided and loading options file returns NotFound. I found the previous behavior of silently continuing confusing.
Closes https://github.com/facebook/rocksdb/pull/3144

Differential Revision: D6270544

Pulled By: ajkr

fbshipit-source-id: 7c2eac9f9b38720523d74466fb9e78db53561367
2017-11-28 17:28:58 -08:00
Prashant D
c1ed005a21 tools: Fix coverity issues
Summary:
tools/ldb_cmd.cc:
```
310  ignore_unknown_options_ = IsFlagPresent(flags, ARG_IGNORE_UNKNOWN_OPTIONS);

CID 1322798 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR)
5. uninit_member: Non-static class member db_ttl_ is not initialized in this constructor nor in any functions that it calls.
311}
```
Closes https://github.com/facebook/rocksdb/pull/3122

Differential Revision: D6428576

Pulled By: sagar0

fbshipit-source-id: d77f04dd201f7f1d9f59ef88a215ee7ad7b934e9
2017-11-28 15:27:41 -08:00
Sagar Vemuri
8954f830a0 Blob DB: db_bench flag to control BlobDB's garbage collection
Summary:
flag: blob_db_enable_gc, to control BlobDb's enable_garbage_collection.
Closes https://github.com/facebook/rocksdb/pull/3190

Differential Revision: D6383395

Pulled By: sagar0

fbshipit-source-id: 4134e835150748c425b8187264273a54c6d8381c
2017-11-20 23:26:15 -08:00
Yi Wu
9871ea4357 Regression test build binaries with PORTABLE=1
Summary:
We hit "Illegal instruction" error in regression test with "shlx" instruction. Setting PORTABLE=1 to resolve it.
Closes https://github.com/facebook/rocksdb/pull/3165

Differential Revision: D6321972

Pulled By: yiwu-arbug

fbshipit-source-id: cc9fe0dbd4698d1b66a750a0b062f66899862719
2017-11-13 21:26:24 -08:00
Andrew Kryczka
114896c4e0 db_bench compression options
Summary:
- moved existing compression options to `InitializeOptionsGeneral` since they cannot be set through options file
- added flag for `zstd_max_train_bytes` which was recently introduced by #3057
Closes https://github.com/facebook/rocksdb/pull/3128

Differential Revision: D6240460

Pulled By: ajkr

fbshipit-source-id: 27dbebd86a55de237ba6a45cc79cff9214e82ebc
2017-11-07 14:00:03 -08:00
Andrew Kryczka
65c95d9c59 support db_bench compact benchmark on bottommost files
Summary:
Without this option, running the compact benchmark on a DB containing only bottommost files simply returned immediately.
Closes https://github.com/facebook/rocksdb/pull/3138

Differential Revision: D6256660

Pulled By: ajkr

fbshipit-source-id: e3b64543acd503d821066f4200daa201d4fb3a9d
2017-11-07 10:57:24 -08:00
Andrew Kryczka
4d43c6a6a4 db_stress snapshot compatibility with reopens
Summary:
- Release all snapshots before crashing and reopening the DB. Without this, we may attempt to release snapshots from an old DB using a new DB. That tripped an assertion.
- Release multiple snapshots in the same operation if needed. Without this, we would sometimes leak snapshots.
Closes https://github.com/facebook/rocksdb/pull/3098

Differential Revision: D6194923

Pulled By: ajkr

fbshipit-source-id: b9c89bcca7ebcbb6c7802c616f9d1175a005aadf
2017-10-31 01:26:08 -07:00
Andrew Kryczka
d75793d6b4 db_stress support long-held snapshots
Summary:
Add options to `db_stress` (correctness testing tool) to randomly acquire snapshot and release it after some period of time. It's useful for correctness testing of #3009, as well as other parts of compaction that behave differently depending on which snapshots are held.
Closes https://github.com/facebook/rocksdb/pull/3038

Differential Revision: D6086501

Pulled By: ajkr

fbshipit-source-id: 3ec0d8666c78ac507f1f808887c4ff759ba9b865
2017-10-20 15:26:59 -07:00
Dmitri Smirnov
ebab2e2d42 Enable MSVC W4 with a few exceptions. Fix warnings and bugs
Summary: Closes https://github.com/facebook/rocksdb/pull/3018

Differential Revision: D6079011

Pulled By: yiwu-arbug

fbshipit-source-id: 988a721e7e7617967859dba71d660fc69f4dff57
2017-10-19 10:57:12 -07:00
Andrew Kryczka
731895214b db_bench randomtransaction print throughput
Summary:
print throughput in MB/s upon finishing randomtransaction benchmark
Closes https://github.com/facebook/rocksdb/pull/3016

Differential Revision: D6070426

Pulled By: ajkr

fbshipit-source-id: 69df43beed4c374a36d826e761ca3a83e1fdcbf5
2017-10-16 18:42:25 -07:00
Andrew Kryczka
1026e794a3 rate limit auto-tuning
Summary:
Dynamic adjustment of rate limit according to demand for background I/O. It increases by a factor when limiter is drained too frequently, and decreases by the same factor when limiter is not drained frequently enough. The parameters for this behavior are fixed in `GenericRateLimiter::Tune`. Other changes:

- make rate limiter's `Env*` configurable for testing
- track num drain intervals in RateLimiter so we don't have to rely on stats, which may be shared across different DB instances from the ones that share the RateLimiter.
Closes https://github.com/facebook/rocksdb/pull/2899

Differential Revision: D5858704

Pulled By: ajkr

fbshipit-source-id: cc2bac30f85e7f6fd63655d0a6732ef9ed7403b1
2017-10-04 19:15:01 -07:00
Siying Dong
2a3363d52e ldb dump can print histogram of value size
Summary:
Make "ldb dump --count_only" print histogram of value size. Also, fix a bug that "ldb dump --path=<db_path>" doesn't work.
Closes https://github.com/facebook/rocksdb/pull/2944

Differential Revision: D5954527

Pulled By: siying

fbshipit-source-id: c620a444ec544258b8d113f5f663c375dd53d6be
2017-10-02 09:41:17 -07:00
Andrew Kryczka
8fc3de3c62 make rate limiter a general option
Summary:
it's unsupported in options file, so the flag should be respected by db_bench even when an options file is provided.
Closes https://github.com/facebook/rocksdb/pull/2910

Differential Revision: D5869836

Pulled By: ajkr

fbshipit-source-id: f67f591ae083e95e989f86b6fad50765d2e3d855
2017-09-21 11:11:00 -07:00
Amy Xu
5785b1fcb8 Fix naming in InternalKey
Summary:
- Switched all instances of SetMinPossibleForUserKey and SetMaxPossibleForUserKey in accordance to InternalKeyComparator's comparison logic
Closes https://github.com/facebook/rocksdb/pull/2868

Differential Revision: D5804152

Pulled By: axxufb

fbshipit-source-id: 80be35e04f2e8abc35cc64abe1fecb03af24e183
2017-09-12 17:17:42 -07:00
Kamalalochana Subbaiah
e612e31740 Updated CRC32 Power Optimization Changes
Summary:
Support for PowerPC Architecture
Detecting AltiVec Support
Closes https://github.com/facebook/rocksdb/pull/2716

Differential Revision: D5606836

Pulled By: siying

fbshipit-source-id: 720262453b1546e5fdbbc668eff56848164113f3
2017-08-31 14:16:30 -07:00
Dmitri Smirnov
8fa4d108a2 Try to switch to Stduio 2017
Summary: Closes https://github.com/facebook/rocksdb/pull/2802

Differential Revision: D5746710

Pulled By: siying

fbshipit-source-id: daa621ba5fccb84c0d6cdb7755c5e09319c45cb4
2017-08-31 10:30:27 -07:00
Sagar Vemuri
06b37eef7b Set defaults for high-pri and low-pri thread pools in regression test script
Summary:
**Summary**:
Set defaults for high-pri and low-pri thread pools in regression test script.

**Reason for this change**:
With #2680 , high-pri and low-pri thread pools get different numbers than before if  `num_high_pri_threads` and `num_low_pri_threads` options are not explicitly passed to db_bench in regression test script ... leading to a false-positive regression.

**Test Plan**:
REMOTE_HOST=udb1671.prn3 TEST_MODE=1 FBSOURCE=~/fbsource ~/fbsource/fbcode/rocks/tools/debug_regression_test.sh viewstate  (with very minor changes to the internals).

Observe P50 and P99 which showed up as regressions in our graphs.

Stats with the commit prior to #2680 , ie. 4f81ab3 :
seekrandomwhilewriting :      75.096 micros/op 13316 ops/sec;  168.6 MB/s (7499074 of 7500000 found)
Microseconds per seek:
Count: 120000000 Average: 1197.7254  StdDev: 33.35
Min: 187  Median: 980.5292  Max: 1816424
Percentiles: **P50: 980.53** P75: 1494.57 **P99: 4185.64** P99.9: 7800.11 P99.99: 15039.64

Stats at #2680, ie. at commit dce6d5a (false-positive regression):
seekrandomwhilewriting :      85.330 micros/op 11719 ops/sec;  148.4 MB/s (7499073 of 7500000 found)
Microseconds per seek:
Count: 120000000 Average: 1362.3261  StdDev: 27.86
Min: 185  Median: 1088.1915  Max: 652760
Percentiles: **P50: 1088.19** P75: 1658.12 **P99: 5361.15** P99.9: 7997.95 P99.99: 11730.07

Stats with the current change on top of dce6d5a :
seekrandomwhilewriting :      77.780 micros/op 12856 ops/sec;  162.8 MB/s (7499102 of 7500000 found)
Microseconds per seek:
Count: 120000000 Average: 1226.6744  StdDev: 17.16
Min: 185  Median: 994.2956  Max: 2553530
Percentiles: **P50: 994.30** P75: 1513.68 **P99: 4284.30** P99.9: 9338.64 P99.99: 23008.86
Closes https://github.com/facebook/rocksdb/pull/2801

Differential Revision: D5742338

Pulled By: sagar0

fbshipit-source-id: cc5d727c1a131f2a7070d1bb892efbe929b976ff
2017-08-30 17:29:34 -07:00
Andrew Kryczka
7fbb9eccaf support disabling checksum in block-based table
Summary:
store a zero as the checksum when disabled since it's easier to keep block trailer a fixed length.
Closes https://github.com/facebook/rocksdb/pull/2781

Differential Revision: D5694702

Pulled By: ajkr

fbshipit-source-id: 69cea9da415778ba2b600dfd9d0dfc8cb5188ecd
2017-08-23 19:40:47 -07:00
yiwu-arbug
c1384a7076 fix db_stress uint64_t to int32 cast
Summary:
Clang complain about an cast from uint64_t to int32 in db_stress. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/2755

Differential Revision: D5655947

Pulled By: yiwu-arbug

fbshipit-source-id: cfac10e796e0adfef4727090b50975b0d6e2c9be
2017-08-17 17:56:55 -07:00
Andrew Kryczka
23593171c4 minor improvements to db_stress
Summary:
fix some things that made this command hard to use from CLI:

- use default values for `target_file_size_base` and `max_bytes_for_level_base`. previously we were using small values for these but default value of `write_buffer_size`, which led to enormous number of L1 files.
- failure message for `value_size_mult` too big. previously there was just an assert, so in non-debug mode it'd overrun the value buffer and crash mysteriously.
- only print verification success if there's no failure. before it'd print both in the failure case.
- support `memtable_prefix_bloom_size_ratio`
- support `num_bottom_pri_threads` (universal compaction)
Closes https://github.com/facebook/rocksdb/pull/2741

Differential Revision: D5629495

Pulled By: ajkr

fbshipit-source-id: ddad97d6d4ba0884e7c0f933b0a359712514fc1d
2017-08-16 19:13:01 -07:00
Andrew Kryczka
7aa96db7a2 db_stress rolling active window
Summary:
Support a window of `active_width` keys that rolls through `[0, max_key)` over the duration of the test. Operations only affect keys inside the window. This gives us the ability to detect L0->L0 deletion bug (#2722).
Closes https://github.com/facebook/rocksdb/pull/2739

Differential Revision: D5628555

Pulled By: ajkr

fbshipit-source-id: 9cb2d8f4ab1a7c73f7797b8e19f7094970ea8749
2017-08-15 12:02:16 -07:00
Andrew Kryczka
8254e9b57c make sst_dump compression size command consistent
Summary:
- like other subcommands, reporting compression sizes should be specified with the `--command` CLI arg.
- also added `--compression_types` arg as it's useful to restrict the types of compression used, at least in my dictionary compression experiments.
Closes https://github.com/facebook/rocksdb/pull/2706

Differential Revision: D5589520

Pulled By: ajkr

fbshipit-source-id: 305bb4ebcc95eecc8a85523cd3b1050619c9ddc5
2017-08-11 16:03:44 -07:00
Andrew Kryczka
74f18c1301 db_bench support for non-uniform column family ops
Summary:
Previously we could only select the CF on which to operate uniformly at random. This is a limitation, e.g., when testing universal compaction as all CFs would need to run full compaction at roughly the same time, which isn't realistic.

This PR allows the user to specify the probability distribution for selecting CFs via the `--column_family_distribution` argument.
Closes https://github.com/facebook/rocksdb/pull/2677

Differential Revision: D5544436

Pulled By: ajkr

fbshipit-source-id: 478d56260995236ae90895ce5bd51f38882e185a
2017-08-11 13:57:17 -07:00
Siying Dong
666a005f9b Support prefetch last 512KB with direct I/O in block based file reader
Summary:
Right now, if direct I/O is enabled, prefetching the last 512KB cannot be applied, except compaction inputs or readahead is enabled for iterators. This can create a lot of I/O for HDD cases. To solve the problem, the 512KB is prefetched in block based table if direct I/O is enabled. The prefetched buffer is passed in totegher with random access file reader, so that we try to read from the buffer before reading from the file. This can be extended in the future to support flexible user iterator readahead too.
Closes https://github.com/facebook/rocksdb/pull/2708

Differential Revision: D5593091

Pulled By: siying

fbshipit-source-id: ee36ff6d8af11c312a2622272b21957a7b5c81e7
2017-08-11 12:16:45 -07:00
Siying Dong
b87ee6f773 Use more keys per lock in daily TSAN crash test
Summary:
TSAN shows error when we grab too many locks at the same time. In TSAN crash test, make one shard key cover 2^22 keys so that no many keys will be hold at the same time.
Closes https://github.com/facebook/rocksdb/pull/2719

Differential Revision: D5609035

Pulled By: siying

fbshipit-source-id: 930e5d63fff92dbc193dc154c4c615efbdf06c6a
2017-08-10 17:56:57 -07:00
Aaron G
7848f0b24c add VerifyChecksum() to db.h
Summary:
We need a tool to check any sst file corruption in the db.
It will check all the sst files in current version and read all the blocks (data, meta, index) with checksum verification. If any verification fails, the function will return non-OK status.
Closes https://github.com/facebook/rocksdb/pull/2498

Differential Revision: D5324269

Pulled By: lightmark

fbshipit-source-id: 6f8a272008b722402a772acfc804524c9d1a483b
2017-08-09 15:58:13 -07:00
Andrew Kryczka
dce6d5a838 db_bench background work thread pool size arguments
Summary:
The background thread pools' sizes weren't easily configurable by `max_background_compactions` and `max_background_flushes` in multi-instance setups. Introduced separate arguments for their sizes.
Closes https://github.com/facebook/rocksdb/pull/2680

Differential Revision: D5550675

Pulled By: ajkr

fbshipit-source-id: bab5f0a7bc5db63bb084d0c10facbe437096367d
2017-08-03 21:41:49 -07:00
Alan Somers
5883a1ae24 Fix /bin/bash shebangs
Summary:
"/bin/bash" is a Linuxism.  "/usr/bin/env bash" is portable.
Closes https://github.com/facebook/rocksdb/pull/2646

Differential Revision: D5556259

Pulled By: ajkr

fbshipit-source-id: cbffd38ecdbfffb2438969ec007ab345ed893ccb
2017-08-03 15:56:46 -07:00
Andrew Kryczka
cc01985db0 Introduce bottom-pri thread pool for large universal compactions
Summary:
When we had a single thread pool for compactions, a thread could be busy for a long time (minutes) executing a compaction involving the bottom level. In multi-instance setups, the entire thread pool could be consumed by such bottom-level compactions. Then, top-level compactions (e.g., a few L0 files) would be blocked for a long time ("head-of-line blocking"). Such top-level compactions are critical to prevent compaction stalls as they can quickly reduce number of L0 files / sorted runs.

This diff introduces a bottom-priority queue for universal compactions including the bottom level. This alleviates the head-of-line blocking situation for fast, top-level compactions.

- Added `Env::Priority::BOTTOM` thread pool. This feature is only enabled if user explicitly configures it to have a positive number of threads.
- Changed `ThreadPoolImpl`'s default thread limit from one to zero. This change is invisible to users as we call `IncBackgroundThreadsIfNeeded` on the low-pri/high-pri pools during `DB::Open` with values of at least one. It is necessary, though, for bottom-pri to start with zero threads so the feature is disabled by default.
- Separated `ManualCompaction` into two parts in `PrepickedCompaction`. `PrepickedCompaction` is used for any compaction that's picked outside of its execution thread, either manual or automatic.
- Forward universal compactions involving last level to the bottom pool (worker thread's entry point is `BGWorkBottomCompaction`).
- Track `bg_bottom_compaction_scheduled_` so we can wait for bottom-level compactions to finish. We don't count them against the background jobs limits. So users of this feature will get an extra compaction for free.
Closes https://github.com/facebook/rocksdb/pull/2580

Differential Revision: D5422916

Pulled By: ajkr

fbshipit-source-id: a74bd11f1ea4933df3739b16808bb21fcd512333
2017-08-03 15:43:29 -07:00
Andrew Kryczka
060ccd4f84 support multiple CFs with OPTIONS file
Summary:
Move an option necessary for running db_bench on multiple CFs into the general initialization area, so it works with both flag-based init and OPTIONS-based init.
Closes https://github.com/facebook/rocksdb/pull/2675

Differential Revision: D5541378

Pulled By: ajkr

fbshipit-source-id: 169926cb4ae95c17974f744faf7cc794d41e5c0a
2017-08-02 16:27:01 -07:00
Yi Wu
1900771bd2 Dump Blob DB options to info log
Summary:
* Dump blob db options to info log
* Remove BlobDBOptionsImpl to disallow dynamic cast *BlobDBOptions into *BlobDBOptionsImpl. Move options there to be constants or into BlobDBOptions. The dynamic cast is broken after #2645
* Change some of the default options
* Remove blob_db_options.min_blob_size, which is unimplemented. Will implement it soon.
Closes https://github.com/facebook/rocksdb/pull/2671

Differential Revision: D5529912

Pulled By: yiwu-arbug

fbshipit-source-id: dcd58ca981db5bcc7f123b65a0d6f6ae0dc703c7
2017-08-01 13:01:47 -07:00
Siying Dong
21696ba502 Replace dynamic_cast<>
Summary:
Replace dynamic_cast<> so that users can choose to build with RTTI off, so that they can save several bytes per object, and get tiny more memory available.
Some nontrivial changes:
1. Add Comparator::GetRootComparator() to get around the internal comparator hack
2. Add the two experiemental functions to DB
3. Add TableFactory::GetOptionString() to avoid unnecessary casting to get the option string
4. Since 3 is done, move the parsing option functions for table factory to table factory files too, to be symmetric.
Closes https://github.com/facebook/rocksdb/pull/2645

Differential Revision: D5502723

Pulled By: siying

fbshipit-source-id: fd13cec5601cf68a554d87bfcf056f2ffa5fbf7c
2017-07-28 16:27:16 -07:00
Andrew Kryczka
f33f113683 fix db_bench argument type
Summary:
it should be a bool
Closes https://github.com/facebook/rocksdb/pull/2653

Differential Revision: D5506148

Pulled By: ajkr

fbshipit-source-id: f142f0f3aa8b678c68adef12e5ac6e1e163306f3
2017-07-27 12:14:37 -07:00
Siying Dong
c281b44829 Revert "CRC32 Power Optimization Changes"
Summary:
This reverts commit 2289d38115.
Closes https://github.com/facebook/rocksdb/pull/2652

Differential Revision: D5506163

Pulled By: siying

fbshipit-source-id: 105e31dd9d99090453a6b9f32c165206cd3affa3
2017-07-26 19:31:36 -07:00
Kamalalochana Subbaiah
2289d38115 CRC32 Power Optimization Changes
Summary:
Support for PowerPC Architecture
Detecting AltiVec Support
Closes https://github.com/facebook/rocksdb/pull/2353

Differential Revision: D5210948

Pulled By: siying

fbshipit-source-id: 859a8c063d37697addd89ba2b8a14e5efd5d24bf
2017-07-26 09:42:29 -07:00
Sagar Vemuri
72502cf227 Revert "comment out unused parameters"
Summary:
This reverts the previous commit 1d7048c598, which broke the build.

Did a `git revert 1d7048c`.
Closes https://github.com/facebook/rocksdb/pull/2627

Differential Revision: D5476473

Pulled By: sagar0

fbshipit-source-id: 4756ff5c0dfc88c17eceb00e02c36176de728d06
2017-07-21 18:26:26 -07:00
Victor Gao
1d7048c598 comment out unused parameters
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.

Reviewed By: igorsugak

Differential Revision: D5454343

fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
2017-07-21 14:57:44 -07:00
Maysam Yabandeh
2f375154ea checkout local branch in check_format_compatible.sh
Summary:
For forward_compatible_checkout_objs the local branch is already created in previous step. This patch avoid recreating it. This should address "fatal: A branch named '3.10.fb' already exists." errors.
Closes https://github.com/facebook/rocksdb/pull/2606

Differential Revision: D5443786

Pulled By: maysamyabandeh

fbshipit-source-id: 69d5a67b87677429cf36e3a467bd114d341f3b9c
2017-07-18 10:42:17 -07:00
Maysam Yabandeh
ddb22ac59c avoid collision with master branch in check format
Summary:
The new local branch specified with -b cannot be called master. Use tmp prefix to avoid name collision.
Closes https://github.com/facebook/rocksdb/pull/2600

Differential Revision: D5442944

Pulled By: maysamyabandeh

fbshipit-source-id: 4a623d9b21d6cc01bee812b2799790315bdf5f6e
2017-07-18 08:27:33 -07:00
Maysam Yabandeh
0c03a7f17d set the remote for git checkout
Summary:
This will fix the error: "error: pathspec '2.2.fb.branch' did not match any file(s) known to git."

Tested by manually sshing to sandcastle and running the command.
Closes https://github.com/facebook/rocksdb/pull/2599

Differential Revision: D5441130

Pulled By: maysamyabandeh

fbshipit-source-id: a22fd6a52221471bafbba8990394b499535e5812
2017-07-17 19:41:50 -07:00
Chris Lamb
b2dd192fed tools/write_stress.cc: Correct "1204" typos.
Summary:
Should be 1024, obviously :)
Closes https://github.com/facebook/rocksdb/pull/2592

Differential Revision: D5435269

Pulled By: ajkr

fbshipit-source-id: c59338a3900798a4733f0b205e534f21215cf049
2017-07-17 11:27:10 -07:00
Siying Dong
3c327ac2d0 Change RocksDB License
Summary: Closes https://github.com/facebook/rocksdb/pull/2589

Differential Revision: D5431502

Pulled By: siying

fbshipit-source-id: 8ebf8c87883daa9daa54b2303d11ce01ab1f6f75
2017-07-15 16:11:23 -07:00
Daniel Black
67510eeff3 db_crashtest.py: remove need for shell
Summary:
Before:
$ ps -ef

build     1713    16  0 Jul11 ?        00:00:00 make crash_test
build     3437  1713  0 Jul11 ?        00:00:00 python -u tools/db_crashtest.py --simple blackbox
build     3438  3437  0 Jul11 ?        00:00:00 [sh] <defunct>
build     3440     1 99 Jul11 ?        5-03:01:25 ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --block_size=16384 --allow_concurrent_me

After:

build      1706     16  0 02:52 ?        00:00:01 make crash_test
build      3432   1706  0 02:55 ?        00:00:00 python -u tools/db_crashtest.py --simple blackbox
build      4452   3432 99 04:35 ?        00:01:42 ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --block_size=16384 --allow_concurr
Closes https://github.com/facebook/rocksdb/pull/2571

Differential Revision: D5421580

Pulled By: maysamyabandeh

fbshipit-source-id: d6c3970c38ea0fa23da653f4385e8e25d83f5c9f
2017-07-14 09:11:03 -07:00
Yi Wu
5bfb67d90f Enable write rate limit for updaterandom benchmark
Summary:
We have FLAGS_benchmark_write_rate_limit to limit write rate in
db_bench, but it was not in use for updaterandom benchmark.
Closes https://github.com/facebook/rocksdb/pull/2578

Differential Revision: D5420328

Pulled By: yiwu-arbug

fbshipit-source-id: 5fa48c2b88f2f2dc83d615cb9c40c472bc916835
2017-07-13 16:27:38 -07:00
Siying Dong
98d1a5510b db_bench to by default verify checksum
Summary: Closes https://github.com/facebook/rocksdb/pull/2575

Differential Revision: D5417350

Pulled By: siying

fbshipit-source-id: 4bc11e35a7256167a5a7d2f586f2ac74c0deddb0
2017-07-13 12:14:17 -07:00
Yi Wu
c32f27223b Fixes db_bench with blob db
Summary:
* Create info log before db open to make blob db able to log to LOG file.
* Properly destroy blob db.
Closes https://github.com/facebook/rocksdb/pull/2567

Differential Revision: D5400034

Pulled By: yiwu-arbug

fbshipit-source-id: a49cfaf4b5c67d42d4cbb872bd5a9441828c17ce
2017-07-11 14:14:38 -07:00
Daniel Black
fcd99d27c9 db_bench_tool: fix buffer size
Summary:
Found by gcc warning:

x86_64-pc-linux-gnu-g++ --version
x86_64-pc-linux-gnu-g++ (GCC) 7.1.1 20170710

tools/db_bench_tool.cc: In member function 'void rocksdb::Benchmark::RandomWithVerify(rocksdb::ThreadState*)':
tools/db_bench_tool.cc:4430:8: error: '%lu' directive output may be truncated writing between 1 and 19 bytes into a region of size between 0 and 66 [-Werror=format-truncation=]
   void RandomWithVerify(ThreadState* thread) {
        ^~~~~~~~~~~~~~~~
tools/db_bench_tool.cc:4430:8: note: directive argument in the range [0, 9223372036854775807]
tools/db_bench_tool.cc:4492:13: note: 'snprintf' output between 37 and 128 bytes into a destination of size 100
     snprintf(msg, sizeof(msg),
     ~~~~~~~~^~~~~~~~~~~~~~~~~~
              "( get:%" PRIu64 " put:%" PRIu64 " del:%" PRIu64 " total:%" \
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
              PRIu64 " found:%" PRIu64 ")",
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
              gets_done, puts_done, deletes_done, readwrites_, found);
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
Makefile:1707: recipe for target 'tools/db_bench_tool.o' failed
Closes https://github.com/facebook/rocksdb/pull/2558

Differential Revision: D5398703

Pulled By: siying

fbshipit-source-id: 6ffa552bbd8b59cfc2c36289f86ff9b9acca8ca6
2017-07-11 10:41:06 -07:00
Aaron Gao
87128bd5c5 fix regression test
Summary:
delete the checkout line
Closes https://github.com/facebook/rocksdb/pull/2556

Differential Revision: D5394446

Pulled By: lightmark

fbshipit-source-id: 4afaf35e7ad0378d12169d13a3131a92b428b5d2
2017-07-10 17:17:31 -07:00
Andrew Kryczka
657df29eaf Add max_background_jobs to db_bench
Summary:
As titled. Also fixed an off-by-one error causing us to add one less range deletion than the user specified.
Closes https://github.com/facebook/rocksdb/pull/2544

Differential Revision: D5383451

Pulled By: ajkr

fbshipit-source-id: cbd5890c33f09bbb5c0c1f4bb952a1add32336e0
2017-07-07 18:13:21 -07:00
Siying Dong
7c4a9e6c92 Initialize a variable in ldb to make code analysis tool happy
Summary: Closes https://github.com/facebook/rocksdb/pull/2529

Differential Revision: D5378787

Pulled By: siying

fbshipit-source-id: 801ecacc2804f77cb3c9e5e665829e439db68800
2017-07-06 15:56:57 -07:00
Andrew Kryczka
33042573db Fix GetCurrentTime() initialization for valgrind
Summary:
Valgrind had false positive complaints about the initialization pattern for `GetCurrentTime()`'s argument in #2480. We can instead have the client initialize the time variable before calling `GetCurrentTime()`, and have `GetCurrentTime()` promise to only overwrite it in success case.
Closes https://github.com/facebook/rocksdb/pull/2526

Differential Revision: D5358689

Pulled By: ajkr

fbshipit-source-id: 857b189f24c19196f6bb299216f3e23e7bc4be42
2017-07-05 12:12:00 -07:00
Siying Dong
1cb8c6de63 Add -enable_pipelined_write to db_bench and add two defaults
Summary:
Expose pipeline write in db_bench and change the default to parallel memtable inserts
Closes https://github.com/facebook/rocksdb/pull/2527

Differential Revision: D5359825

Pulled By: siying

fbshipit-source-id: e30755feb07ff19a731c4058acf101e02de4e197
2017-06-30 15:27:03 -07:00
Yi Wu
67b417d624 fix format compatible test
Summary:
The comma "," is not a valid separator for bash arrays.
Closes https://github.com/facebook/rocksdb/pull/2516

Differential Revision: D5348101

Pulled By: yiwu-arbug

fbshipit-source-id: 8f0afdac368e21076eb7366b7df7dbaaf158cf96
2017-06-29 10:42:14 -07:00
Mike Kolupaev
397ab11152 Improve Status message for block checksum mismatches
Summary:
We've got some DBs where iterators return Status with message "Corruption: block checksum mismatch" all the time. That's not very informative. It would be much easier to investigate if the error message contained the file name - then we would know e.g. how old the corrupted file is, which would be very useful for finding the root cause. This PR adds file name, offset and other stuff to some block corruption-related status messages.

It doesn't improve all the error messages, just a few that were easy to improve. I'm mostly interested in "block checksum mismatch" and "Bad table magic number" since they're the only corruption errors that I've ever seen in the wild.
Closes https://github.com/facebook/rocksdb/pull/2507

Differential Revision: D5345702

Pulled By: al13n321

fbshipit-source-id: fc8023d43f1935ad927cef1b9c55481ab3cb1339
2017-06-28 21:27:01 -07:00
Sagar Vemuri
1cd45cd1b3 FIFO Compaction with TTL
Summary:
Introducing FIFO compactions with TTL.

FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size.

To address that request:
- Added a new TTL option to FIFO compaction options.
- Updated FIFO compaction score to take TTL into consideration.
- Added a new table property, creation_time, to keep track of when the SST file is created.
- Creation_time is set as below:
  - On Flush: Set to the time of flush.
  - On Compaction: Set to the max creation_time of all the files involved in the compaction.
  - On Repair and Recovery: Set to the time of repair/recovery.
  - Old files created prior to this code change will have a creation_time of 0.
- FIFO compaction with TTL is enabled when ttl > 0. All files older than ttl will be deleted during compaction. i.e. `if (file.creation_time < (current_time - ttl)) then delete(file)`. This will enable cases where you might want to delete all files older than, say, 1 day.
- FIFO compaction will fall back to the prior way of deleting files based on size if:
  - the creation_time of all files involved in compaction is 0.
  - the total size (of all SST files combined) does not drop below `compaction_options_fifo.max_table_files_size` even if the files older than ttl are deleted.

This feature is not supported if max_open_files != -1 or with table formats other than Block-based.

**Test Plan:**
Added tests.

**Benchmark results:**
Base: FIFO with max size: 100MB ::
```
svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100

readwhilewriting :       1.924 micros/op 519858 ops/sec;   13.6 MB/s (1176277 of 5000000 found)
```

With TTL (a low one for testing) ::
```
svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 --fifo_compaction_ttl=20

readwhilewriting :       1.902 micros/op 525817 ops/sec;   13.7 MB/s (1185057 of 5000000 found)
```
Example Log lines:
```
2017/06/26-15:17:24.609249 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609177) [db/compaction_picker.cc:1471] [default] FIFO compaction: picking file 40 with creation time 1498515423 for deletion
2017/06/26-15:17:24.609255 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609234) [db/db_impl_compaction_flush.cc:1541] [default] Deleted 1 files
...
2017/06/26-15:17:25.553185 7fd5a61a5800 [DEBUG] [db/db_impl_files.cc:309] [JOB 0] Delete /dev/shm/dbbench/000040.sst type=2 #40 -- OK
2017/06/26-15:17:25.553205 7fd5a61a5800 EVENT_LOG_v1 {"time_micros": 1498515445553199, "job": 0, "event": "table_file_deletion", "file_number": 40}
```

SST Files remaining in the dbbench dir, after db_bench execution completed:
```
svemuri@dev15905 ~/rocksdb (fifo-compaction)  $ ls -l /dev/shm//dbbench/*.sst
-rw-r--r--. 1 svemuri users 30749887 Jun 26 15:17 /dev/shm//dbbench/000042.sst
-rw-r--r--. 1 svemuri users 30768779 Jun 26 15:17 /dev/shm//dbbench/000044.sst
-rw-r--r--. 1 svemuri users 30757481 Jun 26 15:17 /dev/shm//dbbench/000046.sst
```
Closes https://github.com/facebook/rocksdb/pull/2480

Differential Revision: D5305116

Pulled By: sagar0

fbshipit-source-id: 3e5cfcf5dd07ed2211b5b37492eb235b45139174
2017-06-27 17:11:48 -07:00
Yi Wu
dc3d2e4d21 update compatible test
Summary:
update compatible test to include 5.5 and 5.6 branch.
Closes https://github.com/facebook/rocksdb/pull/2501

Differential Revision: D5325220

Pulled By: yiwu-arbug

fbshipit-source-id: 5f5271491e6dd2d7b2cf73a7142f38a571553bc4
2017-06-27 11:00:59 -07:00
Maysam Yabandeh
2c98b06bff Remove pin_slice option by making it the default
Summary:
This would simplify db_bench_tool.cc
Closes https://github.com/facebook/rocksdb/pull/2457

Differential Revision: D5259035

Pulled By: maysamyabandeh

fbshipit-source-id: 0a9c3abda624070fe2650200b885ad7e1c60182c
2017-06-15 16:14:08 -07:00
Maysam Yabandeh
c80c6115de add db_bench options for partitioning
Summary: Closes https://github.com/facebook/rocksdb/pull/2456

Differential Revision: D5259083

Pulled By: maysamyabandeh

fbshipit-source-id: 1ed1746da7a8baadf4772d023d927c6c4e6b112a
2017-06-15 16:14:08 -07:00
Sagar Vemuri
89ad9f3adb Allow ignoring unknown options when loading options from a file
Summary:
Added a flag, `ignore_unknown_options`, to skip unknown options when loading an options file (using `LoadLatestOptions`/`LoadOptionsFromFile`) or while verifying options (using `CheckOptionsCompatibility`). This will help in downgrading the db to an older version.

Also added `--ignore_unknown_options` flag to ldb

**Example Use case:**
In MyRocks, if copying from newer version to older version, it is often impossible to start because of new RocksDB options that don't exist in older version, even though data format is compatible.
MyRocks uses these load and verify functions in [ha_rocksdb.cc::check_rocksdb_options_compatibility](e004fd9f41/storage/rocksdb/ha_rocksdb.cc (L3348-L3401)).

**Test Plan:**
Updated the unit tests.
`make check`

ldb:
$ ./ldb --db=/tmp/test_db --create_if_missing put a1 b1
OK

Now edit /tmp/test_db/<OPTIONS-file> and add an unknown option.

Try loading the options now, and it fails:
$ ./ldb --db=/tmp/test_db --try_load_options get a1
Failed: Invalid argument: Unrecognized option DBOptions:: abcd

Passes with the new --ignore_unknown_options flag
$ ./ldb --db=/tmp/test_db --try_load_options --ignore_unknown_options get a1
b1
Closes https://github.com/facebook/rocksdb/pull/2423

Differential Revision: D5212091

Pulled By: sagar0

fbshipit-source-id: 2ec17636feb47dc0351b53a77e5f15ef7cbf2ca7
2017-06-13 16:58:01 -07:00
Andrew Kryczka
c217e0b9c7 Call RateLimiter for compaction reads
Summary:
Allow users to rate limit background work based on read bytes, written bytes, or sum of read and written bytes. Support these by changing the RateLimiter API, so no additional options were needed.
Closes https://github.com/facebook/rocksdb/pull/2433

Differential Revision: D5216946

Pulled By: ajkr

fbshipit-source-id: aec57a8357dbb4bfde2003261094d786d94f724e
2017-06-13 14:56:46 -07:00
hyunwoo
c7662a44a4 fixed typo
Summary:
fixed typo
Closes https://github.com/facebook/rocksdb/pull/2376

Differential Revision: D5183630

Pulled By: ajkr

fbshipit-source-id: 133cfd0445959e70aa2cd1a12151bf3c0c5c3ac5
2017-06-05 11:27:34 -07:00