Commit Graph

6053 Commits

Author SHA1 Message Date
Yi Wu
8c1f5c254f Fix db_bench build break with blob db
Summary:
Lite build does not recognize FLAGS_use_blob_db. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/2372

Reviewed By: anirbanr-fb

Differential Revision: D5130773

Pulled By: yiwu-arbug

fbshipit-source-id: 43131d9d0be5811f2129af562be72cca26369cb3
2017-05-30 13:40:29 -07:00
Yi Wu
0ccaba2a05 Fix rocksdb.estimate-num-keys DB property underflow
Summary:
rocksdb.estimate-num-keys is compute from `estimate_num_keys - 2 * estimate_num_deletes`. If  `2 * estimate_num_deletes > estimate_num_keys` it will underflow. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/2348

Differential Revision: D5109272

Pulled By: yiwu-arbug

fbshipit-source-id: e1bfb91346a59b7282a282b615002507e9d7c246
2017-05-23 12:27:53 -07:00
Sagar Vemuri
4646cd45c4 Bump version to 5.4.5 and update HISTORY.md 2017-05-19 13:25:59 -07:00
Adam Retter
2d6abcade4 Facility for cross-building RocksJava using Docker
Summary:
As an alternative to Vagrant, we can now also use Docker to cross-build RocksDB. The advantages are:

1. The Docker images are fixed; they include all the latest updates and build tools.
2. The Vagrant image, required scripts that ran for every build that would update CentOS and install the buildtools. This lead to slow repeatable builds, we don't need to do this with Docker as they are already in the provided images.

The Docker images I have used have their Docker build files here: https://github.com/evolvedbinary/docker-rocksjava and the images themselves are available from Docker hub: https://hub.docker.com/r/evolvedbinary/rocksjava/

I have added the following targets to the `Makefile`:
1. `rocksdbjavastaticreleasedocker` this uses Docker to perform the cross-builds. It is basically the Docker version of the existing Vagrant `rocksdbjavastaticrelease` target.
2. `rocksdbjavastaticpublishdocker` delegates to `rocksdbjavastaticreleasedocker` and then `rocksdbjavastaticpublishcentral` to upload the artiacts to Maven Central. Equivalent to the existing Vagrant target: `rocksdbjavastaticpublish`
Closes https://github.com/facebook/rocksdb/pull/2278

Differential Revision: D5048206

Pulled By: yiwu-arbug

fbshipit-source-id: 78fa96ef9d966fe09638ed01de282cd4e31961a9
2017-05-17 17:12:02 -07:00
Adam Retter
0b129d1f7d Make sure that zstd is statically linked correctly in the Java static build
Summary:
Closes https://github.com/facebook/rocksdb/issues/2280
Closes https://github.com/facebook/rocksdb/pull/2292

Differential Revision: D5061259

Pulled By: sagar0

fbshipit-source-id: eec89111d114c04beee5870a4eb4b51857754783
2017-05-17 17:07:55 -07:00
Adam Retter
58ef1ca899 Build and link with ZStd when creating the static RocksJava build
Summary: Closes https://github.com/facebook/rocksdb/pull/2279

Differential Revision: D5048161

Pulled By: yiwu-arbug

fbshipit-source-id: 43742ff93137e0a35ea7e855692c9e9a0cd41968
2017-05-17 17:07:09 -07:00
Yi Wu
13712712de s/std::snprintf/snprintf
Summary:
Looks like std::snprintf is not available on all platforms (e.g. MSVC 2010). Change it back to snprintf, where we have a macro in port.h to workaround compatibility.
Closes https://github.com/facebook/rocksdb/pull/2308

Differential Revision: D5070988

Pulled By: yiwu-arbug

fbshipit-source-id: bedfc1660bab0431c583ad434b7e68265e1211b1
2017-05-16 12:05:01 -07:00
Yi Wu
ab8129ab8a Fix build error with blob DB.
Summary:
snprintf is in <stdio.h> and not in namespace std.
Closes https://github.com/facebook/rocksdb/pull/2287

Reviewed By: anirbanr-fb

Differential Revision: D5054752

Pulled By: yiwu-arbug

fbshipit-source-id: 356807ec38f3c7d95951cdb41f31a3d3ae0714d4
2017-05-15 14:21:07 -07:00
Aaron Gao
87f35fbd3e fix readamp test type inconsistency 2017-05-12 14:14:42 -07:00
Yi Wu
9e58371114 Bump version to 5.4.4 and update HISTORY.md 2017-05-11 11:43:56 -07:00
Adam Retter
1d4dc5eb22 Fixes the CentOS 5 cross-building of RocksJava
Summary:
Updates to CentOS 5 have been archived as CentOS 5 is EOL. We now pull the updates from the vault. This is a stop gap solution, I will send a PR in a couple days which uses fixed Docker containers (with the updates pre-installed) instead.

sagar0 Here you go :-)
Closes https://github.com/facebook/rocksdb/pull/2270

Differential Revision: D5033637

Pulled By: sagar0

fbshipit-source-id: a9312dd1bc18bfb8653f06ffa0a1512b4415720d
2017-05-11 11:37:41 -07:00
Yi Wu
b54951489c Add missing files of blob_db to CMake file
Summary:
Some of the file from #2269 didn't add to CMake file. Adding them to fix window build.
Closes https://github.com/facebook/rocksdb/pull/2276

Differential Revision: D5043487

Pulled By: yiwu-arbug

fbshipit-source-id: 4eba853e9d92574353abce21d77d30e47ce43d3d
2017-05-11 11:25:05 -07:00
Anirban Rahut
e8727ff6e0 Blob storage pr
Summary:
The final pull request for Blob Storage.
Closes https://github.com/facebook/rocksdb/pull/2269

Differential Revision: D5033189

Pulled By: yiwu-arbug

fbshipit-source-id: 6356b683ccd58cbf38a1dc55e2ea400feecd5d06
2017-05-11 11:24:46 -07:00
Yi Wu
a6e1cf9d20 Fix ColumnFamilyTest:BulkAddDrop
Summary:
Fix ColumnFamilyTest:BulkAddDrop not deleted CF handles at the end, causing ASAN failure.
Closes https://github.com/facebook/rocksdb/pull/2275

Differential Revision: D5040724

Pulled By: yiwu-arbug

fbshipit-source-id: 86cd4070c944d01173a3cc36462bb800698af192
2017-05-11 11:23:14 -07:00
Yi Wu
ded1d5a1af Add bulk create/drop column family API
Summary:
Adding DB::CreateColumnFamilie() and DB::DropColumnFamilies() to bulk create/drop column families. This is to address the problem creating/dropping 1k column families takes minutes. The bottleneck is we persist options files for every single column family create/drop, and it parses the persisted options file for verification, which take a lot CPU time.

The new APIs simply create/drop column families individually, and persist options file once at the end. This improves create 1k column families to within ~0.1s. Further improvement can be merge manifest write to one IO.
Closes https://github.com/facebook/rocksdb/pull/2248

Differential Revision: D5001578

Pulled By: yiwu-arbug

fbshipit-source-id: d4e00bda671451e0b314c13e12ad194b1704aa03
2017-05-11 11:22:50 -07:00
Siying Dong
dc0bbf78f7 Fix an issue of manual / auto compaction data race
Summary:
A data race between a manual and an auto compaction can cause a scheduled automatic compaction to be cancelled and never rescheduled again. This may cause a condition of hanging forever. Fix this by always making sure the cancelled compaction is put back to the compaction queue.
Closes https://github.com/facebook/rocksdb/pull/2238

Differential Revision: D4984591

Pulled By: siying

fbshipit-source-id: 3ab153886403c7b991896dcb2158b96cac12f227
2017-05-11 11:17:26 -07:00
Aaron Gao
b634fd7162 Bump version to 5.4.3 2017-05-10 14:18:54 -07:00
Aaron Gao
22f277e034 fix readampbitmap tests
Summary:
fix test failure of ReadAmpBitmap and ReadAmpBitmapLiveInCacheAfterDBClose.
test ReadAmpBitmapLiveInCacheAfterDBClose individually and make check
Closes https://github.com/facebook/rocksdb/pull/2271

Differential Revision: D5038133

Pulled By: lightmark

fbshipit-source-id: 803cd6f45ccfdd14a9d9473c8af311033e164be8
2017-05-10 14:06:54 -07:00
Aaron Gao
7e62c5d67a unbiase readamp bitmap
Summary:
Consider BlockReadAmpBitmap with bytes_per_bit = 32. Suppose bytes [a, b) were used, while bytes [a-32, a)
 and [b+1, b+33) weren't used; more formally, the union of ranges passed to BlockReadAmpBitmap::Mark() contains [a, b) and doesn't intersect with [a-32, a) and [b+1, b+33). Then bits [floor(a/32), ceil(b/32)] will be set, and so the number of useful bytes will be estimated as (ceil(b/32) - floor(a/32)) * 32, which is on average equal to b-a+31.

An extreme example: if we use 1 byte from each block, it'll be counted as 32 bytes from each block.

It's easy to remove this bias by slightly changing the semantics of the bitmap. Currently each bit represents a byte range [i*32, (i+1)*32).

This diff makes each bit represent a single byte: i*32 + X, where X is a random number in [0, 31] generated when bitmap is created. So, e.g., if you read a single byte at random, with probability 31/32 it won't be counted at all, and with probability 1/32 it will be counted as 32 bytes; so, on average it's counted as 1 byte.

*But there is one exception: the last bit will always set with the old way.*

(*) - assuming read_amp_bytes_per_bit = 32.
Closes https://github.com/facebook/rocksdb/pull/2259

Differential Revision: D5035652

Pulled By: lightmark

fbshipit-source-id: bd98b1b9b49fbe61f9e3781d07f624e3cbd92356
2017-05-10 14:06:54 -07:00
Aaron Gao
2014cdf2d0 do not read next datablock if upperbound is reached
Summary:
Now if we have iterate_upper_bound set, we continue read until get a key >= upper_bound. For a lot of cases that neighboring data blocks have a user key gap between them, our index key will be a user key in the middle to get a shorter size. For example, if we have blocks:
[a b c d][f g h]
Then the index key for the first block will be 'e'.
then if upper bound is any key between 'd' and 'e', for example, d1, d2, ..., d99999999999, we don't have to read the second block and also know that we have done our iteration by reaching the last key that smaller the upper bound already.

This diff can reduce RA in most cases.
Closes https://github.com/facebook/rocksdb/pull/2239

Differential Revision: D4990693

Pulled By: lightmark

fbshipit-source-id: ab30ea2e3c6edf3fddd5efed3c34fcf7739827ff
2017-05-10 14:06:33 -07:00
Aaron Gao
459e00b365 Roundup read bytes in ReadaheadRandomAccessFile
Summary:
Fix alignment in ReadaheadRandomAccessFile
Closes https://github.com/facebook/rocksdb/pull/2253

Differential Revision: D5012336

Pulled By: lightmark

fbshipit-source-id: 10d2c829520cb787227ef653ef63d5d701725778
2017-05-09 15:18:49 -07:00
Aaron Gao
49412d93e2 fix memory alignment with logical sector size
Summary:
we align the buffer with logical sector size and should not test it with page size, which is usually 4k.
Closes https://github.com/facebook/rocksdb/pull/2245

Differential Revision: D5001842

Pulled By: lightmark

fbshipit-source-id: a7135fcf6351c6db363e8908956b1e193a4a6291
2017-05-09 15:16:57 -07:00
Islam AbdelRahman
5a8e732594 Bump version to 5.4.2 2017-05-08 17:35:44 +00:00
Andrew Kryczka
f2e68d2bec Avoid calling fallocate with UINT64_MAX
Summary:
When user doesn't set a limit on compaction output file size, let's use the sum of the input files' sizes. This will avoid passing UINT64_MAX as fallocate()'s length. Reported in #2249.

Test setup:
- command: `TEST_TMPDIR=/data/rocksdb-test/ strace -e fallocate ./db_compaction_test --gtest_filter=DBCompactionTest.ManualCompactionUnknownOutputSize`
- filesystem: xfs

before this diff:
`fallocate(10, 01, 0, 1844674407370955160) = -1 ENOSPC (No space left on device)`

after this diff:
`fallocate(10, 01, 0, 1977)              = 0`
Closes https://github.com/facebook/rocksdb/pull/2252

Differential Revision: D5007275

Pulled By: ajkr

fbshipit-source-id: 4491404a6ae8a41328aede2e2d6f4d9ac3e38880
2017-05-08 17:32:32 +00:00
Yi Wu
b07369836e Update HISTORY.md for 5.4.1 2017-05-01 21:49:09 -07:00
Maysam Yabandeh
f3dc93bcf1 Avoid pinning when row cache is accessed
Summary:
With row cache being enabled, table cache is doing a short circuit for reading data. This path needs to be updated to take advantage of pinnable slice. In the meanwhile we disabling pinning in this path.
Closes https://github.com/facebook/rocksdb/pull/2237

Differential Revision: D4982389

Pulled By: maysamyabandeh

fbshipit-source-id: 542630d0cf23cfb1f0c397da82e7053df7966591
2017-05-01 21:46:40 -07:00
Yi Wu
30a6d4e3ed Update HISTORY.md for 5.4.1 2017-04-28 13:56:40 -07:00
Aaron Gao
6d29d8b3fa add prefetch to PosixRandomAccessFile in buffered io
Summary:
Every time after a compaction/flush finish, we issue user reads to put the table into block cache which includes a couple of IO that read footer, index blocks, meta block, etc. So we implement Prefetch here to reduce IO.
Closes https://github.com/facebook/rocksdb/pull/2196

Differential Revision: D4931782

Pulled By: lightmark

fbshipit-source-id: 5a13d58dcab209964352322217193bbf7ff78149
2017-04-28 13:48:23 -07:00
Aaron Gao
1530f38baa fix WritableFile buffer size in direct IO
Summary:
�fix the buffer size in case of ppl use buffer size as their block_size.
Closes https://github.com/facebook/rocksdb/pull/2198

Differential Revision: D4956878

Pulled By: lightmark

fbshipit-source-id: 8bb0dc9c133887aadcd625d5261a3d1110b71473
2017-04-28 13:45:30 -07:00
Yi Wu
2b621c1f83 Bump version to 5.4.1 2017-04-28 13:36:19 -07:00
Yi Wu
71acb4c122 Fix WriteBatchWithIndex address use after scope error
Summary:
Fix use after scope error caught by ASAN.
Closes https://github.com/facebook/rocksdb/pull/2228

Differential Revision: D4968028

Pulled By: yiwu-arbug

fbshipit-source-id: a2a266c98634237494ab4fb2d666bc938127aeb2
2017-04-28 13:31:19 -07:00
Maysam Yabandeh
ebbce5b10d Respect deprecated flag in table options
Summary: Closes https://github.com/facebook/rocksdb/pull/2197

Differential Revision: D4932434

Pulled By: maysamyabandeh

fbshipit-source-id: 6c83c12d6d47e3f0640ab84954944215968f266f
2017-04-21 17:52:06 -07:00
Andrew Kryczka
fa586740e3 Change L0 compaction score using level size
Summary:
The goal is to avoid the problem of small number of L0 files triggering compaction to base level (which increased write-amp), while still allowing L0 compaction-by-size (so intra-L0 compactions cause score to increase).
Closes https://github.com/facebook/rocksdb/pull/2172

Differential Revision: D4908552

Pulled By: ajkr

fbshipit-source-id: 4b170142b2b368e24bd7948b2a6f24c69fabf73d
2017-04-19 17:27:12 -07:00
Maysam Yabandeh
4623a5521f Re-add index_per_partition but as deprecated
Summary:
index_per_partition should have deprecated deprecated instead of being removed. It is causing backward compatibility issues.
Closes https://github.com/facebook/rocksdb/pull/2173

Differential Revision: D4910947

Pulled By: maysamyabandeh

fbshipit-source-id: 5c52939381847d232ede6866606f67f2b4b857ae
2017-04-18 20:47:48 -07:00
Yi Wu
6df24fcffc Hide event listeners from lite build
Summary:
Fixing lite build failure introduce by #2169.
Closes https://github.com/facebook/rocksdb/pull/2174

Reviewed By: sagar0

Differential Revision: D4910619

Pulled By: yiwu-arbug

fbshipit-source-id: 5213b7b7431cc258688793c8c28153025588d8d9
2017-04-18 18:02:42 -07:00
Siying Dong
8a1c34903c Add DB:ResetStats()
Summary:
Add a function to allow users to reset internal stats without restarting the DB.
Closes https://github.com/facebook/rocksdb/pull/2167

Differential Revision: D4907939

Pulled By: siying

fbshipit-source-id: ab2dd85b88aabe9380da7485320a1d460d3e1f68
2017-04-18 17:22:35 -07:00
Yi Wu
48fc484950 Blob storage helper methods
Summary:
Split out interfaces needed for blob storage from #1560, including
* CompactionEventListener and OnFlushBegin listener interfaces.
* Blob filename support.
Closes https://github.com/facebook/rocksdb/pull/2169

Differential Revision: D4905463

Pulled By: yiwu-arbug

fbshipit-source-id: 564e73448f1b7a367e5e46216a521e57ea9011b5
2017-04-18 12:44:15 -07:00
Aaron Gao
1265ed7abb remove warning
Summary:
st_blocks shows 16 though the right value is 8. This happens occasionally which seems a bug.
Closes https://github.com/facebook/rocksdb/pull/2160

Differential Revision: D4893542

Pulled By: lightmark

fbshipit-source-id: 68e832586b58bbc6162efbe83ce273f1570d5be3
2017-04-14 19:50:21 -07:00
Aaron Gao
95c5e2dc6e readahead backwards from sst end
Summary:
prefetch some data from the end of the file for each compaction to reduce IO.
Closes https://github.com/facebook/rocksdb/pull/2149

Differential Revision: D4880576

Pulled By: lightmark

fbshipit-source-id: aa767cd1afc84c541837fbf1ad6c0d45b34d3932
2017-04-14 19:50:09 -07:00
Aaron Gao
8d7edd5908 change use_direct_writes to use_direct_io_for_flush_and_compaction
Summary:
Replace Options::use_direct_writes with Options::use_direct_io_for_flush_and_compaction
Now if Options::use_direct_io_for_flush_and_compaction = true, we will enable direct io for both reads and writes for flush and compaction job. Whereas Options::use_direct_reads controls user reads like iterator and Get().
Closes https://github.com/facebook/rocksdb/pull/2117

Differential Revision: D4860912

Pulled By: lightmark

fbshipit-source-id: d93575a8a5e780cf7e40797287edc425ee648c19
2017-04-14 16:19:53 -07:00
Aaron Gao
b6f6b73a9c add space for buggy kernel warning
Summary:
add the missing space
Closes https://github.com/facebook/rocksdb/pull/2150

Differential Revision: D4880696

Pulled By: lightmark

fbshipit-source-id: a4e0ad6a8ea45d6469d3f6c8514fdeb4cf10aaf5
2017-04-14 16:19:41 -07:00
Sagar Vemuri
415be221cb RocksDB Release 5.4 : Update HISTORY.md and build version.
Summary: Closes https://github.com/facebook/rocksdb/pull/2142

Reviewed By: siying

Differential Revision: D4874696

Pulled By: sagar0

fbshipit-source-id: 03e6e21735bb74e5a37cc913aabb2c250af558cc
2017-04-12 17:36:27 -07:00
Daniel Black
3eab41d7c4 java dependencies test -s -> use test -d
Summary:
To correct a build process where the JAVA_TEST_LIBDIR is a symlink to a cache directory.

Test -s (size 0) on symlinks returns true, resulting in a mkdir over the top of the symlink resulting in failure.

As a solution -d checks if it is a directory (or the symlink refers to a directory), which works in the case of real directories and symlinks to directories.

Trivial I know but it was really easy for me to use a symlink here to prevent frequent downloads in a CI environment.

Thanks for your consideration.
Closes https://github.com/facebook/rocksdb/pull/1917

Differential Revision: D4612263

Pulled By: siying

fbshipit-source-id: 4d458f8e1760068cdd6b5eae4bce6e12c400df41
2017-04-12 15:13:41 -07:00
Siying Dong
a22ed4eab1 internal_repo_rocksdb to build Java and RocksDB LITE
Summary: Build Java and RocksDB LITE as a customized unit test under internal_repo_rocksdb. One thing I'm not sure is that whether these two tests are triggered in every flavor.

Reviewed By: IslamAbdelRahman

Differential Revision: D4855868

fbshipit-source-id: 82a1628b458744d7692bbd29ef7424cca1294031
2017-04-12 15:13:41 -07:00
Islam AbdelRahman
9f2cc59ec5 sync TARGETS file 2017-04-11 18:17:47 -07:00
Aaron Gao
10d7546961 set readahead buffer size from roundup(user_size) + 4k to roundup(use…
Summary:
Users usually set readahead buffer to a multiple of 4k, more than that, usually a multiple of blocks.
So previously we set real buffer size 512 * n + 4k, which may introduce an additional block reading.
Closes https://github.com/facebook/rocksdb/pull/2138

Differential Revision: D4871504

Pulled By: lightmark

fbshipit-source-id: b070faa51d92e976e8e8468c00692699e585e243
2017-04-11 17:13:33 -07:00
Aaron Gao
ba7da434ae fix db_stress crash caused by buggy kernel warning
Summary:
filter the warning out and only print it once.
Closes https://github.com/facebook/rocksdb/pull/2137

Differential Revision: D4870925

Pulled By: lightmark

fbshipit-source-id: 91b363ce7f70bce88b0780337f408fc4649139b8
2017-04-11 16:56:59 -07:00
Siying Dong
6257837d83 Add ROCKSDB_JAVA_NO_COMPRESSION flag
Summary:
In some CI test environment, compression libraries can't be successfully built. It still helps to build RocksDB there. Provide such an option to skip to download and build compression libraries.
Closes https://github.com/facebook/rocksdb/pull/2135

Differential Revision: D4872617

Pulled By: siying

fbshipit-source-id: bb21ac373bc62a2528cdf1ca4547e05fcae86214
2017-04-11 16:56:59 -07:00
Sagar Vemuri
6a6723ee1e Move MergeOperatorPinning tests to be with other merge operator tests
Summary:
Moved MergeOperatorPinning tests from db_test2.cc to db_merge_operator_test.cc.

[This is the same code as PR #2104 , which has already been reviewed,  but I am creating a new PR as I cannot import from #2104 onto phabricator anymore even after rebasing. I'll close and discard #2104.]
Closes https://github.com/facebook/rocksdb/pull/2125

Differential Revision: D4863312

Pulled By: sagar0

fbshipit-source-id: 0f71a7690aa09c1d03ee85ce2bc1d2d89e4f4399
2017-04-11 16:15:06 -07:00
Maysam Yabandeh
6a8d5c015b Revert "Report cpu usage using time command"
Summary:
This reverts commit 97ec8a1349.
Closes https://github.com/facebook/rocksdb/pull/2136

Differential Revision: D4870610

Pulled By: maysamyabandeh

fbshipit-source-id: cdbfba135b065562f38f704f350a9a4e63a9a122
2017-04-11 13:57:58 -07:00