6138 Commits

Author SHA1 Message Date
Aaron Gao
9e72939029 only FALLOC_FL_PUNCH_HOLE when ftruncate is buggy
Summary:
In RocksDB, we sometimes preallocate the estimated space for a file to have better perf with fallocate (if supported). Usually it is a little bit bigger than the real resulting file size. At this time, we have to let the Filesystem reclaim the space not used.

Ideally, calling ftruncate to truncate the file to its real size should be enough. HOWEVER, it isn't on tmpfs, which we witness in our case, with some buggy kernel version. ftruncate a file with preallocated space doesn't change number of the blocks used by the file, which means the space not used by the file is not returned to the filesystems. So in this case we need fallocate with FALLOC_FL_PUNCH_HOLE to explicitly reclaim the used blocks. It is a hack to cope with the kernel bug and usually we should not need it.
Closes https://github.com/facebook/rocksdb/pull/2102

Differential Revision: D4848934

Pulled By: lightmark

fbshipit-source-id: f1b40b5
2017-04-06 18:25:03 -07:00
Sagar Vemuri
343b59d6ee Move various string utility functions into string_util
Summary:
This is an effort to club all string related utility functions into one common place, in string_util, so that it is easier for everyone to know what string processing functions are available. Right now they seem to be spread out across multiple modules, like logging and options_helper.

Check the sub-commits for easier reviewing.
Closes https://github.com/facebook/rocksdb/pull/2094

Differential Revision: D4837730

Pulled By: sagar0

fbshipit-source-id: 344278a
2017-04-06 14:54:12 -07:00
Ayappan
1d068f6067 Fix CompactRange incorrect buffer release
Summary:
While running `make jtest` using IBM Java, it fails at compactRangeToLevel with the below error.

```
Run: org.rocksdb.RocksDBTest testing now -> compactRangeToLevel
JVMJNCK056E JNI error in ReleaseByteArrayElements: Got memory 0x00003FFF94AA8908 from object 0x00000000000C7F78, releasing from 0x00000000000C7F68
JVMJNCK077E Error detected in org/rocksdb/RocksDB.compactRange0(J[BI[BIZII)V

JVMJNCK024E JNI error detected. Aborting.
JVMJNCK025I Use -Xcheck:jni:nonfatal to continue running when errors are detected.

Fatal error: JNI error
Makefile:205: recipe for target 'run_test' failed
make[1]: *** [run_test] Error 87
make[1]: Leaving directory '/home/ubuntu/rocksdb/java'
Makefile:1542: recipe for target 'jtest' failed
make: *** [jtest] Error 2
```

After checking the code, it is vivid that we are messing up the `ReleaseByteArrayElements` args in `rocksdb_compactrange_helper`.

```
   .................
   1959     s = db->CompactRange(compact_options, &begin_slice, &end_slice);
   1960   }
Closes https://github.com/facebook/rocksdb/pull/2060

Differential Revision: D4831427

Pulled By: yiwu-arbug

fbshipit-source-id: dd02037
2017-04-06 14:09:13 -07:00
Yi Wu
df6f5a3772 Move memtable related files into memtable directory
Summary:
Move memtable related files into memtable directory.
Closes https://github.com/facebook/rocksdb/pull/2087

Differential Revision: D4829242

Pulled By: yiwu-arbug

fbshipit-source-id: ca70ab6
2017-04-06 14:09:13 -07:00
Tamir Duberstein
107c5f6a60 CMake: more MinGW fixes
Summary:
siying this is a resubmission of #2081 with the 4th commit fixed. From that commit message:

> Note that the previous use of quotes in PLATFORM_{CC,CXX}FLAGS was
incorrect and caused GCC to produce the incorrect define:
>
>  #define ROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE 1
>
> This was the cause of the Linux build failure on the previous version
of this change.

I've tested this locally, and the Linux build succeeds now.
Closes https://github.com/facebook/rocksdb/pull/2097

Differential Revision: D4839964

Pulled By: siying

fbshipit-source-id: cc51322
2017-04-06 14:09:13 -07:00
Siying Dong
d2dce5611a Move some files under util/ to separate dirs
Summary:
Move some files under util/ to new directories env/, monitoring/ options/ and cache/
Closes https://github.com/facebook/rocksdb/pull/2090

Differential Revision: D4833681

Pulled By: siying

fbshipit-source-id: 2fd8bef
2017-04-05 19:09:16 -07:00
Islam AbdelRahman
c50e3750dc Use a human readable size for level report
Summary:
Current
```
** Compaction Stats [default] **
Level    Files   Size(MB} Score Read(GB}  Rn(GB} Rnp1(GB} Write(GB} Wnew(GB} Moved(GB} W-Amp Rd(MB/s} Wr(MB/s} Comp(sec} Comp(cnt} Avg(sec} KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      2/0      49.02   0.5      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0     76.1         1         2    0.322       0      0
 Sum      2/0      49.02   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     76.1         1         2    0.322       0      0
 Int      0/0       0.00   0.0      0.0     0.0      0.0       0.0      0.0       0.0   1.0      0.0     76.1         1         2    0.322       0      0
```

New
```
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn Key
Closes https://github.com/facebook/rocksdb/pull/2055

Differential Revision: D4804576

Pulled By: IslamAbdelRahman

fbshipit-source-id: 719be6a
2017-04-05 17:24:19 -07:00
Siying Dong
ce64b8b719 Divide db/db_impl.cc
Summary:
db_impl.cc is too large to manage. Divide db_impl.cc into db/db_impl.cc, db/db_impl_compaction_flush.cc, db/db_impl_files.cc, db/db_impl_open.cc and db/db_impl_write.cc.
Closes https://github.com/facebook/rocksdb/pull/2095

Differential Revision: D4838188

Pulled By: siying

fbshipit-source-id: c5f3059
2017-04-05 17:24:19 -07:00
Aaron Gao
02799ad77a Revert "delete fallocate with punch_hole"
Summary:
This reverts commit 0fd574926cc9be7309c2247092d6b337fb022a5d.
It breaks tmpfs on kernel 4.0 or earlier. We will wait for the fix before remove this part
Closes https://github.com/facebook/rocksdb/pull/2096

Differential Revision: D4839661

Pulled By: lightmark

fbshipit-source-id: 574a51f
2017-04-05 16:10:09 -07:00
Maysam Yabandeh
e2a7b202c3 Release note for partition filters
Summary:
I tagged it experimental since the configuring the filter partitions by size will not be ready for this release.
Closes https://github.com/facebook/rocksdb/pull/2049

Differential Revision: D4831575

Pulled By: maysamyabandeh

fbshipit-source-id: fcc2d25
2017-04-05 12:24:22 -07:00
Aaron Gao
af256eb2b7 build db every monday
Summary:
Rebuilding regression db at the first run every monday.
Closes https://github.com/facebook/rocksdb/pull/2093

Differential Revision: D4836961

Pulled By: lightmark

fbshipit-source-id: 22d6c25
2017-04-05 12:24:21 -07:00
Dmitri Smirnov
e5a1372b24 Rework test running script.
Summary:
Rework test running script.
  New options SuiteRun - runs specified executables as google suite
  test cases in parallel.
  Run - this option now runs executables in parallel the same as 'tests'
  RunAll - scans for test executables and attempts to run them all
  as suites except those mentiones in RunOnly (hardcoded in the script)
  or specified either in $ExcludeTestCases $ExcludeTestExe
Closes https://github.com/facebook/rocksdb/pull/2089

Differential Revision: D4832212

Pulled By: yiwu-arbug

fbshipit-source-id: 954990c
2017-04-05 11:39:20 -07:00
Andrew Kryczka
d659faad54 Level-based L0->L0 compaction
Summary:
Level-based L0->L0 compaction operates on spans of files that aren't currently being compacted. It reduces the number of L0 files, thus making write stall conditions harder to reach.

- L0->L0 is triggered when base level is unavailable due to pending compactions
- L0->L0 always outputs one file of at most `max_level0_burst_file_size` bytes.
- Subcompactions are disabled for L0->L0 since we want to output one file.
- Input files are chosen as the longest span of available files that will fit within the size limit. This minimizes number of files in L0.
Closes https://github.com/facebook/rocksdb/pull/2027

Differential Revision: D4760318

Pulled By: ajkr

fbshipit-source-id: 9d07183
2017-04-04 18:09:11 -07:00
Jevon Qiao
a12306fab7 Add a notice on gflags installation in INSTALL.md
Summary:
Add a notice on gflags installation to help people build and install
RocksDB from source code correctly.
Please see https://github.com/facebook/rocksdb/issues/1775
Closes https://github.com/facebook/rocksdb/pull/2061

Differential Revision: D4831323

Pulled By: yiwu-arbug

fbshipit-source-id: 0df1f0e
2017-04-04 16:39:25 -07:00
Siying Dong
43010a929f Revert "[rocksdb][PR] CMake: more MinGW fixes"
fbshipit-source-id: 43b4529
2017-04-04 16:24:26 -07:00
Islam AbdelRahman
a30b75cdcf Add buckifier script to github repo
Summary:
Add buckifier script and TARGETS file to github repo
Closes https://github.com/facebook/rocksdb/pull/2083

Differential Revision: D4825822

Pulled By: IslamAbdelRahman

fbshipit-source-id: 205f4a7
2017-04-04 16:24:26 -07:00
Tamir Duberstein
3450ac8c1b CMake: more MinGW fixes
Summary:
See individual commits.

yuslepukhin siying
Closes https://github.com/facebook/rocksdb/pull/2081

Differential Revision: D4824639

Pulled By: IslamAbdelRahman

fbshipit-source-id: 2fc2b00
2017-04-04 15:09:17 -07:00
Aaron Gao
90cfd46458 update IterKey that can get user key and internal key explicitly
Summary:
to void future bug that caused by the mix of userkey/internalkey
Closes https://github.com/facebook/rocksdb/pull/2084

Differential Revision: D4825889

Pulled By: lightmark

fbshipit-source-id: 28411db
2017-04-04 14:24:20 -07:00
Andrew Kryczka
e2c6c06366 add TimedEnv
Summary:
I've needed Env timing measurements a few times now, so finally built something for it.
Closes https://github.com/facebook/rocksdb/pull/2073

Differential Revision: D4811231

Pulled By: ajkr

fbshipit-source-id: 218a249
2017-04-04 11:24:12 -07:00
Yi Wu
9e44531803 Refactor WriteImpl (pipeline write part 1)
Summary:
Refactor WriteImpl() so when I plug-in the pipeline write code (which is
an alternative approach for WriteThread), some of the logic can be
reuse. I split out the following methods from WriteImpl():

* PreprocessWrite()
* HandleWALFull() (previous MaybeFlushColumnFamilies())
* HandleWriteBufferFull()
* WriteToWAL()

Also adding a constructor to WriteThread::Writer, and move WriteContext into db_impl.h.
No real logic change in this patch.
Closes https://github.com/facebook/rocksdb/pull/2042

Differential Revision: D4781014

Pulled By: yiwu-arbug

fbshipit-source-id: d45ca18
2017-04-04 10:24:32 -07:00
Siying Dong
6ef8c620d3 Move auto_roll_logger and filename out of db/
Summary:
It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency.
Closes https://github.com/facebook/rocksdb/pull/2080

Differential Revision: D4821141

Pulled By: siying

fbshipit-source-id: ca7d768
2017-04-03 18:39:14 -07:00
Maysam Yabandeh
a1c469d719 Add release notes for PinnableSlice
Summary: Closes https://github.com/facebook/rocksdb/pull/2037

Differential Revision: D4822085

Pulled By: maysamyabandeh

fbshipit-source-id: 9d5a986
2017-04-03 15:09:19 -07:00
Aaron Gao
0537f515c2 fix run_remote with strong quoting
Summary:
add \' pair to avoid bash expanding before run remotely
Closes https://github.com/facebook/rocksdb/pull/2079

Differential Revision: D4821342

Pulled By: lightmark

fbshipit-source-id: 418ba41
2017-04-03 13:09:12 -07:00
Doogie Lee
72e6000947 fixed misses on Centos library installation instructions
Summary:
Following the instructions on INSTALL.md,
I found some errors while installing gflags and zstandard libraries on Centos and fixed them.
Thanks :)
Closes https://github.com/facebook/rocksdb/pull/2077

Differential Revision: D4820804

Pulled By: ajkr

fbshipit-source-id: db4abb3
2017-04-03 12:09:14 -07:00
Siying Dong
88cc81df5c auto_roll_logger_test to move away from real sleep
Summary:
auto_roll_logger_test relies on timing conditon that some operations finish within 1 seconds. This caused flaky tests. Move away from real timing and sleep and use fake time to verify the time-based rolling.
Closes https://github.com/facebook/rocksdb/pull/2066

Differential Revision: D4810647

Pulled By: siying

fbshipit-source-id: c54d994
2017-04-03 11:39:09 -07:00
Nikhil Benesch
d25e28d584 replace sometimes-undefined uint type with unsigned int
Summary:
`uint` is nonstandard and not a built-in type on all compilers; replace it
with the always-valid `unsigned int`. I assume this went unnoticed because
it's inside an `#ifdef ROCKDB_JEMALLOC`.
Closes https://github.com/facebook/rocksdb/pull/2075

Differential Revision: D4820427

Pulled By: ajkr

fbshipit-source-id: 0876561
2017-04-03 11:39:09 -07:00
Andrew Kryczka
a1d7e487b3 Add L0 write-amp to compaction level stats
Summary:
Previously it always showed 0.0 for L0 write-amp because we were dividing by bytes read from non-output level. For L0, we should instead divide by bytes ingested to the DB. Note the numerator (bytes written to L0) includes flush bytes.
Closes https://github.com/facebook/rocksdb/pull/2078

Differential Revision: D4816902

Pulled By: ajkr

fbshipit-source-id: 7dca31a
2017-04-03 11:24:10 -07:00
Tamir Duberstein
b6d6090630 CMake: support AVX2 in MinGW
Summary:
yuslepukhin

See individual commits.
Closes https://github.com/facebook/rocksdb/pull/2051

Differential Revision: D4819697

Pulled By: siying

fbshipit-source-id: 75c1de4
2017-04-03 10:11:46 -07:00
Aaron Gao
bd7d13835e test remote instead run remote in regression test
Summary:
avoid exit when $? != 0
Closes https://github.com/facebook/rocksdb/pull/2076

Differential Revision: D4815655

Pulled By: lightmark

fbshipit-source-id: 829299e
2017-03-31 18:54:11 -07:00
Aaron Gao
c81a805fe6 test db existence on the remote host
Summary:
Fix the bug that previous test existence locally
Closes https://github.com/facebook/rocksdb/pull/2074

Differential Revision: D4813631

Pulled By: lightmark

fbshipit-source-id: eceb0d9
2017-03-31 14:54:17 -07:00
Aaron Gao
5fc1e6765a add -rf when remove db in regression test
Summary:
force remove db dir when rebuilding
Closes https://github.com/facebook/rocksdb/pull/2067

Differential Revision: D4811926

Pulled By: lightmark

fbshipit-source-id: ab068a2
2017-03-31 12:10:23 -07:00
Adam Retter
4ab4049f27 gflags has moved to GitHub
Summary:
Closes https://github.com/facebook/rocksdb/issues/2068
Closes https://github.com/facebook/rocksdb/pull/2072

Differential Revision: D4810855

Pulled By: ajkr

fbshipit-source-id: 288ddb9
2017-03-31 10:09:21 -07:00
Andrew Kryczka
4e0065015d make all DB::Get overloads virtual
Summary:
some fbcode services override it, we need to keep it virtual.

original change: #1756
Closes https://github.com/facebook/rocksdb/pull/2065

Differential Revision: D4808123

Pulled By: ajkr

fbshipit-source-id: 5eaeea7
2017-03-30 23:39:14 -07:00
Orgad Shaneh
6401a8b76b Fix build with MinGW
Summary:
There still are many warnings (most of them about invalid printf format
for long long), but it builds if FAIL_ON_WARNINGS is disabled.
Closes https://github.com/facebook/rocksdb/pull/2052

Differential Revision: D4807355

Pulled By: siying

fbshipit-source-id: ef03786
2017-03-30 16:54:52 -07:00
Andrew Kryczka
80fe5b3855 disable test: DeleteSchedulerTest.DynamicRateLimiting1
Summary:
temporarily disable since it isn't working on travis.
Closes https://github.com/facebook/rocksdb/pull/2064

Differential Revision: D4807373

Pulled By: ajkr

fbshipit-source-id: f2bb2b0
2017-03-30 16:54:52 -07:00
Andrew Kryczka
a9c86f51b7 backup garbage collect shared_checksum tmp files
Summary:
previously we only cleaned up .tmp files under "shared/" and "private/" directories in case the previous backup failed. we need to do the same for "shared_checksum/"; otherwise, the subsequent backup will fail if it tries to backup at least one of the same files.
Closes https://github.com/facebook/rocksdb/pull/2062

Differential Revision: D4805599

Pulled By: ajkr

fbshipit-source-id: eaa6088
2017-03-30 14:54:12 -07:00
Aaron Gao
da175f7ecc exit with code 2 when there is already a db_bench running in regression test
Summary: Closes https://github.com/facebook/rocksdb/pull/2063

Differential Revision: D4805507

Pulled By: lightmark

fbshipit-source-id: 59ccb18
2017-03-30 14:09:11 -07:00
Adam Retter
0ee7f04039 Added missing options to RocksJava
Summary:
This adds almost all missing options to RocksJava
Closes https://github.com/facebook/rocksdb/pull/2039

Differential Revision: D4779991

Pulled By: siying

fbshipit-source-id: 4a1bf28
2017-03-30 12:09:21 -07:00
Sagar Vemuri
c6d04f2ecf Option to fail a request as incomplete when skipping too many internal keys
Summary:
Operations like Seek/Next/Prev sometimes take too long to complete when there are many internal keys to be skipped. Adding an option, max_skippable_internal_keys -- which could be used to set a threshold for the maximum number of keys that can be skipped, will help to address these cases where it is much better to fail a request (as incomplete) than to wait for a considerable time for the request to complete.

This feature -- to fail an iterator seek request as incomplete, is disabled by default when max_skippable_internal_keys = 0. It is enabled only when max_skippable_internal_keys > 0.

This feature is based on the discussion mentioned in the PR https://github.com/facebook/rocksdb/pull/1084.
Closes https://github.com/facebook/rocksdb/pull/2000

Differential Revision: D4753223

Pulled By: sagar0

fbshipit-source-id: 1c973f7
2017-03-30 12:09:21 -07:00
Herman Lee
58179ec4a6 Cleanup of ThreadStatusUtil structures should use the DB's reference
Summary:
instead of thread_local

The cleanup path for the rocksdb database might not have the
thread_updater_local_cache_ pointer initialized because the thread
executing the cleanup is likely not a rocksdb thread. This results in a
memory leak detected by Valgrind. The cleanup code path should use the
thread_status_updater pointer obtained from the DB object instead of a
thread local one.
Closes https://github.com/facebook/rocksdb/pull/2059

Differential Revision: D4801611

Pulled By: hermanlee

fbshipit-source-id: 407d7de
2017-03-30 10:39:13 -07:00
Aaron Gao
f3607640a6 add ldb build to regression test
Summary:
add ldb to regression test in order to enable reuse db by creating checkpoint
Closes https://github.com/facebook/rocksdb/pull/2030

Differential Revision: D4800549

Pulled By: lightmark

fbshipit-source-id: d3a7325
2017-03-29 18:24:11 -07:00
Min Wei
8a8c967460 Enable Fast CRC32 for Win64
Summary:
Currently the fast crc32 path is not enabled on Windows. I am trying to enable it here, hopefully, with the minimum impact to the existing code structure.
Closes https://github.com/facebook/rocksdb/pull/2033

Differential Revision: D4770635

Pulled By: siying

fbshipit-source-id: 676f8b8
2017-03-29 17:39:19 -07:00
Mikhail Antonov
f9813b853d Added SstFileWriter construtor without explicit comparator to JNI api
Summary:
Adding API missing after 1ffbdfd9a7 (diff-b94146418eed4a9c1bf324041b95b279).

adamretter  IslamAbdelRahman

Tested locally.
Closes https://github.com/facebook/rocksdb/pull/2028

Differential Revision: D4762817

Pulled By: IslamAbdelRahman

fbshipit-source-id: 833f478
2017-03-29 17:24:14 -07:00
Sharan Suryanarayanan
8d3cb4f207 Added naming of backup engine threads
Summary:
Changed the naming of backup engine threads from "ldb" to "backup_engine"
Closes https://github.com/facebook/rocksdb/pull/2053

Differential Revision: D4799325

Pulled By: ajkr

fbshipit-source-id: 046893f
2017-03-29 17:10:46 -07:00
Siying Dong
67d7623794 Expose the stalling information through DB::GetProperty()
Summary:
Add two DB properties: rocksdb.actual_delayed_write_rate and rocksdb.is_write_stooped, for people to know whether current writes are being throttled.
Closes https://github.com/facebook/rocksdb/pull/2043

Differential Revision: D4782975

Pulled By: siying

fbshipit-source-id: 6b2f5cf
2017-03-29 11:54:20 -07:00
Aaron Gao
0fd574926c delete fallocate with punch_hole
Summary:
As discuss in this thread:
https://www.facebook.com/groups/rocksdb.dev/permalink/1218043868294125/

We remove fallocate with FALLOC_FL_PUNCH_HOLE because the recent bug on xfs in kernel 4.x+ that align file size to page size even with FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE.
Closes https://github.com/facebook/rocksdb/pull/2038

Differential Revision: D4779974

Pulled By: siying

fbshipit-source-id: 5f54625
2017-03-28 15:54:12 -07:00
Yi Wu
41fe9ad75b Hide usage of compaction_options_fifo from lite build
Summary:
...to fix lite build error.
Closes https://github.com/facebook/rocksdb/pull/2046

Differential Revision: D4785910

Pulled By: yiwu-arbug

fbshipit-source-id: b591f27
2017-03-28 13:39:13 -07:00
Maysam Yabandeh
e7731d119a Configure index partition size
Summary:
Allow the users to specify the target index partition size.

With this patch an index partition is cut before its estimated in-memory size goes above the configured value for metadata_block_size. The filter partitions are still cut right after an index partition is cut.
Closes https://github.com/facebook/rocksdb/pull/2041

Differential Revision: D4780216

Pulled By: maysamyabandeh

fbshipit-source-id: 95a0831
2017-03-28 12:09:12 -07:00
Ayappan
69c8d524a3 Fix jni library name for PowerPC Architecture
Summary:
Right now, building rocksdbjava in PowerPC is broken due to JNI library name. I figured it out that "uname -m" and java's os.arch matches in PowerPC architecture. I made use of this advantage to fix the issue. More info can found from this issue  --> https://github.com/facebook/rocksdb/issues/1317
Closes https://github.com/facebook/rocksdb/pull/2040

Differential Revision: D4779967

Pulled By: siying

fbshipit-source-id: 259f939
2017-03-27 14:09:11 -07:00
Maysam Yabandeh
34a70859bc Fix segmentation fault caused by #1961
Summary:
Fixes #1961 which causes a segfault when filter_policy is nullptr and both
pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks
are set.
Closes https://github.com/facebook/rocksdb/pull/2029

Differential Revision: D4764862

Pulled By: maysamyabandeh

fbshipit-source-id: 05bd695
2017-03-24 17:24:11 -07:00