Summary:
- Change data_[b] to data_[b / 8] in DynamicBloom::Prefetch, as b means the b-th bit in data_ and data_[b / 8] is the proper byte in data_.
Closes https://github.com/facebook/rocksdb/pull/1935
Differential Revision: D4628696
Pulled By: siying
fbshipit-source-id: bc5a0c6
Summary:
For the sake of making our options simpler, we should keep options.h as simple as possible and move more advanced/less common options to advaned_options.h
I started with ColumnFamilyOptions and also did some re-ordering
I have moved all ColumnFamilyOptions to advanced_options.h and only left these options in options.h
```
const Comparator* comparator = BytewiseComparator();
std::shared_ptr<MergeOperator> merge_operator = nullptr;
const CompactionFilter* compaction_filter = nullptr;
std::shared_ptr<CompactionFilterFactory> compaction_filter_factory = nullptr;
size_t write_buffer_size = 64 << 20;
CompressionType compression;
int level0_file_num_compaction_trigger = 4;
bool disable_auto_compactions = false;
```
Please feel free to comment on specific options if you think they should be advanced or should not be
Closes https://github.com/facebook/rocksdb/pull/1847
Differential Revision: D4519996
Pulled By: IslamAbdelRahman
fbshipit-source-id: abebd9a
Summary:
I have manually audited the entire RocksJava code base.
Sorry for the large pull-request, I have broken it down into many small atomic commits though.
My initial intention was to fix the warnings that appear when running RocksJava on Java 8 with `-Xcheck:jni`, for example when running `make jtest` you would see many errors similar to:
```
WARNING in native method: JNI call made without checking exceptions when required to from CallObjectMethod
WARNING in native method: JNI call made without checking exceptions when required to from CallVoidMethod
WARNING in native method: JNI call made without checking exceptions when required to from CallStaticVoidMethod
...
```
A few of those warnings still remain, however they seem to come directly from the JVM and are not directly related to RocksJava; I am in contact with the OpenJDK hostpot-dev mailing list about these - http://mail.openjdk.java.net/pipermail/hotspot-dev/2017-February/025981.html.
As a result of fixing these, I realised we were not r
Closes https://github.com/facebook/rocksdb/pull/1890
Differential Revision: D4591758
Pulled By: siying
fbshipit-source-id: 7f7fdf4
Summary:
Separate a smal subset of tests in DBTest to DBBasicTest. Tests in DBTest don't have to run in CI tests on platforms like OSX, as long as they are covered by Linux.
Closes https://github.com/facebook/rocksdb/pull/1924
Differential Revision: D4616702
Pulled By: siying
fbshipit-source-id: 13e6549
Summary:
Travis is short of OSX resource. Try to move platform independent test suites out of OSX
Closes https://github.com/facebook/rocksdb/pull/1922
Differential Revision: D4616070
Pulled By: siying
fbshipit-source-id: 786342c
Summary:
A previous fix to DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest2 didn't address the right problem. The problem is L0->L0 compaction is not trivial move in the scenario, not parallel compactions. Fix this.
Closes https://github.com/facebook/rocksdb/pull/1911
Differential Revision: D4608955
Pulled By: siying
fbshipit-source-id: 7a712cb
Summary:
The PATH update should put the Java path in the beginning, rather than the end. Otherwise, it will be overwritten. Also upgrade the Java version.
Closes https://github.com/facebook/rocksdb/pull/1912
Differential Revision: D4609854
Pulled By: siying
fbshipit-source-id: 3dc04f2
Summary:
valgrind tests always timeout with parallel run. Black list some slowest ones. It is better to run fewer tests than always have the tests timeout.
Closes https://github.com/facebook/rocksdb/pull/1908
Differential Revision: D4607875
Pulled By: siying
fbshipit-source-id: 7062664
Summary:
The assertion in Abandon() fails when called after Finish() fails. Finish() already closes the builder so there's no need to call Abandon().
Closes https://github.com/facebook/rocksdb/pull/1901
Differential Revision: D4601373
Pulled By: ajkr
fbshipit-source-id: e5678be
Summary:
…action
The two options, min_partial_merge_operands and verify_checksums_in_compaction, are not seldom used. Remove them to reduce the total number of options. Also remove them from Java and C interface.
Closes https://github.com/facebook/rocksdb/pull/1902
Differential Revision: D4601219
Pulled By: siying
fbshipit-source-id: aad4cb2
Summary:
add direct_io and compaction_readahead_size in db_stress
test direct_io under db_stress with compaction_readahead_size enabled to capture bugs found in production.
`./db_stress --allow_concurrent_memtable_write=0 --use_direct_reads --use_direct_writes --compaction_readahead_size=4096`
Closes https://github.com/facebook/rocksdb/pull/1906
Differential Revision: D4604514
Pulled By: IslamAbdelRahman
fbshipit-source-id: ebbf0ee
Summary:
querying logical sector size from the device instead of hardcoding it for linux platform.
Closes https://github.com/facebook/rocksdb/pull/1875
Differential Revision: D4591946
Pulled By: ajkr
fbshipit-source-id: 4e9805c
Summary:
InsertPathnameToSizeBytes() is called on shared/ and shared_checksum/ directories, which only exist for certain configurations. If we try to list a non-existent directory's contents, some Envs will dump an error message. Let's avoid this by checking whether the directory exists before listing its contents.
Closes https://github.com/facebook/rocksdb/pull/1895
Differential Revision: D4596301
Pulled By: ajkr
fbshipit-source-id: c809679
Summary:
`WriteBatchWithIndex` has an incorrect implicitly-generated move constructor (it will copy the pointer causing a double-free on destruction). Just switch to `unique_ptr` so we get correct move semantics for free.
Closes https://github.com/facebook/rocksdb/pull/1899
Differential Revision: D4598896
Pulled By: ajkr
fbshipit-source-id: 2373d47
Summary:
As the last step in backup creation, the .tmp directory is renamed omitting the .tmp suffix. In case the process terminates before this, the .tmp directory will be left behind. Even if this happens, we want future backups to succeed, so I added some checks/cleanup for this case.
Closes https://github.com/facebook/rocksdb/pull/1896
Differential Revision: D4597323
Pulled By: ajkr
fbshipit-source-id: 48900d8
Summary:
Missing this function will cause RandomAccessFileReader not doing alignment in Direct IO mode, which introduce an IOError: invalid argument.
Closes https://github.com/facebook/rocksdb/pull/1900
Differential Revision: D4601261
Pulled By: lightmark
fbshipit-source-id: c3eadf1
Summary:
we occasionally missing this call so the file size will be wrong
Closes https://github.com/facebook/rocksdb/pull/1894
Differential Revision: D4598446
Pulled By: lightmark
fbshipit-source-id: 42b6ef5
Summary:
Some places autodetected. These are the two places that didn't.
closes#1498
Still unsure if the following instances of 4 * 1024 need fixing in:
util/io_posix.h
include/rocksdb/table.h (appears to be blocksize and different)
utilities/persistent_cache/block_cache_tier.cc
utilities/persistent_cache/persistent_cache_test.h
include/rocksdb/env.h
util/env_posix.cc
db/column_family.cc
Closes https://github.com/facebook/rocksdb/pull/1499
Differential Revision: D4593640
Pulled By: yiwu-arbug
fbshipit-source-id: efc48de
Summary:
This is a trivial fix for OOMs we've seen a few days ago in logdevice.
RocksDB get into the following state:
(1) Write throughput is too high for flushes to keep up. Compactions are out of the picture - automatic compactions are disabled, and for manual compactions we don't care that much if they fall behind. We write to many CFs, with only a few L0 sst files in each, so compactions are not needed most of the time.
(2) total_log_size_ is consistently greater than GetMaxTotalWalSize(). It doesn't get smaller since flushes are falling ever further behind.
(3) Total size of memtables is way above db_write_buffer_size and keeps growing. But the write_buffer_manager_->ShouldFlush() is not checked because (2) prevents it (for no good reason, afaict; this is what this commit fixes).
(4) Every call to WriteImpl() hits the MaybeFlushColumnFamilies() path. This keeps flushing the memtables one by one in order of increasing log file number.
(5) No write stalling trigger is hit. We rely on max_write_buffer_number
Closes https://github.com/facebook/rocksdb/pull/1893
Differential Revision: D4593590
Pulled By: yiwu-arbug
fbshipit-source-id: af79c5f
Summary:
MongoRocks is still using some deprecated functions, return them temporarily
Closes https://github.com/facebook/rocksdb/pull/1892
Differential Revision: D4592451
Pulled By: IslamAbdelRahman
fbshipit-source-id: 5e6be3e
Summary:
reimplement the compaction expansion on lower level.
Considering such a case:
input level file: 1[B E] 2[F G] 3[H I] 4 [J M]
output level file: 5[A C] 6[D K] 7[L O]
If we initially pick file 2, now we will compact file 2 and 6. But we can safely compact 2, 3 and 6 without expanding the output level.
The previous code is messy and wrong.
In this diff, I first determine the input range [a, b], and output range [c, d],
then we get the range [e,f] = [min(a, c), max(b, d] and put all eligible clean-cut files within [e, f] into this compaction.
**Note: clean-cut means the files don't have the same user key on the boundaries of some files that are not chosen in this compaction**.
Closes https://github.com/facebook/rocksdb/pull/1760
Differential Revision: D4395564
Pulled By: lightmark
fbshipit-source-id: 2dc2c5c
Summary:
The option has been deprecated for two years and has no effect. Removing.
Closes https://github.com/facebook/rocksdb/pull/1866
Differential Revision: D4555203
Pulled By: yiwu-arbug
fbshipit-source-id: c48f627
Summary:
I'd like to propose a patch to expose a new IOError type with subcode kStaleFile to allow to detect when ESTALE error is returned. This allows the rocksdb consumers to handle this error separately from other IOErrors.
I've also added a missing string representation for the kDeadlock subcode, I believe calling ToString() on Status object with that subcode would result in an out of band access in the msgs array,
Please let me know if you have any questions or would like me to make any changes to this pull request.
Closes https://github.com/facebook/rocksdb/pull/1748
Differential Revision: D4387675
Pulled By: IslamAbdelRahman
fbshipit-source-id: 67feb13
Summary:
Remove functions that we deprecated long time ago in db.h
Closes https://github.com/facebook/rocksdb/pull/1878
Differential Revision: D4576521
Pulled By: IslamAbdelRahman
fbshipit-source-id: dfddad1
Summary:
The Makefile in the examples directory contained an empty line contain a tab character. This made my Emacs ask on every save `Suspicious line 10. Save anyway? (y or n)`
Closes https://github.com/facebook/rocksdb/pull/1872
Differential Revision: D4573881
Pulled By: siying
fbshipit-source-id: fb3b4ee
Summary:
fix lite bugs
disable direct io in lite mode
Closes https://github.com/facebook/rocksdb/pull/1870
Differential Revision: D4559866
Pulled By: yiwu-arbug
fbshipit-source-id: 3761c51
Summary:
RepairDB isn't included in rocksdb lite, so don't test it.
Closes https://github.com/facebook/rocksdb/pull/1873
Differential Revision: D4565094
Pulled By: ajkr
fbshipit-source-id: 8cc0898
Summary:
NowMicros() provides non-monotonic time. When wall clock is
synchronized or changed, the non-monotonicity time points will affect write rate
controllers. This patch changes write_controller.cc and rate_limiter.cc to use
monotonic time points.
Closes https://github.com/facebook/rocksdb/pull/1865
Differential Revision: D4561732
Pulled By: siying
fbshipit-source-id: 95ece62
Summary:
Seems to me `has_unpersisted_data_` is read from read thread and write
from write thread concurrently without synchronization. Making it an
atomic.
I update the logic not because seeing any problem with it, but it just
feel confusing.
Closes https://github.com/facebook/rocksdb/pull/1869
Differential Revision: D4555837
Pulled By: yiwu-arbug
fbshipit-source-id: eff2ab8
Summary:
Remove disableDataSync, and another similarly named disable_data_sync options.
This is being done to simplify options, and also because the performance gains of this feature can be achieved by other methods.
Closes https://github.com/facebook/rocksdb/pull/1859
Differential Revision: D4541292
Pulled By: sagar0
fbshipit-source-id: 5b3a6ca
Summary:
The previous version of zlib is no longer available. I have also updated the versions of the other static libraries and added checkum checks for the downloads; This is related to https://github.com/facebook/rocksdb/issues/1769
Closes https://github.com/facebook/rocksdb/pull/1863
Differential Revision: D4550742
Pulled By: yiwu-arbug
fbshipit-source-id: 4414150
Summary:
Record the first parsed sequence number as the minimum
so we can find the true minimum otherwise everything is larger than zero.
Fix the comparator name comparision.
Closes https://github.com/facebook/rocksdb/pull/1858
Differential Revision: D4544365
Pulled By: ajkr
fbshipit-source-id: 439cbc2
Summary:
It was really annoying to have two places (top and bottom of compaction loop) where we cut output files. I had bugs in both DeleteRange and dictionary compression due to updating only one of the two. This diff consolidates the file-cutting logic to the bottom of the compaction loop.
Keep in mind that my goal with input_status is to be consistent with the past behavior, even though I'm not sure it's ideal.
Closes https://github.com/facebook/rocksdb/pull/1832
Differential Revision: D4503038
Pulled By: ajkr
fbshipit-source-id: 7da5213
Summary:
Announce the experimetnal two-level index feature in HISTORY.md. Also updated the default for index_per_partition to 1024.
Closes https://github.com/facebook/rocksdb/pull/1855
Differential Revision: D4530102
Pulled By: maysamyabandeh
fbshipit-source-id: b0fc6ff