Summary:
Allow the users to specify the target index partition size.
With this patch an index partition is cut before its estimated in-memory size goes above the configured value for metadata_block_size. The filter partitions are still cut right after an index partition is cut.
Closes https://github.com/facebook/rocksdb/pull/2041
Differential Revision: D4780216
Pulled By: maysamyabandeh
fbshipit-source-id: 95a0831
Summary:
Right now, building rocksdbjava in PowerPC is broken due to JNI library name. I figured it out that "uname -m" and java's os.arch matches in PowerPC architecture. I made use of this advantage to fix the issue. More info can found from this issue --> https://github.com/facebook/rocksdb/issues/1317
Closes https://github.com/facebook/rocksdb/pull/2040
Differential Revision: D4779967
Pulled By: siying
fbshipit-source-id: 259f939
Summary:
Fixes#1961 which causes a segfault when filter_policy is nullptr and both
pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks
are set.
Closes https://github.com/facebook/rocksdb/pull/2029
Differential Revision: D4764862
Pulled By: maysamyabandeh
fbshipit-source-id: 05bd695
Summary:
Added fifo benchmark to db_bench.
One thing i am not sure is that i am using CompactRange() instead of CompactFiles(). (may cause performance skew because CompactionRange() is not happening in current thread?) For CompactFiles(), for some reason FIFO compaction doesn't work as expected. More insight is welcomed. I guess FIFO compaction doesn't work with file names? igorcanadi
test cmd:
./db_bench --compaction_style=2 --benchmarks=fillseqdeterministic --disable_auto_compactions --num_levels=1 --fifo_compaction_max_table_files_size_mb=10
---------------------- DB 0 LSM ---------------------
Level[0]: /000014.sst(size: 4211014 bytes)
fillseqdeterministic : 4.731 micros/op 211381 ops/sec; 23.4 MB/s
Closes https://github.com/facebook/rocksdb/pull/1734
Differential Revision: D4774964
Pulled By: siying
fbshipit-source-id: 9d08df6
Summary:
need to consistently include "rocksdb/persistent_cache.h" to fix internal build
Closes https://github.com/facebook/rocksdb/pull/2034
Differential Revision: D4768101
Pulled By: ajkr
fbshipit-source-id: 2ecb07f
Summary:
I've added functions to the C API to support WriteBatchWithIndex as requested in #1833.
I've also added unit tests to c_test
I've implemented the WriteBatchWithIndex variation of every function available for regular WriteBatch. And added additional functions unique to WriteBatchWithIndex.
For now, the following is omitted:
1. The ability to create WriteBatchWithIndex's custom batch-only iterator as I'm not sure what its purpose is. It should be possible to add later if anyone wants it.
2. The ability to create the batch with a fallback comparator, since it appears to be unnecessary. I believe the column family comparator will be used for this, meaning those using a custom comparator can just use the column family variations.
Closes https://github.com/facebook/rocksdb/pull/1985
Differential Revision: D4760039
Pulled By: siying
fbshipit-source-id: 393227e
Summary:
Since non_shn CI was made to run in parallel, /dev/shm is automatically used. It defeated the purpose of the test to cover a non-ramfs file system.
Closes https://github.com/facebook/rocksdb/pull/2031
Differential Revision: D4764804
Pulled By: siying
fbshipit-source-id: 5666bda
Summary:
options_file_example should be added in .gitignore so that it does not show up as an untracked file in `git status`.
Closes https://github.com/facebook/rocksdb/pull/2026
Differential Revision: D4759402
Pulled By: sagar0
fbshipit-source-id: d7fe133
Summary:
MemTableInserter default constructs Post processing info
std::map. However, on Windows with 2015 STL the default
constructed map still dynamically allocates one node
which shows up on a profiler and we loose ~40% throughput
on fillrandom benchmark.
Solution: declare a map as std::aligned storage and optionally
construct.
This addresses https://github.com/facebook/rocksdb/issues/1976
Before:
-------------------------------------------------------------------
Initializing RocksDB Options from command-line flags
DB path: [k:\data\BulkLoadRandom_10M_fillonly]
fillrandom : 2.775 micros/op 360334 ops/sec; 280.4 MB/s
Microseconds per write:
Count: 10000000 Average: 2.7749 StdDev: 39.92
Min: 1 Median: 2.0826 Max: 26051
Percentiles: P50: 2.08 P75: 2.55 P99: 3.55 P99.9: 9.58 P99.99: 51.5**6
------------------------------------------------------
After:
Initializing RocksDB Options from command-line flags
DB path: [k:\data\BulkLoadRandom_10M_fillon
Closes https://github.com/facebook/rocksdb/pull/2011
Differential Revision: D4740823
Pulled By: siying
fbshipit-source-id: 1daaa2c
Summary:
PlainTable now supports non-mmap mode. We don't need to check it anymore.
Closes https://github.com/facebook/rocksdb/pull/1882
Differential Revision: D4751643
Pulled By: siying
fbshipit-source-id: ab14540
Summary:
Add a parameter to Checkpoint::CreateCheckpoint() so that flush can be skipped if total log file size is within a threshold.
Closes https://github.com/facebook/rocksdb/pull/1993
Differential Revision: D4719842
Pulled By: siying
fbshipit-source-id: 4f9d9e1
Summary:
Removed max_grandparent_overlap_factor from benchmark.sh since it is not a valid option anymore.
Closes https://github.com/facebook/rocksdb/pull/2015
Differential Revision: D4748229
Pulled By: lgalanis
fbshipit-source-id: c3869ea
Summary:
Previously, when DB write buffer size triggers, we always pick the CF with most data in its memtable to flush. This approach can minimize total flush happens. Change the behavior to always pick the oldest unflushed CF, which makes it the same behavior when max_total_wal_size hits. This approach will minimize size used by max_total_wal_size.
Closes https://github.com/facebook/rocksdb/pull/1987
Differential Revision: D4703214
Pulled By: siying
fbshipit-source-id: 9ff8b09
Summary:
in buffered io, the filesize_ is the real size.
Closes https://github.com/facebook/rocksdb/pull/1991
Differential Revision: D4711433
Pulled By: lightmark
fbshipit-source-id: ad604b9
Summary:
"DEPRECATED" is ambiguous. Make it clear that those options not supported won't take effect.
Closes https://github.com/facebook/rocksdb/pull/1995
Differential Revision: D4724241
Pulled By: siying
fbshipit-source-id: 1e812b8
Summary:
PinnableSlice
Summary:
Currently the point lookup values are copied to a string provided by the
user. This incures an extra memcpy cost. This patch allows doing point lookup
via a PinnableSlice which pins the source memory location (instead of
copying their content) and releases them after the content is consumed
by the user. The old API of Get(string) is translated to the new API
underneath.
Here is the summary for improvements:
value 100 byte: 1.8% regular, 1.2% merge values
value 1k byte: 11.5% regular, 7.5% merge values
value 10k byte: 26% regular, 29.9% merge values
The improvement for merge could be more if we extend this approach to
pin the merge output and delay the full merge operation until the user
actually needs it. We have put that for future work.
PS:
Sometimes we observe a small decrease in performance when switching from
t5452014 to this patch but with the old Get(string) API. The d
Closes https://github.com/facebook/rocksdb/pull/1756
Differential Revision: D4391738
Pulled By: maysamyabandeh
fbshipit-source-id: 6f3edd3
Summary:
The comparator param in SstFileWriter constructor is redundant as it already exists as a field in options. So the current SstFileWriter constructor should be deprecated in favor of a new one which does not take a comparator.
Note that the jni/java apis have not been touched yet.
Closes https://github.com/facebook/rocksdb/pull/1978
Differential Revision: D4685629
Pulled By: sagar0
fbshipit-source-id: 372ce96
Summary:
Add the flag --prefix to the sst_dump tool
This flag is similar to, and exclusive from, the --from flag.
--prefix=0x00FF will return all rows prefixed with 0x00FF.
The --to flag may also be specified and will work as expected.
These changes were used to help in debugging the power cycle corruption issue and theses changes were tested by scanning through a udb.
Closes https://github.com/facebook/rocksdb/pull/1984
Differential Revision: D4691814
Pulled By: reidHoruff
fbshipit-source-id: 027f261
Summary:
Fixing some bugs in MockEnv so it be actually used.
Closes https://github.com/facebook/rocksdb/pull/1914
Differential Revision: D4609923
Pulled By: maysamyabandeh
fbshipit-source-id: ca25735
Summary:
Without the cast, the build will break on Windows.
Closes https://github.com/facebook/rocksdb/pull/1982
Differential Revision: D4690462
Pulled By: ajkr
fbshipit-source-id: c493b6c
Summary:
This reverts commit d43adf21bb.
The patch has caused problems in regression tests. Will revert it for now until we figure how to debug the problems regression tests.
Closes https://github.com/facebook/rocksdb/pull/1975
Differential Revision: D4682880
Pulled By: maysamyabandeh
fbshipit-source-id: 84df83a
Summary:
It augments the regression benchmarks with a time command, parses the output, and print them to the SUMMARY.csv file.
I tested a variation of the script locally. Any idea how to do run a test that also involves writing to scuba tables?
Closes https://github.com/facebook/rocksdb/pull/1967
Differential Revision: D4679470
Pulled By: maysamyabandeh
fbshipit-source-id: 44dac30
Summary:
the 50%+ drained constraint wasn't working consistently in some of our test environments, maybe their resources are too low. relax the constraints a bit.
Closes https://github.com/facebook/rocksdb/pull/1970
Differential Revision: D4679419
Pulled By: ajkr
fbshipit-source-id: 3789cd8
Summary:
This fixes an issue when the most recent readers assume that alignment is always set even if direct io is off.
Also adjust slightly appveyor script to run db_basic_test cases concurrently.
Closes https://github.com/facebook/rocksdb/pull/1959
Differential Revision: D4671972
Pulled By: IslamAbdelRahman
fbshipit-source-id: 1886620