Summary:
Refactor WriteImpl() so when I plug-in the pipeline write code (which is
an alternative approach for WriteThread), some of the logic can be
reuse. I split out the following methods from WriteImpl():
* PreprocessWrite()
* HandleWALFull() (previous MaybeFlushColumnFamilies())
* HandleWriteBufferFull()
* WriteToWAL()
Also adding a constructor to WriteThread::Writer, and move WriteContext into db_impl.h.
No real logic change in this patch.
Closes https://github.com/facebook/rocksdb/pull/2042
Differential Revision: D4781014
Pulled By: yiwu-arbug
fbshipit-source-id: d45ca18
Summary:
It is confusing to have auto_roll_logger to stay under db/, which has nothing to do with database. Move filename together as it is a dependency.
Closes https://github.com/facebook/rocksdb/pull/2080
Differential Revision: D4821141
Pulled By: siying
fbshipit-source-id: ca7d768
Summary:
Following the instructions on INSTALL.md,
I found some errors while installing gflags and zstandard libraries on Centos and fixed them.
Thanks :)
Closes https://github.com/facebook/rocksdb/pull/2077
Differential Revision: D4820804
Pulled By: ajkr
fbshipit-source-id: db4abb3
Summary:
auto_roll_logger_test relies on timing conditon that some operations finish within 1 seconds. This caused flaky tests. Move away from real timing and sleep and use fake time to verify the time-based rolling.
Closes https://github.com/facebook/rocksdb/pull/2066
Differential Revision: D4810647
Pulled By: siying
fbshipit-source-id: c54d994
Summary:
`uint` is nonstandard and not a built-in type on all compilers; replace it
with the always-valid `unsigned int`. I assume this went unnoticed because
it's inside an `#ifdef ROCKDB_JEMALLOC`.
Closes https://github.com/facebook/rocksdb/pull/2075
Differential Revision: D4820427
Pulled By: ajkr
fbshipit-source-id: 0876561
Summary:
Previously it always showed 0.0 for L0 write-amp because we were dividing by bytes read from non-output level. For L0, we should instead divide by bytes ingested to the DB. Note the numerator (bytes written to L0) includes flush bytes.
Closes https://github.com/facebook/rocksdb/pull/2078
Differential Revision: D4816902
Pulled By: ajkr
fbshipit-source-id: 7dca31a
Summary:
Fix the bug that previous test existence locally
Closes https://github.com/facebook/rocksdb/pull/2074
Differential Revision: D4813631
Pulled By: lightmark
fbshipit-source-id: eceb0d9
Summary:
some fbcode services override it, we need to keep it virtual.
original change: #1756
Closes https://github.com/facebook/rocksdb/pull/2065
Differential Revision: D4808123
Pulled By: ajkr
fbshipit-source-id: 5eaeea7
Summary:
There still are many warnings (most of them about invalid printf format
for long long), but it builds if FAIL_ON_WARNINGS is disabled.
Closes https://github.com/facebook/rocksdb/pull/2052
Differential Revision: D4807355
Pulled By: siying
fbshipit-source-id: ef03786
Summary:
temporarily disable since it isn't working on travis.
Closes https://github.com/facebook/rocksdb/pull/2064
Differential Revision: D4807373
Pulled By: ajkr
fbshipit-source-id: f2bb2b0
Summary:
previously we only cleaned up .tmp files under "shared/" and "private/" directories in case the previous backup failed. we need to do the same for "shared_checksum/"; otherwise, the subsequent backup will fail if it tries to backup at least one of the same files.
Closes https://github.com/facebook/rocksdb/pull/2062
Differential Revision: D4805599
Pulled By: ajkr
fbshipit-source-id: eaa6088
Summary:
This adds almost all missing options to RocksJava
Closes https://github.com/facebook/rocksdb/pull/2039
Differential Revision: D4779991
Pulled By: siying
fbshipit-source-id: 4a1bf28
Summary:
Operations like Seek/Next/Prev sometimes take too long to complete when there are many internal keys to be skipped. Adding an option, max_skippable_internal_keys -- which could be used to set a threshold for the maximum number of keys that can be skipped, will help to address these cases where it is much better to fail a request (as incomplete) than to wait for a considerable time for the request to complete.
This feature -- to fail an iterator seek request as incomplete, is disabled by default when max_skippable_internal_keys = 0. It is enabled only when max_skippable_internal_keys > 0.
This feature is based on the discussion mentioned in the PR https://github.com/facebook/rocksdb/pull/1084.
Closes https://github.com/facebook/rocksdb/pull/2000
Differential Revision: D4753223
Pulled By: sagar0
fbshipit-source-id: 1c973f7
Summary:
instead of thread_local
The cleanup path for the rocksdb database might not have the
thread_updater_local_cache_ pointer initialized because the thread
executing the cleanup is likely not a rocksdb thread. This results in a
memory leak detected by Valgrind. The cleanup code path should use the
thread_status_updater pointer obtained from the DB object instead of a
thread local one.
Closes https://github.com/facebook/rocksdb/pull/2059
Differential Revision: D4801611
Pulled By: hermanlee
fbshipit-source-id: 407d7de
Summary:
add ldb to regression test in order to enable reuse db by creating checkpoint
Closes https://github.com/facebook/rocksdb/pull/2030
Differential Revision: D4800549
Pulled By: lightmark
fbshipit-source-id: d3a7325
Summary:
Currently the fast crc32 path is not enabled on Windows. I am trying to enable it here, hopefully, with the minimum impact to the existing code structure.
Closes https://github.com/facebook/rocksdb/pull/2033
Differential Revision: D4770635
Pulled By: siying
fbshipit-source-id: 676f8b8
Summary:
Add two DB properties: rocksdb.actual_delayed_write_rate and rocksdb.is_write_stooped, for people to know whether current writes are being throttled.
Closes https://github.com/facebook/rocksdb/pull/2043
Differential Revision: D4782975
Pulled By: siying
fbshipit-source-id: 6b2f5cf
Summary:
Allow the users to specify the target index partition size.
With this patch an index partition is cut before its estimated in-memory size goes above the configured value for metadata_block_size. The filter partitions are still cut right after an index partition is cut.
Closes https://github.com/facebook/rocksdb/pull/2041
Differential Revision: D4780216
Pulled By: maysamyabandeh
fbshipit-source-id: 95a0831
Summary:
Right now, building rocksdbjava in PowerPC is broken due to JNI library name. I figured it out that "uname -m" and java's os.arch matches in PowerPC architecture. I made use of this advantage to fix the issue. More info can found from this issue --> https://github.com/facebook/rocksdb/issues/1317
Closes https://github.com/facebook/rocksdb/pull/2040
Differential Revision: D4779967
Pulled By: siying
fbshipit-source-id: 259f939
Summary:
Fixes#1961 which causes a segfault when filter_policy is nullptr and both
pin_l0_filter_and_index_blocks_in_cache/cache_index_and_filter_blocks
are set.
Closes https://github.com/facebook/rocksdb/pull/2029
Differential Revision: D4764862
Pulled By: maysamyabandeh
fbshipit-source-id: 05bd695
Summary:
Added fifo benchmark to db_bench.
One thing i am not sure is that i am using CompactRange() instead of CompactFiles(). (may cause performance skew because CompactionRange() is not happening in current thread?) For CompactFiles(), for some reason FIFO compaction doesn't work as expected. More insight is welcomed. I guess FIFO compaction doesn't work with file names? igorcanadi
test cmd:
./db_bench --compaction_style=2 --benchmarks=fillseqdeterministic --disable_auto_compactions --num_levels=1 --fifo_compaction_max_table_files_size_mb=10
---------------------- DB 0 LSM ---------------------
Level[0]: /000014.sst(size: 4211014 bytes)
fillseqdeterministic : 4.731 micros/op 211381 ops/sec; 23.4 MB/s
Closes https://github.com/facebook/rocksdb/pull/1734
Differential Revision: D4774964
Pulled By: siying
fbshipit-source-id: 9d08df6
Summary:
need to consistently include "rocksdb/persistent_cache.h" to fix internal build
Closes https://github.com/facebook/rocksdb/pull/2034
Differential Revision: D4768101
Pulled By: ajkr
fbshipit-source-id: 2ecb07f
Summary:
I've added functions to the C API to support WriteBatchWithIndex as requested in #1833.
I've also added unit tests to c_test
I've implemented the WriteBatchWithIndex variation of every function available for regular WriteBatch. And added additional functions unique to WriteBatchWithIndex.
For now, the following is omitted:
1. The ability to create WriteBatchWithIndex's custom batch-only iterator as I'm not sure what its purpose is. It should be possible to add later if anyone wants it.
2. The ability to create the batch with a fallback comparator, since it appears to be unnecessary. I believe the column family comparator will be used for this, meaning those using a custom comparator can just use the column family variations.
Closes https://github.com/facebook/rocksdb/pull/1985
Differential Revision: D4760039
Pulled By: siying
fbshipit-source-id: 393227e
Summary:
Since non_shn CI was made to run in parallel, /dev/shm is automatically used. It defeated the purpose of the test to cover a non-ramfs file system.
Closes https://github.com/facebook/rocksdb/pull/2031
Differential Revision: D4764804
Pulled By: siying
fbshipit-source-id: 5666bda
Summary:
options_file_example should be added in .gitignore so that it does not show up as an untracked file in `git status`.
Closes https://github.com/facebook/rocksdb/pull/2026
Differential Revision: D4759402
Pulled By: sagar0
fbshipit-source-id: d7fe133
Summary:
MemTableInserter default constructs Post processing info
std::map. However, on Windows with 2015 STL the default
constructed map still dynamically allocates one node
which shows up on a profiler and we loose ~40% throughput
on fillrandom benchmark.
Solution: declare a map as std::aligned storage and optionally
construct.
This addresses https://github.com/facebook/rocksdb/issues/1976
Before:
-------------------------------------------------------------------
Initializing RocksDB Options from command-line flags
DB path: [k:\data\BulkLoadRandom_10M_fillonly]
fillrandom : 2.775 micros/op 360334 ops/sec; 280.4 MB/s
Microseconds per write:
Count: 10000000 Average: 2.7749 StdDev: 39.92
Min: 1 Median: 2.0826 Max: 26051
Percentiles: P50: 2.08 P75: 2.55 P99: 3.55 P99.9: 9.58 P99.99: 51.5**6
------------------------------------------------------
After:
Initializing RocksDB Options from command-line flags
DB path: [k:\data\BulkLoadRandom_10M_fillon
Closes https://github.com/facebook/rocksdb/pull/2011
Differential Revision: D4740823
Pulled By: siying
fbshipit-source-id: 1daaa2c
Summary:
PlainTable now supports non-mmap mode. We don't need to check it anymore.
Closes https://github.com/facebook/rocksdb/pull/1882
Differential Revision: D4751643
Pulled By: siying
fbshipit-source-id: ab14540
Summary:
Add a parameter to Checkpoint::CreateCheckpoint() so that flush can be skipped if total log file size is within a threshold.
Closes https://github.com/facebook/rocksdb/pull/1993
Differential Revision: D4719842
Pulled By: siying
fbshipit-source-id: 4f9d9e1
Summary:
Removed max_grandparent_overlap_factor from benchmark.sh since it is not a valid option anymore.
Closes https://github.com/facebook/rocksdb/pull/2015
Differential Revision: D4748229
Pulled By: lgalanis
fbshipit-source-id: c3869ea