rocksdb

Author	SHA1	Message	Date
Davide Angelocola	c9539ede76	Fix integer overflow in TraceOptions (#9157 ) Summary: Hello from a happy user of rocksdb java :-) Default constructor of TraceOptions is supposed to initialize size to 64GB but the expression contains an integer overflow. Simple test case with JShell: ``` jshell> 64 * 1024 * 1024 * 1024 $1 ==> 0 jshell> 64L * 1024 * 1024 * 1024 $2 ==> 68719476736 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/9157 Reviewed By: pdillinger, zhichao-cao Differential Revision: D32369273 Pulled By: mrambacher fbshipit-source-id: 6a0c95fff7a91f27ff15d65b662c6b101756b450	2021-11-17 08:41:48 -08:00
Zhichao Cao	b694cd0e0d	Add tiered storage related read bytes stats to Statistic (#9123 ) Summary: Add the 3 read bytes counter to the Statistic, which will be used by storage tiering and get the information for files with different temperature. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9123 Test Plan: added new testing cases. Reviewed By: siying Differential Revision: D32154745 Pulled By: zhichao-cao fbshipit-source-id: b7905d6dae469a72428742364ec07b634b6f15da	2021-11-16 15:17:17 -08:00
Adam Retter	1a8eec461b	Remove invalid RocksJava native entry (#9147 ) Summary: It seems that an incorrect native source file entry was introduced in https://github.com/facebook/rocksdb/pull/8999. For some reason it appears that CI was not run against that PR, and so the problem was not detected. This PR fixes the problem by removing the invalid entry, allowing RocksJava to build correctly again. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9147 Reviewed By: pdillinger Differential Revision: D32300976 fbshipit-source-id: dbd763b806bacf0fc08f4deaf07c63d0a266c4cf	2021-11-09 17:21:58 -08:00
Alan Paxton	e5b34f5867	Fb 5789 max total WAL size clarification (#9108 ) Summary: Add clarification/extension to comments on max_total_wal_size and the Java wrapper MaxTotalWalSize to better explain the effect of the option on log file sizes. Closes https://github.com/facebook/rocksdb/issues/5789 Pull Request resolved: https://github.com/facebook/rocksdb/pull/9108 Reviewed By: pdillinger Differential Revision: D32066640 Pulled By: mrambacher fbshipit-source-id: 7d5affc87e4119019054af9c884a2ea01d68f5b7	2021-11-08 08:54:37 -08:00
Adam Retter	be351f4754	Restore Java 7 Compatibility (#9103 ) Summary: RocksDB should still compile on Java 7. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9103 Reviewed By: pdillinger Differential Revision: D32067561 Pulled By: mrambacher fbshipit-source-id: bbe9c18c8007ab3e113de4add56a84c9bde61c8e	2021-11-08 08:21:02 -08:00
Alan Paxton	ec9082d698	Regression tests for tickets fixed by previous change. (#9019 ) Summary: closes https://github.com/facebook/rocksdb/issues/5891 closes https://github.com/facebook/rocksdb/issues/2001 Java BytewiseComparator is now unsigned compliant, consistent with the default C++ comparator, which has always been thus. Consequently 2 tickets reporting the previous broken state can be closed. This test confirms that the following issues were in fact resolved by a change made between 6.2.2 and 6.22.1, to wit https://github.com/facebook/rocksdb/commit/7242dae7 which as part of its effect, changed the Java bytewise comparators. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9019 Reviewed By: pdillinger Differential Revision: D31610910 Pulled By: mrambacher fbshipit-source-id: 664230f1377a1aa270136edd63eea2c206b907e9	2021-11-01 15:06:47 -07:00
Alan Paxton	73e6b89fad	Java wrapper for blob_gc_force_threshold as blobGarbageCollectionForceThreshold (#9109 ) Summary: Extra option added as a supplement to https://github.com/facebook/rocksdb/pull/8999 Closes https://github.com/facebook/rocksdb/issues/8221 Pull Request resolved: https://github.com/facebook/rocksdb/pull/9109 Reviewed By: mrambacher Differential Revision: D32065039 Pulled By: ltamasi fbshipit-source-id: 6c484050a30fe0523850a8a3c95dc85b0a501362	2021-11-01 11:59:10 -07:00
myasuka	dc00e4b120	Introduce allowStall option for write buffer manager constructor (#9076 ) Summary: https://github.com/facebook/rocksdb/pull/7898 enable write buffer manager to stall write when memory_usage exceeds buffer_size, this is really useful for container running case to limit the memory usage. However, this feature is not visiable for rocksJava yet. This PR targets to introduce this feature for rocksJava. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9076 Reviewed By: akankshamahajan15 Differential Revision: D31931092 Pulled By: anand1976 fbshipit-source-id: 5531c16a87598663a02368c07b5e13a503164578	2021-10-26 12:09:54 -07:00
Jonathan Albrecht	e970248602	Add support for building on s390x platform (#8962 ) Summary: This PR adds support for building on s390x including updating travis CI. It uses the previous work in https://github.com/facebook/rocksdb/pull/6168 and adds some more changes to get all current tests (make check and jni tests) to pass. The tests were run with snappy, lz4, bzip2 and zstd all compiled in. There are a few pieces still needed to get the travis build working that I don't think I can do. adamretter is this something you could help with? 1. A prebuilt https://rocksdb-deps.s3-us-west-2.amazonaws.com/cmake/cmake-3.14.5-Linux-s390x.deb package 2. A https://hub.docker.com/r/evolvedbinary/rocksjava s390x image Not sure if there is more required for travis. Happy to help in any way I can. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8962 Reviewed By: mrambacher Differential Revision: D31802198 Pulled By: pdillinger fbshipit-source-id: 683511466fa6b505f85ba5a9964a268c6151f0c2	2021-10-22 10:13:15 -07:00
Alan Paxton	8d615a2b1d	New-style blob option bindings, Java option getter and improve/fix option parsing (#8999 ) Summary: Implementation of https://github.com/facebook/rocksdb/issues/8221, plus/including extension of Java options API to allow the get() of options from RocksDB. The extension allows more comprehensive testing of options at the Java side, by validating that the options are set at the C++ side. Variations on methods: MutableColumnFamilyOptions.MutableColumnFamilyOptionsBuilder getOptions() MutableDBOptions.MutableDBOptionsBuilder getDBOptions() retrieve the options via RocksDB C++ interfaces, and parse the resulting string into one of the Java-style option objects. This necessitated generalising the parsing of option strings in Java, which now parses the full range of option strings returned by the C++ interface, rather than a useful subset. This necessitates the list-separator being changed to :(colon) from , (comma). Pull Request resolved: https://github.com/facebook/rocksdb/pull/8999 Reviewed By: jay-zhuang Differential Revision: D31655487 Pulled By: ltamasi fbshipit-source-id: c38e98145c81c61dc38238b0df580db176ce4efd	2021-10-19 09:21:52 -07:00
Alan Paxton	86cf7266c3	keyMayExist() supports ByteBuffer (#9013 ) Summary: closes https://github.com/facebook/rocksdb/issues/7917 Implemented ByteBuffer API variants of Java keyMayExist() uniformly with and without column families, read options and return data values. Implemented 2 supporting C++ JNI methods. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9013 Reviewed By: mrambacher Differential Revision: D31665989 Pulled By: jay-zhuang fbshipit-source-id: 8adc1730217dba38d6fa7b31d788650a33e28af1	2021-10-18 17:20:07 -07:00
Alan Paxton	f5526af8ed	Fix multiget throwing NPE for num of keys > 70k (#9012 ) Summary: closes https://github.com/facebook/rocksdb/issues/8039 Unnecessary use of multiple local JNI references at the same time, 1 per key, was limiting the size of the key array. The local references don't need to be held simultaneously, so if we rearrange the code we can make it work for bigger key arrays. Incidentally, make errors throw helpful exception messages rather than returning a null pointer. Pull Request resolved: https://github.com/facebook/rocksdb/pull/9012 Reviewed By: mrambacher Differential Revision: D31580862 Pulled By: jay-zhuang fbshipit-source-id: ce05831d52ede332e1b20e74d2dc621d219b9616	2021-10-14 11:48:12 -07:00
Jay Zhuang	6b34eb0ebc	Add remote compaction read/write bytes statistics (#8939 ) Summary: Add basic read/write bytes statistics on the primary side: `REMOTE_COMPACT_READ_BYTES` `REMOTE_COMPACT_WRITE_BYTES` Fixed existing statistics missing some IO for remote compaction. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8939 Test Plan: CI Reviewed By: ajkr Differential Revision: D31074672 Pulled By: jay-zhuang fbshipit-source-id: c57afdba369990185008ffaec7e3fe7c62e8902f	2021-09-28 14:00:37 -07:00
Peter Dillinger	cb5b851ff8	Add (& fix) some simple source code checks (#8821 ) Summary: * Don't hardcode namespace rocksdb (use ROCKSDB_NAMESPACE) * Don't #include <rocksdb/...> (use double quotes) * Support putting NOCOMMIT (any case) in source code that should not be committed/pushed in current state. These will be run with `make check` and in GitHub actions Pull Request resolved: https://github.com/facebook/rocksdb/pull/8821 Test Plan: existing tests, manually try out new checks Reviewed By: zhichao-cao Differential Revision: D30791726 Pulled By: pdillinger fbshipit-source-id: 399c883f312be24d9e55c58951d4013e18429d92	2021-09-07 21:19:27 -07:00
Andrew Kryczka	9308ff366c	Bytes read/written stats for `CreateNewBackup*()` (#8819 ) Summary: Gets `Statistics` from the options associated with the `DB` undergoing backup, and populates new ticker stats with the thread-local `IOContext` read/write counters for the threads doing backup work. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8819 Reviewed By: pdillinger Differential Revision: D30779238 Pulled By: ajkr fbshipit-source-id: 75ccafc355f90906df5cf80367f7245b985772d8	2021-09-07 18:25:16 -07:00
Andrew Kryczka	941543721d	Bytes read stat for `VerifyChecksum()` and `VerifyFileChecksums()` APIs (#8741 ) Summary: - Clarified some comments on compatibility for adding new ticker stats - Added read I/O stats for `VerifyChecksum()` and `VerifyFileChecksums()` APIs Pull Request resolved: https://github.com/facebook/rocksdb/pull/8741 Test Plan: new unit test Reviewed By: zhichao-cao Differential Revision: D30708578 Pulled By: ajkr fbshipit-source-id: d06b961f7e199ae92c266b683e39870aa8f63449	2021-09-07 13:28:29 -07:00
Peter Dillinger	c9cd5d25a8	Remove some unneeded code (#8736 ) Summary: * FullKey and ParseFullKey appear to serve no purpose in the public API (or anything else) so removed. Only use in one test updated. * NumberToString serves no purpose vs. ToString so removed, numerous calls updated * Remove unnecessary forward declarations in metadata.h by re-arranging class definitions. * Remove some unneeded semicolons Pull Request resolved: https://github.com/facebook/rocksdb/pull/8736 Test Plan: existing tests Reviewed By: mrambacher Differential Revision: D30700039 Pulled By: pdillinger fbshipit-source-id: 1e436a576f511a6ed8b4d97af7cc8216bc729af2	2021-09-01 14:28:58 -07:00
anand76	add68bd28a	Add a stat to count secondary cache hits (#8666 ) Summary: Add a stat for secondary cache hits. The ```Cache::Lookup``` API had an unused ```stats``` parameter. This PR uses that to pass the pointer to a ```Statistics``` object that ```LRUCache``` uses to record the stat. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8666 Test Plan: Update a unit test in lru_cache_test Reviewed By: zhichao-cao Differential Revision: D30353816 Pulled By: anand1976 fbshipit-source-id: 2046f78b460428877a26ffdd2bb914ae47dfbe77	2021-08-16 21:01:14 -07:00
sdong	e7c24168d8	Move old files to warm tier in FIFO compactions (#8310 ) Summary: Some FIFO users want to keep the data for longer, but the old data is rarely accessed. This feature allows users to configure FIFO compaction so that data older than a threshold is moved to a warm storage tier. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8310 Test Plan: Add several unit tests. Reviewed By: ajkr Differential Revision: D28493792 fbshipit-source-id: c14824ea634814dee5278b449ab5c98b6e0b5501	2021-08-09 12:51:14 -07:00
Brendan MacDonell	8ca081780b	Correct javadoc for Env#setBackgroundThreads(int) (#8576 ) Summary: By default, the low priority pool is not the flush pool, so calling `Env#setBackgroundThreads` without providing a priority will not do what the caller expected. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8576 Reviewed By: ajkr Differential Revision: D29925154 Pulled By: mrambacher fbshipit-source-id: cd7211fc374e7d9929a9b88ea0a5ba8134b76099	2021-08-06 08:52:14 -07:00
Mikhail Golubev	8f52972cf9	Allow to use a string as a delimiter in StringAppendOperator (#8536 ) Summary: An arbitrary string can be used as a delimiter in StringAppend merge operator flavor. In particular, it allows using an empty string, combining binary values for the same key byte-to-byte one next to another. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8536 Reviewed By: mrambacher Differential Revision: D29962120 Pulled By: zhichao-cao fbshipit-source-id: 4ef5d846a47835cf428a11200409e30e2dbffc4f	2021-08-02 16:50:41 -07:00
Anatolii Zhmaiev	9ddb55a8f6	Add periodic_compaction_seconds option to RocksJava (#8579 ) Summary: Fixes https://github.com/facebook/rocksdb/issues/8578 Pull Request resolved: https://github.com/facebook/rocksdb/pull/8579 Reviewed By: ajkr Differential Revision: D29895081 Pulled By: mrambacher fbshipit-source-id: 3e4120e26a3e8252f8301d657c0aaa0b8550cddf	2021-07-26 17:33:42 -07:00
Peter Dillinger	df5dc73bec	Don't hold DB mutex for block cache entry stat scans (#8538 ) Summary: I previously didn't notice the DB mutex was being held during block cache entry stat scans, probably because I primarily checked for read performance regressions, because they require the block cache and are traditionally latency-sensitive. This change does some refactoring to avoid holding DB mutex and to avoid triggering and waiting for a scan in GetProperty("rocksdb.cfstats"). Some tests have to be updated because now the stats collector is populated in the Cache aggressively on DB startup rather than lazily. (I hope to clean up some of this added complexity in the future.) This change also ensures proper treatment of need_out_of_mutex for non-int DB properties. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8538 Test Plan: Added unit test logic that uses sync points to fail if the DB mutex is held during a scan, covering the various ways that a scan might be triggered. Performance test - the known impact to holding the DB mutex is on TransactionDB, and the easiest way to see the impact is to hack the scan code to almost always miss and take an artificially long time scanning. Here I've injected an unconditional 5s sleep at the call to ApplyToAllEntries. Before (hacked): $ TEST_TMPDIR=/dev/shm ./db_bench.base_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 433.219 micros/op 2308 ops/sec; 0.1 MB/s ( transactions:78999 aborts:0) rocksdb.db.write.micros P50 : 16.135883 P95 : 36.622503 P99 : 66.036115 P100 : 5000614.000000 COUNT : 149677 SUM : 8364856 $ TEST_TMPDIR=/dev/shm ./db_bench.base_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 448.802 micros/op 2228 ops/sec; 0.1 MB/s ( transactions:75999 aborts:0) rocksdb.db.write.micros P50 : 16.629221 P95 : 37.320607 P99 : 72.144341 P100 : 5000871.000000 COUNT : 143995 SUM : 13472323 Notice the 5s P100 write time. After (hacked): $ TEST_TMPDIR=/dev/shm ./db_bench.new_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 303.645 micros/op 3293 ops/sec; 0.1 MB/s ( transactions:98999 aborts:0) rocksdb.db.write.micros P50 : 16.061871 P95 : 33.978834 P99 : 60.018017 P100 : 616315.000000 COUNT : 187619 SUM : 4097407 $ TEST_TMPDIR=/dev/shm ./db_bench.new_xxx -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 310.383 micros/op 3221 ops/sec; 0.1 MB/s ( transactions:96999 aborts:0) rocksdb.db.write.micros P50 : 16.270026 P95 : 35.786844 P99 : 64.302878 P100 : 603088.000000 COUNT : 183819 SUM : 4095918 P100 write is now ~0.6s. Not good, but it's the same even if I completely bypass all the scanning code: $ TEST_TMPDIR=/dev/shm ./db_bench.new_skip -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 311.365 micros/op 3211 ops/sec; 0.1 MB/s ( transactions:96999 aborts:0) rocksdb.db.write.micros P50 : 16.274362 P95 : 36.221184 P99 : 68.809783 P100 : 649808.000000 COUNT : 183819 SUM : 4156767 $ TEST_TMPDIR=/dev/shm ./db_bench.new_skip -benchmarks=randomtransaction,stats -cache_index_and_filter_blocks=1 -bloom_bits=10 -partition_index_and_filters=1 -duration=30 -stats_dump_period_sec=12 -cache_size=100000000 -statistics -transaction_db 2>&1 \| egrep 'db.db.write.micros\|micros/op' randomtransaction : 308.395 micros/op 3242 ops/sec; 0.1 MB/s ( transactions:97999 aborts:0) rocksdb.db.write.micros P50 : 16.106222 P95 : 37.202403 P99 : 67.081875 P100 : 598091.000000 COUNT : 185714 SUM : 4098832 No substantial difference. Reviewed By: siying Differential Revision: D29738847 Pulled By: pdillinger fbshipit-source-id: 1c5c155f5a1b62e4fea0fd4eeb515a8b7474027b	2021-07-16 14:13:08 -07:00
longlijian	4e4ec16957	Replace the namespace "rocksdb" to "ROCKSDB_NAMESPACE" (#8531 ) Summary: For more detail can reference the https://github.com/facebook/rocksdb/issues/6433 (https://github.com/facebook/rocksdb/pull/6433) Pull Request resolved: https://github.com/facebook/rocksdb/pull/8531 Reviewed By: siying Differential Revision: D29717057 Pulled By: ajkr fbshipit-source-id: 3ccad9501e5612590e54a7cf8c447118f323c7f4	2021-07-15 17:23:39 -07:00
Baptiste Lemaire	e817bc9628	Added memtable garbage statistics (#8411 ) Summary: Summary: 2 new statistics counters are added to RocksDB: `MEMTABLE_PAYLOAD_BYTES_AT_FLUSH` and `MEMTABLE_GARBAGE_BYTES_AT_FLUSH`. The former tracks how many raw bytes of useful data are present on the memtable at flush time, whereas the latter is tracks how many of these raw bytes are considered garbage, meaning that they ended up not being imported on the SSTables resulting from the flush operations. Unit test: run `make db_flush_test -j$(nproc); ./db_flush_test` to run the unit test. This executable includes 3 tests, that test support and correct stat calculations for workloads with inserts, deletes, and DeleteRanges. The parameters are set such that the workloads are performed on a single memtable, and a single SSTable is created as a result of the flush operation. The flush operation is manually called in the test file. The tests verify that the values of these 2 statistics counters introduced in this PR can be exactly predicted, showing that we have a full understanding of the underlying operations. Performance testing: `./db_bench -statistics -benchmarks=fillrandom -num=10000000` repeated 10 times. Timing done using "date" function in a bash script. _Results_: Original Rocksdb fork: mean 66.6 sec, std 1.18 sec. This feature branch: mean 67.4 sec, std 1.35 sec. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8411 Reviewed By: akankshamahajan15 Differential Revision: D29150629 Pulled By: bjlemaire fbshipit-source-id: 7b3c2e86d50c6aa34fa50fd134282eacb543a5b1	2021-06-18 04:57:27 -07:00
Sidi Mohamed EL AATIFI	298edae941	Fix a typo in Javadoc (#8394 ) Summary: iterateLowerBound Slice representing the lower bound Pull Request resolved: https://github.com/facebook/rocksdb/pull/8394 Reviewed By: ajkr Differential Revision: D29085721 Pulled By: jay-zhuang fbshipit-source-id: a154375879395c48e9bd3794d296e70316894056	2021-06-17 12:02:57 -07:00
mrambacher	8948dc8524	Make ImmutableOptions struct that inherits from ImmutableCFOptions and ImmutableDBOptions (#8262 ) Summary: The ImmutableCFOptions contained a bunch of fields that belonged to the ImmutableDBOptions. This change cleans that up by introducing an ImmutableOptions struct. Following the pattern of Options struct, this class inherits from the DB and CFOption structs (of the Immutable form). Only one structural change (the ImmutableCFOptions::fs was changed to a shared_ptr from a raw one) is in this PR. All of the other changes involve moving the member variables from the ImmutableCFOptions into the ImmutableOptions and changing member variables or function parameters as required for compilation purposes. Follow-on PRs may do a further clean-up of the code, such as renaming variables (such as "ImmutableOptions cf_options") and potentially eliminating un-needed function parameters (there is no longer a need to pass both an ImmutableDBOptions and an ImmutableOptions to a function). Pull Request resolved: https://github.com/facebook/rocksdb/pull/8262 Reviewed By: pdillinger Differential Revision: D28226540 Pulled By: mrambacher fbshipit-source-id: 18ae71eadc879dedbe38b1eb8e6f9ff5c7147dbf	2021-05-05 14:00:17 -07:00
Adam Retter	69c986825e	Fix javadoc for keyMayExist (#8232 ) Summary: Closes https://github.com/facebook/rocksdb/issues/6985 Pull Request resolved: https://github.com/facebook/rocksdb/pull/8232 Reviewed By: jay-zhuang Differential Revision: D27999779 Pulled By: mrambacher fbshipit-source-id: a37c88d93bde2692b8be9e46e673dda7bea701b2	2021-04-26 08:34:10 -07:00
Yanqin Jin	a376c22066	Handle rename() failure in non-local FS (#8192 ) Summary: In a distributed environment, a file `rename()` operation can succeed on server (remote) side, but the client can somehow return non-ok status to RocksDB. Possible reasons include network partition, connection issue, etc. This happens in `rocksdb::SetCurrentFile()`, which can be called in `LogAndApply() -> ProcessManifestWrites()` if RocksDB tries to switch to a new MANIFEST. We currently always delete the new MANIFEST if an error occurs. This is problematic in distributed world. If the server-side successfully updates the CURRENT file via renaming, then a subsequent `DB::Open()` will try to look for the new MANIFEST and fail. As a fix, we can track the execution result of IO operations on the new MANIFEST. - If IO operations on the new MANIFEST fail, then we know the CURRENT must point to the original MANIFEST. Therefore, it is safe to remove the new MANIFEST. - If IO operations on the new MANIFEST all succeed, but somehow we end up in the clean up code block, then we do not know whether CURRENT points to the new or old MANIFEST. (For local POSIX-compliant FS, it should still point to old MANIFEST, but it does not matter if we keep the new MANIFEST.) Therefore, we keep the new MANIFEST. - Any future `LogAndApply()` will switch to a new MANIFEST and update CURRENT. - If process reopens the db immediately after the failure, then the CURRENT file can point to either the new MANIFEST or the old one, both of which exist. Therefore, recovery can succeed and ignore the other. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8192 Test Plan: make check Reviewed By: zhichao-cao Differential Revision: D27804648 Pulled By: riversand963 fbshipit-source-id: 9c16f2a5ce41bc6aadf085e48449b19ede8423e4	2021-04-19 18:11:13 -07:00
Andrew Kryczka	1ba2b8a568	Add sample_for_compression results to table properties (#8139 ) Summary: Added `TableProperties::{fast,slow}_compression_estimated_data_size`. These properties are present in block-based tables when `ColumnFamilyOptions::sample_for_compression > 0` and the necessary compression library is supported when the file is generated. They contain estimates of what `TableProperties::data_size` would be if the "fast"/"slow" compression library had been used instead. One limitation is we do not record exactly which "fast" (ZSTD or Zlib) or "slow" (LZ4 or Snappy) compression library produced the result. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8139 Test Plan: - new unit test - ran `db_bench` with `sample_for_compression=1`; verified the `data_size` property matches the `{slow,fast}_compression_estimated_data_size` when the same compression type is used for the output file compression and the sampled compression Reviewed By: riversand963 Differential Revision: D27454338 Pulled By: ajkr fbshipit-source-id: 9529293de93ddac7f03b2e149d746e9f634abac4	2021-03-31 18:21:50 -07:00
Jay Zhuang	a781b103da	Fix getApproximateMemTableStats() return type (#8098 ) Summary: Which should return 2 long instead of an array. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8098 Reviewed By: mrambacher Differential Revision: D27308741 Pulled By: jay-zhuang fbshipit-source-id: 44beea2bd28cf6779b048bebc98f2426fe95e25c	2021-03-31 09:46:47 -07:00
Vlad Artamonov	4a6bc47b2e	Fix possible mistype in a comment (#8086 ) Summary: This is a small fix to what I think is a mistype in two comments in `DBOptionsInterface.java`. If it was not an error, feel free to close. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8086 Reviewed By: ajkr Differential Revision: D27260488 Pulled By: mrambacher fbshipit-source-id: 469daadaf6039d5b5187132b8e0c7c3672842f21	2021-03-23 12:37:24 -07:00
Zhichao Cao	08ec5e7321	Add the statistics and info log for Error handler (#8050 ) Summary: Add statistics and info log for error handler: counters for bg error, bg io error, bg retryable io error, auto resume, auto resume total retry, and auto resume sucess; Histogram for auto resume retry count in each recovery call. Pull Request resolved: https://github.com/facebook/rocksdb/pull/8050 Test Plan: make check and add test to error_handler_fs_test Reviewed By: anand1976 Differential Revision: D26990565 Pulled By: zhichao-cao fbshipit-source-id: 49f71e8ea4e9db8b189943976404205b56ab883f	2021-03-17 22:38:13 -07:00
Xiaopeng Zhang	c603f2f898	support getUsage and getPinnedUsage in JavaAPI for Cache (#7925 ) Summary: support getUsage and getPinnedUsage in JavaAPI for Cache also fix a typo in LRUCacheTest.java that the highPriPoolRatio is not valid(set 5, I guess it means 0.05) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7925 Reviewed By: mrambacher Differential Revision: D26900241 Pulled By: ajkr fbshipit-source-id: 735d1e40a16fa8919c89c7c7154ba7f81208ec33	2021-03-17 09:30:33 -07:00
stefan-zobel	8d9088464b	Java-API: Fix minor Javadoc copy-paste errors (#8034 ) Summary: Fixes 3 minor Javadoc copy-paste errors in the `RocksDB#newIterator()` and `Transaction#getIterator()` variants that take a column family handle but are talking about iterating over "the database" or "the default column family". Pull Request resolved: https://github.com/facebook/rocksdb/pull/8034 Reviewed By: jay-zhuang Differential Revision: D26877667 Pulled By: mrambacher fbshipit-source-id: 95dd95b667c496e389f221acc9a91b340e4b63bf	2021-03-16 18:07:09 -07:00
stefan-zobel	cc34da75b5	Java-API: byteCompressionType should be declared as primitive type byte (#7981 ) Summary: The variable `byteCompressionType` is only assigned values of primitive type and is never 'null', but it is declared with the boxed type 'Byte'. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7981 Reviewed By: ajkr Differential Revision: D26546600 Pulled By: jay-zhuang fbshipit-source-id: 07b579cdfcfc2262a448ca3626e216416fd05892	2021-03-09 22:05:16 -08:00
Peter Dillinger	0028e3398b	Make format_version=5 new default (#8017 ) Summary: Haven't seen any production issues with new Bloom filter and it's now > 1 year old (added in 6.6.0). Updated check_format_compatible.sh and HISTORY.md Pull Request resolved: https://github.com/facebook/rocksdb/pull/8017 Test Plan: tests updated (or prior bugs fixed) Reviewed By: ajkr Differential Revision: D26762197 Pulled By: pdillinger fbshipit-source-id: 0e755c46b443087c1544da0fd545beb9c403d1c2	2021-03-09 12:42:53 -08:00
stefan-zobel	430842f948	Java-API: Missing space in string literal (#7982 ) Summary: `TtlDB.open()`: missing space after 'column' `AdvancedColumnFamilyOptionsInterface.setLevelCompactionDynamicLevelBytes()`: missing space after 'cause' Pull Request resolved: https://github.com/facebook/rocksdb/pull/7982 Reviewed By: ajkr Differential Revision: D26546632 Pulled By: jay-zhuang fbshipit-source-id: 885dedcaa2200842764fbac9ce3766d54e1c8914	2021-03-09 11:30:29 -08:00
Andrew Kryczka	d904233d2f	Limit buffering for collecting samples for compression dictionary (#7970 ) Summary: For dictionary compression, we need to collect some representative samples of the data to be compressed, which we use to either generate or train (when `CompressionOptions::zstd_max_train_bytes > 0`) a dictionary. Previously, the strategy was to buffer all the data blocks during flush, and up to the target file size during compaction. That strategy allowed us to randomly pick samples from as wide a range as possible that'd be guaranteed to land in a single output file. However, some users try to make huge files in memory-constrained environments, where this strategy can cause OOM. This PR introduces an option, `CompressionOptions::max_dict_buffer_bytes`, that limits how much data blocks are buffered before we switch to unbuffered mode (which means creating the per-SST dictionary, writing out the buffered data, and compressing/writing new blocks as soon as they are built). It is not strict as we currently buffer more than just data blocks -- also keys are buffered. But it does make a step towards giving users predictable memory usage. Related changes include: - Changed sampling for dictionary compression to select unique data blocks when there is limited availability of data blocks - Made use of `BlockBuilder::SwapAndReset()` to save an allocation+memcpy when buffering data blocks for building a dictionary - Changed `ParseBoolean()` to accept an input containing characters after the boolean. This is necessary since, with this PR, a value for `CompressionOptions::enabled` is no longer necessarily the final component in the `CompressionOptions` string. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7970 Test Plan: - updated `CompressionOptions` unit tests to verify limit is respected (to the extent expected in the current implementation) in various scenarios of flush/compaction to bottommost/non-bottommost level - looked at jemalloc heap profiles right before and after switching to unbuffered mode during flush/compaction. Verified memory usage in buffering is proportional to the limit set. Reviewed By: pdillinger Differential Revision: D26467994 Pulled By: ajkr fbshipit-source-id: 3da4ef9fba59974e4ef40e40c01611002c861465	2021-02-19 14:09:54 -08:00
stefan-zobel	251143f8fb	rocksdbjni: Possible NPE in RocksDB.setOptions #7869 (#7909 ) Summary: Fix for https://github.com/facebook/rocksdb/issues/7869 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7909 Reviewed By: akankshamahajan15 Differential Revision: D26181440 Pulled By: ajkr fbshipit-source-id: f323aec9d91e177fa873599b99801b391cf094b1	2021-02-18 15:48:39 -08:00
Xiaopeng Zhang	bf6795aea0	fix java sample typo and replace deprecated code with latest (#7906 ) Summary: 1. replace deprecated code in sample java with latest api 2. fix optimistictransaction sample code typo Pull Request resolved: https://github.com/facebook/rocksdb/pull/7906 Reviewed By: ajkr Differential Revision: D26127429 Pulled By: jay-zhuang fbshipit-source-id: f015ad1435f565cffb8798a4fb5afc44c72d73d7	2021-02-01 14:45:34 -08:00
Xiaopeng Zhang	36963dc2ca	fix write option typo in java samples (#7894 ) Summary: this is a trivial PR for rocksdb java samples, I think it is a typo about write options. to do sync write, WAL should not be disabled Pull Request resolved: https://github.com/facebook/rocksdb/pull/7894 Reviewed By: jay-zhuang Differential Revision: D26047128 Pulled By: mrambacher fbshipit-source-id: a06ce54cb61af0d3f2578a709c34a0b1ccecb0b2	2021-01-26 19:13:08 -08:00
Tomas Kolda	d76a8eeef7	Fixing Windows build using CMake (#7854 ) Summary: Builds were not producing Windows binaries properly in 6.15 branch: ``` 00:00:46.413 Tests run: 11, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.183 sec <<< FAILURE! - in org.rocksdb.EventListenerTest 00:00:46.414 testAllCallbacksInvocation(org.rocksdb.EventListenerTest) Time elapsed: 0.012 sec <<< ERROR! 00:00:46.414 java.lang.UnsatisfiedLinkError: org.rocksdb.test.TestableEventListener.invokeAllCallbacks(J)V 00:00:46.414 at org.rocksdb.test.TestableEventListener.invokeAllCallbacks(Native Method) 00:00:46.414 at org.rocksdb.test.TestableEventListener.invokeAllCallbacks(TestableEventListener.java:19) 00:00:46.414 at org.rocksdb.EventListenerTest.testAllCallbacksInvocation(EventListenerTest.java:436) ``` ``` 00:00:41.497 "D:\j\workspace\RocksDB_Build_Windows\build\java\rocksdbjni_headers.vcxproj" (default target) (3) -> 00:00:41.497 (CustomBuild target) -> 00:00:41.497 CUSTOMBUILD : error : Could not find class file for 'org.rocksdb.TestableEventListener'. [D:\j\workspace\RocksDB_Build_Windows\build\java\rocksdbjni_headers.vcxproj] ``` Also failed on Linux as library was not initialized yet: ``` 00:01:25.103 Running org.rocksdb.NativeComparatorWrapperTest 00:01:25.133 Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.006 sec <<< FAILURE! - in org.rocksdb.NativeComparatorWrapperTest 00:01:25.133 rountrip(org.rocksdb.NativeComparatorWrapperTest) Time elapsed: 0.002 sec <<< ERROR! 00:01:25.133 java.lang.UnsatisfiedLinkError: org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.newStringComparator()J 00:01:25.133 at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.newStringComparator(Native Method) 00:01:25.133 at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.initializeNative(NativeComparatorWrapperTest.java:87) 00:01:25.133 at org.rocksdb.RocksCallbackObject.<init>(RocksCallbackObject.java:28) 00:01:25.133 at org.rocksdb.AbstractComparator.<init>(AbstractComparator.java:20) 00:01:25.133 at org.rocksdb.NativeComparatorWrapper.<init>(NativeComparatorWrapper.java:16) 00:01:25.133 at org.rocksdb.NativeComparatorWrapperTest$NativeStringComparatorWrapper.<init>(NativeComparatorWrapperTest.java:82) 00:01:25.133 at org.rocksdb.NativeComparatorWrapperTest.rountrip(NativeComparatorWrapperTest.java:30) ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/7854 Reviewed By: jay-zhuang Differential Revision: D25873378 Pulled By: ajkr fbshipit-source-id: 88afb08bfd30edff31f17da063e636df0769cbfe	2021-01-15 17:53:16 -08:00
Tomas Kolda	1001bc01c9	Read Options to support direct slice (#7132 ) Summary: This request is adding support for using DirectSlice in ReadOptions lower/upper bounds. To be more efficient I have added setLength to DirectSlice so I can just update the length to be used by slice from direct buffer. It is also needed, because when one creates iterator it keep pointer to original slice so setting new slice in options does not help (it needs to reuse existing one). Using this approach one can modify the slice any time during operations with iterator. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7132 Reviewed By: zhichao-cao Differential Revision: D25840092 Pulled By: jay-zhuang fbshipit-source-id: 760167baf61568c9a35138145c4bf9b06824cb71	2021-01-15 17:05:18 -08:00
Tomas Kolda	ac956f2bea	S390 Linux is failing tests ColumnFamilyOptionsTest.cfPaths (#7853 ) Summary: Fix ColumnFamilyOptionsTest.cfPaths and OptionsTest.cfPaths in 6.15 branch (and probably other branches including master) has_exception variable was not initialized which was causing test failures and incorrect behavior on s390 platform (and maybe others as variable content is undefined). adamretter please take a look. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7853 Reviewed By: akankshamahajan15 Differential Revision: D25901639 Pulled By: jay-zhuang fbshipit-source-id: 151b5db27b495fc6d8ed54c0eccbde2508215ac5	2021-01-15 16:32:31 -08:00
Adam Retter	3e6ee9f82e	Update the versions of the test dependencies used for RocksJava (#7805 ) Summary: Update the versions of the dependencies used for testing RocksJava. pdillinger Please can you add the following to your S3 bucket: 1. https://repo1.maven.org/maven2/junit/junit/4.13.1/junit-4.13.1.jar 2. https://repo1.maven.org/maven2/org/hamcrest/hamcrest/2.2/hamcrest-2.2.jar 3. https://repo1.maven.org/maven2/cglib/cglib/3.3.0/cglib-3.3.0.jar 4. https://repo1.maven.org/maven2/org/assertj/assertj-core/2.9.0/assertj-core-2.9.0.jar Thanks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7805 Reviewed By: akankshamahajan15 Differential Revision: D25906134 Pulled By: jay-zhuang fbshipit-source-id: 1c6c7d461a73abaff1796bb31f0ad90dcbdef1a0	2021-01-13 16:01:38 -08:00
Laurent Goujon	0426d4a4ee	Fix Java hashCode implementation (#7860 ) Summary: Classes ColumnFamilyHandle and CapturingWriteBatchHandler.Event have byte array fields as part of their identity, but they do not use the arrays' content to compute the instance's hash, and instead rely on the arrays' identity, causing instances to have different hashcodes although they are equal. The PR addresses it by using the arrays' content to compute the hash, like the equals method does. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7860 Reviewed By: jay-zhuang Differential Revision: D25901327 Pulled By: akankshamahajan15 fbshipit-source-id: 347e7b3d2ba7befe7faa956b033e6421b9d0c235	2021-01-13 10:04:42 -08:00
Adam Retter	62afa968c2	Fix various small build issues, Java API naming (#7776 ) Summary: * Compatibility with older GCC. * Compatibility with older jemalloc libraries. * Remove Docker warning when building i686 binaries. * Fix case inconsistency in Java API naming (potential update to HISTORY.md deferred) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7776 Reviewed By: akankshamahajan15 Differential Revision: D25607235 Pulled By: pdillinger fbshipit-source-id: 7ab0fb7fa7a34e97ed0bec991f5081acb095777d	2020-12-18 16:12:26 -08:00
Adam Retter	29d12748b0	Fix failing RocksJava test compilation and add CI (#7769 ) Summary: * Fixes a Java test compilation issue on macOS * Cleans up CircleCI RocksDBJava build config * Adds CircleCI for RocksDBJava on MacOS * Ensures backwards compatibility with older macOS via CircleCI * Fixes RocksJava static builds ordering * Adds missing RocksJava static builds to CircleCI for Mac and Linux * Improves parallelism in RocksJava builds * Reduces the size of the machines used for RocksJava CircleCI as they don't need to be so large (Saves credits) Pull Request resolved: https://github.com/facebook/rocksdb/pull/7769 Reviewed By: akankshamahajan15 Differential Revision: D25601293 Pulled By: pdillinger fbshipit-source-id: 0a0bb9906f65438fe143487d78e37e1947364d08	2020-12-16 16:00:02 -08:00
Cheng Chang	5e794b0841	Fix a recovery corner case (#7621 ) Summary: Consider the following sequence of events: 1. Db flushed an SST with file number N, appended to MANIFEST, and tried to sync the MANIFEST. 2. Syncing MANIFEST failed and db crashed. 3. Db tried to recover with this MANIFEST. In the meantime, no entry about the newly-flushed SST was found in the MANIFEST. Therefore, RocksDB replayed WAL and tried to flush to an SST file reusing the same file number N. This failed because file system does not support overwrite. Then Db deleted this file. 4. Db crashed again. 5. Db tried to recover. When db read the MANIFEST, there was an entry referencing N.sst. This could happen probably because the append in step 1 finally reached the MANIFEST and became visible. Since N.sst had been deleted in step 3, recovery failed. It is possible that N.sst created in step 1 is valid. Although step 3 would still fail since the MANIFEST was not synced properly in step 1 and 2, deleting N.sst would make it impossible for the db to recover even if the remaining part of MANIFEST was appended and visible after step 5. After this PR, in step 3, immediately after recovering from MANIFEST, a new MANIFEST is created, then we find that N.sst is not referenced in the MANIFEST, so we delete it, and we'll not reuse N as file number. Then in step 5, since the new MANIFEST does not contain N.sst, the recovery failure situation in step 5 won't happen. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7621 Test Plan: 1. some tests are updated, because these tests assume that new MANIFEST is created after WAL recovery. 2. a new unit test is added in db_basic_test to simulate step 3. Reviewed By: riversand963 Differential Revision: D24668144 Pulled By: cheng-chang fbshipit-source-id: 90d7487fbad2bc3714f5ede46ea949895b15ae3b	2020-11-07 22:23:27 -08:00

1 2 3 4 5 ...

801 Commits