rocksdb

Author	SHA1	Message	Date
Andrew Kryczka	8dd0a7e11a	add comment in SuperVersion referencing logic Summary: The referencing logic is super confusing so added a comment at the part that took me longest to figure out. Closes https://github.com/facebook/rocksdb/pull/2996 Differential Revision: D6034969 Pulled By: ajkr fbshipit-source-id: 9cc2e744c1f79d6d57d378f86ed59238a5f583db	2017-10-11 15:12:31 -07:00
Yi Wu	fb4ae4d810	fix DBImpl::NewInternalIterator super-version leak on failure Summary: Close #2955 Closes https://github.com/facebook/rocksdb/pull/2960 Differential Revision: D5962872 Pulled By: yiwu-arbug fbshipit-source-id: a6472d5c015bea3dc476c572ff5a5c90259e6059	2017-10-11 14:57:43 -07:00
Zhongyi Xie	0f3b36964e	Fix counter for memtable updates Summary: Right now in `PutCFImpl` we always increment NUMBER_KEYS_UPDATED counter for both in-place update or insertion. This PR fixes this by using the correct counter for either case. Closes https://github.com/facebook/rocksdb/pull/2986 Differential Revision: D6016300 Pulled By: miasantreble fbshipit-source-id: 0aed327522e659450d533d1c47d3a9f568fac65d	2017-10-10 21:26:11 -07:00
Andrew Kryczka	70aa942153	fix file numbers after repair Summary: The file numbers assigned post-repair were sometimes smaller than older files' numbers due to `LogAndApply` saving the wrong next file number in the manifest. - Mark the highest file seen during repair as used before `LogAndApply` so the correct next file number will be stored. - Renamed `MarkFileNumberUsedDuringRecovery` to `MarkFileNumberUsed` since now it's used during repair in addition to during recovery - Added `TEST_Current_Next_FileNo` to expose the next file number for the unit test. Closes https://github.com/facebook/rocksdb/pull/2988 Differential Revision: D6018083 Pulled By: ajkr fbshipit-source-id: 3f25cbf74439cb8f16dd12af90b67f9f9f75e718	2017-10-10 13:12:37 -07:00
Jay Patel	1a61ba179e	compaction picker to use max_bytes_for_level_multiplier_additional Summary: Hi, As part of some optimization, we're using multiple DB locations (tmpfs and spindle) to store data and configured max_bytes_for_level_multiplier_additional. But, max_bytes_for_level_multiplier_additional is not used to compute the actual size for the level while picking the DB location. So, even if DB location does not have space, RocksDB mistakenly puts the level at that location. Can someone pls. verify the fix? Let me know any other changes required. Thanks, Jay Closes https://github.com/facebook/rocksdb/pull/2704 Differential Revision: D5992515 Pulled By: ajkr fbshipit-source-id: cbbc6c0e0a7dbdca91c72e0f37b218c4cec57e28	2017-10-09 22:59:02 -07:00
Yi Wu	8c392a31d7	WritePrepared Txn: Iterator Summary: On iterator create, take a snapshot, create a ReadCallback and pass the ReadCallback to the underlying DBIter to check if key is committed. Closes https://github.com/facebook/rocksdb/pull/2981 Differential Revision: D6001471 Pulled By: yiwu-arbug fbshipit-source-id: 3565c4cdaf25370ba47008b0e0cb65b31dfe79fe	2017-10-09 17:15:28 -07:00
Maysam Yabandeh	ec6c5383d0	WritePrepared Txn: end-to-end tests Summary: Enable WritePrepared policy for existing transaction tests. Closes https://github.com/facebook/rocksdb/pull/2972 Differential Revision: D5993614 Pulled By: maysamyabandeh fbshipit-source-id: d1eb53e2920c4e2a56434bb001231c98426f3509	2017-10-06 14:26:45 -07:00
Yi Wu	d1b74b0c82	WritePrepared Txn: Compaction/Flush Summary: Update Compaction/Flush to support WritePreparedTxnDB: Add SnapshotChecker which is a proxy to query WritePreparedTxnDB::IsInSnapshot. Pass SnapshotChecker to DBImpl on WritePreparedTxnDB open. CompactionIterator use it to check if a key has been committed and if it is visible to a snapshot. In CompactionIterator: * check if key has been committed. If not, output uncommitted keys AS-IS. * use SnapshotChecker to check if key is visible to a snapshot when in need. * do not output key with seq = 0 if the key is not committed. Closes https://github.com/facebook/rocksdb/pull/2926 Differential Revision: D5902907 Pulled By: yiwu-arbug fbshipit-source-id: 945e037fdf0aa652dc5ba0ad879461040baa0320	2017-10-06 10:41:53 -07:00
Adrien Schildknecht	01542400a8	Inform caller when rocksdb is stalling writes Summary: Add a new function in Listener to let the caller know when rocksdb is stalling writes. Closes https://github.com/facebook/rocksdb/pull/2897 Differential Revision: D5860124 Pulled By: schischi fbshipit-source-id: ee791606169aa64f772c86f817cebf02624e05e1	2017-10-05 18:11:43 -07:00
Andrew Kryczka	821887036e	pin L0 filters/indexes for compaction outputs Summary: We need to tell the iterator the compaction output file's level so it can apply proper optimizations, like pinning filter and index blocks when user enables `pin_l0_filter_and_index_blocks_in_cache` and the output file's level is zero. Closes https://github.com/facebook/rocksdb/pull/2949 Differential Revision: D5945597 Pulled By: ajkr fbshipit-source-id: 2389decf9026ffaa32d45801a77d002529f64a62	2017-10-03 16:27:28 -07:00
Sagar Vemuri	377e004048	Fix DBOptionsTest.SetBytesPerSync test when run with no compression Summary: Also made the test more easier to understand: - changed the value size to ~1MB. - switched to NoCompression. We don't anyway need compression in this test for dynamic options. The test failures started happening starting from: #2893 . Closes https://github.com/facebook/rocksdb/pull/2957 Differential Revision: D5959392 Pulled By: sagar0 fbshipit-source-id: 2d55641e429246328bc6d10fcb9ef540d6ce07da	2017-10-03 13:42:11 -07:00
Yi Wu	d1cab2b64e	Add ValueType::kTypeBlobIndex Summary: Add kTypeBlobIndex value type, which will be used by blob db only, to insert a (key, blob_offset) KV pair. The purpose is to 1. Make it possible to open existing rocksdb instance as blob db. Existing value will be of kTypeIndex type, while value inserted by blob db will be of kTypeBlobIndex. 2. Make rocksdb able to detect if the db contains value written by blob db, if so return error. 3. Make it possible to have blob db optionally store value in SST file (with kTypeValue type) or as a blob value (with kTypeBlobIndex type). The root db (DBImpl) basically pretended kTypeBlobIndex are normal value on write. On Get if is_blob is provided, return whether the value read is of kTypeBlobIndex type, or return Status::NotSupported() status if is_blob is not provided. On scan allow_blob flag is pass and if the flag is true, return wether the value is of kTypeBlobIndex type via iter->IsBlob(). Changes on blob db side will be in a separate patch. Closes https://github.com/facebook/rocksdb/pull/2886 Differential Revision: D5838431 Pulled By: yiwu-arbug fbshipit-source-id: 3c5306c62bc13bb11abc03422ec5cbcea1203cca	2017-10-03 09:11:23 -07:00
Andrew Kryczka	880411f54c	disable populating block cache for in-place updates Summary: There's no point populating the block cache during this read. The key we read is guaranteed to be overwritten with a new `kValueType` key immediately afterwards, so can't be accessed again. A user was seeing high turnover of data blocks, at least partially due to this. Closes https://github.com/facebook/rocksdb/pull/2959 Differential Revision: D5961672 Pulled By: ajkr fbshipit-source-id: e7cb27c156c5db3b32af355c780efb99dbdf087c	2017-10-02 20:41:24 -07:00
Aliaksei Sandryhaila	cf51d3eb73	Remove an "unused" variable Summary: PR 2893 introduced a variable that is only used in TEST_SYNC_POINT_CALLBACK. When RocksDB is not built in debug mode, this method is not compiled in, and the variable is unused, which triggers a compiler error. This patch reverts the corresponding part of #2893. Closes https://github.com/facebook/rocksdb/pull/2956 Reviewed By: yiwu-arbug Differential Revision: D5955679 Pulled By: asandryh fbshipit-source-id: ac4a8e85b22da7f02efb117cd2e4a6e07ba73390	2017-10-02 15:26:29 -07:00
Andrew Kryczka	5df172da2f	fix deletion-triggered compaction in table builder Summary: It was broken when `NotifyCollectTableCollectorsOnFinish` was introduced. That function called `Finish` on each of the `TablePropertiesCollector`s, and `CompactOnDeletionCollector::Finish()` was resetting all its internal state. Then, when we checked whether compaction is necessary, the flag had already been cleared. Fixed above issue by avoiding resetting internal state during `Finish()`. Multiple calls to `Finish()` are allowed, but callers cannot invoke `AddUserKey()` on the collector after any finishes. Closes https://github.com/facebook/rocksdb/pull/2936 Differential Revision: D5918659 Pulled By: ajkr fbshipit-source-id: 4f05e9d80e50ee762ba1e611d8d22620029dca6b	2017-09-28 18:17:30 -07:00
Maysam Yabandeh	385049baf2	WritePrepared Txn: Recovery Summary: Recover txns from the WAL. Also added some unit tests. Closes https://github.com/facebook/rocksdb/pull/2901 Differential Revision: D5859596 Pulled By: maysamyabandeh fbshipit-source-id: 6424967b231388093b4effffe0a3b1b7ec8caeb0	2017-09-28 16:56:45 -07:00
Sagar Vemuri	93c2b91740	Introduce conditional merge-operator invocation in point lookups Summary: For every merge operand encountered for a key in the read path we now have the ability to decide whether to look further (to retrieve more merge operands for the key) or stop and invoke the merge operator to return the value. The user needs to override `ShouldMerge()` method with a condition to terminate search when true to avail this facility. This has a couple of advantages: 1. It helps in limiting the number of merge operands that are looked at to compute a value as part of a user Get operation. 2. It allows to peek at a merge key-value to see if further merge operands need to look at. Example: Limiting the number of merge operands that are looked at: Lets say you have 10 merge operands for a key spread over various levels. If you only want RocksDB to look at the latest two merge operands instead of all 10 to compute the value, it is now possible with this PR. You can set the condition in `ShouldMerge()` to return true when the size of the operand list is 2. Look at the example implementation in the unit test. Without this PR, a Get might look at all the 10 merge operands in different levels before invoking the merge-operator. Added a new unit test. Made sure that there is no perf regression by running benchmarks. Command line to Load data: ``` TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="mergerandom" --merge_operator="uint64add" --num=10000000 ... mergerandom : 12.861 micros/op 77757 ops/sec; 8.6 MB/s ( updates:10000000) ``` ReadRandomMergeRandom bechmark results: Command line: ``` TEST_TMPDIR=/dev/shm ./db_bench --benchmarks="readrandommergerandom" --merge_operator="uint64add" --num=10000000 ``` Base -- Without this code change (on commit `fc7476b`): ``` readrandommergerandom : 38.586 micros/op 25916 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8) ``` With this code change: ``` readrandommergerandom : 38.653 micros/op 25870 ops/sec; (reads:3001599 merges:6998401 total:10000000 hits:842235 maxlength:8) ``` Closes https://github.com/facebook/rocksdb/pull/2923 Differential Revision: D5898239 Pulled By: sagar0 fbshipit-source-id: daefa325019f77968639a75c851d46352c2303ef	2017-09-28 15:58:49 -07:00
Quinn Jarrell	6a541afcc4	Make bytes_per_sync and wal_bytes_per_sync mutable Summary: SUMMARY Moves the bytes_per_sync and wal_bytes_per_sync options from immutableoptions to mutable options. Also if wal_bytes_per_sync is changed, the wal file and memtables are flushed. TEST PLAN ran make check all passed Two new tests SetBytesPerSync, SetWalBytesPerSync check that after issuing setoptions with a new value for the var, the db options have the new value. Closes https://github.com/facebook/rocksdb/pull/2893 Reviewed By: yiwu-arbug Differential Revision: D5845814 Pulled By: TheRushingWookie fbshipit-source-id: 93b52d779ce623691b546679dcd984a06d2ad1bd	2017-09-27 17:49:45 -07:00
Maysam Yabandeh	aa67bae6cf	Break down PinnedDataIteratorRandomized Summary: Its timing out under tsan. Closes https://github.com/facebook/rocksdb/pull/2928 Differential Revision: D5911766 Pulled By: maysamyabandeh fbshipit-source-id: 2faacc07752ac8713a3a2abb5a4c4b7ae3bdf208	2017-09-26 14:27:30 -07:00
Zhongyi Xie	1d6700f9e6	Add test kPointInTimeRecoveryCFConsistency Summary: Context/problem: - CFs may be flushed at different times - A WAL can only be deleted after all CFs have flushed beyond end of that WAL. - Point-in-time recovery might stop upon reaching the first corruption. - Some CFs may have already flushed beyond that point, while others haven't. We should fail the Open() instead of proceeding with inconsistent CFs. Closes https://github.com/facebook/rocksdb/pull/2900 Differential Revision: D5863281 Pulled By: miasantreble fbshipit-source-id: 180dbaf83d96c804cff49b3c406312a4ae61313e	2017-09-22 17:26:36 -07:00
Andrew Kryczka	4708a6875c	Repair DBs with trailing slash in name Summary: Problem: - `DB::SanitizeOptions` strips trailing slash from `wal_dir` but not `dbname` - We check whether `wal_dir` and `dbname` refer to the same directory using string equality: https://github.com/facebook/rocksdb/blob/master/db/repair.cc#L258 - Providing `dbname` with trailing slash causes default `wal_dir` to be misidentified as a separate directory. - Then the repair tries to add all SST files to the `VersionEdit` twice (once for `dbname` dir, once for `wal_dir`) and fails with coredump. Solution: - Add a new `Env` function, `AreFilesSame`, which uses device and inode number to check whether files are the same. It's currently only implemented in `PosixEnv`. - Migrate repair to use `AreFilesSame` to check whether `dbname` and `wal_dir` are same. If unsupported, falls back to string comparison. Closes https://github.com/facebook/rocksdb/pull/2827 Differential Revision: D5761349 Pulled By: ajkr fbshipit-source-id: c839d548678b742af1166d60b09abd94e5476238	2017-09-22 12:42:22 -07:00
Andrew Kryczka	fc7476bec1	fix populating range deletions in forward iterator Summary: fixes #2902 Closes https://github.com/facebook/rocksdb/pull/2917 Differential Revision: D5887175 Pulled By: ajkr fbshipit-source-id: 364e292c636a3238bfc53b0fb9a01ff2f82dcbb9	2017-09-21 17:56:38 -07:00
PhaniShekhar	65a9cd6168	Use L1 size as estimate for L0 size in LevelCompactionBuilder::GetPathID Summary: Fix for [2461](https://github.com/facebook/rocksdb/issues/2461). Problem: When using multiple db_paths setting with RocksDB, RocksDB incorrectly calculates the size of L1 in LevelCompactionBuilder::GetPathId. max_bytes_for_level_base is used as L0 size and L1 size is calculated as (L0 size * max_bytes_for_level_multiplier). However, L1 size should be max_bytes_for_level_base. Solution: Use max_bytes_for_level_base as L1 size. Also, use L1 size as the estimated size of L0. Closes https://github.com/facebook/rocksdb/pull/2903 Differential Revision: D5885442 Pulled By: maysamyabandeh fbshipit-source-id: 036da1c9298d173b9b80479cc6661ee4b7a951f6	2017-09-21 15:57:58 -07:00
Yi Wu	b4596c6174	Fix Get does not return super version on error Summary: This is caught when I was testing #2886. Closes https://github.com/facebook/rocksdb/pull/2907 Differential Revision: D5863153 Pulled By: yiwu-arbug fbshipit-source-id: 8c54759ba1a0dc101f24ab50423e35731300612d	2017-09-19 12:01:09 -07:00
Maysam Yabandeh	60beefd6e0	WritePrepared Txn: Advance seq one per batch Summary: By default the seq number in DB is increased once per written key. WritePrepared txns requires the seq to be increased once per the entire batch so that the seq would be used as the prepare timestamp by which the transaction is identified. Also we need to increase seq for the commit marker since it would give a unique id to the commit timestamp of transactions. Two unit tests are added to verify our understanding of how the seq should be increased. The recovery path requires much more work and is left to another patch. Closes https://github.com/facebook/rocksdb/pull/2885 Differential Revision: D5837843 Pulled By: maysamyabandeh fbshipit-source-id: a08960b93d727e1cf438c254d0c2636fb133cc1c	2017-09-18 14:45:08 -07:00
Maysam Yabandeh	c57050b770	Use the default copy constructor in Options Summary: Our current implementation of (semi-)copy constructor of DBOptions and ColumnFamilyOptions seems to intend value by value copy, which is what the default copy constructor does anyway. Moreover not using the default constructor has the risk of forgetting to add newly added options. As an example, allow_2pc seems to be forgotten in the copy constructor which was causing one of the unit tests not seeing its effect. Closes https://github.com/facebook/rocksdb/pull/2888 Differential Revision: D5846368 Pulled By: maysamyabandeh fbshipit-source-id: 1ee92a2aeae93886754b7bc039c3411ea2458683	2017-09-15 17:15:10 -07:00
Yi Wu	6b3c71f6ed	Fix DBImpl::NotifyOnCompactionCompleted data race Summary: Access of `cfd->current()` needs to hold db mutex. The data race is caught by TSAN but hard to reproduce: https://gist.github.com/yiwu-arbug/0fc6dc0de915297a1740aa9610be9373 Closes https://github.com/facebook/rocksdb/pull/2894 Differential Revision: D5843884 Pulled By: yiwu-arbug fbshipit-source-id: 0a30a421bc96f51840821538ad6453dc0815a942	2017-09-15 11:56:31 -07:00
Siying Dong	edcbb36944	Three code-level optimization to Iterator::Next() Summary: Three small optimizations: (1) iter_->IsKeyPinned() shouldn't be called if read_options.pin_data is not true. This may trigger function call all the way down the iterator tree. (2) reuse the iterator key object in DBIter::FindNextUserEntryInternal(). The constructor of the class has some overheads. (3) Move the switching direction logic in MergingIterator::Next() to a separate function. These three in total improves readseq performance by about 3% in my benchmark setting. Closes https://github.com/facebook/rocksdb/pull/2880 Differential Revision: D5829252 Pulled By: siying fbshipit-source-id: 991aea10c6d6c3b43769cb4db168db62954ad1e3	2017-09-14 17:57:31 -07:00
Siying Dong	885b1c682e	Two small refactoring for better inlining Summary: Move uncommon code paths in RangeDelAggregator::ShouldDelete() and IterKey::EnlargeBufferIfNeeded() to a separate function, so that the inlined strcuture can be more optimized. Optimize it because these places show up in CPU profiling, though minimum. The performance is really hard measure. I ran db_bench with readseq benchmark against in-memory DB many times. The variation is big, but it seems to show 1% improvements. Closes https://github.com/facebook/rocksdb/pull/2877 Differential Revision: D5828123 Pulled By: siying fbshipit-source-id: 41a49e229f91e9f8409f85cc6f0dc70e31334e4b	2017-09-14 15:41:49 -07:00
Oleksandr Anyshchenko	ffac68367f	Added save points for transactions C API Summary: Added possibility to set save points in transactions and then rollback to them Closes https://github.com/facebook/rocksdb/pull/2876 Differential Revision: D5825829 Pulled By: yiwu-arbug fbshipit-source-id: 62168992340bbcddecdaea3baa2a678475d1429d	2017-09-14 14:18:59 -07:00
Yi Wu	a843df668b	Fix use-after-free in c_tset Summary: Fix asan error introduce by #2823 Closes https://github.com/facebook/rocksdb/pull/2879 Differential Revision: D5828454 Pulled By: yiwu-arbug fbshipit-source-id: 50777855667f4e7b634279a654c3bfa01a1ac729	2017-09-13 16:12:02 -07:00
Andrew Kryczka	464fb36de9	fix hanging after CompactFiles with L0 overlap Summary: Bug report: https://www.facebook.com/groups/rocksdb.dev/permalink/1389452781153232/ Non-empty `level0_compactions_in_progress_` was aborting `CompactFiles` after incrementing `bg_compaction_scheduled_`, and in that case we never decremented it. This blocked future compactions and prevented DB close as we wait for scheduled compactions to finish/abort during close. I eliminated `CompactFiles`'s dependency on `level0_compactions_in_progress_`. Since it takes a contiguous span of L0 files -- through the last L0 file if any L1+ files are included -- it's fine to run in parallel with other compactions involving L0. We make the same assumption in intra-L0 compaction. Closes https://github.com/facebook/rocksdb/pull/2849 Differential Revision: D5780440 Pulled By: ajkr fbshipit-source-id: 15b15d3faf5a699aed4b82a58352d4a7bb23e027	2017-09-13 15:41:38 -07:00
Oleksandr Anyshchenko	72e4190918	Additions for `OptimisticTransactionDB` in C API Summary: Added some bindings for `OptimisticTransactionDB` in C API Closes https://github.com/facebook/rocksdb/pull/2823 Differential Revision: D5820672 Pulled By: yiwu-arbug fbshipit-source-id: 7efd17f619cc0741feddd2050b8fc856f9288350	2017-09-13 12:12:11 -07:00
Amy Xu	5785b1fcb8	Fix naming in InternalKey Summary: - Switched all instances of SetMinPossibleForUserKey and SetMaxPossibleForUserKey in accordance to InternalKeyComparator's comparison logic Closes https://github.com/facebook/rocksdb/pull/2868 Differential Revision: D5804152 Pulled By: axxufb fbshipit-source-id: 80be35e04f2e8abc35cc64abe1fecb03af24e183	2017-09-12 17:17:42 -07:00
Maysam Yabandeh	2d30aaae47	Exclude incompatible options in test Summary: options.enable_pipelined_write and options.concurrent_prepare are incompatible and should not be set together. Closes https://github.com/facebook/rocksdb/pull/2875 Differential Revision: D5818358 Pulled By: maysamyabandeh fbshipit-source-id: dad862508f00817ab302f8b61729accf38315fb8	2017-09-12 14:58:46 -07:00
Archit Mishra	3c42807794	do not call merge when checking to see if key exists Summary: Changes: * added check for value before merge is called on code path that should check if key exists Closes https://github.com/facebook/rocksdb/pull/2814 Reviewed By: IslamAbdelRahman Differential Revision: D5743966 Pulled By: armishra fbshipit-source-id: 6ac4283bc510c8ca50827d87ef0ba631f2b33b18	2017-09-12 12:02:53 -07:00
Andrew Kryczka	025b85b4ac	speedup DBTest.EncodeDecompressedBlockSizeTest Summary: it sometimes takes more than 10 minutes (i.e., times out) on our internal CI. mainly because bzip is super slow. so I reduced the amount of work it tries to do. Closes https://github.com/facebook/rocksdb/pull/2856 Differential Revision: D5795883 Pulled By: ajkr fbshipit-source-id: e69f986ae60b44ecc26b6b024abd0f13bdf3a3c5	2017-09-12 11:26:47 -07:00
Siying Dong	64b6452e0c	Make InternalKeyComparator final and directly use it in merging iterator Summary: Merging iterator invokes InternalKeyComparator.Compare() frequently to heap merge. By making InternalKeyComparator final and merging iterator to directly use InternalKeyComparator rather than through Iterator interface, we can give compiler a choice to avoid one more virtual function call if possible. I ran readseq benchmark in memory-only use case to make sure the performance at least doesn't regress. I have to disable the final key word in debug build, as a hack test class depends on overriding the class. Closes https://github.com/facebook/rocksdb/pull/2860 Differential Revision: D5800461 Pulled By: siying fbshipit-source-id: ab876f22a09bb5c560740911412336e0e25ccb53	2017-09-11 12:04:21 -07:00
Siying Dong	2dd22e5449	Make DBIter class final Summary: DBIter is referenced in ArenaWrappedDBIter, which is a simple wrapper. If DBIter is final, some virtual function call can be avoided. Some functions can even be inlined, like DBIter.value() to ArenaWrappedDBIter.value() and DBIter.key() to ArenaWrappedDBIter.key(). The performance gain is hard to measure. I just ran the memory-only benchmark for readseq and saw it didn't regress. There shouldn't be any harm doing it. Just give compiler more choices. Closes https://github.com/facebook/rocksdb/pull/2859 Differential Revision: D5799888 Pulled By: siying fbshipit-source-id: 829788f91310c40282dcfb7e412e6ef489931143	2017-09-11 12:04:21 -07:00
Huachao Huang	2a5915049e	Fix missing BYTES_PER_WRITE for pipeline write Summary: Closes https://github.com/facebook/rocksdb/pull/2862 Differential Revision: D5805638 Pulled By: yiwu-arbug fbshipit-source-id: 72d38c74395690023a719f400daff01527645a17	2017-09-11 11:41:27 -07:00
Maysam Yabandeh	f46464d383	write-prepared txn: call IsInSnapshot Summary: This patch instruments the read path to verify each read value against an optional ReadCallback class. If the value is rejected, the reader moves on to the next value. The WritePreparedTxn makes use of this feature to skip sequence numbers that are not in the read snapshot. Closes https://github.com/facebook/rocksdb/pull/2850 Differential Revision: D5787375 Pulled By: maysamyabandeh fbshipit-source-id: 49d808b3062ab35e7ae98ad388f659757794184c	2017-09-11 09:14:48 -07:00
Andrew Kryczka	3cd7ea2e8a	rename stall-related internal stats Summary: Some of these names, like `MEMTABLE_COMPACTION`, did not mean anything. Tried to give them descriptive names. Closes https://github.com/facebook/rocksdb/pull/2852 Differential Revision: D5782822 Pulled By: ajkr fbshipit-source-id: f2695c4124af4073da4492d7135bae2411220f3a	2017-09-07 18:26:18 -07:00
Siying Dong	0e99323ac2	Fix CLANG Analyze Summary: clang analyze shows warnings after we upgrade the CLANG version. Fix them. Closes https://github.com/facebook/rocksdb/pull/2839 Differential Revision: D5769060 Pulled By: siying fbshipit-source-id: 3f8e4df715590d8984f6564b608fa08cfdfa5f14	2017-09-07 14:28:06 -07:00
Andrew Kryczka	10ddd59ba7	fix CompactFiles inclusion of older L0 files Summary: if we're moving any L0 files down, we need to include older L0 files since they may contain older versions of the keys being moved down. Closes https://github.com/facebook/rocksdb/pull/2845 Differential Revision: D5773800 Pulled By: ajkr fbshipit-source-id: 9f0770a8eaaeea4c87df2e7a2a1d65bf9d7f4f7e	2017-09-06 11:42:25 -07:00
Kamalalochana Subbaiah	e612e31740	Updated CRC32 Power Optimization Changes Summary: Support for PowerPC Architecture Detecting AltiVec Support Closes https://github.com/facebook/rocksdb/pull/2716 Differential Revision: D5606836 Pulled By: siying fbshipit-source-id: 720262453b1546e5fdbbc668eff56848164113f3	2017-08-31 14:16:30 -07:00
Andrew Kryczka	c10cf166fa	Dump non-final ZSTD compression type support Summary: Closes https://github.com/facebook/rocksdb/pull/2810 Differential Revision: D5739947 Pulled By: ajkr fbshipit-source-id: 09f99718b6b083c2711dcf17f7b68c305f3fd261	2017-08-30 16:41:24 -07:00
Artem Danilov	8a6708f5f2	Extend property map with compaction stats Summary: This branch extends existing property map which keeps values in doubles to keep values in strings so that it can be used to provide wider range of properties. The immediate need for that is to provide IO stall stats in an easy parseable way to MyRocks which is also part of this branch. Closes https://github.com/facebook/rocksdb/pull/2794 Differential Revision: D5717676 Pulled By: Tema fbshipit-source-id: e34ba5b79ba774697f7b97ce1138d8fd55471b8a	2017-08-30 15:26:55 -07:00
Huachao Huang	0980dc6c9a	Fix wrong smallest key of delete range tombstones Summary: Since tombstones are not stored in order, we may get a wrong smallest key if we only consider the first added tombstone. Check https://github.com/facebook/rocksdb/issues/2752 for more details. Closes https://github.com/facebook/rocksdb/pull/2799 Differential Revision: D5728217 Pulled By: ajkr fbshipit-source-id: 4a53edb0ca80d2a9fcf10749e52d47d57d6417d3	2017-08-29 18:41:35 -07:00
Andrew Kryczka	b767972313	avoid use-after-move error Summary: * db/range_del_aggregator.cc (AddTombstone): Avoid a potential use-after-move bug. The original code would both use and move `tombstone` in a context where the order of those operations is not specified. The fix is to perform the use on a new, preceding statement. Author: meyering Closes https://github.com/facebook/rocksdb/pull/2796 Differential Revision: D5721163 Pulled By: ajkr fbshipit-source-id: a1d328d6a77a17c6425e8069860a202e615e2f48	2017-08-29 12:11:56 -07:00
Maysam Yabandeh	fbfa3e7a43	WriteAtPrepare: Efficient read from snapshot list Summary: Divide the old snapshots to two lists: a few that fit into a cached array and the rest in a vector, which is expected to be empty in normal cases. The former is to optimize concurrent reads from snapshots without requiring locks. It is done by an array of std::atomic, from which std::memory_order_acquire reads are compiled to simple read instructions in most of the x86_64 architectures. Closes https://github.com/facebook/rocksdb/pull/2758 Differential Revision: D5660504 Pulled By: maysamyabandeh fbshipit-source-id: 524fcf9a8e7f90a92324536456912a99aaa6740c	2017-08-26 01:00:38 -07:00

1 2 3 4 5 ...

2922 Commits