rocksdb

Author	SHA1	Message	Date
Andrew Kryczka	6a3eebbab0	support multiple db_paths in SstFileManager Summary: Now that files scheduled for deletion are kept in the same directory, we don't need to constrain deletion scheduler to `db_paths[0]`. Previously this was done because there was a separate trash directory, and this constraint prevented files from being accidentally copied to another filesystem when they're scheduled for deletion. Closes https://github.com/facebook/rocksdb/pull/3544 Differential Revision: D7093786 Pulled By: ajkr fbshipit-source-id: 202f5c92d925eafebec1281fb95bb5828d33414f	2018-03-06 12:43:51 -08:00
Fosco Marotto	d518fe1da6	uint64_t and size_t changes to compile for iOS Summary: In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t. This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported. This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types. Also up for discussion: is iOS worth supporting? Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience. Closes https://github.com/facebook/rocksdb/pull/3503 Differential Revision: D7106457 Pulled By: gfosco fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e	2018-03-06 12:43:51 -08:00
Dmitri Smirnov	c364eb42b5	Windows cumulative patch Summary: This patch addressed several issues. Portability including db_test std::thread -> port::Thread Cc: @ and %z to ROCKSDB portable macro. Cc: maysamyabandeh Implement Env::AreFilesSame Make the implementation of file unique number more robust Get rid of C-runtime and go directly to Windows API when dealing with file primitives. Implement GetSectorSize() and aling unbuffered read on the value if available. Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976 Fix test running script issue where $status var was of incorrect scope so the failures were swallowed and not reported. DestroyDB() creates a logger and opens a LOG file in the directory being cleaned up. This holds a lock on the folder and the cleanup is prevented. This fails one of the checkpoin tests. We observe the same in production. We close the log file in this change. Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test attempts to open a directory with NewRandomAccessFile which does not work on Windows. Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug Closes https://github.com/facebook/rocksdb/pull/3552 Differential Revision: D7156304 Pulled By: siying fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b	2018-03-06 11:57:43 -08:00
Yi Wu	b864bc9b5b	Blob DB: Improve FIFO eviction Summary: Improving blob db FIFO eviction with the following changes, * Change blob_dir_size to max_db_size. Take into account SST file size when computing DB size. * FIFO now only take into account live sst files and live blob files. It is normal for disk usage to go over max_db_size because there are obsolete sst files and blob files pending deletion. * FIFO eviction now also evict TTL blob files that's still open. It doesn't evict non-TTL blob files. * If FIFO is triggered, it will pass an expiration and the current sequence number to compaction filter. Compaction filter will then filter inlined keys to evict those with an earlier expiration and smaller sequence number. So call LSM FIFO. * Compaction filter also filter those blob indexes where corresponding blob file is gone. * Add an event listener to listen compaction/flush event and update sst file size. * Implement DB::Close() to make sure base db, as well as event listener and compaction filter, destruct before blob db. * More blob db statistics around FIFO. * Fix some locking issue when accessing a blob file. Closes https://github.com/facebook/rocksdb/pull/3556 Differential Revision: D7139328 Pulled By: yiwu-arbug fbshipit-source-id: ea5edb07b33dfceacb2682f4789bea61de28bbfa	2018-03-06 11:57:42 -08:00
Maysam Yabandeh	62277e15c3	WritePrepared Txn: Move DuplicateDetector to util Summary: Move DuplicateDetector and SetComparator to its own header file in util. It would also address a complaint in the unity test. Closes https://github.com/facebook/rocksdb/pull/3567 Differential Revision: D7163268 Pulled By: maysamyabandeh fbshipit-source-id: 6ddf82773473646dbbc1284ae601a78c4907c778	2018-03-05 23:57:12 -08:00
Huachao Huang	9cb4856dbd	Don't need to UpdateFilesByCompactionPri for kCompactionStyleNone Summary: Closes https://github.com/facebook/rocksdb/pull/3563 Differential Revision: D7154653 Pulled By: ajkr fbshipit-source-id: 4f32fb1b02451a934504c40be22b07fb1f2deb9c	2018-03-05 17:57:39 -08:00
Andrew Kryczka	5d68243e61	Comment out unused variables Summary: Submitting on behalf of another employee. Closes https://github.com/facebook/rocksdb/pull/3557 Differential Revision: D7146025 Pulled By: ajkr fbshipit-source-id: 495ca5db5beec3789e671e26f78170957704e77e	2018-03-05 13:13:41 -08:00
Maysam Yabandeh	680864ae54	WritePrepared Txn: Fix bug with duplicate keys during recovery Summary: Fix the following bugs: - During recovery a duplicate key was inserted twice into the write batch of the recovery transaction, once when the memtable returns false (because it was duplicates) and once for the 2nd attempt. This would result into different SubBatch count measured when the recovered transactions is committing. - If a cf is flushed during recovery the memtable is not available to assist in detecting the duplicate key. This could result into not advancing the sequence number when iterating over duplicate keys of a flushed cf and hence inserting the next key with the wrong sequence number. - SubBacthCounter would reset the comparator to default comparator after the first duplicate key. The 2nd duplicate key hence would have gone through a wrong comparator and not being detected. Closes https://github.com/facebook/rocksdb/pull/3562 Differential Revision: D7149440 Pulled By: maysamyabandeh fbshipit-source-id: 91ec317b165f363f5d11ff8b8c47c81cebb8ed77	2018-03-05 10:57:59 -08:00
Sagar Vemuri	15f55e5e06	Fix TSAN timeout in MergeOperatorPinningTest.Randomized/x test Summary: [FB - Internal] MergeOperatorPinningTest.Randomized/x tests are frequently failing with timeouts when run with tsan, as they are exceeding 10 minute limit for tests. The tests are in turn getting disabled due to frequent failures. I halved the number of rounds to make the test complete sooner. This reduces the number of testing iterations a little, but it still is much better than totally letting the test be disabled. Closes https://github.com/facebook/rocksdb/pull/3523 Differential Revision: D7031498 Pulled By: sagar0 fbshipit-source-id: 9a694f2176b235259920a42bf24bca5346f7cff1	2018-03-02 16:27:21 -08:00
Yi Wu	1209b6db5c	Blob DB: remove existing garbage collection implementation Summary: Red diff to remove existing implementation of garbage collection. The current approach is reference counting kind of approach and require a lot of effort to get the size counter right on compaction and deletion. I'm going to go with a simple mark-sweep kind of approach and will send another PR for that. CompactionEventListener was added solely for blob db and it adds complexity and overhead to compaction iterator. Removing it as well. Closes https://github.com/facebook/rocksdb/pull/3551 Differential Revision: D7130190 Pulled By: yiwu-arbug fbshipit-source-id: c3a375ad2639a3f6ed179df6eda602372cc5b8df	2018-03-02 12:57:23 -08:00
Maysam Yabandeh	d060421c77	Fix a leak in prepared_section_completed_ Summary: The zeroed entries were not removed from prepared_section_completed_ map. This patch adds a unit test to show the problem and fixes that by refactoring the code. The new code is more efficient since i) it uses two separate mutex to avoid contention between commit and prepare threads, ii) it uses a sorted vector for maintaining uniq log entires with prepare which avoids a very large heap with many duplicate entries. Closes https://github.com/facebook/rocksdb/pull/3545 Differential Revision: D7106071 Pulled By: maysamyabandeh fbshipit-source-id: b3ae17cb6cd37ef10b6b35e0086c15c758768a48	2018-03-01 20:41:56 -08:00
Yi Wu	bf937cf15b	Add "rocksdb.live-sst-files-size" DB property Summary: Add "rocksdb.live-sst-files-size" DB property which only include files of latest version. Existing "rocksdb.total-sst-files-size" include files from all versions and thus include files that's obsolete but not yet deleted. I'm going to use this new property to cap blob db sst + blob files size. Closes https://github.com/facebook/rocksdb/pull/3548 Differential Revision: D7116939 Pulled By: yiwu-arbug fbshipit-source-id: c6a52e45ce0f24ef78708156e1a923c1dd6bc79a	2018-03-01 18:01:10 -08:00
leviathan1995	ec5843dca9	Comment typo Summary: Closes https://github.com/facebook/rocksdb/pull/3546 Differential Revision: D7111708 Pulled By: ajkr fbshipit-source-id: 522a4a00eb3e34c73afcb86c1f75cd2e90e7608d	2018-02-28 09:56:45 -08:00
Andrew Kryczka	3ae0047278	skip CompactRange flush based on memtable contents Summary: CompactRange has a call to Flush because we guarantee that, at the time it's called, all existing keys in the range will be pushed through the user's compaction filter. However, previously the flush was done blindly, so it'd happen even if the memtable does not contain keys in the range specified by the user. This caused unnecessarily many L0 files to be created, leading to write stalls in some cases. This PR checks the memtable's contents, and decides to flush only if it overlaps with `CompactRange`'s range. - Move the memtable overlap check logic from `ExternalSstFileIngestionJob` to `ColumnFamilyData::RangesOverlapWithMemtables` - Reuse the above logic in `CompactRange` and skip flushing if no overlap Closes https://github.com/facebook/rocksdb/pull/3520 Differential Revision: D7018897 Pulled By: ajkr fbshipit-source-id: a3c6b1cfae56687b49dd89ccac7c948e53545934	2018-02-27 17:12:44 -08:00
Zhongyi Xie	ad05cbb182	DB:Open should fail on tmpfs when use_direct_reads=true Summary: Before: > $ TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 DB path: [/dev/shm/dbbench] put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument db_bench: tpp.c:84: __pthread_tpp_change_priority: Assertion `new_prio == -1 \|\| (new_prio >= fifo_min_prio && new_prio <= fifo_max_prio)' failed. put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument After: > TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags open error: Not implemented: Direct I/O is not supported by the specified DB. Closes https://github.com/facebook/rocksdb/pull/3539 Differential Revision: D7082658 Pulled By: miasantreble fbshipit-source-id: f9d9c6ec3b5e9e049cab52154940ee101ba4d342	2018-02-26 14:58:06 -08:00
Anand Ananthabhotla	dfbe52e099	Fix the Logger::Close() and DBImpl::Close() design pattern Summary: The recent Logger::Close() and DBImpl::Close() implementation rely on calling the CloseImpl() virtual function from the destructor, which will not work. Refactor the implementation to have a private close helper function in derived classes that can be called by both CloseImpl() and the destructor. Closes https://github.com/facebook/rocksdb/pull/3528 Reviewed By: gfosco Differential Revision: D7049303 Pulled By: anand1976 fbshipit-source-id: 76a64cbf403209216dfe4864ecf96b5d7f3db9f4	2018-02-23 13:57:26 -08:00
Siying Dong	30649dc6a1	Have a different function when ROCKSDB_JEMALLOC=0 Summary: Some sanitizer is not happy with parameter name with ROCKSDB_JEMALLOC not set. Use another function instead. Closes https://github.com/facebook/rocksdb/pull/3536 Differential Revision: D7064849 Pulled By: siying fbshipit-source-id: c6ae94e044686176af1259df9172453d52c2f9d5	2018-02-23 11:42:33 -08:00
Igor Sugak	aba3409740	Back out "[codemod] - comment out unused parameters" Reviewed By: igorsugak fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37	2018-02-22 12:43:17 -08:00
David Lai	f4a030ce81	- comment out unused parameters Reviewed By: everiq, igorsugak Differential Revision: D7046710 fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461	2018-02-22 09:44:23 -08:00
Sagar Vemuri	8ada876dfe	Add rocksdb.iterator.internal-key property Summary: Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped. Closes https://github.com/facebook/rocksdb/pull/3525 Differential Revision: D7033694 Pulled By: sagar0 fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8	2018-02-20 19:12:09 -08:00
Maysam Yabandeh	c178da053b	WritePrepared Txn: optimizations for sysbench update_noindex Summary: These are optimization that we applied to improve sysbech's update_noindex performance. 1. Make use of LIKELY compiler hint 2. Move std::atomic so the subclass 3. Make use of skip_prepared in non-2pc transactions. Closes https://github.com/facebook/rocksdb/pull/3512 Differential Revision: D7000075 Pulled By: maysamyabandeh fbshipit-source-id: 1ab8292584df1f6305a4992973fb1b7933632181	2018-02-16 08:42:31 -08:00
Mike Kolupaev	97307d888f	Fix deadlock in ColumnFamilyData::InstallSuperVersion() Summary: Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_. This deadlock is hit all the time on our workload. It blocks our release. In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so. So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them. This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup. I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine. Closes https://github.com/facebook/rocksdb/pull/3510 Reviewed By: sagar0 Differential Revision: D7005346 Pulled By: al13n321 fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f	2018-02-16 08:13:34 -08:00
Maysam Yabandeh	8eb1d445c3	Unbreak MemTableRep API change Summary: The MemTableRep API was broken by this commit: 813719e9525f647aaebf19ca3d4bb6f1c63e2648 This patch reverts the changes and instead adds InsertKey (and etc.) overloads to extend the MemTableRep API without breaking the existing classes that inherit from it. Closes https://github.com/facebook/rocksdb/pull/3513 Differential Revision: D7004134 Pulled By: maysamyabandeh fbshipit-source-id: e568d91fe1e17dd76c0c1f6c7dd51a18633b1c4f	2018-02-15 17:27:24 -08:00
jsteemann	4e7a182d09	Several small "fixes" Summary: - removed a few unneeded variables - fused some variable declarations and their assignments - fixed right-trimming code in string_util.cc to not underflow - simplifed an assertion - move non-nullptr check assertion before dereferencing of that pointer - pass an std::string function parameter by const reference instead of by value (avoiding potential copy) Closes https://github.com/facebook/rocksdb/pull/3507 Differential Revision: D7004679 Pulled By: sagar0 fbshipit-source-id: 52944952d9b56dfcac3bea3cd7878e315bb563c4	2018-02-15 16:57:37 -08:00
Zhongyi Xie	c88c57cde1	Tweak external file ingestion seqno logic under universal compaction Summary: Right now it is possible that a file gets assigned to L0 but also assigned the seqno from a higher level which it doesn't fit Under the current impl, it is possibe that seqno in lower levels (Ln) can be equal to smallest seqno of higher levels (Ln-1), which is undesirable from universal compaction's point of view. This should fix the intermittent failure of `ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` Closes https://github.com/facebook/rocksdb/pull/3411 Differential Revision: D6813802 Pulled By: miasantreble fbshipit-source-id: 693d0462fa94725ccfb9d8858743e6d2d9992d14	2018-02-15 14:13:39 -08:00
Fosco Marotto	ba6ee1f749	Fix 2 more unused reference errors VS2017 Summary: As in #3425 Closes https://github.com/facebook/rocksdb/pull/3497 Differential Revision: D6979588 Pulled By: gfosco fbshipit-source-id: e9fb32d04ad45575dfe9de1d79348d158e474197	2018-02-14 11:12:36 -08:00
Igor Sugak	d08d05cb62	fix UBSAN errors in fault_injection_test Summary: This fixes shift and signed-integer-overflow UBSAN checks in fault_injection_test by using a larger and unsigned type. Closes https://github.com/facebook/rocksdb/pull/3498 Reviewed By: siying Differential Revision: D6981116 Pulled By: igorsugak fbshipit-source-id: 3688f62cce570534b161e9b5f42109ebc9ae5a2c	2018-02-13 14:12:40 -08:00
Siying Dong	dadf01672a	Rename one of the two LevelIterator Summary: A new LevelIterator was recently created. Rename the old one to make unity build happy. It's also not a good idea to have two classes in the same name anyway. Closes https://github.com/facebook/rocksdb/pull/3499 Differential Revision: D6979325 Pulled By: siying fbshipit-source-id: 3a032d93fe205650a08e92e5262594731ec726bb	2018-02-13 13:57:58 -08:00
Siying Dong	b555ed30a4	Customized BlockBasedTableIterator and LevelIterator Summary: Use a customzied BlockBasedTableIterator and LevelIterator to replace current implementations leveraging two-level-iterator. Hope the customized logic will make code easier to understand. As a side effect, BlockBasedTableIterator reduces the allocation for the data block iterator object, and avoid the virtual function call to it, because we can directly reference BlockIter, a final class. Similarly, LevelIterator reduces virtual function call to the dummy iterator iterating the file metadata. It also enabled further optimization. The upper bound check is also moved from index block to data block. This implementation fits this iterator better. After the change, forwared iterator is slightly optimized to ensure we trim those iterators. The two-level-iterator now is only used by partitioned index, so it is simplified. Closes https://github.com/facebook/rocksdb/pull/3406 Differential Revision: D6809041 Pulled By: siying fbshipit-source-id: 7da3b9b1d3c8e9d9405302c15920af1fcaf50ffa	2018-02-12 17:12:25 -08:00
Andrew Kryczka	ee1c802675	Add delay before flush in CompactRange to avoid write stalling Summary: - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur. - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true. - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3). Closes https://github.com/facebook/rocksdb/pull/3381 Differential Revision: D6754983 Pulled By: ajkr fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972	2018-02-12 15:42:47 -08:00
Zhongyi Xie	3f1bb07351	make flush_reason_ atomic to keep TSAN happy Summary: Closes https://github.com/facebook/rocksdb/pull/3487 Differential Revision: D6967098 Pulled By: miasantreble fbshipit-source-id: 48e0accf2e3b3f589ddb797ff8083c8520269bf0	2018-02-12 13:28:18 -08:00
Siying Dong	ef29d2a234	Explictly fail writes if key or value is not smaller than 4GB Summary: Right now, users will encounter unexpected bahavior if they use key or value larger than 4GB. We should explicitly fail the queriers. Closes https://github.com/facebook/rocksdb/pull/3484 Differential Revision: D6953895 Pulled By: siying fbshipit-source-id: b60491e1af064fc5d52971956661f6c18ceac24f	2018-02-09 14:57:54 -08:00
Yi Wu	fe228da0a9	WritePrepared Txn: Support merge operator Summary: CompactionIterator invoke MergeHelper::MergeUntil() to do partial merge between snapshot boundaries. Previously it only depend on sequence number to tell snapshot boundary, but we also need to make use of snapshot_checker to verify visibility of the merge operands to the snapshots. For example, say there is a snapshot with seq = 2 but only can see data with seq <= 1. There are three merges, each with seq = 1, 2, 3. A correct compaction output would be (1),(2+3). Without taking snapshot_checker into account when generating merge result, compaction will generate output (1+2),(3). By filtering uncommitted keys with read callback, the read path already take care of merges well and don't need additional updates. Closes https://github.com/facebook/rocksdb/pull/3475 Differential Revision: D6926087 Pulled By: yiwu-arbug fbshipit-source-id: 8f539d6f897cfe29b6dc27a8992f68c2a629d40a	2018-02-09 14:57:54 -08:00
Zhongyi Xie	945f618ba5	log flush reason for better debugging experience Summary: It's always a mystery from the logs why flush was triggered -- user triggered it manually, WriteBufferManager triggered it, logs were full, write buffer was full, etc. This PR logs Flush reason whenever a flush is scheduled. Closes https://github.com/facebook/rocksdb/pull/3401 Differential Revision: D6788142 Pulled By: miasantreble fbshipit-source-id: a867e54d493c06adf5172bd36a180fb3faae3511	2018-02-09 12:12:43 -08:00
Siying Dong	821e0b1683	Disable options_settable_test in UBSAN and fix UBSAN failure in blob_… Summary: …db_test options_settable_test won't pass UBSAN so disable it. blob_db_test fails in UBSAN as SnapshotList doesn't initialize all the fields in dummy snapshot. Fix it. I don't understand why only blob_db_test fails though. Closes https://github.com/facebook/rocksdb/pull/3477 Differential Revision: D6928681 Pulled By: siying fbshipit-source-id: e31dd300fcdecdfd4f6af279a0987fd0cdec5122	2018-02-07 14:42:26 -08:00
Yi Wu	81736d8afe	WritePrepared Txn: update compaction_iterator_test and db_iterator_test Summary: Update compaction_iterator_test with write-prepared transaction DB related tests. Transaction related tests are group in CompactionIteratorWithSnapshotCheckerTest. The existing test are duplicated to make them also test with dummy SnapshotChecker that will say every key is visible to every snapshot (this is okay, we still compare sequence number to verify visibility). Merge related tests are disabled and will be revisit in another PR. Existing db_iterator_tests are also duplicated to test with dummy read_callback that will say every key is committed. Closes https://github.com/facebook/rocksdb/pull/3466 Differential Revision: D6909253 Pulled By: yiwu-arbug fbshipit-source-id: 2ae4656b843a55e2e9ff8beecf21f2832f96cd25	2018-02-06 14:12:13 -08:00
Maysam Yabandeh	88d8b2a2f5	WritePrepared Txn: Duplicate Keys, Txn Part Summary: This patch takes advantage of memtable being able to detect duplicate <key,seq> and returning TryAgain to handle duplicate keys in WritePrepared Txns. Through WriteBatchWithIndex's index it detects existence of at least a duplicate key in the write batch. If duplicate key was reported, it then pays the cost of counting the number of sub-patches by iterating over the write batch and pass it to DBImpl::Write. DB will make use of the provided batch_count to assign proper sequence numbers before sending them to the WAL. When later inserting the batch to the memtable, it increases the seq each time memtbale reports a duplicate (a sub-patch in our counting) and tries again. Closes https://github.com/facebook/rocksdb/pull/3455 Differential Revision: D6873699 Pulled By: maysamyabandeh fbshipit-source-id: db8487526c3a5dc1ddda0ea49f0f979b26ae648d	2018-02-05 18:43:24 -08:00
Anand Ananthabhotla	4b124fb9d3	Handle error return from WriteBuffer() Summary: There are a couple of places where we swallow any error from WriteBuffer() - in SwitchMemtable() and DBImpl::CloseImpl(). Propagate the error up in those cases rather than ignoring it. Closes https://github.com/facebook/rocksdb/pull/3404 Differential Revision: D6879954 Pulled By: anand1976 fbshipit-source-id: 2ef88b554be5286b0a8bad7384ba17a105395bdb	2018-02-05 13:59:34 -08:00
Mike Kolupaev	cb5b8f2090	Fix use-after-free in tailing iterator with merge operator Summary: ForwardIterator::SVCleanup() sometimes didn't pin superversion when it was supposed to. See the added test for the scenario. Here's the ASAN output of the added test without the fix (using `COMPILE_WITH_ASAN=1 make`): https://pastebin.com/9rD0Ywws Closes https://github.com/facebook/rocksdb/pull/3415 Differential Revision: D6817414 Pulled By: al13n321 fbshipit-source-id: bc80c44ea78a3a1fa885dfa448a26111f91afb24	2018-02-02 21:26:28 -08:00
Tamir Duberstein	cd5092e168	Suppress unused warnings Summary: - Use `__unused__` everywhere - Suppress unused warnings in Release mode + This currently affects non-MSVC builds (e.g. mingw64). Closes https://github.com/facebook/rocksdb/pull/3448 Differential Revision: D6885496 Pulled By: miasantreble fbshipit-source-id: f2f6adacec940cc3851a9eee328fafbf61aad211	2018-02-02 12:27:07 -08:00
Fosco Marotto	ba8aa8fdc8	Upgrade Appveyor to VS2017 Summary: Per some discussions, this will switch our Appveyor testing to use Visual Studio 2017. Closes https://github.com/facebook/rocksdb/pull/3445 Differential Revision: D6874918 Pulled By: gfosco fbshipit-source-id: c5a0032ca9f37f0d3baeae35c59d850d528c3176	2018-02-01 13:57:01 -08:00
Maysam Yabandeh	813719e952	WritePrepared Txn: Duplicate Keys, Memtable part Summary: Currently DB does not accept duplicate keys (keys with the same user key and the same sequence number). If Memtable returns false when receiving such keys, we can benefit from this signal to properly increase the sequence number in the rare cases when we have a duplicate key in the write batch written to DB under WritePrepared transactions. Closes https://github.com/facebook/rocksdb/pull/3418 Differential Revision: D6822412 Pulled By: maysamyabandeh fbshipit-source-id: adea3ce5073131cd38ed52b16bea0673b1a19e77	2018-01-31 18:57:07 -08:00
Fosco Marotto	5400800a56	Work around VS2017 warning for unused reference Summary: For #3407 Closes https://github.com/facebook/rocksdb/pull/3425 Differential Revision: D6836900 Pulled By: gfosco fbshipit-source-id: 7bcaf7a1beeeeabb7c05584f2745e7b4a2473497	2018-01-31 11:58:10 -08:00
Andrew Kryczka	ab5ab36ac2	fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose file ID support check Summary: Updated the test case to handle tmpfs mounted at directories different from "/dev/shm/". Closes https://github.com/facebook/rocksdb/pull/3440 Differential Revision: D6848213 Pulled By: ajkr fbshipit-source-id: 465e9dbf0921d0930161f732db6b3766bb030589	2018-01-30 16:50:42 -08:00
Huachao Huang	ab43ff58b5	Delete files in multiple ranges at once Summary: Using `DeleteFilesInRange` to delete files in a lot of ranges can be slow, because `VersionSet::LogAndApply` is expensive. This PR adds a new `DeleteFilesInRange` function to delete files in multiple ranges at once. Close https://github.com/facebook/rocksdb/issues/2951 Closes https://github.com/facebook/rocksdb/pull/3431 Differential Revision: D6849228 Pulled By: ajkr fbshipit-source-id: daeedcabd8def4b1d9ee95a58266dee77b5d68cb	2018-01-30 13:56:39 -08:00
Yi Wu	4bdf06e78f	Fix DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge dead lock Summary: In the test, there can be a dead lock between background flush thread and foreground main thread as following: * background flush thread: - holding db mutex, while - waiting on "DBImpl::FlushMemTableToOutputFile:BeforeInstallSV" sync point. * foreground thread: - waiting for db mutex to write "key2" Fixing by let background flush thread wait without holding db mutex. Closes https://github.com/facebook/rocksdb/pull/3436 Differential Revision: D6841334 Pulled By: yiwu-arbug fbshipit-source-id: b020768ac94e166e40953c5d09e505515a5f244d	2018-01-29 18:56:47 -08:00
Sagar Vemuri	e6605e5302	Tests for dynamic universal compaction options Summary: Added a test for three dynamic universal compaction options, in the realm of read amplification: - size_ratio - min_merge_width - max_merge_width Also updated DynamicUniversalCompactionSizeAmplification by adding a check on compaction reason. Found a bug in compaction reason setting while working on this PR, and fixed in #3412 . TODO for later: Still to add tests for these options: compression_size_percent, stop_style and trivial_move. Closes https://github.com/facebook/rocksdb/pull/3419 Differential Revision: D6822217 Pulled By: sagar0 fbshipit-source-id: 074573fca6389053cbac229891a0163f38bb56c4	2018-01-29 16:42:45 -08:00
Zhongyi Xie	3fe0937180	Use block cache to track memory usage when ReadOptions.fill_cache=false Summary: ReadOptions.fill_cache is set in compaction inputs and can be set by users in their queries too. It tells RocksDB not to put a data block used to block cache. The memory used by the data block is, however, not trackable by users. To make the system more manageable, we can cost the block to block cache while using it, and then release it after using. Closes https://github.com/facebook/rocksdb/pull/3333 Differential Revision: D6670230 Pulled By: miasantreble fbshipit-source-id: ab848d3ed286bd081a13ee1903de357b56cbc308	2018-01-29 14:43:10 -08:00
Mark Isaacson	b8eb32f8cf	Suppress lint in old files Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues. Reviewed By: yiwu-arbug Differential Revision: D6821806 fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c	2018-01-29 12:56:42 -08:00
Sagar Vemuri	7fcc1d0ddf	Incorrect Universal Compaction reason Summary: While writing tests for dynamic Universal Compaction options, I found that the compaction reasons we set for size-ratio based and sorted-run based universal compactions are swapped with each other. Fixed it. Closes https://github.com/facebook/rocksdb/pull/3412 Differential Revision: D6820540 Pulled By: sagar0 fbshipit-source-id: 270a188968ba25b2c96a8339904416c4c87ff5b3	2018-01-26 11:12:40 -08:00

1 2 3 4 5 ...

3077 Commits