rocksdb

Author	SHA1	Message	Date
Reid Horuff	5cf176ca15	Fix for 2PC causing WAL to grow too large Summary: Consider the following single column family scenario: prepare in log A commit in log B WAL is too large, flush all CFs to releast log A CFA is on log B so we do not see CFA is depending on log A so no flush is requested To fix this we must also consider the log containing the prepare section when determining what log a CF is dependent on. Closes https://github.com/facebook/rocksdb/pull/1768 Differential Revision: D4403265 Pulled By: reidHoruff fbshipit-source-id: ce800ff	2017-01-19 15:39:12 -08:00
Dmitri Smirnov	3c233ca4ea	Fix Windows environment issues Summary: Enable directIO on WritableFileImpl::Append with offset being current length of the file. Enable UniqueID tests on Windows, disable others but leeting them to compile. Unique tests are valuable to detect failures on different filesystems and upcoming ReFS. Clear output in WinEnv Getchildren.This is different from previous strategy, do not touch output on failure. Make sure DBTest.OpenWhenOpen works with windows error message Closes https://github.com/facebook/rocksdb/pull/1746 Differential Revision: D4385681 Pulled By: IslamAbdelRahman fbshipit-source-id: c07b702	2017-01-09 15:54:12 -08:00
Maysam Yabandeh	7631734563	Fix the error in ColumnFamiliesTest Summary: In the test the last change to AAAZZZ in handles[1] is deleting it. The result of the get must be NotFound then. Previosuly the test did not check for the return value of Get and assumed that the status is ok. It then move ahead asserting the returned value. The passed-by-reference string value however was not changed (since the key was not found) and the asserted value is what it contained before doing the Get. Closes https://github.com/facebook/rocksdb/pull/1753 Differential Revision: D4390982 Pulled By: maysamyabandeh fbshipit-source-id: dd55a34	2017-01-09 14:09:13 -08:00
Daniel Black	816c1e30ca	gcc-7 requires include <functional> for std::function Summary: Fixes compile error: In file included from ./util/statistics.h:17:0, from ./util/stop_watch.h:8, from ./util/perf_step_timer.h:9, from ./util/iostats_context_imp.h:8, from ./util/posix_logger.h:27, from ./port/util_logger.h:18, from ./db/auto_roll_logger.h:15, from db/auto_roll_logger.cc:6: ./util/thread_local.h:65:16: error: 'function' in namespace 'std' does not name a template type typedef std::function<void(void, void)> FoldFunc; Closes https://github.com/facebook/rocksdb/pull/1656 Differential Revision: D4318702 Pulled By: yiwu-arbug fbshipit-source-id: 8c5d17a	2016-12-16 11:24:18 -08:00
Daniel Black	0ab6fc167f	Gcc-7 buffer size insufficient Summary: Bunch of commits related to insufficient buffer size. Errors in individual commits. Closes https://github.com/facebook/rocksdb/pull/1673 Differential Revision: D4332127 Pulled By: IslamAbdelRahman fbshipit-source-id: 878f73c	2016-12-14 19:24:26 -08:00
Yi Wu	c26a4d8e8a	Fix compile error in trasaction_lock_mgr.cc Summary: Fix error on mac/windows build since they don't recognize `uint`. Closes https://github.com/facebook/rocksdb/pull/1624 Differential Revision: D4287139 Pulled By: yiwu-arbug fbshipit-source-id: b7cc88f	2016-12-06 14:39:16 -08:00
Manuel Ung	2005c88a75	Implement non-exclusive locks Summary: This is an implementation of non-exclusive locks for pessimistic transactions. It is relatively simple and does not prevent starvation (ie. it's possible that request for exclusive access will never be granted if there are always threads holding shared access). It is done by changing `KeyLockInfo` to hold an set a transaction ids, instead of just one, and adding a flag specifying whether this lock is currently held with exclusive access or not. Some implementation notes: - Some lock diagnostic functions had to be updated to return a set of transaction ids for a given lock, eg. `GetWaitingTxn` and `GetLockStatusData`. - Deadlock detection is a bit more complicated since a transaction can now wait on multiple other transactions. A BFS is done in this case, and deadlock detection depth is now just a limit on the number of transactions we visit. - Expirable transactions do not work efficiently with shared locks at the moment, but that's okay for now. Closes https://github.com/facebook/rocksdb/pull/1573 Differential Revision: D4239097 Pulled By: lth fbshipit-source-id: da7c074	2016-12-05 17:39:17 -08:00
Manuel Ung	e63350e726	Use more efficient hash map for deadlock detection Summary: Currently, deadlock cycles are held in std::unordered_map. The problem with it is that it allocates/deallocates memory on every insertion/deletion. This limits throughput since we're doing this expensive operation while holding a global mutex. Fix this by using a vector which caches memory instead. Running the deadlock stress test, this change increased throughput from 39k txns/s -> 49k txns/s. The effect is more noticeable in MyRocks. Closes https://github.com/facebook/rocksdb/pull/1545 Differential Revision: D4205662 Pulled By: lth fbshipit-source-id: ff990e4	2016-11-19 11:39:15 -08:00
Reid Horuff	1ca5f6d132	Fix 2PC Recovery SeqId Miscount Summary: Originally sequence ids were calculated, in recovery, based off of the first seqid found if the first log recovered. The working seqid was then incremented from that value based on every insertion that took place. This was faulty because of the potential for missing log files or inserts that skipped the WAL. The current recovery scheme grabs sequence from current recovering batch and increments using memtableinserter to track how many actual inserts take place. This works for 2PC batches as well scenarios where some logs are missing or inserts that skip the WAL. Closes https://github.com/facebook/rocksdb/pull/1486 Differential Revision: D4156064 Pulled By: reidHoruff fbshipit-source-id: a6da8d9	2016-11-10 11:09:22 -08:00
Reid Horuff	d133b08f68	Use correct sequence number when creating memtable Summary: copied from: `5ebfd2623a` Opening existing RocksDB attempts recovery from log files, which uses wrong sequence number to create the memtable. This is a regression introduced in change `a400336`. This change includes a test demonstrating the problem, without the fix the test fails with "Operation failed. Try again.: Transaction could not check for conflicts for operation at SequenceNumber 1 as the MemTable only contains changes newer than SequenceNumber 2. Increasing the value of the max_write_buffer_number_to_maintain option could reduce the frequency of this error" This change is a joint effort by Peter 'Stig' Edwards thatsafunnyname and me. Closes https://github.com/facebook/rocksdb/pull/1458 Differential Revision: D4143791 Pulled By: reidHoruff fbshipit-source-id: 5a25033	2016-11-09 12:24:17 -08:00
Reid Horuff	4dfaa6610a	Make IsDeadlockDetect() virtual member of Transaction Summary: Make `IsDeadlockDetect()` virtual member of base class `Transaction` for ease of use in MyRocks Test Plan: compiles. compiles into MyRocks call-site. Reviewers: mung Reviewed By: mung Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65385	2016-10-21 14:47:59 -07:00
Manuel Ung	4edd39fda2	Implement deadlock detection Summary: Implement deadlock detection. This is done by maintaining a TxnID -> TxnID map which represents the edges in the wait for graph (this is named `wait_txn_map_`). Test Plan: transaction_test Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64491	2016-10-19 19:45:57 -07:00
Reid Horuff	8c55bb87c8	Make Lock Info test multiple column families Summary: Modifies the lock info export test to test multiple column families after I was experiencing a bug while developing the MyRocks front-end for this. Test Plan: is test. Reviewers: mung Reviewed By: mung Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64725	2016-10-07 15:04:05 -07:00
Reid Horuff	37737c3a6b	Expose Transaction State Publicly Summary: This exposes a transactions state through a public api rather than through a public member variable. I also do some name refactoring. ExecutionStatus => TransactionState exec_status_ => trx_state_ Test Plan: It compiles and transaction_test passes. Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, mung, dhruba, sdong Differential Revision: https://reviews.facebook.net/D64689	2016-10-07 11:58:53 -07:00
Reid Horuff	2c1f95291d	Add facility to write only a portion of WriteBatch to WAL Summary: When constructing a write batch a client may now call MarkWalTerminationPoint() on that batch. No batch operations after this call will be added written to the WAL but will still be inserted into the Memtable. This facility is used to remove one of the three WriteImpl calls in 2PC transactions. This produces a ~1% perf improvement. ``` RocksDB - unoptimized 2pc, sync_binlog=1, disable_2pc=off INFO 2016-08-31 14:30:38,814 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2619 seconds. Requests/second = 28628 RocksDB - optimized 2pc , sync_binlog=1, disable_2pc=off INFO 2016-08-31 16:26:59,442 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2581 seconds. Requests/second = 29054 ``` Test Plan: Two unit tests added. Reviewers: sdong, yiwu, IslamAbdelRahman Reviewed By: yiwu Subscribers: hermanlee4, dhruba, andrewkr Differential Revision: https://reviews.facebook.net/D64599	2016-10-07 11:32:10 -07:00
Manuel Ung	be1f1092c9	Expose transaction id, lock state information and transaction wait information Summary: This diff does 3 things: Expose TransactionID so that we can identify transactions when we retrieve locking and lock wait information. This is exposed as `Transaction::GetID`. Expose lock state information by locking all stripes in all column families and copying their contents to a data structure. This is exposed as `TransactionDB::GetLockStatusData`. Adds support for tracking the transaction and the key being waited on, and exposes this as `Transaction::GetWaitingTxn`. Test Plan: unit tests Reviewers: horuff, sdong Reviewed By: sdong Subscribers: vasilep, hermanlee4, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64413	2016-09-30 11:41:21 -07:00
Wanning Jiang	78837f5d61	TableBuilder / TableReader support for range deletion Summary: 1. Range Deletion Tombstone structure 2. Modify Add() in table_builder to make it usable for adding range del tombstones 3. Expose NewTombstoneIterator() API in table_reader Test Plan: table_test.cc (now BlockBasedTableBuilder::Add() only accepts InternalKey. I make table_test only pass InternalKey to BlockBasedTableBuidler. Also test writing/reading range deletion tombstones in table_test ) Reviewers: sdong, IslamAbdelRahman, lightmark, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D61473	2016-08-19 15:10:31 -07:00
Aaron Gao	76a67cf741	support stackableDB as the baseDB of transactionDB Summary: make transactionDB working with StackableDB Test Plan: make all check -j64 Reviewers: andrewkr, yiwu, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D60705	2016-08-11 14:19:33 -07:00
Aaron Gao	7c190070b4	delete unnessary pointer cast in beginInternalTransaction() function Summary: use of dynamic_cast<TransactionImpl*> is unnecessary and also introduce difficulty for fbrocksdb support of TransactionDB Test Plan: ./transaction_test Reviewers: sdong, IslamAbdelRahman, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D60501	2016-07-07 16:34:41 -07:00
Reid Horuff	892e9d3047	make transaction WriteOptions modifiable	2016-06-27 12:53:30 -07:00
Islam AbdelRahman	05c5c39a7c	Fix build	2016-05-18 00:41:14 -07:00
Reid Horuff	a6254f2bd4	Long outstanding prepare test Summary: This tests that a prepared transaction is not lost after several crashes, restarts, and memtable flushes. Test Plan: TwoPhaseLongPrepareTest Reviewers: sdong Subscribers: hermanlee4, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58185	2016-05-17 18:57:06 -07:00
Islam AbdelRahman	2ead115116	Fix TransactionTest.TwoPhaseMultiThreadTest under TSAN Summary: TransactionTest.TwoPhaseMultiThreadTest runs forever under TSAN and our CI builds time out looks like the reason is that some threads keep running and other threads dont get a chance to increment the counter Test Plan: run the test under TSAN Reviewers: sdong, horuff Reviewed By: horuff Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58359	2016-05-17 18:54:27 -07:00
Islam AbdelRahman	f6aedb62c0	Fix Transaction memory leak Summary: - Make sure we clean up recovered_transactions_ on DBImpl destructor - delete leaked txns and env in TransactionTest Test Plan: Run transaction_test under valgrind Reviewers: sdong, andrewkr, yhchiang, horuff Reviewed By: horuff Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D58263	2016-05-16 16:32:55 -07:00
Reid Horuff	40123b3805	signed vs unsigned comparison fix	2016-05-11 14:22:43 -07:00
Reid Horuff	c27061dae7	[rocksdb] 2PC double recovery bug fix Summary: 1. prepare() 2. crash 3. recover 4. commit() 5. crash 6. data is lost This is due to the transaction data still only residing in the WAL but because the logs were flushed on the first recovery the data is ignored on the second recovery. We must scan all logs found on recovery and only ignore redundant data at the time of replay. It is not possible to know which logs still contain relevant data at time of recovery. We cannot simply ignore a log because all of the non-2pc data it contains has already been written to L0. The changes made to MemTableInserter are to ensure that prepared sections are still recovered even if all of the non-2pc data in that log has already been flushed to L0. Test Plan: Provided test. Reviewers: sdong Subscribers: andrewkr, hermanlee4, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D57729	2016-05-10 14:06:07 -07:00
Reid Horuff	a657ee9a9c	[rocksdb] Recovery path sequence miscount fix Summary: Consider the following WAL with 4 batch entries prefixed with their sequence at time of memtable insert. [1: BEGIN_PREPARE, PUT, PUT, PUT, PUT, END_PREPARE(a)] [1: BEGIN_PREPARE, PUT, PUT, PUT, PUT, END_PREPARE(b)] [4: COMMIT(a)] [7: COMMIT(b)] The first two batches do not consume any sequence numbers so are both prefixed with seq=1. For 2pc commit, memtable insertion takes place before COMMIT batch is written to WAL. We can see that sequence number consumption takes place between WAL entries giving us the seemingly sparse sequence prefix for WAL entries. This is a valid WAL. Because with 2PC markers one WriteBatch points to another batch containing its inserts a writebatch can consume more or less sequence numbers than the number of sequence consuming entries that it contains. We can see that, given the entries in the WAL, 6 sequence ids were consumed. Yet on recovery the maximum sequence consumed would be 7 + 3 (the number of sequence numbers consumed by COMMIT(b)) So, now upon recovery we must track the actual consumption of sequence numbers. In the provided scenario there will be no sequence gaps, but it is possible to produce a sequence gap. This should not be a problem though. correct? Test Plan: provided test. Reviewers: sdong Subscribers: andrewkr, leveldb, dhruba, hermanlee4 Differential Revision: https://reviews.facebook.net/D57645	2016-05-10 14:06:07 -07:00
Reid Horuff	8a66c85e90	[rocksdb] Two Phase Transaction Summary: Two Phase Commit addition to RocksDB. See wiki: https://github.com/facebook/rocksdb/wiki/Two-Phase-Commit-Implementation Quip: https://fb.quip.com/pxZrAyrx53r3 Depends on: WriteBatch modification: https://reviews.facebook.net/D54093 Memtable Log Referencing and Prepared Batch Recovery: https://reviews.facebook.net/D56919 Test Plan: - SimpleTwoPhaseTransactionTest - PersistentTwoPhaseTransactionTest. - TwoPhaseRollbackTest - TwoPhaseMultiThreadTest - TwoPhaseLogRollingTest - TwoPhaseEmptyWriteTest - TwoPhaseExpirationTest Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: leveldb, hermanlee4, andrewkr, vasilep, dhruba, santoshb Differential Revision: https://reviews.facebook.net/D56925	2016-05-10 14:06:07 -07:00
Li Peng	6d4832a998	Merge pull request #1101 from flyd1005/wip-fix-typo fix typos and remove duplicated words	2016-04-28 02:30:44 -07:00
SherlockNoMad	f11b0df121	Fix AppVeyor build error	2016-03-15 10:57:33 -07:00
agiardullo	790252805d	Add multithreaded transaction test Summary: Refactored db_bench transaction stress tests so that they can be called from unit tests as well. Test Plan: run new unit test as well as db_bench Reviewers: yhchiang, IslamAbdelRahman, sdong Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55203	2016-03-11 15:16:52 -08:00
agiardullo	2200295ee1	optimistic transactions support for reinitialization Summary: Extend optimization in D53835 to optimistic transactions for completeness. Test Plan: added test Reviewers: sdong, IslamAbdelRahman, horuff, jkedgar Reviewed By: horuff Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55059	2016-03-07 19:03:09 -08:00
agiardullo	200080ed72	Improve snapshot handling for Transaction reinitialization Summary: Previously, reusing a transaction (by passing it as an argument to BeginTransaction) would not clear the transaction's snapshot. This is not a clear, well-definited behavior. Test Plan: improved test Reviewers: sdong, IslamAbdelRahman, horuff, jkedgar Reviewed By: jkedgar Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D55053	2016-03-07 13:28:11 -08:00
agiardullo	5ea9aa3c14	TransactionDB:ReinitializeTransaction Summary: Add function to reinitialize a transaction object so that it can be reused. This is an optimization so users can potentially avoid reallocating transaction objects. Test Plan: added tests Reviewers: yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: jkedgar, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53835	2016-02-29 16:27:32 -08:00
agiardullo	d08d50295c	Fix transaction locking Summary: Broke transaction locking in 4.4 in D52197. Will cherry-pick this change into 4.4 (which hasn't yet been fully released). Repro'd using db_bench. Test Plan: unit tests and db_Bench Reviewers: sdong, yhchiang, kradhakrishnan, ngbronson Reviewed By: ngbronson Subscribers: ngbronson, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54021	2016-02-16 17:15:05 -08:00
reid horuff	5bcf952a87	Fix WriteImpl empty batch hanging issue Summary: There is an issue in DBImpl::WriteImpl where if an empty writebatch comes in and sync=true then the logs will be marked as being synced yet the sync never actually happens because there is no data in the writebatch. This causes the next incoming batch to hang while waiting for the logs to complete syncing. This fix syncs logs even if the writebatch is empty. Test Plan: DoubleEmptyBatch unit test in transaction_test. Reviewers: yoshinorim, hermanlee4, sdong, ngbronson, anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D54057	2016-02-16 12:21:33 -08:00
Igor Canadi	c90d63a23d	can_unlock set but not used Test Plan: I couldn't repro, but I hope this fixes it. See the error here: https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_ubuntu1404_rocksdb_compile_6e9fd902d5cb25aef992363efa128640affd5196_16_02_11_04_33_37/0?type=T Reviewers: yhchiang, andrewkr, sdong, anthony Reviewed By: anthony Subscribers: meyering, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54123	2016-02-16 11:24:40 -08:00
Yueh-Hsuan Chiang	3a67bffaa8	Fix an ASAN error in transaction_test.cc Summary: One test in transaction_test.cc forgets to call SyncPoint::DisableProcessing(). As a result, a program might to access the SyncPoint singleton after it already goes out of scope. This patch fix this error by calling SyncPoint::DisableProcessing(). Test Plan: transaction_test Reviewers: sdong, IslamAbdelRahman, kradhakrishnan, anthony Reviewed By: anthony Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D54033	2016-02-10 12:06:59 -08:00
Baraa Hamodi	21e95811d1	Updated all copyright headers to the new format.	2016-02-09 15:12:00 -08:00
agiardullo	fe93bf9b5d	Transaction::UndoGetForUpdate Summary: MyRocks wants to be able to un-lock a key that was just locked by GetForUpdate(). To do this safely, I am now keeping track of the number of reads(for update) and writes for each key in a transaction. UndoGetForUpdate() will only unlock a key if it hasn't been written and the read count reaches 0. Test Plan: more unit tests Reviewers: igor, rven, yhchiang, spetrunia, sdong Reviewed By: spetrunia, sdong Subscribers: spetrunia, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D47043	2016-02-09 10:46:11 -08:00
reid horuff	6f71d3b68b	Improve perf of Pessimistic Transaction expirations (and optimistic transactions) Summary: copy from task 8196669: 1) Optimistic transactions do not support batching writes from different threads. 2) Pessimistic transactions do not support batching writes if an expiration time is set. In these 2 cases, we currently do not do any write batching in DBImpl::WriteImpl() because there is a WriteCallback that could decide at the last minute to abort the write. But we could support batching write operations with callbacks if we make sure to process the callbacks correctly. To do this, we would first need to modify write_thread.cc to stop preventing writes with callbacks from being batched together. Then we would need to change DBImpl::WriteImpl() to call all WriteCallback's in a batch, only write the batches that succeed, and correctly set the state of each batch's WriteThread::Writer. Test Plan: Added test WriteWithCallbackTest to write_callback_test.cc which creates multiple client threads and verifies that writes are batched and executed properly. Reviewers: hermanlee4, anthony, ngbronson Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D52863	2016-02-05 10:44:13 -08:00
Gabriela Jacques da Silva	0c2bd5cb4b	Removing data race from expirable transactions Summary: Doing inline checking of transaction expiration instead of using a callback. Test Plan: To be added Reviewers: anthony Reviewed By: anthony Subscribers: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D53673	2016-02-02 18:37:44 -08:00
agiardullo	45768ade4f	transaction allocation perf improvements Summary: Removed a couple of memory allocations Test Plan: changes covered by existing tests Reviewers: rven, yhchiang, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D53523	2016-01-28 17:32:28 -08:00
agiardullo	eff309867e	Do not use timed_mutex in TransactionDB Summary: Stopped using std::timed_mutex as it has known issues in older versiong of gcc. Ran into these problems when testing MongoRocks. Test Plan: unit tests. Manual mongo testing on gcc 4.8. Reviewers: igor, yhchiang, rven, IslamAbdelRahman, kradhakrishnan, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D52197	2015-12-18 17:26:02 -08:00
Dmitri Smirnov	b6d19adcf7	Use port size_t formatting	2015-12-15 11:34:22 -08:00
agiardullo	84f98792d6	Transaction::SetWriteOptions() Summary: Add support to change write options after creating a transaction. This is needed for MongoRocks. Test Plan: added test Reviewers: sdong, rven, kradhakrishnan, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51867	2015-12-11 16:08:25 -08:00
agiardullo	3bfd3d39a3	Use SST files for Transaction conflict detection Summary: Currently, transactions can fail even if there is no actual write conflict. This is due to relying on only the memtables to check for write-conflicts. Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings. With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts. This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot. Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged). Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread. Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files. Test Plan: unit tests, db bench Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb, yoshinorim Differential Revision: https://reviews.facebook.net/D50475	2015-12-11 12:34:11 -08:00
agiardullo	e5c5f23814	Support marking snapshots for write-conflict checking - Take 2 Summary: D51183 was reverted due to breaking the LITE build. This diff is the same as D51183 but with a fix for the LITE BUILD(D51693) Test Plan: run all unit tests Reviewers: sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51711	2015-12-08 16:47:31 -08:00
sdong	1d63c3d610	Revert "Support marking snapshots for write-conflict checking" This reverts commit `ec704aafdc` for it broke RocksDB LITE build.	2015-12-08 09:27:17 -08:00
agiardullo	ec704aafdc	Support marking snapshots for write-conflict checking Summary: D50475 enables using SST files for transaction write-conflict checking. In order for this to work, we need to make sure not to compact out SingleDeletes when there is an earlier transaction snapshot(D50295). If there is a long-held snapshot, this could reduce the benefit of the SingleDelete optimization. This diff allows Transactions to mark snapshots as being used for write-conflict checking. Then, during compaction, we will be able to optimize SingleDeletes better in the future. This diff adds a flag to SnapshotImpl which is used by Transactions. This diff also passes the earliest write-conflict snapshot's sequence number to CompactionIterator. This diff does not actually change Compaction (after this diff is pushed, D50295 will be able to use this information). Test Plan: no behavior change, ran existing tests Reviewers: rven, kradhakrishnan, yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D51183	2015-12-07 19:40:51 -08:00

1 2

72 Commits