rocksdb/db
agiardullo 3bfd3d39a3 Use SST files for Transaction conflict detection
Summary:
Currently, transactions can fail even if there is no actual write conflict.  This is due to relying on only the memtables to check for write-conflicts.  Users have to tune memtable settings to try to avoid this, but it's hard to figure out exactly how to tune these settings.

With this diff, TransactionDB will use both memtables and SST files to determine if there are any write conflicts.  This relies on the fact that BlockBasedTable stores sequence numbers for all writes that happen after any open snapshot.  Also, D50295 is needed to prevent SingleDelete from disappearing writes (the TODOs in this test code will be fixed once the other diff is approved and merged).

Note that Optimistic transactions will still rely on tuning memtable settings as we do not want to read from SST while on the write thread.  Also, memtable settings can still be used to reduce how often TransactionDB needs to read SST files.

Test Plan: unit tests, db bench

Reviewers: rven, yhchiang, kradhakrishnan, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb, yoshinorim

Differential Revision: https://reviews.facebook.net/D50475
2015-12-11 12:34:11 -08:00
..
builder.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
builder.h Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
c_test.c Deprecate CompactionFilterV2 2015-07-17 18:59:11 +02:00
c.cc Move skip_table_builder_flush to BlockBasedTableOption 2015-10-30 18:33:01 -07:00
column_family_test.cc Fixed the valgrind error in ColumnFamilyTest::CreateAndDropRace 2015-12-10 11:53:53 -08:00
column_family.cc Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit 2015-12-09 18:22:45 -08:00
column_family.h Total SST files size DB Property 2015-08-20 11:47:19 -07:00
compact_files_test.cc Make sure that CompactFiles does not run two parallel Level 0 compactions 2015-11-13 12:01:00 -08:00
compacted_db_impl.cc Remove db_impl_readonly dependency on utilities 2015-07-14 11:32:54 -07:00
compacted_db_impl.h Remove db_impl_readonly dependency on utilities 2015-07-14 11:32:54 -07:00
compaction_iterator_test.cc Support marking snapshots for write-conflict checking - Take 2 2015-12-08 16:47:31 -08:00
compaction_iterator.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
compaction_iterator.h Change SingleDelete to support conflict checking 2015-12-10 11:35:38 -08:00
compaction_job_stats_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
compaction_job_test.cc Change SingleDelete to support conflict checking 2015-12-10 11:35:38 -08:00
compaction_job.cc Support marking snapshots for write-conflict checking - Take 2 2015-12-08 16:47:31 -08:00
compaction_job.h fix typos in comments 2015-12-11 01:54:48 +09:00
compaction_picker_test.cc Fix condition for bottommost level 2015-10-05 17:40:18 -07:00
compaction_picker.cc UniversalCompactionPicker::PickCompaction(): avoid to form compactions if there is no file 2015-11-16 10:32:45 -08:00
compaction_picker.h Make sure that CompactFiles does not run two parallel Level 0 compactions 2015-11-13 12:01:00 -08:00
compaction.cc Enable C4267 warning 2015-11-24 16:33:09 +03:00
compaction.h Passing table properties to compaction callback 2015-10-09 18:10:55 -07:00
comparator_db_test.cc Moving memtable related files from util to a new directory memtable 2015-10-16 14:10:33 -07:00
convenience.cc move convenience.h out of utilities 2015-07-15 14:51:51 -07:00
corruption_test.cc Switch to thread-local random for skiplist 2015-11-09 19:25:22 -08:00
cuckoo_table_db_test.cc Block cuckoo table tests in ROCKSDB_LITE 2015-07-20 10:50:46 -07:00
db_bench.cc Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit 2015-12-09 18:22:45 -08:00
db_compaction_filter_test.cc Have a way for compaction filter to ignore snapshots 2015-11-20 15:57:26 -08:00
db_compaction_test.cc A new compaction picking priority that optimizes for write amplification for random updates. 2015-12-09 18:13:03 -08:00
db_dynamic_level_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
db_filesnapshot.cc Add wal files to Checkpoint for multiple column families. 2015-06-19 16:08:31 -07:00
db_impl_debug.cc Add Memory Insight support to utilities 2015-11-03 17:52:17 -08:00
db_impl_experimental.cc Clean up InstallSuperVersion 2015-06-17 12:37:59 -07:00
db_impl_readonly.cc Remove db_impl_readonly dependency on utilities 2015-07-14 11:32:54 -07:00
db_impl_readonly.h Override DBImplReadOnly::SyncWAL() to return NotSupported. Previously, calling it caused program abort. 2015-09-25 21:25:30 -07:00
db_impl.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
db_impl.h Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
db_inplace_update_test.cc Clean up dependency: Move db_test_util.* to db directory 2015-10-12 13:05:42 -07:00
db_iter_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
db_iter.cc Revert previous behavior of internal_key_skipped_count 2015-11-30 21:55:05 -08:00
db_iter.h Prefix-based iterating only shows keys in prefix 2015-11-05 13:24:05 -08:00
db_log_iter_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
db_table_properties_test.cc Avoid empty ranges vector with subsequent zero element access 2015-12-02 14:50:33 -08:00
db_tailing_iter_test.cc Reuse file iterators in tailing iterator when memtable is flushed 2015-11-13 15:50:59 -08:00
db_test_util.cc Merge pull request #853 from Vaisman/enable_C4267_warning 2015-12-08 17:59:24 -08:00
db_test_util.h Merge pull request #853 from Vaisman/enable_C4267_warning 2015-12-08 17:59:24 -08:00
db_test.cc fix typos in comments 2015-12-11 01:54:48 +09:00
db_universal_compaction_test.cc Fix valgrind failure in IncreaseUniversalCompactionNumLevels 2015-12-08 11:45:29 -08:00
db_wal_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
dbformat_test.cc Avoid manipulating const char* arrays 2015-07-14 00:21:41 -07:00
dbformat.cc Support for SingleDelete() 2015-09-17 11:42:56 -07:00
dbformat.h key_ cannot become nullptr, so no check is needed for that 2015-09-18 20:15:20 +02:00
deletefile_test.cc Improved FileExists API 2015-07-20 17:20:40 -07:00
event_helpers.cc Passing table properties to compaction callback 2015-10-09 18:10:55 -07:00
event_helpers.h Add EventListener::OnTableFileDeletion() 2015-06-03 19:57:01 -07:00
experimental.cc Implement DB::PromoteL0 method 2015-04-23 12:10:36 -07:00
fault_injection_test.cc No need to #ifdef test only code on windows 2015-10-22 15:15:37 -07:00
file_indexer_test.cc Fix possible SIGSEGV in CompactRange (github issue #596) 2015-04-29 10:52:31 -07:00
file_indexer.cc Fix possible SIGSEGV in CompactRange (github issue #596) 2015-04-29 10:52:31 -07:00
file_indexer.h fix typos in comments 2015-12-11 01:54:48 +09:00
filename_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
filename.cc Enable RocksDB to persist Options file. 2015-11-10 22:58:01 -08:00
filename.h Enable RocksDB to persist Options file. 2015-11-10 22:58:01 -08:00
flush_job_test.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
flush_job.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
flush_job.h Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
flush_scheduler.cc Don't return (or dereference) dangling pointer 2014-10-02 14:33:16 -07:00
flush_scheduler.h Fix data race #1 2015-01-26 11:48:07 -08:00
forward_iterator_bench.cc Block forward_iterator_bench under MAC and Windows 2015-11-17 11:51:37 -08:00
forward_iterator.cc Fix forward_iterator allocation of vector. 2015-11-17 10:27:51 -08:00
forward_iterator.h Reuse file iterators in tailing iterator when memtable is flushed 2015-11-13 15:50:59 -08:00
inlineskiplist_test.cc InlineSkipList - part 2/3 2015-11-24 14:30:56 -08:00
inlineskiplist.h InlineSkipList part 3/3 - new skiplist type that colocates key and node 2015-11-24 15:16:02 -08:00
internal_stats.cc Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit 2015-12-09 18:22:45 -08:00
internal_stats.h Deprecate options.soft_rate_limit and add options.soft_pending_compaction_bytes_limit 2015-12-09 18:22:45 -08:00
job_context.h fixed leaking log::Writers 2015-07-07 12:10:10 -07:00
listener_test.cc Lint everything 2015-11-16 12:56:21 -08:00
log_format.h log_{reader,write}: recyclable record format 2015-10-19 17:24:05 -04:00
log_reader.cc log_{reader,write}: recyclable record format 2015-10-19 17:24:05 -04:00
log_reader.h log_{reader,write}: recyclable record format 2015-10-19 17:24:05 -04:00
log_test.cc log_{reader,write}: recyclable record format 2015-10-19 17:24:05 -04:00
log_writer.cc Enable C4267 warning 2015-11-24 16:33:09 +03:00
log_writer.h Enable C4267 warning 2015-11-24 16:33:09 +03:00
managed_iterator.cc Windows Port from Microsoft 2015-07-01 16:13:56 -07:00
managed_iterator.h Fixed xfunc related compile errors in ROCKSDB_LITE 2015-04-09 21:05:18 -07:00
manual_compaction_test.cc Move manual_compaction_test.cc from util to db 2015-10-14 11:06:27 -07:00
memtable_allocator.cc Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
memtable_allocator.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00
memtable_list_test.cc Removing duplicate code 2015-08-05 07:33:27 -07:00
memtable_list.cc Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
memtable_list.h Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
memtable.cc Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
memtable.h Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
memtablerep_bench.cc Merge pull request #811 from OverlordQ/unused-variable-warning 2015-11-02 12:44:27 -08:00
merge_context.h API to fetch from both a WriteBatchWithIndex and the db 2015-05-11 14:51:51 -07:00
merge_helper_test.cc Compaction filter on merge operands 2015-10-07 09:30:03 -07:00
merge_helper.cc Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
merge_helper.h Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
merge_operator.cc Call merge operators with empty values 2015-06-26 11:35:46 -07:00
merge_test.cc Make merge_test runnable in ROCKSDB_LITE 2015-07-20 11:17:52 -07:00
options_file_test.cc Fixed build failure of RocksDBLite test on options_file_test.cc 2015-11-10 23:23:36 -08:00
perf_context_test.cc Make perf_context.db_mutex_lock_nanos and db_condition_wait_nanos only measures DB Mutex 2015-10-13 10:41:48 -07:00
plain_table_db_test.cc PlainTableReader to support non-mmap mode 2015-09-23 11:41:07 -07:00
prefix_test.cc Prefix-based iterating only shows keys in prefix 2015-11-05 13:24:05 -08:00
repair.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
skiplist_test.cc rocksdb: switch to gtest 2015-03-17 14:08:00 -07:00
skiplist.h Switch to thread-local random for skiplist 2015-11-09 19:25:22 -08:00
slice.cc Create an abstract interface for write batches 2015-03-17 19:23:08 -07:00
snapshot_impl.cc Support marking snapshots for write-conflict checking - Take 2 2015-12-08 16:47:31 -08:00
snapshot_impl.h Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
table_cache.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
table_cache.h Seperate InternalIterator from Iterator 2015-10-13 15:32:13 -07:00
table_properties_collector_test.cc Pass column family ID to table property collector 2015-10-09 14:36:51 -07:00
table_properties_collector.cc Support for SingleDelete() 2015-09-17 11:42:56 -07:00
table_properties_collector.h Pass column family ID to table property collector 2015-10-09 14:36:51 -07:00
transaction_log_impl.cc log_reader: pass log_number and optional info_log to ctor 2015-10-18 21:24:32 -04:00
transaction_log_impl.h Move rate_limiter, write buffering, most perf context instrumentation and most random kill out of Env 2015-07-17 16:58:18 -07:00
version_builder_test.cc Add a mode to always pick the oldest file to compact for each level 2015-09-21 17:21:59 -07:00
version_builder.cc EstimatedNumKeys Counter Inaccurate 2015-12-07 10:51:08 -08:00
version_builder.h Log more information for the add file with overlapping range failure 2015-10-19 17:31:13 -07:00
version_edit_test.cc New Manifest format to allow customized fields in NewFile. 2015-10-08 15:51:45 -07:00
version_edit.cc New Manifest format to allow customized fields in NewFile. 2015-10-08 15:51:45 -07:00
version_edit.h New Manifest format to allow customized fields in NewFile. 2015-10-08 15:51:45 -07:00
version_set_test.cc Report live data size estimate 2015-07-21 21:33:20 -07:00
version_set.cc Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
version_set.h Use SST files for Transaction conflict detection 2015-12-11 12:34:11 -08:00
wal_manager_test.cc log_writer: pass log number and whether recycling is enabled to ctor 2015-10-18 21:24:32 -04:00
wal_manager.cc log_reader: pass log_number and optional info_log to ctor 2015-10-18 21:24:32 -04:00
wal_manager.h Fix -Wnon-virtual-dtor errors 2014-11-10 17:39:38 -05:00
write_batch_base.cc Support for SingleDelete() 2015-09-17 11:42:56 -07:00
write_batch_internal.h Don't merge WriteBatch-es if WAL is disabled 2015-11-12 10:50:38 -08:00
write_batch_test.cc track WriteBatch contents 2015-11-10 16:56:06 -08:00
write_batch.cc Don't merge WriteBatch-es if WAL is disabled 2015-11-12 10:50:38 -08:00
write_callback_test.cc Fix compile for write_callback_test in ROCKSDB_LITE 2015-07-20 10:54:15 -07:00
write_callback.h Optimistic Transactions 2015-05-29 14:36:35 -07:00
write_controller_test.cc Slow down writes by bytes written 2015-06-11 20:42:18 -07:00
write_controller.cc fix typos in comments 2015-12-11 01:54:48 +09:00
write_controller.h Slow down writes by bytes written 2015-06-11 20:42:18 -07:00
write_thread.cc Resubmit the fix for a race condition in persisting options 2015-12-08 17:01:02 -08:00
write_thread.h reduce db mutex contention for write batch groups 2015-08-14 10:55:43 -07:00
writebuffer.h Enforce write buffer memory limit across column families 2014-12-02 12:09:20 -08:00