rocksdb/db
Andrew Kryczka fea2b1dfb2 Copy Get() result when file reads use mmap
Summary:
For iterator reads, a `SuperVersion` is pinned to preserve a snapshot of SST files, and `Block`s are pinned to allow `key()` and `value()` to return pointers directly into a RocksDB memory region. This works for both non-mmap reads, where the block owns the memory region, and mmap reads, where the file owns the memory region.

For point reads with `PinnableSlice`, only the `Block` object is pinned. This works for non-mmap reads because the block owns the memory region, so even if the file is deleted after compaction, the memory region survives. However, for mmap reads, file deletion causes the memory region to which the `PinnableSlice` refers to be unmapped.   The result is usually a segfault upon accessing the `PinnableSlice`, although sometimes it returned wrong results (I repro'd this a bunch of times with `db_stress`).

This PR copies the value into the `PinnableSlice` when it comes from mmap'd memory. We can tell whether the `Block` owns its memory using `Block::cachable()`, which is unset when reads do not use the provided buffer as is the case with mmap file reads. When that is false we ensure the result of `Get()` is copied.

This feels like a short-term solution as ideally we'd have the `PinnableSlice` pin the mmap'd memory so we can do zero-copy reads. It seemed hard so I chose this approach to fix correctness in the meantime.
Closes https://github.com/facebook/rocksdb/pull/3881

Differential Revision: D8076288

Pulled By: ajkr

fbshipit-source-id: 31d78ec010198723522323dbc6ea325122a46b08
2018-06-01 16:57:58 -07:00
..
builder.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
builder.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
c_test.c comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
c.cc add c api rocksdb_sstfilewriter_file_size 2018-06-01 09:43:59 -07:00
column_family_test.cc Fix compilation error when OPT="-DROCKSDB_LITE". 2018-05-29 12:28:59 -07:00
column_family.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
column_family.h Add max_subcompactions as a compaction option 2018-04-27 11:57:39 -07:00
compact_files_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
compacted_db_impl.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
compacted_db_impl.h Comment out unused variables 2018-03-05 13:13:41 -08:00
compaction_iteration_stats.h add counter for deletion dropping optimization 2017-08-19 14:10:08 -07:00
compaction_iterator_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
compaction_iterator.cc Blob DB: Improve FIFO eviction 2018-03-06 11:57:42 -08:00
compaction_iterator.h Comment out unused variables 2018-03-05 13:13:41 -08:00
compaction_job_stats_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
compaction_job_test.cc Add max_subcompactions as a compaction option 2018-04-27 11:57:39 -07:00
compaction_job.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
compaction_job.h MaxFileSizeForLevel: adjust max_file_size for dynamic level compaction 2018-05-03 16:42:13 -07:00
compaction_picker_test.cc Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction_picker_universal.cc Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction_picker_universal.h Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction_picker.cc Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction_picker.h Delete triggered compaction for universal style 2018-05-29 15:44:34 -07:00
compaction.cc MaxFileSizeForLevel: adjust max_file_size for dynamic level compaction 2018-05-03 16:42:13 -07:00
compaction.h MaxFileSizeForLevel: adjust max_file_size for dynamic level compaction 2018-05-03 16:42:13 -07:00
comparator_db_test.cc Implement key shortening functions in ReverseBytewiseComparator 2018-05-17 18:27:16 -07:00
convenience.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
corruption_test.cc Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs 2018-05-17 02:56:56 -07:00
cuckoo_table_db_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
db_basic_test.cc Disallow compactions if there isn't enough free space 2018-03-06 16:27:54 -08:00
db_blob_index_test.cc fix lite build 2017-10-17 08:57:09 -07:00
db_block_cache_test.cc LRUCache midpoint insertion 2018-05-24 15:57:33 -07:00
db_bloom_filter_test.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
db_compaction_filter_test.cc Remove tests from ROCKSDB_VALGRIND_RUN 2018-05-30 16:15:16 -07:00
db_compaction_test.cc Remove tests from ROCKSDB_VALGRIND_RUN 2018-05-30 16:15:16 -07:00
db_dynamic_level_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
db_encryption_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_filesnapshot.cc fix behavior does not match name for "IsFileDeletionsEnabled" 2018-03-21 22:13:34 -07:00
db_flush_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
db_impl_compaction_flush.cc Bottommost level-based compactions in bottom-pri pool 2018-05-14 14:57:15 -07:00
db_impl_debug.cc Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
db_impl_experimental.cc Inform caller when rocksdb is stalling writes 2017-10-05 18:11:43 -07:00
db_impl_files.cc DBImpl::FindObsoleteFiles() not to call GetChildren() on the same path 2018-05-31 12:58:33 -07:00
db_impl_open.cc Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
db_impl_readonly.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_impl_readonly.h allowing CompactFiles to return new file names 2018-03-15 11:58:12 -07:00
db_impl_write.cc fix deadlock with enable_pipelined_write=true and max_successive_merges > 0 2018-05-31 11:13:14 -07:00
db_impl.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_impl.h Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
db_info_dumper.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_info_dumper.h Change RocksDB License 2017-07-15 16:11:23 -07:00
db_inplace_update_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_io_failure_test.cc Fix LITE unit tests 2017-07-26 21:11:47 -07:00
db_iter_stress_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_iter_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_iter.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_iter.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_iterator_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
db_log_iter_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
db_memtable_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
db_merge_operator_test.cc Fix TSAN timeout in MergeOperatorPinningTest.Randomized/x test 2018-03-02 16:27:21 -08:00
db_options_test.cc PersistRocksDBOptions() to use WritableFileWriter 2018-05-21 16:42:22 -07:00
db_properties_test.cc Exclude seq from index keys 2018-05-25 18:42:43 -07:00
db_range_del_test.cc Add max_subcompactions as a compaction option 2018-04-27 11:57:39 -07:00
db_sst_test.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
db_statistics_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
db_table_properties_test.cc fix deletion-triggered compaction in table builder 2017-09-28 18:17:30 -07:00
db_tailing_iter_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
db_test2.cc Copy Get() result when file reads use mmap 2018-06-01 16:57:58 -07:00
db_test_util.cc Support for Column family specific paths. 2018-04-05 19:58:20 -07:00
db_test_util.h Fix the bug of some test scenarios being put after kEnd 2018-05-31 19:28:00 -07:00
db_test.cc Remove tests from ROCKSDB_VALGRIND_RUN 2018-05-30 16:15:16 -07:00
db_universal_compaction_test.cc Remove tests from ROCKSDB_VALGRIND_RUN 2018-05-30 16:15:16 -07:00
db_wal_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
db_write_test.cc Suppress tsan lock-order-inversion on FlushWAL 2018-05-14 21:13:35 -07:00
dbformat_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
dbformat.cc add kEntryRangeDeletion 2018-04-13 11:27:17 -07:00
dbformat.h Imporve perf of random read and insert compare by suggesting inlining to the compiler 2018-03-23 13:26:55 -07:00
deletefile_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
event_helpers.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
event_helpers.h Change RocksDB License 2017-07-15 16:11:23 -07:00
experimental.cc comment unused parameters to turn on -Wunused-parameter flag 2018-04-12 17:59:16 -07:00
external_sst_file_basic_test.cc optimize file ingestion checks for range deletion overlap 2017-11-28 11:27:02 -08:00
external_sst_file_ingestion_job.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
external_sst_file_ingestion_job.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
external_sst_file_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
fault_injection_test.cc Split FaultInjectionTest.FaultTest to avoid timeout 2018-05-07 12:29:58 -07:00
file_indexer_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
file_indexer.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
file_indexer.h Change RocksDB License 2017-07-15 16:11:23 -07:00
filename_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_job_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
flush_job.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
flush_job.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
flush_scheduler.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
flush_scheduler.h Change RocksDB License 2017-07-15 16:11:23 -07:00
forward_iterator_bench.cc fix gflags namespace 2017-12-01 10:42:05 -08:00
forward_iterator.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
forward_iterator.h Comment out unused variables 2018-03-05 13:13:41 -08:00
internal_stats.cc Fix VersionStorageInfo::EstimateLiveDataSize seg fault 2018-05-28 11:27:08 -07:00
internal_stats.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
job_context.h Introduce and use the option to disable stall notifications structures 2018-05-09 10:13:53 -07:00
listener_test.cc fix some text in comments. 2018-04-10 15:59:24 -07:00
log_format.h Change RocksDB License 2017-07-15 16:11:23 -07:00
log_reader.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
log_reader.h Suppress lint in old files 2018-01-29 12:56:42 -08:00
log_test.cc Use nullptr instead of NULL / 0 more consistently. 2018-03-07 12:42:12 -08:00
log_writer.cc Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
log_writer.h Pass manual_wal_flush also to the first wal file 2018-05-14 10:57:56 -07:00
logs_with_prep_tracker.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
logs_with_prep_tracker.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
malloc_stats.cc Disallow compactions if there isn't enough free space 2018-03-06 16:27:54 -08:00
malloc_stats.h Change RocksDB License 2017-07-15 16:11:23 -07:00
managed_iterator.cc Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs 2018-05-17 02:56:56 -07:00
managed_iterator.h Change and clarify the relationship between Valid(), status() and Seek*() for all iterators. Also fix some bugs 2018-05-17 02:56:56 -07:00
manual_compaction_test.cc Speedup ManualCompactionTest.Test 2018-05-03 16:13:09 -07:00
memtable_list_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
memtable_list.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
memtable_list.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
memtable.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
memtable.h InlineSkiplist: don't decode keys unnecessarily during comparisons 2018-03-23 12:14:30 -07:00
merge_context.h Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_helper_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_helper.cc WritePrepared Txn: Support merge operator 2018-02-09 14:57:54 -08:00
merge_helper.h WritePrepared Txn: Support merge operator 2018-02-09 14:57:54 -08:00
merge_operator.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
merge_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
obsolete_files_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
options_file_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
perf_context_test.cc Improve write time breakdown stats 2018-04-23 17:58:54 -07:00
pinned_iterators_manager.h Change RocksDB License 2017-07-15 16:11:23 -07:00
plain_table_db_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
pre_release_callback.h Fix pre_release callback argument list. 2018-04-05 11:12:16 -07:00
prefix_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
range_del_aggregator_test.cc optimize file ingestion checks for range deletion overlap 2017-11-28 11:27:02 -08:00
range_del_aggregator.cc Assert keys/values pinned by range deletion meta-block iterators 2018-05-21 09:57:00 -07:00
range_del_aggregator.h Recommit "Avoid adding tombstones of the same file to RangeDelAggregator multiple times" 2018-05-04 16:45:15 -07:00
read_callback.h write-prepared txn: call IsInSnapshot 2017-09-11 09:14:48 -07:00
repair_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
repair.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
snapshot_checker.h Comment out unused variables 2018-03-05 13:13:41 -08:00
snapshot_impl.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
snapshot_impl.h WritePrepared Txn: smallest_prepare optimization 2018-04-02 20:27:41 -07:00
table_cache.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
table_cache.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
table_properties_collector_test.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
table_properties_collector.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
table_properties_collector.h Comment out unused variables 2018-03-05 13:13:41 -08:00
transaction_log_impl.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
transaction_log_impl.h WritePrepared Txn: Refactor conf params 2017-11-10 17:28:12 -08:00
version_builder_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
version_builder.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
version_builder.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
version_edit_test.cc Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
version_edit.cc Specify the underlying type of enums. 2018-05-23 16:12:59 -07:00
version_edit.h Skip deleted WALs during recovery 2018-05-03 15:43:09 -07:00
version_set_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
version_set.cc Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
version_set.h Move prefix_extractor to MutableCFOptions 2018-05-21 14:43:11 -07:00
wal_manager_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
wal_manager.cc Fix memleak when DB::DeleteFile() 2018-01-11 18:57:33 -08:00
wal_manager.h Fix memleak when DB::DeleteFile() 2018-01-11 18:57:33 -08:00
write_batch_base.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_batch_internal.h Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
write_batch_test.cc Comment out unused variables 2018-03-05 13:13:41 -08:00
write_batch.cc WritePrepared Txn: enable TryAgain for duplicates at the end of the batch 2018-04-20 15:28:19 -07:00
write_callback_test.cc fix memory leak in two_level_iterator 2018-04-15 17:26:26 -07:00
write_callback.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller_test.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller.cc Change RocksDB License 2017-07-15 16:11:23 -07:00
write_controller.h Change RocksDB License 2017-07-15 16:11:23 -07:00
write_thread.cc Avoid sleep in DBTest.GroupCommitTest to fix flakiness 2018-05-22 12:16:25 -07:00
write_thread.h WritePrepared Txn: Duplicate Keys, Txn Part 2018-02-05 18:43:24 -08:00