5d4fddfa52
Summary: With WritePrepared transaction, flush/compaction can contain uncommitted keys, and those keys can get committed during compaction. If a snapshot is taken before the key is committed, it should not see the key. On the other hand, compaction grab the list of snapshots at its beginning, and only consider those snapshots to dedup keys. Consider the case: ``` seq = 1: put "foo" = "bar" seq = 2: transaction T: delete "foo", prepare seq = 3: compaction start seq = 4: take snapshot S seq = 5: transaction T: commit. ... seq = N: compaction iterator reached key "foo". ``` When compaction start, the list of snapshot is empty. Compaction doesn't take snapshot S into account. When it reached "foo", transaction T is committed. Compaction may think the value "foo=bar" is not visible by any snapshot (which is wrong), and compact the value out. The fix is to explicitly take a snapshot before compaction grabbing the list of snapshots. Compaction will then has to keep keys visible to this snapshot. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4883 Differential Revision: D13668775 Pulled By: maysamyabandeh fbshipit-source-id: 1cab9615f94b7d3e8522cc3d44c3a14c7d4720e4 |
||
---|---|---|
.. | ||
backupable | ||
blob_db | ||
cassandra | ||
checkpoint | ||
compaction_filters | ||
convenience | ||
leveldb_options | ||
lua | ||
memory | ||
merge_operators | ||
option_change_migration | ||
options | ||
persistent_cache | ||
simulator_cache | ||
table_properties_collectors | ||
trace | ||
transactions | ||
ttl | ||
write_batch_with_index | ||
debug.cc | ||
env_librados_test.cc | ||
env_librados.cc | ||
env_librados.md | ||
env_mirror_test.cc | ||
env_mirror.cc | ||
env_timed_test.cc | ||
env_timed.cc | ||
merge_operators.h | ||
object_registry_test.cc | ||
util_merge_operators_test.cc |