A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Andrew Kryczka 791723c1ec Fix race condition in db_stress thread setup (#9314)
Summary:
We need to grab `SharedState`'s mutex while calling `IncThreads()` or `IncBgThreads()`. Otherwise the newly launched threads can simultaneously access the thread counters to check if every thread has finished initializing.

Repro command:

```
$ rm -rf /dev/shm/rocksdb/rocksdb_crashtest_{whitebox,expected}/ && mkdir -p /dev/shm/rocksdb/rocksdb_crashtest_{whitebox,expected}/ && ./db_stress --acquire_snapshot_one_in=10000 --atomic_flush=1 --avoid_flush_during_recovery=0 --avoid_unnecessary_blocking_io=1 --backup_max_size=104857600 --backup_one_in=100000 --batch_protection_bytes_per_key=0 --block_size=16384 --bloom_bits=131.8094496796033 --bottommost_compression_type=zlib --cache_index_and_filter_blocks=1 --cache_size=1048576 --checkpoint_one_in=1000000 --checksum_type=kCRC32c --clear_column_family_one_in=0 --compact_files_one_in=1000000 --compact_range_one_in=1000000 --compaction_style=1 --compaction_ttl=0 --compression_max_dict_buffer_bytes=134217727 --compression_max_dict_bytes=16384 --compression_parallel_threads=1 --compression_type=zstd --compression_zstd_max_train_bytes=65536 --continuous_verification_interval=0 --db=/dev/shm/rocksdb/rocksdb_crashtest_whitebox --db_write_buffer_size=8388608 --delpercent=5 --delrangepercent=0 --destroy_db_initially=0 --disable_wal=1 --enable_compaction_filter=0 --enable_pipelined_write=0 --fail_if_options_file_error=1 --file_checksum_impl=crc32c --flush_one_in=1000000 --format_version=5 --get_current_wal_file_one_in=0 --get_live_files_one_in=1000000 --get_property_one_in=1000000 --get_sorted_wal_files_one_in=0 --index_block_restart_interval=15 --index_type=3 --iterpercent=10 --key_len_percent_dist=1,30,69 --level_compaction_dynamic_level_bytes=True --log2_keys_per_lock=22 --long_running_snapshots=0 --mark_for_compaction_one_file_in=10 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --max_key=1000000 --max_key_len=3 --max_manifest_file_size=1073741824 --max_write_batch_group_size_bytes=1048576 --max_write_buffer_number=3 --max_write_buffer_size_to_maintain=4194304 --memtablerep=skip_list --mmap_read=1 --mock_direct_io=False --nooverwritepercent=1 --open_files=500000 --open_metadata_write_fault_one_in=0 --open_read_fault_one_in=32 --open_write_fault_one_in=0 --ops_per_thread=20000 --optimize_filters_for_memory=1 --paranoid_file_checks=0 --partition_filters=0 --partition_pinning=0 --pause_background_one_in=1000000 --periodic_compaction_seconds=0 --prefixpercent=5 --prepopulate_block_cache=1 --progress_reports=0 --read_fault_one_in=1000 --readpercent=45 --recycle_log_file_num=1 --reopen=0 --ribbon_starting_level=999 --secondary_cache_fault_one_in=32 --snapshot_hold_ops=100000 --sst_file_manager_bytes_per_sec=104857600 --sst_file_manager_bytes_per_truncate=1048576 --subcompactions=2 --sync=0 --sync_fault_injection=False --target_file_size_base=2097152 --target_file_size_multiplier=2 --test_batches_snapshots=1 --test_cf_consistency=1 --top_level_index_pinning=0 --unpartitioned_pinning=0 --use_block_based_filter=1 --use_clock_cache=0 --use_direct_io_for_flush_and_compaction=0 --use_direct_reads=0 --use_full_merge_v1=1 --use_merge=0 --use_multiget=1 --user_timestamp_size=0 --verify_checksum=1 --verify_checksum_one_in=1000000 --verify_db_one_in=100000 --write_buffer_size=1048576 --write_dbid_to_manifest=1 --write_fault_one_in=0 --writepercent=35
```

TSAN error:

```
WARNING: ThreadSanitizer: data race (pid=2750142)
  Read of size 4 at 0x7ffc21d7f58c by thread T39 (mutexes: write M670895590377780496):
    #0 rocksdb::SharedState::AllInitialized() const db_stress_tool/db_stress_shared_state.h:204 (db_stress+0x4fd307)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::ThreadBody(void*) db_stress_tool/db_stress_driver.cc:26 (db_stress+0x4fd307)
    https://github.com/facebook/rocksdb/issues/2 StartThreadWrapper env/env_posix.cc:454 (db_stress+0x84472f)

  Previous write of size 4 at 0x7ffc21d7f58c by main thread:
    #0 rocksdb::SharedState::IncThreads() db_stress_tool/db_stress_shared_state.h:194 (db_stress+0x4fd779)
    https://github.com/facebook/rocksdb/issues/1 rocksdb::RunStressTest(rocksdb::StressTest*) db_stress_tool/db_stress_driver.cc:78 (db_stress+0x4fd779)
    https://github.com/facebook/rocksdb/issues/2 rocksdb::db_stress_tool(int, char**) db_stress_tool/db_stress_tool.cc:348 (db_stress+0x4b97dc)
    https://github.com/facebook/rocksdb/issues/3 main db_stress_tool/db_stress.cc:21 (db_stress+0x47a351)

  Location is stack of main thread.

  Location is global '<null>' at 0x000000000000 ([stack]+0x00000001d58c)

  Mutex M670895590377780496 is already destroyed.

  Thread T39 (tid=2750211, running) created by main thread at:
    #0 pthread_create /home/engshare/third-party2/gcc/9.x/src/gcc-10.x/libsanitizer/tsan/tsan_interceptors.cc:964 (libtsan.so.0+0x613c3)
    https://github.com/facebook/rocksdb/issues/1 StartThread env/env_posix.cc:464 (db_stress+0x8463c2)
    https://github.com/facebook/rocksdb/issues/2 rocksdb::CompositeEnvWrapper::StartThread(void (*)(void*), void*) env/composite_env_wrapper.h:288 (db_stress+0x4bcd20)
    https://github.com/facebook/rocksdb/issues/3 rocksdb::EnvWrapper::StartThread(void (*)(void*), void*) include/rocksdb/env.h:1475 (db_stress+0x4bb950)
    https://github.com/facebook/rocksdb/issues/4 rocksdb::RunStressTest(rocksdb::StressTest*) db_stress_tool/db_stress_driver.cc:80 (db_stress+0x4fd9d2)
    https://github.com/facebook/rocksdb/issues/5 rocksdb::db_stress_tool(int, char**) db_stress_tool/db_stress_tool.cc:348 (db_stress+0x4b97dc)
    https://github.com/facebook/rocksdb/issues/6 main db_stress_tool/db_stress.cc:21 (db_stress+0x47a351)

 ThreadSanitizer: data race db_stress_tool/db_stress_shared_state.h:204 in rocksdb::SharedState::AllInitialized() const
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9314

Test Plan: verified repro command works after this PR.

Reviewed By: jay-zhuang

Differential Revision: D33217698

Pulled By: ajkr

fbshipit-source-id: 79358fe5adb779fc9dcf80643cc102d4b467fc38
2021-12-20 13:05:23 -08:00
.circleci Fix some CI output (#9193) 2021-11-20 10:21:58 -08:00
.github/workflows Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
buckifier Update TARGETS and related scripts (#9310) 2021-12-17 11:51:51 -08:00
build_tools More improvements to output for CircleCI (#9201) 2021-11-23 22:10:27 -08:00
cache Fix unity build with SUPPORT_CLOCK_CACHE (#9309) 2021-12-17 14:15:07 -08:00
cmake gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
coverage Remove asan_symbolize.py for internal asan build (#8737) 2021-09-07 15:39:11 -07:00
db Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
db_stress_tool Fix race condition in db_stress thread setup (#9314) 2021-12-20 13:05:23 -08:00
docs Misc doc fixes (#8983) 2021-10-07 11:22:17 -07:00
env New stable, fixed-length cache keys (#9126) 2021-12-16 17:15:13 -08:00
examples Add (& fix) some simple source code checks (#8821) 2021-09-07 21:19:27 -07:00
file Fix a bug causing duplicate trailing entries in WritableFile (buffered IO) (#9236) 2021-12-13 09:00:36 -08:00
fuzz Make EventListener into a Customizable Class (#8473) 2021-07-27 07:47:02 -07:00
hdfs fix build with 'USE_HDFS' on windows (#6950) 2020-06-12 16:21:50 -07:00
include/rocksdb Update to version 6.28 (#9312) 2021-12-17 16:20:39 -08:00
java fix java doc issues (#9253) 2021-12-16 21:04:41 -08:00
logging Use system-wide thread ID in info log lines (#9164) 2021-11-12 19:46:06 -08:00
memory Skip MemoryAllocatorTest in LITE mode (#9318) 2021-12-20 10:28:29 -08:00
memtable Minor improvement to CacheReservationManager/WriteBufferManager/CompressionDictBuilding (#9139) 2021-11-05 16:13:47 -07:00
microbench Skip directory fsync for filesystem btrfs (#8903) 2021-11-03 12:21:27 -07:00
monitoring Add tiered storage related read bytes stats to Statistic (#9123) 2021-11-16 15:17:17 -08:00
options Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
plugin Add initial CMake support to plugin (#9214) 2021-11-30 17:16:53 -08:00
port Fix/improve 'must free heap allocations' code (#9209) 2021-11-29 10:53:52 -08:00
table Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
test_util More refactoring ahead of footer & meta changes (#9240) 2021-12-10 08:13:26 -08:00
third-party Add support for building on s390x platform (#8962) 2021-10-22 10:13:15 -07:00
tools Verify recovery correctness in multi-CF blackbox crash test (#9303) 2021-12-16 09:05:40 -08:00
trace_replay Cleanup includes in dbformat.h (#8930) 2021-09-29 04:04:40 -07:00
util gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
utilities Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore gitignore cmake-build-* for CLion integration (#7933) 2021-02-19 13:43:15 -08:00
.lgtm.yml Create lgtm.yml for LGTM.com C/C++ analysis (#4058) 2018-06-26 12:43:04 -07:00
.travis.yml Re-enable 390x+cmake* Travis jobs (#9110) 2021-11-03 20:30:15 -07:00
.watchmanconfig Added .watchmanconfig file to rocksdb repo (#5593) 2019-07-19 15:00:33 -07:00
appveyor.yml Remove 2019 from appveyor (#7038) 2020-06-29 14:31:41 -07:00
AUTHORS Update RocksDB Authors File 2017-10-18 14:42:10 -07:00
CMakeLists.txt gcc-11 and cmake related cleanup (#9286) 2021-12-17 17:04:35 -08:00
CODE_OF_CONDUCT.md Adopt Contributor Covenant 2019-08-29 23:21:01 -07:00
CONTRIBUTING.md Add Code of Conduct 2017-12-05 18:42:35 -08:00
COPYING Add GPLv2 as an alternative license. 2017-04-27 18:06:12 -07:00
DEFAULT_OPTIONS_HISTORY.md options.delayed_write_rate use the rate of rate_limiter by default. 2017-05-24 09:58:24 -07:00
defs.bzl Make testpilot recognize that these tests have coverage instrumentation 2020-03-20 11:23:23 -07:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Update to version 6.28 (#9312) 2021-12-17 16:20:39 -08:00
INSTALL.md Update installation instructions (#8158) 2021-04-06 16:02:04 -07:00
issue_template.md Add Google Group to Issue Template 2020-01-28 14:40:37 -08:00
LANGUAGE-BINDINGS.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
LICENSE.Apache Change RocksDB License 2017-07-15 16:11:23 -07:00
LICENSE.leveldb Add back the LevelDB license file 2017-07-16 18:42:18 -07:00
Makefile Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
PLUGINS.md Add ZenFS to plugin list (#8218) 2021-04-22 11:12:40 -07:00
README.md Update branch name to "main" in README/LANGUAGE_BINDINGS (#8727) 2021-09-01 15:26:34 -07:00
ROCKSDB_LITE.md Fix some typos in comments and docs. 2018-03-08 10:27:25 -08:00
src.mk Make MemoryAllocator into a Customizable class (#8980) 2021-12-17 04:20:47 -08:00
TARGETS Update TARGETS and related scripts (#9310) 2021-12-17 11:51:51 -08:00
thirdparty.inc Fix build jemalloc api (#5470) 2019-06-24 17:40:32 -07:00
USERS.md Update USERS.md (#8923) 2021-10-01 16:10:35 -07:00
Vagrantfile Adding CentOS 7 Vagrantfile & build script 2018-02-26 15:27:17 -08:00
WINDOWS_PORT.md Update branch name in WINDOWS_PORT.md (#8745) 2021-09-01 19:26:39 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/ and https://rocksdb.slack.com/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.