A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Igor Canadi 7731d51c82 Simplify column family concurrency
Summary:
This patch changes concurrency guarantees around ColumnFamilySet::column_families_ and ColumnFamilySet::column_families_data_.

Before:
* When mutating: lock DB mutex and spin lock
* When reading: lock DB mutex OR spin lock

After:
* When mutating: lock DB mutex and be in write thread
* When reading: lock DB mutex or be in write thread

That way, we eliminate the spin lock that protects these hash maps and  simplify concurrency. That means we don't need to lock the spin lock during writing, since writing is mutually exclusive with column family create/drop (the only operations that mutate those hash maps).

With these new restrictions, I also needed to move column family create to the write thread (column family drop was already in the write thread).

Even though we don't need to lock the spin lock during write, impact on performance should be minimal -- the spin lock is almost never busy, so locking it is almost free.

This addresses task t5116919.

Test Plan:
make check

Stress test with lots and lots of column family drop and create:

   time ./db_stress --threads=30 --ops_per_thread=5000000 --max_key=5000 --column_families=200 --clear_column_family_one_in=100000 --verify_before_write=0  --reopen=15 --max_background_compactions=10 --max_background_flushes=10 --db=/fast-rocksdb-tmp/db_stress/

Reviewers: yhchiang, rven, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D30651
2015-01-06 12:44:21 -08:00
build_tools Remove -mtune=native because it's redundant 2014-12-19 09:06:45 -08:00
coverage Fix coverage script 2014-11-03 14:53:00 -08:00
db Simplify column family concurrency 2015-01-06 12:44:21 -08:00
doc Remove seek compaction 2014-06-20 10:23:02 +02:00
examples Clean up compile for c_simple_example 2014-12-23 17:32:30 +01:00
hdfs Replace exception by abort() in dummy HdfsEnv implementation. 2014-12-05 13:30:57 -08:00
helpers/memenv Turn -Wshadow back on 2014-11-06 11:14:28 -08:00
include Deprecating skip_log_error_on_recovery 2015-01-05 13:35:56 -08:00
java Merge pull request #444 from adamretter/java-api-fix 2014-12-23 15:26:45 +01:00
linters Fix linters 2014-12-02 13:53:39 -05:00
port Add rocksdb::ToString() to address cases where std::to_string is not available. 2014-11-24 20:44:49 -08:00
table Dump routine to BlockBasedTableReader 2014-12-23 13:24:07 -08:00
third-party/rapidjson Fix a rapidjson compile error in mac. 2014-06-23 17:09:24 -06:00
tools benchmark.sh won't run through all tests properly if one specifies wal_dir to be different than db directory. 2015-01-05 15:36:47 -08:00
util Deprecating skip_log_error_on_recovery 2015-01-05 13:35:56 -08:00
utilities Fix errors when using -Wshorten-64-to-32. 2015-01-05 21:21:04 +08:00
.arcconfig Improve/fix bugs for the cpp linter 2014-02-13 17:48:11 -08:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore Ignore IntelliJ idea project files and ignore java/out folder 2014-10-21 15:52:27 +01:00
.travis.yml Don't parallelize the build in travis 2014-11-14 16:23:56 -08:00
AUTHORS Add AUTHORS file. Fix #203 2014-09-29 10:52:18 -07:00
CONTRIBUTING.md facebook accounts are not required for CLA signers 2014-07-08 05:57:54 -04:00
HISTORY.md Deprecating skip_log_error_on_recovery 2015-01-05 13:35:56 -08:00
INSTALL.md Optimize default compile to compilation platform by default 2014-12-15 11:29:41 +01:00
LICENSE Fix copyright year 2014-03-12 12:06:58 -07:00
Makefile Dump routine to BlockBasedTableReader 2014-12-23 13:24:07 -08:00
PATENTS Fix the patent format 2013-10-16 15:37:32 -07:00
README.md Replaced "built on on earlier work" by "built on earlier work" in README.md 2014-09-17 01:16:17 -07:00
ROCKSDB_LITE.md RocksDBLite 2014-04-15 13:39:26 -07:00
Vagrantfile Package generation for Ubuntu and CentOS 2014-09-29 16:09:46 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/