A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Haobo Xu 9ba82786ce [RocksDB] Provide contiguous sequence number even in case of write failure
Summary: Replication logic would be simplifeid if we can guarantee that write sequence number is always contiguous, even if write failure occurs. Dhruba and I looked at the sequence number generation part of the code. It seems fixable. Note that if WAL was successful and insert into memtable was not, we would be in an unfortunate state. The approach in this diff is : IO error is expected and error status will be returned to client, sequence number will not be advanced; In-mem error is not expected and we panic.

Test Plan: make check; db_stress

Reviewers: dhruba, sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11439
2013-07-08 15:31:09 -07:00
db [RocksDB] Provide contiguous sequence number even in case of write failure 2013-07-08 15:31:09 -07:00
doc merge 1.5 2012-08-28 11:43:33 -07:00
hdfs Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database. 2013-03-20 23:14:03 -07:00
helpers/memenv [RocksDB] cleanup EnvOptions 2013-06-12 11:17:19 -07:00
include Update rocksdb version 2013-07-01 14:30:04 -07:00
java Pom changes to make relase 1.5.7 for java. 2013-01-10 10:43:43 -08:00
linters/src fixing linters. 2012-12-14 14:05:27 -08:00
port Fix Zlib_Compress and Zlib_Uncompress 2013-06-18 16:57:42 -07:00
scribe fix db_test error with scribe logger turned on 2012-08-28 11:22:58 -07:00
snappy Build with gcc-4.7.1-glibc-2.14.1. 2012-09-17 10:56:26 -07:00
table [Rocksdb] Record WriteBlock Times into a histogram 2013-06-17 10:11:10 -07:00
thrift Implement RowLocks for assoc schema 2012-10-03 23:19:01 -07:00
tools [RocksDB] add back --mmap_read options to crashtest 2013-06-19 16:15:59 -07:00
util [RocksDB] Support internal key/value dump for ldb 2013-07-03 10:41:31 -07:00
utilities Added stringappend_test back into the unit tests. 2013-06-26 11:41:13 -07:00
VALGRIND_LOGS Use version 3.8.1 for valgrind in third_party and do away with log files 2013-03-06 17:47:31 -08:00
.arcconfig Enable linting in arc. 2013-02-01 11:34:25 -08:00
.gitignore Various build cleanups/improvements 2013-01-14 18:40:22 -08:00
build_detect_platform Modify build_detect_platform to run fbcode.*.* irrespective of $PATH 2013-05-14 22:09:01 -07:00
build_detect_version Make the build-time show up in the leveldb library. 2013-03-11 10:33:15 -07:00
build_java.sh Release 1.5.6 for Java code + Script to automate it. 2012-12-17 12:11:11 -08:00
e Enhance db_bench 2013-03-14 16:00:23 -07:00
fbcode.clang31.sh Cleanup TODO/NEWS/AUTHORS files 2013-01-25 09:11:26 -08:00
fbcode.gcc471.sh Updating fbcode.gcc471.sh to use jemalloc 3.3.1 2013-03-13 15:34:50 -07:00
LICENSE reverting disastrous MOE commit, returning to r21 2011-04-19 23:11:15 +00:00
Makefile Update rocksdb version 2013-07-01 14:30:04 -07:00
README Use posix_fallocate as default. 2013-03-13 13:50:26 -07:00
README.fb Release 1.5.9.fb to third party 2013-04-10 17:23:58 -07:00
regression_build_test.sh Minor improvements to the regression testing 2013-01-16 14:47:20 -08:00
valgrind_test.sh make clean in valgrind_test.sh first 2013-04-23 14:25:19 -07:00

rocksdb: A persistent key-value store for flash storage
Authors: The Facebook Database Engineering Team

This code is a library that forms the core building block for a fast
key value server, especially suited for storing data on flash drives.
It has an Log-Stuctured-Merge-Database (LSM) design with flexible tradeoffs
between Write-Amplification-Factor(WAF), Read-Amplification-Factor (RAF)
and Space-Amplification-Factor(SAF). It has multi-threaded compactions,
making it specially suitable for storing multiple terabytes of data in a
single database.

The core of this code has been derived from open-source leveldb.

The code under this directory implements a system for maintaining a
persistent key/value store.

See doc/index.html for more explanation.
See doc/impl.html for a brief overview of the implementation.

The public interface is in include/*.h.  Callers should not include or
rely on the details of any other header files in this package.  Those
internal APIs may be changed without warning.

Guide to header files:

include/db.h
    Main interface to the DB: Start here

include/options.h
    Control over the behavior of an entire database, and also
    control over the behavior of individual reads and writes.

include/comparator.h
    Abstraction for user-specified comparison function.  If you want
    just bytewise comparison of keys, you can use the default comparator,
    but clients can write their own comparator implementations if they
    want custom ordering (e.g. to handle different character
    encodings, etc.)

include/iterator.h
    Interface for iterating over data. You can get an iterator
    from a DB object.

include/write_batch.h
    Interface for atomically applying multiple updates to a database.

include/slice.h
    A simple module for maintaining a pointer and a length into some
    other byte array.

include/status.h
    Status is returned from many of the public interfaces and is used
    to report success and various kinds of errors.

include/env.h
    Abstraction of the OS environment.  A posix implementation of
    this interface is in util/env_posix.cc

include/table.h
include/table_builder.h
    Lower-level modules that most clients probably won't use directly