Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
Summary:
After evaluating options for JSON storage, I decided to implement our own. The reason is that we'll be able to optimize it better and we get to reduce unnecessary dependencies (which is what we'd get with folly).
I also plan to write a serializer/deserializer for JSONDocument with our own binary format similar to BSON. That way we'll store binary JSON format in RocksDB instead of the plain-text JSON. This means less storage and faster deserialization.
There are still some inefficiencies left here. I plan to optimize them after we develop a functioning DocumentDB. That way we can move and iterate faster.
Test Plan: added a unit test
Reviewers: dhruba, haobo, sdong, ljin, yhchiang
Reviewed By: haobo
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D18831
Summary: We have a lot of problems with gflags. However, when compiling rocksdb static library, we don't need gflags dependency. Reorganize INSTALL.md such that first-time customers don't need any dependency installed to actually build rocksdb static library.
Test Plan: none
Reviewers: dhruba, haobo
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18501
Summary:
db_test includes Benchmark for LogAndApply. This diff removes it from db_test and puts it into a separate log_and_apply bench. I just wanted to play around with our new benchmark framework and figure out how it works.
I would also like to show you how great it is! I believe right set of microbenchmarks can speed up our productivity a lot and help catch early regressions.
Test Plan: no
Reviewers: dhruba, haobo, sdong, ljin, yhchiang
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D18261
Summary: Added benchmark functionality on the lines of folly/Benchmark.h
Test Plan: Added unit tests
Reviewers: igor, haobo, sdong, ljin, yhchiang, dhruba
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17973
Summary:
The file tree structure in Version is prebuilt and the range of each file is known.
On the Get() code path, we do binary search in FindFile() by comparing
target key with each file's largest key and also check the range for each L0 file.
With some pre-calculated knowledge, each key comparision that has been done can serve
as a hint to narrow down further searches:
(1) If a key falls within a L0 file's range, we can safely skip the next
file if its range does not overlap with the current one.
(2) If a key falls within a file's range in level L0 - Ln-1, we should only
need to binary search in the next level for files that overlap with the current one.
(1) will be able to skip some files depending one the key distribution.
(2) can greatly reduce the range of binary search, especially for bottom
levels, given that one file most likely only overlaps with N files from
the level below (where N is max_bytes_for_level_multiplier). So on level
L, we will only look at ~N files instead of N^L files.
Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
9.6M/s), it gives us ~13% improvement.
Test Plan: make all check
Reviewers: haobo, igor, dhruba, sdong, yhchiang
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17205
Summary:
Add a skeleton binding and test for BackupableDB which shows that BackupableDB
and RocksDB can share the same JNI calls.
Test Plan:
make rocksdbjava
make jtest
Reviewers: haobo, ankgup87, sdong, dhruba
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17793
Summary:
We don't really need sync_point.o if we're compiling with NDEBUG.
This diff depends on D17823
Test Plan: compiles
Reviewers: haobo, ljin, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17829
Summary:
Introducing RocksDBLite! Removes all the non-essential features and reduces the binary size. This effort should help our adoption on mobile.
Binary size when compiling for IOS (`TARGET_OS=IOS m static_lib`) is down to 9MB from 15MB (without stripping)
Test Plan: compiles :)
Reviewers: dhruba, haobo, ljin, sdong, yhchiang
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17835
Summary:
This is first step of my effort to reduce size of librocksdb.a for use in mobile.
ldb object files are huge and are ment to be used as a command line tool. I moved them to `tools/` directory and include them only when compiling `ldb`
This diff reduced librocksdb.a from 42MB to 39MB on my mac (not stripped).
Test Plan: ran ldb
Reviewers: dhruba, haobo, sdong, ljin, yhchiang
Reviewed By: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17823
Summary:
* Add a benchmark for java binding for rocksdb. The java benchmark
is a complete rewrite based on the c++ db/db_bench.cc and the
DbBenchmark in dain's java leveldb.
* Support multithreading.
* 'readseq' is currently not supported as it requires RocksDB Iterator.
* usage:
--benchmarks
Comma-separated list of operations to run in the specified order
Actual benchmarks:
fillseq -- write N values in sequential key order in async mode
fillrandom -- write N values in random key order in async mode
fillbatch -- write N/1000 batch where each batch has 1000 values
in random key order in sync mode
fillsync -- write N/100 values in random key order in sync mode
fill100K -- write N/1000 100K values in random order in async mode
readseq -- read N times sequentially
readrandom -- read N times in random order
readhot -- read N times in random order from 1% section of DB
Meta Operations:
delete -- delete DB
DEFAULT: [fillseq, readrandom, fillrandom]
--compression_ratio
Arrange to generate values that shrink to this fraction of
their original size after compression
DEFAULT: 0.5
--use_existing_db
If true, do not destroy the existing database. If you set this
flag and also specify a benchmark that wants a fresh database, that benchmark will fail.
DEFAULT: false
--num
Number of key/values to place in database.
DEFAULT: 1000000
--threads
Number of concurrent threads to run.
DEFAULT: 1
--reads
Number of read operations to do. If negative, do --nums reads.
--key_size
The size of each key in bytes.
DEFAULT: 16
--value_size
The size of each value in bytes.
DEFAULT: 100
--write_buffer_size
Number of bytes to buffer in memtable before compacting
(initialized to default value by 'main'.)
DEFAULT: 4194304
--cache_size
Number of bytes to use as a cache of uncompressed data.
Negative means use default settings.
DEFAULT: -1
--seed
Seed base for random number generators.
DEFAULT: 0
--db
Use the db with the following name.
DEFAULT: /tmp/rocksdbjni-bench
* Add RocksDB.write().
Test Plan: make jbench
Reviewers: haobo, sdong, dhruba, ankgup87
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17433
Summary: This will allow us to disable them completely for iOS or for better performance
Test Plan: will run make all check
Reviewers: igor, haobo, dhruba
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17511
Summary:
I had to make number of changes to the code and Makefile:
* Add `make lib`, that will create static library without debug info. We need this to avoid growing binary too much. Currently it's 14MB.
* Remove cpuinfo() function and use __SSE4_2__ macro. We actually used the macro as part of Fast_CRC32() function.
As a result, I also accidentally fixed this issue: https://www.facebook.com/groups/rocksdb.dev/permalink/549700778461774/?stream_ref=2
* Remove __thread locals in OS_MACOSX
Test Plan: `make lib PLATFORM=IOS`
Reviewers: ljin, haobo, dhruba, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17475
Summary:
* Add java api for rocksdb::WriteBatch and rocksdb::WriteOptions, which are necessary components
for running benchmark.
* Add java test for org.rocksdb.WriteBatch and org.rocksdb.WriteOptions.
* Add remove() to org.rocksdb.RocksDB, and add put() and remove() to RocksDB which take
org.rocksdb.WriteOptions.
Test Plan: make jtest
Reviewers: haobo, sdong, dhruba
Reviewed By: sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D17373
Summary:
* [java] Add a java api for rocksdb::Options, currently only supports create_if_missing.
* [java] Add a test for RocksDBException in RocksDBSample.
Test Plan: make jtest
Reviewers: haobo, sdong
Reviewed By: haobo
CC: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D17385
Summary:
This patch stores gps locations in rocksdb.
Each object is uniquely identified by an id. Each object has
a gps (latitude, longitude) associated with it. The geodb
supports looking up an object either by its gps location
or by its id. There is a method to retrieve all objects
within a circular radius centered at a specified gps location.
Test Plan: Simple unit-test attached.
Reviewers: leveldb, haobo
Reviewed By: haobo
CC: leveldb, tecbot, haobo
Differential Revision: https://reviews.facebook.net/D15567
Summary:
This diff contains a simple jni library for rocksdb which supports open, get,
put and closeusing default options (including Options, ReadOptions, and
WriteOptions.) In the usual case, Java developers can use the c++ rocksdb
library in the way similar to the following:
RocksDB db = RocksDB.open(path_to_db);
...
db.put("hello".getBytes(), "world".getBytes();
byte[] value = db.get("hello".getBytes());
...
db.close();
Specifically, this diff has the following major classes:
* RocksDB: a Java wrapper class which forwards the operations
from the java side to c++ rocksdb library.
* RocksDBException: ncapsulates the error of an operation.
This exception type is used to describe an internal error from
the c++ rocksdb library.
This diff also include a simple java sample code calling c++ rocksdb library.
To build the rocksdb jni library, simply run make jni, and make jtest will try to
build and run the sample code.
Note that if the rocksdb is not built with the default glibc that Java uses,
java will try to load the wrong glibc during the run time. As a result,
the sample code might not work properly during the run time.
Test Plan:
* make jni
* make jtest
Reviewers: haobo, dhruba, sdong, igor, ljin
Reviewed By: dhruba
CC: leveldb, xjin
Differential Revision: https://reviews.facebook.net/D17109
Summary:
@kailiu mentioned on meeting yesterday that we sometimes have trouble opening DB created by old version with the new version. This will be very important to test for column families, since I'm changing disk format for the MANIFEST.
I added a tool that can help us test that. Usage:
./db_sanity_test <path> create
will create a bunch of DBs under <path>
<change RocksDB version>
./db_sanity_test <path> verify
will verify consistency of DBs created under <path>
Test Plan: ran the db_sanity_test
Reviewers: kailiu, dhruba, haobo
Reviewed By: kailiu
CC: leveldb, kailiu, xjin
Differential Revision: https://reviews.facebook.net/D16605
Summary:
this is the key component extracted from diff: https://reviews.facebook.net/D14271
I separate it to a dedicated patch to make the review easier.
Test Plan: added a unit test and passed it.
Reviewers: haobo, sdong, dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16245