Summary:
Make inplace_update_support and inplace_update_num_locks dynamic.
inplace_callback becomes immutable
We are almost free of references to cfd->options() in db_impl
Test Plan: unit test
Reviewers: igor, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D25293
Summary: Since we depend on C++11, we might as well use it for timing, instead of this platform-depended code.
Test Plan: Ran autovector_test, which reports time and confirmed that output is similar to master
Reviewers: ljin, sdong, yhchiang, rven, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D25587
Summary: as title
Test Plan: unit test
Reviewers: sdong, yhchiang, rven, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D25347
Summary:
This is not a critical options. Making it dynamic so that we can remove
more reference to cfd->options()
Test Plan: unit test
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24957
Summary: as title
Test Plan: make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24903
Summary: as title
Test Plan:
unit test
I am only able to build the test case for hard_rate_limit.
soft_rate_limit is essentially the same thing as hard_rate_limit
Reviewers: igor, sdong, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24759
Summary: Add more tests as well
Test Plan: unit test
Reviewers: igor, sdong, yhchiang
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24747
Summary: as title
Test Plan: unit test
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24729
Summary:
ldb to support --fix_prefix_len to allow us to verify more cases.
Also fix a small issue that --bloom_bits might not be applied if --block_size is not given.
Test Plan: run ldb tool against an example DB.
Reviewers: ljin, yhchiang, rven, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D24819
Summary:
Fixed the following error in Mac:
./util/testharness.h:93:19: error: comparison of integers of different signs: 'const unsigned long' and 'const int' [-Werror,-Wsign-compare]
BINARY_OP(IsEq, ==)
~~~~~~~~~~~~~~~~^~~
./util/testharness.h:86:14: note: expanded from macro 'BINARY_OP'
if (! (x op y)) { \
^
util/options_test.cc:269:3: note: in instantiation of function template specialization 'rocksdb::test::Tester::IsEq<unsigned long, int>' requested here
ASSERT_EQ(new_cf_opt.write_buffer_size, 5);
^
Test Plan:
options_test
Summary: Allow accepting Options as a string of key/value pairs
Test Plan: unit test
Reviewers: yhchiang, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D24597
Summary:
This diff introduces the `lookahead` argument to `SkipListFactory()`. This is an
optimization for the tailing use case which includes many seeks. E.g. consider
the following operations on a skip list iterator:
Seek(x), Next(), Next(), Seek(x+2), Next(), Seek(x+3), Next(), Next(), ...
If `lookahead` is positive, `SkipListRep` will return an iterator which also
keeps track of the previously visited node. Seek() then first does a linear
search starting from that node (up to `lookahead` steps). As in the tailing
example above, this may require fewer than ~log(n) comparisons as with regular
skip list search.
Test Plan:
Added a new benchmark (`fillseekseq`) which simulates the usage pattern. It
first writes N records (with consecutive keys), then measures how much time it
takes to read them by calling `Seek()` and `Next()`.
$ time ./db_bench -num 10000000 -benchmarks fillseekseq -prefix_size 1 \
-key_size 8 -write_buffer_size $[1024*1024*1024] -value_size 50 \
-seekseq_next 2 -skip_list_lookahead=0
[...]
DB path: [/dev/shm/rocksdbtest/dbbench]
fillseekseq : 0.389 micros/op 2569047 ops/sec;
real 0m21.806s
user 0m12.106s
sys 0m9.672s
$ time ./db_bench [...] -skip_list_lookahead=2
[...]
DB path: [/dev/shm/rocksdbtest/dbbench]
fillseekseq : 0.153 micros/op 6540684 ops/sec;
real 0m19.469s
user 0m10.192s
sys 0m9.252s
Reviewers: ljin, sdong, igor
Reviewed By: igor
Subscribers: dhruba, leveldb, march, lovro
Differential Revision: https://reviews.facebook.net/D23997
Summary:
make compaction related options changeable. Most of changes are tedious,
following the same convention: grabs MutableCFOptions at the beginning
of compaction under mutex, then pass it throughout the job and register
it in SuperVersion at the end.
Test Plan: make all check
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23349
Prefer prefix ++operator for non-primitive types like iterators for
performance reasons. Prefix ++/-- operators avoid creating a temporary
copy.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Add comment to enabele cppcheck suppression of intentional null
pointer deref via --inline-suppr option.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Extended Built-in comparators with ReverseBytewiseComparator.
Reverse key handling is under certain conditions essential. E.g. while
using timestamp versioned data.
As native-comparators were not available using JAVA-API. Both built-in comparators
were exposed via JNI to be set upon database creation time.
Summary:
Now the file summary is too small for printing. Enlarge it.
To enable it, allow to pass a size to log buffer.
Test Plan:
Add a unit test.
make all check
Reviewers: ljin, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21723
Summary:
It was commented out in D22545 by accident. Keep the option in
ImmutableOptions for now. I can make it dynamic in
https://reviews.facebook.net/D23349
Test Plan: make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23865
Summary: It contrains the file size to be 4G max with int
Test Plan:
tried to grep instance and made sure other related variables are also
uint64
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23697
Summary:
compression_size_percent is an int but was printed as
an unsigned int. So the default of -1 is displayed as a big number.
Test Plan: make check
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23679
Summary: To avoid false positive test failures when the file system doesn't support fallocate. In EnvTest.AllocateTest, we first make a simple fallocate call and check the error codes to rule out the possibility that it is not supported. Skip the test if the error code indicates it is not supported.
Test Plan: Run the test and make sure it passes on file systems supporting and not supporting fallocate
Reviewers: yhchiang, ljin, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23667
Summary: as title
Test Plan:
make all check
I will think a way to set up stress test for this
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23055
Summary: as title
Test Plan: options_test
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23283
Summary: This work on my compiler, but it turns out some compilers don't implicitly add constness, see: https://github.com/facebook/rocksdb/issues/284. This diff adds constness explicitly.
Test Plan: still compiles
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23409
Summary:
The compilers we use treat char as signed. However, this is not guarantee of C standard and some compilers (for ARM platform for example), treat char as unsigned. Code that assumes that char is either signed or unsigned is wrong.
This change explicitly casts the char to signed version. This will not break any of our use cases on x86, which, I believe are all of them. In case somebody out there is using RocksDB on ARM AND using bloom filters, they're going to have a bad time. However, it is very unlikely that this is the case.
Test Plan: sanity test with previous commit (with new sanity test)
Reviewers: yhchiang, ljin, sdong
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D22767
Summary: removed reference to options in WriteBatch and DBImpl::Get()
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23049
Summary:
all shared_ptrs are in immutable_options now. This will also make
options assignment a little cheaper
Test Plan: make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D23001
Summary:
I found it is almost impossible to get rid of this function in a single
batch. I will take a step by step approach
Test Plan: make release
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22995
Summary:
Introducing WriteController, which is a source of truth about per-DB write delays. Let's define an DB epoch as a period where there are no flushes and compactions (i.e. new epoch is started when flush or compaction finishes). Each epoch can either:
* proceed with all writes without delay
* delay all writes by fixed time
* stop all writes
The three modes are recomputed at each epoch change (flush, compaction), rather than on every write (which is currently the case).
When we have a lot of column families, our current pull behavior adds a big overhead, since we need to loop over every column family for every write. With new push model, overhead on Write code-path is minimal.
This is just the start. Next step is to also take care of stalls introduced by slow memtable flushes. The final goal is to eliminate function MakeRoomForWrite(), which currently needs to be called for every column family by every write.
Test Plan: make check for now. I'll add some unit tests later. Also, perf test.
Reviewers: dhruba, yhchiang, MarkCallaghan, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22791
Summary:
1. Make filter_block.h a base class. Derive block_based_filter_block and full_filter_block. The previous one is the traditional filter block. The full_filter_block is newly added. It would generate a filter block that contain all the keys in SST file.
2. When querying a key, table would first check if full_filter is available. If not, it would go to the exact data block and check using block_based filter.
3. User could choose to use full_filter or tradional(block_based_filter). They would be stored in SST file with different meta index name. "filter.filter_policy" or "full_filter.filter_policy". Then, Table reader is able to know the fllter block type.
4. Some optimizations have been done for full_filter_block, thus it requires a different interface compared to the original one in filter_policy.h.
5. Actual implementation of filter bits coding/decoding is placed in util/bloom_impl.cc
Benchmark: base commit 1d23b5c470
Command:
db_bench --db=/dev/shm/rocksdb --num_levels=6 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --write_buffer_size=134217728 --max_write_buffer_number=2 --target_file_size_base=33554432 --max_bytes_for_level_base=1073741824 --verify_checksum=false --max_background_compactions=4 --use_plain_table=0 --memtablerep=prefix_hash --open_files=-1 --mmap_read=1 --mmap_write=0 --bloom_bits=10 --bloom_locality=1 --memtable_bloom_bits=500000 --compression_type=lz4 --num=393216000 --use_hash_search=1 --block_size=1024 --block_restart_interval=16 --use_existing_db=1 --threads=1 --benchmarks=readrandom —disable_auto_compactions=1
Read QPS increase for about 30% from 2230002 to 2991411.
Test Plan:
make all check
valgrind db_test
db_stress --use_block_based_filter = 0
./auto_sanity_test.sh
Reviewers: igor, yhchiang, ljin, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D20979
Summary:
Simply code by removing code path which does not use Arena
from NewInternalIterator
Test Plan:
make all check
make valgrind_check
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22395
Summary:
As a preparation to support updating some options dynamically, I'd like
to first introduce ImmutableOptions, which is a subset of Options that
cannot be changed during the course of a DB lifetime without restart.
ColumnFamily will keep both Options and ImmutableOptions. Any component
below ColumnFamily should only take ImmutableOptions in their
constructor. Other options should be taken from APIs, which will be
allowed to adjust dynamically.
I am yet to make changes to memtable and other related classes to take
ImmutableOptions in their ctor. That can be done in a seprate diff as
this one is already pretty big.
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D22545
Summary:
Lots of travis builds are failing because on EnvPosixTest.RandomAccessUniqueID: https://travis-ci.org/facebook/rocksdb/builds/34400833
This is the result of their environment and not because of RocksDB's bug.
Also note that RocksDB works correctly even though UniqueID feature is not present in the system (as it's the case with os x)
Test Plan:
OPT=-DTRAVIS make env_test && ./env_test
Observed that offending tests are not being run
Reviewers: sdong, yhchiang, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22803
1, const qualifiers on return types make no sense and will trigger a compile warning: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
2, class HistogramImpl has virtual functions and thus should have a virtual destructor
3, with some toolchain, the macro __STDC_FORMAT_MACROS is predefined and thus should be checked before define
Change-Id: I69747a03bfae88671bfbb2637c80d17600159c99
Signed-off-by: liuhuahang <liuhuahang@zerus.co>
This eliminates the need to remember to call PERF_TIMER_STOP when a section has
been timed. This allows more useful design with the perf timers and enables
possible return value optimizations. Simplistic example:
class Foo {
public:
Foo(int v) : m_v(v);
private:
int m_v;
}
Foo makeFrobbedFoo(int *errno)
{
*errno = 0;
return Foo();
}
Foo bar(int *errno)
{
PERF_TIMER_GUARD(some_timer);
return makeFrobbedFoo(errno);
}
int main(int argc, char[] argv)
{
Foo f;
int errno;
f = bar(&errno);
if (errno)
return -1;
return 0;
}
After bar() is called, perf_context.some_timer would be incremented as if
Stop(&perf_context.some_timer) was called at the end, and the compiler is still
able to produce optimizations on the return value from makeFrobbedFoo() through
to main().
Summary:
BlockBasedTable sst file size can grow to a large size when universal
compaction is used. When index block exceeds 2G, pread seems to fail and
return truncated data and causes "trucated block" error. I tried to use
```
#define _FILE_OFFSET_BITS 64
```
But the problem still persists. Splitting a big write/read into smaller
batches seems to solve the problem.
Test Plan:
successfully compacted a case with resulting sst file at ~90G (2.1G
index block size)
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22569
Summary: No __thread for ios.
Test Plan: compile works for ios now
Reviewers: ljin, dhruba
Reviewed By: dhruba
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22491
Summary: This assert makes Insert O(n^2) instead of O(n) in debug mode. Memtable insert is in the critical path. No need to assert uniqunnes of the key here, since we're adding a sequence number to it anyway.
Test Plan: none
Reviewers: sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22443
Summary:
- New Uint64 comparator
- Modify Reader and Builder to take custom user comparators instead of bytewise comparator
- Modify logic for choosing unused user key in builder
- Modify iterator logic in reader
- test changes
Test Plan:
cuckoo_table_{builder,reader,db}_test
make check all
Reviewers: ljin, sdong
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D22377
Summary: also fix HISTORY.md
Test Plan: make all check
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22437
Summary: Add a virtual function in table factory that will print table options
Test Plan: make release
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22149
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21915
Summary:
ManifestDumpCommand::DoCommand was allocating a VersionSet and never
freeing it.
Test Plan: make
Reviewers: igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D22221
Summary: I was checking some functions in coding.h and coding.cc when I noticed these unused functions. Let's remove them.
Test Plan: compiles
Reviewers: sdong, ljin, yhchiang, dhruba
Reviewed By: dhruba
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22077
Summary:
auto_roll_logger_test fails from time to time. I wasn't able to repro
the issue but by looking at the code, it seems like the initial ctime_
value can be set to the boundary of the second so it may still have a
chance to get rolled when interval is set to 1 second.
```
util/auto_roll_logger_test.cc:120: failed: 118 > 708
==19470== Syscall param msync(start) points to unaddressable byte(s)
==19470== at 0x4E46CE0: __msync_nocancel (in
/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib/libpthread-2.17.so)
==19470== by 0x584EFB: access_mem (Ginit.c:137)
==19470== by 0x5834E3: _ULx86_64_access_reg (libunwind_i.h:162)
==19470== by 0x585601: apply_reg_state (Gparser.c:742)
==19470== by 0x5866BE: _ULx86_64_dwarf_find_save_locs (Gparser.c:883)
==19470== by 0x584550: _ULx86_64_dwarf_step (Gstep.c:34)
==19470== by 0x583653: _ULx86_64_step (Gstep.c:71)
==19470== by 0x583FD2: _ULx86_64_tdep_trace (Gtrace.c:217)
==19470== by 0x5831C3: backtrace (backtrace.c:69)
Test Plan: ./auto_roll_logger_test
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21951
Summary: The prefix and postfix operators were mixed up in the autovector class.
Test Plan: Inspection
Reviewers: sdong, kailiu
Reviewed By: kailiu
Differential Revision: https://reviews.facebook.net/D21873
Summary: This is a linux-specific system call.
Test Plan: ran db_bench
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: haobo, leveldb
Differential Revision: https://reviews.facebook.net/D21183
Summary:
Was looking at an issue. All options are the same except
compaction_filter was missed from a newer package. Our option dump does
not capture that
Test Plan: make release
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21765
Summary: 1. write db MANIFEST, CURRENT, IDENTITY, sst files, log files to log before open
Test Plan: run db and check LOG file
Reviewers: ljin, yhchiang, igor, dhruba, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21459
Summary: as title
Test Plan: make all check
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21201
* Script for building the unity.cc file via Makefile
* Unity executable Makefile target for testing builds
* Source code changes to fix compilation of unity build
Summary: Made some small changes to fix the broken mac build
Test Plan: make check all in both linux and mac. All tests pass.
Reviewers: sdong, igor, ljin, yhchiang
Reviewed By: ljin, yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20895
Summary:
We now reads table properties in VersionSet::LogAndApply(), which requires options.db_paths to be set. But since ldb_cmd directly creates VersionSet without initialization db_paths, causing a seg fault. This patch fix it by initializing db_paths.
log_and_apply_bench still shows segfault, because table cache is nullptr in VersionSet created.
Test Plan: Run ldb dump_manifest which used to fail.
Reviewers: yhchiang, ljin, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20751
Summary: So that we can avoid calling NowSecs() in MakeRoomForWrite twice
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20529
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.
Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics
Reviewers: yhchiang, sdong, igor
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D20145
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.
Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics
Reviewers: yhchiang, sdong, igor
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D20145
Summary:
User gets undefinied error since the definition is not exposed.
Also re-enable the db test with only upper bound check
Test Plan: db_test, rate_limit_test
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20403
Summary:
Fixed the following compile error by replacing pow by shift, as it computes
power of 2.
util/options_builder.cc:133:14: error: no member named 'pow' in namespace 'std'
std::pow(2, std::max(0, std::min(3, level0_stop_writes_trigger -
~~~~~^
1 error generated.
make: *** [util/options_builder.o] Error 1
Test Plan: make success in mac and linux
Reviewers: ljin, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20475
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:
#include <rocksdb/db.h>
#include <utilities/backupable_db.h> // still our public header!
Also, internally, we include:
#include "utilities/backupable/backupable_db.h" // internal header
#include "utilities/backupable_db.h" // public header
which is confusing.
This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.
Test Plan: compiles
Reviewers: dhruba, ljin, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20409
Summary:
Add a function GetOptions(), where based on four parameters users give: read/write amplification threshold, memory budget for mem tables and target DB size, it picks up a compaction style and parameters for them. Background threads are not touched yet.
One limit of this algorithm: since compression rate and key/value size are hard to predict, it's hard to predict level 0 file size from write buffer size. Simply make 1:1 ratio here.
Sample results: https://reviews.facebook.net/P477
Test Plan: Will add some a unit test where some sample scenarios are given and see they pick the results that make sense
Reviewers: yhchiang, dhruba, haobo, igor, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D18741
Summary:
Adding option to save PlainTable index and bloom filter in SST file.
If there is no bloom block and/or index block, PlainTableReader builds
new ones. Otherwise PlainTableReader just use these blocks.
Test Plan: make all check
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19527
Summary:
This patch adds a target size parameter in options.db_paths and universal compaction will base it to determine which DB path to place a new file.
Level-style stays the same.
Test Plan: Add new unit tests
Reviewers: ljin, yhchiang
Reviewed By: yhchiang
Subscribers: MarkCallaghan, dhruba, igor, leveldb
Differential Revision: https://reviews.facebook.net/D19869
Summary: Browsing through the code, looks like StatsLogger is not used at all!
Test Plan: compiles
Reviewers: ljin, sdong, yhchiang, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D19827
Summary: Add a function to return the perf level. It is to allow a wrapper of DB to increase the perf level and restore the original perf level after finishing the function call.
Test Plan: Add a verification in db_test
Reviewers: yhchiang, igor, ljin
Reviewed By: ljin
Subscribers: xjin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D19551
Summary:
Add option and plugin rate limiter for PosixWritableFile. The rate
limiter only applies to flush and compaction. WAL and MANIFEST are
excluded from this enforcement.
Test Plan: db_test
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19425
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
Summary:
This diff allows the I/O stats about Flush and Compaction to be reported
in a more accurate way. Instead of measuring the size of a file, it
measure I/O cost in per read / write basis.
Test Plan: make all check
Reviewers: sdong, igor, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19383
Summary:
This diff adds timeout_hint_us to WriteOptions. If it's non-zero, then
1) writes associated with this options MAY be aborted when it has been
waiting for longer than the specified time. If an abortion happens,
associated writes will return Status::TimeOut.
2) the stall time of the associated write caused by flush or compaction
will be limited by timeout_hint_us.
The default value of timeout_hint_us is 0 (i.e., OFF.)
The statistics of timeout writes will be recorded in WRITE_TIMEDOUT.
Test Plan:
export ROCKSDB_TESTS=WriteTimeoutAndDelayTest
make db_test
./db_test
Reviewers: igor, ljin, haobo, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D18837