Summary:
BlockBasedTable sst file size can grow to a large size when universal
compaction is used. When index block exceeds 2G, pread seems to fail and
return truncated data and causes "trucated block" error. I tried to use
```
#define _FILE_OFFSET_BITS 64
```
But the problem still persists. Splitting a big write/read into smaller
batches seems to solve the problem.
Test Plan:
successfully compacted a case with resulting sst file at ~90G (2.1G
index block size)
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22569
Summary: No __thread for ios.
Test Plan: compile works for ios now
Reviewers: ljin, dhruba
Reviewed By: dhruba
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22491
Summary: This assert makes Insert O(n^2) instead of O(n) in debug mode. Memtable insert is in the critical path. No need to assert uniqunnes of the key here, since we're adding a sequence number to it anyway.
Test Plan: none
Reviewers: sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22443
Summary:
- New Uint64 comparator
- Modify Reader and Builder to take custom user comparators instead of bytewise comparator
- Modify logic for choosing unused user key in builder
- Modify iterator logic in reader
- test changes
Test Plan:
cuckoo_table_{builder,reader,db}_test
make check all
Reviewers: ljin, sdong
Reviewed By: ljin
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D22377
Summary: also fix HISTORY.md
Test Plan: make all check
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22437
Summary: Add a virtual function in table factory that will print table options
Test Plan: make release
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22149
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21915
Summary:
ManifestDumpCommand::DoCommand was allocating a VersionSet and never
freeing it.
Test Plan: make
Reviewers: igor
Reviewed By: igor
Differential Revision: https://reviews.facebook.net/D22221
Summary: I was checking some functions in coding.h and coding.cc when I noticed these unused functions. Let's remove them.
Test Plan: compiles
Reviewers: sdong, ljin, yhchiang, dhruba
Reviewed By: dhruba
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D22077
Summary:
auto_roll_logger_test fails from time to time. I wasn't able to repro
the issue but by looking at the code, it seems like the initial ctime_
value can be set to the boundary of the second so it may still have a
chance to get rolled when interval is set to 1 second.
```
util/auto_roll_logger_test.cc:120: failed: 118 > 708
==19470== Syscall param msync(start) points to unaddressable byte(s)
==19470== at 0x4E46CE0: __msync_nocancel (in
/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib/libpthread-2.17.so)
==19470== by 0x584EFB: access_mem (Ginit.c:137)
==19470== by 0x5834E3: _ULx86_64_access_reg (libunwind_i.h:162)
==19470== by 0x585601: apply_reg_state (Gparser.c:742)
==19470== by 0x5866BE: _ULx86_64_dwarf_find_save_locs (Gparser.c:883)
==19470== by 0x584550: _ULx86_64_dwarf_step (Gstep.c:34)
==19470== by 0x583653: _ULx86_64_step (Gstep.c:71)
==19470== by 0x583FD2: _ULx86_64_tdep_trace (Gtrace.c:217)
==19470== by 0x5831C3: backtrace (backtrace.c:69)
Test Plan: ./auto_roll_logger_test
Reviewers: sdong, yhchiang, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21951
Summary: The prefix and postfix operators were mixed up in the autovector class.
Test Plan: Inspection
Reviewers: sdong, kailiu
Reviewed By: kailiu
Differential Revision: https://reviews.facebook.net/D21873
Summary: This is a linux-specific system call.
Test Plan: ran db_bench
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: haobo, leveldb
Differential Revision: https://reviews.facebook.net/D21183
Summary:
Was looking at an issue. All options are the same except
compaction_filter was missed from a newer package. Our option dump does
not capture that
Test Plan: make release
Reviewers: sdong, igor, yhchiang
Reviewed By: yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21765
Summary: 1. write db MANIFEST, CURRENT, IDENTITY, sst files, log files to log before open
Test Plan: run db and check LOG file
Reviewers: ljin, yhchiang, igor, dhruba, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21459
Summary: as title
Test Plan: make all check
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D21201
* Script for building the unity.cc file via Makefile
* Unity executable Makefile target for testing builds
* Source code changes to fix compilation of unity build
Summary: Made some small changes to fix the broken mac build
Test Plan: make check all in both linux and mac. All tests pass.
Reviewers: sdong, igor, ljin, yhchiang
Reviewed By: ljin, yhchiang
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20895
Summary:
We now reads table properties in VersionSet::LogAndApply(), which requires options.db_paths to be set. But since ldb_cmd directly creates VersionSet without initialization db_paths, causing a seg fault. This patch fix it by initializing db_paths.
log_and_apply_bench still shows segfault, because table cache is nullptr in VersionSet created.
Test Plan: Run ldb dump_manifest which used to fail.
Reviewers: yhchiang, ljin, igor
Reviewed By: igor
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20751
Summary: So that we can avoid calling NowSecs() in MakeRoomForWrite twice
Test Plan: make all check
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20529
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.
Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics
Reviewers: yhchiang, sdong, igor
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D20145
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.
Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics
Reviewers: yhchiang, sdong, igor
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D20145
Summary:
User gets undefinied error since the definition is not exposed.
Also re-enable the db test with only upper bound check
Test Plan: db_test, rate_limit_test
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20403
Summary:
Fixed the following compile error by replacing pow by shift, as it computes
power of 2.
util/options_builder.cc:133:14: error: no member named 'pow' in namespace 'std'
std::pow(2, std::max(0, std::min(3, level0_stop_writes_trigger -
~~~~~^
1 error generated.
make: *** [util/options_builder.o] Error 1
Test Plan: make success in mac and linux
Reviewers: ljin, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20475
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:
#include <rocksdb/db.h>
#include <utilities/backupable_db.h> // still our public header!
Also, internally, we include:
#include "utilities/backupable/backupable_db.h" // internal header
#include "utilities/backupable_db.h" // public header
which is confusing.
This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.
Test Plan: compiles
Reviewers: dhruba, ljin, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D20409
Summary:
Add a function GetOptions(), where based on four parameters users give: read/write amplification threshold, memory budget for mem tables and target DB size, it picks up a compaction style and parameters for them. Background threads are not touched yet.
One limit of this algorithm: since compression rate and key/value size are hard to predict, it's hard to predict level 0 file size from write buffer size. Simply make 1:1 ratio here.
Sample results: https://reviews.facebook.net/P477
Test Plan: Will add some a unit test where some sample scenarios are given and see they pick the results that make sense
Reviewers: yhchiang, dhruba, haobo, igor, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D18741
Summary:
Adding option to save PlainTable index and bloom filter in SST file.
If there is no bloom block and/or index block, PlainTableReader builds
new ones. Otherwise PlainTableReader just use these blocks.
Test Plan: make all check
Reviewers: sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19527
Summary:
This patch adds a target size parameter in options.db_paths and universal compaction will base it to determine which DB path to place a new file.
Level-style stays the same.
Test Plan: Add new unit tests
Reviewers: ljin, yhchiang
Reviewed By: yhchiang
Subscribers: MarkCallaghan, dhruba, igor, leveldb
Differential Revision: https://reviews.facebook.net/D19869
Summary: Browsing through the code, looks like StatsLogger is not used at all!
Test Plan: compiles
Reviewers: ljin, sdong, yhchiang, dhruba
Reviewed By: dhruba
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D19827
Summary: Add a function to return the perf level. It is to allow a wrapper of DB to increase the perf level and restore the original perf level after finishing the function call.
Test Plan: Add a verification in db_test
Reviewers: yhchiang, igor, ljin
Reviewed By: ljin
Subscribers: xjin, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D19551
Summary:
Add option and plugin rate limiter for PosixWritableFile. The rate
limiter only applies to flush and compaction. WAL and MANIFEST are
excluded from this enforcement.
Test Plan: db_test
Reviewers: igor, yhchiang, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19425
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.
Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500
Reviewers: yhchiang, igor, sdong
Reviewed By: sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19359
Summary:
This diff allows the I/O stats about Flush and Compaction to be reported
in a more accurate way. Instead of measuring the size of a file, it
measure I/O cost in per read / write basis.
Test Plan: make all check
Reviewers: sdong, igor, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19383
Summary:
This diff adds timeout_hint_us to WriteOptions. If it's non-zero, then
1) writes associated with this options MAY be aborted when it has been
waiting for longer than the specified time. If an abortion happens,
associated writes will return Status::TimeOut.
2) the stall time of the associated write caused by flush or compaction
will be limited by timeout_hint_us.
The default value of timeout_hint_us is 0 (i.e., OFF.)
The statistics of timeout writes will be recorded in WRITE_TIMEDOUT.
Test Plan:
export ROCKSDB_TESTS=WriteTimeoutAndDelayTest
make db_test
./db_test
Reviewers: igor, ljin, haobo, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D18837
Summary:
In this patch, we allow RocksDB to support multiple DB paths internally.
No user interface is supported yet so this patch is silent to users.
Test Plan: make all check
Reviewers: igor, haobo, ljin, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D18921
Summary:
In this patch, we enhance HashLinkList memtable to reduce performance outliers when a bucket contains too many entries. We switch to skip list for this case to enable binary search.
Add threshold_use_skiplist parameter to determine when a bucket needs to switch to skip list.
The new data structure is documented in comments in the codes.
Test Plan:
make all check
set threshold_use_skiplist in several tests
Reviewers: yhchiang, haobo, ljin
Reviewed By: yhchiang, ljin
Subscribers: nkg-, xjin, dhruba, yhchiang, leveldb
Differential Revision: https://reviews.facebook.net/D19299
Summary:
Bloomfilter and hashskiplist's buckets_ allocated by memtable's arena
DynamicBloom: pass arena via constructor, allocate space in SetTotalBits
HashSkipListRep: allocate space of buckets_ using arena.
do not delete it in deconstructor because arena would take care of it.
Several test files are changed.
Test Plan:
make all check
Reviewers: ljin, haobo, yhchiang, sdong
Reviewed By: sdong
Subscribers: igor, dhruba
Differential Revision: https://reviews.facebook.net/D19335
Summary:
Fixed the following warning:
util/options.cc: In constructor ‘rocksdb::ColumnFamilyOptions::ColumnFamilyOptions(const rocksdb::Options&)’:
util/options.cc:157:58: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (max_bytes_for_level_multiplier_additional.size() < num_levels) {
^
Test Plan: make all check
Reviewers: haobo, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19293
Summary: It seems to me that when ever function MemTableRep::GetIterator(const Slice& slice) is used, we can use MemTableRep::GetDynamicPrefixIterator() instead. Just delete it to simplify the codes.
Test Plan: make all check
Reviewers: yhchiang, ljin
Reviewed By: ljin
Subscribers: xjin, dhruba, haobo, leveldb
Differential Revision: https://reviews.facebook.net/D19281
Summary:
Currently, when num_levels has been changed to > 7, internally
it will not resize max_bytes_for_level_multiplier_additional.
As a result, max_bytes_for_level_multiplier_additional.size() will
be smaller than num_levels, which causes heap-buffer-overflow.
Test Plan: make all check
Reviewers: haobo, sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19275
Summary:
Revert the default setting of InitFromCmdLineArgs() as all the callers
currently provide full set of arguments.
Test Plan:
make reduce_levels_test
./reduce_levels_test
Reviewers: haobo, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19257
Summary:
Fixed the following compile error.
tools/reduce_levels_test.cc:89:31: error: no matching function for call to 'InitFromCmdLineArgs'
LDBCommand* level_reducer = LDBCommand::InitFromCmdLineArgs(args);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./util/ldb_cmd.h:56:22: note: candidate function not viable: requires 3 arguments, but 1 was provided
static LDBCommand* InitFromCmdLineArgs(
^
./util/ldb_cmd.h:62:22: note: candidate function not viable: requires 4 arguments, but 1 was provided
static LDBCommand* InitFromCmdLineArgs(
^
1 error generated.
Test Plan:
make reduce_levels_test
./reduce_levels_test
Reviewers: haobo, ljin, sdong
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19251
Summary: Currently ldb tool dump keys either in ascii format or hex format - neither is ideal if the key has a binary structure and is not readable in ascii. This diff also allows LDB tool to be customized in ways beyond DB options.
Test Plan: verify that key formatter works with some simple db with binary key.
Reviewers: sdong, ljin
Reviewed By: ljin
Subscribers: leveldb
Differential Revision: https://reviews.facebook.net/D19209