Commit Graph

598 Commits

Author SHA1 Message Date
sdong
c4d32b9336 Add some include<functional> 2019-10-31 14:54:56 -07:00
Igor Canadi
7f19bb93c6 Merge pull request #242 from tdfischer/perf-timer-destructors
Refactor PerfStepTimer to automatically stop on destruct
2014-09-02 13:06:40 -07:00
Torrie Fischer
6614a48418 Refactor PerfStepTimer to stop on destruct
This eliminates the need to remember to call PERF_TIMER_STOP when a section has
been timed. This allows more useful design with the perf timers and enables
possible return value optimizations. Simplistic example:

class Foo {
  public:
    Foo(int v) : m_v(v);
  private:
    int m_v;
}

Foo makeFrobbedFoo(int *errno)
{
  *errno = 0;
  return Foo();
}

Foo bar(int *errno)
{
  PERF_TIMER_GUARD(some_timer);

  return makeFrobbedFoo(errno);
}

int main(int argc, char[] argv)
{
  Foo f;
  int errno;

  f = bar(&errno);

  if (errno)
    return -1;
  return 0;
}

After bar() is called, perf_context.some_timer would be incremented as if
Stop(&perf_context.some_timer) was called at the end, and the compiler is still
able to produce optimizations on the return value from makeFrobbedFoo() through
to main().
2014-09-02 12:04:22 -07:00
Lei Jin
7e9f28cb23 limit max bytes that can be read/written per pread/write syscall
Summary:
BlockBasedTable sst file size can grow to a large size when universal
compaction is used. When index block exceeds 2G, pread seems to fail and
return truncated data and causes "trucated block" error. I tried to use
```
  #define _FILE_OFFSET_BITS 64
```
But the problem still persists. Splitting a big write/read into smaller
batches seems to solve the problem.

Test Plan:
successfully compacted a case with resulting sst file at ~90G (2.1G
index block size)

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22569
2014-08-29 21:21:49 -07:00
Igor Canadi
d5bd6c772b Fix ios compile
Summary: No __thread for ios.

Test Plan: compile works for ios now

Reviewers: ljin, dhruba

Reviewed By: dhruba

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22491
2014-08-28 12:46:05 -04:00
Igor Canadi
536e9973e3 Remove assert in vector rep
Summary: This assert makes Insert O(n^2) instead of O(n) in debug mode. Memtable insert is in the critical path. No need to assert uniqunnes of the key here, since we're adding a sequence number to it anyway.

Test Plan: none

Reviewers: sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22443
2014-08-27 11:05:41 -07:00
Radheshyam Balasundaram
4142a3e783 Adding a user comparator for comparing Uint64 slices.
Summary:
- New Uint64 comparator
- Modify Reader and Builder to take custom user comparators instead of bytewise comparator
- Modify logic for choosing unused user key in builder
- Modify iterator logic in reader
- test changes

Test Plan:
cuckoo_table_{builder,reader,db}_test
make check all

Reviewers: ljin, sdong

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D22377
2014-08-27 10:39:31 -07:00
Lei Jin
1755581f19 improve OptimizeForPointLookup()
Summary: also fix HISTORY.md

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22437
2014-08-26 14:15:00 -07:00
Igor Canadi
d9c0785812 Fix assertion in PosixRandomAccessFile
Summary:
See https://github.com/facebook/rocksdb/issues/244#issuecomment-53372297
Also see this: https://github.com/facebook/rocksdb/blob/master/util/env_posix.cc#L1075

Test Plan: compiles

Reviewers: yhchiang, ljin, sdong

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22419
2014-08-26 15:28:36 -04:00
Lei Jin
a98badff16 print table options
Summary: Add a virtual function in table factory that will print table options

Test Plan: make release

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22149
2014-08-25 14:24:09 -07:00
Lei Jin
384400128f move block based table related options BlockBasedTableOptions
Summary:
I will move compression related options in a separate diff since this
diff is already pretty lengthy.
I guess I will also need to change JNI accordingly :(

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21915
2014-08-25 14:22:05 -07:00
Jonah Cohen
e173bf9c19 Eliminate VersionSet memory leak
Summary:
ManifestDumpCommand::DoCommand was allocating a VersionSet and never
freeing it.

Test Plan: make

Reviewers: igor

Reviewed By: igor

Differential Revision: https://reviews.facebook.net/D22221
2014-08-20 13:52:03 -07:00
Igor Canadi
6929b08616 Remove BitStream* tests 2014-08-19 09:52:54 -04:00
Igor Canadi
50b790c6d4 Removing BitStream* functions
Summary: I was checking some functions in coding.h and coding.cc when I noticed these unused functions. Let's remove them.

Test Plan: compiles

Reviewers: sdong, ljin, yhchiang, dhruba

Reviewed By: dhruba

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22077
2014-08-19 06:48:21 -07:00
Lei Jin
c7037153c3 attempt to fix auto_roll_logger_test
Summary:
auto_roll_logger_test fails from time to time. I wasn't able to repro
the issue but by looking at the code, it seems like the initial ctime_
value can be set to the boundary of the second so it may still have a
chance to get rolled when interval is set to 1 second.

```
util/auto_roll_logger_test.cc:120: failed: 118 > 708
==19470== Syscall param msync(start) points to unaddressable byte(s)
==19470==    at 0x4E46CE0: __msync_nocancel (in
/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib/libpthread-2.17.so)
==19470==    by 0x584EFB: access_mem (Ginit.c:137)
==19470==    by 0x5834E3: _ULx86_64_access_reg (libunwind_i.h:162)
==19470==    by 0x585601: apply_reg_state (Gparser.c:742)
==19470==    by 0x5866BE: _ULx86_64_dwarf_find_save_locs (Gparser.c:883)
==19470==    by 0x584550: _ULx86_64_dwarf_step (Gstep.c:34)
==19470==    by 0x583653: _ULx86_64_step (Gstep.c:71)
==19470==    by 0x583FD2: _ULx86_64_tdep_trace (Gtrace.c:217)
==19470==    by 0x5831C3: backtrace (backtrace.c:69)

Test Plan: ./auto_roll_logger_test

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21951
2014-08-18 10:23:18 -07:00
Lei Jin
625b9ef4e0 Merge pull request #234 from bbiao/master
fix compile error under Mac OS X
2014-08-14 20:34:48 -07:00
Jonah Cohen
59a2763d5c Fix typo huage => huge
Test Plan: Inspection

Reviewers: sdong

Reviewed By: sdong

Differential Revision: https://reviews.facebook.net/D21891
2014-08-14 17:01:20 -07:00
Jonah Cohen
f611935e9a Fix autovector iterator increment/decrement comments
Summary: The prefix and postfix operators were mixed up in the autovector class.

Test Plan: Inspection

Reviewers: sdong, kailiu

Reviewed By: kailiu

Differential Revision: https://reviews.facebook.net/D21873
2014-08-14 14:56:11 -07:00
ZHANG Biao
8dfe2fdd51 fix compile error under Mac OS X 2014-08-14 20:01:01 +08:00
Lei Jin
58c49466d2 Allow env_posix to lower background thread IO priority
Summary: This is a linux-specific system call.

Test Plan: ran db_bench

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: haobo, leveldb

Differential Revision: https://reviews.facebook.net/D21183
2014-08-13 20:49:58 -07:00
Feng Zhu
6a2be31f14 fix_valgrind_error_caused_in_db_info_dummper
Summary: 1. add default value to FileType type, thus avoid valgrind error

Test Plan: valgrind ./db_test

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21807
2014-08-13 17:53:43 -07:00
Lei Jin
e91ebf1399 print compaction_filter name in Options.Dump
Summary:
Was looking at an issue. All options are the same except
compaction_filter was missed from a newer package. Our option dump does
not capture that

Test Plan: make release

Reviewers: sdong, igor, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21765
2014-08-13 15:58:31 -07:00
Feng Zhu
5e642403a9 log db path info before open
Summary: 1. write db MANIFEST, CURRENT, IDENTITY, sst files, log files to log before open

Test Plan: run db and check LOG file

Reviewers: ljin, yhchiang, igor, dhruba, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21459
2014-08-13 13:45:13 -07:00
Lei Jin
5d0074c471 set bytes_per_sync to 1MB if rate limiter is enabled
Summary: as title

Test Plan: make all check

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D21201
2014-08-12 16:42:18 -07:00
miguelportilla
93e6b5e9d9 Changes to support unity build:
* Script for building the unity.cc file via Makefile
* Unity executable Makefile target for testing builds
* Source code changes to fix compilation of unity build
2014-08-11 13:22:47 -04:00
Radheshyam Balasundaram
0c9d03ba10 Fixing broken Mac build
Summary: Made some small changes to fix the broken mac build

Test Plan: make check all in both linux and mac. All tests pass.

Reviewers: sdong, igor, ljin, yhchiang

Reviewed By: ljin, yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20895
2014-07-31 20:52:13 -07:00
sdong
473e829784 Fix ldb dump_manifest
Summary:
We now reads table properties in VersionSet::LogAndApply(), which requires options.db_paths to be set. But since ldb_cmd directly creates VersionSet without initialization db_paths, causing a seg fault. This patch fix it by initializing db_paths.

log_and_apply_bench still shows segfault, because table cache is nullptr in VersionSet created.

Test Plan: Run ldb dump_manifest which used to fail.

Reviewers: yhchiang, ljin, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20751
2014-07-30 10:17:48 -07:00
Lei Jin
1bd3431f7c Change StopWatch interface
Summary: So that we can avoid calling NowSecs() in MakeRoomForWrite twice

Test Plan: make all check

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20529
2014-07-28 12:22:37 -07:00
Lei Jin
f6ca226c17 make statistics forward-able
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.

Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics

Reviewers: yhchiang, sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20145
2014-07-28 12:10:49 -07:00
Lei Jin
40fa8a4cd5 make statistics forward-able
Summary:
Make StatisticsImpl being able to forward stats to provided statistics
implementation. The main purpose is to allow us to collect internal
stats in the future even when user supplies custom statistics
implementation. It avoids intrumenting 2 sets of stats collection code.
One immediate use case is tuning advisor, which needs to collect some
internal stats, users may not be interested.

Test Plan:
ran db_bench and see stats show up at the end of run
Will run make all check since some tests rely on statistics

Reviewers: yhchiang, sdong, igor

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D20145
2014-07-28 12:05:36 -07:00
Lei Jin
d650612c4c expose RateLimiter definition
Summary:
User gets undefinied error since the definition is not exposed.
Also re-enable the db test with only upper bound check

Test Plan: db_test, rate_limit_test

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20403
2014-07-25 15:17:06 -07:00
Yueh-Hsuan Chiang
00f56dfa28 Fixed a compile error in util/options_builder.cc
Summary:
Fixed the following compile error by replacing pow by shift, as it computes
power of 2.

util/options_builder.cc:133:14: error: no member named 'pow' in namespace 'std'
        std::pow(2, std::max(0, std::min(3, level0_stop_writes_trigger -
        ~~~~~^
1 error generated.
make: *** [util/options_builder.o] Error 1

Test Plan: make success in mac and linux

Reviewers: ljin, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20475
2014-07-23 10:31:32 -07:00
Igor Canadi
0ff183a0d9 Move include/utilities/*.h to include/rocksdb/utilities/*.h
Summary:
All public headers need to be under `include/rocksdb` directory. Otherwise, clients include our header files like this:

    #include <rocksdb/db.h>
    #include <utilities/backupable_db.h> // still our public header!

Also, internally, we include:

    #include "utilities/backupable/backupable_db.h" // internal header
    #include "utilities/backupable_db.h" // public header

which is confusing.

This way, when we install rocksdb as a system library, we can just copy `include/rocksdb` directory to system's header files. We can't really copy `utilities` directory to system's header files.

Test Plan: compiles

Reviewers: dhruba, ljin, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D20409
2014-07-23 10:21:38 -04:00
sdong
e6de02103a Add a utility function to guess optimized options based on constraints
Summary:
Add a function GetOptions(), where based on four parameters users give: read/write amplification threshold, memory budget for mem tables and target DB size, it picks up a compaction style and parameters for them. Background threads are not touched yet.

One limit of this algorithm: since compression rate and key/value size are hard to predict, it's hard to predict level 0 file size from write buffer size. Simply make 1:1 ratio here.

Sample results: https://reviews.facebook.net/P477

Test Plan: Will add some a unit test where some sample scenarios are given and see they pick the results that make sense

Reviewers: yhchiang, dhruba, haobo, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18741
2014-07-22 15:24:21 -07:00
Stanislau Hlebik
9d70cce047 Adding option to save PlainTable index and bloom filter in SST file.
Summary:
Adding option to save PlainTable index and bloom filter in SST file.
If there is no bloom block and/or index block, PlainTableReader builds
new ones. Otherwise PlainTableReader just use these blocks.

Test Plan: make all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19527
2014-07-18 16:58:13 -07:00
sdong
0abaed2e08 Support multiple DB directories in universal compaction style
Summary:
This patch adds a target size parameter in options.db_paths and universal compaction will base it to determine which DB path to place a new file.
Level-style stays the same.

Test Plan: Add new unit tests

Reviewers: ljin, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, dhruba, igor, leveldb

Differential Revision: https://reviews.facebook.net/D19869
2014-07-15 12:06:28 -07:00
Igor Canadi
20c056306b Remove stats logger
Summary: Browsing through the code, looks like StatsLogger is not used at all!

Test Plan: compiles

Reviewers: ljin, sdong, yhchiang, dhruba

Reviewed By: dhruba

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19827
2014-07-15 09:16:32 -04:00
Igor Canadi
5ff6633588 Fix mac compile 2014-07-10 13:19:47 -07:00
sdong
36de0e5359 Add a function to return current perf level
Summary: Add a function to return the perf level. It is to allow a wrapper of DB to increase the perf level and restore the original perf level after finishing the function call.

Test Plan: Add a verification in db_test

Reviewers: yhchiang, igor, ljin

Reviewed By: ljin

Subscribers: xjin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19551
2014-07-10 11:35:48 -07:00
Lei Jin
534357ca3a integrate rate limiter into rocksdb
Summary:
Add option and plugin rate limiter for PosixWritableFile. The rate
limiter only applies to flush and compaction. WAL and MANIFEST are
excluded from this enforcement.

Test Plan: db_test

Reviewers: igor, yhchiang, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19425
2014-07-08 12:31:49 -07:00
Lei Jin
5ef1ba7ff5 generic rate limiter
Summary:
A generic rate limiter that can be shared by threads and rocksdb
instances. Will use this to smooth out write traffic generated by
compaction and flush. This will help us get better p99 behavior on flash
storage.

Test Plan:
unit test output
==== Test RateLimiterTest.Rate
request size [1 - 1023], limit 10 KB/sec, actual rate: 10.374969 KB/sec, elapsed 2002265
request size [1 - 2047], limit 20 KB/sec, actual rate: 20.771242 KB/sec, elapsed 2002139
request size [1 - 4095], limit 40 KB/sec, actual rate: 41.285299 KB/sec, elapsed 2202424
request size [1 - 8191], limit 80 KB/sec, actual rate: 81.371605 KB/sec, elapsed 2402558
request size [1 - 16383], limit 160 KB/sec, actual rate: 162.541268 KB/sec, elapsed 3303500

Reviewers: yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19359
2014-07-08 11:41:57 -07:00
Yueh-Hsuan Chiang
90a6aca48e Finer report I/O stats about Flush and Compaction.
Summary:
This diff allows the I/O stats about Flush and Compaction to be reported
in a more accurate way.  Instead of measuring the size of a file, it
measure I/O cost in per read / write basis.

Test Plan: make all check

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19383
2014-07-03 16:28:03 -07:00
Yueh-Hsuan Chiang
d4d338de33 Add timeout_hint_us to WriteOptions and introduce Status::TimeOut.
Summary:
This diff adds timeout_hint_us to WriteOptions.  If it's non-zero, then
1) writes associated with this options MAY be aborted when it has been
  waiting for longer than the specified time.  If an abortion happens,
  associated writes will return Status::TimeOut.
2) the stall time of the associated write caused by flush or compaction
  will be limited by timeout_hint_us.

The default value of timeout_hint_us is 0 (i.e., OFF.)

The statistics of timeout writes will be recorded in WRITE_TIMEDOUT.

Test Plan:
export ROCKSDB_TESTS=WriteTimeoutAndDelayTest
make db_test
./db_test

Reviewers: igor, ljin, haobo, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18837
2014-07-03 15:47:02 -07:00
sdong
2459f7ec4e Support Multiple DB paths (without having an interface to expose to users)
Summary:
In this patch, we allow RocksDB to support multiple DB paths internally.
No user interface is supported yet so this patch is silent to users.

Test Plan: make all check

Reviewers: igor, haobo, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18921
2014-07-02 21:14:44 -07:00
Igor Canadi
d3f63f03ad Fix 32-bit errors
Summary: https://www.facebook.com/groups/rocksdb.dev/permalink/590438347721350/

Test Plan: compiles

Reviewers: sdong, ljin, yhchiang

Reviewed By: yhchiang

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19197
2014-07-02 11:40:16 +02:00
sdong
9c332aa11a HashLinkList memtable switches a bucket to a skip list to reduce performance outliers
Summary:
In this patch, we enhance HashLinkList memtable to reduce performance outliers when a bucket contains too many entries. We switch to skip list for this case to enable binary search.

Add threshold_use_skiplist parameter to determine when a bucket needs to switch to skip list.

The new data structure is documented in comments in the codes.

Test Plan:
make all check
set threshold_use_skiplist in several tests

Reviewers: yhchiang, haobo, ljin

Reviewed By: yhchiang, ljin

Subscribers: nkg-, xjin, dhruba, yhchiang, leveldb

Differential Revision: https://reviews.facebook.net/D19299
2014-07-01 17:14:15 -07:00
Feng Zhu
5656367416 use arena to allocate memtable's bloomfilter and hashskiplist's buckets_
Summary:
    Bloomfilter and hashskiplist's buckets_ allocated by memtable's arena
    DynamicBloom: pass arena via constructor, allocate space in SetTotalBits
    HashSkipListRep: allocate space of buckets_ using arena.
       do not delete it in deconstructor because arena would take care of it.
    Several test files are changed.

Test Plan:
    make all check

Reviewers: ljin, haobo, yhchiang, sdong

Reviewed By: sdong

Subscribers: igor, dhruba

Differential Revision: https://reviews.facebook.net/D19335
2014-06-30 15:54:31 -07:00
Yueh-Hsuan Chiang
81c5d98900 Fixed a comparison between signed and unsigned integers in options.cc
Summary:
Fixed the following warning:

util/options.cc: In constructor ‘rocksdb::ColumnFamilyOptions::ColumnFamilyOptions(const rocksdb::Options&)’:
util/options.cc:157:58: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
   if (max_bytes_for_level_multiplier_additional.size() < num_levels) {
                                                             ^

Test Plan: make all check

Reviewers: haobo, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19293
2014-06-25 15:31:30 -06:00
sdong
19de6a7aad Remove MemTableRep::GetIterator(const Slice& slice)
Summary: It seems to me that when ever function MemTableRep::GetIterator(const Slice& slice) is used, we can use MemTableRep::GetDynamicPrefixIterator() instead. Just delete it to simplify the codes.

Test Plan: make all check

Reviewers: yhchiang, ljin

Reviewed By: ljin

Subscribers: xjin, dhruba, haobo, leveldb

Differential Revision: https://reviews.facebook.net/D19281
2014-06-25 14:09:29 -07:00
Yueh-Hsuan Chiang
55531fd089 Fixed heap-buffer-overflow issue when Options.num_levels > 7.
Summary:
Currently, when num_levels has been changed to > 7, internally
it will not resize max_bytes_for_level_multiplier_additional.
As a result, max_bytes_for_level_multiplier_additional.size() will
be smaller than num_levels, which causes heap-buffer-overflow.

Test Plan: make all check

Reviewers: haobo, sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D19275
2014-06-25 14:11:12 -06:00