Commit Graph

6466 Commits

Author SHA1 Message Date
Dhruba Borthakur
27bbef1180 Free obsolete memtables outside the dbmutex.
Summary:
Large memory allocations and frees are costly and best done outside the
db-mutex. The memtables are already allocated outside the db-mutex but
they were being freed while holding the db-mutex.
This patch frees obsolete memtables outside the db-mutex.

Test Plan:
make check
db_stress

Unit tests pass, I am in the process of running stress tests.

Reviewers: haobo, igor, emayanke

Reviewed By: haobo

CC: reconnect.grayhat, leveldb

Differential Revision: https://reviews.facebook.net/D14319
2013-11-25 21:04:48 -08:00
Igor Canadi
3ce3658411 DB::GetOptions()
Summary: We need access to options for BackupableDB

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14331
2013-11-25 15:51:50 -08:00
Igor Canadi
793fdd6731 We should compile with -fPIC on non-fbcode environments also 2013-11-25 15:49:02 -08:00
Igor Canadi
e37221f56b Memtable regression test
Summary: The old regression tests didn't cover memtable part at all. This is an atempt to also measure memtable performance in regression tests.

Test Plan: Ran regression_build_test.sh

Reviewers: dhruba, haobo, kailiu, emayanke

Reviewed By: dhruba

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14325
2013-11-25 15:34:23 -08:00
Igor Canadi
11c26bd4a4 [RocksDB] Interface changes required for BackupableDB
Summary: This is part of https://reviews.facebook.net/D14295 -- smaller diff that is easier to review

Test Plan: make asan_check

Reviewers: dhruba, haobo, emayanke

Reviewed By: emayanke

CC: leveldb, kailiu, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14301
2013-11-25 12:39:23 -08:00
Dhruba Borthakur
299f5c76bb Create new log file outside the dbmutex.
Summary:
All filesystem Io should be done outside the dbmutex. There was one place
when we have to roll the transaction log that we were creating the new log file
while holding the dbmutex.

I rearranged this code so that the act of creating the new transaction log
file is done without holding the dbmutex.  I also allocate the new memtable
outside the dbmutex, this is important because creating the memtable
could be heavyweight.

Test Plan: make check and dbstress

Reviewers: haobo, igor

Reviewed By: haobo

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14283
2013-11-25 11:23:42 -08:00
Haobo Xu
5b825d6964 [RocksDB] Use raw pointer instead of shared pointer when passing Statistics object internally
Summary: liveness of the statistics object is already ensured by the shared pointer in DB options. There's no reason to pass again shared pointer among internal functions. Raw pointer is sufficient and efficient.

Test Plan: make check

Reviewers: dhruba, MarkCallaghan, igor

Reviewed By: dhruba

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14289
2013-11-25 10:38:15 -08:00
Siying Dong
dfa1460d88 [For Performance Branch] Bloom filter in PlainTableIterator::Seek() - Update 1
Summary:
Address @haobo's comments in D14277

Test Plan: ./indexed_table_db_test

Reviewers: haobo

CC:

Task ID: #

Blame Rev:
2013-11-21 23:33:45 -08:00
Siying Dong
718488abc5 Add BloomFilter to PlainTableIterator::Seek()
Summary:
This patch adds a simple bloom filter in PlainTableIterator::Seek()

Test Plan: N/A

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-11-21 22:26:39 -08:00
kailiu
0c93df912e Improve the readability of the TableProperties::ToString() 2013-11-21 17:54:23 -08:00
Siying Dong
3e35aa6412 Revert "Allow users to profile a query and see bottleneck of the query"
This reverts commit 3d8ac31d71.
2013-11-21 17:40:39 -08:00
Siying Dong
b135d01e7b Allow users to profile a query and see bottleneck of the query
Summary:
Provide a framework to profile a query in detail to figure out latency bottleneck. Currently, in Get(), Put() and iterators, 2-3 simple timing is used. We can easily add more profile counters to the framework later.

Test Plan: Enable this profiling in seveal existing tests.

Reviewers: haobo, dhruba, kailiu, emayanke, vamsi, igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14001

Conflicts:
	table/merger.cc
2013-11-21 17:39:19 -08:00
Siying Dong
3d8ac31d71 Allow users to profile a query and see bottleneck of the query
Summary:
Provide a framework to profile a query in detail to figure out latency bottleneck. Currently, in Get(), Put() and iterators, 2-3 simple timing is used. We can easily add more profile counters to the framework later.

Test Plan: Enable this profiling in seveal existing tests.

Reviewers: haobo, dhruba, kailiu, emayanke, vamsi, igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14001
2013-11-21 16:29:57 -08:00
Siying Dong
58e1956d50 [Only for Performance Branch] A Hacky patch to lazily generate memtable key for prefix-hashed memtables.
Summary:
For prefix mem tables, encoding mem table key may be unnecessary if the prefix doesn't have any key. This patch is a little bit hacky but I want to try out the performance gain of removing this lazy initialization.

In longer term, we might want to revisit the way we abstract mem tables implementations.

Test Plan: make all check

Reviewers: haobo, igor, kailiu

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14265
2013-11-20 20:49:23 -08:00
kailiu
7b10fe9fac Fix a memory leak happened in table_test 2013-11-20 20:49:23 -08:00
Siying Dong
b59d4d5a50 A Simple Plain Table
Summary:
A Simple plain table format. No block structure. When creating the table reader, scanning the full table to create indexes.

Test Plan:Add unit test

Reviewers:haobo,dhruba,kailiu

CC:

Task ID: #

Blame Rev:
2013-11-20 18:44:22 -08:00
kailiu
56589ab81f Add TableOptions for BlockBasedTableFactory
We are having more and more options to specify for this table so it makes sense to have a TableOptions for future extension.
2013-11-20 18:42:12 -08:00
Siying Dong
071fb0d77b Inline a couple of functions and put one save lazily clearing
Summary:
Machine several functions inline.
Also, in DBIter.Seek() make value cleaning up lazily done.
These are for the use case that Seek() are called lots of times but few return values.

Test Plan: make all check

Differential Revision: https://reviews.facebook.net/D14217
2013-11-20 17:32:57 -08:00
Siying Dong
15b31b57df MergingIterator.Seek() to lazily initialize MinHeap
Summary:
For the use cases that prefix filtering is enabled, initializing heaps when doing MergingIterator.Seek() might introduce non-negligible costs. This patch makes it lazily done.

Test Plan: make all check

Reviewers: haobo,dhruba,kailiu

CC:

Task ID: #

Blame Rev:
2013-11-20 17:29:35 -08:00
Kai Liu
db2e2615f8 Fix the format in INSTALL.md 2013-11-20 14:55:33 -08:00
Kai Liu
f5acb2eebb Add more guide for mac users. 2013-11-20 14:54:53 -08:00
Kai Liu
618250b5c5 Update the installation guide for mac users. 2013-11-20 14:38:17 -08:00
Haobo Xu
37b459f0aa [RocksDB] Test diff on performance branch
Summary: trivia comment change

Test Plan: Go through the step ofs developing under the performance branch

Reviewers: dhruba, kailiu, sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14259
2013-11-20 14:34:52 -08:00
kailiu
1c8b819be2 Fix a memory leak happened in table_test 2013-11-20 13:45:32 -08:00
Haobo Xu
a617227a36 [RocksDB] fix prefix_test
Summary: user comparator needs to work if either input is prefix only.

Test Plan: ./prefix_test --write_buffer_size=100000 --total_prefixes=10000 --items_per_prefix=10

Reviewers: dhruba, igor

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14241
2013-11-20 09:16:23 -08:00
kailiu
0142d38b69 fix issue #11
URL: https://github.com/facebook/rocksdb/issues/11
2013-11-20 01:17:44 -08:00
kailiu
6eb5649800 Move flush_block_policy from Options to TableFactory
Summary:
Previously we introduce a `flush_block_policy_factory` in Options, however, that options is strongly releated to Table based tables.
It will make more sense to move it to block based table's own factory class.

Test Plan: make check to pass existing tests

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14211
2013-11-19 22:00:48 -08:00
Igor Canadi
469a9f32a7 Fix two nasty use-after-free-bugs
Summary:
These bugs were caught by ASAN crash test.
1. The first one, in table/filter_block.cc is very nasty. We first reference entries_ and store the reference to Slice prev. Then, we call entries_.append(), which can change the reference. The Slice prev now points to junk.
2. The second one is a bug in a test, so it's not very serious. Once we set read_opts.prefix, we never clear it, so some other function might still reference it.

Test Plan: asan crash test now runs more than 5 mins. Before, it failed immediately. I will run the full one, but the full one takes quite some time (5 hours)

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14223
2013-11-19 21:01:48 -08:00
Igor Canadi
8906ab59f7 Split asan_check into asan_check and asan_crash_test 2013-11-19 16:44:40 -08:00
Igor Canadi
92d905026b make asan_check
Summary: Add asan_check rule to Makefile. After we add this, we will create Jenkins run that will check for asan errors!

Test Plan: make asan_check

Reviewers: dhruba, kailiu, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14205
2013-11-19 16:33:24 -08:00
kailiu
1415f8820d Improve the "table stats"
Summary:
The primary motivation of the changes is to make it easier to figure out the inside of the tables.

* rename "table stats" to "table properties" since now we have more than "integers" to store in the property block.
* Add filter block size to the basic table properties.
* Whenever a table is built, we'll log the table properties (the sample output is in Test Plan).
* Make an api to expose deleted keys.

Test Plan:
Passed all existing test. and the sample output of table stats:

    ==================================================================
        Basic Properties
    ------------------------------------------------------------------
                  # data blocks: 1
                      # entries: 1

                   raw key size: 9
           raw average key size: 9
                 raw value size: 9
         raw average value size: 0

                data block size: 25
               index block size: 27
              filter block size: 18
         (estimated) table size: 70

                  filter policy: rocksdb.BuiltinBloomFilter
    ==================================================================
        User collected properties: InternalKeyPropertiesCollector
    ------------------------------------------------------------------
                    kDeletedKeys: 1
    ==================================================================

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14187
2013-11-19 16:29:42 -08:00
Igor Canadi
f045871f1c Remove libevent
Summary: We don't need that dependency

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14199
2013-11-18 21:33:16 -08:00
Igor Canadi
ec0acfbca1 Fix stack overflow
Summary:
Sure, let me put 8 bytes in that int32_t.

Brought to you by ASAN!

Test Plan: ttl_test

Reviewers: dhruba, haobo, kailiu, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14193
2013-11-18 20:56:21 -08:00
Igor Canadi
07e8078b17 Fix BZip constants
Summary: We were using ZLIB constants in BZIP code path. This caused some errors, like: https://github.com/facebook/rocksdb/issues/8

Test Plan: make clean; make check

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14175
2013-11-18 20:33:34 -08:00
Igor Canadi
e51f55d7ae Use gcc4.7.1 on CentOS 5.2
Summary:
For some reason, snappy on CentOS 5.2 when compiled with gcc 4.8.1 segfaults on strcmp. (!?)

Add an if to compile with gcc4.7.1 if you're compiling on CentOS 5.2. Please update your devservers to CentOS 6.

Test Plan:
make clean; make check
on both my devserver (CentOS 6) and dhruba's (CentOS 5.2)

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14169
2013-11-18 20:18:45 -08:00
Igor Canadi
7f156be9d4 [fbcode] Also add glibc and libgcc includes
Summary: We also need to use custom glibc and libgcc includes instead of system ones.

Test Plan:
'make clean; make check'.

Will also try on @dhruba's dev server.

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D14157
2013-11-18 13:50:34 -08:00
Igor Canadi
0d25449d57 Compilation.md
Summary: Explained what it takes to compile it on linux.

Test Plan: -

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14151
2013-11-18 11:44:17 -08:00
Igor Canadi
f611aba559 Move the compiler back to 4.8.1 + more small fixes
Summary:
1. Moved the compiler back to 4.8.1 and uses Centos 5.2 binaries if OS is Centos 5.2.

2. Fixes this issue: https://github.com/facebook/rocksdb/issues/7

3. We use lot of c++11 features, so we can't pretend we can compile without them. Makes it a first class dependency.

4. Fix blob_store_test, which failes on Ubuntu with "too many files opened" error

5. Removed dependency on port/port_chromium.h, which does not even exist on our system

Test Plan: make clean; make check

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14145
2013-11-18 11:40:16 -08:00
kailiu
6c6d5bc3b0 Add a compilation guide to rocksdb
Summary: as title.

Test Plan: no

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14121
2013-11-18 11:01:16 -08:00
Dhruba Borthakur
31295b0a1b Add License message to public header files.
Summary:
Add License message to public header files.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-11-18 10:21:35 -08:00
Igor Canadi
37eedfb8c1 Move back to gcc4.7.1
Summary: Dhruba can't compile on gcc4.8.1 so I'm moving temporarily back to 4.7.1 until we figure out what's wrong with 4.8. on his server.

Test Plan: It can compile on my devserver, but please 'arc patch' this diff and try compiling on your machine

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14139
2013-11-18 10:20:32 -08:00
Igor Canadi
ce26e9a522 Remove snappy from RocksDB distribution
Summary:
Argumentation here: https://github.com/facebook/rocksdb/issues/9

Even though we include snappy in the distribution, we do not link with it if we don't have snappy installed on the system.

Installing snappy is easy nowadays, just type:
sudo apt-get install libsnappy-dev

Test Plan: compile on ubuntu

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14133
2013-11-17 22:05:00 -08:00
Igor Canadi
fc61428288 Include <unistd.h> in db_test
Summary: This is the only compile issue in Ubuntu. It might be better to include <unistd.h> only in env_posix and add Truncate function to Env, but since we use truncate only in db_test, I don't think it makes much sense.

Test Plan: Rocksdb now compiles on Ubuntu!

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14127
2013-11-17 21:58:16 -08:00
Igor Canadi
de9ce7d439 Upgrading compiler to gcc4.8.1
Summary:
Finally did it - the trick was in using --dynamic-linker option. This is first step to running ASAN.

All of our code seems to compile just fine on 4.8.1. However, I still left fbcode.471.sh in the 'build_tools/' just in case.

Test Plan: make clean; make

Reviewers: dhruba, haobo, kailiu, emayanke, sdong

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14109
2013-11-17 13:52:55 -08:00
Kai Liu
75df72f2a5 Change the logic in KeyMayExist()
Summary:
Previously in KeyMayExist(), if DB::Get() returns non-Status::OK(), we assumes key may not exist.
However, as if index block is not in block cache, Status::Incomplete() will return. Worse still, if
options::filter_delete is enabled, we may falsely ignore the "delete" operation:

  https://github.com/facebook/rocksdb/blob/master/db/write_batch.cc#L217-L220

This diff fixes this bug and will let crash-test pass.

Test Plan:
Ran:

  ./db_stress --test_batches_snapshots=1 --ops_per_thread=1000000 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=1 --reopen=0 --readpercent=5 --prefixpercent=45 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/home/kailiu/local/newer --max_key=100000000 --disable_seek_compaction=0 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --disable_wal=0 --disable_data_sync=0 --target_file_size_base=2097152
--target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=1

Previously we'll see crash happens very soon.

Reviewers: igor, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14115
2013-11-17 01:00:34 -08:00
kailiu
97d8e573a6 make util/env_posix.cc work under mac
Summary: This diff invoves some more complicated issues in the posix environment.

Test Plan: works under mac os. will need to verify dev box.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14061
2013-11-16 23:44:39 -08:00
Kai Liu
7604e2f70c Make the options in table_builder/block_builder less misleading
Summary:
By original design, the regular `block options` and index `block options` in table_builder is mutable. We can use ChangeOptions to change the options directly.

However, with my last change, `BlockBuilder` no longer hold the reference to the index_block_options -- as a result, any changes made after the creation of index block builder will be of no effect.

But still the code is very error-prone and developers can easily fall into the trap without aware of it. To avoid this problem from happening in the future, I deleted the `ChangeOptions` and the `index_block_options`, as well as many other changes to make it less misleading.

Test Plan:
make
make check
make release

Reviewers: dhruba, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13707
2013-11-16 23:21:15 -08:00
Kai Liu
f7a2b972a8 Reformat CONTRIBUTING.md with less than 80 characters. 2013-11-16 22:51:09 -08:00
Igor Canadi
72756c7404 Merge pull request #5 from pborreli/typos
Fixed typos
2013-11-16 17:46:29 -08:00
Igor Canadi
f46827212e Added CONTRIBUTING.md 2013-11-16 14:35:33 -08:00