Commit Graph

852 Commits

Author SHA1 Message Date
Dhruba Borthakur
299f5c76bb Create new log file outside the dbmutex.
Summary:
All filesystem Io should be done outside the dbmutex. There was one place
when we have to roll the transaction log that we were creating the new log file
while holding the dbmutex.

I rearranged this code so that the act of creating the new transaction log
file is done without holding the dbmutex.  I also allocate the new memtable
outside the dbmutex, this is important because creating the memtable
could be heavyweight.

Test Plan: make check and dbstress

Reviewers: haobo, igor

Reviewed By: haobo

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14283
2013-11-25 11:23:42 -08:00
Haobo Xu
5b825d6964 [RocksDB] Use raw pointer instead of shared pointer when passing Statistics object internally
Summary: liveness of the statistics object is already ensured by the shared pointer in DB options. There's no reason to pass again shared pointer among internal functions. Raw pointer is sufficient and efficient.

Test Plan: make check

Reviewers: dhruba, MarkCallaghan, igor

Reviewed By: dhruba

CC: leveldb, reconnect.grayhat

Differential Revision: https://reviews.facebook.net/D14289
2013-11-25 10:38:15 -08:00
kailiu
0c93df912e Improve the readability of the TableProperties::ToString() 2013-11-21 17:54:23 -08:00
Siying Dong
3e35aa6412 Revert "Allow users to profile a query and see bottleneck of the query"
This reverts commit 3d8ac31d71.
2013-11-21 17:40:39 -08:00
Siying Dong
3d8ac31d71 Allow users to profile a query and see bottleneck of the query
Summary:
Provide a framework to profile a query in detail to figure out latency bottleneck. Currently, in Get(), Put() and iterators, 2-3 simple timing is used. We can easily add more profile counters to the framework later.

Test Plan: Enable this profiling in seveal existing tests.

Reviewers: haobo, dhruba, kailiu, emayanke, vamsi, igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14001
2013-11-21 16:29:57 -08:00
kailiu
56589ab81f Add TableOptions for BlockBasedTableFactory
We are having more and more options to specify for this table so it makes sense to have a TableOptions for future extension.
2013-11-20 18:42:12 -08:00
Kai Liu
db2e2615f8 Fix the format in INSTALL.md 2013-11-20 14:55:33 -08:00
Kai Liu
f5acb2eebb Add more guide for mac users. 2013-11-20 14:54:53 -08:00
Kai Liu
618250b5c5 Update the installation guide for mac users. 2013-11-20 14:38:17 -08:00
kailiu
1c8b819be2 Fix a memory leak happened in table_test 2013-11-20 13:45:32 -08:00
Haobo Xu
a617227a36 [RocksDB] fix prefix_test
Summary: user comparator needs to work if either input is prefix only.

Test Plan: ./prefix_test --write_buffer_size=100000 --total_prefixes=10000 --items_per_prefix=10

Reviewers: dhruba, igor

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14241
2013-11-20 09:16:23 -08:00
kailiu
0142d38b69 fix issue #11
URL: https://github.com/facebook/rocksdb/issues/11
2013-11-20 01:17:44 -08:00
kailiu
6eb5649800 Move flush_block_policy from Options to TableFactory
Summary:
Previously we introduce a `flush_block_policy_factory` in Options, however, that options is strongly releated to Table based tables.
It will make more sense to move it to block based table's own factory class.

Test Plan: make check to pass existing tests

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14211
2013-11-19 22:00:48 -08:00
Igor Canadi
469a9f32a7 Fix two nasty use-after-free-bugs
Summary:
These bugs were caught by ASAN crash test.
1. The first one, in table/filter_block.cc is very nasty. We first reference entries_ and store the reference to Slice prev. Then, we call entries_.append(), which can change the reference. The Slice prev now points to junk.
2. The second one is a bug in a test, so it's not very serious. Once we set read_opts.prefix, we never clear it, so some other function might still reference it.

Test Plan: asan crash test now runs more than 5 mins. Before, it failed immediately. I will run the full one, but the full one takes quite some time (5 hours)

Reviewers: dhruba, haobo, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14223
2013-11-19 21:01:48 -08:00
Igor Canadi
8906ab59f7 Split asan_check into asan_check and asan_crash_test 2013-11-19 16:44:40 -08:00
Igor Canadi
92d905026b make asan_check
Summary: Add asan_check rule to Makefile. After we add this, we will create Jenkins run that will check for asan errors!

Test Plan: make asan_check

Reviewers: dhruba, kailiu, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14205
2013-11-19 16:33:24 -08:00
kailiu
1415f8820d Improve the "table stats"
Summary:
The primary motivation of the changes is to make it easier to figure out the inside of the tables.

* rename "table stats" to "table properties" since now we have more than "integers" to store in the property block.
* Add filter block size to the basic table properties.
* Whenever a table is built, we'll log the table properties (the sample output is in Test Plan).
* Make an api to expose deleted keys.

Test Plan:
Passed all existing test. and the sample output of table stats:

    ==================================================================
        Basic Properties
    ------------------------------------------------------------------
                  # data blocks: 1
                      # entries: 1

                   raw key size: 9
           raw average key size: 9
                 raw value size: 9
         raw average value size: 0

                data block size: 25
               index block size: 27
              filter block size: 18
         (estimated) table size: 70

                  filter policy: rocksdb.BuiltinBloomFilter
    ==================================================================
        User collected properties: InternalKeyPropertiesCollector
    ------------------------------------------------------------------
                    kDeletedKeys: 1
    ==================================================================

Reviewers: dhruba, haobo

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14187
2013-11-19 16:29:42 -08:00
Igor Canadi
f045871f1c Remove libevent
Summary: We don't need that dependency

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14199
2013-11-18 21:33:16 -08:00
Igor Canadi
ec0acfbca1 Fix stack overflow
Summary:
Sure, let me put 8 bytes in that int32_t.

Brought to you by ASAN!

Test Plan: ttl_test

Reviewers: dhruba, haobo, kailiu, emayanke

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14193
2013-11-18 20:56:21 -08:00
Igor Canadi
07e8078b17 Fix BZip constants
Summary: We were using ZLIB constants in BZIP code path. This caused some errors, like: https://github.com/facebook/rocksdb/issues/8

Test Plan: make clean; make check

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14175
2013-11-18 20:33:34 -08:00
Igor Canadi
e51f55d7ae Use gcc4.7.1 on CentOS 5.2
Summary:
For some reason, snappy on CentOS 5.2 when compiled with gcc 4.8.1 segfaults on strcmp. (!?)

Add an if to compile with gcc4.7.1 if you're compiling on CentOS 5.2. Please update your devservers to CentOS 6.

Test Plan:
make clean; make check
on both my devserver (CentOS 6) and dhruba's (CentOS 5.2)

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14169
2013-11-18 20:18:45 -08:00
Igor Canadi
7f156be9d4 [fbcode] Also add glibc and libgcc includes
Summary: We also need to use custom glibc and libgcc includes instead of system ones.

Test Plan:
'make clean; make check'.

Will also try on @dhruba's dev server.

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D14157
2013-11-18 13:50:34 -08:00
Igor Canadi
0d25449d57 Compilation.md
Summary: Explained what it takes to compile it on linux.

Test Plan: -

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14151
2013-11-18 11:44:17 -08:00
Igor Canadi
f611aba559 Move the compiler back to 4.8.1 + more small fixes
Summary:
1. Moved the compiler back to 4.8.1 and uses Centos 5.2 binaries if OS is Centos 5.2.

2. Fixes this issue: https://github.com/facebook/rocksdb/issues/7

3. We use lot of c++11 features, so we can't pretend we can compile without them. Makes it a first class dependency.

4. Fix blob_store_test, which failes on Ubuntu with "too many files opened" error

5. Removed dependency on port/port_chromium.h, which does not even exist on our system

Test Plan: make clean; make check

Reviewers: dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14145
2013-11-18 11:40:16 -08:00
kailiu
6c6d5bc3b0 Add a compilation guide to rocksdb
Summary: as title.

Test Plan: no

Reviewers: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14121
2013-11-18 11:01:16 -08:00
Dhruba Borthakur
31295b0a1b Add License message to public header files.
Summary:
Add License message to public header files.

Test Plan:

Reviewers:

CC:

Task ID: #

Blame Rev:
2013-11-18 10:21:35 -08:00
Igor Canadi
37eedfb8c1 Move back to gcc4.7.1
Summary: Dhruba can't compile on gcc4.8.1 so I'm moving temporarily back to 4.7.1 until we figure out what's wrong with 4.8. on his server.

Test Plan: It can compile on my devserver, but please 'arc patch' this diff and try compiling on your machine

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14139
2013-11-18 10:20:32 -08:00
Igor Canadi
ce26e9a522 Remove snappy from RocksDB distribution
Summary:
Argumentation here: https://github.com/facebook/rocksdb/issues/9

Even though we include snappy in the distribution, we do not link with it if we don't have snappy installed on the system.

Installing snappy is easy nowadays, just type:
sudo apt-get install libsnappy-dev

Test Plan: compile on ubuntu

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14133
2013-11-17 22:05:00 -08:00
Igor Canadi
fc61428288 Include <unistd.h> in db_test
Summary: This is the only compile issue in Ubuntu. It might be better to include <unistd.h> only in env_posix and add Truncate function to Env, but since we use truncate only in db_test, I don't think it makes much sense.

Test Plan: Rocksdb now compiles on Ubuntu!

Reviewers: dhruba, kailiu

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14127
2013-11-17 21:58:16 -08:00
Igor Canadi
de9ce7d439 Upgrading compiler to gcc4.8.1
Summary:
Finally did it - the trick was in using --dynamic-linker option. This is first step to running ASAN.

All of our code seems to compile just fine on 4.8.1. However, I still left fbcode.471.sh in the 'build_tools/' just in case.

Test Plan: make clean; make

Reviewers: dhruba, haobo, kailiu, emayanke, sdong

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14109
2013-11-17 13:52:55 -08:00
Kai Liu
75df72f2a5 Change the logic in KeyMayExist()
Summary:
Previously in KeyMayExist(), if DB::Get() returns non-Status::OK(), we assumes key may not exist.
However, as if index block is not in block cache, Status::Incomplete() will return. Worse still, if
options::filter_delete is enabled, we may falsely ignore the "delete" operation:

  https://github.com/facebook/rocksdb/blob/master/db/write_batch.cc#L217-L220

This diff fixes this bug and will let crash-test pass.

Test Plan:
Ran:

  ./db_stress --test_batches_snapshots=1 --ops_per_thread=1000000 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=1 --reopen=0 --readpercent=5 --prefixpercent=45 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/home/kailiu/local/newer --max_key=100000000 --disable_seek_compaction=0 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --disable_wal=0 --disable_data_sync=0 --target_file_size_base=2097152
--target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=1

Previously we'll see crash happens very soon.

Reviewers: igor, dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14115
2013-11-17 01:00:34 -08:00
kailiu
97d8e573a6 make util/env_posix.cc work under mac
Summary: This diff invoves some more complicated issues in the posix environment.

Test Plan: works under mac os. will need to verify dev box.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14061
2013-11-16 23:44:39 -08:00
Kai Liu
7604e2f70c Make the options in table_builder/block_builder less misleading
Summary:
By original design, the regular `block options` and index `block options` in table_builder is mutable. We can use ChangeOptions to change the options directly.

However, with my last change, `BlockBuilder` no longer hold the reference to the index_block_options -- as a result, any changes made after the creation of index block builder will be of no effect.

But still the code is very error-prone and developers can easily fall into the trap without aware of it. To avoid this problem from happening in the future, I deleted the `ChangeOptions` and the `index_block_options`, as well as many other changes to make it less misleading.

Test Plan:
make
make check
make release

Reviewers: dhruba, haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13707
2013-11-16 23:21:15 -08:00
Kai Liu
f7a2b972a8 Reformat CONTRIBUTING.md with less than 80 characters. 2013-11-16 22:51:09 -08:00
Igor Canadi
72756c7404 Merge pull request #5 from pborreli/typos
Fixed typos
2013-11-16 17:46:29 -08:00
Igor Canadi
f46827212e Added CONTRIBUTING.md 2013-11-16 14:35:33 -08:00
Pascal Borreli
443e04e62d Fixed typos 2013-11-16 11:21:34 +00:00
Siying Dong
55baa3d955 Add an option to table_reader_bench to access the table from DB And Iterating non-existing prefix case.
Summary: This patch adds an option to table_reader_bench that queries run against DB level (which has one table). It is useful if user wants to see the extra costs DB level introduces.

Test Plan: Run the benchmark with and without the new parameter

Reviewers: haobo, dhruba, kailiu

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D13863
2013-11-15 22:23:12 -08:00
Kai Liu
3d56b0698c Fix typo. 2013-11-15 22:15:42 -08:00
Igor Canadi
21905dd4a8 Start DeleteFileTest with clean plate
Summary:
Remove all the files from the test dir before the test. The test failed when there were some old files still in the directory, since it checks the file counts.
This is what caused jenkins' test failures. It was running fine on my machine so it was hard to repro.

Test Plan:
1. create an extra 000001.log file in the test directory
2. run a ./deletefile_test - test failes
3. patch ./deletefile_test with this
4. test succeeds

Reviewers: haobo, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14097
2013-11-15 16:30:23 -08:00
Kai Liu
bd828264f3 Fix a typo in README 2013-11-15 15:36:37 -08:00
Igor Canadi
29c931f70b Avoid populating live set if we don't need to
Summary: Also changed some comments

Test Plan: ./deletefile_test

Reviewers: haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14091
2013-11-14 22:42:02 -08:00
Igor Canadi
a0ce3fd00a PurgeObsoleteFiles() unittest
Summary:
Created a unittest that verifies that automatic deletion performed by PurgeObsoleteFiles() works correctly.

Also, few small fixes on the logic part -- call version_set_->GetObsoleteFiles() in FindObsoleteFiles() instead of on some arbitrary positions.

Test Plan: Created a unit test

Reviewers: dhruba, haobo, nkg-

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14079
2013-11-14 18:03:57 -08:00
Vamsi Ponnekanti
94dde686bb [Merge operand meant for key K is being applied on wrong key]
Summary:
We iterate until we find a different key than original key.
ikey is pointing to next key when we break out of loop.
After the loop we apply all merge operands meant for original key
on the next key!

Test Plan:
Need to give a build to Marcin to test out.

Revert Plan: OK

Task ID: #3181932

Reviewers: haobo, emayanke, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14073
2013-11-14 17:13:24 -08:00
Igor Canadi
e0ad0f26b8 Fix bloom filters
Summary: https://reviews.facebook.net/D13167 broke bloom filters. If filter is not in cache, we want to return true (safe thing). Am I right?

Test Plan: when benchmarking https://reviews.facebook.net/D14031 I got different results when using bloom filters vs. when not using them. This fixed the issue. I will also be putting this change to the other diff, but that one will probably be in review for longer time.

Reviewers: kailiu, dhruba, haobo

Reviewed By: kailiu

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14085
2013-11-14 14:05:15 -08:00
Igor Canadi
fda8142f29 Delete log files in the correct dir
Summary: Log files are stored in wal_dir, not dbname_

Test Plan: deletfile_test

Reviewers: nkg-

Reviewed By: nkg-

CC: leveldb

Differential Revision: https://reviews.facebook.net/D14067
2013-11-13 14:54:54 -08:00
Kai Liu
80bb81c6fe Add the correct table_factory for tables in table_tests 2013-11-12 23:54:31 -08:00
Kai Liu
88ba331c1a Add the index/filter block cache
Summary: This diff leverage the existing block cache and extend it to cache index/filter block.

Test Plan:
Added new tests in db_test and table_test

The correctness is checked by:

1. make check
2. make valgrind_check

Performance is test by:

1. 10 times of build_tools/regression_build_test.sh on two versions of rocksdb before/after the code change. Test results suggests no significant difference between them. For the two key operatons `overwrite` and `readrandom`, the average iops are both 20k and ~260k, with very small variance).
2. db_stress.

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, haobo, xjin

Differential Revision: https://reviews.facebook.net/D13167
2013-11-12 22:46:51 -08:00
Kai Liu
aed9f1fa5e The updated sed still doesn't work in mac, revert. 2013-11-12 21:40:25 -08:00
Kai Liu
f3b3316a07 Fix a sed command issue that cannot generated *.d files
Summary:

The original sed command is not recognized by mac's sed, which generates ".d-e" extension instead of ".d"

Test Plan:

make clean && make -j32
2013-11-12 21:26:19 -08:00