Summary:
Argumentation here: https://github.com/facebook/rocksdb/issues/9
Even though we include snappy in the distribution, we do not link with it if we don't have snappy installed on the system.
Installing snappy is easy nowadays, just type:
sudo apt-get install libsnappy-dev
Test Plan: compile on ubuntu
Reviewers: dhruba, kailiu
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14133
Summary: This is the only compile issue in Ubuntu. It might be better to include <unistd.h> only in env_posix and add Truncate function to Env, but since we use truncate only in db_test, I don't think it makes much sense.
Test Plan: Rocksdb now compiles on Ubuntu!
Reviewers: dhruba, kailiu
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14127
Summary:
Finally did it - the trick was in using --dynamic-linker option. This is first step to running ASAN.
All of our code seems to compile just fine on 4.8.1. However, I still left fbcode.471.sh in the 'build_tools/' just in case.
Test Plan: make clean; make
Reviewers: dhruba, haobo, kailiu, emayanke, sdong
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14109
Summary:
Previously in KeyMayExist(), if DB::Get() returns non-Status::OK(), we assumes key may not exist.
However, as if index block is not in block cache, Status::Incomplete() will return. Worse still, if
options::filter_delete is enabled, we may falsely ignore the "delete" operation:
https://github.com/facebook/rocksdb/blob/master/db/write_batch.cc#L217-L220
This diff fixes this bug and will let crash-test pass.
Test Plan:
Ran:
./db_stress --test_batches_snapshots=1 --ops_per_thread=1000000 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=1 --reopen=0 --readpercent=5 --prefixpercent=45 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/home/kailiu/local/newer --max_key=100000000 --disable_seek_compaction=0 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --disable_wal=0 --disable_data_sync=0 --target_file_size_base=2097152
--target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=1
Previously we'll see crash happens very soon.
Reviewers: igor, dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14115
Summary: This diff invoves some more complicated issues in the posix environment.
Test Plan: works under mac os. will need to verify dev box.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14061
Summary:
By original design, the regular `block options` and index `block options` in table_builder is mutable. We can use ChangeOptions to change the options directly.
However, with my last change, `BlockBuilder` no longer hold the reference to the index_block_options -- as a result, any changes made after the creation of index block builder will be of no effect.
But still the code is very error-prone and developers can easily fall into the trap without aware of it. To avoid this problem from happening in the future, I deleted the `ChangeOptions` and the `index_block_options`, as well as many other changes to make it less misleading.
Test Plan:
make
make check
make release
Reviewers: dhruba, haobo
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13707
Summary: This patch adds an option to table_reader_bench that queries run against DB level (which has one table). It is useful if user wants to see the extra costs DB level introduces.
Test Plan: Run the benchmark with and without the new parameter
Reviewers: haobo, dhruba, kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13863
Summary:
Remove all the files from the test dir before the test. The test failed when there were some old files still in the directory, since it checks the file counts.
This is what caused jenkins' test failures. It was running fine on my machine so it was hard to repro.
Test Plan:
1. create an extra 000001.log file in the test directory
2. run a ./deletefile_test - test failes
3. patch ./deletefile_test with this
4. test succeeds
Reviewers: haobo, dhruba
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14097
Summary:
Created a unittest that verifies that automatic deletion performed by PurgeObsoleteFiles() works correctly.
Also, few small fixes on the logic part -- call version_set_->GetObsoleteFiles() in FindObsoleteFiles() instead of on some arbitrary positions.
Test Plan: Created a unit test
Reviewers: dhruba, haobo, nkg-
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14079
Summary:
We iterate until we find a different key than original key.
ikey is pointing to next key when we break out of loop.
After the loop we apply all merge operands meant for original key
on the next key!
Test Plan:
Need to give a build to Marcin to test out.
Revert Plan: OK
Task ID: #3181932
Reviewers: haobo, emayanke, dhruba
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14073
Summary: https://reviews.facebook.net/D13167 broke bloom filters. If filter is not in cache, we want to return true (safe thing). Am I right?
Test Plan: when benchmarking https://reviews.facebook.net/D14031 I got different results when using bloom filters vs. when not using them. This fixed the issue. I will also be putting this change to the other diff, but that one will probably be in review for longer time.
Reviewers: kailiu, dhruba, haobo
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14085
Summary: This diff leverage the existing block cache and extend it to cache index/filter block.
Test Plan:
Added new tests in db_test and table_test
The correctness is checked by:
1. make check
2. make valgrind_check
Performance is test by:
1. 10 times of build_tools/regression_build_test.sh on two versions of rocksdb before/after the code change. Test results suggests no significant difference between them. For the two key operatons `overwrite` and `readrandom`, the average iops are both 20k and ~260k, with very small variance).
2. db_stress.
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb, haobo, xjin
Differential Revision: https://reviews.facebook.net/D13167
Summary:
mac and our dev server has totally differnt definition of uint64_t, therefore fixing the warning in mac has actually made code in linux uncompileable.
Test Plan:
make clean && make -j32
Summary: One more fix! In some cases, our filenames start with "/". Apparently, env_ can't handle filenames with double //
Test Plan:
deletefile_test does not include this line in the LOG anymore:
2013/11/12-18:11:43.150149 7fe4a6fff700 RenameFile logfile #3 FAILED -- IO error: /tmp/rocksdbtest-3574/deletefile_test//000003.log: No such file or directory
Reviewers: dhruba, haobo
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14055
Summary: The work to make sure mac os compiles rocksdb is not completed yet. But at least we can start cleaning some warnings captured only by g++ from mac os..
Test Plan: ran make in mac os
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14049
Summary:
Broke the compile when I removed purge_log_after_memtable_flush.
sorrybus
Test Plan: make db_bench works now
Reviewers: haobo
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D14037
Summary:
@haobo's suggestions from https://reviews.facebook.net/D13827
Renaming some variables, deprecating purge_log_after_flush, changing for loop into auto for loop.
I have not implemented deleting objects outside of mutex yet because it would require a big code change - we would delete object in db_impl, which currently does not know anything about object because it's defined in version_edit.h (FileMetaData). We should do it at some point, though.
Test Plan: Ran deletefile_test
Reviewers: haobo
Reviewed By: haobo
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D14025
Summary: FindObsoleteFiles() has to be called before PurgeObsoleteFiles() because FindObsoleteFiles() sets manifest_file_number, log_number and prev_log_number to valid values.
Test Plan: deletefile_test now works
Reviewers: dhruba, emayanke, kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13995
Summary:
Changed leveldb documentation with rocksdb in doc/index.html. Added some of the important options from options.h to doc.
Also removed benchmark files and impl.h, since this is all replaced by RocksDB wikis.
Test Plan: -
Reviewers: dhruba, haobo, kailiu, emayanke, sdong
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13977
Summary: In our project, when writing to the database, we want to form the value as the concatenation of a small header and a larger payload. It's a shame to have to copy the payload just so we can give RocksDB API a linear view of the value. Since RocksDB makes a copy internally, it's easy to support gather writes.
Test Plan: write_batch_test, new test case
Reviewers: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13947
Summary:
Here's one solution we discussed on speeding up FindObsoleteFiles. Keep a set of all files in DBImpl and update the set every time we create a file. I probably missed few other spots where we create a file.
It might speed things up a bit, but makes code uglier. I don't really like it.
Much better approach would be to abstract all file handling to a separate class. Think of it as layer between DBImpl and Env. Having a separate class deal with file namings and deletion would benefit both code cleanliness (especially with huge DBImpl) and speed things up. It will take a huge effort to do this, though.
Let's discuss offline today.
Test Plan: Ran ./db_stress, verified that files are getting deleted
Reviewers: dhruba, haobo, kailiu, emayanke
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D13827
Summary: Changed the name and interface for creating HashSkipListRep. Forgot to change it in db_test.
Test Plan: make db_test
Reviewers: haobo
Reviewed By: haobo
Differential Revision: https://reviews.facebook.net/D13965
Summary: What @haobo done with TransformRep, now in TransformRepNoLock. Similar implementation, except that I made DynamicIterator a subclass of Iterator which makes me have less iterator initializations.
Test Plan: ./prefix_test. Seeing huge savings vs. TransformRep again!
Reviewers: dhruba, haobo, sdong, kailiu
Reviewed By: haobo
CC: leveldb, haobo
Differential Revision: https://reviews.facebook.net/D13953
Summary: Allow block based table to configure the way flushing the blocks. This feature will allow us to add support for prefix-aligned block.
Test Plan: make check
Reviewers: dhruba, haobo, sdong, igor
Reviewed By: sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13875
Summary: I this bug from valgrind report and found a place that may potentially leak memory.
Test Plan: re-ran the valgrind and no error any more
Reviewers: emayanke
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13959
Summary:
Added a new call LogFlush() that flushes the log contents to the OS buffers. We never call it with lock held.
We call it once for every Read/Write and often in compaction/flush process so the frequency should not be a problem.
Test Plan: db_test
Reviewers: dhruba, haobo, kailiu, emayanke
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13935
Summary: Added a prefix_seek flag in ReadOptions to indicate that Seek is prefix aware(might not return data with different prefix), and also not bound to a specific prefix. Multiple Seeks and range scans can be invoked on the same iterator. If a specific prefix is specified, this flag will be ignored. Just a quick prototype that works for PrefixHashRep, the new lockless memtable could be easily extended with this support too.
Test Plan: test it on Leaf
Reviewers: dhruba, kailiu, sdong, igor
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13929
Summary:
Archive cleaning will still happen every WAL_ttl seconds
but archived logs will be deleted only if archive size
is greater then a WAL_size_limit value.
Empty archived logs will be deleted evety WAL_ttl.
Test Plan:
1. Unit tests pass.
2. Benchmark.
Reviewers: emayanke, dhruba, haobo, sdong, kailiu, igor
Reviewed By: emayanke
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13869
Summary: We have to be able to catch last few log outputs before a crash
Test Plan: no
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13917
Summary:
This fixes#3130525. Dhruba's suggestion and Tnovak's implementation :)
The issue was with SkipEmptyDataBlocksForward(), but I also changed SkipEmptyDataBlocksBackward(). Is that OK?
Test Plan: Run the logdevice test
Reviewers: dhruba
Reviewed By: dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D13911