Summary:
My understanding is that the purpose of write stall triggers are to wait for auto-compaction to catch up. Without auto-compaction, we don't need to stall writes.
Also with this diff, flush/compaction conditions are recalculated on dynamic option change. Previously the conditions are recalculate only when write stall options are changed.
Test Plan: See the new test. Removed two tests that are no longer valid.
Reviewers: IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61437
Summary:
patch for diff https://reviews.facebook.net/D58587
Also change StopWatch class to add a fifth param named overwrite which decides whether to overwrite *elapse or add on it.
Test Plan: make all check -j64
Reviewers: sdong, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61239
Summary: MyRocks is adding support for the user of the SstFileWriter which needs a comparator. It would be more convenient to get the comparator from the column family (which already has to have it) than to have caller keep track of it.
Test Plan: Standard tests (adding one for the new method)
Reviewers: IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61155
Summary: Regex support for c++ is very inconsistent across compilers, converting
the logic to simple string manipulation.
Test Plan: Local test
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61377
Summary:
parallel tests are broken because gnu_parallel is reading deprecated options from `/etc/parallel/config`
Fix this by passing `--plain` to ignore `/etc/parallel/config`
Test Plan: make check -j64
Reviewers: kradhakrishnan, sdong, andrewkr, yiwu, arahut
Reviewed By: arahut
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61359
Summary:
We may not have permission on all /dev/shm to fix the sticky bit.
Making the sticky bit fix advisory.
Test Plan: Run CI job locally
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61371
Summary:
Fixing build break on Mac
(1) uint64_t fix
(2) O_DIRECT works only for Linux
Test Plan: Build and test on Mac and Unix
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61353
Summary:
Experiments on column-aware encodings. Supported features: 1) extract data blocks from SST file and encode with specified encodings; 2) Decode encoded data back into row format; 3) Directly extract data blocks and write in row format (without prefix encoding); 4) Get column distribution statistics for column format; 5) Dump data blocks separated by columns in human-readable format.
There is still on-going work on this diff. More refactoring is necessary.
Test Plan: Wrote tests in `column_aware_encoding_test.cc`. More tests should be added.
Reviewers: sdong
Reviewed By: sdong
Subscribers: arahut, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60027
Summary:
The patch is a continuation of part 5. It glues the abstraction for
file layout and metadata, and flush out the implementation of the API. It
adds unit tests for the implementation.
Test Plan: Run unit tests
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D57549
Summary: Add a benchmark to `db_bench`. In this benchmark, a write thread will populate time series data in the format of 'id | timestamp', and multiple read threads will randomly retrieve all data from one id at a time.
Test Plan: Run the benchmark: `num=134217728;bpl=536870912;mb=67108864;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4;delay=8;stop=12;wbn=3;mbc=20;wbs=134217728;dds=0;sync=0;t=32;vs=800;bs=4096;cs=17179869184;of=500000;wps=0;si=10000000; kir=100000; dir=/data/users/jhli/test/; ./db_bench --benchmarks=timeseries --disable_seek_compaction=1 --mmap_read=0 --statistics=1 --histogram=1 --num=$num --threads=$t --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=6 --open_files=$of --verify_checksum=1 --db=$dir --sync=$sync --disable_wal=0 --compression_type=none --stats_interval=$si --compression_ratio=1 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=1 --max_bytes_for_level_base=$bpl --use_existing_db=0 --key_id_range=$kir`
Reviewers: andrewkr, sdong
Reviewed By: sdong
Subscribers: lgalanis, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60651
Summary:
DBTest.CompressionStatsTest on non_shm test where the storage device is slow
DBTest.CompressionStatsTest assumes that a flush happens to check the number of compressed blocks.
This is not always true if the Flush is slow, make the test more deterministic by forcing a flush before doing the check
Test Plan: Run the test locally
Reviewers: andrewkr, yiwu, lightmark, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61317
Summary:
db_stress test is now failing because of this scenario
- run db_stress with merge_operator enabled (now we have a db with merge operands)
- run db_stress with merge_operator disabled (now when we fail to open the db)
the solution is to pass the merge_operator to the DB even if we are not going to do any merge operations
Test Plan: Check the failure
Reviewers: andrewkr, yiwu, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61311
Summary:
The call stack used to look like this during static initialization:
#0 0x00000000008032d1 in rocksdb::ThreadLocalPtr::StaticMeta::StaticMeta() (this=0x7ffff683b300) at util/thread_local.cc:172
#1 0x00000000008030a7 in rocksdb::ThreadLocalPtr::Instance() () at util/thread_local.cc:135
#2 0x000000000080310f in rocksdb::ThreadLocalPtr::StaticMeta::Mutex() () at util/thread_local.cc:141
#3 0x0000000000803103 in rocksdb::ThreadLocalPtr::StaticMeta::InitSingletons() () at util/thread_local.cc:139
#4 0x000000000080305d in rocksdb::ThreadLocalPtr::InitSingletons() () at util/thread_local.cc:106
It involves outer/inner classes and the call stacks goes
outer->inner->outer->inner, which is too difficult to understand. We can avoid
a level of back-and-forth by skipping StaticMeta::InitSingletons(), which
doesn't initialize anything beyond what ThreadLocalPtr::Instance() already
initializes.
Now the call stack looks like this during static initialization:
#0 0x00000000008032c5 in rocksdb::ThreadLocalPtr::StaticMeta::StaticMeta() (this=0x7ffff683b300) at util/thread_local.cc:170
#1 0x00000000008030a7 in rocksdb::ThreadLocalPtr::Instance() () at util/thread_local.cc:135
#2 0x000000000080305d in rocksdb::ThreadLocalPtr::InitSingletons() () at util/thread_local.cc:106
Test Plan:
unit tests
verify StaticMeta::mutex_ is still initialized in DefaultEnv() (StaticMeta::mutex_ is the only variable intended to be initialized via StaticMeta::InitSingletons() which I removed)
#0 0x00000000005cee17 in rocksdb::port::Mutex::Mutex(bool) (this=0x7ffff69500b0, adaptive=false) at port/port_posix.cc:52
#1 0x0000000000769cf8 in rocksdb::ThreadLocalPtr::StaticMeta::StaticMeta() (this=0x7ffff6950000) at util/thread_local.cc:168
#2 0x0000000000769a53 in rocksdb::ThreadLocalPtr::Instance() () at util/thread_local.cc:133
#3 0x0000000000769a09 in rocksdb::ThreadLocalPtr::InitSingletons() () at util/thread_local.cc:105
#4 0x0000000000647d98 in rocksdb::Env::Default() () at util/env_posix.cc:845
Reviewers: lightmark, yhchiang, sdong
Reviewed By: sdong
Subscribers: arahut, IslamAbdelRahman, yiwu, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60813
Summary: regression tests to make sure seek keys not in domain would not fail assertion
Test Plan:
```
[gzh@dev6163.prn2 ~/local/rocksdb] ./prefix_test --gtest_filter=SamePrefixTest.*
/tmp/rocksdbtest-112628/prefix_test
Note: Google Test filter = SamePrefixTest.*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from SamePrefixTest
[ RUN ] SamePrefixTest.InDomainTest
[ OK ] SamePrefixTest.InDomainTest (211 ms)
[----------] 1 test from SamePrefixTest (211 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (211 ms total)
```
Reviewers: andrewkr, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61161
Summary: We rely on a basic evaluation logic: when using an expression `A || B`, the `B` part will only get evaluated when `A` fails. Therefore the task creation tool is guaranteed to run if previous build/test step failed. To indicate the correct return value to shell, the task creation tool will call `exit(1)` which will cause Sandcastle to mark it as a failure.
Test Plan:
- Land the changes.
- Trigger a RocksDB contrun - observe the results.
- Look at the results from the nightly runs and fix issues as necessary.
Current testing so far has been in isolation for various components.
Reviewers: andrewkr, IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60909
Summary: OnTableFileCreated() now is also called when the file creaion fails. In that case, we shouldn't assert the file size is not 0.
Test Plan: Run crash test
Reviewers: yiwu, andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61137
Summary:
Removing moreutils from sandcastle and adding gnu parallel.
Then passing in J= nproc command
Test Plan: Testing on sandcastle
Reviewers: sdong, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61017
Summary: Increase test timeout to some tests to unblock CI.
Test Plan: watch how it runs.
Reviewers: kradhakrishnan, andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: IslamAbdelRahman, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61263
Summary: Extend the option memtable_prefix_bloom_huge_page_tlb_size from just putting memtable bloom filter to huge page to memtable itself too.
Test Plan: Run all existing tests.
Reviewers: IslamAbdelRahman, yhchiang, andrewkr
Reviewed By: andrewkr
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60513
Summary: We may wrongly drop delete operation if we pick a file with the entry to be delete, the put entry of the same user key is in the next file in the level, and the next file is not picked. We expand compaction inputs for output level too.
Test Plan: Add unit tests that reproduct the bug of dropping delete entry. Change compaction_picker_test to assert the new behavior.
Reviewers: IslamAbdelRahman, igor
Reviewed By: igor
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61173
Summary: std::make_unique is not standard and not always available, remove it
Test Plan: Run "make clean jclean rocksdbjava jtest -j8" on my mac
Reviewers: yhchiang, yiwu, sdong, andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61143
Summary: old typos with FILTER/INDEX_CACHE
Test Plan: still pass this unit test
Reviewers: andrewkr
Reviewed By: andrewkr
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61185
Summary:
- Added a new subcommand, 'ldb restore', that restores from backup
- Made backup_env_uri optional (also for 'ldb backup') because it can use db_env when backup_env isn't provided
Test Plan:
verify backup and restore commands work:
$ ./ldb backup --db=/data/users/andrewkr/rocksdb/db_bench-out/ --num_threads 1 --backup_dir=/data/users/andrewkr/rocksdb/db_bench-out/backup/
$ ./ldb restore --db=/data/users/andrewkr/rocksdb/db_bench-out/restore/ --num_threads 1 --backup_dir=/data/users/andrewkr/rocksdb/db_bench-out/backup/
Reviewers: sdong, wanning
Reviewed By: wanning
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60849
Summary:
this script creates a summary message based on the test type and
output. Previously it only ran during contbuild. This diff makes it also
run on commit-triggered diffs.
Test Plan: will see if it works on diff's sandcastle tests
Reviewers: kradhakrishnan, lightmark, sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D60879
Summary:
TickersNameMap is not consistent with Tickers enum.
this cause us to report wrong statistics and sometimes to access TickersNameMap outside it's boundary causing crashes (in Fb303 statistics)
Test Plan: added new unit test
Reviewers: sdong, kradhakrishnan
Reviewed By: kradhakrishnan
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61083
Summary:
MergeContext::copied_operands contain strings that MergeContext::operand_list_ Slices point to
It's possible that when MergeContext::copied_operands grow, these strings are moved and there place in memory is changed, this will cause MergeContext::operand_list_ to point to invalid memory.
fix this problem by using unique_ptr<string> instead of string
Test Plan: run tests under mac/clang
Reviewers: sdong, yiwu
Reviewed By: yiwu
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61023
Summary: The test is flaky on Travis in osx environment. The background flush the test wanting to block can run behind the L2 manual compaction, making the test actually blocking the L2 compaction and won't able to proceed.
Test Plan: Test run on travis
Reviewers: kradhakrishnan, sdong, andrewkr, IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61101
Summary: RocksDB lite don't support dynamic options. Disable the two test from lite build, and assert `SetOptions` should return `status::OK`.
Test Plan: Run the db_options test under lite build and normal build.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D61119
Summary:
Stale log files can be deleted out of order. This can happen for various reasons. One of the reason is that no data is ever inserted to a column family and we have an optimization to update its log number, but not all the old log files are cleaned up (the case shown in the unit tests added). It can also happen when we simply delete multiple log files out of order.
This causes data corruption because we simply increase seqID after processing the next row and we may end up with writing data with smaller seqID than what is already flushed to memtables.
In DB recovery, for the oldest files we are replaying, if there it contains no data for any column family, we ignore the sequence IDs in the file.
Test Plan: Add two unit tests that fail without the fix.
Reviewers: IslamAbdelRahman, igor, yiwu
Reviewed By: yiwu
Subscribers: hermanlee4, yoshinorim, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60891
Summary:
Tsan crash white-box test was disabled because it never ends. Re-enable it with reduced killing odds.
Add a parameter in crash test script to allow we pass it through an environment variable.
Test Plan: Run it manually.
Reviewers: IslamAbdelRahman
Reviewed By: IslamAbdelRahman
Subscribers: leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61053
getline on std::cin can be very inefficient when ldb is loading large values, with high CPU usage in libc _IO_(un)getc, this is because of the performance penalty that comes from synchronizing stdio and iostream buffers.
See the reproducers and tests in #1133 .
If an ifstream on /dev/stdin is used (when available) then using ldb to load large values can be much more efficient.
I thought for ldb load, that this approach is preferable to using <cstdio> or std::ios_base::sync_with_stdio(false).
I couldn't think of a use case where ldb load would need to support reading unbuffered input, an alternative approach would be to add support for passing --input_file=/dev/stdin.
I have a CLA in place, thanks.
The CI tests were failing at the time of https://github.com/facebook/rocksdb/pull/1156, so this change and PR will supersede it.
* fixes 1220: rocksjni build fails on Windows due to variable-size array declaration
using new keyword to create variable-size arrays in order to satisfy most of the compilers
* fixes 1220: rocksjni build fails on Windows due to variable-size array declaration
using unique_ptr keyword to create variable-size arrays in order to satisfy most of the compilers
Summary: Multiput atomiciy is broken across multiple column families if we don't sync WAL before flushing one column family. The WAL file may contain a write batch containing writes to a key to the CF to be flushed and a key to other CF. If we don't sync WAL before flushing, if machine crashes after flushing, the write batch will only be partial recovered. Data to other CFs are lost.
Test Plan: Add a new unit test which will fail without the diff.
Reviewers: yhchiang, IslamAbdelRahman, igor, yiwu
Reviewed By: yiwu
Subscribers: yiwu, leveldb, andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D60915