Commit Graph

3874 Commits

Author SHA1 Message Date
Yueh-Hsuan Chiang
acee2b08a2 Fixed endless loop in DBIter::FindPrevUserKey()
Summary: Fixed endless loop in DBIter::FindPrevUserKey()

Test Plan: ./db_stress --test_batches_snapshots=1 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=0 --reopen=20 --readpercent=45 --prefixpercent=5 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/tmp/rocksdb_crashtest_KdCI5F --max_key=100000000 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --progress_reports=0 --disable_wal=0 --disable_data_sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=0 --memtablerep=prefix_hash --prefix_size=7 --ops_per_thread=200 --kill_random_test=97

Reviewers: tnovak, igor, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41085
2015-07-02 16:10:31 -07:00
Mike Kolupaev
218487d8dc [wal changes 1/3] fixed unbounded wal growth in some workloads
Summary:
This fixes the following scenario we've hit:
 - we reached max_total_wal_size, created a new wal and scheduled flushing all memtables corresponding to the old one,
 - before the last of these flushes started its column family was dropped; the last background flush call was a no-op; no one removed the old wal from alive_logs_,
 - hours have passed and no flushes happened even though lots of data was written; data is written to different column families, compactions are disabled; old column families are dropped before memtable grows big enough to trigger a flush; the old wal still sits in alive_logs_ preventing max_total_wal_size limit from kicking in,
 - a few more hours pass and we run out disk space because of one huge .log file.

Test Plan: `make check`; backported the new test, checked that it fails without this diff

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40893
2015-07-02 14:27:00 -07:00
Dmitri Smirnov
feb99c31a4 Merge remote-tracking branch 'origin' into ms_win_port 2015-07-02 13:14:18 -07:00
Aaron Feldman
e70115e71b Fix unity build by removing anonymous namespace
Summary: see title

Test Plan: run 'make unity'

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41079
2015-07-02 12:27:35 -07:00
agiardullo
4159f5b87b Prepare 3.12
Summary: About to cut release

Test Plan: none

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D41061
2015-07-02 12:20:36 -07:00
Aaron Feldman
a69bc91e37 Multithreaded backup and restore in BackupEngineImpl
Summary:
Add a new field: BackupableDBOptions.max_background_copies.
CreateNewBackup() and RestoreDBFromBackup() will use this number of threads to perform copies.
If there is a backup rate limit, then max_background_copies must be 1.
Update backupable_db_test.cc to test multi-threaded backup and restore.
Update backupable_db_test.cc to test backups when the backup environment is not the same as the database environment.

Test Plan:
Run ./backupable_db_test
Run valgrind ./backupable_db_test
Run with TSAN and ASAN

Reviewers: yhchiang, rven, anthony, sdong, igor

Reviewed By: igor

Subscribers: yhchiang, anthony, sdong, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40725
2015-07-02 11:35:51 -07:00
Dmitri Smirnov
9dbde7277c Merge remote-tracking branch 'origin' into ms_win_port 2015-07-02 11:34:22 -07:00
Yueh-Hsuan Chiang
03d433ee65 [RocksJava] Fixed test failures
Summary:
The option bottommost_level_compaction was introduced lately.
This option breaks the Java API behavior. To prevent the library
from doing so we set that option to a fixed value in Java.

In future we are going to remove that portion and replace the
hardcoded options using a more flexible way.

Fixed bug introduced by WriteBatchWithIndex Patch

Lately icanadi changed the behavior of WriteBatchWithIndex.
See commit: 821cff114e

This commit solves problems introduced by above mentioned commit.

Test Plan:
make rocksdbjava
make jtest

Reviewers: adamretter, ankgup87, yhchiang

Reviewed By: yhchiang

Subscribers: igor, dhruba

Differential Revision: https://reviews.facebook.net/D40647
2015-07-01 23:22:03 -07:00
Dmitri Smirnov
326da912de Add string.h to Histogram as we init the array out of curly braces 2015-07-01 17:21:38 -07:00
Dmitri Smirnov
ca2fe2c1b6 Address GCC compilation issues
invalid suffix on literal
 no return statement in function returning non-void CuckooStep::operator=
 extra qualification ‘rocksdb::spatial::Variant::
 dereferencing type-punned pointer will break strict-aliasing rules
2015-07-01 17:04:08 -07:00
Dmitri Smirnov
19e13a595d Fix header inclusion 2015-07-01 16:35:51 -07:00
Dmitri Smirnov
18285c1e2f Windows Port from Microsoft
Summary: Make RocksDb build and run on Windows to be functionally
 complete and performant. All existing test cases run with no
 regressions. Performance numbers are in the pull-request.

 Test plan: make all of the existing unit tests pass, obtain perf numbers.

 Co-authored-by: Praveen Rao praveensinghrao@outlook.com
 Co-authored-by: Sherlock Huang baihan.huang@gmail.com
 Co-authored-by: Alex Zinoviev alexander.zinoviev@me.com
 Co-authored-by: Dmitri Smirnov dmitrism@microsoft.com
2015-07-01 16:13:56 -07:00
Yueh-Hsuan Chiang
c00948d5e1 [RocksJava] Fix test failure of compactRangeToLevel
Summary:
Rewrite Java tests compactRangeToLevel and compactRangeToLevelColumnFamily
to make them more deterministic and robust.

Test Plan:
make rocksdbjava
make jtest

Reviewers: anthony, fyrz, adamretter, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40941
2015-07-01 16:03:56 -07:00
sdong
05e2831966 Allocate LevelFileIteratorState and LevelFileNumIterator from DB iterator's arena
Summary: Try to allocate LevelFileIteratorState and LevelFileNumIterator from DB iterator's arena, instead of calling malloc and free.

Test Plan: valgrind check

Reviewers: rven, yhchiang, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40929
2015-06-30 17:30:38 -07:00
Igor Canadi
436ed904da Add rpath option to production builds for 4.8.1 toolchain
Summary: Copy change from D37533 to gcc 4.8.1 config

Test Plan: make db_bench, `ldd db_bench`, try running it

Reviewers: MarkCallaghan, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40845
2015-06-30 13:30:54 -07:00
krad
b0f1927dbb Increasing timeout for drop writes.
Summary: We have a race in the way test works. We avoided the race by adding the
wait to the counter. I thought 1s was eternity, but that is not true in some
scenarios. Increasing the timeout to 10s and adding warnings.

Also, adding nosleep to avoid the case where the wakeup thread is waiting behind
the sleeping thread for scheduling.

Test Plan: Run make check

Reviewers: siying igorcanadi

CC: leveldb@

Task ID: #7312624

Blame Rev:
2015-06-30 11:11:56 -07:00
Tomislav Novak
ec70fea4c4 Fix a comparison in DBIter::FindPrevUserKey()
Summary:
When seek target is a merge key (`kTypeMerge`), `DBIter::FindNextUserEntry()`
advances the underlying iterator _past_ the current key (`saved_key_`); see
`MergeValuesNewToOld()`. However, `FindPrevUserKey()` assumes that `iter_`
points to an entry with the same user key as `saved_key_`. As a result,
`it->Seek(key) && it->Prev()` can cause the iterator to be positioned at the
_next_, instead of the previous, entry (new test, written by @lovro, reproduces
the bug).

This diff changes `FindPrevUserKey()` to also skip keys that are _greater_ than
`saved_key_`.

Test Plan: db_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba, lovro

Differential Revision: https://reviews.facebook.net/D40791
2015-06-29 17:04:03 -07:00
Yueh-Hsuan Chiang
501591c423 Make column_family_test runnable in ROCKSDB_LITE
Summary: Make column_family_test runnable in ROCKSDB_LITE.

Test Plan: column_family_test

Reviewers: sdong, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40251
2015-06-29 14:39:01 -07:00
krad
91cb82f34e Merge branch 'master' of github.com:facebook/rocksdb 2015-06-29 11:47:24 -07:00
Igor Canadi
09f5a4b486 set -e in fb_compile_mongo.sh
Summary: Based on @anthony's feedback, we want to fail early if our static linking fails.

Test Plan: none

Reviewers: anthony

Reviewed By: anthony

Subscribers: dhruba, anthony, leveldb

Differential Revision: https://reviews.facebook.net/D40839
2015-06-29 11:43:25 -07:00
krad
6199cba998 Fix race in unit test.
Summary: Avoid falling victim to race condition.

Test Plan: Run the unit test

Reviewers: sdong igor

CC: leveldb@

Task ID: #7312624

Blame Rev:
2015-06-29 11:40:21 -07:00
Igor Canadi
0a019d74a0 Use malloc_usable_size() for accounting block cache size
Summary:
Currently, when we insert something into block cache, we say that the block cache capacity decreased by the size of the block. However, size of the block might be less than the actual memory used by this object. For example, 4.5KB block will actually use 8KB of memory. So even if we configure block cache to 10GB, our actually memory usage of block cache will be 20GB!

This problem showed up a lot in testing and just recently also showed up in MongoRocks production where we were using 30GB more memory than expected.

This diff will fix the problem. Instead of counting the block size, we will count memory used by the block. That way, a block cache configured to be 10GB will actually use only 10GB of memory.

I'm using non-portable function and I couldn't find info on portability on Google. However, it seems to work on Linux, which will cover majority of our use-cases.

Test Plan:
1. fill up mongo instance with 80GB of data
2. restart mongo with block cache size configured to 10GB
3. do a table scan in mongo
4. memory usage before the diff: 12GB. memory usage after the diff: 10.5GB

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40635
2015-06-26 11:48:09 -07:00
Igor Canadi
4cbc4e6f88 Call merge operators with empty values
Summary: It's not really nice to call user's API with garbage data in new_value. This diff makes sure that new_value is empty before calling the merge operator.

Test Plan: Added assert to Merge operator in merge_test

Reviewers: sdong, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40773
2015-06-26 11:35:46 -07:00
Igor Canadi
619167ee66 Fix mac compile
Summary: as title

Test Plan: make check

Reviewers: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40785
2015-06-26 10:29:24 -07:00
Igor Canadi
472e64d39e Improve fb_compile_mongo.sh
Summary: If we create a new temp directory for each build, scons will recompile everything because we have different parameters. Instead, let's set up a constant path to our static lib. That way we won't have to recompile.

Test Plan: Run fb_compile_mongo.sh twice -- second time it didn't recompile everything

Reviewers: MarkCallaghan, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40707
2015-06-26 10:24:08 -07:00
Venkatesh Radhakrishnan
c9cd404bcd Make flush check for shutdown
Summary:
Fixes task 7156865 where a compaction causes a hang in flush
memtable if CancelAllBackgroundWork was called prior to it.
Stack trace is in : https://phabricator.fb.com/P19848829
We end up waiting for a flush which will never happen because there are no background threads.

Test Plan: PreShutdownFlush

Reviewers: sdong, igor

Reviewed By: sdong, igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40617
2015-06-25 14:43:25 -07:00
Poornima Chozhiyath Raman
4fb09c6871 Updating SeekToLast with upper bound
Summary: #7124486: RocksDB's Iterator.SeekToLast should seek to the last key before iterate_upper_bound if presents

Test Plan: ./db_iter_test run successfully with the new testcase

Reviewers: rven, yhchiang, igor, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D40425
2015-06-25 09:44:30 -07:00
Igor Canadi
dadc429767 Reproducible MongoRocks compile with FB toolchain
Summary:
Added a script that will compile MongoRocks with the same flags as RocksDB binary. On FB infra, we can now do:

  cd ~/rocksdb; make static_lib
  cd ~/mongo; ~/rocksdb/build_tools/fb_compile_mongo.sh

No need to upgrade the g++ on the devbox (like Aaron and I did) or maintain a separate script to compile (like Mark did)

fb_compile_mongo.sh gets the settings from fbcode_config.sh, so it also makes it easier to upgrade the environment one day.

Test Plan: Compiled mongod with new script. Also, ldd output looks good: https://phabricator.fb.com/P19891602

Reviewers: AaronFeldman, MarkCallaghan, anthony

Reviewed By: anthony

Subscribers: anthony, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40659
2015-06-24 15:09:55 -07:00
Yueh-Hsuan Chiang
62a8fd154a Make stringappend_test runnable in ROCKSDB_LITE
Summary: Make stringappend_test runnable in ROCKSDB_LITE

Test Plan: stringappend_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40593
2015-06-24 15:01:43 -07:00
Yueh-Hsuan Chiang
48da7a9cad Improve the comment for BYTES_READ in statistics.
Summary:
BYTES_READ only count the number of logical bytes read from
the DB::Get() function.  It neither includes all logical bytes read
nor indicates IO read bytes.

This patch improves the comment for BYTES_READ.

Test Plan: Only change comment.

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40599
2015-06-24 15:00:51 -07:00
Yueh-Hsuan Chiang
72cab88959 Block redis_test in ROCKSDB_LITE
Summary: Block redis_test in ROCKSDB_LITE as utilities not supported in ROCKSDB_LITE.

Test Plan: redis_test

Reviewers: sdong, igor, rven, anthony, kradhakrishnan, IslamAbdelRahman

Reviewed By: IslamAbdelRahman

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40587
2015-06-24 01:44:21 -07:00
Yueh-Hsuan Chiang
dec2c9f564 Make table_properties_collector_test runnable in ROCKSDB_LITE
Summary: Make table_properties_collector_test runnable in ROCKSDB_LITE

Test Plan: table_properties_collector_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40581
2015-06-24 01:38:53 -07:00
Yueh-Hsuan Chiang
0b1ffe2e1d Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS
Summary:
Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS as
those environment doe not have compile option --no-as-needed

  ld: unknown option: --no-as-needed
  clang: error: linker command failed with exit code 1 (use -v to see invocation)

Test Plan: make shared_lib

Reviewers: meyering, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40353
2015-06-23 16:32:59 -07:00
Islam AbdelRahman
674b1181cf Bottommost level compaction option
Summary: Replace force_bottommost_level_compaction in CompactRangeOption with an option that allow the user to (always skip, always compact, compact if compaction filter is present) the bottommost level for level based compaction.

Test Plan: make check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40527
2015-06-23 13:32:40 -07:00
Giuseppe Ottaviano
782a1590f9 Implement a table-level row cache
Summary:
Implementation of a table-level row cache.
It only caches point queries done through the `DB::Get` interface, queries done through the `Iterator` interface will completely skip the cache.

Supports snapshots and merge operations.

Test Plan: Ran `make valgrind_check commit-prereq`

Reviewers: igor, philipp, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D39849
2015-06-23 10:25:45 -07:00
krad
de85e4cadf Introduce WAL recovery consistency levels
Summary:
The "one size fits all" approach with WAL recovery will only introduce inconvenience for our varied clients as we go forward. The current recovery is a bit heuristic. We introduce the following levels of consistency while replaying the WAL.

1. RecoverAfterRestart (kTolerateCorruptedTailRecords)

This mocks the current recovery mode.

2. RecoverAfterCleanShutdown (kAbsoluteConsistency)

This is ideal for unit test and cases where the store is shutdown cleanly. We tolerate no corruption or incomplete writes.

3. RecoverPointInTime (kPointInTimeRecovery)

This is ideal when using devices with controller cache or file systems which can loose data on restart. We recover upto the point were is no corruption or incomplete write.

4. RecoverAfterDisaster (kSkipAnyCorruptRecord)

This is ideal mode to recover data. We tolerate corruption and incomplete writes, and we hop over those sections that we cannot make sense of salvaging as many records as possible.

Test Plan:
(1) Run added unit test to cover all levels.
(2) Run make check.

Reviewers: leveldb, sdong, igor

Subscribers: yoshinorim, dhruba

Differential Revision: https://reviews.facebook.net/D38487
2015-06-22 15:28:12 -07:00
Islam AbdelRahman
530534fceb Fix trivial move merge
Summary: Fixing bad merge

Test Plan: make -j64 check (this is not enough to verify the fix)

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40521
2015-06-22 15:20:30 -07:00
krad
7015fd81c4 Add read_nanos to IOStatsContext.
Summary: MyRocks need a mechanism to track read outliers. We need to expose this
stat.

Test Plan: None

Reviewers: sdong

CC: leveldb

Task ID: #7152512

Blame Rev:
2015-06-22 11:09:35 -07:00
Aaron Feldman
7160f5d80c Fix broken gflags link
Summary: Fix broken gflags link

Test Plan: Follow the link

Reviewers: igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40503
2015-06-22 09:31:52 -07:00
Jesper Lundgren
dda74111ae add setMaxTableFilesSize Options unit test 2015-06-20 16:24:09 +08:00
Jesper Lundgren
d62b6ed838 add setMaxTableFilesSize to JNI interface 2015-06-20 16:23:22 +08:00
Venkatesh Radhakrishnan
e1d3c7dbe4 Fixing valgrind error in checkpoint_test
Summary: Fixed a valgrind issue in checkpoint_test

Test Plan: valgrind on checkpoint_test

Reviewers: igor, anthony

Reviewed By: anthony

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40455
2015-06-19 20:21:23 -07:00
Michael Callahan
3bdec09cb7 Remove ldb_tests.py from make check until it is working again.
Summary: Recent checkin added ldb_test.py to the make check target but the test fails.  Remove it again for now and make task.

Test Plan: No more ldb_tests.py running

Reviewers: igor, anthony

Reviewed By: anthony

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D40449
2015-06-19 17:41:49 -07:00
Michael Callahan
15325bf55b First version of rocksdb_dump and rocksdb_undump.
Summary: Hack up rocksdb_dump and rocksdb_undump utilities to get this task rolling/promote discussion.

Test Plan: Dump/undump databases recursively to see if nothing is lost.

Reviewers: sdong, yhchiang, rven, anthony, kradhakrishnan, igor

Reviewed By: igor

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D37269
2015-06-19 16:24:36 -07:00
Venkatesh Radhakrishnan
04251e1e3a Add wal files to Checkpoint for multiple column families.
Summary:
When there are multiple column families, the flush in
GetLiveFiles is not atomic, so that there are entries in the wal files
which are needed to get a consisten RocksDB. We now add the log files to
the checkpoint.

Test Plan:
CheckpointCF - This test forces more data to be written to
the other column families after the flush of the first column family but
before the second.

Reviewers: igor, yhchiang, IslamAbdelRahman, anthony, kradhakrishnan, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40323
2015-06-19 16:08:31 -07:00
Aaron Feldman
18cc5018b7 Fix memory leaks in PinnedUsageTest
Summary: See title

Test Plan: Run valgrind ./cache_test

Reviewers: igor

Reviewed By: igor

Subscribers: anthony, dhruba

Differential Revision: https://reviews.facebook.net/D40419
2015-06-19 09:43:08 -07:00
Igor Canadi
bf03f59c11 Disable CompressLevelCompaction() if Zlib is not supported
Summary: CompressLevelCompaction() depends on Zlib. We should skip it when zlib is not present.

Test Plan: `make check` without zlib

Reviewers: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40401
2015-06-18 18:46:26 -07:00
Yueh-Hsuan Chiang
df719d4964 Make autovector_test runnable in ROCKSDB_LITE
Summary: Make autovector_test runnable in ROCKSDB_LITE

Test Plan: autovector_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40245
2015-06-18 15:58:00 -07:00
Yueh-Hsuan Chiang
4d6d47688c Block geodb_test in ROCKSDB_LITE
Summary:
Block geodb_test in ROCKSDB_LITE as geodb is not supported
in ROCKSDB_LITE

Test Plan: geodb_test

Reviewers: sdong, rven, anthony, kradhakrishnan, IslamAbdelRahman, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40335
2015-06-18 15:57:19 -07:00
Yueh-Hsuan Chiang
71b438c4a6 Remove unused target --- compactor_test
Summary:
Remove compactor_test, which depends on a directory not exist
in our code base.
    make compactor_test
    GEN      util/build_version.cc
    GEN      util/build_version.cc
    make: *** No rule to make target `utilities/compaction/compactor_test.o', needed by `compactor_test'.  Stop.

Test Plan: verify the output message of make compactor_test

Reviewers: rven, anthony, kradhakrishnan, igor, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40341
2015-06-18 15:54:52 -07:00