Commit Graph

50 Commits

Author SHA1 Message Date
Islam AbdelRahman
574b543f80 Rename merger.h -> merging_iterator.h
Summary:
merger.h was always a confusing name for me, simply give the file a better name
Closes https://github.com/facebook/rocksdb/pull/1836

Differential Revision: D4505357

Pulled By: IslamAbdelRahman

fbshipit-source-id: 07b28d8
2017-02-02 16:54:19 -08:00
Andrew Kryczka
9da4d542fe Range deletions unsupported in tailing iterator
Summary:
change the iterator status to NotSupported as soon as a range tombstone
is encountered by a ForwardIterator.
Closes https://github.com/facebook/rocksdb/pull/1593

Differential Revision: D4246294

Pulled By: ajkr

fbshipit-source-id: aef9f49
2017-01-23 13:39:12 -08:00
Andrew Kryczka
48e8baebc0 Decouple data iterator and range deletion iterator in TableCache
Summary:
Previously we used TableCache::NewIterator() for multiple purposes (data
block iterator and range deletion iterator), and returned non-ok status in
the data block iterator. In one case where the caller only used the range
deletion block iterator (9e7cf3469b/db/version_set.cc (L965-L973)),
we didn't check/free the data block iterator containing non-ok status, which
caused a valgrind error.

So, this diff decouples creation of data block and range deletion block iterators,
and updates the callers accordingly. Both functions can return non-ok status
in an InternalIterator. Since the non-ok status is returned in an iterator that the
callers will definitely use, it should be more usable/less error-prone.
Closes https://github.com/facebook/rocksdb/pull/1513

Differential Revision: D4181423

Pulled By: ajkr

fbshipit-source-id: 835b8f5
2016-11-15 17:24:28 -08:00
Islam AbdelRahman
193221e0a1 Fix Forward Iterator Seek()/SeekToFirst()
Summary:
In ForwardIterator::SeekInternal(), we may end up passing empty Slice representing an internal key to InternalKeyComparator::Compare.
and when we try to extract the user key from this empty Slice, we will create a slice with size = 0 - 8 ( which will overflow and cause us to read invalid memory as well )

Scenarios to reproduce these issues are in the unit tests
Closes https://github.com/facebook/rocksdb/pull/1467

Differential Revision: D4136660

Pulled By: lightmark

fbshipit-source-id: 151e128
2016-11-08 13:54:31 -08:00
Aaron Gao
bc429de490 revert fractional cascading in farward iterator
Summary: As offline discussion with Siying, revert this since it has bug with seek.

Test Plan: make check -j64

Reviewers: yiwu, andrewkr, sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D65559
2016-10-28 10:25:39 -07:00
Aaron Gao
f517d9dd09 Add SeekForPrev() to Iterator
Summary:
Add new Iterator API, `SeekForPrev`: find the last key that <= target key
support prefix_extractor
support prefix_same_as_start
support upper_bound
not supported in iterators without Prev()

Also add tests in db_iter_test and db_iterator_test

Pass all tests
Cheers!

Test Plan: make all check -j64

Reviewers: andrewkr, yiwu, IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D64149
2016-09-27 18:20:57 -07:00
Islam AbdelRahman
a297643f2e Fix valgrind memory leak 2016-08-11 23:34:19 -07:00
Islam AbdelRahman
d11c09d9e2 Eliminate memcpy from ForwardIterator
Summary:
This diff update ForwardIterator to support pinning keys and values, which will allow DBIter to take advantage of that and eliminate memcpy when executing merge operators
This diff is stacked on D61305

Test Plan:
existing tests (updated them to test tailing iterator)
new test

Reviewers: andrewkr, yhchiang, sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D60009
2016-08-11 19:10:16 -07:00
sdong
56dd034115 read_options.background_purge_on_iterator_cleanup to cover forward iterator and log file closing too.
Summary: With read_options.background_purge_on_iterator_cleanup=true, File deletion and closing can still happen in forward iterator, or WAL file closing. Cover those cases too.

Test Plan: I am adding unit tests.

Reviewers: andrewkr, IslamAbdelRahman, yiwu

Reviewed By: yiwu

Subscribers: leveldb, andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D61503
2016-08-10 13:16:41 -07:00
sdong
294bdf9ee2 Change Property name from "rocksdb.current_version_number" to "rocksdb.current-super-version-number"
Summary: I realized I again is wrong about the naming convention. Let me change it to the correct one.

Test Plan: Run unit tests.

Reviewers: IslamAbdelRahman, kradhakrishnan, yhchiang, andrewkr

Reviewed By: andrewkr

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D55041
2016-03-04 18:15:29 -08:00
sdong
e79ad9e184 Add Iterator Property rocksdb.iterator.version_number
Summary: We want to provide a way to detect whether an iterator is stale and needs to be recreated. Add a iterator property to return version number.

Test Plan: Add two unit tests for it.

Reviewers: IslamAbdelRahman, yhchiang, anthony, kradhakrishnan, andrewkr

Reviewed By: andrewkr

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D54921
2016-03-02 16:23:59 -08:00
Baraa Hamodi
21e95811d1 Updated all copyright headers to the new format. 2016-02-09 15:12:00 -08:00
Gunnar Kudrjavets
97265f5f14 Fix minor bugs in delete operator, snprintf, and size_t usage
Summary:
List of changes:

1) Fix the snprintf() usage in cases where wrong variable was used to determine the output buffer size.

2) Remove unnecessary checks before calling delete operator.

3) Increase code correctness by using size_t type when getting vector's size.

4) Unify the coding style by removing namespace::std usage at the top of the file to confirm to the majority usage.

5) Fix various lint errors pointed out by 'arc lint'.

Test Plan:
Code review and build:

git diff
make clean
make -j 32 commit-prereq
arc lint

Reviewers: kradhakrishnan, sdong, rven, anthony, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D51849
2015-12-15 15:26:20 -08:00
Venkatesh Radhakrishnan
9b8c9be0b5 Fix forward_iterator allocation of vector.
Summary:
db_tailing_iter_test was failing on some platforms because of
an incorrect allocation and use. This diff fixes the issue.

Test Plan:
db_tailing_iter_test
Run valgrind for db_tailing_iter_test

Reviewers: igor, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D50835
2015-11-17 10:27:51 -08:00
Dmitri Smirnov
cb9459f85c Fix empty vector write in ForwardIterator 2015-11-16 13:58:10 -08:00
Venkatesh Radhakrishnan
7824444bfc Reuse file iterators in tailing iterator when memtable is flushed
Summary:
Under a tailing workload, there were increased block cache
misses when a memtable was flushed because we were rebuilding iterators
in that case since the version set changed. This was exacerbated in the
case of iterate_upper_bound, since file iterators which were over the
iterate_upper_bound would have been deleted and are now brought back as
part of the Rebuild, only to be deleted again. We now renew the iterators
and only build iterators for files which are added and delete file
iterators for files which are deleted.
Refer to https://reviews.facebook.net/D50463 for previous version

Test Plan: DBTestTailingIterator.TailingIteratorTrimSeekToNext

Reviewers: anthony, IslamAbdelRahman, igor, tnovak, yhchiang, sdong

Reviewed By: sdong

Subscribers: yhchiang, march, dhruba, leveldb, lovro

Differential Revision: https://reviews.facebook.net/D50679
2015-11-13 15:50:59 -08:00
sdong
35ad531be3 Seperate InternalIterator from Iterator
Summary:
Separate a new class InternalIterator from class Iterator, when the look-up is done internally, which also means they operate on key with sequence ID and type.

This change will enable potential future optimizations but for now InternalIterator's functions are still the same as Iterator's.
At the same time, separate the cleanup function to a separate class and let both of InternalIterator and Iterator inherit from it.

Test Plan: Run all existing tests.

Reviewers: igor, yhchiang, anthony, kradhakrishnan, IslamAbdelRahman, rven

Reviewed By: rven

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D48549
2015-10-13 15:32:13 -07:00
Andres Notzli
e17e92ea19 Relaxed assert in forward iterator
Summary:
It looks like in some cases an assert in SeekInternal failed when computing the
hints for the next level because user_key was the same as the largest key and
not strictly smaller. Relaxing the assert to expect smaller or equal keys.

Test Plan: make clean all check

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D46443
2015-09-08 17:15:11 -07:00
Andres Noetzli
3a0df7f161 Fixed comparison in ForwardIterator when computing hint for GetNextLevelIndex()
Summary: When computing the hint for GetNextLevelIndex(), ForwardIterator was doing a redundant comparison. This patch fixes the comparison (using https://github.com/facebook/rocksdb/blob/master/db/version_set.cc#L158 as a reference) and moves it inside an assert because we expect `level_files[f_idx]` to contain the next key after Seek(), so user_key should always be smaller than the largest key.

Test Plan: make clean all check

Reviewers: rven, anthony, yhchiang, igor, sdong

Reviewed By: sdong

Subscribers: tnovak, sdong, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D46227
2015-09-08 09:47:54 -07:00
Venkatesh Radhakrishnan
91f3c90792 Fix case when forward iterator misses a new update
Summary:
This diff fixes a case when the forward iterator misses a new
insert when the mutable iterator is not current. The test is also
improved and the check for deleted iterators is made more informative.

Test Plan: DBTailingIteratorTest.*Trim

Reviewers: tnovak, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D46167
2015-09-04 14:28:45 -07:00
Tomislav Novak
5508122ed6 Fix a perf regression in ForwardIterator
Summary:
I noticed that memtable iterator usually crosses the `iterate_upper_bound`
threshold when tailing. Changes introduced in D43833 made `NeedToSeekImmutable`
always return true in such case, even when `Seek()` only needs to rewind the
memtable iterator. In a test I ran, this caused the "tailing efficiency"
(ratio of calls to `Seek()` that only affect the memtable versus all seeks)
to drop almost to zero.

This diff attempts to fix the regression by using a different flag to indicate
that `current_` is over the limit instead of resetting `valid_` in
`UpdateCurrent()`.

Test Plan: `DBTestTailingIterator.TailingIteratorUpperBound`

Reviewers: sdong, rven

Reviewed By: rven

Subscribers: dhruba, march

Differential Revision: https://reviews.facebook.net/D45909
2015-09-01 09:54:30 -07:00
Venkatesh Radhakrishnan
cb164bfc48 Do not delete iterators for immutable memtables.
Summary:
The immutable memtable iterators are allocated from an arena and there
is no benefit from deleting these. Also the immutable memtables
themselves will continue to be in memory until the version set
containing it is alive. We will not remove immutable memtable iterators
over the upper bound. We now add immutable iterators to the test.

Test Plan: db_tailing_iter_test.TailingIteratorTrimSeekToNext

Reviewers: tnovak, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D45597
2015-08-28 11:07:07 -07:00
Venkatesh Radhakrishnan
4d28a7d8ab Add a whitebox test for deleted file iterators.
Summary:
We have earlier added a feature to delete file iterators when the
current key is over the iterate upper bound. We now add a whitebox test
to check if the file iterators were actually deleted.

Test Plan: Add check for a range which has deleted iterators.

Reviewers: sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D45321
2015-08-25 13:40:58 -07:00
Venkatesh Radhakrishnan
249fb4f881 Fix use of deleted file iterators with incomplete iterators
Summary:
After deleting file iterators which are over the iterate upper
bound, we also need to check for null pointers in
ResetIncompletIterators.

Test Plan: db_tailing_iter_test.TailingIteratorTrimSeekToNext

Reviewers: tnovak, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D45525
2015-08-25 13:38:35 -07:00
Andres Noetzli
b604d2562f Removing unused variables to fix build
Summary: Removing two unused variables that prevented compilation.

Test Plan: make all

Reviewers: rven, sdong, yhchiang, anthony, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D44991
2015-08-19 16:57:40 -07:00
Venkatesh Radhakrishnan
1b114eed4d Free file iterators for files which are above the iterate upper bound to Improve memory utilization
Summary:
This diff improves the memory utilization for tailing iterators RocksDB,
by freeing file iterators which are over the upper bound.
It is an updating on Siying's original diff for improving the memory usage for
tailing iterators. The changes for the seek and next path are now complete
and a test has been added to exercise these paths while deleting file iterators
which are above the upper bound.

Test Plan: db_tailing_iter_test.TailingIteratorTrimSeekToNext

Reviewers: march, tnovak, sdong

Reviewed By: sdong

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D43833
2015-08-19 16:05:51 -07:00
Venkatesh Radhakrishnan
e58e1b18e7 Make tailing iterator show new entries in memtable.
Summary:
Reseek mutable_iter if it is invalid in Next and immutable_iter
is invalid.

Test Plan: DBTestTailingIterator.TailingIteratorSeekToNext

Reviewers: tnovak, march, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D44865
2015-08-18 14:40:06 -07:00
sdong
72613657f0 Measure file read latency histogram per level
Summary: In internal stats, remember read latency histogram, if statistics is enabled. It can be retrieved from DB::GetProperty() with "rocksdb.dbstats" property, if it is enabled.

Test Plan: Manually run db_bench and prints out "rocksdb.dbstats" by hand and make sure it prints out as expected

Reviewers: igor, IslamAbdelRahman, rven, kradhakrishnan, anthony, yhchiang

Reviewed By: yhchiang

Subscribers: MarkCallaghan, leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D44193
2015-08-14 17:32:42 -07:00
Yueh-Hsuan Chiang
4ce5be4255 fixed leaking log::Writers
Summary: Fixes valgrind errors in column_family_test.

Test Plan: `make check`, `make valgrind_check`

Reviewers: igor, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba

Differential Revision: https://reviews.facebook.net/D41181
2015-07-07 12:10:10 -07:00
Igor Sugak
62247ffa3b rocksdb: Add missing override
Summary:
When using latest clang (3.6 or 3.7/trunck) rocksdb is failing with many errors. Almost all of them are missing override errors. This diff adds missing override keyword. No manual changes.

Prerequisites: bear and clang 3.5 build with extra tools

```lang=bash
% USE_CLANG=1 bear make all # generate a compilation database http://clang.llvm.org/docs/JSONCompilationDatabase.html
% clang-modernize -p . -include . -add-override
% make format
```

Test Plan:
Make sure all tests are passing.
```lang=bash
% #Use default fb code clang.
% make check
```
Verify less error and no missing override errors.
```lang=bash
% # Have trunk clang present in path.
% ROCKSDB_NO_FBCODE=1 CC=clang CXX=clang++ make
```

Reviewers: igor, kradhakrishnan, rven, meyering, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D34077
2015-02-26 11:28:41 -08:00
Igor Canadi
e7ea51a8e7 Introduce job_id for flush and compaction
Summary:
It would be good to assing background job their IDs. Two benefits:
1) makes LOGs more readable
2) I might use it in my EventLogger, which will try to make our LOG easier to read/query/visualize

Test Plan: ran rocksdb, read the LOG

Reviewers: sdong, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D31617
2015-02-12 09:54:48 -08:00
Lei Jin
8d3f8f9696 remove all remaining references to cfd->options()
Summary:
The very last reference happens in DBImpl::GetOptions()
I built with both DBImpl::GetOptions() and ColumnFamilyData::options() commented out

Test Plan: make all check

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D29073
2014-11-18 10:20:10 -08:00
Igor Canadi
767777c2bd Turn on -Wshorten-64-to-32 and fix all the errors
Summary:
We need to turn on -Wshorten-64-to-32 for mobile. See D1671432 (internal phabricator) for details.

This diff turns on the warning flag and fixes all the errors. There were also some interesting errors that I might call bugs, especially in plain table. Going forward, I think it makes sense to have this flag turned on and be very very careful when converting 64-bit to 32-bit variables.

Test Plan: compiles

Reviewers: ljin, rven, yhchiang, sdong

Reviewed By: yhchiang

Subscribers: bobbaldwin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D28689
2014-11-11 16:47:22 -05:00
Tomislav Novak
35c8c814e8 Make ForwardIterator::status() more efficient
Summary:
In D19581 I made `ForwardIterator::status()` check all child iterators,
including immutable ones. It's, however, not necessary to do it every
time -- it'll suffice to check only when they're used and their status
could change.

This diff:
* introduces `immutable_status_` which is updated by `Seek()` and `Next()`
* removes special handling of `kIncomplete` status in those methods

Test Plan:
* `db_test`
* hacked ReadSequential in db_bench.cc to check `status()` in addition to
  validity:

```
   $ ./db_bench -use_existing_db -benchmarks readseq -disable_auto_compactions \
      -use_tailing_iterator  # without this patch
   Keys:       16 bytes each
   Values:     100 bytes each (50 bytes after compression)
   Entries:    1000000
   [...]
   DB path: [/dev/shm/rocksdbtest/dbbench]
   readseq      :       0.562 micros/op 1778103 ops/sec;   98.4 MB/s
   $ ./db_bench -use_existing_db -benchmarks readseq -disable_auto_compactions \
      -use_tailing_iterator  # with the patch
   readseq      :       0.433 micros/op 2311363 ops/sec;  127.8 MB/s
```

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb, march, lovro

Differential Revision: https://reviews.facebook.net/D24063
2014-11-10 15:44:20 -08:00
sdong
ac6afaf9ef Enforce naming convention of getters in version_set.h
Summary: Enforce the accessier naming convention in functions in version_set.h

Test Plan: make all check

Reviewers: ljin, yhchiang, rven, igor

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D28143
2014-11-04 09:59:05 -08:00
sdong
4d2ba38b65 Make VersionBuilder unit testable
Summary:
Rename Version::Builder to VersionBuilder and expose its definition to a header.
Make VerisonBuilder not reference Version or ColumnFamilyData, only working with VersionStorageInfo.
Add version_builder_test which has a simple test.

Test Plan: make all check

Reviewers: rven, yhchiang, igor, ljin

Reviewed By: igor

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D27969
2014-10-31 10:44:06 -07:00
sdong
76d1c28e82 Make CompactionPicker more easily tested
Summary:
Make compaction picker easier to test.
The basic idea is to separate a minimum subcomponent of Version to VersionStorageInfo, which just responsible to LSM tree. A stub VersionStorageInfo can then be easily created and passed into compaction picker so that we can check the outputs.

It now passes most tests. Still two things need to be done:
(1) deal with the FIFO compaction's file size.
(2) write an example test to make sure the interface can do the job.

Add a compaction_picker_test to make sure compaction picker codes can be easily unit tested.

Test Plan:
Pass all unit tests and compaction_picker_test

Reviewers: yhchiang, rven, igor, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D27639
2014-10-29 15:16:53 -07:00
Igor Canadi
a39e931e50 FlushProcess
Summary:
Abstract out FlushProcess and take it out of DBImpl.
This also includes taking DeletionState outside of DBImpl.

Currently this diff is only doing the refactoring. Future work includes:
1. Decoupling flush_process.cc, make it depend on less state
2. Write flush_process_test, which will mock out everything that FlushProcess depends on and test it in isolation

Test Plan: make check

Reviewers: rven, yhchiang, sdong, ljin

Reviewed By: ljin

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27561
2014-10-28 11:54:33 -07:00
Lei Jin
eb357af58c unfriend ForwardIterator from VersionSet
Summary: as title

Test Plan:
make release
will run full test on all stacked diffs

Reviewers: sdong, yhchiang, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D27597
2014-10-28 10:08:41 -07:00
Lei Jin
574028679b dynamic max_sequential_skip_in_iterations
Summary:
This is not a critical options. Making it dynamic so that we can remove
more reference to cfd->options()

Test Plan: unit test

Reviewers: yhchiang, sdong, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D24957
2014-10-23 15:34:21 -07:00
Tomislav Novak
187b29938c ForwardIterator: update prev_key_ only if prefix hasn't changed
Summary:
Since ForwardIterator is on a level below DBIter, the latter may call Next() on
it (e.g. in order to skip deletion markers). Since this also updates
`prev_key_`, it may prevent the Seek() optimization.

For example, assume that there's only one SST file and it contains the following
entries: 0101, 0201 (`ValueType::kTypeDeletion`, i.e. a tombstone record), 0201
(`kTypeValue`), 0202. Memtable is empty. `Seek(0102)` will result in `prev_key_`
being set to `0201` instead of `0102`, since `DBIter::Seek()` will call
`ForwardIterator::Next()` to skip record 0201. Therefore, when `Seek(0102)` is
called again, `NeedToSeekImmutable()` will return true.

This fix relies on `prefix_extractor_` to detect prefix changes. `prev_key_` is
only set to `current_->key()` as long as they have the same prefix.

I also made a small change to `NeedToSeekImmutable()` so it no longer returns
true when the db is empty (i.e. there's nothing but a memtable).

Test Plan:
   $ TEST_TMPDIR=/dev/shm/rocksdbtest ROCKSDB_TESTS=TailingIterator ./db_test

Reviewers: sdong, igor, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D23823
2014-10-01 17:10:48 -07:00
Stanislau Hlebik
45a5e3ede0 Remove path with arena==nullptr from NewInternalIterator
Summary:
Simply code by removing code path which does not use Arena
from NewInternalIterator

Test Plan:
make all check
make valgrind_check

Reviewers: sdong

Reviewed By: sdong

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D22395
2014-09-04 17:40:41 -07:00
Tomislav Novak
0f9c43ea36 ForwardIterator: reset incomplete iterators on Seek()
Summary:
When reading from kBlockCacheTier, ForwardIterator's internal child iterators
may end up in the incomplete state (read was unable to complete without doing
disk I/O). `ForwardIterator::status()` will correctly report that; however, the
iterator may be stuck in that state until all sub-iterators are rebuilt:

  * `NeedToSeekImmutable()` may return false even if some sub-iterators are
    incomplete
  * one of the child iterators may be an empty iterator without any state other
    that the kIncomplete status (created using `NewErrorIterator()`); seeking on
    any such iterator has no effect -- we need to construct it again

Akin to rebuilding iterators after a superversion bump, this diff makes forward
iterator reset all incomplete child iterators when `Seek()` or `Next()` are
called.

Test Plan: TEST_TMPDIR=/dev/shm/rocksdbtest ROCKSDB_TESTS=TailingIterator ./db_test

Reviewers: igor, sdong, ljin

Reviewed By: ljin

Subscribers: lovro, march, leveldb

Differential Revision: https://reviews.facebook.net/D22575
2014-08-29 16:21:29 -07:00
Tomislav Novak
3b97ee96c4 ForwardIterator seek bugfix
Summary:
If `NeedToSeekImmutable()` returns false, `SeekInternal()` won't reset the
contents of `immutable_min_heap_`. However, since it calls `UpdateCurrent()`
unconditionally, if `current_` is one of immutable iterators (previously popped
from `immutable_min_heap_`), `UpdateCurrent()` will overwrite it. As a result,
if old `current_` in fact pointed to the smallest entry, forward iterator will
skip some records.

Fix implemented in this diff pushes `current_` back to `immutable_min_heap_`
before calling `UpdateCurrent()`.

Test Plan:
New unit test (courtesy of @lovro):
   $ ROCKSDB_TESTS=TailingIteratorSeekToSame ./db_test

Reviewers: igor, dhruba, haobo, ljin

Reviewed By: ljin

Subscribers: lovro, leveldb

Differential Revision: https://reviews.facebook.net/D19653
2014-07-10 16:46:13 -07:00
Tomislav Novak
105c1e099b ForwardIterator::status() checks all child iterators
Summary:
Forward iterator only checked `status_` and `mutable_iter_->status()`, which is
not sufficient. For example, when reading exclusively from cache
(kBlockCacheTier), `mutable_iter_->status()` may return kOk (e.g. there's
nothing in the memtable), but one of immutable iterators could be in
kIncomplete. In this case, `ForwardIterator::status()` ought to return that
status instead of kOk.

This diff changes `status()` to also check `imm_iters_`, `l0_iters_`, and
`level_iters_`.

Test Plan:
  ROCKSDB_TESTS=TailingIteratorIncomplete ./db_test

Reviewers: ljin, igor

Reviewed By: igor

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D19581
2014-07-10 12:43:12 -07:00
Lei Jin
b278ae8e50 Apply fractional cascading in ForwardIterator::Seek()
Summary:
Use search hint to reduce FindFile range thus avoid comparison
For a small DB with 50M keys, perf_context counter shows it reduces
comparison from 2B to 1.3B for a 15-minute run. No perf change was
observed for 1 seek thread, but quite good improvement was seen for 32
seek threads, when CPU was busy.
will post detail results when ready

Test Plan: db_bench and db_test

Reviewers: haobo, sdong, dhruba, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18879
2014-07-08 11:40:42 -07:00
sdong
cadc1adffa Refactor: group metadata needed to open an SST file to a separate copyable struct
Summary:
We added multiple fields to FileMetaData recently and are planning to add more.
This refactoring separate the minimum information for accessing the file. This object is copyable (FileMetaData is not copyable since the ref counter). I hope this refactoring can enable further improvements:

(1) use it to design a more efficient data structure to speed up read queries.
(2) in the future, when we add information of storage level, we can easily do the encoding, instead of enlarge this structure, which might expand memory work set for file meta data.

The definition is same as current EncodedFileMetaData used in two level iterator, so now the logic in two level iterator is easier to understand.

Test Plan: make all check

Reviewers: haobo, igor, ljin

Reviewed By: ljin

Subscribers: leveldb, dhruba, yhchiang

Differential Revision: https://reviews.facebook.net/D18933
2014-06-16 16:10:52 -07:00
Lei Jin
77db08f27b fix forward iterator bug
Summary: obvious

Test Plan: db_test

Reviewers: sdong, haobo, igor

Reviewed By: igor

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18987
2014-06-10 09:57:26 -07:00
Igor Canadi
fd27001072 Fix compile errors on Mac
Summary: https://phabricator.fb.com/P11372644

Test Plan: compiles

Reviewers: sdong, ljin

Reviewed By: ljin

Subscribers: leveldb

Differential Revision: https://reviews.facebook.net/D18873
2014-06-03 12:28:58 -07:00
Lei Jin
388d2054c7 forward iterator
Summary:
Forward iterator puts everything together in a flat structure instead of
a hierarchy of nested iterators. this should simplify the code and
provide better performance. It also enables more optimization since all
information are accessiable in one place.
Init evaluation shows about 6% improvement

Test Plan: db_test and db_bench

Reviewers: dhruba, igor, tnovak, sdong, haobo

Reviewed By: haobo

Subscribers: sdong, leveldb

Differential Revision: https://reviews.facebook.net/D18795
2014-05-30 14:31:55 -07:00