Commit Graph

8 Commits

Author SHA1 Message Date
Deon Nicholas
c2d7826ced [RocksDB] [MergeOperator] The new Merge Interface! Uses merge sequences.
Summary:
Here are the major changes to the Merge Interface. It has been expanded
to handle cases where the MergeOperator is not associative. It does so by stacking
up merge operations while scanning through the key history (i.e.: during Get() or
Compaction), until a valid Put/Delete/end-of-history is encountered; it then
applies all of the merge operations in the correct sequence starting with the
base/sentinel value.

I have also introduced an "AssociativeMerge" function which allows the user to
take advantage of associative merge operations (such as in the case of counters).
The implementation will always attempt to merge the operations/operands themselves
together when they are encountered, and will resort to the "stacking" method if
and only if the "associative-merge" fails.

This implementation is conjectured to allow MergeOperator to handle the general
case, while still providing the user with the ability to take advantage of certain
efficiencies in their own merge-operator / data-structure.

NOTE: This is a preliminary diff. This must still go through a lot of review,
revision, and testing. Feedback welcome!

Test Plan:
  -This is a preliminary diff. I have only just begun testing/debugging it.
  -I will be testing this with the existing MergeOperator use-cases and unit-tests
(counters, string-append, and redis-lists)
  -I will be "desk-checking" and walking through the code with the help gdb.
  -I will find a way of stress-testing the new interface / implementation using
db_bench, db_test, merge_test, and/or db_stress.
  -I will ensure that my tests cover all cases: Get-Memtable,
Get-Immutable-Memtable, Get-from-Disk, Iterator-Range-Scan, Flush-Memtable-to-L0,
Compaction-L0-L1, Compaction-Ln-L(n+1), Put/Delete found, Put/Delete not-found,
end-of-history, end-of-file, etc.
  -A lot of feedback from the reviewers.

Reviewers: haobo, dhruba, zshao, emayanke

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D11499
2013-08-05 20:14:32 -07:00
Haobo Xu
05e8854085 [Rocksdb] Support Merge operation in rocksdb
Summary:
This diff introduces a new Merge operation into rocksdb.
The purpose of this review is mostly getting feedback from the team (everyone please) on the design.

Please focus on the four files under include/leveldb/, as they spell the client visible interface change.
include/leveldb/db.h
include/leveldb/merge_operator.h
include/leveldb/options.h
include/leveldb/write_batch.h

Please go over local/my_test.cc carefully, as it is a concerete use case.

Please also review the impelmentation files to see if the straw man implementation makes sense.

Note that, the diff does pass all make check and truly supports forward iterator over db and a version
of Get that's based on iterator.

Future work:
- Integration with compaction
- A raw Get implementation

I am working on a wiki that explains the design and implementation choices, but coding comes
just naturally and I think it might be a good idea to share the code earlier. The code is
heavily commented.

Test Plan: run all local tests

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb, zshao, sheki, emayanke, MarkCallaghan

Differential Revision: https://reviews.facebook.net/D9651
2013-05-03 16:59:02 -07:00
Haobo Xu
eb6d139666 [RocksDB] Move table.h to table/
Summary:
- don't see a point exposing table.h to the public.
- fixed make clean to remove also *.d files.

Test Plan: make check; db_stress

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10479
2013-04-22 16:07:56 -07:00
amayank
b2c50f1c3f Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value
Summary:
Changed the Get and Scan options with openForReadOnly mode to have access to the memtable.
Changed the visibility of NewInternalIterator in db_impl from private to protected so that
the derived class db_impl_read_only can call that in its NewIterator function for the
scan case. The previous approach which changed the default for flush_on_destroy_ from false to true
caused many problems in the unit tests due to empty sst files that it created. All
unit tests pass now.

Test Plan: make clean; make all check; ldb put and get and scans

Reviewers: dhruba, heyongqiang, sheki

Reviewed By: dhruba

CC: kosievdmerwe, zshao, dilipj, kailiu

Differential Revision: https://reviews.facebook.net/D8697
2013-02-20 10:45:52 -08:00
amayank
f3901e0647 Revert "Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value"
This reverts commit 4c696ed001.
2013-02-18 22:32:27 -08:00
amayank
4c696ed001 Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value
Summary:
flush_on_destroy has a default value of false and the memtable is flushed
in the dbimpl-destructor only when that is set to true. Because we want the memtable to be flushed everytime that
the destructor is called(db is closed) and the cases where we work with the memtable only are very less
it is a good idea to give this a default value of true. Thus the put from ldb
wil have its data flushed to disk in the destructor and the next Get will be able to
read it when opened with OpenForReadOnly. The reason that ldb could read the latest value when
the db was opened in the normal Open mode is that the Get from normal Open first reads
the memtable and directly finds the latest value written there and the Get from OpenForReadOnly
doesn't have access to the memtable (which is correct because all its Put/Modify) are disabled

Test Plan: make all; ldb put and get and scans

Reviewers: dhruba, heyongqiang, sheki

Reviewed By: heyongqiang

CC: kosievdmerwe, zshao, dilipj, kailiu

Differential Revision: https://reviews.facebook.net/D8631
2013-02-15 16:56:06 -08:00
Dhruba Borthakur
f4c2b7cf97 Enhance ReadOnly mode to process the all committed transactions.
Summary:
Leveldb has an api OpenForReadOnly() that opens the database
in readonly mode. This call had an option to not process the
transaction log.  This patch removes this option and always
processes all transactions that had been committed. It has
been done in such a way that it does not create/write to
any new files in the process. The invariant of "no-writes"
to the leveldb data directory is still true.

This enhancement allows multiple threads to open the same database
in readonly mode and access all trancations that were committed right
upto the OpenForReadOnly call.

I changed the public API to match the new semantics because
there are no users who are currently using this api.

Test Plan: make clean check

Reviewers: sheki

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D7479
2012-12-19 16:30:46 -08:00
heyongqiang
3fcf533ed0 Add a readonly db
Summary: as subject

Test Plan: run db_bench readrandom

Reviewers: dhruba

Reviewed By: dhruba

CC: MarkCallaghan, emayanke, sheki

Differential Revision: https://reviews.facebook.net/D6495
2012-11-07 14:19:48 -08:00