Summary:
This diff does two things:
(1) Log::Reader does not report a corruption when the last record in a log or manifest file is truncated (meaning that log writer died in the middle of the write). Inherited the code from LevelDB: https://code.google.com/p/leveldb/source/detail?r=269fc6ca9416129248db5ca57050cd5d39d177c8#
(2) Turn off mmap writes for all writes to log and manifest files
(2) is necessary because if we use mmap writes, the last record is not truncated, but is actually filled with zeros, making checksum fail. It is hard to recover from checksum failing.
Test Plan:
Added unit tests from LevelDB
Actually recovered a "corrupted" MANIFEST file.
Reviewers: dhruba, haobo
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16119
Summary: Added list_column_family command and also updated dump_manifest
Test Plan: no
Reviewers: dhruba, haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16419
Summary:
InfoLogLevel test now checks the number of lines of the output log file
instead of the number of bytes in the log file.
This diff fixes the issue that the previous InfoLogLevel test in
auto_roll_logger_test passed in make check but fails when valgrind
is used.
Test Plan: run with make check and valgrind.
Reviewers: kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16407
Summary: as title
Test Plan:
asan_check
will post results later
Reviewers: haobo, igor, dhruba, sdong
Reviewed By: haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16257
Summary: fix the memory leak that was captured by jenkin build.
Test Plan: ran the valgrind test locally
Reviewers: yhchiang
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16389
Summary:
* Now each Log related function has a variant that takes an additional
argument indicating its log level, which is one of the following:
- DEBUG, INFO, WARN, ERROR, FATAL.
* To ensure backward-compatibility, old version Log functions are kept
unchanged.
* Logger now has a member variable indicating its log level. Any incoming
Log request which log level is lower than Logger's log level will not
be output.
* The output of the newer version Log will be prefixed by its log level.
Test Plan:
Add a LogType test in auto_roll_logger_test.cc
= Sample log output =
2014/02/11-00:03:07.683895 7feded179840 [DEBUG] this is the message to be written to the log file!!
2014/02/11-00:03:07.683898 7feded179840 [INFO] this is the message to be written to the log file!!
2014/02/11-00:03:07.683900 7feded179840 [WARN] this is the message to be written to the log file!!
2014/02/11-00:03:07.683903 7feded179840 [ERROR] this is the message to be written to the log file!!
2014/02/11-00:03:07.683906 7feded179840 [FATAL] this is the message to be written to the log file!!
Reviewers: dhruba, xjin, kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16071
Summary:
This is not a generic thread local implementation in the sense that it
only takes pointer. But it does support multiple instances per thread
and lets user plugin function to perform cleanup when thread exits or an
instance gets destroyed.
Test Plan: unit test for now
Reviewers: haobo, igor, sdong, dhruba
Reviewed By: igor
CC: leveldb, kailiu
Differential Revision: https://reviews.facebook.net/D16131
Summary: A simple benchmark that simulates WAL append. It can be used to test different platform/file system's performance on WAL.
Test Plan: run it.
Reviewers: haobo, kailiu
Reviewed By: haobo
CC: igor, dhruba, i.am.jin.lei, yhchiang, leveldb, nkg-
Differential Revision: https://reviews.facebook.net/D16239
Summary:
The change to the public behavior:
* When opening a DB or creating new column family client gets a ColumnFamilyHandle.
* As long as column family handle is alive, client can do whatever he wants with it, even drop it
* Dropped column family can still be read from (using the column family handle)
* Added a new call CloseColumnFamily(). Client has to close all column families that he has opened before deleting the DB
* As soon as column family is closed, any calls to DB using that column family handle will fail (also any outstanding calls)
Internally:
* Ref-counting ColumnFamilyData
* New thread-safety for ColumnFamilySet
* Dropped column families are now completely dropped and their memory cleaned-up
Test Plan: added some tests to column_family_test
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16101
Summary: Clean up IOErrors so that it only indicates errors talking to device.
Test Plan: make all check
Reviewers: igor, haobo, dhruba, emayanke
Reviewed By: igor
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15831
Summary:
This patch optimized Get() code paths by avoiding malloc of iterators. Iterator creation is moved to mem table rep implementations, where a callback is called when any key is found. This is the same practice as what we do in (SST) table readers.
db_bench result for readrandom following a writeseq, with no compression, single thread and tmpfs, we see throughput improved to 144958 from 139027, about 3%.
Test Plan: make all check
Reviewers: dhruba, haobo, igor
Reviewed By: haobo
CC: leveldb, yhchiang
Differential Revision: https://reviews.facebook.net/D14685
Summary: When we open a DB, we should dump only DBOptions and then when we create a new column family, we dump ColumnFamilyOptions for each one.
Test Plan: make check, confirm contents of the LOG
Reviewers: dhruba, haobo, sdong, kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D16011
Summary: Nothing major, just an extra return line and posibility of leaking fb in NewRandomRWFile
Test Plan: make check
Reviewers: kailiu, dhruba
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15993
Summary:
This is not column-family related diff. It is in columnfamily branch because the change is significant and we want to push it with next major release (3.0).
It removes the leveldb notion of one thread pool and expands it to two thread pools by default (HIGH and LOW). Flush process is removed from compaction process and all flush threads are executed on HIGH thread pool, since we don't want long-running compactions to influence flush latency.
Test Plan: make check
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15987
Summary: Replaced most of occurrences of Options with more specific DBOptions. This brings us very close to supporting different configuration options for each column family.
Test Plan: make check
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15933
Summary:
Adapting table cache to column families is interesting. We want table cache to be global LRU, so if some column families are use not as often as others, we want them to be evicted from cache. However, current TableCache object also constructs tables on its own. If table is not found in the cache, TableCache automatically creates new table. We want each column family to be able to specify different table factory.
To solve the problem, we still have a single LRU, but we provide the LRUCache object to TableCache on construction. We have one TableCache per column family, but the underyling cache is shared by all TableCache objects.
This allows us to have a global LRU, but still be able to support different table factories for different column families. Also, in the future it will also be able to support different directories for different column families.
Test Plan: make check
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15915
Summary:
There are three SanitizeOption-s now : one for DBOptions, one for ColumnFamilyOptions and one for Options (which just calls the other two)
I have also reshuffled some options -- table_cache options and info_log should live in DBOptions, for example.
Test Plan: make check doesn't complain
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15873
Summary: RocksDB doesn't compile on 32-bit architecture apparently. This is attempt to fix some of 32-bit errors. They are reported here: https://gist.github.com/paxos/8789697
Test Plan: RocksDB still compiles on 64-bit :)
Reviewers: kailiu
Reviewed By: kailiu
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15825
Summary: In PlainTable, use one single byte to represent 8 bytes of internal bytes, if seqID = 0 and it is value type (which should be common for bottom most files). It is to save 7 bytes for uncompressed cases.
Test Plan: make all check
Reviewers: haobo, dhruba, kailiu
Reviewed By: haobo
CC: igor, leveldb
Differential Revision: https://reviews.facebook.net/D15489
Summary: Compaction picker and internal key comparator are different for each column family (not global), so they should live in ColumnFamilyData
Test Plan: make check
Reviewers: dhruba, haobo
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15801
Summary:
Easy thing goes first. This patch moves arena to internal dir; based
on which, the coming patch will deal with memtable_rep.
Test Plan: make check
Reviewers: haobo, sdong, dhruba
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15615
Summary: ColumnFamilyData grew a lot, there's much more data that it holds now. It makes more sense to encapsulate it better by making it a class.
Test Plan: make check
Reviewers: dhruba, haobo, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15579
Summary: All memtables and immutable memtables are moved from DBImpl to ColumnFamilyData. For now, they are all referenced from default column family in DBImpl. It shouldn't be hard to get them from custom column family.
Test Plan: make check
Reviewers: dhruba, kailiu, sdong
CC: leveldb
Differential Revision: https://reviews.facebook.net/D15459
Summary:
@dhruba, I'm not sure where we need to sync the directory. I implemented the function in Env() and added the dir sync just after we close the newly created file in the builder.
Should I also add FsyncDir() to new files that get created by a compaction?
Test Plan: Confirmed that FsyncDir is returning Status::OK()
Reviewers: dhruba, haobo
Reviewed By: dhruba
CC: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D14751
Summary: Converting from length prefixed buffer back to internal key costs some CPU but it is not necessary. In this patch, internal keys are pass though the functions so that we don't need to convert back to it.
Test Plan: make all check
Reviewers: haobo, kailiu
Reviewed By: kailiu
CC: igor, leveldb
Differential Revision: https://reviews.facebook.net/D15393