Commit Graph

159 Commits

Author SHA1 Message Date
Haobo Xu
4ca3c67bd3 [RocksDB] Cleanup compaction filter to use a class interface, instead of function pointer and additional context pointer.
Summary:
This diff replaces compaction_filter_args and CompactionFilter with a single compaction_filter parameter. It gives CompactionFilter better encapsulation and a similar look to Comparator and MergeOpertor, which improves consistency of the overall interface.
The change is not backward compatible. Nevertheless, the two references in fbcode are not in production yet.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, zshao

Differential Revision: https://reviews.facebook.net/D10773
2013-05-13 14:06:10 -07:00
Haobo Xu
05e8854085 [Rocksdb] Support Merge operation in rocksdb
Summary:
This diff introduces a new Merge operation into rocksdb.
The purpose of this review is mostly getting feedback from the team (everyone please) on the design.

Please focus on the four files under include/leveldb/, as they spell the client visible interface change.
include/leveldb/db.h
include/leveldb/merge_operator.h
include/leveldb/options.h
include/leveldb/write_batch.h

Please go over local/my_test.cc carefully, as it is a concerete use case.

Please also review the impelmentation files to see if the straw man implementation makes sense.

Note that, the diff does pass all make check and truly supports forward iterator over db and a version
of Get that's based on iterator.

Future work:
- Integration with compaction
- A raw Get implementation

I am working on a wiki that explains the design and implementation choices, but coding comes
just naturally and I think it might be a good idea to share the code earlier. The code is
heavily commented.

Test Plan: run all local tests

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb, zshao, sheki, emayanke, MarkCallaghan

Differential Revision: https://reviews.facebook.net/D9651
2013-05-03 16:59:02 -07:00
Kai Liu
958b9c80e1 Avoid global static initialization in Env::Default()
Summary:
Mark's task description from #2316777

Env::Default() comes from util/env_posix.cc

This is a static global.

static PosixEnv default_env;

Env* Env::Default() {
  return &default_env;
}

-----

These globals assume default_env was initialized first. I don't think that is safe or correct to do (http://stackoverflow.com/questions/1005685/c-static-initialization-order)

const string AutoRollLoggerTest::kTestDir(
test::TmpDir() + "/db_log_test");
const string AutoRollLoggerTest::kLogFile(
test::TmpDir() + "/db_log_test/LOG");
Env* AutoRollLoggerTest::env = Env::Default();

Test Plan:
run make clean && make && make check
But how can I know if it works in Ubuntu?

Reviewers: MarkCallaghan, chip

Reviewed By: chip

CC: leveldb, dhruba, haobo

Differential Revision: https://reviews.facebook.net/D10491
2013-04-22 18:10:28 -07:00
Dhruba Borthakur
3cb7bf8170 Initialize parameters in the constructor.
Summary:
RocksDB doesn't build on Ubuntu VM .. shoudl be fixed with this patch.

g++ --version
g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

util/env_posix.cc:68:24: sorry, unimplemented: non-static data member initializers
util/env_posix.cc:68:24: error: ISO C++ forbids in-class initialization of non-const static member ‘use_os_buffer’
util/env_posix.cc:113:24: sorry, unimplemented: non-static data member initializers
util/env_posix.cc:113:24: error: ISO C++ forbids in-class initialization of non-const static member ‘use_os_buffer

Test Plan: make check

Reviewers: sheki, leveldb

Reviewed By: sheki

Differential Revision: https://reviews.facebook.net/D10461
2013-04-22 14:41:45 -07:00
Mark Callaghan
b1ff9ac9c5 Add --writes_per_second rate limit, print p99.99 in histogram
Summary:
Adds the --writes_per_second rate limit for the readwhilewriting test.
The purpose is to optionally avoid saturating storage with writes & compaction
and test read response time when some writes are being done.

Changes the histogram code to also print the p99.99 value

Task ID: #

Blame Rev:

Test Plan:
make check, ran db_bench with it

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: haobo

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10305
2013-04-20 10:26:51 -07:00
Haobo Xu
e0b60923ee [RocksDB] fix build
Summary: forgot to include signal_test.cc

Test Plan: make check

Reviewers: sheki

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10281
2013-04-20 10:26:51 -07:00
Haobo Xu
1255dcd446 [RocksDB] Add stacktrace signal handler
Summary:
This diff provides the ability to print out a stacktrace when the process receives certain signals.
Currently, we enable this for the following signals (program error related):
SIGILL SIGSEGV SIGBUS SIGABRT
Application simply #include "util/stack_trace.h" and call leveldb::InstallStackTraceHandler() during initialization, if signal handler is needed. It's not done automatically when openning db, because it's the application(process)'s responsibility to install signal handler and some applications might already have their own (like fbcode).

Sample output:
Received signal 11 (Segmentation fault)
#0  0x408ff0 ./signal_test() [0x408ff0] /home/haobo/rocksdb/util/signal_test.cc:4
#1  0x40827d ./signal_test() [0x40827d] /home/haobo/rocksdb/util/signal_test.cc:24
#2  0x7f8bb183172e /usr/local/fbcode/gcc-4.7.1-glibc-2.14.1/lib/libc.so.6(__libc_start_main+0x10e) [0x7f8bb183172e] ??:0
#3  0x408ebc ./signal_test() [0x408ebc] /home/engshare/third-party/src/glibc/glibc-2.14.1/glibc-2.14.1/csu/../sysdeps/x86_64/elf/start.S:113
Segmentation fault (core dumped)

For each frame, we print the raw pointer, the symbol provided by backtrace_symbols (still not good enough), and the source file/line. Note that address translation is done by directly shell out to addr2line. ??:0 means addr2line fails to do the translation. Hacky, but I think it's good for now.

Test Plan: signal_test.cc

Reviewers: dhruba, MarkCallaghan

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10173
2013-04-20 10:26:50 -07:00
Haobo Xu
a29fc171a6 [RocksDB] posix_logger does not compile on non-linux platform
Summary: As title. Found out this when testing stack_trace.cc portability.

Test Plan: make check; manual test 'non-linux' build by forcing OS_LINUX2

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10263
2013-04-15 19:18:51 -07:00
Abhishek Kona
dae7379050 [RocksDB] Expose LDB functioanality as a library call - clients can build their own LDB binary with additional options
Summary: Primarily a refactor. Introduced LDBTool interface to which customers can plug in their options and this will create their own version of ldb tool.

Test Plan: made ldb tool and tried it.

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D10191
2013-04-11 20:21:49 -07:00
Mayank Agarwal
6594fef7ef Exit and Join the background compaction threads while running rocksdb tests
Summary:
The background compaction threads are never exitted and therefore caused
memory-leaks while running rpcksdb tests. Have changed the PosixEnv destructor to exit and join them and changed the tests likewise
The memory leaked has reduced from 320 bytes to 64 bytes in all the tests. The 64
bytes is relating to
pthread_exit, but still have to figure out why. The stack-trace right now with
table_test.cc = 64 bytes in 1 blocks are possibly lost in loss record 4 of 5
   at 0x475D8C: malloc (jemalloc.c:914)
   by 0x400D69E: _dl_map_object_deps (dl-deps.c:505)
   by 0x4013393: dl_open_worker (dl-open.c:263)
   by 0x400F015: _dl_catch_error (dl-error.c:178)
   by 0x4013B2B: _dl_open (dl-open.c:569)
   by 0x5D3E913: do_dlopen (dl-libc.c:86)
   by 0x400F015: _dl_catch_error (dl-error.c:178)
   by 0x5D3E9D6: __libc_dlopen_mode (dl-libc.c:47)
   by 0x5048BF3: pthread_cancel_init (unwind-forcedunwind.c:53)
   by 0x5048DC9: _Unwind_ForcedUnwind (unwind-forcedunwind.c:126)
   by 0x5046D9F: __pthread_unwind (unwind.c:130)
   by 0x50413A4: pthread_exit (pthreadP.h:289)

Test Plan: make all check

Reviewers: dhruba, sheki, haobo

Reviewed By: dhruba

CC: leveldb, chip

Differential Revision: https://reviews.facebook.net/D9573
2013-04-10 14:50:25 -07:00
heyongqiang
e21ba94a69 Set FD_CLOEXEC after each file open
Summary: as subject. This is causing problem in adsconv. Ideally, this flags should be set in open. But that is only supported in Linux kernel ≥2.6.23 and glibc ≥2.7.

Test Plan:
db_test

run db_test

Reviewers: dhruba, MarkCallaghan, haobo

Reviewed By: dhruba

CC: leveldb, chip

Differential Revision: https://reviews.facebook.net/D10089
2013-04-10 14:44:06 -07:00
Mayank Agarwal
adb4e4509b Fixing delete in env_posix.cc
Summary: Was deleting incorrectly. Should delete the whole array.

Test Plan: make;valgrind stops complaining about Mismatched free/delete

Reviewers: dhruba, sheki

Reviewed By: sheki

CC: leveldb, haobo

Differential Revision: https://reviews.facebook.net/D10059
2013-04-09 11:49:35 -07:00
Haobo Xu
a77bc3d14c [RocksDB] Fix LRUCache Eviction problem
Summary:
1. The stock LRUCache nukes itself whenever the working set (the total number of entries not released by client at a certain time) is bigger than the cache capacity.
See https://our.dev.facebook.com/intern/tasks/?t=2252281
2. There's a bug in shard calculation leading to segmentation fault when only one shard is needed.

Test Plan: make check

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb, zshao, sheki

Differential Revision: https://reviews.facebook.net/D9927
2013-04-04 11:22:50 -07:00
Haobo Xu
d815082159 [RocksDB] env_posix cleanup
Summary:
1. SetBackgroundThreads was not thread safe
2. queue_size_ does not seem necessary
3. moved condition signal after shared state change. Even though the original
   order is in practice ok (because the mutex is still held), it looks fishy
   and non-intuitive.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb, zshao

Differential Revision: https://reviews.facebook.net/D9825
2013-04-02 11:36:51 -07:00
Abhishek Kona
7fdd5f5b33 Use non-mmapd files for Write-Ahead Files
Summary:
Use non mmapd files for Write-Ahead log.
Earlier use of MMaped files. made the log iterator read ahead and miss records.
Now the reader and writer will point to the same physical location.

There is no perf regression :
./db_bench --benchmarks=fillseq --db=/dev/shm/mmap_test --num=$(million 20) --use_existing_db=0 --threads=2
with This diff :
fillseq      :      10.756 micros/op 185281 ops/sec;   20.5 MB/s
without this dif :
fillseq      :      11.085 micros/op 179676 ops/sec;   19.9 MB/s

Test Plan: unit test included

Reviewers: dhruba, heyongqiang

Reviewed By: heyongqiang

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9741
2013-03-28 13:13:35 -07:00
Abhishek Kona
63f216ee0a memory manage statistics
Summary:
Earlier Statistics object was a raw pointer. This meant the user had to clear up
the Statistics object after creating the database. In most use cases the database is created in a function and the statistics pointer is out of scope. Hence the statistics object would never be deleted.
Now Using a shared_ptr to manage this.

Want this in before the next release.

Test Plan: make all check.

Reviewers: dhruba, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9735
2013-03-27 11:27:39 -07:00
Simon Marlow
a8bf8fe504 Integrate the manifest_dump command with ldb
Summary:
Syntax:

   manifest_dump [--verbose] --num=<manifest_num>

e.g.

$ ./ldb --db=/home/smarlow/tmp/testdb manifest_dump --num=12
manifest_file_number 13 next_file_number 14 last_sequence 3 log_number
11  prev_log_number 0
--- level 0 --- version# 0 ---
 6:116['a1' @ 1 : 1 .. 'a1' @ 1 : 1]
 10:130['a3' @ 2 : 1 .. 'a4' @ 3 : 1]
--- level 1 --- version# 0 ---
--- level 2 --- version# 0 ---
--- level 3 --- version# 0 ---
--- level 4 --- version# 0 ---
--- level 5 --- version# 0 ---
--- level 6 --- version# 0 ---

Test Plan: - Tested on an example DB (see output in summary)

Reviewers: sheki, dhruba

Reviewed By: sheki

CC: leveldb, heyongqiang

Differential Revision: https://reviews.facebook.net/D9609
2013-03-22 09:17:30 -07:00
Mayank Agarwal
38d54832f7 Initialize variable in constructor for PosixEnv::checkedDiskForMmap_
Summary: This caused compilation problems on some gcc platforms during the third-partyrelease

Test Plan: make

Reviewers: sheki

Reviewed By: sheki

Differential Revision: https://reviews.facebook.net/D9627
2013-03-21 11:26:50 -07:00
Dhruba Borthakur
ad96563b79 Ability to configure bufferedio-reads, filesystem-readaheads and mmap-read-write per database.
Summary:
This patch allows an application to specify whether to use bufferedio,
reads-via-mmaps and writes-via-mmaps per database. Earlier, there
was a global static variable that was used to configure this functionality.

The default setting remains the same (and is backward compatible):
 1. use bufferedio
 2. do not use mmaps for reads
 3. use mmap for writes
 4. use readaheads for reads needed for compaction

I also added a parameter to db_bench to be able to explicitly specify
whether to do readaheads for compactions or not.

Test Plan: make check

Reviewers: sheki, heyongqiang, MarkCallaghan

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9429
2013-03-20 23:14:03 -07:00
Mayank Agarwal
a6f4275403 Removing boost from ldb_cmd.cc
Summary: Getting rid of boost in our github codebase which caused problems on third-party

Test Plan: make ldb; python tools/ldb_test.py

Reviewers: sheki, dhruba

Reviewed By: sheki

Differential Revision: https://reviews.facebook.net/D9543
2013-03-20 11:19:12 -07:00
Mayank Agarwal
48abc06049 Using return value of fwrite in posix_logger.h
Summary: Was causing error(warning) in third-party saying unused result

Test Plan: make

Reviewers: sheki, dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9447
2013-03-19 21:33:01 -07:00
Mayank Agarwal
487168cdcf Fixed sign-comparison in rocksdb code-base and fixed Makefile
Summary: Makefile had options to ignore sign-comparisons and unused-parameters, which should be there. Also fixed the specific errors in the code-base

Test Plan: make

Reviewers: chip, dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9531
2013-03-19 14:35:23 -07:00
Mayank Agarwal
f04cc368f7 Fixing a careless mistake in ldb
Summary: negation of the condition checked currently had to be checkd actually

Test Plan: make ldb; python ldb_test.py

Reviewers: sheki, dhruba

Reviewed By: sheki

Differential Revision: https://reviews.facebook.net/D9459
2013-03-15 13:59:11 -07:00
Mayank Agarwal
a78fb5e8bc Doing away with boost in ldb_cmd.h
Summary: boost functions cause complications while deploying to third-party

Test Plan: make

Reviewers: sheki, dhruba

Reviewed By: sheki

Differential Revision: https://reviews.facebook.net/D9441
2013-03-14 18:16:46 -07:00
Abhishek Kona
1ba5abca97 Use posix_fallocate as default.
Summary:
Ftruncate does not throw an error on disk-full. This causes Sig-bus in
the case where the database tries to issue a Put call on a full-disk.

Use posix_fallocate for allocation instead of truncate.
Add a check to use MMaped files only on ext4, xfs and tempfs, as
posix_fallocate is very slow on ext3 and older.

Test Plan: make all check

Reviewers: dhruba, chip

Reviewed By: dhruba

CC: adsharma, leveldb

Differential Revision: https://reviews.facebook.net/D9291
2013-03-13 13:50:26 -07:00
Mayank Agarwal
5b278b53ae Fix valgrind errors in rocksdb tests: auto_roll_logger_test, reduce_levels_test
Summary: Fix for memory leaks in rocksdb tests. Also modified the variable NUM_FAILED_TESTS to print the actual number of failed tests.

Test Plan: make <test>; valgrind --leak-check=full ./<test>

Reviewers: sheki, dhruba

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9333
2013-03-12 16:03:16 -07:00
Dhruba Borthakur
469724be7f Add appropriate parameters to make bulk-load go faster.
Summary:
1. Create only 2 levels so that manual compactions are fast.
2. Set target file size to a large value

Test Plan: make clean check

Reviewers: kailiu, zshao

Reviewed By: zshao

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9231
2013-03-08 10:52:16 -08:00
Zheng Shao
7b43500794 [RocksDB] Add bulk_load option to Options and ldb
Summary:
Add a shortcut function to make it easier for people
to efficiently bulk_load data into RocksDB.

Test Plan:
Tried ldb with "--bulk_load" and "--bulk_load --compact" and verified the outcome.
Needs to consult the team on how to test this automatically.

Reviewers: sheki, dhruba, emayanke, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8907
2013-03-05 00:34:53 -08:00
Mark Callaghan
993543d1be Add rate_delay_limit_milliseconds
Summary:
This adds the rate_delay_limit_milliseconds option to make the delay
configurable in MakeRoomForWrite when the max compaction score is too high.
This delay is called the Ln slowdown. This change also counts the Ln slowdown
per level to make it possible to see where the stalls occur.

From IO-bound performance testing, the Level N stalls occur:
* with compression -> at the largest uncompressed level. This makes sense
                      because compaction for compressed levels is much
                      slower. When Lx is uncompressed and Lx+1 is compressed
                      then files pile up at Lx because the (Lx,Lx+1)->Lx+1
                      compaction process is the first to be slowed by
                      compression.
* without compression -> at level 1

Task ID: #1832108

Blame Rev:

Test Plan:
run with real data, added test

Revert Plan:

Database Impact:

Memcache Impact:

Other Notes:

EImportant:

- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D9045
2013-03-04 07:41:15 -08:00
Dhruba Borthakur
806e264350 Ability for rocksdb to compact when flushing the in-memory memtable to a file in L0.
Summary:
Rocks accumulates recent writes and deletes in the in-memory memtable.
When the memtable is full, it writes the contents on the memtable to
a file in L0.

This patch removes redundant records at the time of the flush. If there
are multiple versions of the same key in the memtable, then only the
most recent one is dumped into the output file. The purging of
redundant records occur only if the most recent snapshot is earlier
than the earliest record in the memtable.

Should we switch on this feature by default or should we keep this feature
turned off in the default settings?

Test Plan: Added test case to db_test.cc

Reviewers: sheki, vamsi, emayanke, heyongqiang

Reviewed By: sheki

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8991
2013-03-04 00:01:47 -08:00
Abhishek Kona
c41f1e995c Codemod NULL to nullptr
Summary:
scripted NULL to nullptr in
* include/leveldb/
* db/
* table/
* util/

Test Plan: make all check

Reviewers: dhruba, emayanke

Reviewed By: emayanke

CC: leveldb

Differential Revision: https://reviews.facebook.net/D9003
2013-02-28 18:04:58 -08:00
Abhishek Kona
9bf91c74b8 ldb waldump to print the keys along with other stats + NULL to nullptr in ldb_cmd.cc
Summary: LDB tool to print the deleted/put keys in hex in the wal file.

Test Plan: run ldb on a  db to check if output was satisfactory

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8691
2013-02-20 11:01:37 -08:00
amayank
b2c50f1c3f Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value
Summary:
Changed the Get and Scan options with openForReadOnly mode to have access to the memtable.
Changed the visibility of NewInternalIterator in db_impl from private to protected so that
the derived class db_impl_read_only can call that in its NewIterator function for the
scan case. The previous approach which changed the default for flush_on_destroy_ from false to true
caused many problems in the unit tests due to empty sst files that it created. All
unit tests pass now.

Test Plan: make clean; make all check; ldb put and get and scans

Reviewers: dhruba, heyongqiang, sheki

Reviewed By: dhruba

CC: kosievdmerwe, zshao, dilipj, kailiu

Differential Revision: https://reviews.facebook.net/D8697
2013-02-20 10:45:52 -08:00
Abhishek Kona
fe10200ddc Introduce histogram in statistics.h
Summary:
* Introduce is histogram in statistics.h
* stop watch to measure time.
* introduce two timers as a poc.
Replaced NULL with nullptr to fight some lint errors
Should be useful for google.

Test Plan:
ran db_bench and check stats.
make all check

Reviewers: dhruba, heyongqiang

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8637
2013-02-20 10:43:32 -08:00
Kai Liu
45f0030458 Fix the "IO error" in auto_roll_logger_test
Summary:

I missed InitTestDb() in one of my tess. InitTestDb() initializes the test directory, without which the test will throw IO error.

This problem didn't occur before because I've already run the tests before so the test directory is already there.

Test Plan:

Reviewers: dhruba

CC:

Task ID: #

Blame Rev:
2013-02-19 00:13:22 -08:00
amayank
f3901e0647 Revert "Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value"
This reverts commit 4c696ed001.
2013-02-18 22:32:27 -08:00
amayank
4c696ed001 Fix for the weird behaviour encountered by ldb Get where it could read only the second-latest value
Summary:
flush_on_destroy has a default value of false and the memtable is flushed
in the dbimpl-destructor only when that is set to true. Because we want the memtable to be flushed everytime that
the destructor is called(db is closed) and the cases where we work with the memtable only are very less
it is a good idea to give this a default value of true. Thus the put from ldb
wil have its data flushed to disk in the destructor and the next Get will be able to
read it when opened with OpenForReadOnly. The reason that ldb could read the latest value when
the db was opened in the normal Open mode is that the Get from normal Open first reads
the memtable and directly finds the latest value written there and the Get from OpenForReadOnly
doesn't have access to the memtable (which is correct because all its Put/Modify) are disabled

Test Plan: make all; ldb put and get and scans

Reviewers: dhruba, heyongqiang, sheki

Reviewed By: heyongqiang

CC: kosievdmerwe, zshao, dilipj, kailiu

Differential Revision: https://reviews.facebook.net/D8631
2013-02-15 16:56:06 -08:00
Kai Liu
aaa0cbb97a Fix the warning introduced by auto_roll_logger_test
Summary: Fix the warning [-Werror=format-security] and [-Werror=unused-result].

Test Plan:
enforced the Werror and run make

Task ID: 2101673

Blame Rev:

Reviewers: heyongqiang

Differential Revision: https://reviews.facebook.net/D8553
2013-02-13 15:29:35 -08:00
Chip Turner
f02db1c118 Add zlib to our builds and tweak histogram output
Summary:
$SUBJECT -- cosmetic fix for histograms, print P75/P99, and
make sure zlib is enabled for our command line tools.

Test Plan: compile, test db_bench with --compression_type=zlib

Reviewers: heyongqiang

Reviewed By: heyongqiang

CC: adsharma, leveldb

Differential Revision: https://reviews.facebook.net/D8445
2013-02-07 15:31:53 -08:00
Kai Liu
b63aafce42 Allow the logs to be purged by TTL.
Summary:
* Add a SplitByTTLLogger to enable this feature. In this diff I implemented generalized AutoSplitLoggerBase class to simplify the
development of such classes.
* Refactor the existing AutoSplitLogger and fix several bugs.

Test Plan:
* Added a unit tests for different types of "auto splitable" loggers individually.
* Tested the composited logger which allows the log files to be splitted by both TTL and log size.

Reviewers: heyongqiang, dhruba

Reviewed By: heyongqiang

CC: zshao, leveldb

Differential Revision: https://reviews.facebook.net/D8037
2013-02-04 19:42:40 -08:00
Abhishek Kona
4dc02f7b7a Initialize all doubles to 0 in histogram.cc
Summary:
The existing code did not initialize a few doubles in histogram.cc.
Cropped up when I wrote a unit-test.

Test Plan: make all check

Reviewers: chip

Reviewed By: chip

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8319
2013-01-31 17:31:43 -08:00
Abhishek Kona
009034cf12 Performant util/histogram.
Summary:
Earlier way to record in histogram=>
Linear search BucketLimit array to find the bucket and increment the
counter
Current way to record in histogram=>
Store a HistMap statically which points the buckets of each value in the
range [kFirstValue, kLastValue);

In the proccess use vectors instead of array's and refactor some code to
HistogramHelper class.

Test Plan:
run db_bench with histogram=1 and see a histogram being
printed.

Reviewers: dhruba, chip, heyongqiang

Reviewed By: chip

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8265
2013-01-31 16:10:34 -08:00
Kosie van der Merwe
4dcc0c89f4 Fixed cache key for block cache
Summary:
Added function to `RandomAccessFile` to generate an unique ID for that file. Currently only `PosixRandomAccessFile` has this behaviour implemented and only on Linux.

Changed how key is generated in `Table::BlockReader`.

Added tests to check whether the unique ID is stable, unique and not a prefix of another unique ID. Added tests to see that `Table` uses the cache more efficiently.

Test Plan: make check

Reviewers: chip, vamsi, dhruba

Reviewed By: chip

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8145
2013-01-31 15:20:24 -08:00
Chip Turner
2c3565285e Add OS_LINUX ifdef protections around fallocate parts
Summary: fallocate is linux only, so let's protect it with ifdef's

Test Plan: make

Reviewers: sheki, dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8223
2013-01-28 12:03:35 -08:00
Dilip Antony Joseph
11ce6a060e Enhanced ldb to support data access commands
Summary: Added put/get/scan/batchput/delete/approxsize

Test Plan: Added pyunit script to test the newly added commands

Reviewers: chip, leveldb

Reviewed By: chip

CC: zshao, emayanke

Differential Revision: https://reviews.facebook.net/D7947
2013-01-28 11:38:26 -08:00
Chip Turner
0b83a83191 Fix poor error on num_levels mismatch and few other minor improvements
Summary:
Previously, if you opened a db with num_levels set lower than
the database, you received the unhelpful message "Corruption:
VersionEdit: new-file entry."  Now you get a more verbose message
describing the issue.

Also, fix handling of compression_levels (both the run-over-the-end
issue and the memory management of it).

Lastly, unique_ptr'ify a couple of minor calls.

Test Plan: make check

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D8151
2013-01-25 15:37:26 -08:00
Chip Turner
772f75b3fb Stop continually re-creating build_version.c
Summary:
We continually rebuilt build_version.c because we put the
current date into it, but that's what __DATE__ already is.  This makes
builds faster.

This also fixes an issue with 'make clean FOO' not working properly.

Also tweak the build rules to be more consistent, always have warnings,
and add a 'make release' rule to handle flags for release builds.

Test Plan: make, make clean

Reviewers: dhruba

Reviewed By: dhruba

Differential Revision: https://reviews.facebook.net/D8139
2013-01-24 17:51:39 -08:00
Chip Turner
3dafdfb2c4 Use fallocate to prevent excessive allocation of sst files and logs
Summary:
On some filesystems, pre-allocation can be a considerable
amount of space.  xfs in our production environment pre-allocates by
1GB, for instance.  By using fallocate to inform the kernel of our
expected file sizes, we eliminate this wasteage (that isn't recovered
until the file is closed which, in the case of LOG files, can be a
considerable amount of time).

Test Plan:
created an xfs loopback filesystem, mounted with
allocsize=4M, and ran db_stress.  LOG file without this change was 4M,
and with it it was 128k then grew to normal size.

Reviewers: dhruba

Reviewed By: dhruba

CC: adsharma, leveldb

Differential Revision: https://reviews.facebook.net/D7953
2013-01-24 12:25:13 -08:00
Chip Turner
2fdf91a4f8 Fix a number of object lifetime/ownership issues
Summary:
Replace manual memory management with std::unique_ptr in a
number of places; not exhaustive, but this fixes a few leaks with file
handles as well as clarifies semantics of the ownership of file handles
with log classes.

Test Plan: db_stress, make check

Reviewers: dhruba

Reviewed By: dhruba

CC: zshao, leveldb, heyongqiang

Differential Revision: https://reviews.facebook.net/D8043
2013-01-23 16:54:11 -08:00
Abhishek Kona
7d5a4383bb rollover manifest file.
Summary:
Check in LogAndApply if the file size is more than the limit set in
Options.
Things to consider : will this be expensive?

Test Plan: make all check. Inputs on a new unit test?

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D7701
2013-01-16 12:09:44 -08:00