Commit Graph

2035 Commits

Author SHA1 Message Date
sdong
86a0133d05 PlainTableReader to expose index size to users
Summary:
This is a temp solution to expose index sizes to users from PlainTableReader before we persistent them to files.
In this patch, the memory consumption of indexes used by PlainTableReader will be reported as two user defined properties, so that users can monitor them.

Test Plan:
Add a unit test.
make all check`

Reviewers: haobo, ljin

Reviewed By: haobo

CC: nkg-, yhchiang, igor, ljin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18195
2014-04-22 19:29:05 -07:00
Igor Canadi
1068d2fa60 Revert "Better port::Mutex::AssertHeld() and AssertNotHeld()"
This reverts commit ddafceb6c2.
2014-04-22 18:38:10 -07:00
Igor Canadi
ddafceb6c2 Better port::Mutex::AssertHeld() and AssertNotHeld()
Summary:
Using ThreadLocalPtr as a flag to determine if a mutex is locked or not enables us to implement AssertNotHeld(). It also makes AssertHeld() actually correct.

I had to remove port::Mutex as a dependency for util/thread_local.h, but that's fine since we can just use std::mutex :)

Test Plan: make check

Reviewers: ljin, dhruba, haobo, sdong, yhchiang

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18171
2014-04-22 17:26:21 -07:00
Yueh-Hsuan Chiang
29123408b0 Merge pull request #125 from ankgup87/master
[Java] Add bloom filter JNI bindings
2014-04-22 13:46:42 -07:00
Ankit Gupta
042221ba32 Merge branch 'master' of https://github.com/facebook/rocksdb 2014-04-22 13:05:40 -07:00
Igor Canadi
3992aec8fa Support for column families in TTL DB
Summary:
This will enable people using TTL DB to do so with multiple column families. They can also specify different TTLs for each one.

TODO: Implement CreateColumnFamily() in TTL world.

Test Plan: Added a very simple sanity test.

Reviewers: dhruba, haobo, ljin, sdong, yhchiang

Reviewed By: haobo

CC: leveldb, alberts

Differential Revision: https://reviews.facebook.net/D17859
2014-04-22 11:27:33 -07:00
Ankit Gupta
dd9f6f0a31 Fix formatting 2014-04-22 10:51:39 -07:00
James Pearce
e557297acc New CLA form 2014-04-22 09:12:19 -07:00
Ankit Gupta
7a5106fbea Add doc 2014-04-22 09:01:57 -07:00
Ankit Gupta
2214fd8a15 Refactor filter impl 2014-04-22 08:58:43 -07:00
Ankit Gupta
89cb481aa1 Fix doc 2014-04-22 00:09:40 -07:00
Ankit Gupta
677b0d6d3f Refactor filter impl 2014-04-22 00:04:56 -07:00
Ankit Gupta
5e797cf0dd Change filter implementation 2014-04-21 23:56:19 -07:00
Ankit Gupta
cea2be20b6 Fix formatting 2014-04-21 20:27:09 -07:00
Ankit Gupta
dc4b27ac48 Add bloom filters 2014-04-21 20:25:30 -07:00
Yueh-Hsuan Chiang
af6ad113a8 Fix SIGFAULT when running sst_dump on v2.6 db
Summary: Fix the sigfault when running sst_dump on v2.6 db.

Test Plan:
    git checkout bba6595b1f
    make clean
    make db_bench
    ./db_bench --db=/tmp/some/db --benchmarks=fillseq
    arc patch D18039
    make clean
    make sst_dump
    ./sst_dump --file=/tmp/some/db --command=check

Reviewers: igor, haobo, sdong

Reviewed By: sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18039
2014-04-21 17:49:47 -07:00
Igor Canadi
c2da9e5997 Flush before Fsync()/Sync()
Summary: Calling Fsync()/Sync() on a file should give the guarantee that whatever you written to the file is now persisted. This is currently not the case, since we might have some data left in application cache as we do Fsync()/Sync(). For example, BuildTable() calls Fsync() without the flush, assuming all sst data is now persisted, but it's actually not. This may result in big inconsistencies.

Test Plan: no test

Reviewers: sdong, dhruba, haobo, ljin, yhchiang

Reviewed By: sdong

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18159
2014-04-21 17:45:04 -07:00
Igor Canadi
ba16c1f410 Move benchmark timing to Env::NowNanos() 2014-04-21 17:43:51 -07:00
Yueh-Hsuan Chiang
e316af5f16 [Java] Add Java binding and Java test for ReadOptions.
Summary: Add Java binding and test for rocksdb::ReadOptions.

Test Plan:
make rocksdbjava
make jtest

Reviewers: haobo, dhruba, sdong, ankgup87, rsumbaly, swapnilghike, zzbennett

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18129
2014-04-21 15:52:59 -07:00
Igor Canadi
d0939cdcea Single-threaded asan_crash_test 2014-04-21 15:42:28 -07:00
Yueh-Hsuan Chiang
ef8b8a8ef6 [Java] Add Java bindings for memtables and sst format.
Summary:
Add Java bindings for memtables and sst format.  Specifically,
add two abstract Java classses --- MemTableConfig and SstFormatConfig.
Each MemTable / SST implementation should has its own config class
extends MemTableConfig / SstFormatConfig respectively and pass it
to Options via setMemTableConfig / setSstConfig.

Test Plan:
make rocksdbjava
make jdb_test
make jdb_bench
java/jdb_bench.sh \
  --benchmarks=fillseq,readrandom,readwhilewriting \
  --memtablerep=hash_skiplist \
  --use_plain_table=1 \
  --key_size=20 \
  --prefix_size=12 \
  --value_size=100 \
  --cache_size=17179869184 \
  --disable_wal=0 \
  --sync=0 \

Reviewers: haobo, ankgup87, sdong

Reviewed By: haobo

CC: leveldb, dhruba

Differential Revision: https://reviews.facebook.net/D17997
2014-04-21 15:40:46 -07:00
Igor Canadi
8dc34364d2 Rename "benchmark" back to "bench".
Also, make `benchharness.cc` not compiled into rocksdb library.
2014-04-21 13:12:15 -07:00
Igor Canadi
05c168658e Relax env_test::AllocateTest 2014-04-21 12:56:32 -07:00
Pratyush Seth
ff1b5df4c6 Added benchmark functionality on the lines of folly/Benchmark.h
Summary: Added benchmark functionality on the lines of folly/Benchmark.h

Test Plan: Added unit tests

Reviewers: igor, haobo, sdong, ljin, yhchiang, dhruba

Reviewed By: igor

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17973
2014-04-21 12:29:55 -07:00
Igor Canadi
c7076a7a05 Fix Allocate test
Summary: For some reason, on a subset of our continuous build machines, preallocation is allocating 8 block more than it should be. Let's relax the test a little bit -- now we require the test to allocate *at least* the number of blocks as we told them to.

Test Plan: no

Reviewers: ljin, haobo, sdong

Reviewed By: ljin

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18141
2014-04-21 12:12:02 -07:00
Igor Canadi
f813279da5 Remove TransactionLogIteratorRace when -DNDEBUG 2014-04-21 11:08:30 -07:00
Igor Canadi
11e8525422 Merge pull request #124 from ankgup87/master
[Java] Add iterator JNI bindings
2014-04-21 09:59:39 -07:00
Lei Jin
0f2d768191 hints for narrowing down FindFile range and avoiding checking unrelevant L0 files
Summary:
The file tree structure in Version is prebuilt and the range of each file is known.
On the Get() code path, we do binary search in FindFile() by comparing
target key with each file's largest key and also check the range for each L0 file.
With some pre-calculated knowledge, each key comparision that has been done can serve
as a hint to narrow down further searches:
(1) If a key falls within a L0 file's range, we can safely skip the next
file if its range does not overlap with the current one.
(2) If a key falls within a file's range in level L0 - Ln-1, we should only
need to binary search in the next level for files that overlap with the current one.

(1) will be able to skip some files depending one the key distribution.
(2) can greatly reduce the range of binary search, especially for bottom
levels, given that one file most likely only overlaps with N files from
the level below (where N is max_bytes_for_level_multiplier). So on level
L, we will only look at ~N files instead of N^L files.

Some inital results: measured with 500M key DB, when write is light (10k/s = 1.2M/s), this
improves QPS ~7% on top of blocked bloom. When write is heavier (80k/s =
9.6M/s), it gives us ~13% improvement.

Test Plan: make all check

Reviewers: haobo, igor, dhruba, sdong, yhchiang

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17205
2014-04-21 09:10:12 -07:00
Ankit Gupta
bbdd550b66 Remove getIterator function from portal 2014-04-19 23:17:42 -07:00
Ankit Gupta
1574e0c41a Add doc 2014-04-19 13:21:06 -07:00
Ankit Gupta
06b590dd7c Add doc 2014-04-19 13:13:01 -07:00
Ankit Gupta
dc28a726c1 Add doc + refactor + fix formatting 2014-04-19 13:05:21 -07:00
Ankit Gupta
1d6c1e018f Add more iterator JNI bindings 2014-04-19 12:55:28 -07:00
Ankit Gupta
eda398491a Add more iterator functions 2014-04-19 03:35:01 -07:00
Ankit Gupta
5bbeefaa49 Adding iterator JNI binding 2014-04-19 03:26:22 -07:00
sdong
27d3bc184e Use a different approach to make sure BlockBasedTableReader can use hash index on older files
Summary:
A recent commit e37dd216f9 makes sure hash index can be used when reading existing files. This patch uses another way to achieve the approach:
(1) Currently, always writing kBinarySearch to files, despite of BlockBasedTableOptions.IndexType setting.
(2) When reading a file, read out the field, and make sure it is kBinarySearch, while always use index type by users.

The reason for doing it is, to reserve kHashSearch property on disk to future. If now we write out binary index for both of kHashSearch and kBinarySearch. We have to use a new flag in the future for hash index on disk, otherwise compatibility would break. Also, we want the real index type and type shown in properties block to be consistent.

Test Plan: make all check

Reviewers: haobo, kailiu

Reviewed By: kailiu

CC: igor, ljin, yhchiang, xjin, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D18009
2014-04-18 14:09:21 -07:00
Igor Canadi
35c968f3a5 Merge pull request #122 from ankgup87/master
[Java] Add statistics JNI bindings
2014-04-18 13:53:52 -07:00
Ankit Gupta
686fdea811 Fix formatting issues 2014-04-18 10:48:48 -07:00
Ankit Gupta
ebd85e8f3a Fix build 2014-04-18 10:47:03 -07:00
Ankit Gupta
dc291f5bf0 Merge branch 'master' of https://github.com/facebook/rocksdb
Conflicts:
	Makefile
	java/Makefile
	java/org/rocksdb/Options.java
	java/rocksjni/portal.h
2014-04-18 10:32:14 -07:00
Igor Canadi
1a8abe7276 Merge pull request #120 from jamesgpearce/master
Added period
2014-04-18 09:45:48 -07:00
James Pearce
a745089554 Added period
This is a PR to test some tooling; please do not merge without talking to @jamesgpearce :)
2014-04-18 09:33:27 -07:00
Yueh-Hsuan Chiang
9b2a0939cf [Java] Add Java bindings for 30 options for rocksdb::DBOptions.
Summary:
1. Add Java bindings for 30 options for rocksdb::DBOptions.
2. Add org.rocksdb.test.OptionsTest
3. Codes are semi-auto generated, JavaDocs are manually polished.

Test Plan:
make rocksdbjava
make jtest

Reviewers: haobo, ankgup87, sdong, dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D18015
2014-04-18 01:14:29 -07:00
Yueh-Hsuan Chiang
bb6fd15a6e [Java] Add a basic binding and test for BackupableDB and StackableDB.
Summary:
Add a skeleton binding and test for BackupableDB which shows that BackupableDB
and RocksDB can share the same JNI calls.

Test Plan:
make rocksdbjava
make jtest

Reviewers: haobo, ankgup87, sdong, dhruba

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17793
2014-04-17 17:28:51 -07:00
sdong
651792251a Fix bugs introduced by D17961
Summary:
D17961 has two bugs:
(1) two level iterator fails to populate FileMetaData.table_reader, causing performance regression.
(2) table cache handle the !status.ok() case in the wrong place, causing seg fault which shouldn't happen.

Test Plan: make all check

Reviewers: ljin, igor, haobo

Reviewed By: ljin

CC: yhchiang, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D17991
2014-04-17 17:25:28 -07:00
Igor Canadi
ce353c2474 Nuke tools/shell
Summary: We don't use or build this code

Test Plan: builds

Reviewers: dhruba

Reviewed By: dhruba

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17979
2014-04-17 14:43:42 -07:00
Igor Canadi
86ae8203e6 Fix ifdef NDEBUG 2014-04-17 14:29:28 -07:00
sdong
fa430bfd04 Minimize accessing multiple objects in Version::Get()
Summary:
One of our profilings shows that Version::Get() sometimes is slow when getting pointer of user comparators or other global objects. In this patch:
(1) we keep pointers of immutable objects in Version to avoid accesses them though option objects or cfd objects
(2) table_reader is directly cached in FileMetaData so that table cache don't have to go through handle first to fetch it
(3) If level 0 has less than 3 files, skip the filtering logic based on SST tables' key range. Smallest and largest key are stored in separated memory locations, which has potential cache misses

Test Plan: make all check

Reviewers: haobo, ljin

Reviewed By: haobo

CC: igor, yhchiang, nkg-, leveldb

Differential Revision: https://reviews.facebook.net/D17739
2014-04-17 14:14:00 -07:00
Kai Liu
e37dd216f9 Index type doesn't have to be persisted
Summary:

With the recent changes, there is no need to check the property block about the index block type.
If user want to use it, they don't really need any disk format change; everything happens in the fly.

Also another team encountered an error while reading the index type from properties.

Test Plan:

ran all the tests

Reviewers: sdong

CC:

Task ID: #

Blame Rev:
2014-04-17 11:08:12 -07:00
Igor Canadi
62551b1c4e Don't compile sync_point if NDEBUG
Summary:
We don't really need sync_point.o if we're compiling with NDEBUG.

This diff depends on D17823

Test Plan: compiles

Reviewers: haobo, ljin, sdong

Reviewed By: haobo

CC: leveldb

Differential Revision: https://reviews.facebook.net/D17829
2014-04-17 10:49:58 -07:00