Summary: This implement the Java interface by using JNI
Test Plan: compile test
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D5925
Summary:
Each assoc is identified by (id1, assocType). This is the rowkey.
Each row has a read/write rowlock. There is statically allocated array
of 2000 read/write locks. A rowkey is murmur-hashed to one of the
read/write locks.
assocPut and assocDelete acquires the rowlock in Write mode.
The key-updates are done within the rowlock with a atomic nosync
batch write to leveldb. Then the rowlock is released and
a write-with-sync is done to sync leveldb transaction log.
Test Plan: added unit test
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5859
Summary:
We have seen that reading data via the pread call (instead of
mmap) is much faster on Linux 2.6.x kernels. This patch makes
an equivalent option to switch off mmaps for the write path
as well.
db_bench --mmap_write=0 will use write() instead of mmap() to
write data to a file.
This change is backward compatible, the default
option is to continue using mmap for writing to a file.
Test Plan: "make check all"
Differential Revision: https://reviews.facebook.net/D5781
Summary:
The option is zero by default and in that case reporting is unchanged.
By unchanged, the interval at which stats are reported is scaled after each
report and newline is not issued after each report so one line is rewritten.
When non-zero it specifies the constant interval (in operations) at which
statistics are reported and the stats include the rate per interval. This
makes it easier to determine whether QPS changes over the duration of the test.
Task ID: #
Blame Rev:
Test Plan:
run db_bench
Revert Plan:
Database Impact:
Memcache Impact:
Other Notes:
EImportant:
- begin *PUBLIC* platform impact section -
Bugzilla: #
- end platform impact -
Reviewers: dhruba
Reviewed By: dhruba
CC: heyongqiang
Differential Revision: https://reviews.facebook.net/D5817
Summary:
Implement ReadWrite locks for leveldb. These will be helpful
to implement a read-modify-write operation (e.g. atomic increments).
Test Plan: does not modify any existing code
Reviewers: heyongqiang
Reviewed By: heyongqiang
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5787
Summary:
A simple CLI which calles DB->CompactRange()
Can take String key's as range.
Test Plan:
Inserted data into a table.
Waited for a minute, used compact tool on it. File modification time's
changed so Compact did something on the files.
Existing unit tests work.
Reviewers: heyongqiang, dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D5697
Summary: Print the block cache size in the LOG.
Test Plan: run db_bench and look at LOG. This is helpful while I was debugging one use-case.
Reviewers: heyongqiang, MarkCallaghan
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5739
Summary:
In the current code, a Get() call can trigger compaction if it has to look at more than one file. This causes unnecessary compaction because looking at more than one file is a penalty only if the file is not yet in the cache. Also, th current code counts these files before the bloom filter check is applied.
This patch counts a 'seek' only if the file fails the bloom filter
check and has to read in data block(s) from the storage.
This patch also counts a 'seek' if a file is not present in the file-cache, because opening a file means that its index blocks need to be read into cache.
Test Plan: unit test attached. I will probably add one more unti tests.
Reviewers: heyongqiang
Reviewed By: heyongqiang
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5709
Summary:
Create a tool to iterate through keys and dump values. Current options
as follows:
db_dump --start=[START_KEY] --end=[END_KEY] --max_keys=[NUM] --stats
[PATH]
START_KEY: First key to start at
END_KEY: Key to end at (not inclusive)
NUM: Maximum number of keys to dump
PATH: Path to leveldb DB
The --stats command line argument prints out the DB stats before dumping
the keys.
Test Plan:
- Tested with invalid args
- Tested with invalid path
- Used empty DB
- Used filled DB
- Tried various permutations of command line options
Reviewers: dhruba, heyongqiang
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D5643
Summary:
If ReadCompaction is switched off, then it is better to not even
submit background compaction jobs. I see about 3% increase in
read-throughput on a pure memory database.
Test Plan: run db_bench
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5673
Summary:
The GetLiveFiles() api lists the set of sst files and the current
MANIFEST file. But the database continues to append new data to the
MANIFEST file even when the application is backing it up to the
backup location. This means that the database-version that is
stored in the MANIFEST FILE in the backup location
does not correspond to the sst files returned by GetLiveFiles.
This API adds a new parameter to GetLiveFiles. This new parmeter
returns the current size of the MANIFEST file.
Test Plan: Unit test attached.
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5631
Summary:
Keeping symbols in the binary increases the size of the library but makes
it easier to debug. The optimization level is still -O2, so this should
have no impact on performance.
Test Plan: make all
Reviewers: heyongqiang, MarkCallaghan
Reviewed By: MarkCallaghan
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5601
Summary:
The background threads are necessary for compaction.
For slower storage, it might be necessary to have more than
one compaction thread per DB. This patch allows creating
a configurable number of worker threads.
The default reamins at 1 (to maintain backward compatibility).
Test Plan:
run all unit tests. changes to db-bench coming in
a separate patch.
Reviewers: heyongqiang
Reviewed By: heyongqiang
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5559
Summary: Print out the compile version in the LOG.
Test Plan: run dbbench and verify LOG
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5529
Summary: Use correct version of jemalloc.
Test Plan: run unit tests
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5487
Summary:
as subject. This diff should be good for benchmarking.
will send another diff to make it better in the case the seek compaction is enable.
In that coming diff, will not count a seek if the bloomfilter filters.
Test Plan: build
Reviewers: dhruba, MarkCallaghan
Reviewed By: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5481
Summary:
A set of apis that allows an application to backup data from the
leveldb database based on a set of files.
Test Plan: unint test attached. more coming soon.
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5439
Summary:
as subject. this can be used for benchmarking.
If we want it for some cases, we can do more changes to make this part of the option.
Test Plan: db_test
Reviewers: dhruba
CC: MarkCallaghan
Differential Revision: https://reviews.facebook.net/D5451
Summary:
Reads via mmap on concurrent workloads are much slower than pread.
For example on a 24-core server with storage that can do 100k IOPS or more
I can get no more than 10k IOPS with mmap reads and 32+ threads.
Test Plan: db_bench benchmarks
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5433
Summary:
FLAGS_cache_size is a long, no need to scan %lld into a size_t
for it (which generates a compiler warning)
Test Plan: run db_bench
Reviewers: dhruba, heyongqiang
Reviewed By: heyongqiang
CC: heyongqiang
Differential Revision: https://reviews.facebook.net/D5427
Summary:
This adds an option to db_bench to specify the compression algorithm to
use for LevelDB
Test Plan: ran db_bench
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D5421
Summary:
Ability to switch off filesystem read-aheads. This change is
backward-compatible: the default setting is to allow file
system read-aheads.
Test Plan: run benchmarks
Reviewers: heyongqiang, adsharma
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5391
Summary:
When posix_fadvise(offset, offset) is usedm it frees up only those
pages in that specified range. But the filesystem could have done some
read-aheads and those get cached in the OS cache.
Do not cache readahead-pages in the OS cache.
Test Plan: run db_bench benchmark.
Reviewers: vamsi, heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5379
Summary: Enable db_bench to specify block size.
Test Plan: compile and run
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5373
Summary: added a new option db_log_dir, which points the log dir. Inside that dir, in order to make log names unique, the log file name is prefixed with the leveldb data dir absolute path.
Test Plan: db_test
Reviewers: dhruba
Reviewed By: dhruba
Differential Revision: https://reviews.facebook.net/D5205
Summary: If none of reads or writes are specified by user, then pick the FLAGS_NUM as the number of iterations in the ReadRandomWriteRandom test. If either reads or writes are defined, then use their maximum.
Test Plan: run benchmark
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5217
Summary: Do not use scribe for release builds.
Test Plan: build fbcode
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5139
Summary:
This patch enables the db_bench benchmark to issue both random reads and random writes at the same time. This options can be trigged via
./db_bench --benchmarks=readrandomwriterandom
The default percetage of reads is 90.
One can change the percentage of reads by specifying the --readwritepercent.
./db_bench --benchmarks=readrandomwriterandom=50
This is a feature request from Jeffro asking for leveldb performance with a 90:10 read:write ratio.
Test Plan: run on test machine.
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5067
Summary:
Clean up compiler warnings generated by -Wall option.
make clean all OPT=-Wall
This is a pre-requisite before making a new release.
Test Plan: compile and run unit tests
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5019
Summary:
The numbers of shards that the block cache is divided into is
configurable. However, if the user specifies that he/she wants
the block cache to be divided into more than 2**20 pieces, then
the system will rey to allocate a huge array of that size) that
could fail.
It is better to limit the sharding of the block cache to an
upper bound. The default sharding is 16 shards (i.e. 2**4)
and the maximum is now 2 million shards (i.e. 2**20).
Also, fixed a bug with the LRUCache where the numShardBits
should be a private member of the LRUCache object rather than
a static variable.
Test Plan:
run db_bench with --cache_numshardbits=64.
Task ID: #
Blame Rev:
Reviewers: heyongqiang
Reviewed By: heyongqiang
Differential Revision: https://reviews.facebook.net/D5013
Summary: as subject. ported the change from google code leveldb 1.5
Test Plan: run db_test
Reviewers: dhruba
Differential Revision: https://reviews.facebook.net/D4839