A library that provides an embeddable, persistent key-value store for fast storage.
Go to file
Igor Canadi 0a019d74a0 Use malloc_usable_size() for accounting block cache size
Summary:
Currently, when we insert something into block cache, we say that the block cache capacity decreased by the size of the block. However, size of the block might be less than the actual memory used by this object. For example, 4.5KB block will actually use 8KB of memory. So even if we configure block cache to 10GB, our actually memory usage of block cache will be 20GB!

This problem showed up a lot in testing and just recently also showed up in MongoRocks production where we were using 30GB more memory than expected.

This diff will fix the problem. Instead of counting the block size, we will count memory used by the block. That way, a block cache configured to be 10GB will actually use only 10GB of memory.

I'm using non-portable function and I couldn't find info on portability on Google. However, it seems to work on Linux, which will cover majority of our use-cases.

Test Plan:
1. fill up mongo instance with 80GB of data
2. restart mongo with block cache size configured to 10GB
3. do a table scan in mongo
4. memory usage before the diff: 12GB. memory usage after the diff: 10.5GB

Reviewers: sdong, MarkCallaghan, rven, yhchiang

Reviewed By: yhchiang

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D40635
2015-06-26 11:48:09 -07:00
arcanist_util Integrate Jenkins with Phabricator 2015-04-07 11:56:29 -07:00
build_tools Use malloc_usable_size() for accounting block cache size 2015-06-26 11:48:09 -07:00
coverage Fix coverage script 2014-11-03 14:53:00 -08:00
db Call merge operators with empty values 2015-06-26 11:35:46 -07:00
doc Remove seek compaction 2014-06-20 10:23:02 +02:00
examples [API Change] Improve EventListener::OnFlushCompleted interface 2015-06-05 12:28:51 -07:00
hdfs Add Env::GetThreadID(), which returns the ID of the current thread. 2015-06-11 14:18:02 -07:00
include Call merge operators with empty values 2015-06-26 11:35:46 -07:00
java Use CompactRangeOptions for CompactRange 2015-06-17 14:36:14 -07:00
port Build for CYGWIN 2015-04-23 21:33:44 -07:00
table Use malloc_usable_size() for accounting block cache size 2015-06-26 11:48:09 -07:00
third-party Update COMMIT.md 2015-03-30 17:48:16 -07:00
tools Fix mac compile 2015-06-26 10:29:24 -07:00
util Implement a table-level row cache 2015-06-23 10:25:45 -07:00
utilities Make stringappend_test runnable in ROCKSDB_LITE 2015-06-24 15:01:43 -07:00
.arcconfig Integrate Jenkins with Phabricator 2015-04-07 11:56:29 -07:00
.clang-format A script that automatically reformat affected lines 2014-01-14 12:21:24 -08:00
.gitignore First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
.travis.yml Don't preinstall jemalloc in Travis 2015-04-24 18:43:07 -07:00
AUTHORS Add AUTHORS file. Fix #203 2014-09-29 10:52:18 -07:00
CONTRIBUTING.md facebook accounts are not required for CLA signers 2014-07-08 05:57:54 -04:00
DUMP_FORMAT.md First version of rocksdb_dump and rocksdb_undump. 2015-06-19 16:24:36 -07:00
HISTORY.md Use malloc_usable_size() for accounting block cache size 2015-06-26 11:48:09 -07:00
INSTALL.md Fix broken gflags link 2015-06-22 09:31:52 -07:00
LICENSE Fix copyright year 2014-03-12 12:06:58 -07:00
Makefile Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS 2015-06-23 16:32:59 -07:00
PATENTS Update Patent Grant. 2015-04-13 10:33:43 +01:00
README.md Replaced "built on on earlier work" by "built on earlier work" in README.md 2014-09-17 01:16:17 -07:00
ROCKSDB_LITE.md Optimistic Transactions 2015-05-29 14:36:35 -07:00
src.mk Add wal files to Checkpoint for multiple column families. 2015-06-19 16:08:31 -07:00
USERS.md Add Yahoo's blog post about Sherpa to USERS.md 2015-06-09 12:55:58 -07:00
Vagrantfile RocksDB on FreeBSD support 2015-02-26 15:19:17 -08:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/