rocksdb

Go to file

Andrew Kryczka 6c40806e51 Digest ZSTD compression dictionary once per SST file (#4251 )

Summary:
In RocksDB, for a given SST file, all data blocks are compressed with the same dictionary. When we compress a block using the dictionary's raw bytes, the compression library first has to digest the dictionary to get it into a usable form. This digestion work is redundant and ideally should be done once per file.

ZSTD offers APIs for the caller to create and reuse a digested dictionary object (`ZSTD_CDict`). In this PR, we call `ZSTD_createCDict` once per file to digest the raw bytes. Then we use `ZSTD_compress_usingCDict` to compress each data block using the pre-digested dictionary. Once the file's created `ZSTD_freeCDict` releases the resources held by the digested dictionary.

There are a couple other changes included in this PR:

- Changed the parameter object for (un)compression functions from `CompressionContext`/`UncompressionContext` to `CompressionInfo`/`UncompressionInfo`. This avoids the previous pattern, where `CompressionContext`/`UncompressionContext` had to be mutated before calling a (un)compression function depending on whether dictionary should be used. I felt that mutation was error-prone so eliminated it.
- Added support for digested uncompression dictionaries (`ZSTD_DDict`) as well. However, this PR does not support reusing them across uncompression calls for the same file. That work is deferred to a later PR when we will store the `ZSTD_DDict` objects in block cache.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4251

Differential Revision: D9257078

Pulled By: ajkr

fbshipit-source-id: 21b8cb6bbdd48e459f1c62343780ab66c0a64438

2018-08-23 19:28:18 -07:00

buckifier

Remove two CI tests (#4110 )

2018-07-12 11:43:25 -07:00

build_tools

Release 5.16 (#4298 )

2018-08-21 14:43:08 -07:00

cache

Support group commits of version edits (#3944 )

2018-06-28 12:34:39 -07:00

cmake

Search paths provided by intel's "tbbvars.sh".

2018-05-07 14:28:36 -07:00

coverage

Remove unused imports, from python scripts. (#4057 )

2018-06-26 12:43:04 -07:00

Invoke OnTableFileCreated for empty SSTs (#4307 )

2018-08-23 18:27:30 -07:00

docs

Advisor: README and blog, and also tests for DBBenchRunner, DatabaseOptions (#4201 )

2018-08-01 16:13:09 -07:00

env

Fix the build failure with OS_ANDROID (#4232 )

2018-08-08 08:12:02 -07:00

examples

Pin top-level index on partitioned index/filter blocks (#4037 )

2018-06-22 15:27:46 -07:00

hdfs

Comment out unused variables

2018-03-05 13:13:41 -08:00

include/rocksdb

Adding a method for memtable class for memtable getting flushed. (#4304 )

2018-08-23 17:14:25 -07:00

java

Add CompactRangeOptions for Java (#4220 )

2018-08-17 10:57:25 -07:00

memtable

Suppress clang analyzer error (#4299 )

2018-08-21 16:43:05 -07:00

monitoring

Support group commits of version edits (#3944 )

2018-06-28 12:34:39 -07:00

options

Add path to WritableFileWriter. (#4039 )

2018-08-23 10:12:58 -07:00

port

Add path to WritableFileWriter. (#4039 )

2018-08-23 10:12:58 -07:00

table

Digest ZSTD compression dictionary once per SST file (#4251 )

2018-08-23 19:28:18 -07:00

third-party

Add GCC 8 to Travis (#3433 )

2018-07-13 10:58:06 -07:00

tools

Digest ZSTD compression dictionary once per SST file (#4251 )

2018-08-23 19:28:18 -07:00

util

Digest ZSTD compression dictionary once per SST file (#4251 )

2018-08-23 19:28:18 -07:00

utilities

Digest ZSTD compression dictionary once per SST file (#4251 )

2018-08-23 19:28:18 -07:00

.clang-format

A script that automatically reformat affected lines

2014-01-14 12:21:24 -08:00

.gitignore

RocksDB Trace Analyzer (#4091 )

2018-08-13 11:44:02 -07:00

.lgtm.yml

Create lgtm.yml for LGTM.com C/C++ analysis (#4058 )

2018-06-26 12:43:04 -07:00

.travis.yml

Add GCC 8 to Travis (#3433 )

2018-07-13 10:58:06 -07:00

appveyor.yml

Upgrade Appveyor to VS2017

2018-02-01 13:57:01 -08:00

AUTHORS

Update RocksDB Authors File

2017-10-18 14:42:10 -07:00

CMakeLists.txt

Improve point-lookup performance using a data block hash index (#4174 )

2018-08-15 14:30:03 -07:00

CODE_OF_CONDUCT.md

Add Code of Conduct

2017-12-05 18:42:35 -08:00

CONTRIBUTING.md

Add Code of Conduct

2017-12-05 18:42:35 -08:00

COPYING

Add GPLv2 as an alternative license.

2017-04-27 18:06:12 -07:00

DEFAULT_OPTIONS_HISTORY.md

options.delayed_write_rate use the rate of rate_limiter by default.

2017-05-24 09:58:24 -07:00

DUMP_FORMAT.md

First version of rocksdb_dump and rocksdb_undump.

2015-06-19 16:24:36 -07:00

HISTORY.md

Invoke OnTableFileCreated for empty SSTs (#4307 )

2018-08-23 18:27:30 -07:00

INSTALL.md

Enable compilation on OpenBSD

2018-03-19 12:30:05 -07:00

issue_template.md

Add a template for issues

2017-09-29 11:41:28 -07:00

LANGUAGE-BINDINGS.md

Added PingCaps Rust RocksDB and ObjectiveRocks (#4065 )

2018-06-27 15:43:21 -07:00

LICENSE.Apache

Change RocksDB License

2017-07-15 16:11:23 -07:00

LICENSE.leveldb

Add back the LevelDB license file

2017-07-16 18:42:18 -07:00

Makefile

Adjusted the Makefile of trace_analyzer to isolate the Gflags from other (#4290 )

2018-08-21 10:47:24 -07:00

README.md

Create lgtm.yml for LGTM.com C/C++ analysis (#4058 )

2018-06-26 12:43:04 -07:00

ROCKSDB_LITE.md

Fix some typos in comments and docs.

2018-03-08 10:27:25 -08:00

src.mk

Adjusted the Makefile of trace_analyzer to isolate the Gflags from other (#4290 )

2018-08-21 10:47:24 -07:00

TARGETS

Improve point-lookup performance using a data block hash index (#4174 )

2018-08-15 14:30:03 -07:00

thirdparty.inc

Provide a way to override windows memory allocator with jemalloc for ZSTD

2018-06-04 12:12:48 -07:00

USERS.md

Support range deletion tombstones in IngestExternalFile SSTs (#3778 )

2018-07-13 22:43:09 -07:00

Vagrantfile

Adding CentOS 7 Vagrantfile & build script

2018-02-26 15:27:17 -08:00

WINDOWS_PORT.md

Add GCC 8 to Travis (#3433 )

2018-07-13 10:58:06 -07:00

README.md

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it specially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/master/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Design discussions are conducted in https://www.facebook.com/groups/rocksdb.dev/

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.

Languages

C++ 82.1%

Java 10.3%

C 2.5%

Python 1.7%

Perl 1.1%

Other 2.1%