Commit Graph

241 Commits

Author SHA1 Message Date
Levi Tamasi
3e1bf771a3 Make it possible to force the garbage collection of the oldest blob files (#8994)
Summary:
The current BlobDB garbage collection logic works by relocating the valid
blobs from the oldest blob files as they are encountered during compaction,
and cleaning up blob files once they contain nothing but garbage. However,
with sufficiently skewed workloads, it is theoretically possible to end up in a
situation when few or no compactions get scheduled for the SST files that contain
references to the oldest blob files, which can lead to increased space amp due
to the lack of GC.

In order to efficiently handle such workloads, the patch adds a new BlobDB
configuration option called `blob_garbage_collection_force_threshold`,
which signals to BlobDB to schedule targeted compactions for the SST files
that keep alive the oldest batch of blob files if the overall ratio of garbage in
the given blob files meets the threshold *and* all the given blob files are
eligible for GC based on `blob_garbage_collection_age_cutoff`. (For example,
if the new option is set to 0.9, targeted compactions will get scheduled if the
sum of garbage bytes meets or exceeds 90% of the sum of total bytes in the
oldest blob files, assuming all affected blob files are below the age-based cutoff.)
The net result of these targeted compactions is that the valid blobs in the oldest
blob files are relocated and the oldest blob files themselves cleaned up (since
*all* SST files that rely on them get compacted away).

These targeted compactions are similar to periodic compactions in the sense
that they force certain SST files that otherwise would not get picked up to undergo
compaction and also in the sense that instead of merging files from multiple levels,
they target a single file. (Note: such compactions might still include neighboring files
from the same level due to the need of having a "clean cut" boundary but they never
include any files from any other level.)

This functionality is currently only supported with the leveled compaction style
and is inactive by default (since the default value is set to 1.0, i.e. 100%).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8994

Test Plan: Ran `make check` and tested using `db_bench` and the stress/crash tests.

Reviewed By: riversand963

Differential Revision: D31489850

Pulled By: ltamasi

fbshipit-source-id: 44057d511726a0e2a03c5d9313d7511b3f0c4eab
2021-10-11 18:03:01 -07:00
Peter Dillinger
2a383f21f4 Add Bloom/Ribbon hybrid API support (#8679)
Summary:
This is essentially resurrection and fixing of the part of
https://github.com/facebook/rocksdb/issues/8198 that was reverted in https://github.com/facebook/rocksdb/issues/8212, using data added in https://github.com/facebook/rocksdb/issues/8246. Basically,
when configuring Ribbon filter, you can specify an LSM level before which
Bloom will be used instead of Ribbon. But Bloom is only considered for
Leveled and Universal compaction styles and file going into a known LSM
level. This way, SST file writer, FIFO compaction, etc. use Ribbon filter as
you would expect with NewRibbonFilterPolicy.

So that this can be controlled with a single int value and so that flushes
can be distinguished from intra-L0, we consider flush to go to level -1 for
the purposes of this option. (Explained in API comment.)

I also expect the most common and recommended Ribbon configuration to
use Bloom during flush, to minimize slowing down writes and because according
to my estimates, Ribbon only pays off if the structure lives in memory for
more than an hour. Thus, I have changed the default for NewRibbonFilterPolicy
to be this mild hybrid configuration. I don't really want to add something like
NewHybridFilterPolicy because at least the mild hybrid configuration (Bloom for
flush, Ribbon otherwise) should be considered a natural choice.

C APIs also updated, but because they don't support overloading,
rocksdb_filterpolicy_create_ribbon is kept pure ribbon for clarity and
rocksdb_filterpolicy_create_ribbon_hybrid must be called for a hybrid
configuration. While touching C API, I changed bits per key options from
int to double.

BuiltinFilterPolicy is needed so that LevelThresholdFilterPolicy doesn't inherit
unused fields from BloomFilterPolicy.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8679

Test Plan: new + updated tests, including crash test

Reviewed By: jay-zhuang

Differential Revision: D30445797

Pulled By: pdillinger

fbshipit-source-id: 6f5aeddfd6d79f7e55493b563c2d1d2d568892e1
2021-08-20 18:00:16 -07:00
Baptiste Lemaire
e3a96c4823 Memtable sampling for mempurge heuristic. (#8628)
Summary:
Changes the API of the MemPurge process: the `bool experimental_allow_mempurge` and `experimental_mempurge_policy` flags have been replaced by a `double experimental_mempurge_threshold` option.
This change of API reflects another major change introduced in this PR: the MemPurgeDecider() function now works by sampling the memtables being flushed to estimate the overall amount of useful payload (payload minus the garbage), and then compare this useful payload estimate with the `double experimental_mempurge_threshold` value.
Therefore, when the value of this flag is `0.0` (default value), mempurge is simply deactivated. On the other hand, a value of `DBL_MAX` would be equivalent to always going through a mempurge regardless of the garbage ratio estimate.
At the moment, a `double experimental_mempurge_threshold` value else than 0.0 or `DBL_MAX` is opnly supported`with the `SkipList` memtable representation.
Regarding the sampling, this PR includes the introduction of a `MemTable::UniqueRandomSample` function that collects (approximately) random entries from the memtable by using the new `SkipList::Iterator::RandomSeek()` under the hood, or by iterating through each memtable entry, depending on the target sample size and the total number of entries.
The unit tests have been readapted to support this new API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8628

Reviewed By: pdillinger

Differential Revision: D30149315

Pulled By: bjlemaire

fbshipit-source-id: 1feef5390c95db6f4480ab4434716533d3947f27
2021-08-10 18:09:03 -07:00
Roy Crihfield
d4b75d295f Add more C bindings for OptimisticTransactionDB (#8526)
Summary:
* `rocksdb_optimistictransactiondb_checkpoint_object_create`
* `rocksdb_optimistictransactiondb_write`

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8526

Reviewed By: ajkr

Differential Revision: D30076822

Pulled By: jay-zhuang

fbshipit-source-id: a59956a8d5449e75d39a8087fbb2bad148cf697d
2021-08-06 19:10:48 -07:00
qieqieplus
bb485e986a Add ribbon filter to C API (#8486)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8486

Reviewed By: jay-zhuang

Differential Revision: D29625501

Pulled By: pdillinger

fbshipit-source-id: e6e2a455ae62a71f3a202278a751b9bba17ad03c
2021-07-09 16:22:48 -07:00
Baptiste Lemaire
9dc887ece0 Memtable "MemPurge" prototype (#8454)
Summary:
Implement an experimental feature called "MemPurge", which consists in purging "garbage" bytes out of a memtable and reuse the memtable struct instead of making it immutable and eventually flushing its content to storage.
The prototype is by default deactivated and is not intended for use. It is intended for correctness and validation testing. At the moment, the "MemPurge" feature can be switched on by using the `options.experimental_allow_mempurge` flag. For this early stage, when the allow_mempurge flag is set to `true`, all the flush operations will be rerouted to perform a MemPurge. This is a temporary design decision that will give us the time to explore meaningful heuristics to use MemPurge at the right time for relevant workloads . Moreover, the current MemPurge operation only supports `Puts`, `Deletes`, `DeleteRange` operations, and handles `Iterators` as well as `CompactionFilter`s that are invoked at flush time .
Three unit tests are added to `db_flush_test.cc` to test if MemPurge works correctly (and checks that the previously mentioned operations are fully supported thoroughly tested).
One noticeable design decision is the timing of the MemPurge operation in the memtable workflow: for this prototype, the mempurge happens when the memtable is switched (and usually made immutable). This is an inefficient process because it implies that the entirety of the MemPurge operation happens while holding the db_mutex. Future commits will make the MemPurge operation a background task (akin to the regular flush operation) and aim at drastically enhancing the performance of this operation. The MemPurge is also not fully "WAL-compatible" yet, but when the WAL is full, or when the regular MemPurge operation fails (or when the purged memtable still needs to be flushed), a regular flush operation takes place. Later commits will also correct these behaviors.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8454

Reviewed By: anand1976

Differential Revision: D29433971

Pulled By: bjlemaire

fbshipit-source-id: 6af48213554e35048a7e03816955100a80a26dc5
2021-07-02 05:23:02 -07:00
Stanislav Tkach
83d1a66598 Expose CompressionOptions::parallel_threads through C API (#8302)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8302

Reviewed By: jay-zhuang

Differential Revision: D28499262

Pulled By: ajkr

fbshipit-source-id: 7b17b79af871d874dfca76db9bca0d640a6cd854
2021-05-17 22:53:04 -07:00
Duarte Nunes
3949731de3 Add WAL flush API to C client (#8226)
Summary:
The C client is missing the`manual_wal_flush` option and the `flush_wal` API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8226

Reviewed By: ajkr

Differential Revision: D28000869

Pulled By: jay-zhuang

fbshipit-source-id: ed44937e7e7e75bc0dfa870a14147fbeef0c38f8
2021-04-27 14:56:23 -07:00
Sahir Hoda
13c655a887 New C API to expose NewCompactOnDeletionCollectorFactory (#8233)
Summary:
New C API rocksdb_options_add_compact_on_deletion_collector_factory to expose NewCompactOnDeletionCollectorFactory

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8233

Reviewed By: mrambacher

Differential Revision: D28018381

Pulled By: anand1976

fbshipit-source-id: 674c9ed902c91ff0d9f09e7a60c5f37b907604c6
2021-04-27 10:14:04 -07:00
Sahir Hoda
d65d7d657d Expose JemallocNodumpAllocator to C API (#8178)
Summary:
Add new C APIs to create the JemallocNodumpAllocator and set it on a Cache object.

`make test` passes with and without `DISABLE_JEMALLOC=1`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8178

Reviewed By: jay-zhuang

Differential Revision: D27944631

Pulled By: ajkr

fbshipit-source-id: 2531729aa285a8985c58f22f093c4d53029c4a7b
2021-04-22 22:22:34 -07:00
mrambacher
4c41e51c07 Add Blob Options to C API (#8148)
Summary:
Added the Blob option settings from the AdvancedColmnFamilyOptions to the C API.

There are no tests for getting/setting options in the C API currently, hence no specific test plans.  Should there be a some?

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8148

Reviewed By: ltamasi

Differential Revision: D27568495

Pulled By: mrambacher

fbshipit-source-id: 3a52b784467ea2c4bc58be5f75c5d41f0a5c55d6
2021-04-16 05:56:00 -07:00
Sahir Hoda
139778dfb3 Expose Cache::DisownData in C API (#8160)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8160

Reviewed By: riversand963

Differential Revision: D27672474

Pulled By: ajkr

fbshipit-source-id: fdbbc3398f0b1d4cef6b68636e5caf369c34b3a7
2021-04-09 10:39:11 -07:00
anand76
c5f52714fb Use malloc in rocksdb_transaction_get_snapshot (#8114)
Summary:
The snapshot structure returned by rocksdb_transaction_get_snapshot is
supposed to be freed by calling rocksdb_free(), so allocate using malloc
rather than new. Fixes https://github.com/facebook/rocksdb/issues/6112

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8114

Reviewed By: akankshamahajan15

Differential Revision: D27362923

Pulled By: anand1976

fbshipit-source-id: e93a8b1ffe26dafbe22529907f72b796ae971214
2021-03-26 15:51:34 -07:00
storagezhang
d9be6556aa Include C++ standard library headers instead of C compatibility headers (#8068)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/8068

Reviewed By: zhichao-cao

Differential Revision: D27147685

Pulled By: riversand963

fbshipit-source-id: 5428b1c0142ecae17c977fba31a6d49b52983d1c
2021-03-19 12:09:47 -07:00
storagezhang
c706324208 Add default in switch (#8065)
Summary:
switch may not cover all branch in `db/c.cc`:

```c++
void rocksdb_options_set_access_hint_on_compaction_start(
    rocksdb_options_t* opt, int v) {
  switch(v) {
    case 0:
      opt->rep.access_hint_on_compaction_start =
          ROCKSDB_NAMESPACE::Options::NONE;
      break;
    case 1:
      opt->rep.access_hint_on_compaction_start =
          ROCKSDB_NAMESPACE::Options::NORMAL;
      break;
    case 2:
      opt->rep.access_hint_on_compaction_start =
          ROCKSDB_NAMESPACE::Options::SEQUENTIAL;
      break;
    case 3:
      opt->rep.access_hint_on_compaction_start =
          ROCKSDB_NAMESPACE::Options::WILLNEED;
      break;
  }
}
```

Pull Request resolved: https://github.com/facebook/rocksdb/pull/8065

Reviewed By: riversand963

Differential Revision: D27102892

Pulled By: zhichao-cao

fbshipit-source-id: ad1d20d192712878e61597311ba75b55df0066d7
2021-03-19 11:57:52 -07:00
Andrew Kryczka
d904233d2f Limit buffering for collecting samples for compression dictionary (#7970)
Summary:
For dictionary compression, we need to collect some representative samples of the data to be compressed, which we use to either generate or train (when `CompressionOptions::zstd_max_train_bytes > 0`) a dictionary. Previously, the strategy was to buffer all the data blocks during flush, and up to the target file size during compaction. That strategy allowed us to randomly pick samples from as wide a range as possible that'd be guaranteed to land in a single output file.

However, some users try to make huge files in memory-constrained environments, where this strategy can cause OOM. This PR introduces an option, `CompressionOptions::max_dict_buffer_bytes`, that limits how much data blocks are buffered before we switch to unbuffered mode (which means creating the per-SST dictionary, writing out the buffered data, and compressing/writing new blocks as soon as they are built). It is not strict as we currently buffer more than just data blocks -- also keys are buffered. But it does make a step towards giving users predictable memory usage.

Related changes include:

- Changed sampling for dictionary compression to select unique data blocks when there is limited availability of data blocks
- Made use of `BlockBuilder::SwapAndReset()` to save an allocation+memcpy when buffering data blocks for building a dictionary
- Changed `ParseBoolean()` to accept an input containing characters after the boolean. This is necessary since, with this PR, a value for `CompressionOptions::enabled` is no longer necessarily the final component in the `CompressionOptions` string.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7970

Test Plan:
- updated `CompressionOptions` unit tests to verify limit is respected (to the extent expected in the current implementation) in various scenarios of flush/compaction to bottommost/non-bottommost level
- looked at jemalloc heap profiles right before and after switching to unbuffered mode during flush/compaction. Verified memory usage in buffering is proportional to the limit set.

Reviewed By: pdillinger

Differential Revision: D26467994

Pulled By: ajkr

fbshipit-source-id: 3da4ef9fba59974e4ef40e40c01611002c861465
2021-02-19 14:09:54 -08:00
Stanislav Tkach
3feee6db17 Add get/set deadline and io_timeout C functions (read options) (#7914)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7914

Reviewed By: jay-zhuang

Differential Revision: D26184409

Pulled By: ajkr

fbshipit-source-id: 8e30faac5223ec80c22e2b617af67775322065d8
2021-02-04 17:00:58 -08:00
Adam Retter
6e0f62f2b6 Add more tests to ASSERT_STATUS_CHECKED (3), API change (#7715)
Summary:
Third batch of adding more tests to ASSERT_STATUS_CHECKED.

* db_compaction_filter_test
* db_compaction_test
* db_dynamic_level_test
* db_inplace_update_test
* db_sst_test
* db_tailing_iter_test
* db_io_failure_test

Also update GetApproximateSizes APIs to all return Status.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7715

Reviewed By: jay-zhuang

Differential Revision: D25806896

Pulled By: pdillinger

fbshipit-source-id: 6cb9d62ba5a756c645812754c596ad3995d7c262
2021-01-06 14:15:02 -08:00
Yanqin Jin
394210f280 Remove unused includes (#7604)
Summary:
This is a PR generated **semi-automatically** by an internal tool to remove unused includes and `using` statements.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7604

Test Plan: make check

Reviewed By: ajkr

Differential Revision: D24579392

Pulled By: riversand963

fbshipit-source-id: c4bfa6c6b08da1de186690d37eb73d8fff45aecd
2020-10-28 23:22:27 -07:00
Stanislav Tkach
ed90e2a450 Add getters to the C API for env, universal compaction options and fifo compaction options (#7501)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7501

Reviewed By: ltamasi

Differential Revision: D24344109

Pulled By: pdillinger

fbshipit-source-id: d9a2b1b1cc8c8d8a96f13b8ae6814380caa10c96
2020-10-16 11:04:01 -07:00
Stanislav Tkach
1a83f5a8ac Expose BackupableDBOptions in the C API (#7550)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7550

Reviewed By: jay-zhuang

Differential Revision: D24315343

Pulled By: ajkr

fbshipit-source-id: fc7855b630a50c00dcb940241942295932732f39
2020-10-14 17:51:47 -07:00
Adam Retter
3ac07a12fe RocksJava - Add errorIfLogFileExists parameter to RocksDB.openReadOnly (#7046)
Summary:
Expose from C++ API to Java API.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7046

Reviewed By: riversand963

Differential Revision: D23726297

Pulled By: pdillinger

fbshipit-source-id: fc66bf626ce6fe9797e7d021ac849eacab91bf6d
2020-09-17 15:41:25 -07:00
Stanislav Tkach
5c39d8df69 Add getters to the C API for flush, write, cache and compact options (#7321)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7321

Reviewed By: ajkr

Differential Revision: D23590160

fbshipit-source-id: 35d106e732ac37f674222759cdb1dbb31e005ca7
2020-09-09 11:45:27 -07:00
Eduardo Barreto Alexandre
5b1ccdc191 Expose rocksdb_open_column_families_with_ttl C function (#7314)
Summary:
This PR creates `rocksdb_open_column_families_with_ttl` which allows C API users to open a DBWithTLL with column families.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7314

Reviewed By: cheng-chang

Differential Revision: D23430287

Pulled By: ajkr

fbshipit-source-id: 307aa21d170d1402653263a91f6f832ef76afba0
2020-09-03 14:39:58 -07:00
Stanislav Tkach
b288f0131b Add getters for the read options to the C API (#7289)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7289

Reviewed By: akankshamahajan15

Differential Revision: D23252520

Pulled By: ajkr

fbshipit-source-id: 85cea485a6dcaa1c67c32a83eb49a1b623966609
2020-08-20 16:36:19 -07:00
codingsh
50f206ad84 feat: export SetBackgroundThreads(n, Env::BOTTOM); (#7191)
Summary:
- https://github.com/rust-rocksdb/rust-rocksdb/pull/448

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7191

Reviewed By: riversand963

Differential Revision: D22809066

Pulled By: ajkr

fbshipit-source-id: 036939f9a28cacc3f677c318d1aed97fe5f4f85e
2020-07-29 12:24:13 -07:00
codingsh
83ea266b43 export stats_persist_period_sec (#7168)
Summary:
fixed
 - https://github.com/rust-rocksdb/rust-rocksdb/issues/447
 -  https://github.com/rust-rocksdb/rust-rocksdb/pull/448

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7168

Reviewed By: cheng-chang

Differential Revision: D22736013

Pulled By: ajkr

fbshipit-source-id: fdd784aa75d26a367b9108b05ffdd94a2ae117d3
2020-07-28 13:05:34 -07:00
Stanislav Tkach
393e486e3e Add getters for options to the C API (#7094)
Summary:
Along with https://github.com/facebook/rocksdb/issues/6925 and https://github.com/facebook/rocksdb/issues/6998, this should add getters for all Options fields except several ones with non-trivial interface (for example rocksdb_options_set_min_level_to_compress).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7094

Reviewed By: riversand963

Differential Revision: D22479800

Pulled By: pdillinger

fbshipit-source-id: d14f305e12cfe268d07e0fe229d55cef299c792a
2020-07-10 14:30:04 -07:00
rafael-aero
712458fc34 Add RestoreDBFromLatestBackup to C API, add new C# package (#7092)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7092

Reviewed By: riversand963

Differential Revision: D22412323

Pulled By: ajkr

fbshipit-source-id: 3fc1c63bb19a8cd2c0ae620800c28f199a7f494b
2020-07-08 11:56:41 -07:00
Stanislav Tkach
1b85d57cf5 Expose KeyMayExist in the C API (#7021)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/7021

Reviewed By: ajkr

Differential Revision: D22246297

Pulled By: pdillinger

fbshipit-source-id: 81dfd0a49e4d5ce0c9f00772c17cca425757ea24
2020-06-29 12:21:53 -07:00
Stanislav Tkach
70b5d95dc7 Add (more) getters for options to the C API (#6998)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6998

Reviewed By: ajkr

Differential Revision: D22211700

Pulled By: pdillinger

fbshipit-source-id: 1141c20527dee5e13205059bf8e83927063c4c1e
2020-06-25 13:53:33 -07:00
Stanislav Tkach
b7c825d5cf Add (some) getters for options to the C API (#6925)
Summary:
Additionally I have extended the incomplete test added in the https://github.com/facebook/rocksdb/issues/6880.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6925

Reviewed By: ajkr

Differential Revision: D21869788

Pulled By: pdillinger

fbshipit-source-id: e9db80f259c57ca1bdcbc2c66cb938cb1ac26e48
2020-06-03 17:08:50 -07:00
Anatoly Zhmur
22e5c513c2 Add zstd_max_train_bytes to c interop (#6796)
Summary:
Added setting of zstd_max_train_bytes compression option parameter to c interop.

rocksdb_options_set_bottommost_compression_options was using bool parameter and thus not exported, updated it to unsigned char and added to c.h as well.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6796

Reviewed By: cheng-chang

Differential Revision: D21611471

Pulled By: ajkr

fbshipit-source-id: caaaf153de934837ad9af283c7f8c025ff0b0cf5
2020-06-03 12:27:12 -07:00
Stanislav Tkach
38f988d3b4 Expose rocksdb_options_copy function to the C API (#6880)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6880

Reviewed By: ajkr

Differential Revision: D21842752

Pulled By: pdillinger

fbshipit-source-id: eda326f551ddd9cb397681544b9e9799ea614e52
2020-06-02 13:48:51 -07:00
Stanislav Tkach
70aaa9ceeb Expose CancellAllBackgroundWork to C api (#6832)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6832

Reviewed By: cheng-chang

Differential Revision: D21498186

Pulled By: riversand963

fbshipit-source-id: 66bb0d7c06af2bf0df3c6a09b61bca2fb81f2dd3
2020-05-12 14:50:52 -07:00
Kirill Abrosimov
3ff603171d added new functions to c-api (#5630)
Summary:
Few functions from options added to C-api
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5630

Reviewed By: anand1976

Differential Revision: D20896731

Pulled By: ltamasi

fbshipit-source-id: e4215a58b3c2429ec44e3f0d0381cbf86700fb14
2020-04-07 14:45:39 -07:00
Zaiyang Li
fcec56e86c Add function to set row cache on rocksdb_options_t (#6442)
Summary:
Adding a C API function to set `row_cache` on `rocksdb_options_t` as this functionality is missing.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6442

Differential Revision: D20036813

Pulled By: riversand963

fbshipit-source-id: c1fa95ea343345fbc1e57961d0d048e0e79be373
2020-02-21 11:12:42 -08:00
sdong
fdf882ded2 Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433)
Summary:
When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433

Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag.

Differential Revision: D19977691

fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e
2020-02-20 12:09:57 -08:00
Mike Kolupaev
637e64b9ac Add an option to prevent DB::Open() from querying sizes of all sst files (#6353)
Summary:
When paranoid_checks is on, DBImpl::CheckConsistency() iterates over all sst files and calls Env::GetFileSize() for each of them. As far as I could understand, this is pretty arbitrary and doesn't affect correctness - if filesystem doesn't corrupt fsynced files, the file sizes will always match; if it does, it may as well corrupt contents as well as sizes, and rocksdb doesn't check contents on open.

If there are thousands of sst files, getting all their sizes takes a while. If, on top of that, Env is overridden to use some remote storage instead of local filesystem, it can be *really* slow and overload the remote storage service. This PR adds an option to not do GetFileSize(); instead it does GetChildren() for parent directory to check that all the expected sst files are at least present, but doesn't check their sizes.

We can't just disable paranoid_checks instead because paranoid_checks do a few other important things: make the DB read-only on write errors, print error messages on read errors, etc.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6353

Test Plan: ran the added sanity check unit test. Will try it out in a LogDevice test cluster where the GetFileSize() calls are causing a lot of trouble.

Differential Revision: D19656425

Pulled By: al13n321

fbshipit-source-id: c2c421b367633033760d1f56747bad206d1fbf82
2020-02-04 01:27:26 -08:00
Yanqin Jin
f2fbc5d668 Shorten certain test names to avoid infra failure (#6352)
Summary:
Unit test names, together with other components,  are used to create log files
during some internal testing. Overly long names cause infra failure due to file
names being too long.

Look for internal tests.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6352

Differential Revision: D19649307

Pulled By: riversand963

fbshipit-source-id: 6f29de096e33c0eaa87d9c8702f810eda50059e7
2020-01-30 23:10:24 -08:00
Matt Bell
7e5b04d04f Expose atomic flush option in C API (#6307)
Summary:
This PR adds a `rocksdb_options_set_atomic_flush` function to the C API.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6307

Differential Revision: D19451313

Pulled By: ltamasi

fbshipit-source-id: 750495642ef55b1ea7e13477f85c38cd6574849c
2020-01-17 12:57:48 -08:00
Qinfan Wu
6733be033e More const pointers in C API (#6283)
Summary:
This makes it easier to call the functions from Rust as otherwise they require mutable types.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6283

Differential Revision: D19349991

Pulled By: wqfish

fbshipit-source-id: e8da7a75efe8cd97757baef8ca844a054f2519b4
2020-01-10 19:27:09 -08:00
Yanqin Jin
bce5189f4d Fix error message (#6264)
Summary:
Fix an error message when CURRENT is not found.

Test plan (dev server)
```
make check
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6264

Differential Revision: D19300699

Pulled By: riversand963

fbshipit-source-id: 303fa206386a125960ecca1dbdeff07422690caf
2020-01-07 12:32:20 -08:00
Qinfan Wu
edaaa1fff2 Add range delete function to C-API (#6259)
Summary:
It seems that the C-API doesn't expose the range delete functionality at the moment, so add the API.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6259

Differential Revision: D19290320

Pulled By: pdillinger

fbshipit-source-id: 3f403a4c3446d2042d55f1ece7cdc9c040f40c27
2020-01-06 10:46:21 -08:00
sdong
a68dff5c35 Apply formatter to some recent commits (#6138)
Summary:
Formatter somehow complains some recent lines changed. Apply them to make the formatter happy.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6138

Test Plan: See CI passes.

Differential Revision: D18895950

fbshipit-source-id: 7d1696cf3e3a682bc10a30cdca748a23c6565255
2019-12-09 15:49:49 -08:00
Peter Dillinger
e43d2c4424 Fix & test rocksdb_filterpolicy_create_bloom_full (#6132)
Summary:
Add overrides needed in FilterPolicy wrapper to fix
rocksdb_filterpolicy_create_bloom_full (see issue https://github.com/facebook/rocksdb/issues/6129). Re-enabled
assertion in BloomFilterPolicy::CreateFilter that was being violated.
Expanded c_test to identify Bloom filter implementations by FP counts.
(Without the fix, updated test will trigger assertion and fail otherwise
without the assertion.)

Fixes https://github.com/facebook/rocksdb/issues/6129
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6132

Test Plan: updated c_test, also run under valgrind.

Differential Revision: D18864911

Pulled By: pdillinger

fbshipit-source-id: 08e81d7b5368b08e501cd402ef5583f2650c19fa
2019-12-09 12:21:14 -08:00
David Palm
048472f620 Add missing DataBlock-releated functions to the C-API (#6101)
Summary:
Adds two missing functions to the C-API:

- `rocksdb_block_based_options_set_data_block_index_type`
- `rocksdb_block_based_options_set_data_block_hash_ratio`

This enables users in other languages to enjoy the new(-ish) feature.

The changes here are partially overlapping with [another PR](https://github.com/facebook/rocksdb/pull/5630) but are more focused on the DataBlock indexing options.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6101

Differential Revision: D18765639

fbshipit-source-id: 4a8947e71b179f26fa1eb83c267dd47ee64ac3b3
2019-12-02 11:00:09 -08:00
Maysam Yabandeh
6ec6a4a9a4 Remove snap_refresh_nanos option (#5826)
Summary:
The snap_refresh_nanos option didn't bring much benefit. Remove the feature to simplify the code.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5826

Differential Revision: D17467147

Pulled By: maysamyabandeh

fbshipit-source-id: 4f950b046990d0d1292d7fc04c2ccafaf751c7f0
2019-09-18 20:26:04 -07:00
Lingjing You
1a928c22a0 Add insert hints for each writebatch (#5728)
Summary:
Add insert hints for each writebatch so that they can be used in concurrent write, and add write option to enable it.

Bench result (qps):

`./db_bench --benchmarks=fillseq -allow_concurrent_memtable_write=true -num=4000000 -batch-size=1 -threads=1 -db=/data3/ylj/tmp -write_buffer_size=536870912 -num_column_families=4`

master:

| batch size \ thread num | 1       | 2       | 4       | 8       |
| ----------------------- | ------- | ------- | ------- | ------- |
| 1                       | 387883  | 220790  | 308294  | 490998  |
| 10                      | 1397208 | 978911  | 1275684 | 1733395 |
| 100                     | 2045414 | 1589927 | 1798782 | 2681039 |
| 1000                    | 2228038 | 1698252 | 1839877 | 2863490 |

fillseq with writebatch hint:

| batch size \ thread num | 1       | 2       | 4       | 8       |
| ----------------------- | ------- | ------- | ------- | ------- |
| 1                       | 286005  | 223570  | 300024  | 466981  |
| 10                      | 970374  | 813308  | 1399299 | 1753588 |
| 100                     | 1962768 | 1983023 | 2676577 | 3086426 |
| 1000                    | 2195853 | 2676782 | 3231048 | 3638143 |
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5728

Differential Revision: D17297240

fbshipit-source-id: b053590a6d77871f1ef2f911a7bd013b3899b26c
2019-09-12 17:15:18 -07:00
Zhongyi Xie
2f41ecfe75 Refactor trimming logic for immutable memtables (#5022)
Summary:
MyRocks currently sets `max_write_buffer_number_to_maintain` in order to maintain enough history for transaction conflict checking. The effectiveness of this approach depends on the size of memtables. When memtables are small, it may not keep enough history; when memtables are large, this may consume too much memory.
We are proposing a new way to configure memtable list history: by limiting the memory usage of immutable memtables. The new option is `max_write_buffer_size_to_maintain` and it will take precedence over the old `max_write_buffer_number_to_maintain` if they are both set to non-zero values. The new option accounts for the total memory usage of flushed immutable memtables and mutable memtable. When the total usage exceeds the limit, RocksDB may start dropping immutable memtables (which is also called trimming history), starting from the oldest one.
The semantics of the old option actually works both as an upper bound and lower bound. History trimming will start if number of immutable memtables exceeds the limit, but it will never go below (limit-1) due to history trimming.
In order the mimic the behavior with the new option, history trimming will stop if dropping the next immutable memtable causes the total memory usage go below the size limit. For example, assuming the size limit is set to 64MB, and there are 3 immutable memtables with sizes of 20, 30, 30. Although the total memory usage is 80MB > 64MB, dropping the oldest memtable will reduce the memory usage to 60MB < 64MB, so in this case no memtable will be dropped.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5022

Differential Revision: D14394062

Pulled By: miasantreble

fbshipit-source-id: 60457a509c6af89d0993f988c9b5c2aa9e45f5c5
2019-08-23 13:55:34 -07:00