Levi Tamasi 0cc0543893 Mitigate the overhead of building the hash of file locations (#9504)
Summary:
The patch builds on the refactoring done in https://github.com/facebook/rocksdb/issues/9494
and improves the performance of building the hash of file
locations in `VersionStorageInfo` in two ways. First, the hash
building is moved from `AddFile` (which is called under the DB mutex)
to a separate post-processing step done as part of `PrepareForVersionAppend`
(during which the mutex is *not* held). Second, the space necessary
for the hash is preallocated to prevent costly reallocation/rehashing
operations. These changes mitigate the overhead of the file location hash,
which can be significant with certain workloads where the baseline CPU usage
is low (see https://github.com/facebook/rocksdb/issues/9351,
which is a workload where keys are sorted, WAL is turned
off, the vector memtable implementation is used, and there are lots of small
SST files).

Fixes https://github.com/facebook/rocksdb/issues/9351

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9504

Test Plan:
`make check`

```
numactl --interleave=all ./db_bench --benchmarks=fillseq --allow_concurrent_memtable_write=false --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=20 --level0_stop_writes_trigger=30 --max_background_jobs=8 --max_write_buffer_number=8 --db=/data/ltamasi-dbbench --wal_dir=/data/ltamasi-dbbench --num=800000000 --num_levels=8 --key_size=20 --value_size=400 --block_size=8192 --cache_size=51539607552 --cache_numshardbits=6 --compression_max_dict_bytes=0 --compression_ratio=0.5 --compression_type=lz4 --bytes_per_sync=8388608 --cache_index_and_filter_blocks=1 --cache_high_pri_pool_ratio=0.5 --benchmark_write_rate_limit=0 --write_buffer_size=16777216 --target_file_size_base=16777216 --max_bytes_for_level_base=67108864 --verify_checksum=1 --delete_obsolete_files_period_micros=62914560 --max_bytes_for_level_multiplier=8 --statistics=0 --stats_per_interval=1 --stats_interval_seconds=20 --histogram=1 --bloom_bits=10 --open_files=-1 --subcompactions=1 --compaction_style=0 --level_compaction_dynamic_level_bytes=true --pin_l0_filter_and_index_blocks_in_cache=1 --soft_pending_compaction_bytes_limit=167503724544 --hard_pending_compaction_bytes_limit=335007449088 --min_level_to_compress=0 --use_existing_db=0 --sync=0 --threads=1 --memtablerep=vector --disable_wal=1 --seed=<some_seed>
```

Final statistics before this patch:
```
Cumulative writes: 0 writes, 697M keys, 0 commit groups, 0.0 writes per commit group, ingest: 283.25 GB, 241.08 MB/s
Interval writes: 0 writes, 1264K keys, 0 commit groups, 0.0 writes per commit group, ingest: 525.69 MB, 176.67 MB/s
```

With the patch:
```
Cumulative writes: 0 writes, 759M keys, 0 commit groups, 0.0 writes per commit group, ingest: 308.57 GB, 262.63 MB/s
Interval writes: 0 writes, 1555K keys, 0 commit groups, 0.0 writes per commit group, ingest: 646.61 MB, 215.11 MB/s
```

Reviewed By: riversand963

Differential Revision: D34014734

Pulled By: ltamasi

fbshipit-source-id: acb2703677451d5ccaa7e9d950844b33d240695b
2022-02-07 09:17:14 -08:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2022-01-12 09:31:12 -08:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2017-10-18 14:42:10 -07:00
2022-02-04 17:13:10 -08:00
2019-08-29 23:21:01 -07:00
2017-12-05 18:42:35 -08:00
2017-04-27 18:06:12 -07:00
2022-02-04 17:13:10 -08:00
2022-02-04 17:13:10 -08:00
2017-07-15 16:11:23 -07:00
2022-02-04 17:13:10 -08:00
2019-06-24 17:40:32 -07:00
2021-10-01 16:10:35 -07:00

RocksDB: A Persistent Key-Value Store for Flash and RAM Storage

CircleCI Status TravisCI Status Appveyor Build status PPC64le Build Status

RocksDB is developed and maintained by Facebook Database Engineering Team. It is built on earlier work on LevelDB by Sanjay Ghemawat (sanjay@google.com) and Jeff Dean (jeff@google.com)

This code is a library that forms the core building block for a fast key-value server, especially suited for storing data on flash drives. It has a Log-Structured-Merge-Database (LSM) design with flexible tradeoffs between Write-Amplification-Factor (WAF), Read-Amplification-Factor (RAF) and Space-Amplification-Factor (SAF). It has multi-threaded compactions, making it especially suitable for storing multiple terabytes of data in a single database.

Start with example usage here: https://github.com/facebook/rocksdb/tree/main/examples

See the github wiki for more explanation.

The public interface is in include/. Callers should not include or rely on the details of any other header files in this package. Those internal APIs may be changed without warning.

Questions and discussions are welcome on the RocksDB Developers Public Facebook group and email list on Google Groups.

License

RocksDB is dual-licensed under both the GPLv2 (found in the COPYING file in the root directory) and Apache 2.0 License (found in the LICENSE.Apache file in the root directory). You may select, at your option, one of the above-listed licenses.

Description
A library that provides an embeddable, persistent key-value store for fast storage.
Readme 271 MiB
Languages
C++ 82.1%
Java 10.3%
C 2.5%
Python 1.7%
Perl 1.1%
Other 2.1%