Compare commits

...

35 Commits

Author SHA1 Message Date
przemyslaw.skibinski@percona.com
5d5e8110ff Fix GitHub issue #3716: gcc-8 warnings
Summary:
Fix the following gcc-8 warnings:
- conflicting C language linkage declaration [-Werror]
- writing to an object with no trivial copy-assignment [-Werror=class-memaccess]
- array subscript -1 is below array bounds [-Werror=array-bounds]

Solves https://github.com/facebook/rocksdb/issues/3716
Closes https://github.com/facebook/rocksdb/pull/3736

Differential Revision: D7684161

Pulled By: yiwu-arbug

fbshipit-source-id: 47c0423d26b74add251f1d3595211eee1e41e54a
2019-10-30 13:30:56 -07:00
Vijay Nadimpalli
448d0096fe Making platform 007 (gcc 7) default in build_detect_platform.sh (#5947)
Summary:
Making platform 007 (gcc 7) default in build_detect_platform.sh.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5947

Differential Revision: D18038837

Pulled By: vjnadimpalli

fbshipit-source-id: 9ac2ddaa93bf328a416faec028970e039886378e
2019-10-30 13:30:56 -07:00
Andrew Kryczka
6da4b446be Add latest toolchain (gcc-8, etc.) build support for fbcode users (#4923)
Summary:
- When building with internal dependencies, specify this toolchain by setting `ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1`
- It is not enabled by default. However, it is enabled for TSAN builds in CI since there is a known problem with TSAN in gcc-5: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71090
- I did not add support for Lua since (1) we agreed to deprecate it, and (2) we only have an internal build for v5.3 with this toolchain while that has breaking changes compared to our current version (v5.2).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4923

Differential Revision: D13827226

Pulled By: ajkr

fbshipit-source-id: 9aa3388ed3679777cfb15ef8cbcb83c07f62f947
2019-10-30 13:30:56 -07:00
Andrew Kryczka
808e429bd6 update history and bump version to 5.10.5 2018-03-16 14:22:42 -07:00
Andrew Kryczka
8b99935452 Fix WAL corruption from checkpoint/backup race condition
Summary:
`Writer::WriteBuffer` was always called at the beginning of checkpoint/backup. But that log writer has no internal synchronization, which meant the same buffer could be flushed twice in a race condition case, causing a WAL entry to be duplicated. Then subsequent WAL entries would be at unexpected offsets, causing the 32KB block boundaries to be overlapped and manifesting as a corruption.

This PR fixes the behavior to only use `WriteBuffer` (via `FlushWAL`) in checkpoint/backup when manual WAL flush is enabled. In that case, users are responsible for providing synchronization between WAL flushes. We can also consider removing the call entirely.
Closes https://github.com/facebook/rocksdb/pull/3603

Differential Revision: D7277447

Pulled By: ajkr

fbshipit-source-id: 1b15bd7fd930511222b075418c10de0aaa70a35a
2018-03-16 14:22:05 -07:00
sdong
492ab7c7d9 Update HISTORY.md 2018-03-01 11:16:05 -08:00
Adam Retter
0d4fb0ef1e Fixes the build on Windows
Summary:
As discovered during v5.9.2 release, and forward-ported.
Closes https://github.com/facebook/rocksdb/pull/3323

Differential Revision: D6657209

Pulled By: siying

fbshipit-source-id: b560d9f8ddb89e0ffaff7c895ec80f68ddf7dab4
2018-02-26 16:10:58 -08:00
Andrew Kryczka
5b0481ef49 update history and bump version 2018-02-22 13:58:31 -08:00
Andrew Kryczka
2f078d57e7 BackupEngine gluster-friendly file naming convention
Summary:
Use the rsync tempfile naming convention in our `BackupEngine`. The temp file follows the format, `.<filename>.<suffix>`, which is later renamed to `<filename>`. We fix `tmp` as the `<suffix>` as we don't need to use random bytes for now. The benefit is gluster treats this tempfile naming convention specially and applies hashing only to `<filename>`, so the file won't need to be linked or moved when it's renamed. Our gluster team suggested this will make things operationally easier.
Closes https://github.com/facebook/rocksdb/pull/3463

Differential Revision: D6893333

Pulled By: ajkr

fbshipit-source-id: fd7622978f4b2487fce33cde40dd3124f16bcaa8
2018-02-22 13:53:42 -08:00
sdong
ac329cce6e Update HISTORY.md 2018-02-21 15:12:10 -08:00
Sagar Vemuri
3300ca0f39 Add rocksdb.iterator.internal-key property
Summary:
Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped.
Closes https://github.com/facebook/rocksdb/pull/3525

Differential Revision: D7033694

Pulled By: sagar0

fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8
2018-02-21 10:20:59 -08:00
Sagar Vemuri
aa3b8bb460 Bump version to 5.10.3 2018-02-16 14:18:14 -08:00
Mike Kolupaev
0493719649 Fix deadlock in ColumnFamilyData::InstallSuperVersion()
Summary:
Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_.

This deadlock is hit all the time on our workload. It blocks our release.

In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so.

So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them.

This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup.

I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine.
Closes https://github.com/facebook/rocksdb/pull/3510

Reviewed By: sagar0

Differential Revision: D7005346

Pulled By: al13n321

fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f
2018-02-16 11:34:46 -08:00
Siying Dong
7ad79400af Direct I/O writable file should do fsync in Close()
Summary:
We don't do fsync() after truncate in direct I/O writeable file (in fact we don't do any fsync ever). This can cause metadata not persistent to disk after the file is generated. We call it instead.
Closes https://github.com/facebook/rocksdb/pull/3500

Differential Revision: D6981482

Pulled By: siying

fbshipit-source-id: 7e2b591b7e5dd1b96fc0775515b8b9e6092980ef
2018-02-13 16:52:07 -08:00
Wouter Beek
257eb0a41c FIXED: string buffers potentially too small to fit formatted write
Summary:
This fixes the following warnings when compiled with GCC7:

util/transaction_test_util.cc: In static member function ‘static rocksdb::Status rocksdb::RandomTransactionInserter::DBGet(rocksdb::DB*, rocksdb::Transaction*, rocksdb::ReadOptions&, uint16_t, uint64_t, bool, uint64_t*, std::__cxx11::string*, bool*)’:
util/transaction_test_util.cc:75:8: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
 Status RandomTransactionInserter::DBGet(
        ^~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc:84:11: note: ‘snprintf’ output between 5 and 6 bytes into a destination of size 5
   snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
   ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc: In static member function ‘static rocksdb::Status rocksdb::RandomTransactionInserter::Verify(rocksdb::DB*, uint16_t, uint64_t, bool, rocksdb::Random64*)’:
util/transaction_test_util.cc:245:8: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
 Status RandomTransactionInserter::Verify(DB* db, uint16_t num_sets,
        ^~~~~~~~~~~~~~~~~~~~~~~~~
util/transaction_test_util.cc:268:13: note: ‘snprintf’ output between 5 and 6 bytes into a destination of size 5
     snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
     ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Closes https://github.com/facebook/rocksdb/pull/3295

Differential Revision: D6609411

Pulled By: maysamyabandeh

fbshipit-source-id: 33f0add471056eb59db2f8bd4366e6dfbb1a187d
2018-02-13 13:37:32 -08:00
Jun Wu
54c1984579 crc32: suppress -Wimplicit-fallthrough warnings
Summary:
Workaround a bunch of "implicit-fallthrough" compiler errors, like:

```
util/crc32c.cc:533:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
   crc = _mm_crc32_u64(crc, *(uint64_t*)(buf + offset));
       ^
util/crc32c.cc:1016:9: note: in expansion of macro ‘CRCsinglet’
         CRCsinglet(crc0, next, -2 * 8);
         ^~~~~~~~~~
util/crc32c.cc:1017:7: note: here
       case 1:
```
Closes https://github.com/facebook/rocksdb/pull/3339

Reviewed By: sagar0

Differential Revision: D6874736

Pulled By: quark-zju

fbshipit-source-id: eec9f3bc135e12fca336928d01711006d5c3cb16
2018-02-12 12:58:31 -08:00
Mark Isaacson
7ac544656d Suppress lint in old files
Summary: Grandfather in super old lint issues to make a clean slate for moving forward that allows us to have stronger enforcement on new issues.

Reviewed By: yiwu-arbug

Differential Revision: D6821806

fbshipit-source-id: 22797d31ec58e9eb0255d3b66fedfcfcb0dc127c
2018-01-30 17:15:39 -08:00
Yi Wu
c1e70e73ea Bump version to 5.10.2 2018-01-30 11:13:10 -08:00
Yi Wu
d91f420dd9 StackableDB optionally take shared ownership of the underlying DB
Summary:
Allow StackableDB optionally takes a shared_ptr on construction and thus hold shared ownership of the underlying DB.
Closes https://github.com/facebook/rocksdb/pull/3423

Differential Revision: D6824163

Pulled By: yiwu-arbug

fbshipit-source-id: dbdc30c42e007533a987ef413785e192340f03eb
2018-01-30 11:12:37 -08:00
Sagar Vemuri
1454a5ccfe Fix PowerPC dynamic java build
Summary:
Java build on PPC64le has been broken since a few months, due to #2716. Fixing it with the least amount of changes.
(We should cleanup a little around this code when time permits).

This should fix the build failures seen in http://140.211.168.68:8080/job/Rocksdb/ .
Closes https://github.com/facebook/rocksdb/pull/3359

Differential Revision: D6712938

Pulled By: sagar0

fbshipit-source-id: 3046e8f072180693de2af4762934ec1ace309ca4
2018-01-18 18:02:26 -08:00
Yi Wu
d7defe4fa3 Bump version to 5.10.1 2018-01-18 18:02:03 -08:00
Yi Wu
e7e0aa7a2a Fix Flush() keep waiting after flush finish
Summary:
Flush() call could be waiting indefinitely if min_write_buffer_number_to_merge is used. Consider the sequence:
1. User call Flush() with flush_options.wait = true
2. The manual flush started in the background
3. New memtable become immutable because of writes. The new memtable will not trigger flush if min_write_buffer_number_to_merge is not reached.
4. The manual flush finish.

Because of the new memtable created at step 3 not being flush, previous logic of WaitForFlushMemTable() keep waiting, despite the memtables it intent to flush has been flushed.

Here instead of checking if there are any more memtables to flush, WaitForFlushMemTable() also check the id of the earliest memtable. If the id is larger than that of latest memtable at the time flush was initiated, it means all the memtable at the time of flush start has all been flush.
Closes https://github.com/facebook/rocksdb/pull/3378

Differential Revision: D6746789

Pulled By: yiwu-arbug

fbshipit-source-id: 35e698f71c7f90b06337a93e6825f4ea3b619bfa
2018-01-18 17:52:02 -08:00
Adam Retter
6c1a7243d8 FreeBSD build support for RocksDB and RocksJava
Summary:
Tested on a clean FreeBSD 11.01 x64.

Closes https://github.com/facebook/rocksdb/pull/1423
Closes https://github.com/facebook/rocksdb/pull/3357

Differential Revision: D6705868

Pulled By: sagar0

fbshipit-source-id: cbccbbdafd4f42922512ca03619a5d5583a425fd
2018-01-11 13:42:22 -08:00
Andrew Kryczka
c43d91eef7 fix powerpc java static build
Summary:
added support for C and asm files as required for e612e317409e8a9d74cf05db0bd733403305f459.
Closes https://github.com/facebook/rocksdb/pull/3299

Differential Revision: D6612479

Pulled By: ajkr

fbshipit-source-id: 6263ed7c1602f249460421825c76b5721f396163
2018-01-11 12:41:44 -08:00
Sagar Vemuri
cf2c88cb4d Fix zstd/zdict include path for java static build
Summary:
With the ZSTD dictionary generator support added in #3057
`PORTABLE=1 ROCKSDB_NO_FBCODE=1 make rocksdbjavastatic` fails as it can't find zdict.h. Specifically due to:
e3a06f12d2/util/compression.h (L39)
In java static builds zstd code gets directly downloaded from https://github.com/facebook/zstd , and in there zdict.h is under dictBuilder directory. So, I modified libzstd.a target to use `make install` to collect all the header files into a single location and used that as the zstd's include path.
Closes https://github.com/facebook/rocksdb/pull/3260

Differential Revision: D6669850

Pulled By: sagar0

fbshipit-source-id: f8a7562a670e5aed4c4fb6034a921697590d7285
2018-01-10 17:50:53 -08:00
Yi Wu
f897be87ae Fix db_bench write being disabled in lite build
Summary:
The macro was added by mistake in #2372
Closes https://github.com/facebook/rocksdb/pull/3343

Differential Revision: D6681356

Pulled By: yiwu-arbug

fbshipit-source-id: 4180172fb0eaef4189c07f219241e0c261c03461
2018-01-10 17:50:03 -08:00
Siying Dong
8e42c3f8b8 Remove GCC parameter "-march=native" for ARM
Summary:
Most popular versions of GCC can't identify platform on ARM if "-march=native" is specified. Remove it to unblock most people.
Closes https://github.com/facebook/rocksdb/pull/3346

Differential Revision: D6690544

Pulled By: siying

fbshipit-source-id: bbaba9fe2645b6b37144b36ea75beeff88992b49
2018-01-10 17:49:56 -08:00
Siying Dong
0d2818d8d7 Fix a wrong log formatting
Summary:
I experienced weird segfault because of this mismatch of type in log formatting. Fix it.
Closes https://github.com/facebook/rocksdb/pull/3345

Differential Revision: D6687224

Pulled By: siying

fbshipit-source-id: c51fb1c008b7ebc3efdc353a4adad3e8f5b3e9de
2018-01-10 17:49:43 -08:00
sdong
f7285ce5be Fix HISTORY.md 2017-12-27 13:19:53 -08:00
yingsu00
76698fe15e Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**

RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.

This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.

**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.

Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.

1) ReadRandom in db_bench overall metrics

    PER RUN
    Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
    3-way      |  1   | 4.143   | 241387 | 26.7
    3-way      |  2   | 3.775   | 264872 | 29.3
    3-way      | 3    | 4.116   | 242929 | 26.9
    FastCrc32c|1  | 4.037   | 247727 | 27.4
    FastCrc32c|2  | 4.648   | 215166 | 23.8
    FastCrc32c|3  | 4.352   | 229799 | 25.4

     AVG
    Algorithm     |    Average of micros/op |   Average of ops/sec |    Average of Throughput (MB/s)
    3-way           |     4.01                               |      249,729                 |      27.63
    FastCrc32c  |     4.35                              |     230,897                  |      25.53

 2)   Crc32c computation CPU cost (inclusive samples percentage)
    PER RUN
    Implementation | run |  TotalSamples   | Crc32c percentage
    3-way                 |  1    |  4,572,250,000 | 4.37%
    3-way                 |  2    |  3,779,250,000 | 4.62%
    3-way                 |  3    |  4,129,500,000 | 4.48%
    FastCrc32c       |  1    |  4,663,500,000 | 11.24%
    FastCrc32c       |  2    |  4,047,500,000 | 12.34%
    FastCrc32c       |  3    |  4,366,750,000 | 11.68%

 **# Test Plan**
     make -j64 corruption_test && ./corruption_test
      By default it uses 3-way SSE algorithm

     NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test

    make clean && DEBUG_LEVEL=0 make -j64 db_bench
    make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173

Differential Revision: D6330882

Pulled By: yingsu00

fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
2017-12-27 13:16:32 -08:00
Zhongyi Xie
9196e80b15 Reduce heavy hitter for Get operation
Summary:
This PR addresses the following heavy hitters in `Get` operation by moving calls to `StatisticsImpl::recordTick` from `BlockBasedTable` to `Version::Get`

- rocksdb.block.cache.bytes.write
- rocksdb.block.cache.add
- rocksdb.block.cache.data.miss
- rocksdb.block.cache.data.bytes.insert
- rocksdb.block.cache.data.add
- rocksdb.block.cache.hit
- rocksdb.block.cache.data.hit
- rocksdb.block.cache.bytes.read

The db_bench statistics before and after the change are:

|1GB block read|Children      |Self  |Command          |Shared Object        |Symbol|
|---|---|---|---|---|---|
|master:     |4.22%     |1.31%  |db_bench  |db_bench  |[.] rocksdb::StatisticsImpl::recordTick|
|updated:    |0.51%     |0.21%  |db_bench  |db_bench  |[.] rocksdb::StatisticsImpl::recordTick|
|     	     |0.14%     |0.14%  |db_bench  |db_bench  |[.] rocksdb::GetContext::record_counters|

|1MB block read|Children      |Self  |Command          |Shared Object        |Symbol|
|---|---|---|---|---|---|
|master:    |3.48%     |1.08%  |db_bench  |db_bench  |[.] rocksdb::StatisticsImpl::recordTick|
|updated:    |0.80%     |0.31%  |db_bench  |db_bench  |[.] rocksdb::StatisticsImpl::recordTick|
|    	     |0.35%     |0.35%  |db_bench  |db_bench  |[.] rocksdb::GetContext::record_counters|
Closes https://github.com/facebook/rocksdb/pull/3172

Differential Revision: D6330532

Pulled By: miasantreble

fbshipit-source-id: 2b492959e00a3db29e9437ecdcc5e48ca4ec5741
2017-12-14 16:52:06 -08:00
Siying Dong
243ca14116 Print out compression type of new SST files in logging
Summary: Closes https://github.com/facebook/rocksdb/pull/3264

Differential Revision: D6552768

Pulled By: siying

fbshipit-source-id: 6303110aff22f341d5cff41f8d2d4f138a53652d
2017-12-14 16:51:26 -08:00
Siying Dong
1c3c1c4340 NUMBER_BLOCK_COMPRESSED, etc, shouldn't be treated as timer counter
Summary:
NUMBER_BLOCK_DECOMPRESSED and NUMBER_BLOCK_COMPRESSED are not reported unless the stats level contain detailed timers, which is wrong. They are normal counters. Fix it.
Closes https://github.com/facebook/rocksdb/pull/3263

Differential Revision: D6552519

Pulled By: siying

fbshipit-source-id: 40899ccea7b2856bb39752616657c0bfd432f6f9
2017-12-14 16:51:13 -08:00
Orvid King
3e40def2c1 Fix the build with MSVC 2017
Summary:
There were a few places where MSVC's implicit truncation warnings were getting triggered, which was causing the MSVC build to fail due to warnings being treated as errors. This resolves the issues by making the truncations in some places explicit, and by making it so there are no truncations of literals.

Fixes #3239
Supersedes #3259
Closes https://github.com/facebook/rocksdb/pull/3273

Reviewed By: yiwu-arbug

Differential Revision: D6569204

Pulled By: Orvid

fbshipit-source-id: c188cf1cf98d9acb6d94b71875041cc81f8ff088
2017-12-14 16:49:25 -08:00
Siying Dong
766f062bbb Switch version to 5.10
Summary: Closes https://github.com/facebook/rocksdb/pull/3252

Differential Revision: D6539373

Pulled By: siying

fbshipit-source-id: ce7c3d3fe625852179055295da9cf7bc80755025
2017-12-11 15:44:51 -08:00
82 changed files with 1834 additions and 317 deletions

View File

@ -169,7 +169,7 @@ if(PORTABLE)
# MSVC does not need a separate compiler flag to enable SSE4.2; if nmmintrin.h
# is available, it is available by default.
if(FORCE_SSE42 AND NOT MSVC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.2")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.2 -mpclmul")
endif()
else()
if(MSVC)
@ -181,13 +181,18 @@ endif()
include(CheckCXXSourceCompiles)
if(NOT MSVC)
set(CMAKE_REQUIRED_FLAGS "-msse4.2")
set(CMAKE_REQUIRED_FLAGS "-msse4.2 -mpclmul")
endif()
CHECK_CXX_SOURCE_COMPILES("
#include <cstdint>
#include <nmmintrin.h>
#include <wmmintrin.h>
int main() {
volatile uint32_t x = _mm_crc32_u32(0, 0);
const auto a = _mm_set_epi64x(0, 0);
const auto b = _mm_set_epi64x(0, 0);
const auto c = _mm_clmulepi64_si128(a, b, 0x00);
auto d = _mm_cvtsi128_si64(c);
}
" HAVE_SSE42)
unset(CMAKE_REQUIRED_FLAGS)
@ -608,7 +613,7 @@ if(HAVE_SSE42 AND NOT FORCE_SSE42)
if(NOT MSVC)
set_source_files_properties(
util/crc32c.cc
PROPERTIES COMPILE_FLAGS "-msse4.2")
PROPERTIES COMPILE_FLAGS "-msse4.2 -mpclmul")
endif()
endif()

View File

@ -1,9 +1,31 @@
# Rocksdb Change Log
## Unreleased
## 5.10.5 (03/16/2018)
### Bug fixes
* Fix WAL corruption caused by race condition between user write thread and backup/checkpoint thread.
## 5.10.4 (02/22/2018)
### New Features
* Follow rsync-style naming convention for BackupEngine tempfiles. This enables some optimizations when run on GlusterFS.
* Fix regression of Java build break on Windows.
## 5.10.3 (02/21/2018)
### Bug fixes
* Fix build break regression using gcc-7
* Direct I/O writable file should do fsync in Close()
### New Features
* Add rocksdb.iterator.internal-key property
## 5.10.1 (01/18/2018)
### Bug Fixes
* Fix DB::Flush() keep waiting after flush finish under certain condition.
## 5.10.0 (12/11/2017)
### Public API Change
* When running `make` with environment variable `USE_SSE` set and `PORTABLE` unset, will use all machine features available locally. Previously this combination only compiled SSE-related features.
### New Features
* CRC32C is now using the 3-way pipelined SSE algorithm `crc32c_3way` on supported platforms to improve performance. The system will choose to use this algorithm on supported platforms automatically whenever possible. If PCLMULQDQ is not supported it will fall back to the old Fast_CRC32 algorithm.
* Provide lifetime hints when writing files on Linux. This reduces hardware write-amp on storage devices supporting multiple streams.
* Add a DB stat, `NUMBER_ITER_SKIP`, which returns how many internal keys were skipped during iterations (e.g., due to being tombstones or duplicate versions of a key).
* Add PerfContext counters, `key_lock_wait_count` and `key_lock_wait_time`, which measure the number of times transactions wait on key locks and total amount of time waiting.

View File

@ -107,6 +107,41 @@ to build a portable binary, add `PORTABLE=1` before your make commands, like thi
* run `brew tap homebrew/versions; brew install gcc48 --use-llvm` to install gcc 4.8 (or higher).
* run `brew install rocksdb`
* **FreeBSD** (11.01):
* You can either install RocksDB from the Ports system using `cd /usr/ports/databases/rocksdb && make install`, or you can follow the details below to install dependencies and compile from source code:
* Install the dependencies for RocksDB:
export BATCH=YES
cd /usr/ports/devel/gmake && make install
cd /usr/ports/devel/gflags && make install
cd /usr/ports/archivers/snappy && make install
cd /usr/ports/archivers/bzip2 && make install
cd /usr/ports/archivers/liblz4 && make install
cd /usr/ports/archivesrs/zstd && make install
cd /usr/ports/devel/git && make install
* Install the dependencies for RocksJava (optional):
export BATCH=yes
cd /usr/ports/java/openjdk7 && make install
* Build RocksDB from source:
cd ~
git clone https://github.com/facebook/rocksdb.git
cd rocksdb
gmake static_lib
* Build RocksJava from source (optional):
cd rocksdb
export JAVA_HOME=/usr/local/openjdk7
gmake rocksdbjava
* **iOS**:
* Run: `TARGET_OS=IOS make static_lib`. When building the project which uses rocksdb iOS library, make sure to define two important pre-processing macros: `ROCKSDB_LITE` and `IOS_CROSS_COMPILE`.

View File

@ -305,6 +305,9 @@ LDFLAGS += $(LUA_LIB)
endif
ifeq ($(NO_THREEWAY_CRC32C), 1)
CXXFLAGS += -DNO_THREEWAY_CRC32C
endif
CFLAGS += $(WARNING_FLAGS) -I. -I./include $(PLATFORM_CCFLAGS) $(OPT)
CXXFLAGS += $(WARNING_FLAGS) -I. -I./include $(PLATFORM_CXXFLAGS) $(OPT) -Woverloaded-virtual -Wnon-virtual-dtor -Wno-missing-field-initializers
@ -341,6 +344,8 @@ ifeq ($(HAVE_POWER8),1)
LIB_CC_OBJECTS = $(LIB_SOURCES:.cc=.o)
LIBOBJECTS += $(LIB_SOURCES_C:.c=.o)
LIBOBJECTS += $(LIB_SOURCES_ASM:.S=.o)
else
LIB_CC_OBJECTS = $(LIB_SOURCES:.cc=.o)
endif
LIBOBJECTS += $(TOOL_LIB_SOURCES:.cc=.o)
@ -1568,7 +1573,7 @@ else
endif
endif
ifeq ($(PLATFORM), OS_FREEBSD)
JAVA_INCLUDE += -I$(JAVA_HOME)/include/freebsd
JAVA_INCLUDE = -I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/freebsd
ROCKSDBJNILIB = librocksdbjni-freebsd$(ARCH).so
ROCKSDB_JAR = rocksdbjni-$(ROCKSDB_MAJOR).$(ROCKSDB_MINOR).$(ROCKSDB_PATCH)-freebsd$(ARCH).jar
endif
@ -1594,7 +1599,7 @@ libz.a:
exit 1; \
fi
tar xvzf zlib-$(ZLIB_VER).tar.gz
cd zlib-$(ZLIB_VER) && CFLAGS='-fPIC ${EXTRA_CFLAGS}' LDFLAGS='${EXTRA_LDFLAGS}' ./configure --static && make
cd zlib-$(ZLIB_VER) && CFLAGS='-fPIC ${EXTRA_CFLAGS}' LDFLAGS='${EXTRA_LDFLAGS}' ./configure --static && $(MAKE)
cp zlib-$(ZLIB_VER)/libz.a .
libbz2.a:
@ -1606,7 +1611,7 @@ libbz2.a:
exit 1; \
fi
tar xvzf bzip2-$(BZIP2_VER).tar.gz
cd bzip2-$(BZIP2_VER) && make CFLAGS='-fPIC -O2 -g -D_FILE_OFFSET_BITS=64 ${EXTRA_CFLAGS}' AR='ar ${EXTRA_ARFLAGS}'
cd bzip2-$(BZIP2_VER) && $(MAKE) CFLAGS='-fPIC -O2 -g -D_FILE_OFFSET_BITS=64 ${EXTRA_CFLAGS}' AR='ar ${EXTRA_ARFLAGS}'
cp bzip2-$(BZIP2_VER)/libbz2.a .
libsnappy.a:
@ -1619,7 +1624,7 @@ libsnappy.a:
fi
tar xvzf snappy-$(SNAPPY_VER).tar.gz
cd snappy-$(SNAPPY_VER) && CFLAGS='${EXTRA_CFLAGS}' CXXFLAGS='${EXTRA_CXXFLAGS}' LDFLAGS='${EXTRA_LDFLAGS}' ./configure --with-pic --enable-static --disable-shared
cd snappy-$(SNAPPY_VER) && make ${SNAPPY_MAKE_TARGET}
cd snappy-$(SNAPPY_VER) && $(MAKE) ${SNAPPY_MAKE_TARGET}
cp snappy-$(SNAPPY_VER)/.libs/libsnappy.a .
liblz4.a:
@ -1632,7 +1637,7 @@ liblz4.a:
exit 1; \
fi
tar xvzf lz4-$(LZ4_VER).tar.gz
cd lz4-$(LZ4_VER)/lib && make CFLAGS='-fPIC -O2 ${EXTRA_CFLAGS}' all
cd lz4-$(LZ4_VER)/lib && $(MAKE) CFLAGS='-fPIC -O2 ${EXTRA_CFLAGS}' all
cp lz4-$(LZ4_VER)/lib/liblz4.a .
libzstd.a:
@ -1645,29 +1650,45 @@ libzstd.a:
exit 1; \
fi
tar xvzf zstd-$(ZSTD_VER).tar.gz
cd zstd-$(ZSTD_VER)/lib && make CFLAGS='-fPIC -O2 ${EXTRA_CFLAGS}' all
cd zstd-$(ZSTD_VER)/lib && DESTDIR=. PREFIX= $(MAKE) CFLAGS='-fPIC -O2 ${EXTRA_CFLAGS}' install
cp zstd-$(ZSTD_VER)/lib/libzstd.a .
# A version of each $(LIBOBJECTS) compiled with -fPIC and a fixed set of static compression libraries
java_static_libobjects = $(patsubst %,jls/%,$(LIBOBJECTS))
java_static_libobjects = $(patsubst %,jls/%,$(LIB_CC_OBJECTS))
CLEAN_FILES += jls
java_static_all_libobjects = $(java_static_libobjects)
ifneq ($(ROCKSDB_JAVA_NO_COMPRESSION), 1)
JAVA_COMPRESSIONS = libz.a libbz2.a libsnappy.a liblz4.a libzstd.a
endif
JAVA_STATIC_FLAGS = -DZLIB -DBZIP2 -DSNAPPY -DLZ4 -DZSTD
JAVA_STATIC_INCLUDES = -I./zlib-$(ZLIB_VER) -I./bzip2-$(BZIP2_VER) -I./snappy-$(SNAPPY_VER) -I./lz4-$(LZ4_VER)/lib -I./zstd-$(ZSTD_VER)/lib
JAVA_STATIC_INCLUDES = -I./zlib-$(ZLIB_VER) -I./bzip2-$(BZIP2_VER) -I./snappy-$(SNAPPY_VER) -I./lz4-$(LZ4_VER)/lib -I./zstd-$(ZSTD_VER)/lib/include
ifeq ($(HAVE_POWER8),1)
JAVA_STATIC_C_LIBOBJECTS = $(patsubst %.c.o,jls/%.c.o,$(LIB_SOURCES_C:.c=.o))
JAVA_STATIC_ASM_LIBOBJECTS = $(patsubst %.S.o,jls/%.S.o,$(LIB_SOURCES_ASM:.S=.o))
java_static_ppc_libobjects = $(JAVA_STATIC_C_LIBOBJECTS) $(JAVA_STATIC_ASM_LIBOBJECTS)
jls/util/crc32c_ppc.o: util/crc32c_ppc.c
$(AM_V_CC)$(CC) $(CFLAGS) $(JAVA_STATIC_FLAGS) $(JAVA_STATIC_INCLUDES) -c $< -o $@
jls/util/crc32c_ppc_asm.o: util/crc32c_ppc_asm.S
$(AM_V_CC)$(CC) $(CFLAGS) $(JAVA_STATIC_FLAGS) $(JAVA_STATIC_INCLUDES) -c $< -o $@
java_static_all_libobjects += $(java_static_ppc_libobjects)
endif
$(java_static_libobjects): jls/%.o: %.cc $(JAVA_COMPRESSIONS)
$(AM_V_CC)mkdir -p $(@D) && $(CXX) $(CXXFLAGS) $(JAVA_STATIC_FLAGS) $(JAVA_STATIC_INCLUDES) -fPIC -c $< -o $@ $(COVERAGEFLAGS)
rocksdbjavastatic: $(java_static_libobjects)
rocksdbjavastatic: $(java_static_all_libobjects)
cd java;$(MAKE) javalib;
rm -f ./java/target/$(ROCKSDBJNILIB)
$(CXX) $(CXXFLAGS) -I./java/. $(JAVA_INCLUDE) -shared -fPIC \
-o ./java/target/$(ROCKSDBJNILIB) $(JNI_NATIVE_SOURCES) \
$(java_static_libobjects) $(COVERAGEFLAGS) \
$(java_static_all_libobjects) $(COVERAGEFLAGS) \
$(JAVA_COMPRESSIONS) $(JAVA_STATIC_LDFLAGS)
cd java/target;strip $(STRIPFLAGS) $(ROCKSDBJNILIB)
cd java;jar -cf target/$(ROCKSDB_JAR) HISTORY*.md
@ -1728,7 +1749,7 @@ JAVA_C_LIBOBJECTS = $(patsubst %.c.o,jl/%.c.o,$(JAVA_C_OBJECTS))
JAVA_ASM_LIBOBJECTS = $(patsubst %.S.o,jl/%.S.o,$(JAVA_ASM_OBJECTS))
endif
java_libobjects = $(patsubst %,jl/%,$(LIBOBJECTS))
java_libobjects = $(patsubst %,jl/%,$(LIB_CC_OBJECTS))
CLEAN_FILES += jl
java_all_libobjects = $(java_libobjects)

View File

@ -36,7 +36,7 @@ def parse_src_mk(repo_path):
# get all .cc / .c files
def get_cc_files(repo_path):
cc_files = []
for root, dirnames, filenames in os.walk(repo_path):
for root, dirnames, filenames in os.walk(repo_path): # noqa: B007 T25377293 Grandfathered in
root = root[(len(repo_path) + 1):]
if "java" in root:
# Skip java

View File

@ -1,4 +1,5 @@
#!/usr/bin/env bash
# Create a tmp directory for the test to use
TEST_DIR=$(mktemp -d /dev/shm/fbcode_rocksdb_XXXXXXX)
# shellcheck disable=SC2068
TEST_TMPDIR="$TEST_DIR" $@ && rm -rf "$TEST_DIR"

View File

@ -51,11 +51,13 @@ if [ -z "$ROCKSDB_NO_FBCODE" -a -d /mnt/gvfs/third-party ]; then
FBCODE_BUILD="true"
# If we're compiling with TSAN we need pic build
PIC_BUILD=$COMPILE_WITH_TSAN
if [ -z "$ROCKSDB_FBCODE_BUILD_WITH_481" ]; then
source "$PWD/build_tools/fbcode_config.sh"
else
if [ -n "$ROCKSDB_FBCODE_BUILD_WITH_481" ]; then
# we need this to build with MySQL. Don't use for other purposes.
source "$PWD/build_tools/fbcode_config4.8.1.sh"
elif [ -n "$ROCKSDB_FBCODE_BUILD_WITH_5xx" ]; then
source "$PWD/build_tools/fbcode_config.sh"
else
source "$PWD/build_tools/fbcode_config_platform007.sh"
fi
fi
@ -141,6 +143,7 @@ case "$TARGET_OS" in
;;
FreeBSD)
PLATFORM=OS_FREEBSD
CXX=clang++
COMMON_FLAGS="$COMMON_FLAGS -fno-builtin-memcmp -D_REENTRANT -DOS_FREEBSD"
PLATFORM_LDFLAGS="$PLATFORM_LDFLAGS -lpthread"
# PORT_FILES=port/freebsd/freebsd_specific.cc
@ -481,13 +484,16 @@ if test -z "$PORTABLE"; then
COMMON_FLAGS="$COMMON_FLAGS -mcpu=$POWER -mtune=$POWER "
elif test -n "`echo $TARGET_ARCHITECTURE | grep ^s390x`"; then
COMMON_FLAGS="$COMMON_FLAGS -march=z10 "
elif test -n "`echo $TARGET_ARCHITECTURE | grep ^arm`"; then
# TODO: Handle this with approprite options.
COMMON_FLAGS="$COMMON_FLAGS"
elif [ "$TARGET_OS" != AIX ] && [ "$TARGET_OS" != SunOS ]; then
COMMON_FLAGS="$COMMON_FLAGS -march=native "
elif test "$USE_SSE"; then
COMMON_FLAGS="$COMMON_FLAGS -msse4.2"
COMMON_FLAGS="$COMMON_FLAGS -msse4.2 -mpclmul"
fi
elif test "$USE_SSE"; then
COMMON_FLAGS="$COMMON_FLAGS -msse4.2"
COMMON_FLAGS="$COMMON_FLAGS -msse4.2 -mpclmul"
fi
$CXX $PLATFORM_CXXFLAGS $COMMON_FLAGS -x c++ - -o /dev/null 2>/dev/null <<EOF
@ -501,6 +507,24 @@ if [ "$?" = 0 ]; then
COMMON_FLAGS="$COMMON_FLAGS -DHAVE_SSE42"
elif test "$USE_SSE"; then
echo "warning: USE_SSE specified but compiler could not use SSE intrinsics, disabling"
exit 1
fi
$CXX $PLATFORM_CXXFLAGS $COMMON_FLAGS -x c++ - -o /dev/null 2>/dev/null <<EOF
#include <cstdint>
#include <wmmintrin.h>
int main() {
const auto a = _mm_set_epi64x(0, 0);
const auto b = _mm_set_epi64x(0, 0);
const auto c = _mm_clmulepi64_si128(a, b, 0x00);
auto d = _mm_cvtsi128_si64(c);
}
EOF
if [ "$?" = 0 ]; then
COMMON_FLAGS="$COMMON_FLAGS -DHAVE_PCLMUL"
elif test "$USE_SSE"; then
echo "warning: USE_SSE specified but compiler could not use PCLMUL intrinsics, disabling"
exit 1
fi
# iOS doesn't support thread-local storage, but this check would erroneously

View File

@ -13,10 +13,12 @@ error=0
function log {
DATE=`date +%Y-%m-%d:%H:%M:%S`
# shellcheck disable=SC2068
echo $DATE $@
}
function log_err {
# shellcheck disable=SC2145
log "ERROR: $@ Error code: $error."
}

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
GCC_BASE=/mnt/gvfs/third-party2/gcc/8219ec1bcedf8ad9da05e121e193364de2cc4f61/5.x/centos6-native/c447969
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/64d8d58e3d84f8bde7a029763d4f5baf39d0d5b9/stable/centos6-native/6aaf4de
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/ba9be983c81de7299b59fe71950c664a84dcb5f8/5.x/gcc-5-glibc-2.23/339d858

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
GCC_BASE=/mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.8.1/centos6-native/cc6c9dc
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/8598c375b0e94e1448182eb3df034704144a838d/stable/centos6-native/3f16ddd
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/d6e0a7da6faba45f5e5b1638f9edd7afc2f34e7d/4.8.1/gcc-4.8.1-glibc-2.17/8aac7fc

View File

@ -0,0 +1,18 @@
GCC_BASE=/mnt/gvfs/third-party2/gcc/6e8e715624fd15256a7970073387793dfcf79b46/7.x/centos7-native/b2ef2b6
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/ef37e1faa1c29782abfac1ae65a291b9b7966f6d/stable/centos7-native/c9f9104
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/c67031f0f739ac61575a061518d6ef5038f99f90/7.x/platform007/5620abc
GLIBC_BASE=/mnt/gvfs/third-party2/glibc/60d6f124a78798b73944f5ba87c2306ae3460153/2.26/platform007/f259413
SNAPPY_BASE=/mnt/gvfs/third-party2/snappy/7f9bdaada18f59bc27ec2b0871eb8a6144343aef/1.1.3/platform007/ca4da3d
ZLIB_BASE=/mnt/gvfs/third-party2/zlib/22c2d65676fb7c23cfa797c4f6937f38b026f3cf/1.2.8/platform007/ca4da3d
BZIP2_BASE=/mnt/gvfs/third-party2/bzip2/dc49a21c5fceec6456a7a28a94dcd16690af1337/1.0.6/platform007/ca4da3d
LZ4_BASE=/mnt/gvfs/third-party2/lz4/907b498203d297947f3bb70b9466f47e100f1873/r131/platform007/ca4da3d
ZSTD_BASE=/mnt/gvfs/third-party2/zstd/3ee276cbacfad3074e3f07bf826ac47f06970f4e/1.3.5/platform007/15a3614
GFLAGS_BASE=/mnt/gvfs/third-party2/gflags/0b9929d2588991c65a57168bf88aff2db87c5d48/2.2.0/platform007/ca4da3d
JEMALLOC_BASE=/mnt/gvfs/third-party2/jemalloc/9c910d36d6235cc40e8ff559358f1833452300ca/master/platform007/5b0f53e
NUMA_BASE=/mnt/gvfs/third-party2/numa/9cbf2460284c669ed19c3ccb200a71f7dd7e53c7/2.0.11/platform007/ca4da3d
LIBUNWIND_BASE=/mnt/gvfs/third-party2/libunwind/bf3d7497fe4e6d007354f0adffa16ce3003f8338/1.3/platform007/6f3e0a9
TBB_BASE=/mnt/gvfs/third-party2/tbb/ff4e0b093534704d8abab678a4fd7f5ea7b094c7/2018_U5/platform007/ca4da3d
KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/b5c4a61a5c483ba24722005ae07895971a2ac707/fb/platform007/da39a3e
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/92ff90349e2f43ea0a8246d8b1cf17b6869013e3/2.29.1/centos7-native/da39a3e
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/f3f697a28122e6bcd513273dd9c1ff23852fc59f/3.13.0/platform007/ca4da3d
LUA_BASE=/mnt/gvfs/third-party2/lua/f0cd714433206d5139df61659eb7b28b1dea6683/5.3.4/platform007/5007832

View File

@ -0,0 +1,157 @@
#!/bin/sh
#
# Set environment variables so that we can compile rocksdb using
# fbcode settings. It uses the latest g++ and clang compilers and also
# uses jemalloc
# Environment variables that change the behavior of this script:
# PIC_BUILD -- if true, it will only take pic versions of libraries from fbcode. libraries that don't have pic variant will not be included
BASEDIR=`dirname $BASH_SOURCE`
source "$BASEDIR/dependencies_platform007.sh"
CFLAGS=""
# libgcc
LIBGCC_INCLUDE="$LIBGCC_BASE/include/c++/7.3.0"
LIBGCC_LIBS=" -L $LIBGCC_BASE/lib"
# glibc
GLIBC_INCLUDE="$GLIBC_BASE/include"
GLIBC_LIBS=" -L $GLIBC_BASE/lib"
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
if test -z $PIC_BUILD; then
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy.a"
else
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy_pic.a"
fi
CFLAGS+=" -DSNAPPY"
if test -z $PIC_BUILD; then
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz.a"
CFLAGS+=" -DZLIB"
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2.a"
CFLAGS+=" -DBZIP2"
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4.a"
CFLAGS+=" -DLZ4"
fi
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
if test -z $PIC_BUILD; then
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd.a"
else
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd_pic.a"
fi
CFLAGS+=" -DZSTD"
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
if test -z $PIC_BUILD; then
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags.a"
else
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags_pic.a"
fi
CFLAGS+=" -DGFLAGS=gflags"
# location of jemalloc
JEMALLOC_INCLUDE=" -I $JEMALLOC_BASE/include/"
JEMALLOC_LIB=" $JEMALLOC_BASE/lib/libjemalloc.a"
if test -z $PIC_BUILD; then
# location of numa
NUMA_INCLUDE=" -I $NUMA_BASE/include/"
NUMA_LIB=" $NUMA_BASE/lib/libnuma.a"
CFLAGS+=" -DNUMA"
# location of libunwind
LIBUNWIND="$LIBUNWIND_BASE/lib/libunwind.a"
fi
# location of TBB
TBB_INCLUDE=" -isystem $TBB_BASE/include/"
if test -z $PIC_BUILD; then
TBB_LIBS="$TBB_BASE/lib/libtbb.a"
else
TBB_LIBS="$TBB_BASE/lib/libtbb_pic.a"
fi
CFLAGS+=" -DTBB"
# use Intel SSE support for checksum calculations
export USE_SSE=1
export PORTABLE=1
BINUTILS="$BINUTILS_BASE/bin"
AR="$BINUTILS/ar"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE"
STDLIBS="-L $GCC_BASE/lib64"
CLANG_BIN="$CLANG_BASE/bin"
CLANG_LIB="$CLANG_BASE/lib"
CLANG_SRC="$CLANG_BASE/../../src"
CLANG_ANALYZER="$CLANG_BIN/clang++"
CLANG_SCAN_BUILD="$CLANG_SRC/llvm/tools/clang/tools/scan-build/bin/scan-build"
if [ -z "$USE_CLANG" ]; then
# gcc
CC="$GCC_BASE/bin/gcc"
CXX="$GCC_BASE/bin/g++"
CFLAGS+=" -B$BINUTILS/gold"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $GLIBC_INCLUDE"
JEMALLOC=1
else
# clang
CLANG_INCLUDE="$CLANG_LIB/clang/stable/include"
CC="$CLANG_BIN/clang"
CXX="$CLANG_BIN/clang++"
KERNEL_HEADERS_INCLUDE="$KERNEL_HEADERS_BASE/include"
CFLAGS+=" -B$BINUTILS/gold -nostdinc -nostdlib"
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/7.x "
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/7.x/x86_64-facebook-linux "
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $CLANG_INCLUDE"
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE/linux "
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE "
CFLAGS+=" -Wno-expansion-to-defined "
CXXFLAGS="-nostdinc++"
fi
CFLAGS+=" $DEPS_INCLUDE"
CFLAGS+=" -DROCKSDB_PLATFORM_POSIX -DROCKSDB_LIB_IO_POSIX -DROCKSDB_FALLOCATE_PRESENT -DROCKSDB_MALLOC_USABLE_SIZE -DROCKSDB_RANGESYNC_PRESENT -DROCKSDB_SCHED_GETCPU_PRESENT -DROCKSDB_SUPPORT_THREAD_LOCAL -DHAVE_SSE42"
CXXFLAGS+=" $CFLAGS"
EXEC_LDFLAGS=" $SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $NUMA_LIB $TBB_LIBS"
EXEC_LDFLAGS+=" -B$BINUTILS/gold"
EXEC_LDFLAGS+=" -Wl,--dynamic-linker,/usr/local/fbcode/platform007/lib/ld.so"
EXEC_LDFLAGS+=" $LIBUNWIND"
EXEC_LDFLAGS+=" -Wl,-rpath=/usr/local/fbcode/platform007/lib"
# required by libtbb
EXEC_LDFLAGS+=" -ldl"
PLATFORM_LDFLAGS="$LIBGCC_LIBS $GLIBC_LIBS $STDLIBS -lgcc -lstdc++"
EXEC_LDFLAGS_SHARED="$SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $TBB_LIBS"
VALGRIND_VER="$VALGRIND_BASE/bin/"
# lua not supported because it's on track for deprecation, I think
LUA_PATH=
LUA_LIB=
export CC CXX AR CFLAGS CXXFLAGS EXEC_LDFLAGS EXEC_LDFLAGS_SHARED VALGRIND_VER JEMALLOC_LIB JEMALLOC_INCLUDE CLANG_ANALYZER CLANG_SCAN_BUILD LUA_PATH LUA_LIB

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC1113
#/usr/bin/env bash
set -e
@ -28,12 +29,14 @@ function package() {
if dpkg --get-selections | grep --quiet $1; then
log "$1 is already installed. skipping."
else
# shellcheck disable=SC2068
apt-get install $@ -y
fi
elif [[ $OS = "centos" ]]; then
if rpm -qa | grep --quiet $1; then
log "$1 is already installed. skipping."
else
# shellcheck disable=SC2068
yum install $@ -y
fi
fi
@ -52,6 +55,7 @@ function gem_install() {
if gem list | grep --quiet $1; then
log "$1 is already installed. skipping."
else
# shellcheck disable=SC2068
gem install $@
fi
}
@ -125,4 +129,5 @@ function main() {
include $LIB_DIR
}
# shellcheck disable=SC2068
main $@

View File

@ -85,8 +85,9 @@ NON_SHM="TMPD=/tmp/rocksdb_test_tmp"
GCC_481="ROCKSDB_FBCODE_BUILD_WITH_481=1"
ASAN="COMPILE_WITH_ASAN=1"
CLANG="USE_CLANG=1"
LITE="OPT=\"-DROCKSDB_LITE -g\""
TSAN="COMPILE_WITH_TSAN=1"
# in gcc-5 there are known problems with TSAN like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71090.
# using platform007 gives us gcc-8 or higher which has that bug fixed.
TSAN="ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1 COMPILE_WITH_TSAN=1"
UBSAN="COMPILE_WITH_UBSAN=1"
DISABLE_JEMALLOC="DISABLE_JEMALLOC=1"
HTTP_PROXY="https_proxy=http://fwdproxy.29.prn1:8080 http_proxy=http://fwdproxy.29.prn1:8080 ftp_proxy=http://fwdproxy.29.prn1:8080"

View File

@ -53,6 +53,45 @@ function get_lib_base()
log_variable $__res_var
}
###########################################################
# platform007 dependencies #
###########################################################
OUTPUT="$BASEDIR/dependencies_platform007.sh"
rm -f "$OUTPUT"
touch "$OUTPUT"
echo "Writing dependencies to $OUTPUT"
# Compilers locations
GCC_BASE=`readlink -f $TP2_LATEST/gcc/7.x/centos7-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/stable/centos7-native/*/`
log_variable GCC_BASE
log_variable CLANG_BASE
# Libraries locations
get_lib_base libgcc 7.x platform007
get_lib_base glibc 2.26 platform007
get_lib_base snappy LATEST platform007
get_lib_base zlib LATEST platform007
get_lib_base bzip2 LATEST platform007
get_lib_base lz4 LATEST platform007
get_lib_base zstd LATEST platform007
get_lib_base gflags LATEST platform007
get_lib_base jemalloc LATEST platform007
get_lib_base numa LATEST platform007
get_lib_base libunwind LATEST platform007
get_lib_base tbb LATEST platform007
get_lib_base kernel-headers fb platform007
get_lib_base binutils LATEST centos7-native
get_lib_base valgrind LATEST platform007
get_lib_base lua 5.3.4 platform007
git diff $OUTPUT
###########################################################
# 5.x dependencies #
###########################################################

View File

@ -72,7 +72,7 @@ def display_file_coverage(per_file_coverage, total_coverage):
header_template = \
"%" + str(max_file_name_length) + "s\t%s\t%s"
separator = "-" * (max_file_name_length + 10 + 20)
print header_template % ("Filename", "Coverage", "Lines")
print header_template % ("Filename", "Coverage", "Lines") # noqa: E999 T25377293 Grandfathered in
print separator
# -- Print body

77
db/c.cc
View File

@ -1388,23 +1388,24 @@ void rocksdb_writebatch_put_log_data(
b->rep.PutLogData(Slice(blob, len));
}
class H : public WriteBatch::Handler {
public:
void* state_;
void (*put_)(void*, const char* k, size_t klen, const char* v, size_t vlen);
void (*deleted_)(void*, const char* k, size_t klen);
virtual void Put(const Slice& key, const Slice& value) override {
(*put_)(state_, key.data(), key.size(), value.data(), value.size());
}
virtual void Delete(const Slice& key) override {
(*deleted_)(state_, key.data(), key.size());
}
};
void rocksdb_writebatch_iterate(
rocksdb_writebatch_t* b,
void* state,
void (*put)(void*, const char* k, size_t klen, const char* v, size_t vlen),
void (*deleted)(void*, const char* k, size_t klen)) {
class H : public WriteBatch::Handler {
public:
void* state_;
void (*put_)(void*, const char* k, size_t klen, const char* v, size_t vlen);
void (*deleted_)(void*, const char* k, size_t klen);
virtual void Put(const Slice& key, const Slice& value) override {
(*put_)(state_, key.data(), key.size(), value.data(), value.size());
}
virtual void Delete(const Slice& key) override {
(*deleted_)(state_, key.data(), key.size());
}
};
H handler;
handler.state_ = state;
handler.put_ = put;
@ -1649,18 +1650,6 @@ void rocksdb_writebatch_wi_iterate(
void* state,
void (*put)(void*, const char* k, size_t klen, const char* v, size_t vlen),
void (*deleted)(void*, const char* k, size_t klen)) {
class H : public WriteBatch::Handler {
public:
void* state_;
void (*put_)(void*, const char* k, size_t klen, const char* v, size_t vlen);
void (*deleted_)(void*, const char* k, size_t klen);
virtual void Put(const Slice& key, const Slice& value) override {
(*put_)(state_, key.data(), key.size(), value.data(), value.size());
}
virtual void Delete(const Slice& key) override {
(*deleted_)(state_, key.data(), key.size());
}
};
H handler;
handler.state_ = state;
handler.put_ = put;
@ -3104,20 +3093,21 @@ void rocksdb_slicetransform_destroy(rocksdb_slicetransform_t* st) {
delete st;
}
struct Wrapper : public rocksdb_slicetransform_t {
const SliceTransform* rep_;
~Wrapper() { delete rep_; }
const char* Name() const override { return rep_->Name(); }
Slice Transform(const Slice& src) const override {
return rep_->Transform(src);
}
bool InDomain(const Slice& src) const override {
return rep_->InDomain(src);
}
bool InRange(const Slice& src) const override { return rep_->InRange(src); }
static void DoNothing(void*) { }
};
rocksdb_slicetransform_t* rocksdb_slicetransform_create_fixed_prefix(size_t prefixLen) {
struct Wrapper : public rocksdb_slicetransform_t {
const SliceTransform* rep_;
~Wrapper() { delete rep_; }
const char* Name() const override { return rep_->Name(); }
Slice Transform(const Slice& src) const override {
return rep_->Transform(src);
}
bool InDomain(const Slice& src) const override {
return rep_->InDomain(src);
}
bool InRange(const Slice& src) const override { return rep_->InRange(src); }
static void DoNothing(void*) { }
};
Wrapper* wrapper = new Wrapper;
wrapper->rep_ = rocksdb::NewFixedPrefixTransform(prefixLen);
wrapper->state_ = nullptr;
@ -3126,19 +3116,6 @@ rocksdb_slicetransform_t* rocksdb_slicetransform_create_fixed_prefix(size_t pref
}
rocksdb_slicetransform_t* rocksdb_slicetransform_create_noop() {
struct Wrapper : public rocksdb_slicetransform_t {
const SliceTransform* rep_;
~Wrapper() { delete rep_; }
const char* Name() const override { return rep_->Name(); }
Slice Transform(const Slice& src) const override {
return rep_->Transform(src);
}
bool InDomain(const Slice& src) const override {
return rep_->InDomain(src);
}
bool InRange(const Slice& src) const override { return rep_->InRange(src); }
static void DoNothing(void*) { }
};
Wrapper* wrapper = new Wrapper;
wrapper->rep_ = rocksdb::NewNoopTransform();
wrapper->state_ = nullptr;

View File

@ -344,12 +344,13 @@ void SuperVersionUnrefHandle(void* ptr) {
// When latter happens, we are in ~ColumnFamilyData(), no get should happen as
// well.
SuperVersion* sv = static_cast<SuperVersion*>(ptr);
if (sv->Unref()) {
sv->db_mutex->Lock();
sv->Cleanup();
sv->db_mutex->Unlock();
delete sv;
}
bool was_last_ref __attribute__((__unused__));
was_last_ref = sv->Unref();
// Thread-local SuperVersions can't outlive ColumnFamilyData::super_version_.
// This is important because we can't do SuperVersion cleanup here.
// That would require locking DB mutex, which would deadlock because
// SuperVersionUnrefHandle is called with locked ThreadLocalPtr mutex.
assert(!was_last_ref);
}
} // anonymous namespace
@ -385,7 +386,8 @@ ColumnFamilyData::ColumnFamilyData(
pending_flush_(false),
pending_compaction_(false),
prev_compaction_needed_bytes_(0),
allow_2pc_(db_options.allow_2pc) {
allow_2pc_(db_options.allow_2pc),
last_memtable_id_(0) {
Ref();
// Convert user defined table properties collector factories to internal ones.
@ -965,6 +967,12 @@ void ColumnFamilyData::InstallSuperVersion(
RecalculateWriteStallConditions(mutable_cf_options);
if (old_superversion != nullptr) {
// Reset SuperVersions cached in thread local storage.
// This should be done before old_superversion->Unref(). That's to ensure
// that local_sv_ never holds the last reference to SuperVersion, since
// it has no means to safely do SuperVersion cleanup.
ResetThreadLocalSuperVersions();
if (old_superversion->mutable_cf_options.write_buffer_size !=
mutable_cf_options.write_buffer_size) {
mem_->UpdateWriteBufferSize(mutable_cf_options.write_buffer_size);
@ -980,9 +988,6 @@ void ColumnFamilyData::InstallSuperVersion(
sv_context->superversions_to_free.push_back(old_superversion);
}
}
// Reset SuperVersions cached in thread local storage
ResetThreadLocalSuperVersions();
}
void ColumnFamilyData::ResetThreadLocalSuperVersions() {
@ -994,10 +999,12 @@ void ColumnFamilyData::ResetThreadLocalSuperVersions() {
continue;
}
auto sv = static_cast<SuperVersion*>(ptr);
if (sv->Unref()) {
sv->Cleanup();
delete sv;
}
bool was_last_ref __attribute__((__unused__));
was_last_ref = sv->Unref();
// sv couldn't have been the last reference because
// ResetThreadLocalSuperVersions() is called before
// unref'ing super_version_.
assert(!was_last_ref);
}
}

View File

@ -239,7 +239,11 @@ class ColumnFamilyData {
void SetCurrent(Version* _current);
uint64_t GetNumLiveVersions() const; // REQUIRE: DB mutex held
uint64_t GetTotalSstFilesSize() const; // REQUIRE: DB mutex held
void SetMemtable(MemTable* new_mem) { mem_ = new_mem; }
void SetMemtable(MemTable* new_mem) {
uint64_t memtable_id = last_memtable_id_.fetch_add(1) + 1;
new_mem->SetID(memtable_id);
mem_ = new_mem;
}
// calculate the oldest log needed for the durability of this column family
uint64_t OldestLogToKeep();
@ -419,6 +423,9 @@ class ColumnFamilyData {
// if the database was opened with 2pc enabled
bool allow_2pc_;
// Memtable id to track flush.
std::atomic<uint64_t> last_memtable_id_;
};
// ColumnFamilySet has interesting thread-safety requirements

View File

@ -625,7 +625,8 @@ Status CompactionJob::Install(const MutableCFOptions& mutable_cf_options) {
"[%s] compacted to: %s, MB/sec: %.1f rd, %.1f wr, level %d, "
"files in(%d, %d) out(%d) "
"MB in(%.1f, %.1f) out(%.1f), read-write-amplify(%.1f) "
"write-amplify(%.1f) %s, records in: %d, records dropped: %d\n",
"write-amplify(%.1f) %s, records in: %" PRIu64
", records dropped: %" PRIu64 " output_compression: %s\n",
cfd->GetName().c_str(), vstorage->LevelSummary(&tmp), bytes_read_per_sec,
bytes_written_per_sec, compact_->compaction->output_level(),
stats.num_input_files_in_non_output_levels,
@ -634,20 +635,23 @@ Status CompactionJob::Install(const MutableCFOptions& mutable_cf_options) {
stats.bytes_read_output_level / 1048576.0,
stats.bytes_written / 1048576.0, read_write_amp, write_amp,
status.ToString().c_str(), stats.num_input_records,
stats.num_dropped_records);
stats.num_dropped_records,
CompressionTypeToString(compact_->compaction->output_compression())
.c_str());
UpdateCompactionJobStats(stats);
auto stream = event_logger_->LogToBuffer(log_buffer_);
stream << "job" << job_id_
<< "event" << "compaction_finished"
stream << "job" << job_id_ << "event"
<< "compaction_finished"
<< "compaction_time_micros" << compaction_stats_.micros
<< "output_level" << compact_->compaction->output_level()
<< "num_output_files" << compact_->NumOutputFiles()
<< "total_output_size" << compact_->total_bytes
<< "num_input_records" << compact_->num_input_records
<< "num_output_records" << compact_->num_output_records
<< "num_subcompactions" << compact_->sub_compact_states.size();
<< "total_output_size" << compact_->total_bytes << "num_input_records"
<< compact_->num_input_records << "num_output_records"
<< compact_->num_output_records << "num_subcompactions"
<< compact_->sub_compact_states.size() << "output_compression"
<< CompressionTypeToString(compact_->compaction->output_compression());
if (compaction_job_stats_ != nullptr) {
stream << "num_single_delete_mismatches"

View File

@ -126,6 +126,41 @@ TEST_F(DBFlushTest, FlushInLowPriThreadPool) {
ASSERT_EQ(1, num_compactions);
}
TEST_F(DBFlushTest, ManualFlushWithMinWriteBufferNumberToMerge) {
Options options = CurrentOptions();
options.write_buffer_size = 100;
options.max_write_buffer_number = 4;
options.min_write_buffer_number_to_merge = 3;
Reopen(options);
SyncPoint::GetInstance()->LoadDependency(
{{"DBImpl::BGWorkFlush",
"DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge:1"},
{"DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge:2",
"DBImpl::FlushMemTableToOutputFile:BeforeInstallSV"}});
SyncPoint::GetInstance()->EnableProcessing();
ASSERT_OK(Put("key1", "value1"));
port::Thread t([&]() {
// The call wait for flush to finish, i.e. with flush_options.wait = true.
ASSERT_OK(Flush());
});
// Wait for flush start.
TEST_SYNC_POINT("DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge:1");
// Insert a second memtable before the manual flush finish.
// At the end of the manual flush job, it will check if further flush
// is needed, but it will not trigger flush of the second memtable because
// min_write_buffer_number_to_merge is not reached.
ASSERT_OK(Put("key2", "value2"));
ASSERT_OK(dbfull()->TEST_SwitchMemtable());
TEST_SYNC_POINT("DBFlushTest::ManualFlushWithMinWriteBufferNumberToMerge:2");
// Manual flush should return, without waiting for flush indefinitely.
t.join();
}
TEST_P(DBFlushDirectIOTest, DirectIO) {
Options options;
options.create_if_missing = true;

View File

@ -810,8 +810,12 @@ class DBImpl : public DB {
Status FlushMemTable(ColumnFamilyData* cfd, const FlushOptions& options,
bool writes_stopped = false);
// Wait for memtable flushed
Status WaitForFlushMemTable(ColumnFamilyData* cfd);
// Wait for memtable flushed.
// If flush_memtable_id is non-null, wait until the memtable with the ID
// gets flush. Otherwise, wait until the column family don't have any
// memtable pending flush.
Status WaitForFlushMemTable(ColumnFamilyData* cfd,
const uint64_t* flush_memtable_id = nullptr);
// REQUIRES: mutex locked
Status SwitchWAL(WriteContext* write_context);

View File

@ -134,6 +134,7 @@ Status DBImpl::FlushMemTableToOutputFile(
}
if (s.ok()) {
TEST_SYNC_POINT("DBImpl::FlushMemTableToOutputFile:BeforeInstallSV");
InstallSuperVersionAndScheduleWork(cfd, &job_context->superversion_context,
mutable_cf_options);
if (made_progress) {
@ -809,7 +810,13 @@ int DBImpl::Level0StopWriteTrigger(ColumnFamilyHandle* column_family) {
Status DBImpl::Flush(const FlushOptions& flush_options,
ColumnFamilyHandle* column_family) {
auto cfh = reinterpret_cast<ColumnFamilyHandleImpl*>(column_family);
return FlushMemTable(cfh->cfd(), flush_options);
ROCKS_LOG_INFO(immutable_db_options_.info_log, "[%s] Manual flush start.",
cfh->GetName().c_str());
Status s = FlushMemTable(cfh->cfd(), flush_options);
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"[%s] Manual flush finished, status: %s\n",
cfh->GetName().c_str(), s.ToString().c_str());
return s;
}
Status DBImpl::RunManualCompaction(ColumnFamilyData* cfd, int input_level,
@ -944,6 +951,7 @@ Status DBImpl::FlushMemTable(ColumnFamilyData* cfd,
const FlushOptions& flush_options,
bool writes_stopped) {
Status s;
uint64_t flush_memtable_id = 0;
{
WriteContext context;
InstrumentedMutexLock guard_lock(&mutex_);
@ -961,6 +969,7 @@ Status DBImpl::FlushMemTable(ColumnFamilyData* cfd,
// SwitchMemtable() will release and reacquire mutex during execution
s = SwitchMemtable(cfd, &context);
flush_memtable_id = cfd->imm()->GetLatestMemTableID();
if (!writes_stopped) {
write_thread_.ExitUnbatched(&w);
@ -975,16 +984,19 @@ Status DBImpl::FlushMemTable(ColumnFamilyData* cfd,
if (s.ok() && flush_options.wait) {
// Wait until the compaction completes
s = WaitForFlushMemTable(cfd);
s = WaitForFlushMemTable(cfd, &flush_memtable_id);
}
return s;
}
Status DBImpl::WaitForFlushMemTable(ColumnFamilyData* cfd) {
Status DBImpl::WaitForFlushMemTable(ColumnFamilyData* cfd,
const uint64_t* flush_memtable_id) {
Status s;
// Wait until the compaction completes
InstrumentedMutexLock l(&mutex_);
while (cfd->imm()->NumNotFlushed() > 0 && bg_error_.ok()) {
while (cfd->imm()->NumNotFlushed() > 0 && bg_error_.ok() &&
(flush_memtable_id == nullptr ||
cfd->imm()->GetEarliestMemTableID() <= *flush_memtable_id)) {
if (shutting_down_.load(std::memory_order_acquire)) {
return Status::ShutdownInProgress();
}

View File

@ -217,6 +217,9 @@ class DBIter final: public Iterator {
*prop = "Iterator is not valid.";
}
return Status::OK();
} else if (prop_name == "rocksdb.iterator.internal-key") {
*prop = saved_key_.GetUserKey().ToString();
return Status::OK();
}
return Status::InvalidArgument("Undentified property.");
}

View File

@ -55,6 +55,7 @@ TEST_F(DBIteratorTest, IteratorProperty) {
Options options = CurrentOptions();
CreateAndReopenWithCF({"pikachu"}, options);
Put(1, "1", "2");
Delete(1, "2");
ReadOptions ropt;
ropt.pin_data = false;
{
@ -64,9 +65,15 @@ TEST_F(DBIteratorTest, IteratorProperty) {
ASSERT_NOK(iter->GetProperty("non_existing.value", &prop_value));
ASSERT_OK(iter->GetProperty("rocksdb.iterator.is-key-pinned", &prop_value));
ASSERT_EQ("0", prop_value);
ASSERT_OK(iter->GetProperty("rocksdb.iterator.internal-key", &prop_value));
ASSERT_EQ("1", prop_value);
iter->Next();
ASSERT_OK(iter->GetProperty("rocksdb.iterator.is-key-pinned", &prop_value));
ASSERT_EQ("Iterator is not valid.", prop_value);
// Get internal key at which the iteration stopped (tombstone in this case).
ASSERT_OK(iter->GetProperty("rocksdb.iterator.internal-key", &prop_value));
ASSERT_EQ("2", prop_value);
}
Close();
}
@ -2157,6 +2164,48 @@ TEST_F(DBIteratorTest, SkipStatistics) {
ASSERT_EQ(skip_count, TestGetTickerCount(options, NUMBER_ITER_SKIP));
}
TEST_F(DBIteratorTest, SeekAfterHittingManyInternalKeys) {
Options options = CurrentOptions();
DestroyAndReopen(options);
ReadOptions ropts;
ropts.max_skippable_internal_keys = 2;
Put("1", "val_1");
// Add more tombstones than max_skippable_internal_keys so that Next() fails.
Delete("2");
Delete("3");
Delete("4");
Delete("5");
Put("6", "val_6");
unique_ptr<Iterator> iter(db_->NewIterator(ropts));
iter->SeekToFirst();
ASSERT_TRUE(iter->Valid());
ASSERT_EQ(iter->key().ToString(), "1");
ASSERT_EQ(iter->value().ToString(), "val_1");
// This should fail as incomplete due to too many non-visible internal keys on
// the way to the next valid user key.
iter->Next();
ASSERT_TRUE(!iter->Valid());
ASSERT_TRUE(iter->status().IsIncomplete());
// Get the internal key at which Next() failed.
std::string prop_value;
ASSERT_OK(iter->GetProperty("rocksdb.iterator.internal-key", &prop_value));
ASSERT_EQ("4", prop_value);
// Create a new iterator to seek to the internal key.
unique_ptr<Iterator> iter2(db_->NewIterator(ropts));
iter2->Seek(prop_value);
ASSERT_TRUE(iter2->Valid());
ASSERT_OK(iter2->status());
ASSERT_EQ(iter2->key().ToString(), "6");
ASSERT_EQ(iter2->value().ToString(), "val_6");
}
} // namespace rocksdb
int main(int argc, char** argv) {

View File

@ -5651,6 +5651,50 @@ TEST_F(DBTest, PauseBackgroundWorkTest) {
// now it's done
ASSERT_TRUE(done.load());
}
// Keep spawning short-living threads that create an iterator and quit.
// Meanwhile in another thread keep flushing memtables.
// This used to cause a deadlock.
TEST_F(DBTest, ThreadLocalPtrDeadlock) {
std::atomic<int> flushes_done{0};
std::atomic<int> threads_destroyed{0};
auto done = [&] {
return flushes_done.load() > 10;
};
std::thread flushing_thread([&] {
for (int i = 0; !done(); ++i) {
ASSERT_OK(db_->Put(WriteOptions(), Slice("hi"),
Slice(std::to_string(i).c_str())));
ASSERT_OK(db_->Flush(FlushOptions()));
int cnt = ++flushes_done;
fprintf(stderr, "Flushed %d times\n", cnt);
}
});
std::vector<std::thread> thread_spawning_threads(10);
for (auto& t: thread_spawning_threads) {
t = std::thread([&] {
while (!done()) {
{
std::thread tmp_thread([&] {
auto it = db_->NewIterator(ReadOptions());
delete it;
});
tmp_thread.join();
}
++threads_destroyed;
}
});
}
for (auto& t: thread_spawning_threads) {
t.join();
}
flushing_thread.join();
fprintf(stderr, "Done. Flushed %d times, destroyed %d threads\n",
flushes_done.load(), threads_destroyed.load());
}
} // namespace rocksdb
int main(int argc, char** argv) {

View File

@ -209,6 +209,8 @@ Status FlushJob::Run(FileMetaData* file_meta) {
auto stream = event_logger_->LogToBuffer(log_buffer_);
stream << "job" << job_context_->job_id << "event"
<< "flush_finished";
stream << "output_compression"
<< CompressionTypeToString(output_compression_);
stream << "lsm_state";
stream.StartArray();
auto vstorage = cfd_->current()->storage_info();

View File

@ -54,6 +54,7 @@ class Reader {
// The Reader will start reading at the first record located at physical
// position >= initial_offset within the file.
Reader(std::shared_ptr<Logger> info_log,
// @lint-ignore TXT2 T25377293 Grandfathered in
unique_ptr<SequentialFileReader>&& file,
Reporter* reporter, bool checksum, uint64_t initial_offset,
uint64_t log_num);

View File

@ -368,6 +368,11 @@ class MemTable {
return oldest_key_time_.load(std::memory_order_relaxed);
}
// REQUIRES: db_mutex held.
void SetID(uint64_t id) { id_ = id; }
uint64_t GetID() const { return id_; }
private:
enum FlushStateEnum { FLUSH_NOT_REQUESTED, FLUSH_REQUESTED, FLUSH_SCHEDULED };
@ -437,6 +442,9 @@ class MemTable {
// Timestamp of oldest key
std::atomic<uint64_t> oldest_key_time_;
// Memtable id to track flush.
uint64_t id_ = 0;
// Returns a heuristic flush decision
bool ShouldFlushNow() const;

View File

@ -5,11 +5,12 @@
//
#pragma once
#include <string>
#include <list>
#include <vector>
#include <set>
#include <deque>
#include <limits>
#include <list>
#include <set>
#include <string>
#include <vector>
#include "db/dbformat.h"
#include "db/memtable.h"
@ -244,6 +245,22 @@ class MemTableList {
uint64_t GetMinLogContainingPrepSection();
uint64_t GetEarliestMemTableID() const {
auto& memlist = current_->memlist_;
if (memlist.empty()) {
return std::numeric_limits<uint64_t>::max();
}
return memlist.back()->GetID();
}
uint64_t GetLatestMemTableID() const {
auto& memlist = current_->memlist_;
if (memlist.empty()) {
return 0;
}
return memlist.front()->GetID();
}
private:
// DB mutex held
void InstallNewVersion();

View File

@ -980,10 +980,12 @@ void Version::Get(const ReadOptions& read_options, const LookupKey& k,
storage_info_.num_non_empty_levels_, &storage_info_.file_indexer_,
user_comparator(), internal_comparator());
FdWithKeyRange* f = fp.GetNextFile();
while (f != nullptr) {
if (get_context.sample()) {
sample_file_read_inc(f->file_metadata);
}
*status = table_cache_->Get(
read_options, *internal_comparator(), f->fd, ikey, &get_context,
cfd_->internal_stats()->GetFileReadHist(fp.GetHitFileLevel()),
@ -995,10 +997,21 @@ void Version::Get(const ReadOptions& read_options, const LookupKey& k,
return;
}
// report the counters before returning
if (get_context.State() != GetContext::kNotFound &&
get_context.State() != GetContext::kMerge) {
for (uint32_t t = 0; t < Tickers::TICKER_ENUM_MAX; t++) {
if (get_context.tickers_value[t] > 0) {
RecordTick(db_statistics_, t, get_context.tickers_value[t]);
}
}
}
switch (get_context.State()) {
case GetContext::kNotFound:
// Keep searching in other files
break;
case GetContext::kMerge:
break;
case GetContext::kFound:
if (fp.GetHitFileLevel() == 0) {
RecordTick(db_statistics_, GET_HIT_L0);
@ -1015,8 +1028,6 @@ void Version::Get(const ReadOptions& read_options, const LookupKey& k,
case GetContext::kCorrupt:
*status = Status::Corruption("corrupted key for ", user_key);
return;
case GetContext::kMerge:
break;
case GetContext::kBlobIndex:
ROCKS_LOG_ERROR(info_log_, "Encounter unexpected blob index.");
*status = Status::NotSupported(
@ -1027,6 +1038,11 @@ void Version::Get(const ReadOptions& read_options, const LookupKey& k,
f = fp.GetNextFile();
}
for (uint32_t t = 0; t < Tickers::TICKER_ENUM_MAX; t++) {
if (get_context.tickers_value[t] > 0) {
RecordTick(db_statistics_, t, get_context.tickers_value[t]);
}
}
if (GetContext::kMerge == get_context.State()) {
if (!merge_operator_) {
*status = Status::InvalidArgument(

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
export USE_HDFS=1
export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/amd64/server:$JAVA_HOME/jre/lib/amd64:/usr/lib/hadoop/lib/native

View File

@ -97,6 +97,9 @@ class Iterator : public Cleanable {
// Property "rocksdb.iterator.super-version-number":
// LSM version used by the iterator. The same format as DB Property
// kCurrentSuperVersionNumber. See its comment for more information.
// Property "rocksdb.iterator.internal-key":
// Get the user-key portion of the internal key at which the iteration
// stopped.
virtual Status GetProperty(std::string prop_name, std::string* prop);
private:

View File

@ -4,6 +4,7 @@
#pragma once
#include <map>
#include <memory>
#include <string>
#include "rocksdb/db.h"
@ -18,11 +19,20 @@ namespace rocksdb {
// This class contains APIs to stack rocksdb wrappers.Eg. Stack TTL over base d
class StackableDB : public DB {
public:
// StackableDB is the owner of db now!
// StackableDB take sole ownership of the underlying db.
explicit StackableDB(DB* db) : db_(db) {}
// StackableDB take shared ownership of the underlying db.
explicit StackableDB(std::shared_ptr<DB> db)
: db_(db.get()), shared_db_ptr_(db) {}
~StackableDB() {
delete db_;
if (shared_db_ptr_ == nullptr) {
delete db_;
} else {
assert(shared_db_ptr_.get() == db_);
}
db_ = nullptr;
}
virtual DB* GetBaseDB() {
@ -373,6 +383,7 @@ class StackableDB : public DB {
protected:
DB* db_;
std::shared_ptr<DB> shared_db_ptr_;
};
} // namespace rocksdb

View File

@ -5,8 +5,8 @@
#pragma once
#define ROCKSDB_MAJOR 5
#define ROCKSDB_MINOR 9
#define ROCKSDB_PATCH 0
#define ROCKSDB_MINOR 10
#define ROCKSDB_PATCH 5
// Do not use these. We made the mistake of declaring macros starting with
// double underscore. Now we have to live with our choice. We'll deprecate these

View File

@ -7,6 +7,8 @@ set(JNI_NATIVE_SOURCES
rocksjni/clock_cache.cc
rocksjni/columnfamilyhandle.cc
rocksjni/compaction_filter.cc
rocksjni/compaction_filter_factory.cc
rocksjni/compaction_filter_factory_jnicallback.cc
rocksjni/compaction_options_fifo.cc
rocksjni/compaction_options_universal.cc
rocksjni/comparator.cc
@ -17,6 +19,7 @@ set(JNI_NATIVE_SOURCES
rocksjni/filter.cc
rocksjni/ingest_external_file_options.cc
rocksjni/iterator.cc
rocksjni/jnicallback.cc
rocksjni/loggerjnicallback.cc
rocksjni/lru_cache.cc
rocksjni/memtablejni.cc
@ -25,6 +28,7 @@ set(JNI_NATIVE_SOURCES
rocksjni/options_util.cc
rocksjni/ratelimiterjni.cc
rocksjni/remove_emptyvalue_compactionfilterjni.cc
rocksjni/rocks_callback_object.cc
rocksjni/cassandra_compactionfilterjni.cc
rocksjni/cassandra_value_operator.cc
rocksjni/restorejni.cc

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
PLATFORM=64
if [ `getconf LONG_BIT` != "64" ]
then
@ -7,4 +8,5 @@ fi
ROCKS_JAR=`find target -name rocksdbjni*.jar`
echo "Running benchmark in $PLATFORM-Bit mode."
# shellcheck disable=SC2068
java -server -d$PLATFORM -XX:NewSize=4m -XX:+AggressiveOpts -Djava.library.path=target -cp "${ROCKS_JAR}:benchmark/target/classes" org.rocksdb.benchmark.DbBenchmark $@

View File

@ -34,4 +34,5 @@ void Java_org_rocksdb_AbstractCompactionFilterFactory_disposeInternal(
auto* ptr_sptr_cff =
reinterpret_cast<std::shared_ptr<rocksdb::CompactionFilterFactoryJniCallback> *>(jhandle);
delete ptr_sptr_cff;
// @lint-ignore TXT4 T25377293 Grandfathered in
}

View File

@ -146,4 +146,5 @@ void Java_org_rocksdb_IngestExternalFileOptions_disposeInternal(
auto* options =
reinterpret_cast<rocksdb::IngestExternalFileOptions*>(jhandle);
delete options;
// @lint-ignore TXT4 T25377293 Grandfathered in
}

View File

@ -49,4 +49,5 @@ JniCallback::~JniCallback() {
releaseJniEnv(attached_thread);
}
// @lint-ignore TXT4 T25377293 Grandfathered in
} // namespace rocksdb

View File

@ -25,4 +25,5 @@ namespace rocksdb {
};
}
// @lint-ignore TXT4 T25377293 Grandfathered in
#endif // JAVA_ROCKSJNI_JNICALLBACK_H_

View File

@ -24,4 +24,5 @@ void Java_org_rocksdb_RocksCallbackObject_disposeInternal(
// 2) Comparator -> BaseComparatorJniCallback + JniCallback -> ComparatorJniCallback
// I think this is okay, as Comparator and JniCallback both have virtual destructors...
delete reinterpret_cast<rocksdb::JniCallback*>(handle);
// @lint-ignore TXT4 T25377293 Grandfathered in
}

View File

@ -30,4 +30,5 @@ namespace rocksdb {
return true;
}
// @lint-ignore TXT4 T25377293 Grandfathered in
};

View File

@ -30,4 +30,5 @@ namespace rocksdb {
} // namespace rocksdb
// @lint-ignore TXT4 T25377293 Grandfathered in
#endif // JAVA_ROCKSJNI_STATISTICSJNI_H_

View File

@ -12,6 +12,10 @@ public class Environment {
return (OS.contains("win"));
}
public static boolean isFreeBSD() {
return (OS.contains("freebsd"));
}
public static boolean isMac() {
return (OS.contains("mac"));
}
@ -54,6 +58,8 @@ public class Environment {
}
} else if (isMac()) {
return String.format("%sjni-osx", name);
} else if (isFreeBSD()) {
return String.format("%sjni-freebsd%s", name, is64Bit() ? "64" : "32");
} else if (isAix() && is64Bit()) {
return String.format("%sjni-aix64", name);
} else if (isSolaris()) {
@ -71,7 +77,7 @@ public class Environment {
}
private static String appendLibOsSuffix(final String libraryFileName, final boolean shared) {
if (isUnix() || isAix() || isSolaris()) {
if (isUnix() || isAix() || isSolaris() || isFreeBSD()) {
return libraryFileName + ".so";
} else if (isMac()) {
return libraryFileName + (shared ? ".dylib" : ".jnilib");

View File

@ -279,7 +279,7 @@ struct InlineSkipList<Comparator>::Node {
// next_[0]. This is used for passing data from AllocateKey to Insert.
void StashHeight(const int height) {
assert(sizeof(int) <= sizeof(next_[0]));
memcpy(&next_[0], &height, sizeof(int));
memcpy(static_cast<void*>(&next_[0]), &height, sizeof(int));
}
// Retrieves the value passed to StashHeight. Undefined after a call
@ -299,30 +299,30 @@ struct InlineSkipList<Comparator>::Node {
assert(n >= 0);
// Use an 'acquire load' so that we observe a fully initialized
// version of the returned Node.
return (next_[-n].load(std::memory_order_acquire));
return ((&next_[0] - n)->load(std::memory_order_acquire));
}
void SetNext(int n, Node* x) {
assert(n >= 0);
// Use a 'release store' so that anybody who reads through this
// pointer observes a fully initialized version of the inserted node.
next_[-n].store(x, std::memory_order_release);
(&next_[0] - n)->store(x, std::memory_order_release);
}
bool CASNext(int n, Node* expected, Node* x) {
assert(n >= 0);
return next_[-n].compare_exchange_strong(expected, x);
return (&next_[0] - n)->compare_exchange_strong(expected, x);
}
// No-barrier variants that can be safely used in a few locations.
Node* NoBarrier_Next(int n) {
assert(n >= 0);
return next_[-n].load(std::memory_order_relaxed);
return (&next_[0] - n)->load(std::memory_order_relaxed);
}
void NoBarrier_SetNext(int n, Node* x) {
assert(n >= 0);
next_[-n].store(x, std::memory_order_relaxed);
(&next_[0] - n)->store(x, std::memory_order_relaxed);
}
// Insert node after prev on specific level.

View File

@ -32,7 +32,7 @@ namespace port {
namespace {
#ifdef OS_LINUX
#if defined(OS_LINUX) || defined(OS_FREEBSD)
const char* GetExecutableName() {
static char name[1024];

View File

@ -527,11 +527,11 @@ void BlockBasedTableBuilder::WriteBlock(const Slice& raw_block_contents,
RecordTick(r->ioptions.statistics, NUMBER_BLOCK_NOT_COMPRESSED);
type = kNoCompression;
block_contents = raw_block_contents;
} else if (type != kNoCompression &&
ShouldReportDetailedTime(r->ioptions.env,
r->ioptions.statistics)) {
MeasureTime(r->ioptions.statistics, COMPRESSION_TIMES_NANOS,
timer.ElapsedNanos());
} else if (type != kNoCompression) {
if (ShouldReportDetailedTime(r->ioptions.env, r->ioptions.statistics)) {
MeasureTime(r->ioptions.statistics, COMPRESSION_TIMES_NANOS,
timer.ElapsedNanos());
}
MeasureTime(r->ioptions.statistics, BYTES_COMPRESSED,
raw_block_contents.size());
RecordTick(r->ioptions.statistics, NUMBER_BLOCK_COMPRESSED);

View File

@ -126,22 +126,37 @@ Slice GetCacheKeyFromOffset(const char* cache_key_prefix,
Cache::Handle* GetEntryFromCache(Cache* block_cache, const Slice& key,
Tickers block_cache_miss_ticker,
Tickers block_cache_hit_ticker,
Statistics* statistics) {
Statistics* statistics,
GetContext* get_context) {
auto cache_handle = block_cache->Lookup(key, statistics);
if (cache_handle != nullptr) {
PERF_COUNTER_ADD(block_cache_hit_count, 1);
// overall cache hit
RecordTick(statistics, BLOCK_CACHE_HIT);
// total bytes read from cache
RecordTick(statistics, BLOCK_CACHE_BYTES_READ,
block_cache->GetUsage(cache_handle));
// block-type specific cache hit
RecordTick(statistics, block_cache_hit_ticker);
if (get_context != nullptr) {
// overall cache hit
get_context->RecordCounters(BLOCK_CACHE_HIT, 1);
// total bytes read from cache
get_context->RecordCounters(BLOCK_CACHE_BYTES_READ,
block_cache->GetUsage(cache_handle));
// block-type specific cache hit
get_context->RecordCounters(block_cache_hit_ticker, 1);
} else {
// overall cache hit
RecordTick(statistics, BLOCK_CACHE_HIT);
// total bytes read from cache
RecordTick(statistics, BLOCK_CACHE_BYTES_READ,
block_cache->GetUsage(cache_handle));
RecordTick(statistics, block_cache_hit_ticker);
}
} else {
// overall cache miss
RecordTick(statistics, BLOCK_CACHE_MISS);
// block-type specific cache miss
RecordTick(statistics, block_cache_miss_ticker);
if (get_context != nullptr) {
// overall cache miss
get_context->RecordCounters(BLOCK_CACHE_MISS, 1);
// block-type specific cache miss
get_context->RecordCounters(block_cache_miss_ticker, 1);
} else {
RecordTick(statistics, BLOCK_CACHE_MISS);
RecordTick(statistics, block_cache_miss_ticker);
}
}
return cache_handle;
@ -253,9 +268,11 @@ class PartitionIndexReader : public IndexReader, public Cleanable {
compression_dict = rep->compression_dict_block->data;
}
const bool is_index = true;
s = table_->MaybeLoadDataBlockToCache(prefetch_buffer.get(), rep, ro,
handle, compression_dict, &block,
is_index);
// TODO: Support counter batch update for partitioned index and
// filter blocks
s = table_->MaybeLoadDataBlockToCache(
prefetch_buffer.get(), rep, ro, handle, compression_dict, &block,
is_index, nullptr /* get_context */);
assert(s.ok() || block.value == nullptr);
if (s.ok() && block.value != nullptr) {
@ -779,7 +796,8 @@ Status BlockBasedTable::Open(const ImmutableCFOptions& ioptions,
ReadOptions read_options;
s = MaybeLoadDataBlockToCache(
prefetch_buffer.get(), rep, read_options, rep->range_del_handle,
Slice() /* compression_dict */, &rep->range_del_entry);
Slice() /* compression_dict */, &rep->range_del_entry,
false /* is_index */, nullptr /* get_context */);
if (!s.ok()) {
ROCKS_LOG_WARN(
rep->ioptions.info_log,
@ -955,8 +973,8 @@ Status BlockBasedTable::GetDataBlockFromCache(
Cache* block_cache, Cache* block_cache_compressed,
const ImmutableCFOptions& ioptions, const ReadOptions& read_options,
BlockBasedTable::CachableEntry<Block>* block, uint32_t format_version,
const Slice& compression_dict, size_t read_amp_bytes_per_bit,
bool is_index) {
const Slice& compression_dict, size_t read_amp_bytes_per_bit, bool is_index,
GetContext* get_context) {
Status s;
Block* compressed_block = nullptr;
Cache::Handle* block_cache_compressed_handle = nullptr;
@ -967,7 +985,8 @@ Status BlockBasedTable::GetDataBlockFromCache(
block->cache_handle = GetEntryFromCache(
block_cache, block_cache_key,
is_index ? BLOCK_CACHE_INDEX_MISS : BLOCK_CACHE_DATA_MISS,
is_index ? BLOCK_CACHE_INDEX_HIT : BLOCK_CACHE_DATA_HIT, statistics);
is_index ? BLOCK_CACHE_INDEX_HIT : BLOCK_CACHE_DATA_HIT, statistics,
get_context);
if (block->cache_handle != nullptr) {
block->value =
reinterpret_cast<Block*>(block_cache->Value(block->cache_handle));
@ -1020,18 +1039,36 @@ Status BlockBasedTable::GetDataBlockFromCache(
block_cache->TEST_mark_as_data_block(block_cache_key,
block->value->usable_size());
if (s.ok()) {
RecordTick(statistics, BLOCK_CACHE_ADD);
if (is_index) {
RecordTick(statistics, BLOCK_CACHE_INDEX_ADD);
RecordTick(statistics, BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_DATA_ADD);
RecordTick(statistics, BLOCK_CACHE_DATA_BYTES_INSERT,
RecordTick(statistics, BLOCK_CACHE_ADD);
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
}
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
if (is_index) {
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_INDEX_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_INDEX_ADD);
RecordTick(statistics, BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
}
} else {
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_DATA_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_DATA_BYTES_INSERT,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_DATA_ADD);
RecordTick(statistics, BLOCK_CACHE_DATA_BYTES_INSERT,
block->value->usable_size());
}
}
} else {
RecordTick(statistics, BLOCK_CACHE_ADD_FAILURES);
delete block->value;
@ -1051,7 +1088,7 @@ Status BlockBasedTable::PutDataBlockToCache(
const ReadOptions& read_options, const ImmutableCFOptions& ioptions,
CachableEntry<Block>* block, Block* raw_block, uint32_t format_version,
const Slice& compression_dict, size_t read_amp_bytes_per_bit, bool is_index,
Cache::Priority priority) {
Cache::Priority priority, GetContext* get_context) {
assert(raw_block->compression_type() == kNoCompression ||
block_cache_compressed != nullptr);
@ -1104,18 +1141,36 @@ Status BlockBasedTable::PutDataBlockToCache(
block->value->usable_size());
if (s.ok()) {
assert(block->cache_handle != nullptr);
RecordTick(statistics, BLOCK_CACHE_ADD);
if (is_index) {
RecordTick(statistics, BLOCK_CACHE_INDEX_ADD);
RecordTick(statistics, BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_DATA_ADD);
RecordTick(statistics, BLOCK_CACHE_DATA_BYTES_INSERT,
RecordTick(statistics, BLOCK_CACHE_ADD);
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
}
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE,
block->value->usable_size());
if (is_index) {
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_INDEX_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_INDEX_ADD);
RecordTick(statistics, BLOCK_CACHE_INDEX_BYTES_INSERT,
block->value->usable_size());
}
} else {
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_DATA_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_DATA_BYTES_INSERT,
block->value->usable_size());
} else {
RecordTick(statistics, BLOCK_CACHE_DATA_ADD);
RecordTick(statistics, BLOCK_CACHE_DATA_BYTES_INSERT,
block->value->usable_size());
}
}
assert(reinterpret_cast<Block*>(
block_cache->Value(block->cache_handle)) == block->value);
} else {
@ -1188,16 +1243,18 @@ FilterBlockReader* BlockBasedTable::ReadFilter(
}
BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
FilePrefetchBuffer* prefetch_buffer, bool no_io) const {
FilePrefetchBuffer* prefetch_buffer, bool no_io,
GetContext* get_context) const {
const BlockHandle& filter_blk_handle = rep_->filter_handle;
const bool is_a_filter_partition = true;
return GetFilter(prefetch_buffer, filter_blk_handle, !is_a_filter_partition,
no_io);
no_io, get_context);
}
BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
FilePrefetchBuffer* prefetch_buffer, const BlockHandle& filter_blk_handle,
const bool is_a_filter_partition, bool no_io) const {
const bool is_a_filter_partition, bool no_io,
GetContext* get_context) const {
// If cache_index_and_filter_blocks is false, filter should be pre-populated.
// We will return rep_->filter anyway. rep_->filter can be nullptr if filter
// read fails at Open() time. We don't want to reload again since it will
@ -1227,7 +1284,7 @@ BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
Statistics* statistics = rep_->ioptions.statistics;
auto cache_handle =
GetEntryFromCache(block_cache, key, BLOCK_CACHE_FILTER_MISS,
BLOCK_CACHE_FILTER_HIT, statistics);
BLOCK_CACHE_FILTER_HIT, statistics, get_context);
FilterBlockReader* filter = nullptr;
if (cache_handle != nullptr) {
@ -1247,10 +1304,19 @@ BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
? Cache::Priority::HIGH
: Cache::Priority::LOW);
if (s.ok()) {
RecordTick(statistics, BLOCK_CACHE_ADD);
RecordTick(statistics, BLOCK_CACHE_FILTER_ADD);
RecordTick(statistics, BLOCK_CACHE_FILTER_BYTES_INSERT, filter->size());
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE, filter->size());
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_BYTES_WRITE, filter->size());
get_context->RecordCounters(BLOCK_CACHE_FILTER_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_FILTER_BYTES_INSERT,
filter->size());
} else {
RecordTick(statistics, BLOCK_CACHE_ADD);
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE, filter->size());
RecordTick(statistics, BLOCK_CACHE_FILTER_ADD);
RecordTick(statistics, BLOCK_CACHE_FILTER_BYTES_INSERT,
filter->size());
}
} else {
RecordTick(statistics, BLOCK_CACHE_ADD_FAILURES);
delete filter;
@ -1264,7 +1330,7 @@ BlockBasedTable::CachableEntry<FilterBlockReader> BlockBasedTable::GetFilter(
InternalIterator* BlockBasedTable::NewIndexIterator(
const ReadOptions& read_options, BlockIter* input_iter,
CachableEntry<IndexReader>* index_entry) {
CachableEntry<IndexReader>* index_entry, GetContext* get_context) {
// index reader has already been pre-populated.
if (rep_->index_reader) {
return rep_->index_reader->NewIterator(
@ -1287,7 +1353,7 @@ InternalIterator* BlockBasedTable::NewIndexIterator(
Statistics* statistics = rep_->ioptions.statistics;
auto cache_handle =
GetEntryFromCache(block_cache, key, BLOCK_CACHE_INDEX_MISS,
BLOCK_CACHE_INDEX_HIT, statistics);
BLOCK_CACHE_INDEX_HIT, statistics, get_context);
if (cache_handle == nullptr && no_io) {
if (input_iter != nullptr) {
@ -1322,10 +1388,15 @@ InternalIterator* BlockBasedTable::NewIndexIterator(
if (s.ok()) {
size_t usable_size = index_reader->usable_size();
RecordTick(statistics, BLOCK_CACHE_ADD);
if (get_context != nullptr) {
get_context->RecordCounters(BLOCK_CACHE_ADD, 1);
get_context->RecordCounters(BLOCK_CACHE_BYTES_WRITE, usable_size);
} else {
RecordTick(statistics, BLOCK_CACHE_ADD);
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE, usable_size);
}
RecordTick(statistics, BLOCK_CACHE_INDEX_ADD);
RecordTick(statistics, BLOCK_CACHE_INDEX_BYTES_INSERT, usable_size);
RecordTick(statistics, BLOCK_CACHE_BYTES_WRITE, usable_size);
} else {
if (index_reader != nullptr) {
delete index_reader;
@ -1359,13 +1430,14 @@ InternalIterator* BlockBasedTable::NewIndexIterator(
InternalIterator* BlockBasedTable::NewDataBlockIterator(
Rep* rep, const ReadOptions& ro, const Slice& index_value,
BlockIter* input_iter, bool is_index) {
BlockIter* input_iter, bool is_index, GetContext* get_context) {
BlockHandle handle;
Slice input = index_value;
// We intentionally allow extra stuff in index_value so that we
// can add more features in the future.
Status s = handle.DecodeFrom(&input);
return NewDataBlockIterator(rep, ro, handle, input_iter, is_index, s);
return NewDataBlockIterator(rep, ro, handle, input_iter, is_index,
get_context, s);
}
// Convert an index iterator value (i.e., an encoded BlockHandle)
@ -1374,7 +1446,7 @@ InternalIterator* BlockBasedTable::NewDataBlockIterator(
// If input_iter is not null, update this iter and return it
InternalIterator* BlockBasedTable::NewDataBlockIterator(
Rep* rep, const ReadOptions& ro, const BlockHandle& handle,
BlockIter* input_iter, bool is_index, Status s) {
BlockIter* input_iter, bool is_index, GetContext* get_context, Status s) {
PERF_TIMER_GUARD(new_table_block_iter_nanos);
const bool no_io = (ro.read_tier == kBlockCacheTier);
@ -1386,7 +1458,8 @@ InternalIterator* BlockBasedTable::NewDataBlockIterator(
compression_dict = rep->compression_dict_block->data;
}
s = MaybeLoadDataBlockToCache(nullptr /*prefetch_buffer*/, rep, ro, handle,
compression_dict, &block, is_index);
compression_dict, &block, is_index,
get_context);
}
// Didn't get any data from block caches.
@ -1437,7 +1510,7 @@ InternalIterator* BlockBasedTable::NewDataBlockIterator(
Status BlockBasedTable::MaybeLoadDataBlockToCache(
FilePrefetchBuffer* prefetch_buffer, Rep* rep, const ReadOptions& ro,
const BlockHandle& handle, Slice compression_dict,
CachableEntry<Block>* block_entry, bool is_index) {
CachableEntry<Block>* block_entry, bool is_index, GetContext* get_context) {
assert(block_entry != nullptr);
const bool no_io = (ro.read_tier == kBlockCacheTier);
Cache* block_cache = rep->table_options.block_cache.get();
@ -1468,7 +1541,7 @@ Status BlockBasedTable::MaybeLoadDataBlockToCache(
s = GetDataBlockFromCache(
key, ckey, block_cache, block_cache_compressed, rep->ioptions, ro,
block_entry, rep->table_options.format_version, compression_dict,
rep->table_options.read_amp_bytes_per_bit, is_index);
rep->table_options.read_amp_bytes_per_bit, is_index, get_context);
if (block_entry->value == nullptr && !no_io && ro.fill_cache) {
std::unique_ptr<Block> raw_block;
@ -1487,11 +1560,11 @@ Status BlockBasedTable::MaybeLoadDataBlockToCache(
block_entry, raw_block.release(), rep->table_options.format_version,
compression_dict, rep->table_options.read_amp_bytes_per_bit,
is_index,
is_index &&
rep->table_options
.cache_index_and_filter_blocks_with_high_priority
is_index && rep->table_options
.cache_index_and_filter_blocks_with_high_priority
? Cache::Priority::HIGH
: Cache::Priority::LOW);
: Cache::Priority::LOW,
get_context);
}
}
}
@ -1535,8 +1608,9 @@ BlockBasedTable::BlockEntryIteratorState::NewSecondaryIterator(
&rep->internal_comparator, nullptr, true, rep->ioptions.statistics);
}
}
return NewDataBlockIterator(rep, read_options_, handle, nullptr, is_index_,
s);
return NewDataBlockIterator(rep, read_options_, handle,
/* input_iter */ nullptr, is_index_,
/* get_context */ nullptr, s);
}
bool BlockBasedTable::BlockEntryIteratorState::PrefixMayMatch(
@ -1730,8 +1804,9 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key,
const bool no_io = read_options.read_tier == kBlockCacheTier;
CachableEntry<FilterBlockReader> filter_entry;
if (!skip_filters) {
filter_entry = GetFilter(/*prefetch_buffer*/ nullptr,
read_options.read_tier == kBlockCacheTier);
filter_entry =
GetFilter(/*prefetch_buffer*/ nullptr,
read_options.read_tier == kBlockCacheTier, get_context);
}
FilterBlockReader* filter = filter_entry.value;
@ -1741,7 +1816,8 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key,
RecordTick(rep_->ioptions.statistics, BLOOM_FILTER_USEFUL);
} else {
BlockIter iiter_on_stack;
auto iiter = NewIndexIterator(read_options, &iiter_on_stack);
auto iiter = NewIndexIterator(read_options, &iiter_on_stack,
/* index_entry */ nullptr, get_context);
std::unique_ptr<InternalIterator> iiter_unique_ptr;
if (iiter != &iiter_on_stack) {
iiter_unique_ptr.reset(iiter);
@ -1765,7 +1841,8 @@ Status BlockBasedTable::Get(const ReadOptions& read_options, const Slice& key,
break;
} else {
BlockIter biter;
NewDataBlockIterator(rep_, read_options, iiter->value(), &biter);
NewDataBlockIterator(rep_, read_options, iiter->value(), &biter, false,
get_context);
if (read_options.read_tier == kBlockCacheTier &&
biter.status().IsIncomplete()) {

View File

@ -215,15 +215,14 @@ class BlockBasedTable : public TableReader {
private:
friend class MockedBlockBasedTable;
// input_iter: if it is not null, update this one and return it as Iterator
static InternalIterator* NewDataBlockIterator(Rep* rep, const ReadOptions& ro,
const Slice& index_value,
BlockIter* input_iter = nullptr,
bool is_index = false);
static InternalIterator* NewDataBlockIterator(Rep* rep, const ReadOptions& ro,
const BlockHandle& block_hanlde,
BlockIter* input_iter = nullptr,
bool is_index = false,
Status s = Status());
static InternalIterator* NewDataBlockIterator(
Rep* rep, const ReadOptions& ro, const Slice& index_value,
BlockIter* input_iter = nullptr, bool is_index = false,
GetContext* get_context = nullptr);
static InternalIterator* NewDataBlockIterator(
Rep* rep, const ReadOptions& ro, const BlockHandle& block_hanlde,
BlockIter* input_iter = nullptr, bool is_index = false,
GetContext* get_context = nullptr, Status s = Status());
// If block cache enabled (compressed or uncompressed), looks for the block
// identified by handle in (1) uncompressed cache, (2) compressed cache, and
// then (3) file. If found, inserts into the cache(s) that were searched
@ -238,16 +237,19 @@ class BlockBasedTable : public TableReader {
const BlockHandle& handle,
Slice compression_dict,
CachableEntry<Block>* block_entry,
bool is_index = false);
bool is_index = false,
GetContext* get_context = nullptr);
// For the following two functions:
// if `no_io == true`, we will not try to read filter/index from sst file
// were they not present in cache yet.
CachableEntry<FilterBlockReader> GetFilter(
FilePrefetchBuffer* prefetch_buffer = nullptr, bool no_io = false) const;
FilePrefetchBuffer* prefetch_buffer = nullptr, bool no_io = false,
GetContext* get_context = nullptr) const;
virtual CachableEntry<FilterBlockReader> GetFilter(
FilePrefetchBuffer* prefetch_buffer, const BlockHandle& filter_blk_handle,
const bool is_a_filter_partition, bool no_io) const;
const bool is_a_filter_partition, bool no_io,
GetContext* get_context) const;
// Get the iterator from the index reader.
// If input_iter is not set, return new Iterator
@ -261,7 +263,8 @@ class BlockBasedTable : public TableReader {
// kBlockCacheTier
InternalIterator* NewIndexIterator(
const ReadOptions& read_options, BlockIter* input_iter = nullptr,
CachableEntry<IndexReader>* index_entry = nullptr);
CachableEntry<IndexReader>* index_entry = nullptr,
GetContext* get_context = nullptr);
// Read block cache from block caches (if set): block_cache and
// block_cache_compressed.
@ -275,7 +278,7 @@ class BlockBasedTable : public TableReader {
const ImmutableCFOptions& ioptions, const ReadOptions& read_options,
BlockBasedTable::CachableEntry<Block>* block, uint32_t format_version,
const Slice& compression_dict, size_t read_amp_bytes_per_bit,
bool is_index = false);
bool is_index = false, GetContext* get_context = nullptr);
// Put a raw block (maybe compressed) to the corresponding block caches.
// This method will perform decompression against raw_block if needed and then
@ -293,7 +296,8 @@ class BlockBasedTable : public TableReader {
const ReadOptions& read_options, const ImmutableCFOptions& ioptions,
CachableEntry<Block>* block, Block* raw_block, uint32_t format_version,
const Slice& compression_dict, size_t read_amp_bytes_per_bit,
bool is_index = false, Cache::Priority pri = Cache::Priority::LOW);
bool is_index = false, Cache::Priority pri = Cache::Priority::LOW,
GetContext* get_context = nullptr);
// Calls (*handle_result)(arg, ...) repeatedly, starting with the entry found
// after a call to Seek(key), until handle_result returns false.

View File

@ -49,6 +49,7 @@ class CuckooBuilderTest : public testing::Test {
uint64_t read_file_size;
ASSERT_OK(env_->GetFileSize(fname, &read_file_size));
// @lint-ignore TXT2 T25377293 Grandfathered in
Options options;
options.allow_mmap_reads = true;
ImmutableCFOptions ioptions(options);

View File

@ -568,9 +568,9 @@ Status UncompressBlockContentsForCompressionType(
if(ShouldReportDetailedTime(ioptions.env, ioptions.statistics)){
MeasureTime(ioptions.statistics, DECOMPRESSION_TIMES_NANOS,
timer.ElapsedNanos());
MeasureTime(ioptions.statistics, BYTES_DECOMPRESSED, contents->data.size());
RecordTick(ioptions.statistics, NUMBER_BLOCK_DECOMPRESSED);
}
MeasureTime(ioptions.statistics, BYTES_DECOMPRESSED, contents->data.size());
RecordTick(ioptions.statistics, NUMBER_BLOCK_DECOMPRESSED);
return Status::OK();
}

View File

@ -87,6 +87,13 @@ void GetContext::SaveValue(const Slice& value, SequenceNumber seq) {
}
}
void GetContext::RecordCounters(Tickers ticker, size_t val) {
if (ticker == Tickers::TICKER_ENUM_MAX) {
return;
}
tickers_value[ticker] += static_cast<uint64_t>(val);
}
bool GetContext::SaveValue(const ParsedInternalKey& parsed_key,
const Slice& value, Cleanable* value_pinner) {
assert((state_ != kMerge && parsed_key.type != kTypeMerge) ||

View File

@ -9,6 +9,7 @@
#include "db/range_del_aggregator.h"
#include "db/read_callback.h"
#include "rocksdb/env.h"
#include "rocksdb/statistics.h"
#include "rocksdb/types.h"
#include "table/block.h"
@ -26,6 +27,7 @@ class GetContext {
kMerge, // saver contains the current merge result (the operands)
kBlobIndex,
};
uint64_t tickers_value[Tickers::TICKER_ENUM_MAX] = {0};
GetContext(const Comparator* ucmp, const MergeOperator* merge_operator,
Logger* logger, Statistics* statistics, GetState init_state,
@ -72,6 +74,8 @@ class GetContext {
return true;
}
void RecordCounters(Tickers ticker, size_t val);
private:
const Comparator* ucmp_;
const MergeOperator* merge_operator_;

View File

@ -231,7 +231,8 @@ PartitionedFilterBlockReader::GetFilterPartition(
}
}
return table_->GetFilter(/*prefetch_buffer*/ nullptr, fltr_blk_handle,
is_a_filter_partition, no_io);
is_a_filter_partition, no_io,
/* get_context */ nullptr);
} else {
auto filter = table_->ReadFilter(prefetch_buffer, fltr_blk_handle,
is_a_filter_partition);
@ -295,7 +296,8 @@ void PartitionedFilterBlockReader::CacheDependencies(bool pin) {
const bool no_io = true;
const bool is_a_filter_partition = true;
auto filter = table_->GetFilter(prefetch_buffer.get(), handle,
is_a_filter_partition, !no_io);
is_a_filter_partition, !no_io,
/* get_context */ nullptr);
if (LIKELY(filter.IsSet())) {
if (pin) {
filter_map_[handle.offset()] = std::move(filter);

View File

@ -29,7 +29,8 @@ class MockedBlockBasedTable : public BlockBasedTable {
virtual CachableEntry<FilterBlockReader> GetFilter(
FilePrefetchBuffer*, const BlockHandle& filter_blk_handle,
const bool /* unused */, bool /* unused */) const override {
const bool /* unused */, bool /* unused */,
GetContext* /* unused */) const override {
Slice slice = slices[filter_blk_handle.offset()];
auto obj = new FullFilterBlockReader(
nullptr, true, BlockContents(slice, false, kNoCompression),

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
TMP_DIR="${TMPDIR:-/tmp}/rocksdb-sanity-test"
if [ "$#" -lt 2 ]; then

View File

@ -449,6 +449,7 @@ echo "===== Benchmark ====="
# Run!!!
IFS=',' read -a jobs <<< $1
# shellcheck disable=SC2068
for job in ${jobs[@]}; do
if [ $job != debug ]; then

View File

@ -151,6 +151,7 @@ echo "===== Benchmark ====="
# Run!!!
IFS=',' read -a jobs <<< $1
# shellcheck disable=SC2068
for job in ${jobs[@]}; do
if [ $job != debug ]; then

View File

@ -2752,8 +2752,10 @@ void VerifyDBFromDB(std::string& truth_db_name) {
void Crc32c(ThreadState* thread) {
// Checksum about 500MB of data total
const int size = 4096;
const char* label = "(4K per op)";
const int size = FLAGS_block_size; // use --block_size option for db_bench
std::string labels = "(" + ToString(FLAGS_block_size) + " per op)";
const char* label = labels.c_str();
std::string data(size, 'x');
int64_t bytes = 0;
uint32_t crc = 0;
@ -3659,9 +3661,7 @@ void VerifyDBFromDB(std::string& truth_db_name) {
}
}
if (!use_blob_db_) {
#ifndef ROCKSDB_LITE
s = db_with_cfh->db->Write(write_options_, &batch);
#endif // ROCKSDB_LITE
}
thread->stats.FinishedOps(db_with_cfh, db_with_cfh->db,
entries_per_batch_, kWrite);

View File

@ -317,7 +317,7 @@ def whitebox_crash_main(args):
cmd = gen_cmd(dict(cmd_params.items() + additional_opts.items()
+ {'db': dbname}.items()))
print "Running:" + ' '.join(cmd) + "\n"
print "Running:" + ' '.join(cmd) + "\n" # noqa: E999 T25377293 Grandfathered in
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)

View File

@ -229,7 +229,7 @@ class LDBTestCase(unittest.TestCase):
self.assertRunFAIL("get --ttl a3")
self.assertRunOK("checkconsistency", "OK")
def testInvalidCmdLines(self):
def testInvalidCmdLines(self): # noqa: F811 T25377293 Grandfathered in
print "Running testInvalidCmdLines..."
# db not specified
self.assertRunFAILFull("put 0x6133 0x6233 --hex --create_if_missing")
@ -516,7 +516,7 @@ class LDBTestCase(unittest.TestCase):
def testColumnFamilies(self):
print "Running testColumnFamilies..."
dbPath = os.path.join(self.TMP_DIR, self.DB_NAME)
dbPath = os.path.join(self.TMP_DIR, self.DB_NAME) # noqa: F841 T25377293 Grandfathered in
self.assertRunOK("put cf1_1 1 --create_if_missing", "OK")
self.assertRunOK("put cf1_2 2 --create_if_missing", "OK")
self.assertRunOK("put cf1_3 3 --try_load_options", "OK")

View File

@ -465,4 +465,5 @@ function cleanup_test_directory {
############################################################################
# shellcheck disable=SC2068
main $@

View File

@ -1,3 +1,4 @@
# shellcheck disable=SC2148
TESTDIR=`mktemp -d ${TMPDIR:-/tmp}/rocksdb-dump-test.XXXXX`
DUMPFILE="tools/sample-dump.dmp"

View File

@ -21,7 +21,7 @@ def generate_runtimes(total_runtime):
def main(args):
runtimes = generate_runtimes(int(args.runtime_sec))
print "Going to execute write stress for " + str(runtimes)
print "Going to execute write stress for " + str(runtimes) # noqa: E999 T25377293 Grandfathered in
first_time = True
for runtime in runtimes:

View File

@ -9,12 +9,11 @@
//
// A portable implementation of crc32c, optimized to handle
// four bytes at a time.
#include "util/crc32c.h"
#include <stdint.h>
#ifdef HAVE_SSE42
#include <nmmintrin.h>
#include <wmmintrin.h>
#endif
#include "util/coding.h"
@ -352,6 +351,7 @@ static inline void Fast_CRC32(uint64_t* l, uint8_t const **p) {
template<void (*CRC32)(uint64_t*, uint8_t const**)>
uint32_t ExtendImpl(uint32_t crc, const char* buf, size_t size) {
const uint8_t *p = reinterpret_cast<const uint8_t *>(buf);
const uint8_t *e = p + size;
uint64_t l = crc ^ 0xffffffffu;
@ -395,13 +395,14 @@ uint32_t ExtendImpl(uint32_t crc, const char* buf, size_t size) {
// Detect if SS42 or not.
#ifndef HAVE_POWER8
static bool isSSE42() {
#ifndef HAVE_SSE42
return false;
#elif defined(__GNUC__) && defined(__x86_64__) && !defined(IOS_CROSS_COMPILE)
uint32_t c_;
__asm__("cpuid" : "=c"(c_) : "a"(1) : "ebx", "edx");
return c_ & (1U << 20); // copied from CpuId.h in Folly.
return c_ & (1U << 20); // copied from CpuId.h in Folly. Test SSE42
#elif defined(_WIN64)
int info[4];
__cpuidex(info, 0x00000001, 0);
@ -410,7 +411,26 @@ static bool isSSE42() {
return false;
#endif
}
static bool isPCLMULQDQ() {
#ifndef HAVE_SSE42
// in build_detect_platform we set this macro when both SSE42 and PCLMULQDQ are
// supported by compiler
return false;
#elif defined(__GNUC__) && defined(__x86_64__) && !defined(IOS_CROSS_COMPILE)
uint32_t c_;
__asm__("cpuid" : "=c"(c_) : "a"(1) : "ebx", "edx");
return c_ & (1U << 1); // PCLMULQDQ is in bit 1 (not bit 0)
#elif defined(_WIN64)
int info[4];
__cpuidex(info, 0x00000001, 0);
return (info[2] & ((int)1 << 1)) != 0;
#else
return false;
#endif
}
#endif // HAVE_POWER8
typedef uint32_t (*Function)(uint32_t, const char*, size_t);
@ -440,13 +460,6 @@ static bool isAltiVec() {
}
#endif
static inline Function Choose_Extend() {
#ifndef HAVE_POWER8
return isSSE42() ? ExtendImpl<Fast_CRC32> : ExtendImpl<Slow_CRC32>;
#else
return isAltiVec() ? ExtendPPCImpl : ExtendImpl<Slow_CRC32>;
#endif
}
std::string IsFastCrc32Supported() {
bool has_fast_crc = false;
@ -475,11 +488,727 @@ std::string IsFastCrc32Supported() {
return fast_zero_msg;
}
static Function ChosenExtend = Choose_Extend();
/*
* Copyright 2016 Ferry Toth, Exalon Delft BV, The Netherlands
* This software is provided 'as-is', without any express or implied
* warranty. In no event will the author be held liable for any damages
* arising from the use of this software.
* Permission is granted to anyone to use this software for any purpose,
* including commercial applications, and to alter it and redistribute it
* freely, subject to the following restrictions:
* 1. The origin of this software must not be misrepresented; you must not
* claim that you wrote the original software. If you use this software
* in a product, an acknowledgment in the product documentation would be
* appreciated but is not required.
* 2. Altered source versions must be plainly marked as such, and must not be
* misrepresented as being the original software.
* 3. This notice may not be removed or altered from any source distribution.
* Ferry Toth
* ftoth@exalondelft.nl
*
* https://github.com/htot/crc32c
*
* Modified by Facebook
*
* Original intel whitepaper:
* "Fast CRC Computation for iSCSI Polynomial Using CRC32 Instruction"
* https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/crc-iscsi-polynomial-crc32-instruction-paper.pdf
*
* This version is from the folly library, created by Dave Watson <davejwatson@fb.com>
*
*/
#if defined HAVE_SSE42 && defined HAVE_PCLMUL
#define CRCtriplet(crc, buf, offset) \
crc##0 = _mm_crc32_u64(crc##0, *(buf##0 + offset)); \
crc##1 = _mm_crc32_u64(crc##1, *(buf##1 + offset)); \
crc##2 = _mm_crc32_u64(crc##2, *(buf##2 + offset));
#define CRCduplet(crc, buf, offset) \
crc##0 = _mm_crc32_u64(crc##0, *(buf##0 + offset)); \
crc##1 = _mm_crc32_u64(crc##1, *(buf##1 + offset));
#define CRCsinglet(crc, buf, offset) \
crc = _mm_crc32_u64(crc, *(uint64_t*)(buf + offset));
// Numbers taken directly from intel whitepaper.
// clang-format off
const uint64_t clmul_constants[] = {
0x14cd00bd6, 0x105ec76f0, 0x0ba4fc28e, 0x14cd00bd6,
0x1d82c63da, 0x0f20c0dfe, 0x09e4addf8, 0x0ba4fc28e,
0x039d3b296, 0x1384aa63a, 0x102f9b8a2, 0x1d82c63da,
0x14237f5e6, 0x01c291d04, 0x00d3b6092, 0x09e4addf8,
0x0c96cfdc0, 0x0740eef02, 0x18266e456, 0x039d3b296,
0x0daece73e, 0x0083a6eec, 0x0ab7aff2a, 0x102f9b8a2,
0x1248ea574, 0x1c1733996, 0x083348832, 0x14237f5e6,
0x12c743124, 0x02ad91c30, 0x0b9e02b86, 0x00d3b6092,
0x018b33a4e, 0x06992cea2, 0x1b331e26a, 0x0c96cfdc0,
0x17d35ba46, 0x07e908048, 0x1bf2e8b8a, 0x18266e456,
0x1a3e0968a, 0x11ed1f9d8, 0x0ce7f39f4, 0x0daece73e,
0x061d82e56, 0x0f1d0f55e, 0x0d270f1a2, 0x0ab7aff2a,
0x1c3f5f66c, 0x0a87ab8a8, 0x12ed0daac, 0x1248ea574,
0x065863b64, 0x08462d800, 0x11eef4f8e, 0x083348832,
0x1ee54f54c, 0x071d111a8, 0x0b3e32c28, 0x12c743124,
0x0064f7f26, 0x0ffd852c6, 0x0dd7e3b0c, 0x0b9e02b86,
0x0f285651c, 0x0dcb17aa4, 0x010746f3c, 0x018b33a4e,
0x1c24afea4, 0x0f37c5aee, 0x0271d9844, 0x1b331e26a,
0x08e766a0c, 0x06051d5a2, 0x093a5f730, 0x17d35ba46,
0x06cb08e5c, 0x11d5ca20e, 0x06b749fb2, 0x1bf2e8b8a,
0x1167f94f2, 0x021f3d99c, 0x0cec3662e, 0x1a3e0968a,
0x19329634a, 0x08f158014, 0x0e6fc4e6a, 0x0ce7f39f4,
0x08227bb8a, 0x1a5e82106, 0x0b0cd4768, 0x061d82e56,
0x13c2b89c4, 0x188815ab2, 0x0d7a4825c, 0x0d270f1a2,
0x10f5ff2ba, 0x105405f3e, 0x00167d312, 0x1c3f5f66c,
0x0f6076544, 0x0e9adf796, 0x026f6a60a, 0x12ed0daac,
0x1a2adb74e, 0x096638b34, 0x19d34af3a, 0x065863b64,
0x049c3cc9c, 0x1e50585a0, 0x068bce87a, 0x11eef4f8e,
0x1524fa6c6, 0x19f1c69dc, 0x16cba8aca, 0x1ee54f54c,
0x042d98888, 0x12913343e, 0x1329d9f7e, 0x0b3e32c28,
0x1b1c69528, 0x088f25a3a, 0x02178513a, 0x0064f7f26,
0x0e0ac139e, 0x04e36f0b0, 0x0170076fa, 0x0dd7e3b0c,
0x141a1a2e2, 0x0bd6f81f8, 0x16ad828b4, 0x0f285651c,
0x041d17b64, 0x19425cbba, 0x1fae1cc66, 0x010746f3c,
0x1a75b4b00, 0x18db37e8a, 0x0f872e54c, 0x1c24afea4,
0x01e41e9fc, 0x04c144932, 0x086d8e4d2, 0x0271d9844,
0x160f7af7a, 0x052148f02, 0x05bb8f1bc, 0x08e766a0c,
0x0a90fd27a, 0x0a3c6f37a, 0x0b3af077a, 0x093a5f730,
0x04984d782, 0x1d22c238e, 0x0ca6ef3ac, 0x06cb08e5c,
0x0234e0b26, 0x063ded06a, 0x1d88abd4a, 0x06b749fb2,
0x04597456a, 0x04d56973c, 0x0e9e28eb4, 0x1167f94f2,
0x07b3ff57a, 0x19385bf2e, 0x0c9c8b782, 0x0cec3662e,
0x13a9cba9e, 0x0e417f38a, 0x093e106a4, 0x19329634a,
0x167001a9c, 0x14e727980, 0x1ddffc5d4, 0x0e6fc4e6a,
0x00df04680, 0x0d104b8fc, 0x02342001e, 0x08227bb8a,
0x00a2a8d7e, 0x05b397730, 0x168763fa6, 0x0b0cd4768,
0x1ed5a407a, 0x0e78eb416, 0x0d2c3ed1a, 0x13c2b89c4,
0x0995a5724, 0x1641378f0, 0x19b1afbc4, 0x0d7a4825c,
0x109ffedc0, 0x08d96551c, 0x0f2271e60, 0x10f5ff2ba,
0x00b0bf8ca, 0x00bf80dd2, 0x123888b7a, 0x00167d312,
0x1e888f7dc, 0x18dcddd1c, 0x002ee03b2, 0x0f6076544,
0x183e8d8fe, 0x06a45d2b2, 0x133d7a042, 0x026f6a60a,
0x116b0f50c, 0x1dd3e10e8, 0x05fabe670, 0x1a2adb74e,
0x130004488, 0x0de87806c, 0x000bcf5f6, 0x19d34af3a,
0x18f0c7078, 0x014338754, 0x017f27698, 0x049c3cc9c,
0x058ca5f00, 0x15e3e77ee, 0x1af900c24, 0x068bce87a,
0x0b5cfca28, 0x0dd07448e, 0x0ded288f8, 0x1524fa6c6,
0x059f229bc, 0x1d8048348, 0x06d390dec, 0x16cba8aca,
0x037170390, 0x0a3e3e02c, 0x06353c1cc, 0x042d98888,
0x0c4584f5c, 0x0d73c7bea, 0x1f16a3418, 0x1329d9f7e,
0x0531377e2, 0x185137662, 0x1d8d9ca7c, 0x1b1c69528,
0x0b25b29f2, 0x18a08b5bc, 0x19fb2a8b0, 0x02178513a,
0x1a08fe6ac, 0x1da758ae0, 0x045cddf4e, 0x0e0ac139e,
0x1a91647f2, 0x169cf9eb0, 0x1a0f717c4, 0x0170076fa,
};
// Compute the crc32c value for buffer smaller than 8
inline void align_to_8(
size_t len,
uint64_t& crc0, // crc so far, updated on return
const unsigned char*& next) { // next data pointer, updated on return
uint32_t crc32bit = static_cast<uint32_t>(crc0);
if (len & 0x04) {
crc32bit = _mm_crc32_u32(crc32bit, *(uint32_t*)next);
next += sizeof(uint32_t);
}
if (len & 0x02) {
crc32bit = _mm_crc32_u16(crc32bit, *(uint16_t*)next);
next += sizeof(uint16_t);
}
if (len & 0x01) {
crc32bit = _mm_crc32_u8(crc32bit, *(next));
next++;
}
crc0 = crc32bit;
}
//
// CombineCRC performs pclmulqdq multiplication of 2 partial CRC's and a well
// chosen constant and xor's these with the remaining CRC.
//
inline uint64_t CombineCRC(
size_t block_size,
uint64_t crc0,
uint64_t crc1,
uint64_t crc2,
const uint64_t* next2) {
const auto multiplier =
*(reinterpret_cast<const __m128i*>(clmul_constants) + block_size - 1);
const auto crc0_xmm = _mm_set_epi64x(0, crc0);
const auto res0 = _mm_clmulepi64_si128(crc0_xmm, multiplier, 0x00);
const auto crc1_xmm = _mm_set_epi64x(0, crc1);
const auto res1 = _mm_clmulepi64_si128(crc1_xmm, multiplier, 0x10);
const auto res = _mm_xor_si128(res0, res1);
crc0 = _mm_cvtsi128_si64(res);
crc0 = crc0 ^ *((uint64_t*)next2 - 1);
crc2 = _mm_crc32_u64(crc2, crc0);
return crc2;
}
// Compute CRC-32C using the Intel hardware instruction.
uint32_t crc32c_3way(uint32_t crc, const char* buf, size_t len) {
const unsigned char* next = (const unsigned char*)buf;
uint64_t count;
uint64_t crc0, crc1, crc2;
crc0 = crc ^ 0xffffffffu;
if (len >= 8) {
// if len > 216 then align and use triplets
if (len > 216) {
{
// Work on the bytes (< 8) before the first 8-byte alignment addr starts
uint64_t align_bytes = (8 - (uintptr_t)next) & 7;
len -= align_bytes;
align_to_8(align_bytes, crc0, next);
}
// Now work on the remaining blocks
count = len / 24; // number of triplets
len %= 24; // bytes remaining
uint64_t n = count >> 7; // #blocks = first block + full blocks
uint64_t block_size = count & 127;
if (block_size == 0) {
block_size = 128;
} else {
n++;
}
// points to the first byte of the next block
const uint64_t* next0 = (uint64_t*)next + block_size;
const uint64_t* next1 = next0 + block_size;
const uint64_t* next2 = next1 + block_size;
crc1 = crc2 = 0;
// Use Duff's device, a for() loop inside a switch()
// statement. This needs to execute at least once, round len
// down to nearest triplet multiple
switch (block_size) {
case 128:
do {
// jumps here for a full block of len 128
CRCtriplet(crc, next, -128);
// FALLTHRU
case 127:
// jumps here or below for the first block smaller
CRCtriplet(crc, next, -127);
// FALLTHRU
case 126:
CRCtriplet(crc, next, -126); // than 128
// FALLTHRU
case 125:
CRCtriplet(crc, next, -125);
// FALLTHRU
case 124:
CRCtriplet(crc, next, -124);
// FALLTHRU
case 123:
CRCtriplet(crc, next, -123);
// FALLTHRU
case 122:
CRCtriplet(crc, next, -122);
// FALLTHRU
case 121:
CRCtriplet(crc, next, -121);
// FALLTHRU
case 120:
CRCtriplet(crc, next, -120);
// FALLTHRU
case 119:
CRCtriplet(crc, next, -119);
// FALLTHRU
case 118:
CRCtriplet(crc, next, -118);
// FALLTHRU
case 117:
CRCtriplet(crc, next, -117);
// FALLTHRU
case 116:
CRCtriplet(crc, next, -116);
// FALLTHRU
case 115:
CRCtriplet(crc, next, -115);
// FALLTHRU
case 114:
CRCtriplet(crc, next, -114);
// FALLTHRU
case 113:
CRCtriplet(crc, next, -113);
// FALLTHRU
case 112:
CRCtriplet(crc, next, -112);
// FALLTHRU
case 111:
CRCtriplet(crc, next, -111);
// FALLTHRU
case 110:
CRCtriplet(crc, next, -110);
// FALLTHRU
case 109:
CRCtriplet(crc, next, -109);
// FALLTHRU
case 108:
CRCtriplet(crc, next, -108);
// FALLTHRU
case 107:
CRCtriplet(crc, next, -107);
// FALLTHRU
case 106:
CRCtriplet(crc, next, -106);
// FALLTHRU
case 105:
CRCtriplet(crc, next, -105);
// FALLTHRU
case 104:
CRCtriplet(crc, next, -104);
// FALLTHRU
case 103:
CRCtriplet(crc, next, -103);
// FALLTHRU
case 102:
CRCtriplet(crc, next, -102);
// FALLTHRU
case 101:
CRCtriplet(crc, next, -101);
// FALLTHRU
case 100:
CRCtriplet(crc, next, -100);
// FALLTHRU
case 99:
CRCtriplet(crc, next, -99);
// FALLTHRU
case 98:
CRCtriplet(crc, next, -98);
// FALLTHRU
case 97:
CRCtriplet(crc, next, -97);
// FALLTHRU
case 96:
CRCtriplet(crc, next, -96);
// FALLTHRU
case 95:
CRCtriplet(crc, next, -95);
// FALLTHRU
case 94:
CRCtriplet(crc, next, -94);
// FALLTHRU
case 93:
CRCtriplet(crc, next, -93);
// FALLTHRU
case 92:
CRCtriplet(crc, next, -92);
// FALLTHRU
case 91:
CRCtriplet(crc, next, -91);
// FALLTHRU
case 90:
CRCtriplet(crc, next, -90);
// FALLTHRU
case 89:
CRCtriplet(crc, next, -89);
// FALLTHRU
case 88:
CRCtriplet(crc, next, -88);
// FALLTHRU
case 87:
CRCtriplet(crc, next, -87);
// FALLTHRU
case 86:
CRCtriplet(crc, next, -86);
// FALLTHRU
case 85:
CRCtriplet(crc, next, -85);
// FALLTHRU
case 84:
CRCtriplet(crc, next, -84);
// FALLTHRU
case 83:
CRCtriplet(crc, next, -83);
// FALLTHRU
case 82:
CRCtriplet(crc, next, -82);
// FALLTHRU
case 81:
CRCtriplet(crc, next, -81);
// FALLTHRU
case 80:
CRCtriplet(crc, next, -80);
// FALLTHRU
case 79:
CRCtriplet(crc, next, -79);
// FALLTHRU
case 78:
CRCtriplet(crc, next, -78);
// FALLTHRU
case 77:
CRCtriplet(crc, next, -77);
// FALLTHRU
case 76:
CRCtriplet(crc, next, -76);
// FALLTHRU
case 75:
CRCtriplet(crc, next, -75);
// FALLTHRU
case 74:
CRCtriplet(crc, next, -74);
// FALLTHRU
case 73:
CRCtriplet(crc, next, -73);
// FALLTHRU
case 72:
CRCtriplet(crc, next, -72);
// FALLTHRU
case 71:
CRCtriplet(crc, next, -71);
// FALLTHRU
case 70:
CRCtriplet(crc, next, -70);
// FALLTHRU
case 69:
CRCtriplet(crc, next, -69);
// FALLTHRU
case 68:
CRCtriplet(crc, next, -68);
// FALLTHRU
case 67:
CRCtriplet(crc, next, -67);
// FALLTHRU
case 66:
CRCtriplet(crc, next, -66);
// FALLTHRU
case 65:
CRCtriplet(crc, next, -65);
// FALLTHRU
case 64:
CRCtriplet(crc, next, -64);
// FALLTHRU
case 63:
CRCtriplet(crc, next, -63);
// FALLTHRU
case 62:
CRCtriplet(crc, next, -62);
// FALLTHRU
case 61:
CRCtriplet(crc, next, -61);
// FALLTHRU
case 60:
CRCtriplet(crc, next, -60);
// FALLTHRU
case 59:
CRCtriplet(crc, next, -59);
// FALLTHRU
case 58:
CRCtriplet(crc, next, -58);
// FALLTHRU
case 57:
CRCtriplet(crc, next, -57);
// FALLTHRU
case 56:
CRCtriplet(crc, next, -56);
// FALLTHRU
case 55:
CRCtriplet(crc, next, -55);
// FALLTHRU
case 54:
CRCtriplet(crc, next, -54);
// FALLTHRU
case 53:
CRCtriplet(crc, next, -53);
// FALLTHRU
case 52:
CRCtriplet(crc, next, -52);
// FALLTHRU
case 51:
CRCtriplet(crc, next, -51);
// FALLTHRU
case 50:
CRCtriplet(crc, next, -50);
// FALLTHRU
case 49:
CRCtriplet(crc, next, -49);
// FALLTHRU
case 48:
CRCtriplet(crc, next, -48);
// FALLTHRU
case 47:
CRCtriplet(crc, next, -47);
// FALLTHRU
case 46:
CRCtriplet(crc, next, -46);
// FALLTHRU
case 45:
CRCtriplet(crc, next, -45);
// FALLTHRU
case 44:
CRCtriplet(crc, next, -44);
// FALLTHRU
case 43:
CRCtriplet(crc, next, -43);
// FALLTHRU
case 42:
CRCtriplet(crc, next, -42);
// FALLTHRU
case 41:
CRCtriplet(crc, next, -41);
// FALLTHRU
case 40:
CRCtriplet(crc, next, -40);
// FALLTHRU
case 39:
CRCtriplet(crc, next, -39);
// FALLTHRU
case 38:
CRCtriplet(crc, next, -38);
// FALLTHRU
case 37:
CRCtriplet(crc, next, -37);
// FALLTHRU
case 36:
CRCtriplet(crc, next, -36);
// FALLTHRU
case 35:
CRCtriplet(crc, next, -35);
// FALLTHRU
case 34:
CRCtriplet(crc, next, -34);
// FALLTHRU
case 33:
CRCtriplet(crc, next, -33);
// FALLTHRU
case 32:
CRCtriplet(crc, next, -32);
// FALLTHRU
case 31:
CRCtriplet(crc, next, -31);
// FALLTHRU
case 30:
CRCtriplet(crc, next, -30);
// FALLTHRU
case 29:
CRCtriplet(crc, next, -29);
// FALLTHRU
case 28:
CRCtriplet(crc, next, -28);
// FALLTHRU
case 27:
CRCtriplet(crc, next, -27);
// FALLTHRU
case 26:
CRCtriplet(crc, next, -26);
// FALLTHRU
case 25:
CRCtriplet(crc, next, -25);
// FALLTHRU
case 24:
CRCtriplet(crc, next, -24);
// FALLTHRU
case 23:
CRCtriplet(crc, next, -23);
// FALLTHRU
case 22:
CRCtriplet(crc, next, -22);
// FALLTHRU
case 21:
CRCtriplet(crc, next, -21);
// FALLTHRU
case 20:
CRCtriplet(crc, next, -20);
// FALLTHRU
case 19:
CRCtriplet(crc, next, -19);
// FALLTHRU
case 18:
CRCtriplet(crc, next, -18);
// FALLTHRU
case 17:
CRCtriplet(crc, next, -17);
// FALLTHRU
case 16:
CRCtriplet(crc, next, -16);
// FALLTHRU
case 15:
CRCtriplet(crc, next, -15);
// FALLTHRU
case 14:
CRCtriplet(crc, next, -14);
// FALLTHRU
case 13:
CRCtriplet(crc, next, -13);
// FALLTHRU
case 12:
CRCtriplet(crc, next, -12);
// FALLTHRU
case 11:
CRCtriplet(crc, next, -11);
// FALLTHRU
case 10:
CRCtriplet(crc, next, -10);
// FALLTHRU
case 9:
CRCtriplet(crc, next, -9);
// FALLTHRU
case 8:
CRCtriplet(crc, next, -8);
// FALLTHRU
case 7:
CRCtriplet(crc, next, -7);
// FALLTHRU
case 6:
CRCtriplet(crc, next, -6);
// FALLTHRU
case 5:
CRCtriplet(crc, next, -5);
// FALLTHRU
case 4:
CRCtriplet(crc, next, -4);
// FALLTHRU
case 3:
CRCtriplet(crc, next, -3);
// FALLTHRU
case 2:
CRCtriplet(crc, next, -2);
// FALLTHRU
case 1:
CRCduplet(crc, next, -1); // the final triplet is actually only 2
//{ CombineCRC(); }
crc0 = CombineCRC(block_size, crc0, crc1, crc2, next2);
if (--n > 0) {
crc1 = crc2 = 0;
block_size = 128;
// points to the first byte of the next block
next0 = next2 + 128;
next1 = next0 + 128; // from here on all blocks are 128 long
next2 = next1 + 128;
}
// FALLTHRU
case 0:;
} while (n > 0);
}
next = (const unsigned char*)next2;
}
uint64_t count2 = len >> 3; // 216 of less bytes is 27 or less singlets
len = len & 7;
next += (count2 * 8);
switch (count2) {
case 27:
CRCsinglet(crc0, next, -27 * 8);
// FALLTHRU
case 26:
CRCsinglet(crc0, next, -26 * 8);
// FALLTHRU
case 25:
CRCsinglet(crc0, next, -25 * 8);
// FALLTHRU
case 24:
CRCsinglet(crc0, next, -24 * 8);
// FALLTHRU
case 23:
CRCsinglet(crc0, next, -23 * 8);
// FALLTHRU
case 22:
CRCsinglet(crc0, next, -22 * 8);
// FALLTHRU
case 21:
CRCsinglet(crc0, next, -21 * 8);
// FALLTHRU
case 20:
CRCsinglet(crc0, next, -20 * 8);
// FALLTHRU
case 19:
CRCsinglet(crc0, next, -19 * 8);
// FALLTHRU
case 18:
CRCsinglet(crc0, next, -18 * 8);
// FALLTHRU
case 17:
CRCsinglet(crc0, next, -17 * 8);
// FALLTHRU
case 16:
CRCsinglet(crc0, next, -16 * 8);
// FALLTHRU
case 15:
CRCsinglet(crc0, next, -15 * 8);
// FALLTHRU
case 14:
CRCsinglet(crc0, next, -14 * 8);
// FALLTHRU
case 13:
CRCsinglet(crc0, next, -13 * 8);
// FALLTHRU
case 12:
CRCsinglet(crc0, next, -12 * 8);
// FALLTHRU
case 11:
CRCsinglet(crc0, next, -11 * 8);
// FALLTHRU
case 10:
CRCsinglet(crc0, next, -10 * 8);
// FALLTHRU
case 9:
CRCsinglet(crc0, next, -9 * 8);
// FALLTHRU
case 8:
CRCsinglet(crc0, next, -8 * 8);
// FALLTHRU
case 7:
CRCsinglet(crc0, next, -7 * 8);
// FALLTHRU
case 6:
CRCsinglet(crc0, next, -6 * 8);
// FALLTHRU
case 5:
CRCsinglet(crc0, next, -5 * 8);
// FALLTHRU
case 4:
CRCsinglet(crc0, next, -4 * 8);
// FALLTHRU
case 3:
CRCsinglet(crc0, next, -3 * 8);
// FALLTHRU
case 2:
CRCsinglet(crc0, next, -2 * 8);
// FALLTHRU
case 1:
CRCsinglet(crc0, next, -1 * 8);
// FALLTHRU
case 0:;
}
}
{
align_to_8(len, crc0, next);
return (uint32_t)crc0 ^ 0xffffffffu;
}
}
#endif //HAVE_SSE42 && HAVE_PCLMUL
static inline Function Choose_Extend() {
#ifndef HAVE_POWER8
if (isSSE42()) {
if (isPCLMULQDQ()) {
#if defined HAVE_SSE42 && defined HAVE_PCLMUL && !defined NO_THREEWAY_CRC32C
return crc32c_3way;
#else
return ExtendImpl<Fast_CRC32>; // Fast_CRC32 will check HAVE_SSE42 itself
#endif
}
else { // no runtime PCLMULQDQ support but has SSE42 support
return ExtendImpl<Fast_CRC32>;
}
} // end of isSSE42()
else {
return ExtendImpl<Slow_CRC32>;
}
#else //HAVE_POWER8
return isAltiVec() ? ExtendPPCImpl : ExtendImpl<Slow_CRC32>;
#endif
}
static Function ChosenExtend = Choose_Extend();
uint32_t Extend(uint32_t crc, const char* buf, size_t size) {
return ChosenExtend(crc, buf, size);
}
} // namespace crc32c
} // namespace rocksdb

View File

@ -6,7 +6,6 @@
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#include "util/crc32c.h"
#include "util/testharness.h"
@ -15,7 +14,57 @@ namespace crc32c {
class CRC { };
// Tests for 3-way crc32c algorithm. We need these tests because it uses
// different lookup tables than the original Fast_CRC32
const unsigned int BUFFER_SIZE = 512 * 1024 * sizeof(uint64_t);
char buffer[BUFFER_SIZE];
struct ExpectedResult {
size_t offset;
size_t length;
uint32_t crc32c;
};
ExpectedResult expectedResults[] = {
// Zero-byte input
{ 0, 0, ~0U },
// Small aligned inputs to test special cases in SIMD implementations
{ 8, 1, 1543413366 },
{ 8, 2, 523493126 },
{ 8, 3, 1560427360 },
{ 8, 4, 3422504776 },
{ 8, 5, 447841138 },
{ 8, 6, 3910050499 },
{ 8, 7, 3346241981 },
// Small unaligned inputs
{ 9, 1, 3855826643 },
{ 10, 2, 560880875 },
{ 11, 3, 1479707779 },
{ 12, 4, 2237687071 },
{ 13, 5, 4063855784 },
{ 14, 6, 2553454047 },
{ 15, 7, 1349220140 },
// Larger inputs to test leftover chunks at the end of aligned blocks
{ 8, 8, 627613930 },
{ 8, 9, 2105929409 },
{ 8, 10, 2447068514 },
{ 8, 11, 863807079 },
{ 8, 12, 292050879 },
{ 8, 13, 1411837737 },
{ 8, 14, 2614515001 },
{ 8, 15, 3579076296 },
{ 8, 16, 2897079161 },
{ 8, 17, 675168386 },
// // Much larger inputs
{ 0, BUFFER_SIZE, 2096790750 },
{ 1, BUFFER_SIZE / 2, 3854797577 },
};
TEST(CRC, StandardResults) {
// Original Fast_CRC32 tests.
// From rfc3720 section B.4.
char buf[32];
@ -50,6 +99,24 @@ TEST(CRC, StandardResults) {
0x00, 0x00, 0x00, 0x00,
};
ASSERT_EQ(0xd9963a56, Value(reinterpret_cast<char*>(data), sizeof(data)));
// 3-Way Crc32c tests ported from folly.
// Test 1: single computation
for (auto expected : expectedResults) {
uint32_t result = Value(buffer + expected.offset, expected.length);
EXPECT_EQ(~expected.crc32c, result);
}
// Test 2: stitching two computations
for (auto expected : expectedResults) {
size_t partialLength = expected.length / 2;
uint32_t partialChecksum = Value(buffer + expected.offset, partialLength);
uint32_t result = Extend(partialChecksum,
buffer + expected.offset + partialLength,
expected.length - partialLength);
EXPECT_EQ(~expected.crc32c, result);
}
}
TEST(CRC, Values) {
@ -72,7 +139,36 @@ TEST(CRC, Mask) {
} // namespace crc32c
} // namespace rocksdb
// copied from folly
const uint64_t FNV_64_HASH_START = 14695981039346656037ULL;
inline uint64_t fnv64_buf(const void* buf,
size_t n,
uint64_t hash = FNV_64_HASH_START) {
// forcing signed char, since other platforms can use unsigned
const signed char* char_buf = reinterpret_cast<const signed char*>(buf);
for (size_t i = 0; i < n; ++i) {
hash += (hash << 1) + (hash << 4) + (hash << 5) + (hash << 7) +
(hash << 8) + (hash << 40);
hash ^= char_buf[i];
}
return hash;
}
int main(int argc, char** argv) {
::testing::InitGoogleTest(&argc, argv);
// Populate a buffer with a deterministic pattern
// on which to compute checksums
const uint8_t* src = (uint8_t*)rocksdb::crc32c::buffer;
uint64_t* dst = (uint64_t*)rocksdb::crc32c::buffer;
const uint64_t* end = (const uint64_t*)(rocksdb::crc32c::buffer + rocksdb::crc32c::BUFFER_SIZE);
*dst++ = 0;
while (dst < end) {
*dst++ = fnv64_buf((const char*)src, sizeof(uint64_t));
src += sizeof(uint64_t);
}
return RUN_ALL_TESTS();
}

View File

@ -239,6 +239,9 @@ Status WritableFileWriter::Close() {
// we need to let the file know where data ends.
if (use_direct_io()) {
interim = writable_file_->Truncate(filesize_);
if (interim.ok()) {
interim = writable_file_->Fsync();
}
if (!interim.ok() && s.ok()) {
s = interim;
}

View File

@ -24,6 +24,13 @@ namespace rocksdb {
// pointer (if not NULL) when one of the following happens:
// (1) a thread terminates
// (2) a ThreadLocalPtr is destroyed
//
// Warning: this function is called while holding a global mutex. The same mutex
// is used (at least in some cases) by most methods of ThreadLocalPtr, and it's
// shared across all instances of ThreadLocalPtr. Thereforere extra care
// is needed to avoid deadlocks. In particular, the handler shouldn't lock any
// mutexes and shouldn't call any methods of any ThreadLocalPtr instances,
// unless you know what you're doing.
typedef void (*UnrefHandler)(void* ptr);
// ThreadLocalPtr stores only values of pointer type. Different from

View File

@ -77,8 +77,9 @@ Status RandomTransactionInserter::DBGet(
uint64_t ikey, bool get_for_update, uint64_t* int_value,
std::string* full_key, bool* unexpected_error) {
Status s;
// four digits and zero end char
char prefix_buf[5];
// Five digits (since the largest uint16_t is 65535) plus the NUL
// end char.
char prefix_buf[6];
// Pad prefix appropriately so we can iterate over each set
snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
// key format: [SET#][random#]
@ -124,7 +125,7 @@ bool RandomTransactionInserter::DoInsert(DB* db, Transaction* txn,
bool unexpected_error = false;
std::vector<uint16_t> set_vec(num_sets_);
std::iota(set_vec.begin(), set_vec.end(), 0);
std::iota(set_vec.begin(), set_vec.end(), static_cast<uint16_t>(0));
std::random_shuffle(set_vec.begin(), set_vec.end(),
[&](uint64_t r) { return rand_->Uniform(r); });
// For each set, pick a key at random and increment it
@ -254,15 +255,17 @@ Status RandomTransactionInserter::Verify(DB* db, uint16_t num_sets,
}
std::vector<uint16_t> set_vec(num_sets);
std::iota(set_vec.begin(), set_vec.end(), 0);
std::iota(set_vec.begin(), set_vec.end(), static_cast<uint16_t>(0));
if (rand) {
std::random_shuffle(set_vec.begin(), set_vec.end(),
[&](uint64_t r) { return rand->Uniform(r); });
}
// For each set of keys with the same prefix, sum all the values
for (uint16_t set_i : set_vec) {
// four digits and zero end char
char prefix_buf[5];
// Five digits (since the largest uint16_t is 65535) plus the NUL
// end char.
char prefix_buf[6];
assert(set_i + 1 <= 9999);
snprintf(prefix_buf, sizeof(prefix_buf), "%.4u", set_i + 1);
uint64_t total = 0;

View File

@ -146,11 +146,17 @@ class BackupEngineImpl : public BackupEngine {
class BackupMeta {
public:
BackupMeta(const std::string& meta_filename,
BackupMeta(
const std::string& meta_filename, const std::string& meta_tmp_filename,
std::unordered_map<std::string, std::shared_ptr<FileInfo>>* file_infos,
Env* env)
: timestamp_(0), sequence_number_(0), size_(0),
meta_filename_(meta_filename), file_infos_(file_infos), env_(env) {}
: timestamp_(0),
sequence_number_(0),
size_(0),
meta_filename_(meta_filename),
meta_tmp_filename_(meta_tmp_filename),
file_infos_(file_infos),
env_(env) {}
BackupMeta(const BackupMeta&) = delete;
BackupMeta& operator=(const BackupMeta&) = delete;
@ -228,6 +234,7 @@ class BackupEngineImpl : public BackupEngine {
uint64_t size_;
std::string app_metadata_;
std::string const meta_filename_;
std::string const meta_tmp_filename_;
// files with relative paths (without "/" prefix!!)
std::vector<std::shared_ptr<FileInfo>> files_;
std::unordered_map<std::string, std::shared_ptr<FileInfo>>* file_infos_;
@ -257,12 +264,14 @@ class BackupEngineImpl : public BackupEngine {
inline std::string GetSharedFileRel(const std::string& file = "",
bool tmp = false) const {
assert(file.size() == 0 || file[0] != '/');
return "shared/" + file + (tmp ? ".tmp" : "");
return std::string("shared/") + (tmp ? "." : "") + file +
(tmp ? ".tmp" : "");
}
inline std::string GetSharedFileWithChecksumRel(const std::string& file = "",
bool tmp = false) const {
assert(file.size() == 0 || file[0] != '/');
return GetSharedChecksumDirRel() + "/" + file + (tmp ? ".tmp" : "");
return GetSharedChecksumDirRel() + "/" + (tmp ? "." : "") + file +
(tmp ? ".tmp" : "");
}
inline std::string GetSharedFileWithChecksum(const std::string& file,
const uint32_t checksum_value,
@ -283,8 +292,9 @@ class BackupEngineImpl : public BackupEngine {
inline std::string GetBackupMetaDir() const {
return GetAbsolutePath("meta");
}
inline std::string GetBackupMetaFile(BackupID backup_id) const {
return GetBackupMetaDir() + "/" + rocksdb::ToString(backup_id);
inline std::string GetBackupMetaFile(BackupID backup_id, bool tmp) const {
return GetBackupMetaDir() + "/" + (tmp ? "." : "") +
rocksdb::ToString(backup_id) + (tmp ? ".tmp" : "");
}
// If size_limit == 0, there is no size limit, copy everything.
@ -605,10 +615,11 @@ Status BackupEngineImpl::Initialize() {
continue;
}
assert(backups_.find(backup_id) == backups_.end());
backups_.insert(
std::make_pair(backup_id, unique_ptr<BackupMeta>(new BackupMeta(
GetBackupMetaFile(backup_id),
&backuped_file_infos_, backup_env_))));
backups_.insert(std::make_pair(
backup_id, unique_ptr<BackupMeta>(new BackupMeta(
GetBackupMetaFile(backup_id, false /* tmp */),
GetBackupMetaFile(backup_id, true /* tmp */),
&backuped_file_infos_, backup_env_))));
}
latest_backup_id_ = 0;
@ -736,10 +747,11 @@ Status BackupEngineImpl::CreateNewBackupWithMetadata(
BackupID new_backup_id = latest_backup_id_ + 1;
assert(backups_.find(new_backup_id) == backups_.end());
auto ret = backups_.insert(
std::make_pair(new_backup_id, unique_ptr<BackupMeta>(new BackupMeta(
GetBackupMetaFile(new_backup_id),
&backuped_file_infos_, backup_env_))));
auto ret = backups_.insert(std::make_pair(
new_backup_id, unique_ptr<BackupMeta>(new BackupMeta(
GetBackupMetaFile(new_backup_id, false /* tmp */),
GetBackupMetaFile(new_backup_id, true /* tmp */),
&backuped_file_infos_, backup_env_))));
assert(ret.second == true);
auto& new_backup = ret.first->second;
new_backup->RecordTimestamp();
@ -1708,8 +1720,7 @@ Status BackupEngineImpl::BackupMeta::StoreToFile(bool sync) {
EnvOptions env_options;
env_options.use_mmap_writes = false;
env_options.use_direct_writes = false;
s = env_->NewWritableFile(meta_filename_ + ".tmp", &backup_meta_file,
env_options);
s = env_->NewWritableFile(meta_tmp_filename_, &backup_meta_file, env_options);
if (!s.ok()) {
return s;
}
@ -1749,7 +1760,7 @@ Status BackupEngineImpl::BackupMeta::StoreToFile(bool sync) {
s = backup_meta_file->Close();
}
if (s.ok()) {
s = env_->RenameFile(meta_filename_ + ".tmp", meta_filename_);
s = env_->RenameFile(meta_tmp_filename_, meta_filename_);
}
return s;
}

View File

@ -810,9 +810,9 @@ TEST_F(BackupableDBTest, NoDoubleCopy) {
test_db_env_->SetFilenamesForMockedAttrs(dummy_db_->live_files_);
ASSERT_OK(backup_engine_->CreateNewBackup(db_.get(), false));
std::vector<std::string> should_have_written = {
"/shared/00010.sst.tmp", "/shared/00011.sst.tmp",
"/shared/.00010.sst.tmp", "/shared/.00011.sst.tmp",
"/private/1.tmp/CURRENT", "/private/1.tmp/MANIFEST-01",
"/private/1.tmp/00011.log", "/meta/1.tmp"};
"/private/1.tmp/00011.log", "/meta/.1.tmp"};
AppendPath(backupdir_, should_have_written);
test_backup_env_->AssertWrittenFiles(should_have_written);
@ -828,9 +828,9 @@ TEST_F(BackupableDBTest, NoDoubleCopy) {
ASSERT_OK(backup_engine_->CreateNewBackup(db_.get(), false));
// should not open 00010.sst - it's already there
should_have_written = {"/shared/00015.sst.tmp", "/private/2.tmp/CURRENT",
should_have_written = {"/shared/.00015.sst.tmp", "/private/2.tmp/CURRENT",
"/private/2.tmp/MANIFEST-01",
"/private/2.tmp/00011.log", "/meta/2.tmp"};
"/private/2.tmp/00011.log", "/meta/.2.tmp"};
AppendPath(backupdir_, should_have_written);
test_backup_env_->AssertWrittenFiles(should_have_written);
@ -1169,7 +1169,7 @@ TEST_F(BackupableDBTest, DeleteTmpFiles) {
} else {
shared_tmp += "/shared";
}
shared_tmp += "/00006.sst.tmp";
shared_tmp += "/.00006.sst.tmp";
std::string private_tmp_dir = backupdir_ + "/private/10.tmp";
std::string private_tmp_file = private_tmp_dir + "/00003.sst";
file_manager_->WriteToFile(shared_tmp, "tmp");

View File

@ -317,10 +317,10 @@ TEST(RowValueTest, PurgeTtlShouldRemvoeAllColumnsExpired) {
int64_t now = time(nullptr);
auto row_value = CreateTestRowValue({
std::make_tuple(kColumn, 0, ToMicroSeconds(now)),
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 10)), //expired
std::make_tuple(kExpiringColumn, 2, ToMicroSeconds(now)), // not expired
std::make_tuple(kTombstone, 3, ToMicroSeconds(now))
CreateTestColumnSpec(kColumn, 0, ToMicroSeconds(now)),
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 10)), //expired
CreateTestColumnSpec(kExpiringColumn, 2, ToMicroSeconds(now)), // not expired
CreateTestColumnSpec(kTombstone, 3, ToMicroSeconds(now))
});
bool changed = false;
@ -339,10 +339,10 @@ TEST(RowValueTest, ExpireTtlShouldConvertExpiredColumnsToTombstones) {
int64_t now = time(nullptr);
auto row_value = CreateTestRowValue({
std::make_tuple(kColumn, 0, ToMicroSeconds(now)),
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 10)), //expired
std::make_tuple(kExpiringColumn, 2, ToMicroSeconds(now)), // not expired
std::make_tuple(kTombstone, 3, ToMicroSeconds(now))
CreateTestColumnSpec(kColumn, 0, ToMicroSeconds(now)),
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 10)), //expired
CreateTestColumnSpec(kExpiringColumn, 2, ToMicroSeconds(now)), // not expired
CreateTestColumnSpec(kTombstone, 3, ToMicroSeconds(now))
});
bool changed = false;

View File

@ -145,21 +145,21 @@ TEST_F(CassandraFunctionalTest, SimpleMergeTest) {
int64_t now = time(nullptr);
store.Append("k1", CreateTestRowValue({
std::make_tuple(kTombstone, 0, ToMicroSeconds(now + 5)),
std::make_tuple(kColumn, 1, ToMicroSeconds(now + 8)),
std::make_tuple(kExpiringColumn, 2, ToMicroSeconds(now + 5)),
CreateTestColumnSpec(kTombstone, 0, ToMicroSeconds(now + 5)),
CreateTestColumnSpec(kColumn, 1, ToMicroSeconds(now + 8)),
CreateTestColumnSpec(kExpiringColumn, 2, ToMicroSeconds(now + 5)),
}));
store.Append("k1",CreateTestRowValue({
std::make_tuple(kColumn, 0, ToMicroSeconds(now + 2)),
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now + 5)),
std::make_tuple(kTombstone, 2, ToMicroSeconds(now + 7)),
std::make_tuple(kExpiringColumn, 7, ToMicroSeconds(now + 17)),
CreateTestColumnSpec(kColumn, 0, ToMicroSeconds(now + 2)),
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now + 5)),
CreateTestColumnSpec(kTombstone, 2, ToMicroSeconds(now + 7)),
CreateTestColumnSpec(kExpiringColumn, 7, ToMicroSeconds(now + 17)),
}));
store.Append("k1", CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now + 6)),
std::make_tuple(kTombstone, 1, ToMicroSeconds(now + 5)),
std::make_tuple(kColumn, 2, ToMicroSeconds(now + 4)),
std::make_tuple(kTombstone, 11, ToMicroSeconds(now + 11)),
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now + 6)),
CreateTestColumnSpec(kTombstone, 1, ToMicroSeconds(now + 5)),
CreateTestColumnSpec(kColumn, 2, ToMicroSeconds(now + 4)),
CreateTestColumnSpec(kTombstone, 11, ToMicroSeconds(now + 11)),
}));
auto ret = store.Get("k1");
@ -180,16 +180,16 @@ TEST_F(CassandraFunctionalTest,
int64_t now= time(nullptr);
store.Append("k1", CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)), //expired
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now - kTtl + 10)), // not expired
std::make_tuple(kTombstone, 3, ToMicroSeconds(now))
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)), //expired
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now - kTtl + 10)), // not expired
CreateTestColumnSpec(kTombstone, 3, ToMicroSeconds(now))
}));
store.Flush();
store.Append("k1",CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)), //expired
std::make_tuple(kColumn, 2, ToMicroSeconds(now))
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)), //expired
CreateTestColumnSpec(kColumn, 2, ToMicroSeconds(now))
}));
store.Flush();
@ -213,16 +213,16 @@ TEST_F(CassandraFunctionalTest,
int64_t now = time(nullptr);
store.Append("k1", CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)), //expired
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now)), // not expired
std::make_tuple(kTombstone, 3, ToMicroSeconds(now))
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)), //expired
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now)), // not expired
CreateTestColumnSpec(kTombstone, 3, ToMicroSeconds(now))
}));
store.Flush();
store.Append("k1",CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)), //expired
std::make_tuple(kColumn, 2, ToMicroSeconds(now))
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)), //expired
CreateTestColumnSpec(kColumn, 2, ToMicroSeconds(now))
}));
store.Flush();
@ -244,14 +244,14 @@ TEST_F(CassandraFunctionalTest,
int64_t now = time(nullptr);
store.Append("k1", CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)),
std::make_tuple(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 20)),
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 20)),
CreateTestColumnSpec(kExpiringColumn, 1, ToMicroSeconds(now - kTtl - 20)),
}));
store.Flush();
store.Append("k1",CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)),
CreateTestColumnSpec(kExpiringColumn, 0, ToMicroSeconds(now - kTtl - 10)),
}));
store.Flush();
@ -266,18 +266,18 @@ TEST_F(CassandraFunctionalTest,
int64_t now = time(nullptr);
store.Append("k1", CreateTestRowValue({
std::make_tuple(kTombstone, 0, ToMicroSeconds(now - gc_grace_period_in_seconds_ - 1)),
std::make_tuple(kColumn, 1, ToMicroSeconds(now))
CreateTestColumnSpec(kTombstone, 0, ToMicroSeconds(now - gc_grace_period_in_seconds_ - 1)),
CreateTestColumnSpec(kColumn, 1, ToMicroSeconds(now))
}));
store.Append("k2", CreateTestRowValue({
std::make_tuple(kColumn, 0, ToMicroSeconds(now))
CreateTestColumnSpec(kColumn, 0, ToMicroSeconds(now))
}));
store.Flush();
store.Append("k1",CreateTestRowValue({
std::make_tuple(kColumn, 1, ToMicroSeconds(now)),
CreateTestColumnSpec(kColumn, 1, ToMicroSeconds(now)),
}));
store.Flush();
@ -296,7 +296,7 @@ TEST_F(CassandraFunctionalTest, CompactionShouldRemoveTombstoneFromPut) {
int64_t now = time(nullptr);
store.Put("k1", CreateTestRowValue({
std::make_tuple(kTombstone, 0, ToMicroSeconds(now - gc_grace_period_in_seconds_ - 1)),
CreateTestColumnSpec(kTombstone, 0, ToMicroSeconds(now - gc_grace_period_in_seconds_ - 1)),
}));
store.Flush();

View File

@ -15,27 +15,27 @@ TEST(RowValueMergeTest, Merge) {
std::vector<RowValue> row_values;
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kTombstone, 0, 5),
std::make_tuple(kColumn, 1, 8),
std::make_tuple(kExpiringColumn, 2, 5),
CreateTestColumnSpec(kTombstone, 0, 5),
CreateTestColumnSpec(kColumn, 1, 8),
CreateTestColumnSpec(kExpiringColumn, 2, 5),
})
);
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kColumn, 0, 2),
std::make_tuple(kExpiringColumn, 1, 5),
std::make_tuple(kTombstone, 2, 7),
std::make_tuple(kExpiringColumn, 7, 17),
CreateTestColumnSpec(kColumn, 0, 2),
CreateTestColumnSpec(kExpiringColumn, 1, 5),
CreateTestColumnSpec(kTombstone, 2, 7),
CreateTestColumnSpec(kExpiringColumn, 7, 17),
})
);
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kExpiringColumn, 0, 6),
std::make_tuple(kTombstone, 1, 5),
std::make_tuple(kColumn, 2, 4),
std::make_tuple(kTombstone, 11, 11),
CreateTestColumnSpec(kExpiringColumn, 0, 6),
CreateTestColumnSpec(kTombstone, 1, 5),
CreateTestColumnSpec(kColumn, 2, 4),
CreateTestColumnSpec(kTombstone, 11, 11),
})
);
@ -60,24 +60,24 @@ TEST(RowValueMergeTest, MergeWithRowTombstone) {
// This row's timestamp is smaller than tombstone.
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kColumn, 0, 5),
std::make_tuple(kColumn, 1, 6),
CreateTestColumnSpec(kColumn, 0, 5),
CreateTestColumnSpec(kColumn, 1, 6),
})
);
// Some of the column's row is smaller, some is larger.
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kColumn, 2, 10),
std::make_tuple(kColumn, 3, 12),
CreateTestColumnSpec(kColumn, 2, 10),
CreateTestColumnSpec(kColumn, 3, 12),
})
);
// All of the column's rows are larger than tombstone.
row_values.push_back(
CreateTestRowValue({
std::make_tuple(kColumn, 4, 13),
std::make_tuple(kColumn, 5, 14),
CreateTestColumnSpec(kColumn, 4, 13),
CreateTestColumnSpec(kColumn, 5, 14),
})
);

View File

@ -129,7 +129,7 @@ std::shared_ptr<Tombstone> ExpiringColumn::ToTombstone() const {
int64_t marked_for_delete_at =
std::chrono::duration_cast<std::chrono::microseconds>(expired_at).count();
return std::make_shared<Tombstone>(
ColumnTypeMask::DELETION_MASK,
static_cast<int8_t>(ColumnTypeMask::DELETION_MASK),
Index(),
local_deletion_time,
marked_for_delete_at);

View File

@ -29,6 +29,12 @@ std::shared_ptr<ColumnBase> CreateTestColumn(int8_t mask,
}
}
std::tuple<int8_t, int8_t, int64_t> CreateTestColumnSpec(int8_t mask,
int8_t index,
int64_t timestamp) {
return std::make_tuple(mask, index, timestamp);
}
RowValue CreateTestRowValue(
std::vector<std::tuple<int8_t, int8_t, int64_t>> column_specs) {
std::vector<std::shared_ptr<ColumnBase>> columns;

View File

@ -23,6 +23,10 @@ std::shared_ptr<ColumnBase> CreateTestColumn(int8_t mask,
int8_t index,
int64_t timestamp);
std::tuple<int8_t, int8_t, int64_t> CreateTestColumnSpec(int8_t mask,
int8_t index,
int64_t timestamp);
RowValue CreateTestRowValue(
std::vector<std::tuple<int8_t, int8_t, int64_t>> column_specs);

View File

@ -207,7 +207,9 @@ Status CheckpointImpl::CreateCustomCheckpoint(
TEST_SYNC_POINT("CheckpointImpl::CreateCheckpoint:SavedLiveFiles1");
TEST_SYNC_POINT("CheckpointImpl::CreateCheckpoint:SavedLiveFiles2");
db_->FlushWAL(false /* sync */);
if (db_options.manual_wal_flush) {
db_->FlushWAL(false /* sync */);
}
}
// if we have more than one column family, we need to also get WAL files
if (s.ok()) {

View File

@ -1485,4 +1485,5 @@ EnvLibrados* EnvLibrados::Default() {
default_pool_name);
return &default_env;
}
// @lint-ignore TXT4 T25377293 Grandfathered in
}

View File

@ -89,6 +89,7 @@ class CacheActivityLogger {
log_line += key.ToString(true);
log_line += " - ";
AppendNumberTo(&log_line, size);
// @lint-ignore TXT2 T25377293 Grandfathered in
log_line += "\n";
// line format: "ADD - <KEY> - <KEY-SIZE>"

View File

@ -193,6 +193,7 @@ TEST_F(SimCacheTest, SimCacheLogging) {
ASSERT_EQ(add_num, num_block_entries);
// Log things again but stop logging automatically after reaching 512 bytes
// @lint-ignore TXT2 T25377293 Grandfathered in
int max_size = 512;
ASSERT_OK(sim_cache->StartActivityLogging(log_file, env_, max_size));
for (int it = 0; it < 10; it++) {