Compare commits

..

36 Commits
main ... 7.0.fb

Author SHA1 Message Date
Hui Xiao
985c8741ad Conditionally declare and define variable that is unused in LITE mode (#9854)
Summary:
Context:
As mentioned in https://github.com/facebook/rocksdb/issues/9701, we have the following in LITE=1 make static_lib for v7.0.2
```
  CC       file/sequence_file_reader.o
  CC       file/sst_file_manager_impl.o
  CC       file/writable_file_writer.o
In file included from file/writable_file_writer.cc:10:
./file/writable_file_writer.h:163:15: error: private field 'temperature_' is not used [-Werror,-Wunused-private-field]
  Temperature temperature_;
              ^
1 error generated.
make: *** [file/writable_file_writer.o] Error 1
```

 as titled

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9854

Test Plan:
- Local `LITE=1 make static_lib` reveals the same error and error is gone after this fix
- CI

Reviewed By: ajkr, jay-zhuang

Differential Revision: D35706585

Pulled By: hx235

fbshipit-source-id: 7743310298231ad6866304ffa2225c8abdc91d9a
2022-04-19 12:00:09 -07:00
Adam Retter
da47b71229
Retrieve ZLib from the archive (#9783) 2022-04-04 18:08:36 -07:00
Yuriy Chernyshov
076ac983f4 Do not rely on ADL when invoking std::max_element (#9608)
Summary:
Certain STLs use raw pointers and ADL does not work for them.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9608

Reviewed By: ajkr

Differential Revision: D34583012

Pulled By: riversand963

fbshipit-source-id: 7de6bbc8a080c3e7243ce0d758fe83f1663168aa
2022-03-29 12:53:45 -07:00
Andrew Kryczka
9622795deb update version.h and HISTORY.md for 7.0.4 2022-03-29 12:50:05 -07:00
myasuka
e7de808d35 Enable READ_BLOCK_COMPACTION_MICROS to track stats (#9722)
Summary:
After commit [d642c60](d642c60bdc), the stats `READ_BLOCK_COMPACTION_MICROS` cannot record any compaction read duration, and it always report zero.

This PR targets to distinguish `READ_BLOCK_COMPACTION_MICROS` with `READ_BLOCK_GET_MICROS` so that `READ_BLOCK_COMPACTION_MICROS` could record the correct stats.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9722

Reviewed By: ajkr

Differential Revision: D35021870

Pulled By: jay-zhuang

fbshipit-source-id: f1a804994265e51465de64c2a08f2e0eeb6fc5a3
2022-03-29 12:48:08 -07:00
Peter Dillinger
8b2cfef9e0 Fix heap use-after-free race with DropColumnFamily (#9730)
Summary:
Although ColumnFamilySet comments say that DB mutex can be
freed during iteration, as long as you hold a ref while releasing DB
mutex, this is not quite true because UnrefAndTryDelete might delete cfd
right before it is needed to get ->next_ for the next iteration of the
loop.

This change solves the problem by making a wrapper class that makes such
iteration easier while handling the tricky details of UnrefAndTryDelete
on the previous cfd only after getting next_ in operator++.

FreeDeadColumnFamilies should already have been obsolete; this removes
it for good. Similarly, ColumnFamilySet::iterator doesn't need to check
for cfd with 0 refs, because those are immediately deleted.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9730

Test Plan:
was reported with ASAN on unit tests like
DBLogicalBlockSizeCacheTest.CreateColumnFamily (very rare); keep watching

Reviewed By: ltamasi

Differential Revision: D35038143

Pulled By: pdillinger

fbshipit-source-id: 0a5478d5be96c135343a00603711b7df43ae19c9
2022-03-29 12:47:41 -07:00
Yanqin Jin
2de3bfc7d1 Fix a race condition in WAL tracking causing DB open failure (#9715)
Summary:
There is a race condition if WAL tracking in the MANIFEST is enabled in a database that disables 2PC.

The race condition is between two background flush threads trying to install flush results to the MANIFEST.

Consider an example database with two column families: "default" (cfd0) and "cf1" (cfd1). Initially,
both column families have one mutable (active) memtable whose data backed by 6.log.

1. Trigger a manual flush for "cf1", creating a 7.log
2. Insert another key to "default", and trigger flush for "default", creating 8.log
3. BgFlushThread1 finishes writing 9.sst
4. BgFlushThread2 finishes writing 10.sst

```
Time  BgFlushThread1                                    BgFlushThread2
 |    mutex_.Lock()
 |    precompute min_wal_to_keep as 6
 |    mutex_.Unlock()
 |                                                     mutex_.Lock()
 |                                                     precompute min_wal_to_keep as 6
 |                                                     join MANIFEST write queue and mutex_.Unlock()
 |    write to MANIFEST
 |    mutex_.Lock()
 |    cfd1->log_number = 7
 |    Signal bg_flush_2 and mutex_.Unlock()
 |                                                     wake up and mutex_.Lock()
 |                                                     cfd0->log_number = 8
 |                                                     FindObsoleteFiles() with job_context->log_number == 7
 |                                                     mutex_.Unlock()
 |                                                     PurgeObsoleteFiles() deletes 6.log
 V
```

As shown in the above, BgFlushThread2 thinks that the min wal to keep is 6.log because "cf1" has unflushed data in 6.log (cf1.log_number=6).
Similarly, BgThread1 thinks that min wal to keep is also 6.log because "default" has unflushed data (default.log_number=6).
No WAL deletion will be written to MANIFEST because 6 is equal to `versions_->wals_.min_wal_number_to_keep`,
due to https://github.com/facebook/rocksdb/blob/7.1.fb/db/memtable_list.cc#L513:L514.
The bg flush thread that finishes last will perform file purging. `job_context.log_number` will be evaluated as 7, i.e.
the min wal that contains unflushed data, causing 6.log to be deleted. However, MANIFEST thinks 6.log should still exist.
If you close the db at this point, you won't be able to re-open it if `track_and_verify_wal_in_manifest` is true.

We must handle the case of multiple bg flush threads, and it is difficult for one bg flush thread to know
the correct min wal number until the other bg flush threads have finished committing to the manifest and updated
the `cfd::log_number`.
To fix this issue, we rename an existing variable `min_log_number_to_keep_2pc` to `min_log_number_to_keep`,
and use it to track WAL file deletion in non-2pc mode as well.
This variable is updated only 1) during recovery with mutex held, or 2) in the MANIFEST write thread.
`min_log_number_to_keep` means RocksDB will delete WALs below it, although there may be WALs
above it which are also obsolete. Formally, we will have [min_wal_to_keep, max_obsolete_wal]. During recovery, we
make sure that only WALs above max_obsolete_wal are checked and added back to `alive_log_files_`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9715

Test Plan:
```
make check
```
Also ran stress test below (with asan) to make sure it completes successfully.
```
TEST_TMPDIR=/dev/shm/rocksdb OPT=-g ASAN_OPTIONS=disable_coredump=0 \
CRASH_TEST_EXT_ARGS=--compression_type=zstd SKIP_FORMAT_BUCK_CHECKS=1 \
make J=52 -j52 blackbox_asan_crash_test
```

Reviewed By: ltamasi

Differential Revision: D34984412

Pulled By: riversand963

fbshipit-source-id: c7b21a8d84751bb55ea79c9f387103d21b231005
2022-03-29 12:46:43 -07:00
KNOEEE
6733f75086 Fix a bug in PosixClock (#9695)
Summary:
Multiplier here should be 1e6 to get microseconds.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9695

Reviewed By: ajkr

Differential Revision: D34897086

Pulled By: jay-zhuang

fbshipit-source-id: 9c1d0811ea740ba0a007edc2da199edbd000b88b
2022-03-29 12:45:42 -07:00
duyuqi
fdd0e3a875 fix a bug, c api, if enable inplace_update_support, and use create sn… (#9471)
Summary:
c api release snapshot will core dump when enable inplace_update_support and create snapshot

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9471

Reviewed By: akankshamahajan15

Differential Revision: D34965103

Pulled By: riversand963

fbshipit-source-id: c3aeeb9ea7126c2eda1466102794fecf57b6ab77
2022-03-29 12:45:29 -07:00
Yanqin Jin
7789c45856 Fix assertion error by doing comparison with mutex (#9717)
Summary:
On CircleCI MacOS instances, we have been seeing the following assertion error:
```
Assertion failed: (alive_log_files_tail_ == alive_log_files_.rbegin()), function WriteToWAL, file /Users/distiller/project/db/db_impl/db_impl_write.cc, line 1213.
Received signal 6 (Abort trap: 6)
#0   0x1
https://github.com/facebook/rocksdb/issues/1   abort (in libsystem_c.dylib) + 120
https://github.com/facebook/rocksdb/issues/2   err (in libsystem_c.dylib) + 0
https://github.com/facebook/rocksdb/issues/3   rocksdb::DBImpl::WriteToWAL(rocksdb::WriteBatch const&, rocksdb::log::Writer*, unsigned long long*, unsigned long long*, rocksdb::Env::IOPriority, bool, bool) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:1213)
https://github.com/facebook/rocksdb/issues/4   rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long long*, bool, bool, unsigned long long) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:1251)
https://github.com/facebook/rocksdb/issues/5   rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long long*, unsigned long long, bool, unsigned long long*, unsigned long, rocksdb::PreReleaseCallback*) (in librocksdb.7.0.0.dylib) (db_impl_	rite.cc:421)
https://github.com/facebook/rocksdb/issues/6   rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:109)
https://github.com/facebook/rocksdb/issues/7   rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:2159)
https://github.com/facebook/rocksdb/issues/8   rocksdb::DBImpl::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db_impl_write.cc:37)
https://github.com/facebook/rocksdb/issues/9   rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::Slice const&) (in librocksdb.7.0.0.dylib) (db.h:382)
https://github.com/facebook/rocksdb/issues/10  rocksdb::DBBasicTestWithTimestampPrefixSeek_IterateWithPrefix_Test::TestBody() (in db_with_timestamp_basic_test) (db_with_timestamp_basic_test.cc:2926)
https://github.com/facebook/rocksdb/issues/11  void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3899)
https://github.com/facebook/rocksdb/issues/12  void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3935)
https://github.com/facebook/rocksdb/issues/13  testing::Test::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:3980)
https://github.com/facebook/rocksdb/issues/14  testing::TestInfo::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:4153)
https://github.com/facebook/rocksdb/issues/15  testing::TestCase::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:4266)
https://github.com/facebook/rocksdb/issues/16  testing::internal::UnitTestImpl::RunAllTests() (in db_with_timestamp_basic_test) (gtest-all.cc:6632)
https://github.com/facebook/rocksdb/issues/17  bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3899)
https://github.com/facebook/rocksdb/issues/18  bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in db_with_timestamp_basic_test) (gtest-all.cc:3935)
https://github.com/facebook/rocksdb/issues/19  testing::UnitTest::Run() (in db_with_timestamp_basic_test) (gtest-all.cc:6242)
https://github.com/facebook/rocksdb/issues/20  RUN_ALL_TESTS() (in db_with_timestamp_basic_test) (gtest.h:22110)
https://github.com/facebook/rocksdb/issues/21  main (in db_with_timestamp_basic_test) (db_with_timestamp_basic_test.cc:3150)
https://github.com/facebook/rocksdb/issues/22  start (in libdyld.dylib) + 1
```

It's likely caused by concurrent, unprotected access to the deque, even though `back()` is never popped,
and we are comparing `rbegin()` with a cached `riterator`. To be safe, do the comparison only if we have mutex.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9717

Test Plan:
One example
Ssh to one CircleCI MacOS instance.
```
gtest-parallel -r 1000 -w 8 ./db_test --gtest_filter=DBTest.FlushesInParallelWithCompactRange
```

Reviewed By: pdillinger

Differential Revision: D34990696

Pulled By: riversand963

fbshipit-source-id: 62dd48ae6fedbda53d0a64d73de9b948b4c26eee
2022-03-29 12:45:07 -07:00
Yanqin Jin
1ac5d37276 Fix race condition caused by concurrent accesses to forceMmapOff_ when opening Posix WritableFile (#9685)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9685

Our TSAN reports a race condition as follows when running test
```
gtest-parallel -r 100 ./external_sst_file_test --gtest_filter=ExternalSSTFileTest.MultiThreaded
```
leads to the following

```
WARNING: ThreadSanitizer: data race (pid=2683148)
  Write of size 1 at 0x556fede63340 by thread T7:
    #0 rocksdb::(anonymous namespace)::PosixFileSystem::OpenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, bool, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/fs_posix.cc:334 (external_sst_file_test+0xb61ac4)
    #1 rocksdb::(anonymous namespace)::PosixFileSystem::ReopenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/fs_posix.cc:382 (external_sst_file_test+0xb5ba96)
    #2 rocksdb::CompositeEnv::ReopenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/env/composite_env.cc:334 (external_sst_file_test+0xa6ab7f)
    #3 rocksdb::EnvWrapper::ReopenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unique_ptr<rocksdb::WritableFile, std::default_delete<rocksdb::WritableFile> >*, rocksdb::EnvOptions const&) internal_repo_rocksdb/repo/include/rocksdb/env.h:1428 (external_sst_file_test+0x561f3e)
Previous read of size 1 at 0x556fede63340 by thread T4:
    #0 rocksdb::(anonymous namespace)::PosixFileSystem::OpenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::FileOptions const&, bool, std::unique_ptr<rocksdb::FSWritableFile, std::default_delete<rocksdb::FSWritableFile> >*, rocksdb::IODebugContext*) internal_repo_rocksdb/repo/env/fs_posix.cc:328 (external_sst_file_test+0xb61a70)
    #1 rocksdb::(anonymous namespace)::PosixFileSystem::ReopenWritableFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator
...
```

Fix by making sure the following block gets executed only once:
```
      if (!checkedDiskForMmap_) {
        // this will be executed once in the program's lifetime.
        // do not use mmapWrite on non ext-3/xfs/tmpfs systems.
        if (!SupportsFastAllocate(fname)) {
          forceMmapOff_ = true;
        }
        checkedDiskForMmap_ = true;
      }
```

Reviewed By: pdillinger

Differential Revision: D34780308

fbshipit-source-id: b761f66b24c8b5b8389d86ea371c8542b8d869d5
2022-03-29 12:44:31 -07:00
Yanqin Jin
4ae8687e88 Fix TSAN caused by calling rend() and pop_front(). (#9698)
Summary:
PR9686 makes `WriteToWAL()` call `assert(...!=rend())` while not holding
db mutex or log mutex. Another thread may concurrently call
`pop_front()`, causing race condition.
To fix, assert only if mutex is held.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9698

Test Plan: COMPILE_WITH_TSAN=1 make check

Reviewed By: jay-zhuang

Differential Revision: D34898535

Pulled By: riversand963

fbshipit-source-id: 1ddfa5bf1b6ae8d409cab6ff6e1b5321c6803da9
2022-03-29 12:43:09 -07:00
Yanqin Jin
3e18c47db0 Fix a TSAN-reported bug caused by concurrent accesss to std::deque (#9686)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/9686

According to https://www.cplusplus.com/reference/deque/deque/back/,
"
The container is accessed (neither the const nor the non-const versions modify the container).
The last element is potentially accessed or modified by the caller. Concurrently accessing or modifying other elements is safe.
"

Also according to https://www.cplusplus.com/reference/deque/deque/pop_front/,
"
The container is modified.
The first element is modified. Concurrently accessing or modifying other elements is safe (although see iterator validity above).
"
In RocksDB, we never pop the last element of `DBImpl::alive_log_files_`. We have been
exploiting this fact and the above two properties when ensuring correctness when
`DBImpl::alive_log_files_` may be accessed concurrently. Specifically, it can be accessed
in the write path when db mutex is released. Sometimes, the log_mute_ is held. It can also be accessed in `FindObsoleteFiles()`
when db mutex is always held. It can also be accessed
during recovery when db mutex is also held.
Given the fact that we never pop the last element of alive_log_files_, we currently do not
acquire additional locks when accessing it in `WriteToWAL()` as follows
```
alive_log_files_.back().AddSize(log_entry.size());
```

This is problematic.

Check source code of deque.h
```
  back() _GLIBCXX_NOEXCEPT
  {
__glibcxx_requires_nonempty();
...
  }

  pop_front() _GLIBCXX_NOEXCEPT
  {
...
  if (this->_M_impl._M_start._M_cur
      != this->_M_impl._M_start._M_last - 1)
    {
      ...
      ++this->_M_impl._M_start._M_cur;
    }
  ...
  }
```

`back()` will actually call `__glibcxx_requires_nonempty()` first.
If `__glibcxx_requires_nonempty()` is enabled and not an empty macro,
it will call `empty()`
```
bool empty() {
return this->_M_impl._M_finish == this->_M_impl._M_start;
}
```
You can see that it will access `this->_M_impl._M_start`, racing with `pop_front()`.
Therefore, TSAN will actually catch the bug in this case.

To be able to use TSAN on our library and unit tests, we should always coordinate
concurrent accesses to STL containers properly.

We need to pass information about db mutex and log mutex into `WriteToWAL()`, otherwise
it's impossible to know which mutex to acquire inside the function.

To fix this, we can catch the tail of `alive_log_files_` by reference, so that we do not have to call `back()` in `WriteToWAL()`.

Reviewed By: pdillinger

Differential Revision: D34780309

fbshipit-source-id: 1def9821f0c437f2736c6a26445d75890377889b
2022-03-29 12:42:09 -07:00
anand76
802093fe44 Fix issue #9627 (#9657)
Summary:
SMB mounts do not support hard links. The ENOTSUP error code is
returned, which should be interpreted by PosixFileSystem as
IOStatus::NotSupported().

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9657

Reviewed By: mrambacher

Differential Revision: D34634783

Pulled By: anand1976

fbshipit-source-id: 0d57f5b2e6118e4c20e9ed1a293327428c3aecac
2022-03-29 12:37:22 -07:00
Jay Zhuang
63a39d9110 Fix a race condition when disable and enable manual compaction (#9694)
Summary:
In https://github.com/facebook/rocksdb/issues/9659, when `DisableManualCompaction()` is issued, the foreground
manual compaction thread does not have to wait background compaction
thread to finish. Which could be a problem that the user re-enable
manual compaction with `EnableManualCompaction()`, it may re-enable the
BG compaction which supposed be cancelled.
This patch makes the FG compaction wait on
`manual_compaction_state.done`, which either be set by BG compaction or
Unschedule callback. Then when FG manual compaction thread returns, it
should not have BG compaction running. So shared_ptr is no longer needed
for `manual_compaction_state`.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9694

Test Plan: a StressTest and unittest

Reviewed By: ajkr

Differential Revision: D34885472

Pulled By: jay-zhuang

fbshipit-source-id: e6476175b43e8c59cd49f5c09241036a0716c274
2022-03-29 12:36:27 -07:00
Jay Zhuang
6996495c1b Minor fix for Windows build with zlib (#9699)
Summary:
```
conversion from 'size_t' to 'uLong', possible loss of data
```

Fix https://github.com/facebook/rocksdb/issues/9688

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9699

Reviewed By: riversand963

Differential Revision: D34901116

Pulled By: jay-zhuang

fbshipit-source-id: 969148a7a8c023449bd85055a1f0eec71d0a9b3f
2022-03-29 09:46:33 -07:00
Peter Dillinger
b7119ff818 Update HISTORY.md and version.h for 7.0.3
Summary: Also clarified in HISTORY.md that this patch release breaks
binary compatibility (because of FilterPolicy vtable)
2022-03-23 10:32:57 -07:00
Peter Dillinger
418e58e6a2 Fix a major performance bug in 7.0 re: filter compatibility (#9736)
Summary:
Bloom filters generated by pre-7.0 releases are not read by
7.0.x releases (and vice-versa) due to changes to FilterPolicy::Name()
in https://github.com/facebook/rocksdb/issues/9590. This can severely impact read performance and read I/O on
upgrade or downgrade with existing DB, but not data correctness.

To fix, we go back using the old, unified name in SST metadata but (for
a while anyway) recognize the aliases that could be generated by early
7.0.x releases. This unfortunately requires a public API change to avoid
interfering with all the good changes from https://github.com/facebook/rocksdb/issues/9590, but the API change
only affects users with custom FilterPolicy, which should be very few.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9736

Test Plan:
manual

Generate DBs with
```
./db_bench.7.0 -db=/dev/shm/rocksdb.7.0 -bloom_bits=10 -cache_index_and_filter_blocks=1 -benchmarks=fillrandom -num=10000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0
```
and similar. Compare with
```
for IMPL in 6.29 7.0 fixed; do for DB in 6.29 7.0 fixed; do echo "Testing $IMPL on $DB:"; ./db_bench.$IMPL -db=/dev/shm/rocksdb.$DB -use_existing_db -readonly -bloom_bits=10 -benchmarks=readrandom -num=10000000 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -duration=10 2>&1 | grep micros/op; done; done
```

Results:
```
Testing 6.29 on 6.29:
readrandom   :      34.381 micros/op 29085 ops/sec;    3.2 MB/s (291999 of 291999 found)
Testing 6.29 on 7.0:
readrandom   :     190.443 micros/op 5249 ops/sec;    0.6 MB/s (52999 of 52999 found)
Testing 6.29 on fixed:
readrandom   :      40.148 micros/op 24907 ops/sec;    2.8 MB/s (249999 of 249999 found)
Testing 7.0 on 6.29:
readrandom   :     229.430 micros/op 4357 ops/sec;    0.5 MB/s (43999 of 43999 found)
Testing 7.0 on 7.0:
readrandom   :      33.348 micros/op 29986 ops/sec;    3.3 MB/s (299999 of 299999 found)
Testing 7.0 on fixed:
readrandom   :     152.734 micros/op 6546 ops/sec;    0.7 MB/s (65999 of 65999 found)
Testing fixed on 6.29:
readrandom   :      32.024 micros/op 31224 ops/sec;    3.5 MB/s (312999 of 312999 found)
Testing fixed on 7.0:
readrandom   :      33.990 micros/op 29390 ops/sec;    3.3 MB/s (294999 of 294999 found)
Testing fixed on fixed:
readrandom   :      28.714 micros/op 34825 ops/sec;    3.9 MB/s (348999 of 348999 found)
```

Just paying attention to order of magnitude of ops/sec (short test
durations, lots of noise), it's clear that with the fix we can read <= 6.29
& >= 7.0 at full speed, where neither 6.29 nor 7.0 can on both. And 6.29
release can properly read fixed DB at full speed.

Reviewed By: siying, ajkr

Differential Revision: D35057844

Pulled By: pdillinger

fbshipit-source-id: a46893a6af4bf084375ebe4728066d00eb08f050
2022-03-23 10:10:56 -07:00
gukaifeng
0e4c063c16 fix a bug of the ticker NO_FILE_OPENS (#9677)
Summary:
In the original code, the value of `NO_FILE_OPENS` corresponding to the Ticker item will be increased regardless of whether the file is successfully opened or not. Even counts are repeated, which can lead to skewed counts.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9677

Reviewed By: jay-zhuang

Differential Revision: D34725733

Pulled By: ajkr

fbshipit-source-id: 841234ed03802c0105fd2107d82a740265ead576
2022-03-15 10:47:44 -07:00
Jermy Li
fddc1a79dd fix: Reusing-Iterator reads stale keys after DeleteRange() performed (#9258)
Summary:
fix https://github.com/facebook/rocksdb/issues/9255

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9258

Reviewed By: pdillinger

Differential Revision: D34879684

Pulled By: ajkr

fbshipit-source-id: 5934f4b7524dc27ecdf1430e0456a0fc02958fc7
2022-03-15 10:46:54 -07:00
Tomas Kolda
9db48d6e08 NPE in Java_org_rocksdb_ColumnFamilyOptions_setSstPartitionerFactory (#9622)
Summary:
There was a mistake that incorrectly cast SstPartitionerFactory (missed shared pointer). It worked for database (correct cast), but not for family. Trying to set it in family has caused Access violation.

I have also added test and improved it. Older version was passing even without sst partitioner which is weird, because on Level1 we had two SST files with same key "aaaa1". I was not sure if it is a new feature and changed it to overlaping keys "aaaa0" - "aaaa2" overlaps "aaaa1".

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9622

Reviewed By: ajkr

Differential Revision: D34871968

Pulled By: pdillinger

fbshipit-source-id: a08009766da49fc198692a610e8beb19caf737e6
2022-03-15 10:42:49 -07:00
Yuriy Chernyshov
348414d502 #include <winioctl.h> as MSDN prescribes (#9612)
Summary:
The recommendation can be found e. g. [here](https://docs.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-storage_property_query).

While `<windows.h>` transitively includes `<winioctl.h>` by default, this can be switched off by `/DWIN32_LEAN_AND_MEAN` which forces the user to include-what-you-use.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9612

Reviewed By: riversand963

Differential Revision: D34845629

Pulled By: ajkr

fbshipit-source-id: 1ef9273074e3d84864c6833a7de6eb9df81e29d9
2022-03-15 10:42:25 -07:00
Andrew Kryczka
7b6ca63ed0 update HISTORY.md and version.h for 7.0.2 2022-03-14 09:38:23 -07:00
Jay Zhuang
9fe3a53443 DisableManualCompaction may fail to cancel an unscheduled task (#9659)
Summary:
https://github.com/facebook/rocksdb/issues/9625 didn't change the unschedule condition which was waiting for the background thread to clean-up the compaction.
make sure we only unschedule the task when it's scheduled.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9659

Reviewed By: ajkr

Differential Revision: D34651820

Pulled By: jay-zhuang

fbshipit-source-id: 23f42081b15ec8886cd81cbf131b116e0c74dc2f
2022-03-12 22:17:59 -08:00
Jay Zhuang
22e011fe0a Fix a timer crash caused by invalid memory management (#9656)
Summary:
Timer crash when multiple DB instances doing heavy DB open and close
operations concurrently. Which is caused by adding a timer task with
smaller timestamp than the current running task. Fix it by moving the
getting new task timestamp part within timer mutex protection.
And other fixes:
- Disallow adding duplicated function name to timer
- Fix a minor memory leak in timer when a running task is cancelled

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9656

Reviewed By: ajkr

Differential Revision: D34626296

Pulled By: jay-zhuang

fbshipit-source-id: 6b6d96a5149746bf503546244912a9e41a0c5f6b
2022-03-12 22:10:28 -08:00
Andrew Kryczka
bbae679ce7 Avoid popcnt on Windows when unavailable and in portable builds (#9680)
Summary:
Fixes https://github.com/facebook/rocksdb/issues/9560. Only use popcnt intrinsic when HAVE_SSE42 is set. Also avoid setting it based on compiler test in portable builds because such test will pass on MSVC even without proper arch flags (ref: https://devblogs.microsoft.com/oldnewthing/20201026-00/?p=104397).

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9680

Test Plan: verified the combinations of -DPORTABLE and -DFORCE_SSE42 produce expected compiler flags on Linux. Verified MSVC build using PORTABLE=1 (in CircleCI) does not set HAVE_SSE42.

Reviewed By: pdillinger

Differential Revision: D34739033

Pulled By: ajkr

fbshipit-source-id: d10456f3392945fc3e59430a1777840f7b60b276
2022-03-09 22:51:01 -08:00
Adam Retter
fe9e363ba1 Fix RocksJava releases for macOS (#9662)
Summary:
Addresses the problems described in https://github.com/facebook/rocksdb/pull/9254#issuecomment-1054598516 and https://github.com/facebook/rocksdb/pull/9254#issuecomment-1059574837 that have blocked a RocksJava release

**NOTE** Also needs to be ported to 6.29.fb branch.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9662

Reviewed By: ajkr

Differential Revision: D34689200

Pulled By: pdillinger

fbshipit-source-id: c62fe34c54f05be5a00ee1daec8ec7454baa5eb8
2022-03-07 15:29:08 -08:00
Jonathan Albrecht
3a4d73d337 Reenable s390x platform_dependent travis job (#9631)
Summary:
Fix g++ -march=native detection and reenable s390x in travis

This PR fixes s390x assembler messages:
```
Error: invalid switch -march=z14
Error: unrecognized option -march=z14
```

The s390x travis build was failing with gcc-7 because the assembler on
ubuntu 16.04 is too old to recognize the z14 model so it doesn't work
with -march=native on a z14 machine. It fixes the check for the
-march=native flag so that the assembler will get called and correctly
fail on ubuntu 16.04 which will cause the build to fall back to
-march=z196 which works.

The other changes are needed so builds work more consistently on
s390x:

1. Set make parallelism to 1 for s390x: The default was 4 previously
but I saw frequent internal compiler errors on travis probably due to
low resources. The `platform_dependent` job works more consistently
but is roughly 10 minutes slower although it varies.
2. Remove status_checked jobs, as we are relying on CircleCI for
these now and do not really need platform coverage on them.

Fixes https://github.com/facebook/rocksdb/issues/9524

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9631

Test Plan: CI

Reviewed By: ajkr

Differential Revision: D34553989

Pulled By: pdillinger

fbshipit-source-id: a6e3a7276446721c4c0bebc4ed217c2ca2b53f11
2022-03-07 10:32:43 -08:00
Andrew Kryczka
4ab2fe77f3 update HISTORY.md and version.h for 7.0.1 2022-03-02 21:42:52 -08:00
Yanqin Jin
596c606739 Fix bug causing incorrect data returned by snapshot read (#9648)
Summary:
This bug affects use cases that meet the following conditions
- (has only the default column family or disables WAL) and
- has at least one event listener
- atomic flush is NOT affected.

If the above conditions meet, then RocksDB can release the db mutex before picking all the
existing memtables to flush. In the meantime, a snapshot can be created and db's sequence
number can still be incremented. The upcoming flush will ignore this snapshot.
A later read using this snapshot can return incorrect result.

To fix this issue, we call the listeners callbacks after picking the memtables so that we avoid
creating snapshots during this interval.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9648

Test Plan: make check

Reviewed By: ajkr

Differential Revision: D34555456

Pulled By: riversand963

fbshipit-source-id: 1438981e9f069a5916686b1a0ad7627f734cf0ee
2022-03-02 21:41:05 -08:00
jingkai.yuan
a6fd3332e5 Fix corruption error when compressing blob data with zlib. (#9572)
Summary:
The plain data length may not be big enough if the compression actually expands data. So use deflateBound() to get the upper limit on the compressed output before deflate().

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9572

Reviewed By: riversand963

Differential Revision: D34326475

Pulled By: ajkr

fbshipit-source-id: 4b679cb7a83a62782a127785b4d5eb9aa4646449
2022-03-02 21:39:21 -08:00
Hui Xiao
f278e0e2d3 Deflake DBErrorHandlingFSTest.MultiCFWALWriteError (#9496)
Summary:
**Context:**
As part of https://github.com/facebook/rocksdb/pull/6949, file deletion is disabled for faulty database on the IOError of MANIFEST write/sync and [re-enabled again during `DBImpl::Resume()` if all recovery is completed](e66199d848 (diff-d9341fbe2a5d4089b93b22c5ed7f666bc311b378c26d0786f4b50c290e460187R396)). Before re-enabling file deletion, it `assert(versions_->io_status().ok());`, which IMO assumes `versions_` is **the** `version_` in the recovery process.

However, this is not necessarily true due to `s = error_handler_.ClearBGError();` happening before that assertion can unblock some foreground thread by [`EventHelpers::NotifyOnErrorRecoveryEnd()`](3122cb4358/db/error_handler.cc (L552-L553)) as part of the `ClearBGError()`. That foreground thread can do whatever it wants including closing/reopening the db and clean up that same `versions_`.

As a consequence,  `assert(versions_->io_status().ok());`, will access `io_status()` of a nullptr and test like `DBErrorHandlingFSTest.MultiCFWALWriteError` becomes flaky. The unblocked foreground thread (in this case, the testing thread) proceeds to [reopen the db](https://github.com/facebook/rocksdb/blob/6.29.fb/db/error_handler_fs_test.cc?fbclid=IwAR1kQOxSbTUmaHQPAGz5jdMHXtDsDFKiFl8rifX-vIz4B23Y0S9jBkssSCg#L1494), where [`versions_` gets reset to nullptr](https://github.com/facebook/rocksdb/blob/6.29.fb/db/db_impl/db_impl.cc?fbclid=IwAR2uRhwBiPKgmE9q_6CM2mzbfwjoRgsGpXOrHruSJUDcAKc9rYZtVSvKdOY#L678) as part of the old db clean-up. If this happens right before `assert(versions_->io_status().ok()); ` gets excuted in the background thread, then we can see error like
```
db/db_impl/db_impl.cc:420:5: runtime error: member call on null pointer of type 'rocksdb::VersionSet'
assert(versions_->io_status().ok());
```

**Summary:**
- I proposed to call `s = error_handler_.ClearBGError();` after we know it's fine to wake up foreground, which I think is right before we LOG `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");`
   - As the context,  the orignal https://github.com/facebook/rocksdb/pull/3997  introducing `DBImpl::Resume()` calls `s = error_handler_.ClearBGError();` very close to calling `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");` while the later https://github.com/facebook/rocksdb/pull/6949 distances these two calls a bit.
   - And it seems fine to me that `s = error_handler_.ClearBGError();` happens after `EnableFileDeletions(/*force=*/true);` at least syntax-wise since these two functions are orthogonal. And it also seems okay to me that we re-enable file deletion before `s = error_handler_.ClearBGError();`, which basically is resetting some state variables.
- In addition, to preserve the previous behavior of  https://github.com/facebook/rocksdb/pull/6949 where status of re-enabling file deletion is not taken account into the general status of resuming the db, I separated `enable_file_deletion_s` from the general `s`
- In addition, to make `ROCKS_LOG_INFO(immutable_db_options_.info_log, "Successfully resumed DB");` more clear, I separated it into its own if-block.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9496

Test Plan:
- Manually reproduce the assertion failure in`DBErrorHandlingFSTest.MultiCFWALWriteError` by injecting sleep like below so that it's more likely for `assert(versions_->io_status().ok());` to execute after [reopening the db](https://github.com/facebook/rocksdb/blob/6.29.fb/db/error_handler_fs_test.cc?fbclid=IwAR1kQOxSbTUmaHQPAGz5jdMHXtDsDFKiFl8rifX-vIz4B23Y0S9jBkssSCg#L1494) in the foreground (i.e, testing) thread
```
sleep(1);
assert(versions_->io_status().ok());
```
   `python3 gtest-parallel/gtest_parallel.py -r 100 -w 100 rocksdb/error_handler_fs_test --gtest_filter=DBErrorHandlingFSTest.MultiCFWALWriteError`
   ```
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBErrorHandlingFSTest
[ RUN      ] DBErrorHandlingFSTest.MultiCFWALWriteError
Received signal 11 (Segmentation fault)
#0   rocksdb/error_handler_fs_test() [0x5818a4] rocksdb::DBImpl::ResumeImpl(rocksdb::DBRecoverContext)  /data/users/huixiao/rocksdb/db/db_impl/db_impl.cc:421
https://github.com/facebook/rocksdb/issues/1   rocksdb/error_handler_fs_test() [0x6379ff] rocksdb::ErrorHandler::RecoverFromBGError(bool) /data/users/huixiao/rocksdb/db/error_handler.cc:600
https://github.com/facebook/rocksdb/issues/2   rocksdb/error_handler_fs_test() [0x7c5362] rocksdb::SstFileManagerImpl::ClearError()       /data/users/huixiao/rocksdb/file/sst_file_manager_impl.cc:310
https://github.com/facebook/rocksdb/issues/3   rocksdb/error_handler_fs_test()
   ```
- The assertion failure does not happen with PR
`python3 gtest-parallel/gtest_parallel.py -r 100 -w 100 rocksdb/error_handler_fs_test --gtest_filter=DBErrorHandlingFSTest.MultiCFWALWriteError`
`[100/100] DBErrorHandlingFSTest.MultiCFWALWriteError (43785 ms)  `

Reviewed By: riversand963, anand1976

Differential Revision: D33990099

Pulled By: hx235

fbshipit-source-id: 2e0259a471fa8892ff177da91b3e1c0792dd7bab
2022-03-02 21:38:50 -08:00
Jay Zhuang
0344b1d331 Unschedule manual compaction from thread-pool queue (#9625)
Summary:
PR https://github.com/facebook/rocksdb/issues/9557 introduced a race condition between manual compaction
foreground thread and background compaction thread.
This PR adds the ability to really unschedule manual compaction from
thread-pool queue by differentiate tag name for manual compaction and
other tasks.
Also fix an issue that db `close()` didn't cancel the manual compaction thread.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9625

Test Plan: unittest not hang

Reviewed By: ajkr

Differential Revision: D34410811

Pulled By: jay-zhuang

fbshipit-source-id: cb14065eabb8cf1345fa042b5652d4f788c0c40c
2022-03-02 21:37:10 -08:00
Andrew Kryczka
a4bf9311da Handle failures in block-based table size/offset approximation (#9615)
Summary:
In crash test with fault injection, we were seeing stack traces like the following:

```
https://github.com/facebook/rocksdb/issues/3 0x00007f75f763c533 in __GI___assert_fail (assertion=assertion@entry=0x1c5b2a0 "end_offset >= start_offset", file=file@entry=0x1c580a0 "table/block_based/block_based_table_reader.cc", line=line@entry=3245,
function=function@entry=0x1c60e60 "virtual uint64_t rocksdb::BlockBasedTable::ApproximateSize(const rocksdb::Slice&, const rocksdb::Slice&, rocksdb::TableReaderCaller)") at assert.c:101
https://github.com/facebook/rocksdb/issues/4 0x00000000010ea9b4 in rocksdb::BlockBasedTable::ApproximateSize (this=<optimized out>, start=..., end=..., caller=<optimized out>) at table/block_based/block_based_table_reader.cc:3224
https://github.com/facebook/rocksdb/issues/5 0x0000000000be61fb in rocksdb::TableCache::ApproximateSize (this=0x60f0000161b0, start=..., end=..., fd=..., caller=caller@entry=rocksdb::kCompaction, internal_comparator=..., prefix_extractor=...) at db/table_cache.cc:719
https://github.com/facebook/rocksdb/issues/6 0x0000000000c3eaec in rocksdb::VersionSet::ApproximateSize (this=<optimized out>, v=<optimized out>, f=..., start=..., end=..., caller=<optimized out>) at ./db/version_set.h:850
https://github.com/facebook/rocksdb/issues/7 0x0000000000c6ebc3 in rocksdb::VersionSet::ApproximateSize (this=<optimized out>, options=..., v=v@entry=0x621000047500, start=..., end=..., start_level=start_level@entry=0, end_level=<optimized out>, caller=<optimized out>)
at db/version_set.cc:5657
https://github.com/facebook/rocksdb/issues/8 0x000000000166e894 in rocksdb::CompactionJob::GenSubcompactionBoundaries (this=<optimized out>) at ./include/rocksdb/options.h:1869
https://github.com/facebook/rocksdb/issues/9 0x000000000168c526 in rocksdb::CompactionJob::Prepare (this=this@entry=0x7f75f3ffcf00) at db/compaction/compaction_job.cc:546
```

The problem occurred in `ApproximateSize()` when the index `Seek()` for the first `ApproximateDataOffsetOf()` encountered an I/O error, while the second `Seek()` did not. In the old code that scenario caused `start_offset == data_size` , thus it was easy to trip the assertion that `end_offset >= start_offset`.

The fix is to set `start_offset == 0` when the first index `Seek()` fails, and `end_offset == data_size` when the second index `Seek()` fails. I doubt these give an "on average correct" answer for how this function is used, but I/O errors in index seeks are hopefully rare, it looked consistent with what was already there, and it was easier to calculate.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9615

Test Plan:
run the repro command for a while and stopped seeing coredumps -

```
$ while !  ./db_stress --block_size=128 --cache_size=32768 --clear_column_family_one_in=0 --column_families=1 --continuous_verification_interval=0 --db=/dev/shm/rocksdb_crashtest --delpercent=4 --delrangepercent=1 --destroy_db_initially=0 --expected_values_dir=/dev/shm/rocksdb_crashtest_expected --index_type=2 --iterpercent=10  --kill_random_test=18887 --max_key=1000000 --max_bytes_for_level_base=2048576 --nooverwritepercent=1 --open_files=-1 --open_read_fault_one_in=32 --ops_per_thread=1000000 --prefixpercent=5 --read_fault_one_in=0 --readpercent=45 --reopen=0 --skip_verifydb=1 --subcompactions=2 --target_file_size_base=524288 --test_batches_snapshots=0 --value_size_mult=32 --write_buffer_size=524288 --writepercent=35  ; do : ; done
```

Reviewed By: pdillinger

Differential Revision: D34383069

Pulled By: ajkr

fbshipit-source-id: fac26c3b20ea962e75387515ba5f2724dc48719f
2022-03-02 21:35:15 -08:00
Andrew Kryczka
b251c4f5a9 Dedicate cacheline for DB mutex (#9637)
Summary:
We found a case of cacheline bouncing due to writers locking/unlocking `mutex_` and readers accessing `block_cache_tracer_`. We discovered it only after the issue was fixed by https://github.com/facebook/rocksdb/issues/9462 shifting the `DBImpl` members such that `mutex_` and `block_cache_tracer_` were naturally placed in separate cachelines in our regression testing setup. This PR forces the cacheline alignment of `mutex_` so we don't accidentally reintroduce the problem.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/9637

Reviewed By: riversand963

Differential Revision: D34502233

Pulled By: ajkr

fbshipit-source-id: 46aa313b7fe83e80c3de254e332b6fb242434c07
2022-03-02 21:33:07 -08:00
Andrew Kryczka
a12be569af Update HISTORY.md and version.h for 7.0 release (#9609)
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/9609

Reviewed By: riversand963

Differential Revision: D34370309

Pulled By: ajkr

fbshipit-source-id: 5fc9306439aefa4b2d61d847534ea6758c30b6a5
2022-02-20 16:40:15 -08:00
563 changed files with 15988 additions and 33758 deletions

View File

@ -2,6 +2,7 @@ version: 2.1
orbs:
win: circleci/windows@2.4.0
slack: circleci/slack@3.4.2
aliases:
- &notify-on-main-failure
@ -56,6 +57,7 @@ commands:
post-steps:
steps:
- slack/status: *notify-on-main-failure
- store_test_results: # store test result if there's any
path: /tmp/test-results
- store_artifacts: # store LOG for debugging if there's any
@ -85,6 +87,7 @@ commands:
- run:
name: Install Clang 13
command: |
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
echo "deb http://apt.llvm.org/focal/ llvm-toolchain-focal-13 main" | sudo tee -a /etc/apt/sources.list
echo "deb-src http://apt.llvm.org/focal/ llvm-toolchain-focal-13 main" | sudo tee -a /etc/apt/sources.list
echo "APT::Acquire::Retries \"10\";" | sudo tee -a /etc/apt/apt.conf.d/80-retries # llvm.org unreliable
@ -100,22 +103,10 @@ commands:
install-benchmark:
steps:
- run:
name: Install ninja build
command: sudo apt-get update -y && sudo apt-get install -y ninja-build
- run:
name: Install benchmark
command: |
git clone --depth 1 --branch v1.6.1 https://github.com/google/benchmark.git ~/benchmark
cd ~/benchmark && mkdir build && cd build
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release -DBENCHMARK_ENABLE_GTEST_TESTS=0
ninja && sudo ninja install
install-valgrind:
steps:
- run:
name: Install valgrind
command: sudo apt-get update -y && sudo apt-get install -y valgrind
sudo apt-get update -y && sudo apt-get install -y libbenchmark-dev
upgrade-cmake:
steps:
@ -180,42 +171,19 @@ jobs:
- increase-max-open-files-on-macos
- install-gflags-on-macos
- pre-steps-macos
- run: ulimit -S -n `ulimit -H -n` && OPT=-DCIRCLECI make V=1 J=32 -j32 all
- run: ulimit -S -n 1048576 && OPT=-DCIRCLECI make V=1 J=32 -j32 check
- post-steps
build-macos-cmake:
macos:
xcode: 12.5.1
resource_class: large
parameters:
run_even_tests:
description: run even or odd tests, used to split tests to 2 groups
type: boolean
default: true
steps:
- increase-max-open-files-on-macos
- install-cmake-on-macos
- install-gflags-on-macos
- pre-steps-macos
- run:
name: "cmake generate project file"
command: ulimit -S -n `ulimit -H -n` && mkdir build && cd build && cmake -DWITH_GFLAGS=1 ..
- run:
name: "Build tests"
command: cd build && make V=1 -j32
- when:
condition: << parameters.run_even_tests >>
steps:
- run:
name: "Run even tests"
command: ulimit -S -n `ulimit -H -n` && cd build && ctest -j32 -I 0,,2
- when:
condition:
not: << parameters.run_even_tests >>
steps:
- run:
name: "Run odd tests"
command: ulimit -S -n `ulimit -H -n` && cd build && ctest -j32 -I 1,,2
- run: ulimit -S -n 1048576 && (mkdir build && cd build && cmake -DWITH_GFLAGS=1 .. && make V=1 -j32 && ctest -j10)
- post-steps
build-linux:
@ -228,16 +196,14 @@ jobs:
- run: make V=1 J=32 -j32 check
- post-steps
build-linux-encrypted_env-no_compression:
build-linux-encrypted-env:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
steps:
- pre-steps
- install-gflags
- run: ENCRYPTED_ENV=1 ROCKSDB_DISABLE_SNAPPY=1 ROCKSDB_DISABLE_ZLIB=1 ROCKSDB_DISABLE_BZIP=1 ROCKSDB_DISABLE_LZ4=1 ROCKSDB_DISABLE_ZSTD=1 make V=1 J=32 -j32 check
- run: |
./sst_dump --help | egrep -q 'Supported compression types: kNoCompression$' # Verify no compiled in compression
- run: ENCRYPTED_ENV=1 make V=1 J=32 -j32 check
- post-steps
build-linux-shared_lib-alt_namespace-status_checked:
@ -253,38 +219,38 @@ jobs:
build-linux-release:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
resource_class: large
steps:
- checkout # check out the code in the project directory
- run: make V=1 -j32 release
- run: make V=1 -j8 release
- run: if ./db_stress --version; then false; else true; fi # ensure without gflags
- install-gflags
- run: make V=1 -j32 release
- run: make V=1 -j8 release
- run: ./db_stress --version # ensure with gflags
- post-steps
build-linux-release-rtti:
machine:
image: ubuntu-2004:202111-02
resource_class: xlarge
resource_class: large
steps:
- checkout # check out the code in the project directory
- run: make clean
- run: USE_RTTI=1 DEBUG_LEVEL=0 make V=1 -j16 static_lib tools db_bench
- run: USE_RTTI=1 DEBUG_LEVEL=0 make V=1 -j8 static_lib tools db_bench
- run: if ./db_stress --version; then false; else true; fi # ensure without gflags
- run: sudo apt-get update -y && sudo apt-get install -y libgflags-dev
- run: make clean
- run: USE_RTTI=1 DEBUG_LEVEL=0 make V=1 -j16 static_lib tools db_bench
- run: USE_RTTI=1 DEBUG_LEVEL=0 make V=1 -j8 static_lib tools db_bench
- run: ./db_stress --version # ensure with gflags
build-linux-lite:
machine:
image: ubuntu-2004:202111-02
resource_class: large
resource_class: 2xlarge
steps:
- pre-steps
- install-gflags
- run: LITE=1 make V=1 J=8 -j8 check
- run: LITE=1 make V=1 J=32 -j32 check
- post-steps
build-linux-lite-release:
@ -325,30 +291,11 @@ jobs:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
# find test list by `make list_all_tests`
parameters:
start_test:
default: ""
type: string
end_test:
default: ""
type: string
steps:
- pre-steps
- install-gflags
- install-clang-10
- install-gtest-parallel
- run:
name: "Build unit tests"
command: |
echo "env: $(env)"
ROCKSDBTESTS_START=<<parameters.start_test>> ROCKSDBTESTS_END=<<parameters.end_test>> ROCKSDBTESTS_SUBSET_TESTS_TO_FILE=/tmp/test_list COMPILE_WITH_TSAN=1 CC=clang-10 CXX=clang++-10 ROCKSDB_DISABLE_ALIGNED_NEW=1 USE_CLANG=1 make V=1 -j32 --output-sync=target build_subset_tests
- run:
name: "Run unit tests in parallel"
command: |
sed -i 's/[[:space:]]*$//; s/ / \.\//g; s/.*/.\/&/' /tmp/test_list
cat /tmp/test_list
gtest-parallel $(</tmp/test_list) --output_dir=/tmp | cat # pipe to cat to continuously output status on circleci UI. Otherwise, no status will be printed while the job is running.
- run: COMPILE_WITH_TSAN=1 CC=clang-10 CXX=clang++-10 ROCKSDB_DISABLE_ALIGNED_NEW=1 USE_CLANG=1 make V=1 -j32 check # aligned new doesn't work for reason we haven't figured out.
- post-steps
build-linux-clang10-ubsan:
@ -362,17 +309,6 @@ jobs:
- run: COMPILE_WITH_UBSAN=1 OPT="-fsanitize-blacklist=.circleci/ubsan_suppression_list.txt" CC=clang-10 CXX=clang++-10 ROCKSDB_DISABLE_ALIGNED_NEW=1 USE_CLANG=1 make V=1 -j32 ubsan_check # aligned new doesn't work for reason we haven't figured out
- post-steps
build-linux-valgrind:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
steps:
- pre-steps
- install-gflags
- install-valgrind
- run: PORTABLE=1 make V=1 -j32 valgrind_test
- post-steps
build-linux-clang10-clang-analyze:
machine:
image: ubuntu-2004:202111-02
@ -385,7 +321,7 @@ jobs:
- run: CC=clang-10 CXX=clang++-10 ROCKSDB_DISABLE_ALIGNED_NEW=1 CLANG_ANALYZER="/usr/bin/clang++-10" CLANG_SCAN_BUILD=scan-build-10 USE_CLANG=1 make V=1 -j32 analyze # aligned new doesn't work for reason we haven't figured out. For unknown, reason passing "clang++-10" as CLANG_ANALYZER doesn't work, and we need a full path.
- post-steps
build-linux-cmake-with-folly:
build-linux-cmake:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
@ -393,11 +329,10 @@ jobs:
- pre-steps
- install-gflags
- upgrade-cmake
- run: make checkout_folly
- run: (mkdir build && cd build && cmake -DUSE_FOLLY=1 -DWITH_GFLAGS=1 .. && make V=1 -j20 && ctest -j20)
- run: (mkdir build && cd build && cmake -DWITH_GFLAGS=1 .. && make V=1 -j20 && ctest -j20)
- post-steps
build-linux-cmake-with-benchmark:
build-linux-cmake-ubuntu-20:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
@ -411,35 +346,32 @@ jobs:
build-linux-unity-and-headers:
docker: # executor type
- image: gcc:latest
environment:
EXTRA_CXXFLAGS: -mno-avx512f # Warnings-as-error in avx512fintrin.h, would be used on newer hardware
resource_class: large
steps:
- checkout # check out the code in the project directory
- run: apt-get update -y && apt-get install -y libgflags-dev
- run: make V=1 -j8 unity_test
- run: TEST_TMPDIR=/dev/shm && make V=1 -j8 unity_test
- run: make V=1 -j8 -k check-headers # could be moved to a different build
- post-steps
build-linux-gcc-7-with-folly:
build-linux-gcc-7:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
steps:
- pre-steps
- run: sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test && sudo apt-get update -y && sudo apt-get install gcc-7 g++-7 libgflags-dev
- run: make checkout_folly
- run: USE_FOLLY=1 CC=gcc-7 CXX=g++-7 V=1 make -j32 check
- run: CC=gcc-7 CXX=g++-7 V=1 make -j32 check
- post-steps
build-linux-gcc-8-no_test_run:
machine:
image: ubuntu-2004:202111-02
resource_class: xlarge
resource_class: large
steps:
- pre-steps
- run: sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test && sudo apt-get update -y && sudo apt-get install gcc-8 g++-8 libgflags-dev
- run: CC=gcc-8 CXX=g++-8 V=1 make -j16 all
- run: CC=gcc-8 CXX=g++-8 V=1 make -j8 all
- post-steps
build-linux-gcc-10-cxx20-no_test_run:
@ -449,7 +381,7 @@ jobs:
steps:
- pre-steps
- run: sudo apt-get update -y && sudo apt-get install gcc-10 g++-10 libgflags-dev
- run: CC=gcc-10 CXX=g++-10 V=1 ROCKSDB_CXX_STANDARD=c++20 make -j16 all
- run: CC=gcc-10 CXX=g++-10 V=1 SKIP_LINK=1 ROCKSDB_CXX_STANDARD=c++20 make -j16 all # Linking broken because libgflags compiled with newer ABI
- post-steps
build-linux-gcc-11-no_test_run:
@ -459,8 +391,7 @@ jobs:
steps:
- pre-steps
- run: sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test && sudo apt-get update -y && sudo apt-get install gcc-11 g++-11 libgflags-dev
- install-benchmark
- run: CC=gcc-11 CXX=g++-11 V=1 make -j16 all microbench
- run: CC=gcc-11 CXX=g++-11 V=1 SKIP_LINK=1 make -j16 all # Linking broken because libgflags compiled with newer ABI
- post-steps
build-linux-clang-13-no_test_run:
@ -470,43 +401,18 @@ jobs:
steps:
- pre-steps
- install-clang-13
- install-benchmark
- run: CC=clang-13 CXX=clang++-13 USE_CLANG=1 make -j16 all microbench
- post-steps
# Ensure ASAN+UBSAN with folly, and full testsuite with clang 13
build-linux-clang-13-asan-ubsan-with-folly:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
steps:
- pre-steps
- install-clang-13
- install-gflags
- run: make checkout_folly
- run: CC=clang-13 CXX=clang++-13 USE_CLANG=1 USE_FOLLY=1 COMPILE_WITH_UBSAN=1 COMPILE_WITH_ASAN=1 make -j32 check
- run: CC=clang-13 CXX=clang++-13 USE_CLANG=1 make -j16 all
- post-steps
# This job is only to make sure the microbench tests are able to run, the benchmark result is not meaningful as the CI host is changing.
build-linux-run-microbench:
build-linux-microbench:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
resource_class: xlarge
steps:
- pre-steps
- install-benchmark
- run: DEBUG_LEVEL=0 make -j32 run_microbench
- post-steps
build-linux-mini-crashtest:
machine:
image: ubuntu-2004:202111-02
resource_class: large
steps:
- pre-steps
- install-gflags
- install-compression-libs
- run: ulimit -S -n `ulimit -H -n` && make V=1 -j8 CRASH_TEST_EXT_ARGS=--duration=960 blackbox_crash_test_with_atomic_flush
- run: DEBUG_LEVEL=0 make microbench
- post-steps
build-windows:
@ -593,6 +499,9 @@ jobs:
echo 'export PATH=$JAVA_HOME/bin:$PATH' >> $BASH_ENV
which java && java -version
which javac && javac -version
- run:
name: "Build RocksDBJava Shared Library"
command: make V=1 J=8 -j8 rocksdbjava
- run:
name: "Test RocksDBJava"
command: make V=1 J=8 -j8 jtest
@ -622,7 +531,7 @@ jobs:
build-macos-java:
macos:
xcode: 12.5.1
resource_class: large
resource_class: medium
environment:
JAVA_HOME: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
ROCKSDB_DISABLE_JEMALLOC: 1 # jemalloc causes java 8 crash
@ -638,15 +547,18 @@ jobs:
echo 'export PATH=$JAVA_HOME/bin:$PATH' >> $BASH_ENV
which java && java -version
which javac && javac -version
- run:
name: "Build RocksDBJava Shared Library"
command: make V=1 J=8 -j8 rocksdbjava
- run:
name: "Test RocksDBJava"
command: make V=1 J=16 -j16 jtest
command: make V=1 J=8 -j8 jtest
- post-steps
build-macos-java-static:
macos:
xcode: 12.5.1
resource_class: large
resource_class: medium
environment:
JAVA_HOME: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
steps:
@ -664,13 +576,13 @@ jobs:
which javac && javac -version
- run:
name: "Build RocksDBJava x86 and ARM Static Libraries"
command: make V=1 J=16 -j16 rocksdbjavastaticosx
command: make V=1 J=8 -j8 rocksdbjavastaticosx
- post-steps
build-macos-java-static-universal:
macos:
xcode: 12.5.1
resource_class: large
resource_class: medium
environment:
JAVA_HOME: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
steps:
@ -688,7 +600,7 @@ jobs:
which javac && javac -version
- run:
name: "Build RocksDBJava Universal Binary Static Library"
command: make V=1 J=16 -j16 rocksdbjavastaticosx_ub
command: make V=1 J=8 -j8 rocksdbjavastaticosx_ub
- post-steps
build-examples:
@ -707,7 +619,7 @@ jobs:
build-cmake-mingw:
machine:
image: ubuntu-2004:202111-02
resource_class: large
resource_class: 2xlarge
steps:
- pre-steps
- install-gflags
@ -731,12 +643,30 @@ jobs:
machine:
image: ubuntu-2004:202111-02
resource_class: 2xlarge
environment:
TEST_TMPDIR: /tmp/rocksdb_test_tmp
parameters:
start_test:
default: ""
type: string
end_test:
default: ""
type: string
steps:
- pre-steps
- install-gflags
- run: make V=1 -j32 check
- install-gtest-parallel
- run:
name: "Build unit tests"
command: |
echo "env: $(env)"
echo "** done env"
ROCKSDBTESTS_START=<<parameters.start_test>> ROCKSDBTESTS_END=<<parameters.end_test>> ROCKSDBTESTS_SUBSET_TESTS_TO_FILE=/tmp/test_list make V=1 -j32 --output-sync=target build_subset_tests
- run:
name: "Run unit tests in parallel"
command: |
sed -i 's/[[:space:]]*$//; s/ / \.\//g; s/.*/.\/&/' /tmp/test_list
cat /tmp/test_list
export TEST_TMPDIR=/tmp/rocksdb_test_tmp
gtest-parallel $(</tmp/test_list) --output_dir=/tmp | cat # pipe to cat to continuously output status on circleci UI. Otherwise, no status will be printed while the job is running.
- post-steps
build-linux-arm-test-full:
@ -828,76 +758,116 @@ jobs:
workflows:
version: 2
jobs-linux-run-tests:
build-linux:
jobs:
- build-linux
- build-linux-cmake-with-folly
- build-linux-gcc-7-with-folly
- build-linux-cmake-with-benchmark
- build-linux-encrypted_env-no_compression
- build-linux-lite
jobs-linux-run-tests-san:
build-linux-cmake:
jobs:
- build-linux-cmake
- build-linux-cmake-ubuntu-20
build-linux-encrypted-env:
jobs:
- build-linux-encrypted-env
build-linux-shared_lib-alt_namespace-status_checked:
jobs:
- build-linux-clang10-asan
- build-linux-clang10-ubsan
- build-linux-clang10-mini-tsan:
start_test: ""
end_test: "env_test"
- build-linux-clang10-mini-tsan:
start_test: "env_test"
end_test: ""
- build-linux-shared_lib-alt_namespace-status_checked
jobs-linux-no-test-run:
build-linux-lite:
jobs:
- build-linux-lite
build-linux-release:
jobs:
- build-linux-release
build-linux-release-rtti:
jobs:
- build-linux-release-rtti
build-linux-lite-release:
jobs:
- build-linux-lite-release
- build-examples
- build-fuzzers
- build-linux-clang-no_test_run
- build-linux-clang-13-no_test_run
- build-linux-gcc-8-no_test_run
- build-linux-gcc-10-cxx20-no_test_run
- build-linux-gcc-11-no_test_run
- build-linux-arm-cmake-no_test_run
jobs-linux-other-checks:
build-linux-clang10-asan:
jobs:
- build-linux-clang10-asan
build-linux-clang10-mini-tsan:
jobs:
- build-linux-clang10-mini-tsan
build-linux-clang10-ubsan:
jobs:
- build-linux-clang10-ubsan
build-linux-clang10-clang-analyze:
jobs:
- build-linux-clang10-clang-analyze
build-linux-unity-and-headers:
jobs:
- build-linux-unity-and-headers
- build-linux-mini-crashtest
jobs-windows:
build-windows-vs2019:
jobs:
- build-windows:
name: "build-windows-vs2019"
build-windows-vs2019-cxx20:
jobs:
- build-windows:
name: "build-windows-vs2019-cxx20"
extra_cmake_opt: -DCMAKE_CXX_STANDARD=20
build-windows-vs2017:
jobs:
- build-windows:
name: "build-windows-vs2017"
vs_year: "2017"
cmake_generator: "Visual Studio 15 Win64"
- build-cmake-mingw
jobs-java:
build-java:
jobs:
- build-linux-java
- build-linux-java-static
- build-macos-java
- build-macos-java-static
- build-macos-java-static-universal
jobs-macos:
build-examples:
jobs:
- build-examples
build-linux-non-shm:
jobs:
- build-linux-non-shm:
start_test: ""
end_test: "db_options_test" # make sure unique in src.mk
- build-linux-non-shm:
start_test: "db_options_test" # make sure unique in src.mk
end_test: "filename_test" # make sure unique in src.mk
- build-linux-non-shm:
start_test: "filename_test" # make sure unique in src.mk
end_test: "statistics_test" # make sure unique in src.mk
- build-linux-non-shm:
start_test: "statistics_test" # make sure unique in src.mk
end_test: ""
build-linux-compilers-no_test_run:
jobs:
- build-linux-clang-no_test_run
- build-linux-clang-13-no_test_run
- build-linux-gcc-7
- build-linux-gcc-8-no_test_run
- build-linux-gcc-10-cxx20-no_test_run
- build-linux-gcc-11-no_test_run
- build-linux-arm-cmake-no_test_run
build-macos:
jobs:
- build-macos
- build-macos-cmake:
run_even_tests: true
- build-macos-cmake:
run_even_tests: false
jobs-linux-arm:
build-macos-cmake:
jobs:
- build-macos-cmake
build-cmake-mingw:
jobs:
- build-cmake-mingw
build-linux-arm:
jobs:
- build-linux-arm
build-microbench:
jobs:
- build-linux-microbench
build-fuzzers:
jobs:
- build-fuzzers
nightly:
triggers:
- schedule:
cron: "0 9 * * *"
cron: "0 0 * * *"
filters:
branches:
only:
@ -905,7 +875,3 @@ workflows:
jobs:
- build-format-compatible
- build-linux-arm-test-full
- build-linux-run-microbench
- build-linux-non-shm
- build-linux-clang-13-asan-ubsan-with-folly
- build-linux-valgrind

View File

@ -32,7 +32,7 @@ jobs:
- name: Download clang-format-diff.py
uses: wei/wget@v1
with:
args: https://raw.githubusercontent.com/llvm/llvm-project/release/12.x/clang/tools/clang-format/clang-format-diff.py
args: https://raw.githubusercontent.com/llvm/llvm-project/main/clang/tools/clang-format/clang-format-diff.py
- name: Check format
run: VERBOSE_CHECK=1 make check-format

1
.gitignore vendored
View File

@ -95,4 +95,3 @@ fuzz/proto/gen/
fuzz/crash-*
cmake-build-*
third-party/folly/

View File

@ -239,10 +239,10 @@ script:
OPT=-DTRAVIS LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=only make -j$MK_PARALLEL all_but_some_tests check_some
;;
1)
OPT=-DTRAVIS LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=exclude ROCKSDBTESTS_END=backup_engine_test make -j$MK_PARALLEL check_some
OPT=-DTRAVIS LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=exclude ROCKSDBTESTS_END=backupable_db_test make -j$MK_PARALLEL check_some
;;
2)
OPT="-DTRAVIS -DROCKSDB_NAMESPACE=alternative_rocksdb_ns" LIB_MODE=shared V=1 make -j$MK_PARALLEL tools && OPT="-DTRAVIS -DROCKSDB_NAMESPACE=alternative_rocksdb_ns" LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=exclude ROCKSDBTESTS_START=backup_engine_test ROCKSDBTESTS_END=db_universal_compaction_test make -j$MK_PARALLEL check_some
OPT="-DTRAVIS -DROCKSDB_NAMESPACE=alternative_rocksdb_ns" LIB_MODE=shared V=1 make -j$MK_PARALLEL tools && OPT="-DTRAVIS -DROCKSDB_NAMESPACE=alternative_rocksdb_ns" LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=exclude ROCKSDBTESTS_START=backupable_db_test ROCKSDBTESTS_END=db_universal_compaction_test make -j$MK_PARALLEL check_some
;;
3)
OPT=-DTRAVIS LIB_MODE=shared V=1 ROCKSDBTESTS_PLATFORM_DEPENDENT=exclude ROCKSDBTESTS_START=db_universal_compaction_test ROCKSDBTESTS_END=table_properties_collector_test make -j$MK_PARALLEL check_some

View File

@ -2,7 +2,7 @@
# This cmake build is for Windows 64-bit only.
#
# Prerequisites:
# You must have at least Visual Studio 2019. Start the Developer Command Prompt window that is a part of Visual Studio installation.
# You must have at least Visual Studio 2015 Update 3. Start the Developer Command Prompt window that is a part of Visual Studio installation.
# Run the build commands from within the Developer Command Prompt window to have paths to the compiler and runtime libraries set.
# You must have git.exe in your %PATH% environment variable.
#
@ -40,8 +40,6 @@ include(GoogleTest)
get_rocksdb_version(rocksdb_VERSION)
project(rocksdb
VERSION ${rocksdb_VERSION}
DESCRIPTION "An embeddable persistent key-value store for fast storage"
HOMEPAGE_URL https://rocksdb.org/
LANGUAGES CXX C ASM)
if(POLICY CMP0042)
@ -80,6 +78,19 @@ if ($ENV{CIRCLECI})
add_definitions(-DCIRCLECI)
endif()
# third-party/folly is only validated to work on Linux and Windows for now.
# So only turn it on there by default.
if(CMAKE_SYSTEM_NAME MATCHES "Linux|Windows")
if(MSVC AND MSVC_VERSION LESS 1910)
# Folly does not compile with MSVC older than VS2017
option(WITH_FOLLY_DISTRIBUTED_MUTEX "build with folly::DistributedMutex" OFF)
else()
option(WITH_FOLLY_DISTRIBUTED_MUTEX "build with folly::DistributedMutex" ON)
endif()
else()
option(WITH_FOLLY_DISTRIBUTED_MUTEX "build with folly::DistributedMutex" OFF)
endif()
if( NOT DEFINED CMAKE_CXX_STANDARD )
set(CMAKE_CXX_STANDARD 17)
endif()
@ -171,6 +182,26 @@ else()
endif()
endif()
string(TIMESTAMP TS "%Y-%m-%d %H:%M:%S" UTC)
set(BUILD_DATE "${TS}" CACHE STRING "the time we first built rocksdb")
find_package(Git)
if(GIT_FOUND AND EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/.git")
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_SHA COMMAND "${GIT_EXECUTABLE}" rev-parse HEAD )
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" RESULT_VARIABLE GIT_MOD COMMAND "${GIT_EXECUTABLE}" diff-index HEAD --quiet)
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_DATE COMMAND "${GIT_EXECUTABLE}" log -1 --date=format:"%Y-%m-%d %T" --format="%ad")
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_TAG RESULT_VARIABLE rv COMMAND "${GIT_EXECUTABLE}" symbolic-ref -q --short HEAD OUTPUT_STRIP_TRAILING_WHITESPACE)
if (rv AND NOT rv EQUAL 0)
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_TAG COMMAND "${GIT_EXECUTABLE}" describe --tags --exact-match OUTPUT_STRIP_TRAILING_WHITESPACE)
endif()
else()
set(GIT_SHA 0)
set(GIT_MOD 1)
endif()
string(REGEX REPLACE "[^0-9a-fA-F]+" "" GIT_SHA "${GIT_SHA}")
string(REGEX REPLACE "[^0-9: /-]+" "" GIT_DATE "${GIT_DATE}")
option(WITH_MD_LIBRARY "build with MD" ON)
if(WIN32 AND MSVC)
if(WITH_MD_LIBRARY)
@ -180,6 +211,9 @@ if(WIN32 AND MSVC)
endif()
endif()
set(BUILD_VERSION_CC ${CMAKE_BINARY_DIR}/build_version.cc)
configure_file(util/build_version.cc.in ${BUILD_VERSION_CC} @ONLY)
if(MSVC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /Zi /nologo /EHsc /GS /Gd /GR /GF /fp:precise /Zc:wchar_t /Zc:forScope /errorReport:queue")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /FC /d2Zi+ /W4 /wd4127 /wd4800 /wd4996 /wd4351 /wd4100 /wd4204 /wd4324")
@ -422,32 +456,30 @@ if (ASSERT_STATUS_CHECKED)
add_definitions(-DROCKSDB_ASSERT_STATUS_CHECKED)
endif()
# RTTI is by default AUTO which enables it in debug and disables it in release.
set(USE_RTTI AUTO CACHE STRING "Enable RTTI in builds")
set_property(CACHE USE_RTTI PROPERTY STRINGS AUTO ON OFF)
if(USE_RTTI STREQUAL "AUTO")
if(DEFINED USE_RTTI)
if(USE_RTTI)
message(STATUS "Enabling RTTI")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -DROCKSDB_USE_RTTI")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -DROCKSDB_USE_RTTI")
else()
if(MSVC)
message(STATUS "Disabling RTTI in Release builds. Always on in Debug.")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -DROCKSDB_USE_RTTI")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /GR-")
else()
message(STATUS "Disabling RTTI in Release builds")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -fno-rtti")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -fno-rtti")
endif()
endif()
else()
message(STATUS "Enabling RTTI in Debug builds only (default)")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -DROCKSDB_USE_RTTI")
if(MSVC)
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /GR-")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /GR-")
else()
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -fno-rtti")
endif()
elseif(USE_RTTI)
message(STATUS "Enabling RTTI in all builds")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -DROCKSDB_USE_RTTI")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -DROCKSDB_USE_RTTI")
else()
if(MSVC)
message(STATUS "Disabling RTTI in Release builds. Always on in Debug.")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -DROCKSDB_USE_RTTI")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /GR-")
else()
message(STATUS "Disabling RTTI in all builds")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -fno-rtti")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -fno-rtti")
endif()
endif()
# Used to run CI build and tests so we can run faster
@ -583,9 +615,8 @@ endif()
include_directories(${PROJECT_SOURCE_DIR})
include_directories(${PROJECT_SOURCE_DIR}/include)
if(USE_FOLLY)
if(WITH_FOLLY_DISTRIBUTED_MUTEX)
include_directories(${PROJECT_SOURCE_DIR}/third-party/folly)
add_definitions(-DUSE_FOLLY -DFOLLY_NO_CONFIG)
endif()
find_package(Threads REQUIRED)
@ -597,8 +628,6 @@ set(SOURCES
cache/cache_key.cc
cache/cache_reservation_manager.cc
cache/clock_cache.cc
cache/compressed_secondary_cache.cc
cache/fast_lru_cache.cc
cache/lru_cache.cc
cache/sharded_cache.cc
db/arena_wrapped_db_iter.cc
@ -799,11 +828,9 @@ set(SOURCES
trace_replay/trace_record_result.cc
trace_replay/trace_record.cc
trace_replay/trace_replay.cc
util/cleanable.cc
util/coding.cc
util/compaction_job_stats_impl.cc
util/comparator.cc
util/compression.cc
util/compression_context_cache.cc
util/concurrent_task_limiter_impl.cc
util/crc32c.cc
@ -820,8 +847,7 @@ set(SOURCES
util/thread_local.cc
util/threadpool_imp.cc
util/xxhash.cc
utilities/agg_merge/agg_merge.cc
utilities/backup/backup_engine.cc
utilities/backupable/backupable_db.cc
utilities/blob_db/blob_compaction_filter.cc
utilities/blob_db/blob_db.cc
utilities/blob_db/blob_db_impl.cc
@ -970,12 +996,13 @@ else()
env/io_posix.cc)
endif()
if(USE_FOLLY)
if(WITH_FOLLY_DISTRIBUTED_MUTEX)
list(APPEND SOURCES
third-party/folly/folly/container/detail/F14Table.cpp
third-party/folly/folly/lang/SafeAssert.cpp
third-party/folly/folly/lang/ToAscii.cpp
third-party/folly/folly/ScopeGuard.cpp)
third-party/folly/folly/detail/Futex.cpp
third-party/folly/folly/synchronization/AtomicNotification.cpp
third-party/folly/folly/synchronization/DistributedMutex.cpp
third-party/folly/folly/synchronization/ParkingLot.cpp
third-party/folly/folly/synchronization/WaitOptions.cpp)
endif()
set(ROCKSDB_STATIC_LIB rocksdb${ARTIFACT_SUFFIX})
@ -990,60 +1017,6 @@ else()
set(SYSTEM_LIBS ${CMAKE_THREAD_LIBS_INIT})
endif()
set(ROCKSDB_PLUGIN_EXTERNS "")
set(ROCKSDB_PLUGIN_BUILTINS "")
message(STATUS "ROCKSDB PLUGINS TO BUILD ${ROCKSDB_PLUGINS}")
list(APPEND PLUGINS ${ROCKSDB_PLUGINS})
foreach(PLUGIN IN LISTS PLUGINS)
set(PLUGIN_ROOT "${CMAKE_SOURCE_DIR}/plugin/${PLUGIN}/")
message("including rocksb plugin ${PLUGIN_ROOT}")
set(PLUGINMKFILE "${PLUGIN_ROOT}${PLUGIN}.mk")
if (NOT EXISTS ${PLUGINMKFILE})
message(FATAL_ERROR "Missing plugin makefile: ${PLUGINMKFILE}")
endif()
file(READ ${PLUGINMKFILE} PLUGINMK)
string(REGEX MATCH "SOURCES = ([^\n]*)" FOO ${PLUGINMK})
set(MK_SOURCES ${CMAKE_MATCH_1})
separate_arguments(MK_SOURCES)
foreach(MK_FILE IN LISTS MK_SOURCES)
list(APPEND SOURCES "${PLUGIN_ROOT}${MK_FILE}")
endforeach()
string(REGEX MATCH "_FUNC = ([^\n]*)" FOO ${PLUGINMK})
if (NOT ${CMAKE_MATCH_1} STREQUAL "")
string(APPEND ROCKSDB_PLUGIN_BUILTINS "{\"${PLUGIN}\", " ${CMAKE_MATCH_1} "},")
string(APPEND ROCKSDB_PLUGIN_EXTERNS "int " ${CMAKE_MATCH_1} "(ROCKSDB_NAMESPACE::ObjectLibrary&, const std::string&); ")
endif()
string(REGEX MATCH "_LIBS = ([^\n]*)" FOO ${PLUGINMK})
if (NOT ${CMAKE_MATCH_1} STREQUAL "")
list(APPEND THIRDPARTY_LIBS "${CMAKE_MATCH_1}")
endif()
message("THIRDPARTY_LIBS=${THIRDPARTY_LIBS}")
#TODO: We need to set any compile/link-time flags and add any link libraries
endforeach()
string(TIMESTAMP TS "%Y-%m-%d %H:%M:%S" UTC)
set(BUILD_DATE "${TS}" CACHE STRING "the time we first built rocksdb")
find_package(Git)
if(GIT_FOUND AND EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/.git")
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_SHA COMMAND "${GIT_EXECUTABLE}" rev-parse HEAD )
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" RESULT_VARIABLE GIT_MOD COMMAND "${GIT_EXECUTABLE}" diff-index HEAD --quiet)
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_DATE COMMAND "${GIT_EXECUTABLE}" log -1 --date=format:"%Y-%m-%d %T" --format="%ad")
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_TAG RESULT_VARIABLE rv COMMAND "${GIT_EXECUTABLE}" symbolic-ref -q --short HEAD OUTPUT_STRIP_TRAILING_WHITESPACE)
if (rv AND NOT rv EQUAL 0)
execute_process(WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE GIT_TAG COMMAND "${GIT_EXECUTABLE}" describe --tags --exact-match OUTPUT_STRIP_TRAILING_WHITESPACE)
endif()
else()
set(GIT_SHA 0)
set(GIT_MOD 1)
endif()
string(REGEX REPLACE "[^0-9a-fA-F]+" "" GIT_SHA "${GIT_SHA}")
string(REGEX REPLACE "[^0-9: /-]+" "" GIT_DATE "${GIT_DATE}")
set(BUILD_VERSION_CC ${CMAKE_BINARY_DIR}/build_version.cc)
configure_file(util/build_version.cc.in ${BUILD_VERSION_CC} @ONLY)
add_library(${ROCKSDB_STATIC_LIB} STATIC ${SOURCES} ${BUILD_VERSION_CC})
target_link_libraries(${ROCKSDB_STATIC_LIB} PRIVATE
${THIRDPARTY_LIBS} ${SYSTEM_LIBS})
@ -1123,12 +1096,6 @@ if(NOT WIN32 OR ROCKSDB_INSTALL_ON_WINDOWS)
COMPATIBILITY SameMajorVersion
)
configure_file(
${CMAKE_CURRENT_SOURCE_DIR}/${PROJECT_NAME}.pc.in
${CMAKE_CURRENT_BINARY_DIR}/${PROJECT_NAME}.pc
@ONLY
)
install(DIRECTORY include/rocksdb COMPONENT devel DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}")
install(DIRECTORY "${PROJECT_SOURCE_DIR}/cmake/modules" COMPONENT devel DESTINATION ${package_config_destination})
@ -1167,13 +1134,6 @@ if(NOT WIN32 OR ROCKSDB_INSTALL_ON_WINDOWS)
COMPONENT devel
DESTINATION ${package_config_destination}
)
install(
FILES
${CMAKE_CURRENT_BINARY_DIR}/${PROJECT_NAME}.pc
COMPONENT devel
DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig
)
endif()
option(WITH_ALL_TESTS "Build all test, rather than a small subset" ON)
@ -1195,7 +1155,6 @@ if(WITH_TESTS)
list(APPEND TESTS
cache/cache_reservation_manager_test.cc
cache/cache_test.cc
cache/compressed_secondary_cache_test.cc
cache/lru_cache_test.cc
db/blob/blob_counting_iterator_test.cc
db/blob/blob_file_addition_test.cc
@ -1353,8 +1312,7 @@ if(WITH_TESTS)
util/thread_list_test.cc
util/thread_local_test.cc
util/work_queue_test.cc
utilities/agg_merge/agg_merge_test.cc
utilities/backup/backup_engine_test.cc
utilities/backupable/backupable_db_test.cc
utilities/blob_db/blob_db_test.cc
utilities/cassandra/cassandra_functional_test.cc
utilities/cassandra/cassandra_format_test.cc
@ -1375,7 +1333,6 @@ if(WITH_TESTS)
utilities/transactions/optimistic_transaction_test.cc
utilities/transactions/transaction_test.cc
utilities/transactions/lock/point/point_lock_manager_test.cc
utilities/transactions/write_committed_transaction_ts_test.cc
utilities/transactions/write_prepared_transaction_test.cc
utilities/transactions/write_unprepared_transaction_test.cc
utilities/transactions/lock/range/range_locking_test.cc
@ -1385,11 +1342,14 @@ if(WITH_TESTS)
)
endif()
if(WITH_FOLLY_DISTRIBUTED_MUTEX)
list(APPEND TESTS third-party/folly/folly/synchronization/test/DistributedMutexTest.cpp)
endif()
set(TESTUTIL_SOURCE
db/db_test_util.cc
monitoring/thread_status_updater_debug.cc
table/mock_table.cc
utilities/agg_merge/test_agg_merge.cc
utilities/cassandra/test_utils.cc
)
enable_testing()

View File

@ -1,97 +1,29 @@
# Rocksdb Change Log
## Unreleased
## 7.0.4 (03/29/2022)
### Bug Fixes
* Fixed a bug where manual flush would block forever even though flush options had wait=false.
* Fixed a bug where RocksDB could corrupt DBs with `avoid_flush_during_recovery == true` by removing valid WALs, leading to `Status::Corruption` with message like "SST file is ahead of WALs" when attempting to reopen.
* Fixed a bug in async_io path where incorrect length of data is read by FilePrefetchBuffer if data is consumed from two populated buffers and request for more data is sent.
* Fixed a CompactionFilter bug. Compaction filter used to use `Delete` to remove keys, even if the keys should be removed with `SingleDelete`. Mixing `Delete` and `SingleDelete` may cause undefined behavior.
* Fixed a bug which might cause process crash when I/O error happens when reading an index block in MultiGet().
### New Features
* DB::GetLiveFilesStorageInfo is ready for production use.
* Add new stats PREFETCHED_BYTES_DISCARDED which records number of prefetched bytes discarded by RocksDB FilePrefetchBuffer on destruction and POLL_WAIT_MICROS records wait time for FS::Poll API completion.
### Public API changes
* Add rollback_deletion_type_callback to TransactionDBOptions so that write-prepared transactions know whether to issue a Delete or SingleDelete to cancel a previous key written during prior prepare phase. The PR aims to prevent mixing SingleDeletes and Deletes for the same key that can lead to undefined behaviors for write-prepared transactions.
* EXPERIMENTAL: Add new API AbortIO in file_system to abort the read requests submitted asynchronously.
* CompactionFilter::Decision has a new value: kRemoveWithSingleDelete. If CompactionFilter returns this decision, then CompactionIterator will use `SingleDelete` to mark a key as removed.
* Renamed CompactionFilter::Decision::kRemoveWithSingleDelete to kPurge since the latter sounds more general and hides the implementation details of how compaction iterator handles keys.
* Added ability to specify functions for Prepare and Validate to OptionsTypeInfo. Added methods to OptionTypeInfo to set the functions via an API. These methods are intended for RocksDB plugin developers for configuration management.
* Added a new immutable db options, enforce_single_del_contracts. If set to false (default is true), compaction will NOT fail due to a single delete followed by a delete for the same key. The purpose of this temporay option is to help existing use cases migrate.
### Bug Fixes
* RocksDB calls FileSystem::Poll API during FilePrefetchBuffer destruction which impacts performance as it waits for read requets completion which is not needed anymore. Calling FileSystem::AbortIO to abort those requests instead fixes that performance issue.
* Fixed unnecessary block cache contention when queries within a MultiGet batch and across parallel batches access the same data block, which previously could cause severely degraded performance in this unusual case. (In more typical MultiGet cases, this fix is expected to yield a small or negligible performance improvement.)
### Behavior changes
* Enforce the existing contract of SingleDelete so that SingleDelete cannot be mixed with Delete because it leads to undefined behavior. Fix a number of unit tests that violate the contract but happen to pass.
* ldb `--try_load_options` default to true if `--db` is specified and not creating a new DB, the user can still explicitly disable that by `--try_load_options=false` (or explicitly enable that by `--try_load_options`).
## 7.2.0 (04/15/2022)
### Bug Fixes
* Fixed bug which caused rocksdb failure in the situation when rocksdb was accessible using UNC path
* Fixed a race condition when disable and re-enable manual compaction.
* Fixed a race condition for `alive_log_files_` in non-two-write-queues mode. The race is between the write_thread_ in WriteToWAL() and another thread executing `FindObsoleteFiles()`. The race condition will be caught if `__glibcxx_requires_nonempty` is enabled.
* Fixed a race condition when mmaping a WritableFile on POSIX.
* Fixed a race condition when 2PC is disabled and WAL tracking in the MANIFEST is enabled. The race condition is between two background flush threads trying to install flush results, causing a WAL deletion not tracked in the MANIFEST. A future DB open may fail.
* Fixed a heap use-after-free race with DropColumnFamily.
* Fixed a bug that `rocksdb.read.block.compaction.micros` cannot track compaction stats (#9722).
* Fixed `file_type`, `relative_filename` and `directory` fields returned by `GetLiveFilesMetaData()`, which were added in inheriting from `FileStorageInfo`.
* Fixed a bug affecting `track_and_verify_wals_in_manifest`. Without the fix, application may see "open error: Corruption: Missing WAL with log number" while trying to open the db. The corruption is a false alarm but prevents DB open (#9766).
* Fix segfault in FilePrefetchBuffer with async_io as it doesn't wait for pending jobs to complete on destruction.
* Fix ERROR_HANDLER_AUTORESUME_RETRY_COUNT stat whose value was set wrong in portal.h
* Fixed a bug for non-TransactionDB with avoid_flush_during_recovery = true and TransactionDB where in case of crash, min_log_number_to_keep may not change on recovery and persisting a new MANIFEST with advanced log_numbers for some column families, results in "column family inconsistency" error on second recovery. As a solution the corrupted WALs whose numbers are larger than the corrupted wal and smaller than the new WAL will be moved to archive folder.
* Fixed a bug in RocksDB DB::Open() which may creates and writes to two new MANIFEST files even before recovery succeeds. Now writes to MANIFEST are persisted only after recovery is successful.
### New Features
* For db_bench when --seed=0 or --seed is not set then it uses the current time as the seed value. Previously it used the value 1000.
* For db_bench when --benchmark lists multiple tests and each test uses a seed for a RNG then the seeds across tests will no longer be repeated.
* Added an option to dynamically charge an updating estimated memory usage of block-based table reader to block cache if block cache available. To enable this feature, set `BlockBasedTableOptions::reserve_table_reader_memory = true`.
* Add new stat ASYNC_READ_BYTES that calculates number of bytes read during async read call and users can check if async code path is being called by RocksDB internal automatic prefetching for sequential reads.
* Enable async prefetching if ReadOptions.readahead_size is set along with ReadOptions.async_io in FilePrefetchBuffer.
* Add event listener support on remote compaction compactor side.
* Added a dedicated integer DB property `rocksdb.live-blob-file-garbage-size` that exposes the total amount of garbage in the blob files in the current version.
* RocksDB does internal auto prefetching if it notices sequential reads. It starts with readahead size `initial_auto_readahead_size` which now can be configured through BlockBasedTableOptions.
* Add a merge operator that allows users to register specific aggregation function so that they can does aggregation using different aggregation types for different keys. See comments in include/rocksdb/utilities/agg_merge.h for actual usage. The feature is experimental and the format is subject to change and we won't provide a migration tool.
* Meta-internal / Experimental: Improve CPU performance by replacing many uses of std::unordered_map with folly::F14FastMap when RocksDB is compiled together with Folly.
* Experimental: Add CompressedSecondaryCache, a concrete implementation of rocksdb::SecondaryCache, that integrates with compression libraries (e.g. LZ4) to hold compressed blocks.
### Behavior changes
* Disallow usage of commit-time-write-batch for write-prepared/write-unprepared transactions if TransactionOptions::use_only_the_last_commit_time_batch_for_recovery is false to prevent two (or more) uncommitted versions of the same key in the database. Otherwise, bottommost compaction may violate the internal key uniqueness invariant of SSTs if the sequence numbers of both internal keys are zeroed out (#9794).
* Make DB::GetUpdatesSince() return NotSupported early for write-prepared/write-unprepared transactions, as the API contract indicates.
### Public API changes
* Exposed APIs to examine results of block cache stats collections in a structured way. In particular, users of `GetMapProperty()` with property `kBlockCacheEntryStats` can now use the functions in `BlockCacheEntryStatsMapKeys` to find stats in the map.
* Add `fail_if_not_bottommost_level` to IngestExternalFileOptions so that ingestion will fail if the file(s) cannot be ingested to the bottommost level.
* Add output parameter `is_in_sec_cache` to `SecondaryCache::Lookup()`. It is to indicate whether the handle is possibly erased from the secondary cache after the Lookup.
## 7.1.0 (03/23/2022)
### New Features
* Allow WriteBatchWithIndex to index a WriteBatch that includes keys with user-defined timestamps. The index itself does not have timestamp.
* Add support for user-defined timestamps to write-committed transaction without API change. The `TransactionDB` layer APIs do not allow timestamps because we require that all user-defined-timestamps-aware operations go through the `Transaction` APIs.
* Added BlobDB options to `ldb`
* `BlockBasedTableOptions::detect_filter_construct_corruption` can now be dynamically configured using `DB::SetOptions`.
* Automatically recover from retryable read IO errors during backgorund flush/compaction.
* Experimental support for preserving file Temperatures through backup and restore, and for updating DB metadata for outside changes to file Temperature (`UpdateManifestForFilesState` or `ldb update_manifest --update_temperatures`).
* Experimental support for async_io in ReadOptions which is used by FilePrefetchBuffer to prefetch some of the data asynchronously, if reads are sequential and auto readahead is enabled by rocksdb internally.
## 7.0.3 (03/23/2022)
### Bug Fixes
* Fixed a major performance bug in which Bloom filters generated by pre-7.0 releases are not read by early 7.0.x releases (and vice-versa) due to changes to FilterPolicy::Name() in #9590. This can severely impact read performance and read I/O on upgrade or downgrade with existing DB, but not data correctness.
* Fixed a data race on `versions_` between `DBImpl::ResumeImpl()` and threads waiting for recovery to complete (#9496)
* Fixed a bug caused by race among flush, incoming writes and taking snapshots. Queries to snapshots created with these race condition can return incorrect result, e.g. resurfacing deleted data.
* Fixed a bug that DB flush uses `options.compression` even `options.compression_per_level` is set.
* Fixed a bug that DisableManualCompaction may assert when disable an unscheduled manual compaction.
* Fix a race condition when cancel manual compaction with `DisableManualCompaction`. Also DB close can cancel the manual compaction thread.
* Fixed a potential timer crash when open close DB concurrently.
* Fixed a race condition for `alive_log_files_` in non-two-write-queues mode. The race is between the write_thread_ in WriteToWAL() and another thread executing `FindObsoleteFiles()`. The race condition will be caught if `__glibcxx_requires_nonempty` is enabled.
* Fixed a bug that `Iterator::Refresh()` reads stale keys after DeleteRange() performed.
* Fixed a race condition when disable and re-enable manual compaction.
* Fixed automatic error recovery failure in atomic flush.
* Fixed a race condition when mmaping a WritableFile on POSIX.
### Public API changes
* Added pure virtual FilterPolicy::CompatibilityName(), which is needed for fixing major performance bug involving FilterPolicy naming in SST metadata without affecting Customizable aspect of FilterPolicy. This change only affects those with their own custom or wrapper FilterPolicy classes.
* `options.compression_per_level` is dynamically changeable with `SetOptions()`.
* Added `WriteOptions::rate_limiter_priority`. When set to something other than `Env::IO_TOTAL`, the internal rate limiter (`DBOptions::rate_limiter`) will be charged at the specified priority for writes associated with the API to which the `WriteOptions` was provided. Currently the support covers automatic WAL flushes, which happen during live updates (`Put()`, `Write()`, `Delete()`, etc.) when `WriteOptions::disableWAL == false` and `DBOptions::manual_wal_flush == false`.
* Add DB::OpenAndTrimHistory API. This API will open DB and trim data to the timestamp specified by trim_ts (The data with timestamp larger than specified trim bound will be removed). This API should only be used at a timestamp-enabled column families recovery. If the column family doesn't have timestamp enabled, this API won't trim any data on that column family. This API is not compatible with avoid_flush_during_recovery option.
* Remove BlockBasedTableOptions.hash_index_allow_collision which already takes no effect.
* Added pure virtual FilterPolicy::CompatibilityName(), which is needed for fixing major performance bug involving FilterPolicy naming in SST metadata without affecting Customizable aspect of FilterPolicy. For source code, this change only affects those with their own custom or wrapper FilterPolicy classes, but does break compiled library binary compatibility in a patch release.
## 7.0.2 (03/12/2022)
* Fixed a bug that DisableManualCompaction may assert when disable an unscheduled manual compaction.
## 7.0.1 (03/02/2022)
### Bug Fixes
* Fix a race condition when cancel manual compaction with `DisableManualCompaction`. Also DB close can cancel the manual compaction thread.
* Fixed a data race on `versions_` between `DBImpl::ResumeImpl()` and threads waiting for recovery to complete (#9496)
* Fixed a bug caused by race among flush, incoming writes and taking snapshots. Queries to snapshots created with these race condition can return incorrect result, e.g. resurfacing deleted data.
## 7.0.0 (02/20/2022)
### Bug Fixes
@ -1721,7 +1653,7 @@ if set to something > 0 user will see 2 changes in iterators behavior 1) only ke
* Added a new way to report QPS from db_bench (check out --report_file and --report_interval_seconds)
* Added a cache for individual rows. See DBOptions::row_cache for more info.
* Several new features on EventListener (see include/rocksdb/listener.h):
- OnCompactionCompleted() now returns per-compaction job statistics, defined in include/rocksdb/compaction_job_stats.h.
- OnCompationCompleted() now returns per-compaction job statistics, defined in include/rocksdb/compaction_job_stats.h.
- Added OnTableFileCreated() and OnTableFileDeleted().
* Add compaction_options_universal.enable_trivial_move to true, to allow trivial move while performing universal compaction. Trivial move will happen only when all the input files are non overlapping.

View File

@ -47,7 +47,7 @@ to build a portable binary, add `PORTABLE=1` before your make commands, like thi
* If you wish to build the RocksJava static target, then cmake is required for building Snappy.
* If you wish to run microbench (e.g, `make microbench`, `make ribbon_bench` or `cmake -DWITH_BENCHMARK=1`), Google benchmark >= 1.6.0 is needed.
* If you wish to run microbench (e.g, `make microbench`, `make ribbon_bench` or `cmake -DWITH_BENCHMARK=1`), Google benchmark < 1.6.0 is needed.
## Supported platforms
@ -180,7 +180,8 @@ to build a portable binary, add `PORTABLE=1` before your make commands, like thi
* **iOS**:
* Run: `TARGET_OS=IOS make static_lib`. When building the project which uses rocksdb iOS library, make sure to define two important pre-processing macros: `ROCKSDB_LITE` and `IOS_CROSS_COMPILE`.
* **Windows** (Visual Studio 2017 to up):
* **Windows**:
* For building with MS Visual Studio 13 you will need Update 4 installed.
* Read and follow the instructions at CMakeLists.txt
* Or install via [vcpkg](https://github.com/microsoft/vcpkg)
* run `vcpkg install rocksdb:x64-windows`

300
Makefile
View File

@ -8,7 +8,11 @@
BASH_EXISTS := $(shell which bash)
SHELL := $(shell which bash)
include common.mk
# Default to python3. Some distros like CentOS 8 do not have `python`.
ifeq ($(origin PYTHON), undefined)
PYTHON := $(shell which python3 || which python || echo python3)
endif
export PYTHON
CLEAN_FILES = # deliberately empty, so we can append below.
CFLAGS += ${EXTRA_CFLAGS}
@ -232,20 +236,14 @@ include make_config.mk
ROCKSDB_PLUGIN_MKS = $(foreach plugin, $(ROCKSDB_PLUGINS), plugin/$(plugin)/*.mk)
include $(ROCKSDB_PLUGIN_MKS)
ROCKSDB_PLUGIN_PROTO =ROCKSDB_NAMESPACE::ObjectLibrary\&, const std::string\&
ROCKSDB_PLUGIN_SOURCES = $(foreach p, $(ROCKSDB_PLUGINS), $(foreach source, $($(p)_SOURCES), plugin/$(p)/$(source)))
ROCKSDB_PLUGIN_HEADERS = $(foreach p, $(ROCKSDB_PLUGINS), $(foreach header, $($(p)_HEADERS), plugin/$(p)/$(header)))
ROCKSDB_PLUGIN_LIBS = $(foreach p, $(ROCKSDB_PLUGINS), $(foreach lib, $($(p)_LIBS), -l$(lib)))
ROCKSDB_PLUGIN_W_FUNCS = $(foreach p, $(ROCKSDB_PLUGINS), $(if $($(p)_FUNC), $(p)))
ROCKSDB_PLUGIN_EXTERNS = $(foreach p, $(ROCKSDB_PLUGIN_W_FUNCS), int $($(p)_FUNC)($(ROCKSDB_PLUGIN_PROTO));)
ROCKSDB_PLUGIN_BUILTINS = $(foreach p, $(ROCKSDB_PLUGIN_W_FUNCS), {\"$(p)\"\, $($(p)_FUNC)}\,)
ROCKSDB_PLUGIN_LDFLAGS = $(foreach plugin, $(ROCKSDB_PLUGINS), $($(plugin)_LDFLAGS))
ROCKSDB_PLUGIN_SOURCES = $(foreach plugin, $(ROCKSDB_PLUGINS), $(foreach source, $($(plugin)_SOURCES), plugin/$(plugin)/$(source)))
ROCKSDB_PLUGIN_HEADERS = $(foreach plugin, $(ROCKSDB_PLUGINS), $(foreach header, $($(plugin)_HEADERS), plugin/$(plugin)/$(header)))
ROCKSDB_PLUGIN_PKGCONFIG_REQUIRES = $(foreach plugin, $(ROCKSDB_PLUGINS), $($(plugin)_PKGCONFIG_REQUIRES))
CXXFLAGS += $(foreach plugin, $(ROCKSDB_PLUGINS), $($(plugin)_CXXFLAGS))
PLATFORM_LDFLAGS += $(ROCKSDB_PLUGIN_LDFLAGS)
# Patch up the link flags for JNI from the plugins
ROCKSDB_PLUGIN_LDFLAGS = $(foreach plugin, $(ROCKSDB_PLUGINS), $($(plugin)_LDFLAGS))
PLATFORM_LDFLAGS += $(ROCKSDB_PLUGIN_LDFLAGS)
JAVA_LDFLAGS += $(ROCKSDB_PLUGIN_LDFLAGS)
JAVA_STATIC_LDFLAGS += $(ROCKSDB_PLUGIN_LDFLAGS)
@ -288,7 +286,7 @@ missing_make_config_paths := $(shell \
grep "\./\S*\|/\S*" -o $(CURDIR)/make_config.mk | \
while read path; \
do [ -e $$path ] || echo $$path; \
done | sort | uniq | grep -v "/DOES/NOT/EXIST")
done | sort | uniq)
$(foreach path, $(missing_make_config_paths), \
$(warning Warning: $(path) does not exist))
@ -340,8 +338,6 @@ endif
# ASAN doesn't work well with jemalloc. If we're compiling with ASAN, we should use regular malloc.
ifdef COMPILE_WITH_ASAN
DISABLE_JEMALLOC=1
ASAN_OPTIONS?=detect_stack_use_after_return=1
export ASAN_OPTIONS
EXEC_LDFLAGS += -fsanitize=address
PLATFORM_CCFLAGS += -fsanitize=address
PLATFORM_CXXFLAGS += -fsanitize=address
@ -402,10 +398,6 @@ ifndef DISABLE_JEMALLOC
ifdef JEMALLOC
PLATFORM_CXXFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
PLATFORM_CCFLAGS += -DROCKSDB_JEMALLOC -DJEMALLOC_NO_DEMANGLE
ifeq ($(USE_FOLLY),1)
PLATFORM_CXXFLAGS += -DUSE_JEMALLOC
PLATFORM_CCFLAGS += -DUSE_JEMALLOC
endif
endif
ifdef WITH_JEMALLOC_FLAG
PLATFORM_LDFLAGS += -ljemalloc
@ -416,8 +408,8 @@ ifndef DISABLE_JEMALLOC
PLATFORM_CCFLAGS += $(JEMALLOC_INCLUDE)
endif
ifndef USE_FOLLY
USE_FOLLY=0
ifndef USE_FOLLY_DISTRIBUTED_MUTEX
USE_FOLLY_DISTRIBUTED_MUTEX=0
endif
ifndef GTEST_THROW_ON_FAILURE
@ -437,12 +429,8 @@ else
PLATFORM_CXXFLAGS += -isystem $(GTEST_DIR)
endif
# This provides a Makefile simulation of a Meta-internal folly integration.
# It is not validated for general use.
ifeq ($(USE_FOLLY),1)
ifeq (,$(FOLLY_DIR))
FOLLY_DIR = ./third-party/folly
endif
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
FOLLY_DIR = ./third-party/folly
# AIX: pre-defined system headers are surrounded by an extern "C" block
ifeq ($(PLATFORM), OS_AIX)
PLATFORM_CCFLAGS += -I$(FOLLY_DIR)
@ -451,8 +439,6 @@ ifeq ($(USE_FOLLY),1)
PLATFORM_CCFLAGS += -isystem $(FOLLY_DIR)
PLATFORM_CXXFLAGS += -isystem $(FOLLY_DIR)
endif
PLATFORM_CCFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
PLATFORM_CXXFLAGS += -DUSE_FOLLY -DFOLLY_NO_CONFIG
endif
ifdef TEST_CACHE_LINE_SIZE
@ -539,7 +525,7 @@ LIB_OBJECTS += $(patsubst %.c, $(OBJ_DIR)/%.o, $(LIB_SOURCES_C))
LIB_OBJECTS += $(patsubst %.S, $(OBJ_DIR)/%.o, $(LIB_SOURCES_ASM))
endif
ifeq ($(USE_FOLLY),1)
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
LIB_OBJECTS += $(patsubst %.cpp, $(OBJ_DIR)/%.o, $(FOLLY_SOURCES))
endif
@ -574,6 +560,11 @@ ALL_SOURCES += $(ROCKSDB_PLUGIN_SOURCES)
TESTS = $(patsubst %.cc, %, $(notdir $(TEST_MAIN_SOURCES)))
TESTS += $(patsubst %.c, %, $(notdir $(TEST_MAIN_SOURCES_C)))
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
TESTS += folly_synchronization_distributed_mutex_test
ALL_SOURCES += third-party/folly/folly/synchronization/test/DistributedMutexTest.cc
endif
# `make check-headers` to very that each header file includes its own
# dependencies
ifneq ($(filter check-headers, $(MAKECMDGOALS)),)
@ -598,6 +589,9 @@ am__v_CCH_1 =
check-headers: $(HEADER_OK_FILES)
# options_settable_test doesn't pass with UBSAN as we use hack in the test
ifdef COMPILE_WITH_UBSAN
TESTS := $(shell echo $(TESTS) | sed 's/\boptions_settable_test\b//g')
endif
ifdef ASSERT_STATUS_CHECKED
# TODO: finish fixing all tests to pass this check
TESTS_FAILING_ASC = \
@ -617,13 +611,10 @@ ROCKSDBTESTS_SUBSET ?= $(TESTS)
# env_test - suspicious use of test::TmpDir
# deletefile_test - serial because it generates giant temporary files in
# its various tests. Parallel can fill up your /dev/shm
# db_bloom_filter_test - serial because excessive space usage by instances
# of DBFilterConstructionReserveMemoryTestWithParam can fill up /dev/shm
NON_PARALLEL_TEST = \
c_test \
env_test \
deletefile_test \
db_bloom_filter_test \
PARALLEL_TEST = $(filter-out $(NON_PARALLEL_TEST), $(TESTS))
@ -741,7 +732,7 @@ else
git_mod := $(shell git diff-index HEAD --quiet 2>/dev/null; echo $$?)
git_date := $(shell git log -1 --date=format:"%Y-%m-%d %T" --format="%ad" 2>/dev/null)
endif
gen_build_version = sed -e s/@GIT_SHA@/$(git_sha)/ -e s:@GIT_TAG@:"$(git_tag)": -e s/@GIT_MOD@/"$(git_mod)"/ -e s/@BUILD_DATE@/"$(build_date)"/ -e s/@GIT_DATE@/"$(git_date)"/ -e s/@ROCKSDB_PLUGIN_BUILTINS@/'$(ROCKSDB_PLUGIN_BUILTINS)'/ -e s/@ROCKSDB_PLUGIN_EXTERNS@/"$(ROCKSDB_PLUGIN_EXTERNS)"/ util/build_version.cc.in
gen_build_version = sed -e s/@GIT_SHA@/$(git_sha)/ -e s:@GIT_TAG@:"$(git_tag)": -e s/@GIT_MOD@/"$(git_mod)"/ -e s/@BUILD_DATE@/"$(build_date)"/ -e s/@GIT_DATE@/"$(git_date)"/ util/build_version.cc.in
# Record the version of the source that we are compiling.
# We keep a record of the git revision in this file. It is then built
@ -795,10 +786,15 @@ $(SHARED4): $(LIB_OBJECTS)
$(AM_V_CCLD) $(CXX) $(PLATFORM_SHARED_LDFLAGS)$(SHARED3) $(LIB_OBJECTS) $(LDFLAGS) -o $@
endif # PLATFORM_SHARED_EXT
.PHONY: check clean coverage ldb_tests package dbg gen-pc build_size \
release tags tags0 valgrind_check format static_lib shared_lib all \
rocksdbjavastatic rocksdbjava install install-static install-shared \
uninstall analyze tools tools_lib check-headers checkout_folly
.PHONY: blackbox_crash_test check clean coverage crash_test ldb_tests package \
release tags tags0 valgrind_check whitebox_crash_test format static_lib shared_lib all \
dbg rocksdbjavastatic rocksdbjava gen-pc install install-static install-shared uninstall \
analyze tools tools_lib check-headers \
blackbox_crash_test_with_atomic_flush whitebox_crash_test_with_atomic_flush \
blackbox_crash_test_with_txn whitebox_crash_test_with_txn \
blackbox_crash_test_with_best_efforts_recovery \
blackbox_crash_test_with_ts whitebox_crash_test_with_ts
all: $(LIBRARY) $(BENCHMARKS) tools tools_lib test_libs $(TESTS)
@ -819,8 +815,6 @@ test_libs: $(TEST_LIBS)
benchmarks: $(BENCHMARKS)
microbench: $(MICROBENCHS)
run_microbench: $(MICROBENCHS)
for t in $(MICROBENCHS); do echo "===== Running benchmark $$t (`date`)"; ./$$t || exit 1; done;
dbg: $(LIBRARY) $(BENCHMARKS) tools $(TESTS)
@ -832,8 +826,20 @@ release: clean
coverage: clean
COVERAGEFLAGS="-fprofile-arcs -ftest-coverage" LDFLAGS+="-lgcov" $(MAKE) J=1 all check
cd coverage && ./coverage_test.sh
# Delete intermediate files
$(FIND) . -type f \( -name "*.gcda" -o -name "*.gcno" \) -exec rm -f {} \;
# Delete intermediate files
$(FIND) . -type f -regex ".*\.\(\(gcda\)\|\(gcno\)\)" -exec rm -f {} \;
ifneq (,$(filter check parallel_check,$(MAKECMDGOALS)),)
# Use /dev/shm if it has the sticky bit set (otherwise, /tmp),
# and create a randomly-named rocksdb.XXXX directory therein.
# We'll use that directory in the "make check" rules.
ifeq ($(TMPD),)
TMPDIR := $(shell echo $${TMPDIR:-/tmp})
TMPD := $(shell f=/dev/shm; test -k $$f || f=$(TMPDIR); \
perl -le 'use File::Temp "tempdir";' \
-e 'print tempdir("'$$f'/rocksdb.XXXX", CLEANUP => 0)')
endif
endif
# Run all tests in parallel, accumulating per-test logs in t/log-*.
#
@ -874,7 +880,7 @@ $(parallel_tests):
TEST_SCRIPT=t/run-$$TEST_BINARY-$${TEST_NAME//\//-}; \
printf '%s\n' \
'#!/bin/sh' \
"d=\$(TEST_TMPDIR)$$TEST_SCRIPT" \
"d=\$(TMPD)$$TEST_SCRIPT" \
'mkdir -p $$d' \
"TEST_TMPDIR=\$$d $(DRIVER) ./$$TEST_BINARY --gtest_filter=$$TEST_NAME" \
> $$TEST_SCRIPT; \
@ -903,7 +909,7 @@ gen_parallel_tests:
# 107.816 PASS t/DBTest.EncodeDecompressedBlockSizeTest
#
slow_test_regexp = \
^.*MySQLStyleTransactionTest.*$$|^.*SnapshotConcurrentAccessTest.*$$|^.*SeqAdvanceConcurrentTest.*$$|^t/run-table_test-HarnessTest.Randomized$$|^t/run-db_test-.*(?:FileCreationRandomFailure|EncodeDecompressedBlockSizeTest)$$|^.*RecoverFromCorruptedWALWithoutFlush$$
^.*SnapshotConcurrentAccessTest.*$$|^.*SeqAdvanceConcurrentTest.*$$|^t/run-table_test-HarnessTest.Randomized$$|^t/run-db_test-.*(?:FileCreationRandomFailure|EncodeDecompressedBlockSizeTest)$$|^.*RecoverFromCorruptedWALWithoutFlush$$
prioritize_long_running_tests = \
perl -pe 's,($(slow_test_regexp)),100 $$1,' \
| sort -k1,1gr \
@ -934,6 +940,7 @@ endif
.PHONY: check_0
check_0:
$(AM_V_GEN)export TEST_TMPDIR=$(TMPD); \
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' ''; \
@ -944,8 +951,7 @@ check_0:
| $(prioritize_long_running_tests) \
| grep -E '$(tests-regexp)' \
| grep -E -v '$(EXCLUDE_TESTS_REGEX)' \
| build_tools/gnu_parallel -j$(J) --plain --joblog=LOG --eta --gnu \
--tmpdir=$(TEST_TMPDIR) '{} $(parallel_redir)' ; \
| build_tools/gnu_parallel -j$(J) --plain --joblog=LOG --eta --gnu '{} $(parallel_redir)' ; \
parallel_retcode=$$? ; \
awk '{ if ($$7 != 0 || $$8 != 0) { if ($$7 == "Exitval") { h = $$0; } else { if (!f) print h; print; f = 1 } } } END { if(f) exit 1; }' < LOG ; \
awk_retcode=$$?; \
@ -956,6 +962,7 @@ valgrind-exclude-regexp = InlineSkipTest.ConcurrentInsert|TransactionStressTest.
.PHONY: valgrind_check_0
valgrind_check_0: test_log_prefix := valgrind_
valgrind_check_0:
$(AM_V_GEN)export TEST_TMPDIR=$(TMPD); \
printf '%s\n' '' \
'To monitor subtest <duration,pass/fail,name>,' \
' run "make watch-log" in a separate window' ''; \
@ -967,11 +974,10 @@ valgrind_check_0:
| grep -E '$(tests-regexp)' \
| grep -E -v '$(valgrind-exclude-regexp)' \
| build_tools/gnu_parallel -j$(J) --plain --joblog=LOG --eta --gnu \
--tmpdir=$(TEST_TMPDIR) \
'(if [[ "{}" == "./"* ]] ; then $(DRIVER) {}; else {}; fi) \
'(if [[ "{}" == "./"* ]] ; then $(DRIVER) {}; else {}; fi) \
$(parallel_redir)' \
CLEAN_FILES += t LOG $(TEST_TMPDIR)
CLEAN_FILES += t LOG $(TMPD)
# When running parallel "make check", you can monitor its progress
# from another window.
@ -994,12 +1000,12 @@ check: all
&& (build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel'; \
then \
$(MAKE) T="$$t" check_0; \
$(MAKE) T="$$t" TMPD=$(TMPD) check_0; \
else \
for t in $(TESTS); do \
echo "===== Running $$t (`date`)"; ./$$t || exit 1; done; \
fi
rm -rf $(TEST_TMPDIR)
rm -rf $(TMPD)
ifneq ($(PLATFORM), OS_AIX)
$(PYTHON) tools/check_all_python.py
ifeq ($(filter -DROCKSDB_LITE,$(OPT)),)
@ -1023,7 +1029,65 @@ check_some: $(ROCKSDBTESTS_SUBSET)
ldb_tests: ldb
$(PYTHON) tools/ldb_test.py
include crash_test.mk
crash_test:
# Do not parallelize
$(MAKE) whitebox_crash_test
$(MAKE) blackbox_crash_test
crash_test_with_atomic_flush:
# Do not parallelize
$(MAKE) whitebox_crash_test_with_atomic_flush
$(MAKE) blackbox_crash_test_with_atomic_flush
crash_test_with_txn:
# Do not parallelize
$(MAKE) whitebox_crash_test_with_txn
$(MAKE) blackbox_crash_test_with_txn
crash_test_with_best_efforts_recovery: blackbox_crash_test_with_best_efforts_recovery
crash_test_with_ts:
# Do not parallelize
$(MAKE) whitebox_crash_test_with_ts
$(MAKE) blackbox_crash_test_with_ts
blackbox_crash_test: db_stress
$(PYTHON) -u tools/db_crashtest.py --simple blackbox $(CRASH_TEST_EXT_ARGS)
$(PYTHON) -u tools/db_crashtest.py blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_atomic_flush: db_stress
$(PYTHON) -u tools/db_crashtest.py --cf_consistency blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_txn: db_stress
$(PYTHON) -u tools/db_crashtest.py --txn blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_best_efforts_recovery: db_stress
$(PYTHON) -u tools/db_crashtest.py --test_best_efforts_recovery blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_ts: db_stress
$(PYTHON) -u tools/db_crashtest.py --enable_ts blackbox $(CRASH_TEST_EXT_ARGS)
ifeq ($(CRASH_TEST_KILL_ODD),)
CRASH_TEST_KILL_ODD=888887
endif
whitebox_crash_test: db_stress
$(PYTHON) -u tools/db_crashtest.py --simple whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
$(PYTHON) -u tools/db_crashtest.py whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_atomic_flush: db_stress
$(PYTHON) -u tools/db_crashtest.py --cf_consistency whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_txn: db_stress
$(PYTHON) -u tools/db_crashtest.py --txn whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_ts: db_stress
$(PYTHON) -u tools/db_crashtest.py --enable_ts whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
asan_check: clean
COMPILE_WITH_ASAN=1 $(MAKE) check -j32
@ -1096,11 +1160,11 @@ valgrind_test_some:
valgrind_check: $(TESTS)
$(MAKE) DRIVER="$(VALGRIND_VER) $(VALGRIND_OPTS)" gen_parallel_tests
$(AM_V_GEN)if test "$(J)" != 1 \
&& (build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
&& (build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel'; \
then \
$(MAKE) \
DRIVER="$(VALGRIND_VER) $(VALGRIND_OPTS)" valgrind_check_0; \
$(MAKE) TMPD=$(TMPD) \
DRIVER="$(VALGRIND_VER) $(VALGRIND_OPTS)" valgrind_check_0; \
else \
for t in $(filter-out %skiplist_test options_settable_test,$(TESTS)); do \
$(VALGRIND_VER) $(VALGRIND_OPTS) ./$$t; \
@ -1120,6 +1184,27 @@ valgrind_check_some: $(ROCKSDBTESTS_SUBSET)
fi; \
done
ifneq ($(PAR_TEST),)
parloop:
ret_bad=0; \
for t in $(PAR_TEST); do \
echo "===== Running $$t in parallel $(NUM_PAR) (`date`)";\
if [ $(db_test) -eq 1 ]; then \
seq $(J) | v="$$t" build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{}; export TEST_TMPDIR=$$s;' \
'timeout 2m ./db_test --gtest_filter=$$v >> $$s/log-{} 2>1'; \
else\
seq $(J) | v="./$$t" build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{};' \
'export TEST_TMPDIR=$$s; timeout 10m $$v >> $$s/log-{} 2>1'; \
fi; \
ret_code=$$?; \
if [ $$ret_code -ne 0 ]; then \
ret_bad=$$ret_code; \
echo $$t exited with $$ret_code; \
fi; \
done; \
exit $$ret_bad;
endif
test_names = \
./db_test --gtest_list_tests \
| perl -n \
@ -1127,6 +1212,24 @@ test_names = \
-e '/^(\s*)(\S+)/; !$$1 and do {$$p=$$2; break};' \
-e 'print qq! $$p$$2!'
parallel_check: $(TESTS)
$(AM_V_GEN)if test "$(J)" > 1 \
&& (build_tools/gnu_parallel --gnu --help 2>/dev/null) | \
grep -q 'GNU Parallel'; \
then \
echo Running in parallel $(J); \
else \
echo "Need to have GNU Parallel and J > 1"; exit 1; \
fi; \
ret_bad=0; \
echo $(J);\
echo Test Dir: $(TMPD); \
seq $(J) | build_tools/gnu_parallel --gnu --plain 's=$(TMPD)/rdb-{}; rm -rf $$s; mkdir $$s'; \
$(MAKE) PAR_TEST="$(shell $(test_names))" TMPD=$(TMPD) \
J=$(J) db_test=1 parloop; \
$(MAKE) PAR_TEST="$(filter-out db_test, $(TESTS))" \
TMPD=$(TMPD) J=$(J) db_test=0 parloop;
analyze: clean
USE_CLANG=1 $(MAKE) analyze_incremental
@ -1169,9 +1272,9 @@ clean-rocks:
rm -f $(BENCHMARKS) $(TOOLS) $(TESTS) $(PARALLEL_TEST) $(ALL_STATIC_LIBS) $(ALL_SHARED_LIBS) $(MICROBENCHS)
rm -rf $(CLEAN_FILES) ios-x86 ios-arm scan_build_report
$(FIND) . -name "*.[oda]" -exec rm -f {} \;
$(FIND) . -type f \( -name "*.gcda" -o -name "*.gcno" \) -exec rm -f {} \;
$(FIND) . -type f -regex ".*\.\(\(gcda\)\|\(gcno\)\)" -exec rm -f {} \;
clean-rocksjava: clean-rocks
clean-rocksjava:
rm -rf jl jls
cd java && $(MAKE) clean
@ -1255,6 +1358,11 @@ trace_analyzer: $(OBJ_DIR)/tools/trace_analyzer.o $(ANALYZE_OBJECTS) $(TOOLS_LIB
block_cache_trace_analyzer: $(OBJ_DIR)/tools/block_cache_analyzer/block_cache_trace_analyzer_tool.o $(ANALYZE_OBJECTS) $(TOOLS_LIBRARY) $(LIBRARY)
$(AM_LINK)
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
folly_synchronization_distributed_mutex_test: $(OBJ_DIR)/third-party/folly/folly/synchronization/test/DistributedMutexTest.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
endif
cache_bench: $(OBJ_DIR)/cache/cache_bench.o $(CACHE_BENCH_OBJECTS) $(LIBRARY)
$(AM_LINK)
@ -1321,9 +1429,6 @@ ribbon_test: $(OBJ_DIR)/util/ribbon_test.o $(TEST_LIBRARY) $(LIBRARY)
option_change_migration_test: $(OBJ_DIR)/utilities/option_change_migration/option_change_migration_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
agg_merge_test: $(OBJ_DIR)/utilities/agg_merge/agg_merge_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
stringappend_test: $(OBJ_DIR)/utilities/merge_operators/string_append/stringappend_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
@ -1504,7 +1609,7 @@ perf_context_test: $(OBJ_DIR)/db/perf_context_test.o $(TEST_LIBRARY) $(LIBRARY)
prefix_test: $(OBJ_DIR)/db/prefix_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
backup_engine_test: $(OBJ_DIR)/utilities/backup/backup_engine_test.o $(TEST_LIBRARY) $(LIBRARY)
backupable_db_test: $(OBJ_DIR)/utilities/backupable/backupable_db_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
checkpoint_test: $(OBJ_DIR)/utilities/checkpoint/checkpoint_test.o $(TEST_LIBRARY) $(LIBRARY)
@ -1759,9 +1864,6 @@ point_lock_manager_test: utilities/transactions/lock/point/point_lock_manager_te
transaction_test: $(OBJ_DIR)/utilities/transactions/transaction_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
write_committed_transaction_ts_test: $(OBJ_DIR)/utilities/transactions/write_committed_transaction_ts_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
write_prepared_transaction_test: $(OBJ_DIR)/utilities/transactions/write_prepared_transaction_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
@ -1795,9 +1897,6 @@ statistics_test: $(OBJ_DIR)/monitoring/statistics_test.o $(TEST_LIBRARY) $(LIBRA
stats_history_test: $(OBJ_DIR)/monitoring/stats_history_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
compressed_secondary_cache_test: $(OBJ_DIR)/cache/compressed_secondary_cache_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
lru_cache_test: $(OBJ_DIR)/cache/lru_cache_test.o $(TEST_LIBRARY) $(LIBRARY)
$(AM_LINK)
@ -1997,9 +2096,9 @@ ROCKSDB_JAVADOCS_JAR = rocksdbjni-$(ROCKSDB_JAVA_VERSION)-javadoc.jar
ROCKSDB_SOURCES_JAR = rocksdbjni-$(ROCKSDB_JAVA_VERSION)-sources.jar
SHA256_CMD = sha256sum
ZLIB_VER ?= 1.2.12
ZLIB_SHA256 ?= 91844808532e5ce316b3c010929493c0244f3d37593afd6de04f71821d5136d9
ZLIB_DOWNLOAD_BASE ?= http://zlib.net
ZLIB_VER ?= 1.2.11
ZLIB_SHA256 ?= c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1
ZLIB_DOWNLOAD_BASE ?= https://zlib.net/fossils
BZIP2_VER ?= 1.0.8
BZIP2_SHA256 ?= ab5a03176ee106d3f0fa90e381da478ddae405918153cca248e682cd0c4a2269
BZIP2_DOWNLOAD_BASE ?= http://sourceware.org/pub/bzip2
@ -2325,58 +2424,10 @@ jtest: rocksdbjava
jdb_bench:
cd java;$(MAKE) db_bench;
commit_prereq:
echo "TODO: bring this back using parts of old precommit_checker.py and rocksdb-lego-determinator"
false # J=$(J) build_tools/precommit_checker.py unit clang_unit release clang_release tsan asan ubsan lite unit_non_shm
# $(MAKE) clean && $(MAKE) jclean && $(MAKE) rocksdbjava;
# For public CI runs, checkout folly in a way that can build with RocksDB.
# This is mostly intended as a test-only simulation of Meta-internal folly
# integration.
checkout_folly:
if [ -e third-party/folly ]; then \
cd third-party/folly && git fetch origin; \
else \
cd third-party && git clone https://github.com/facebook/folly.git; \
fi
@# Pin to a particular version for public CI, so that PR authors don't
@# need to worry about folly breaking our integration. Update periodically
cd third-party/folly && git reset --hard 98b9b2c1124e99f50f9085ddee74ce32afffc665
@# A hack to remove boost dependency.
@# NOTE: this hack is not needed if using FBCODE compiler config
perl -pi -e 's/^(#include <boost)/\/\/$$1/' third-party/folly/folly/functional/Invoke.h
# ---------------------------------------------------------------------------
# Build size testing
# ---------------------------------------------------------------------------
REPORT_BUILD_STATISTIC?=echo STATISTIC:
build_size:
# === normal build, static ===
$(MAKE) clean
$(MAKE) static_lib
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib $$(stat --printf="%s" librocksdb.a)
strip librocksdb.a
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib_stripped $$(stat --printf="%s" librocksdb.a)
# === normal build, shared ===
$(MAKE) clean
$(MAKE) shared_lib
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib $$(stat --printf="%s" `readlink -f librocksdb.so`)
strip `readlink -f librocksdb.so`
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib_stripped $$(stat --printf="%s" `readlink -f librocksdb.so`)
# === lite build, static ===
$(MAKE) clean
$(MAKE) LITE=1 static_lib
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib_lite $$(stat --printf="%s" librocksdb.a)
strip librocksdb.a
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.static_lib_lite_stripped $$(stat --printf="%s" librocksdb.a)
# === lite build, shared ===
$(MAKE) clean
$(MAKE) LITE=1 shared_lib
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib_lite $$(stat --printf="%s" `readlink -f librocksdb.so`)
strip `readlink -f librocksdb.so`
$(REPORT_BUILD_STATISTIC) rocksdb.build_size.shared_lib_lite_stripped $$(stat --printf="%s" `readlink -f librocksdb.so`)
commit_prereq: build_tools/rocksdb-lego-determinator \
build_tools/precommit_checker.py
J=$(J) build_tools/precommit_checker.py unit unit_481 clang_unit release release_481 clang_release tsan asan ubsan lite unit_non_shm
$(MAKE) clean && $(MAKE) jclean && $(MAKE) rocksdbjava;
# ---------------------------------------------------------------------------
# Platform-specific compilation
@ -2430,7 +2481,7 @@ endif
ifneq ($(SKIP_DEPENDS), 1)
DEPFILES = $(patsubst %.cc, $(OBJ_DIR)/%.cc.d, $(ALL_SOURCES))
DEPFILES+ = $(patsubst %.c, $(OBJ_DIR)/%.c.d, $(LIB_SOURCES_C) $(TEST_MAIN_SOURCES_C))
ifeq ($(USE_FOLLY),1)
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
DEPFILES +=$(patsubst %.cpp, $(OBJ_DIR)/%.cpp.d, $(FOLLY_SOURCES))
endif
endif
@ -2473,12 +2524,9 @@ endif
build_subset_tests: $(ROCKSDBTESTS_SUBSET)
$(AM_V_GEN)if [ -n "$${ROCKSDBTESTS_SUBSET_TESTS_TO_FILE}" ]; then echo "$(ROCKSDBTESTS_SUBSET)" > "$${ROCKSDBTESTS_SUBSET_TESTS_TO_FILE}"; else echo "$(ROCKSDBTESTS_SUBSET)"; fi
list_all_tests:
echo "$(ROCKSDBTESTS_SUBSET)"
# Remove the rules for which dependencies should not be generated and see if any are left.
#If so, include the dependencies; if not, do not include the dependency files
ROCKS_DEP_RULES=$(filter-out clean format check-format check-buck-targets check-headers check-sources jclean jtest package analyze tags rocksdbjavastatic% unity.% unity_test checkout_folly, $(MAKECMDGOALS))
ROCKS_DEP_RULES=$(filter-out clean format check-format check-buck-targets check-headers check-sources jclean jtest package analyze tags rocksdbjavastatic% unity.% unity_test, $(MAKECMDGOALS))
ifneq ("$(ROCKS_DEP_RULES)", "")
-include $(DEPFILES)
endif

View File

@ -4,4 +4,3 @@ This is the list of all known third-party plugins for RocksDB. If something is m
* [HDFS](https://github.com/riversand963/rocksdb-hdfs-env): an Env used for interacting with HDFS. Migrated from main RocksDB repo
* [ZenFS](https://github.com/westerndigitalcorporation/zenfs): a file system for zoned block devices
* [RADOS](https://github.com/riversand963/rocksdb-rados-env): an Env used for interacting with RADOS. Migrated from RocksDB main repo.
* [PMEM](https://github.com/pmem/pmem-rocksdb-plugin): a collection of plugins to enable Persistent Memory on RocksDB.

View File

@ -4,7 +4,7 @@ RocksDBLite is a project focused on mobile use cases, which don't need a lot of
Some examples of the features disabled by ROCKSDB_LITE:
* compiled-in support for LDB tool
* No backup engine
* No backupable DB
* No support for replication (which we provide in form of TransactionalIterator)
* No advanced monitoring tools
* No special-purpose memtables that are highly optimized for specific use cases

4044
TARGETS

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

38
buckifier/buckify_rocksdb.py Executable file → Normal file
View File

@ -1,4 +1,3 @@
#!/usr/bin/env python3
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
from __future__ import absolute_import
from __future__ import division
@ -144,8 +143,7 @@ def generate_targets(repo_path, deps_map):
src_mk["LIB_SOURCES"] +
# always add range_tree, it's only excluded on ppc64, which we don't use internally
src_mk["RANGE_TREE_SOURCES"] +
src_mk["TOOL_LIB_SOURCES"],
deps=["//folly/container:f14_hash"])
src_mk["TOOL_LIB_SOURCES"])
# rocksdb_whole_archive_lib
TARGETS.add_library(
"rocksdb_whole_archive_lib",
@ -153,7 +151,7 @@ def generate_targets(repo_path, deps_map):
# always add range_tree, it's only excluded on ppc64, which we don't use internally
src_mk["RANGE_TREE_SOURCES"] +
src_mk["TOOL_LIB_SOURCES"],
deps=["//folly/container:f14_hash"],
deps=None,
headers=None,
extra_external_deps="",
link_whole=True)
@ -185,10 +183,6 @@ def generate_targets(repo_path, deps_map):
src_mk.get("ANALYZER_LIB_SOURCES", [])
+ src_mk.get('STRESS_LIB_SOURCES', [])
+ ["test_util/testutil.cc"])
# db_stress binary
TARGETS.add_binary("db_stress",
["db_stress_tool/db_stress.cc"],
[":rocksdb_stress_lib"])
# bench binaries
for src in src_mk.get("MICROBENCH_SOURCES", []):
name = src.rsplit('/',1)[1].split('.')[0] if '/' in src else src.split('.')[0]
@ -216,34 +210,14 @@ def generate_targets(repo_path, deps_map):
with open(f"{repo_path}/buckifier/bench.json") as json_file:
fast_fancy_bench_config_list = json.load(json_file)
for config_dict in fast_fancy_bench_config_list:
clean_benchmarks = {}
benchmarks = config_dict['benchmarks']
for binary, benchmark_dict in benchmarks.items():
clean_benchmarks[binary] = {}
for benchmark, overloaded_metric_list in benchmark_dict.items():
clean_benchmarks[binary][benchmark] = []
for metric in overloaded_metric_list:
if not isinstance(metric, dict):
clean_benchmarks[binary][benchmark].append(metric)
TARGETS.add_fancy_bench_config(config_dict['name'], clean_benchmarks, False, config_dict['expected_runtime_one_iter'], config_dict['sl_iterations'], config_dict['regression_threshold'])
TARGETS.add_fancy_bench_config(config_dict['name'],config_dict['benchmarks'], False, config_dict['expected_runtime'])
with open(f"{repo_path}/buckifier/bench-slow.json") as json_file:
slow_fancy_bench_config_list = json.load(json_file)
for config_dict in slow_fancy_bench_config_list:
clean_benchmarks = {}
benchmarks = config_dict['benchmarks']
for binary, benchmark_dict in benchmarks.items():
clean_benchmarks[binary] = {}
for benchmark, overloaded_metric_list in benchmark_dict.items():
clean_benchmarks[binary][benchmark] = []
for metric in overloaded_metric_list:
if not isinstance(metric, dict):
clean_benchmarks[binary][benchmark].append(metric)
for config_dict in slow_fancy_bench_config_list:
TARGETS.add_fancy_bench_config(config_dict['name']+"_slow", clean_benchmarks, True, config_dict['expected_runtime_one_iter'], config_dict['sl_iterations'], config_dict['regression_threshold'])
# it is better servicelab experiments break
# than rocksdb github ci
except Exception:
TARGETS.add_fancy_bench_config(config_dict['name']+"_slow",config_dict['benchmarks'], True, config_dict['expected_runtime'])
except (FileNotFoundError, KeyError):
pass
TARGETS.add_test_header()

View File

@ -92,14 +92,12 @@ add_c_test_wrapper()
# will not be included.
""")
def add_fancy_bench_config(self, name, bench_config, slow, expected_runtime, sl_iterations, regression_threshold):
def add_fancy_bench_config(self, name, bench_config, slow, expected_runtime):
self.targets_file.write(targets_cfg.fancy_bench_template.format(
name=name,
bench_config=pprint.pformat(bench_config),
slow=slow,
expected_runtime=expected_runtime,
sl_iterations=sl_iterations,
regression_threshold=regression_threshold
).encode("utf-8"))
def register_test(self,

View File

@ -41,6 +41,6 @@ cpp_unittest_wrapper(name="{test_name}",
"""
fancy_bench_template = """
fancy_bench_wrapper(suite_name="{name}", binary_to_bench_to_metric_list_map={bench_config}, slow={slow}, expected_runtime={expected_runtime}, sl_iterations={sl_iterations}, regression_threshold={regression_threshold})
fancy_bench_wrapper(suite_name="{name}", binary_to_bench_to_metric_list_map={bench_config}, slow={slow}, expected_runtime={expected_runtime})
"""

View File

@ -63,8 +63,13 @@ if [ -z "$ROCKSDB_NO_FBCODE" -a -d /mnt/gvfs/third-party ]; then
if [ "$LIB_MODE" == "shared" ]; then
PIC_BUILD=1
fi
if [ -n "$ROCKSDB_FBCODE_BUILD_WITH_PLATFORM010" ]; then
source "$PWD/build_tools/fbcode_config_platform010.sh"
if [ -n "$ROCKSDB_FBCODE_BUILD_WITH_481" ]; then
# we need this to build with MySQL. Don't use for other purposes.
source "$PWD/build_tools/fbcode_config4.8.1.sh"
elif [ -n "$ROCKSDB_FBCODE_BUILD_WITH_5xx" ]; then
source "$PWD/build_tools/fbcode_config.sh"
elif [ -n "$ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007" ]; then
source "$PWD/build_tools/fbcode_config_platform007.sh"
elif [ -n "$ROCKSDB_FBCODE_BUILD_WITH_PLATFORM009" ]; then
source "$PWD/build_tools/fbcode_config_platform009.sh"
else
@ -469,7 +474,7 @@ EOF
if ! test $ROCKSDB_DISABLE_MEMKIND; then
# Test whether memkind library is installed
$CXX $PLATFORM_CXXFLAGS $LDFLAGS -x c++ - -o test.o -lmemkind 2>/dev/null <<EOF
$CXX $PLATFORM_CXXFLAGS $COMMON_FLAGS -lmemkind -x c++ - -o test.o 2>/dev/null <<EOF
#include <memkind.h>
int main() {
memkind_malloc(MEMKIND_DAX_KMEM, 1024);
@ -593,7 +598,7 @@ EOF
fi
if ! test $ROCKSDB_DISABLE_BENCHMARK; then
# Test whether google benchmark is available
$CXX $PLATFORM_CXXFLAGS -x c++ - -o /dev/null -lbenchmark -lpthread 2>/dev/null <<EOF
$CXX $PLATFORM_CXXFLAGS -x c++ - -o /dev/null -lbenchmark 2>/dev/null <<EOF
#include <benchmark/benchmark.h>
int main() {}
EOF
@ -633,9 +638,6 @@ if test "0$PORTABLE" -eq 0; then
COMMON_FLAGS="$COMMON_FLAGS -march=z196 "
fi
COMMON_FLAGS="$COMMON_FLAGS"
elif test -n "`echo $TARGET_ARCHITECTURE | grep ^riscv64`"; then
RISC_ISA=$(cat /proc/cpuinfo | grep isa | head -1 | cut --delimiter=: -f 2 | cut -b 2-)
COMMON_FLAGS="$COMMON_FLAGS -march=${RISC_ISA}"
elif [ "$TARGET_OS" == "IOS" ]; then
COMMON_FLAGS="$COMMON_FLAGS"
elif [ "$TARGET_OS" == "AIX" ] || [ "$TARGET_OS" == "SunOS" ]; then
@ -656,11 +658,6 @@ else
COMMON_FLAGS="$COMMON_FLAGS -march=z196 "
fi
if test -n "`echo $TARGET_ARCHITECTURE | grep ^riscv64`"; then
RISC_ISA=$(cat /proc/cpuinfo | grep isa | head -1 | cut --delimiter=: -f 2 | cut -b 2-)
COMMON_FLAGS="$COMMON_FLAGS -march=${RISC_ISA}"
fi
if [[ "${PLATFORM}" == "OS_MACOSX" ]]; then
# For portability compile for macOS 10.12 (2016) or newer
COMMON_FLAGS="$COMMON_FLAGS -mmacosx-version-min=10.12"
@ -880,8 +877,8 @@ if test -n "$WITH_JEMALLOC_FLAG"; then
echo "WITH_JEMALLOC_FLAG=$WITH_JEMALLOC_FLAG" >> "$OUTPUT"
fi
echo "LUA_PATH=$LUA_PATH" >> "$OUTPUT"
if test -n "$USE_FOLLY"; then
echo "USE_FOLLY=$USE_FOLLY" >> "$OUTPUT"
if test -n "$USE_FOLLY_DISTRIBUTED_MUTEX"; then
echo "USE_FOLLY_DISTRIBUTED_MUTEX=$USE_FOLLY_DISTRIBUTED_MUTEX" >> "$OUTPUT"
fi
if test -n "$PPC_LIBC_IS_GNU"; then
echo "PPC_LIBC_IS_GNU=$PPC_LIBC_IS_GNU" >> "$OUTPUT"

View File

@ -0,0 +1,19 @@
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
GCC_BASE=/mnt/gvfs/third-party2/gcc/7331085db891a2ef4a88a48a751d834e8d68f4cb/5.x/centos7-native/c447969
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/1bd23f9917738974ad0ff305aa23eb5f93f18305/9.0.0/centos7-native/c9f9104
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/6ace84e956873d53638c738b6f65f3f469cca74c/5.x/gcc-5-glibc-2.23/339d858
GLIBC_BASE=/mnt/gvfs/third-party2/glibc/192b0f42d63dcf6210d6ceae387b49af049e6e0c/2.23/gcc-5-glibc-2.23/ca1d1c0
SNAPPY_BASE=/mnt/gvfs/third-party2/snappy/7f9bdaada18f59bc27ec2b0871eb8a6144343aef/1.1.3/gcc-5-glibc-2.23/9bc6787
ZLIB_BASE=/mnt/gvfs/third-party2/zlib/2d9f0b9a4274cc21f61272a9e89bdb859bce8f1f/1.2.8/gcc-5-glibc-2.23/9bc6787
BZIP2_BASE=/mnt/gvfs/third-party2/bzip2/dc49a21c5fceec6456a7a28a94dcd16690af1337/1.0.6/gcc-5-glibc-2.23/9bc6787
LZ4_BASE=/mnt/gvfs/third-party2/lz4/0f607f8fc442ea7d6b876931b1898bb573d5e5da/1.9.1/gcc-5-glibc-2.23/9bc6787
ZSTD_BASE=/mnt/gvfs/third-party2/zstd/ca22bc441a4eb709e9e0b1f9fec9750fed7b31c5/1.4.x/gcc-5-glibc-2.23/03859b5
GFLAGS_BASE=/mnt/gvfs/third-party2/gflags/0b9929d2588991c65a57168bf88aff2db87c5d48/2.2.0/gcc-5-glibc-2.23/9bc6787
JEMALLOC_BASE=/mnt/gvfs/third-party2/jemalloc/c26f08f47ac35fc31da2633b7da92d6b863246eb/master/gcc-5-glibc-2.23/0c8f76d
NUMA_BASE=/mnt/gvfs/third-party2/numa/3f3fb57a5ccc5fd21c66416c0b83e0aa76a05376/2.0.11/gcc-5-glibc-2.23/9bc6787
LIBUNWIND_BASE=/mnt/gvfs/third-party2/libunwind/40c73d874898b386a71847f1b99115d93822d11f/1.4/gcc-5-glibc-2.23/b443de1
TBB_BASE=/mnt/gvfs/third-party2/tbb/4ce8e8dba77cdbd81b75d6f0c32fd7a1b76a11ec/2018_U5/gcc-5-glibc-2.23/9bc6787
KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/fb251ecd2f5ae16f8671f7014c246e52a748fe0b/4.0.9-36_fbk5_2933_gd092e3f/gcc-5-glibc-2.23/da39a3e
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/2e3cb7d119b3cea5f1e738cc13a1ac69f49eb875/2.29.1/centos7-native/da39a3e
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/d42d152a15636529b0861ec493927200ebebca8e/3.15.0/gcc-5-glibc-2.23/9bc6787
LUA_BASE=/mnt/gvfs/third-party2/lua/f0cd714433206d5139df61659eb7b28b1dea6683/5.2.3/gcc-5-glibc-2.23/65372bd

View File

@ -0,0 +1,20 @@
# shellcheck disable=SC2148
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
GCC_BASE=/mnt/gvfs/third-party2/gcc/cf7d14c625ce30bae1a4661c2319c5a283e4dd22/4.8.1/centos6-native/cc6c9dc
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/8598c375b0e94e1448182eb3df034704144a838d/stable/centos6-native/3f16ddd
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/d6e0a7da6faba45f5e5b1638f9edd7afc2f34e7d/4.8.1/gcc-4.8.1-glibc-2.17/8aac7fc
GLIBC_BASE=/mnt/gvfs/third-party2/glibc/d282e6e8f3d20f4e40a516834847bdc038e07973/2.17/gcc-4.8.1-glibc-2.17/99df8fc
SNAPPY_BASE=/mnt/gvfs/third-party2/snappy/8c38a4c1e52b4c2cc8a9cdc31b9c947ed7dbfcb4/1.1.3/gcc-4.8.1-glibc-2.17/c3f970a
ZLIB_BASE=/mnt/gvfs/third-party2/zlib/0882df3713c7a84f15abe368dc004581f20b39d7/1.2.8/gcc-4.8.1-glibc-2.17/c3f970a
BZIP2_BASE=/mnt/gvfs/third-party2/bzip2/740325875f6729f42d28deaa2147b0854f3a347e/1.0.6/gcc-4.8.1-glibc-2.17/c3f970a
LZ4_BASE=/mnt/gvfs/third-party2/lz4/0e790b441e2d9acd68d51e1d2e028f88c6a79ddf/r131/gcc-4.8.1-glibc-2.17/c3f970a
ZSTD_BASE=/mnt/gvfs/third-party2/zstd/9455f75ff7f4831dc9fda02a6a0f8c68922fad8f/1.0.0/gcc-4.8.1-glibc-2.17/c3f970a
GFLAGS_BASE=/mnt/gvfs/third-party2/gflags/f001a51b2854957676d07306ef3abf67186b5c8b/2.1.1/gcc-4.8.1-glibc-2.17/c3f970a
JEMALLOC_BASE=/mnt/gvfs/third-party2/jemalloc/fc8a13ca1fffa4d0765c716c5a0b49f0c107518f/master/gcc-4.8.1-glibc-2.17/8d31e51
NUMA_BASE=/mnt/gvfs/third-party2/numa/17c514c4d102a25ca15f4558be564eeed76f4b6a/2.0.8/gcc-4.8.1-glibc-2.17/c3f970a
LIBUNWIND_BASE=/mnt/gvfs/third-party2/libunwind/ad576de2a1ea560c4d3434304f0fc4e079bede42/trunk/gcc-4.8.1-glibc-2.17/675d945
TBB_BASE=/mnt/gvfs/third-party2/tbb/9d9a554877d0c5bef330fe818ab7178806dd316a/4.0_update2/gcc-4.8.1-glibc-2.17/c3f970a
KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/7c111ff27e0c466235163f00f280a9d617c3d2ec/4.0.9-36_fbk5_2933_gd092e3f/gcc-4.8.1-glibc-2.17/da39a3e
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/b7fd454c4b10c6a81015d4524ed06cdeab558490/2.26/centos6-native/da39a3e
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/d7f4d4d86674a57668e3a96f76f0e17dd0eb8765/3.8.1/gcc-4.8.1-glibc-2.17/c3f970a
LUA_BASE=/mnt/gvfs/third-party2/lua/61e4abf5813bbc39bc4f548757ccfcadde175a48/5.2.3/centos6-native/730f94e

View File

@ -0,0 +1,20 @@
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
GCC_BASE=/mnt/gvfs/third-party2/gcc/7331085db891a2ef4a88a48a751d834e8d68f4cb/7.x/centos7-native/b2ef2b6
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/963d9aeda70cc4779885b1277484fe7544a04e3e/9.0.0/platform007/9e92d53/
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/6ace84e956873d53638c738b6f65f3f469cca74c/7.x/platform007/5620abc
GLIBC_BASE=/mnt/gvfs/third-party2/glibc/192b0f42d63dcf6210d6ceae387b49af049e6e0c/2.26/platform007/f259413
SNAPPY_BASE=/mnt/gvfs/third-party2/snappy/7f9bdaada18f59bc27ec2b0871eb8a6144343aef/1.1.3/platform007/ca4da3d
ZLIB_BASE=/mnt/gvfs/third-party2/zlib/2d9f0b9a4274cc21f61272a9e89bdb859bce8f1f/1.2.8/platform007/ca4da3d
BZIP2_BASE=/mnt/gvfs/third-party2/bzip2/dc49a21c5fceec6456a7a28a94dcd16690af1337/1.0.6/platform007/ca4da3d
LZ4_BASE=/mnt/gvfs/third-party2/lz4/0f607f8fc442ea7d6b876931b1898bb573d5e5da/1.9.1/platform007/ca4da3d
ZSTD_BASE=/mnt/gvfs/third-party2/zstd/ca22bc441a4eb709e9e0b1f9fec9750fed7b31c5/1.4.x/platform007/15a3614
GFLAGS_BASE=/mnt/gvfs/third-party2/gflags/0b9929d2588991c65a57168bf88aff2db87c5d48/2.2.0/platform007/ca4da3d
JEMALLOC_BASE=/mnt/gvfs/third-party2/jemalloc/c26f08f47ac35fc31da2633b7da92d6b863246eb/master/platform007/c26c002
NUMA_BASE=/mnt/gvfs/third-party2/numa/3f3fb57a5ccc5fd21c66416c0b83e0aa76a05376/2.0.11/platform007/ca4da3d
LIBUNWIND_BASE=/mnt/gvfs/third-party2/libunwind/40c73d874898b386a71847f1b99115d93822d11f/1.4/platform007/6f3e0a9
TBB_BASE=/mnt/gvfs/third-party2/tbb/4ce8e8dba77cdbd81b75d6f0c32fd7a1b76a11ec/2018_U5/platform007/ca4da3d
LIBURING_BASE=/mnt/gvfs/third-party2/liburing/79427253fd0d42677255aacfe6d13bfe63f752eb/20190828/platform007/ca4da3d
KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/fb251ecd2f5ae16f8671f7014c246e52a748fe0b/fb/platform007/da39a3e
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/ab9f09bba370e7066cafd4eb59752db93f2e8312/2.29.1/platform007/15a3614
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/d42d152a15636529b0861ec493927200ebebca8e/3.15.0/platform007/ca4da3d
LUA_BASE=/mnt/gvfs/third-party2/lua/f0cd714433206d5139df61659eb7b28b1dea6683/5.3.4/platform007/5007832

View File

@ -18,5 +18,4 @@ KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/32b8a2407b634df3f8f948
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/08634589372fa5f237bfd374e8c644a8364e78c1/2.32/platform009/ba86d1f/
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/6ae525939ad02e5e676855082fbbc7828dbafeac/3.15.0/platform009/7f3b187
LUA_BASE=/mnt/gvfs/third-party2/lua/162efd9561a3d21f6869f4814011e9cf1b3ff4dc/5.3.4/platform009/a6271c4
BENCHMARK_BASE=/mnt/gvfs/third-party2/benchmark/30bf49ad6414325e17f3425b0edcb64239427ae3/1.6.1/platform009/7f3b187
BOOST_BASE=/mnt/gvfs/third-party2/boost/201b7d74941e54b436dfa364a063aa6d2cd7de4c/1.69.0/platform009/8a7ffdf
BENCHMARK_BASE=/mnt/gvfs/third-party2/benchmark/bce8d9564eaf161700aa3a20b1051564acf555fb/1.5.5/platform009/7f3b187

View File

@ -1,22 +0,0 @@
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
# The file is generated using update_dependencies.sh.
GCC_BASE=/mnt/gvfs/third-party2/gcc/e40bde78650fa91b8405a857e3f10bf336633fb0/11.x/centos7-native/886b5eb
CLANG_BASE=/mnt/gvfs/third-party2/llvm-fb/2043340983c032915adbb6f78903dc855b65aee8/12/platform010/9520e0f
LIBGCC_BASE=/mnt/gvfs/third-party2/libgcc/c00dcc6a3e4125c7e8b248e9a79c14b78ac9e0ca/11.x/platform010/5684a5a
GLIBC_BASE=/mnt/gvfs/third-party2/glibc/0b9c8e4b060eda62f3bc1c6127bbe1256697569b/2.34/platform010/f259413
SNAPPY_BASE=/mnt/gvfs/third-party2/snappy/bc9647f7912b131315827d65cb6189c21f381d05/1.1.3/platform010/76ebdda
ZLIB_BASE=/mnt/gvfs/third-party2/zlib/a6f5f3f1d063d2d00cd02fc12f0f05fc3ab3a994/1.2.11/platform010/76ebdda
BZIP2_BASE=/mnt/gvfs/third-party2/bzip2/09703139cfc376bd8a82642385a0e97726b28287/1.0.6/platform010/76ebdda
LZ4_BASE=/mnt/gvfs/third-party2/lz4/60220d6a5bf7722b9cc239a1368c596619b12060/1.9.1/platform010/76ebdda
ZSTD_BASE=/mnt/gvfs/third-party2/zstd/50eace8143eaaea9473deae1f3283e0049e05633/1.4.x/platform010/64091f4
GFLAGS_BASE=/mnt/gvfs/third-party2/gflags/5d27e5919771603da06000a027b12f799e58a4f7/2.2.0/platform010/76ebdda
JEMALLOC_BASE=/mnt/gvfs/third-party2/jemalloc/b62912d333ef33f9760efa6219dbe3fe6abb3b0e/master/platform010/f57cc4a
NUMA_BASE=/mnt/gvfs/third-party2/numa/6b412770957aa3c8a87e5e0dcd8cc2f45f393bc0/2.0.11/platform010/76ebdda
LIBUNWIND_BASE=/mnt/gvfs/third-party2/libunwind/52f69816e936e147664ad717eb71a1a0e9dc973a/1.4/platform010/5074a48
TBB_BASE=/mnt/gvfs/third-party2/tbb/c9cc192099fa84c0dcd0ffeedd44a373ad6e4925/2018_U5/platform010/76ebdda
LIBURING_BASE=/mnt/gvfs/third-party2/liburing/a98e2d137007e3ebf7f33bd6f99c2c56bdaf8488/20210212/platform010/76ebdda
BENCHMARK_BASE=/mnt/gvfs/third-party2/benchmark/780c7a0f9cf0967961e69ad08e61cddd85d61821/trunk/platform010/76ebdda
KERNEL_HEADERS_BASE=/mnt/gvfs/third-party2/kernel-headers/02d9f76aaaba580611cf75e741753c800c7fdc12/fb/platform010/da39a3e
BINUTILS_BASE=/mnt/gvfs/third-party2/binutils/938dc3f064ef3a48c0446f5b11d788d50b3eb5ee/2.37/centos7-native/da39a3e
VALGRIND_BASE=/mnt/gvfs/third-party2/valgrind/429a6b3203eb415f1599bd15183659153129188e/3.15.0/platform010/76ebdda
LUA_BASE=/mnt/gvfs/third-party2/lua/363787fa5cac2a8aa20638909210443278fa138e/5.3.4/platform010/9079c97

View File

@ -21,48 +21,38 @@ LIBGCC_LIBS=" -L $LIBGCC_BASE/lib"
GLIBC_INCLUDE="$GLIBC_BASE/include"
GLIBC_LIBS=" -L $GLIBC_BASE/lib"
if ! test $ROCKSDB_DISABLE_SNAPPY; then
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
if test -z $PIC_BUILD; then
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy.a"
else
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy_pic.a"
fi
CFLAGS+=" -DSNAPPY"
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
if test -z $PIC_BUILD; then
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy.a"
else
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy_pic.a"
fi
CFLAGS+=" -DSNAPPY"
if test -z $PIC_BUILD; then
if ! test $ROCKSDB_DISABLE_ZLIB; then
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz.a"
CFLAGS+=" -DZLIB"
fi
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz.a"
CFLAGS+=" -DZLIB"
if ! test $ROCKSDB_DISABLE_BZIP; then
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2.a"
CFLAGS+=" -DBZIP2"
fi
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2.a"
CFLAGS+=" -DBZIP2"
if ! test $ROCKSDB_DISABLE_LZ4; then
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4.a"
CFLAGS+=" -DLZ4"
fi
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4.a"
CFLAGS+=" -DLZ4"
fi
if ! test $ROCKSDB_DISABLE_ZSTD; then
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
if test -z $PIC_BUILD; then
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd.a"
else
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd_pic.a"
fi
CFLAGS+=" -DZSTD -DZSTD_STATIC_LINKING_ONLY"
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
if test -z $PIC_BUILD; then
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd.a"
else
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd_pic.a"
fi
CFLAGS+=" -DZSTD -DZSTD_STATIC_LINKING_ONLY"
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
@ -172,4 +162,6 @@ else
LUA_LIB=" $LUA_PATH/lib/liblua_pic.a"
fi
USE_FOLLY_DISTRIBUTED_MUTEX=1
export CC CXX AR CFLAGS CXXFLAGS EXEC_LDFLAGS EXEC_LDFLAGS_SHARED VALGRIND_VER JEMALLOC_LIB JEMALLOC_INCLUDE CLANG_ANALYZER CLANG_SCAN_BUILD LUA_PATH LUA_LIB

View File

@ -0,0 +1,120 @@
#!/bin/sh
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#
# Set environment variables so that we can compile rocksdb using
# fbcode settings. It uses the latest g++ compiler and also
# uses jemalloc
BASEDIR=`dirname $BASH_SOURCE`
source "$BASEDIR/dependencies_4.8.1.sh"
# location of libgcc
LIBGCC_INCLUDE="$LIBGCC_BASE/include"
LIBGCC_LIBS=" -L $LIBGCC_BASE/lib"
# location of glibc
GLIBC_INCLUDE="$GLIBC_BASE/include"
GLIBC_LIBS=" -L $GLIBC_BASE/lib"
# location of snappy headers and libraries
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include"
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy.a"
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz.a"
# location of bzip headers and libraries
BZIP2_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP2_LIBS=" $BZIP2_BASE/lib/libbz2.a"
LZ4_INCLUDE=" -I $LZ4_BASE/include"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4.a"
ZSTD_INCLUDE=" -I $ZSTD_BASE/include"
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd.a"
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags.a"
# location of jemalloc
JEMALLOC_INCLUDE=" -I $JEMALLOC_BASE/include"
JEMALLOC_LIB="$JEMALLOC_BASE/lib/libjemalloc.a"
# location of numa
NUMA_INCLUDE=" -I $NUMA_BASE/include/"
NUMA_LIB=" $NUMA_BASE/lib/libnuma.a"
# location of libunwind
LIBUNWIND="$LIBUNWIND_BASE/lib/libunwind.a"
# location of tbb
TBB_INCLUDE=" -isystem $TBB_BASE/include/"
TBB_LIBS="$TBB_BASE/lib/libtbb.a"
test "$USE_SSE" || USE_SSE=1
export USE_SSE
test "$PORTABLE" || PORTABLE=1
export PORTABLE
BINUTILS="$BINUTILS_BASE/bin"
AR="$BINUTILS/ar"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP2_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE"
STDLIBS="-L $GCC_BASE/lib64"
if [ -z "$USE_CLANG" ]; then
# gcc
CC="$GCC_BASE/bin/gcc"
CXX="$GCC_BASE/bin/g++"
CXX="$GCC_BASE/bin/gcc-ar"
CFLAGS="-B$BINUTILS/gold -m64 -mtune=generic"
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
JEMALLOC=1
else
# clang
CLANG_BIN="$CLANG_BASE/bin"
CLANG_LIB="$CLANG_BASE/lib"
CLANG_INCLUDE="$CLANG_LIB/clang/*/include"
CC="$CLANG_BIN/clang"
CXX="$CLANG_BIN/clang++"
AR="$CLANG_BIN/llvm-ar"
KERNEL_HEADERS_INCLUDE="$KERNEL_HEADERS_BASE/include/"
CFLAGS="-B$BINUTILS/gold -nostdinc -nostdlib"
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/4.8.1 "
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/4.8.1/x86_64-facebook-linux "
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $CLANG_INCLUDE"
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE/linux "
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE "
CXXFLAGS="-nostdinc++"
fi
CFLAGS+=" $DEPS_INCLUDE"
CFLAGS+=" -DROCKSDB_PLATFORM_POSIX -DROCKSDB_LIB_IO_POSIX -DROCKSDB_FALLOCATE_PRESENT -DROCKSDB_MALLOC_USABLE_SIZE -DROCKSDB_RANGESYNC_PRESENT -DROCKSDB_SCHED_GETCPU_PRESENT -DROCKSDB_SUPPORT_THREAD_LOCAL -DHAVE_SSE42"
CFLAGS+=" -DSNAPPY -DGFLAGS=google -DZLIB -DBZIP2 -DLZ4 -DZSTD -DNUMA -DTBB"
CXXFLAGS+=" $CFLAGS"
EXEC_LDFLAGS=" $SNAPPY_LIBS $ZLIB_LIBS $BZIP2_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $NUMA_LIB $TBB_LIBS"
EXEC_LDFLAGS+=" -Wl,--dynamic-linker,/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib/ld.so"
EXEC_LDFLAGS+=" $LIBUNWIND"
EXEC_LDFLAGS+=" -Wl,-rpath=/usr/local/fbcode/gcc-4.8.1-glibc-2.17/lib"
# required by libtbb
EXEC_LDFLAGS+=" -ldl"
PLATFORM_LDFLAGS="$LIBGCC_LIBS $GLIBC_LIBS $STDLIBS -lgcc -lstdc++"
EXEC_LDFLAGS_SHARED="$SNAPPY_LIBS $ZLIB_LIBS $BZIP2_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS"
VALGRIND_VER="$VALGRIND_BASE/bin/"
LUA_PATH="$LUA_BASE"
export CC CXX AR CFLAGS CXXFLAGS EXEC_LDFLAGS EXEC_LDFLAGS_SHARED VALGRIND_VER JEMALLOC_LIB JEMALLOC_INCLUDE LUA_PATH

View File

@ -0,0 +1,170 @@
#!/bin/sh
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#
# Set environment variables so that we can compile rocksdb using
# fbcode settings. It uses the latest g++ and clang compilers and also
# uses jemalloc
# Environment variables that change the behavior of this script:
# PIC_BUILD -- if true, it will only take pic versions of libraries from fbcode. libraries that don't have pic variant will not be included
BASEDIR=`dirname $BASH_SOURCE`
source "$BASEDIR/dependencies_platform007.sh"
CFLAGS=""
# libgcc
LIBGCC_INCLUDE="$LIBGCC_BASE/include/c++/7.3.0"
LIBGCC_LIBS=" -L $LIBGCC_BASE/lib"
# glibc
GLIBC_INCLUDE="$GLIBC_BASE/include"
GLIBC_LIBS=" -L $GLIBC_BASE/lib"
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
if test -z $PIC_BUILD; then
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy.a"
else
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy_pic.a"
fi
CFLAGS+=" -DSNAPPY"
if test -z $PIC_BUILD; then
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz.a"
CFLAGS+=" -DZLIB"
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2.a"
CFLAGS+=" -DBZIP2"
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4.a"
CFLAGS+=" -DLZ4"
fi
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
if test -z $PIC_BUILD; then
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd.a"
else
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd_pic.a"
fi
CFLAGS+=" -DZSTD"
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
if test -z $PIC_BUILD; then
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags.a"
else
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags_pic.a"
fi
CFLAGS+=" -DGFLAGS=gflags"
# location of jemalloc
JEMALLOC_INCLUDE=" -I $JEMALLOC_BASE/include/"
JEMALLOC_LIB=" $JEMALLOC_BASE/lib/libjemalloc.a"
if test -z $PIC_BUILD; then
# location of numa
NUMA_INCLUDE=" -I $NUMA_BASE/include/"
NUMA_LIB=" $NUMA_BASE/lib/libnuma.a"
CFLAGS+=" -DNUMA"
# location of libunwind
LIBUNWIND="$LIBUNWIND_BASE/lib/libunwind.a"
fi
# location of TBB
TBB_INCLUDE=" -isystem $TBB_BASE/include/"
if test -z $PIC_BUILD; then
TBB_LIBS="$TBB_BASE/lib/libtbb.a"
else
TBB_LIBS="$TBB_BASE/lib/libtbb_pic.a"
fi
CFLAGS+=" -DTBB"
# location of LIBURING
LIBURING_INCLUDE=" -isystem $LIBURING_BASE/include/"
if test -z $PIC_BUILD; then
LIBURING_LIBS="$LIBURING_BASE/lib/liburing.a"
else
LIBURING_LIBS="$LIBURING_BASE/lib/liburing_pic.a"
fi
CFLAGS+=" -DLIBURING"
test "$USE_SSE" || USE_SSE=1
export USE_SSE
test "$PORTABLE" || PORTABLE=1
export PORTABLE
BINUTILS="$BINUTILS_BASE/bin"
AR="$BINUTILS/ar"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE $LIBURING_INCLUDE"
STDLIBS="-L $GCC_BASE/lib64"
CLANG_BIN="$CLANG_BASE/bin"
CLANG_LIB="$CLANG_BASE/lib"
CLANG_SRC="$CLANG_BASE/../../src"
CLANG_ANALYZER="$CLANG_BIN/clang++"
CLANG_SCAN_BUILD="$CLANG_SRC/llvm/tools/clang/tools/scan-build/bin/scan-build"
if [ -z "$USE_CLANG" ]; then
# gcc
CC="$GCC_BASE/bin/gcc"
CXX="$GCC_BASE/bin/g++"
AR="$GCC_BASE/bin/gcc-ar"
CFLAGS+=" -B$BINUTILS/gold"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $GLIBC_INCLUDE"
JEMALLOC=1
else
# clang
CLANG_INCLUDE="$CLANG_LIB/clang/stable/include"
CC="$CLANG_BIN/clang"
CXX="$CLANG_BIN/clang++"
AR="$CLANG_BIN/llvm-ar"
KERNEL_HEADERS_INCLUDE="$KERNEL_HEADERS_BASE/include"
CFLAGS+=" -B$BINUTILS/gold -nostdinc -nostdlib"
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/7.x "
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/7.x/x86_64-facebook-linux "
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $CLANG_INCLUDE"
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE/linux "
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE "
CFLAGS+=" -Wno-expansion-to-defined "
CXXFLAGS="-nostdinc++"
fi
CFLAGS+=" $DEPS_INCLUDE"
CFLAGS+=" -DROCKSDB_PLATFORM_POSIX -DROCKSDB_LIB_IO_POSIX -DROCKSDB_FALLOCATE_PRESENT -DROCKSDB_MALLOC_USABLE_SIZE -DROCKSDB_RANGESYNC_PRESENT -DROCKSDB_SCHED_GETCPU_PRESENT -DROCKSDB_SUPPORT_THREAD_LOCAL -DHAVE_SSE42 -DROCKSDB_IOURING_PRESENT"
CXXFLAGS+=" $CFLAGS"
EXEC_LDFLAGS=" $SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $NUMA_LIB $TBB_LIBS $LIBURING_LIBS"
EXEC_LDFLAGS+=" -B$BINUTILS/gold"
EXEC_LDFLAGS+=" -Wl,--dynamic-linker,/usr/local/fbcode/platform007/lib/ld.so"
EXEC_LDFLAGS+=" $LIBUNWIND"
EXEC_LDFLAGS+=" -Wl,-rpath=/usr/local/fbcode/platform007/lib"
# required by libtbb
EXEC_LDFLAGS+=" -ldl"
PLATFORM_LDFLAGS="$LIBGCC_LIBS $GLIBC_LIBS $STDLIBS -lgcc -lstdc++"
EXEC_LDFLAGS_SHARED="$SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $TBB_LIBS $LIBURING_LIBS"
VALGRIND_VER="$VALGRIND_BASE/bin/"
# lua not supported because it's on track for deprecation, I think
LUA_PATH=
LUA_LIB=
export CC CXX AR CFLAGS CXXFLAGS EXEC_LDFLAGS EXEC_LDFLAGS_SHARED VALGRIND_VER JEMALLOC_LIB JEMALLOC_INCLUDE CLANG_ANALYZER CLANG_SCAN_BUILD LUA_PATH LUA_LIB

View File

@ -27,38 +27,28 @@ else
MAYBE_PIC=_pic
fi
if ! test $ROCKSDB_DISABLE_SNAPPY; then
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy${MAYBE_PIC}.a"
CFLAGS+=" -DSNAPPY"
fi
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy${MAYBE_PIC}.a"
CFLAGS+=" -DSNAPPY"
if ! test $ROCKSDB_DISABLE_ZLIB; then
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz${MAYBE_PIC}.a"
CFLAGS+=" -DZLIB"
fi
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz${MAYBE_PIC}.a"
CFLAGS+=" -DZLIB"
if ! test $ROCKSDB_DISABLE_BZIP; then
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2${MAYBE_PIC}.a"
CFLAGS+=" -DBZIP2"
fi
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2${MAYBE_PIC}.a"
CFLAGS+=" -DBZIP2"
if ! test $ROCKSDB_DISABLE_LZ4; then
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4${MAYBE_PIC}.a"
CFLAGS+=" -DLZ4"
fi
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4${MAYBE_PIC}.a"
CFLAGS+=" -DLZ4"
if ! test $ROCKSDB_DISABLE_ZSTD; then
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd${MAYBE_PIC}.a"
CFLAGS+=" -DZSTD"
fi
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd${MAYBE_PIC}.a"
CFLAGS+=" -DZSTD"
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
@ -68,8 +58,6 @@ CFLAGS+=" -DGFLAGS=gflags"
BENCHMARK_INCLUDE=" -I $BENCHMARK_BASE/include/"
BENCHMARK_LIBS=" $BENCHMARK_BASE/lib/libbenchmark${MAYBE_PIC}.a"
BOOST_INCLUDE=" -I $BOOST_BASE/include/"
# location of jemalloc
JEMALLOC_INCLUDE=" -I $JEMALLOC_BASE/include/"
JEMALLOC_LIB=" $JEMALLOC_BASE/lib/libjemalloc${MAYBE_PIC}.a"
@ -101,7 +89,7 @@ BINUTILS="$BINUTILS_BASE/bin"
AR="$BINUTILS/ar"
AS="$BINUTILS/as"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE $LIBURING_INCLUDE $BENCHMARK_INCLUDE $BOOST_INCLUDE"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE $LIBURING_INCLUDE $BENCHMARK_INCLUDE"
STDLIBS="-L $GCC_BASE/lib64"

View File

@ -1,175 +0,0 @@
#!/bin/sh
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
#
# Set environment variables so that we can compile rocksdb using
# fbcode settings. It uses the latest g++ and clang compilers and also
# uses jemalloc
# Environment variables that change the behavior of this script:
# PIC_BUILD -- if true, it will only take pic versions of libraries from fbcode. libraries that don't have pic variant will not be included
BASEDIR=`dirname $BASH_SOURCE`
source "$BASEDIR/dependencies_platform010.sh"
# Disallow using libraries from default locations as they might not be compatible with platform010 libraries.
CFLAGS=" --sysroot=/DOES/NOT/EXIST"
# libgcc
LIBGCC_INCLUDE="$LIBGCC_BASE/include/c++/trunk"
LIBGCC_LIBS=" -L $LIBGCC_BASE/lib -B$LIBGCC_BASE/lib/gcc/x86_64-facebook-linux/trunk/"
# glibc
GLIBC_INCLUDE="$GLIBC_BASE/include"
GLIBC_LIBS=" -L $GLIBC_BASE/lib"
GLIBC_LIBS+=" -B$GLIBC_BASE/lib"
if test -z $PIC_BUILD; then
MAYBE_PIC=
else
MAYBE_PIC=_pic
fi
if ! test $ROCKSDB_DISABLE_SNAPPY; then
# snappy
SNAPPY_INCLUDE=" -I $SNAPPY_BASE/include/"
SNAPPY_LIBS=" $SNAPPY_BASE/lib/libsnappy${MAYBE_PIC}.a"
CFLAGS+=" -DSNAPPY"
fi
if ! test $ROCKSDB_DISABLE_ZLIB; then
# location of zlib headers and libraries
ZLIB_INCLUDE=" -I $ZLIB_BASE/include/"
ZLIB_LIBS=" $ZLIB_BASE/lib/libz${MAYBE_PIC}.a"
CFLAGS+=" -DZLIB"
fi
if ! test $ROCKSDB_DISABLE_BZIP; then
# location of bzip headers and libraries
BZIP_INCLUDE=" -I $BZIP2_BASE/include/"
BZIP_LIBS=" $BZIP2_BASE/lib/libbz2${MAYBE_PIC}.a"
CFLAGS+=" -DBZIP2"
fi
if ! test $ROCKSDB_DISABLE_LZ4; then
LZ4_INCLUDE=" -I $LZ4_BASE/include/"
LZ4_LIBS=" $LZ4_BASE/lib/liblz4${MAYBE_PIC}.a"
CFLAGS+=" -DLZ4"
fi
if ! test $ROCKSDB_DISABLE_ZSTD; then
ZSTD_INCLUDE=" -I $ZSTD_BASE/include/"
ZSTD_LIBS=" $ZSTD_BASE/lib/libzstd${MAYBE_PIC}.a"
CFLAGS+=" -DZSTD"
fi
# location of gflags headers and libraries
GFLAGS_INCLUDE=" -I $GFLAGS_BASE/include/"
GFLAGS_LIBS=" $GFLAGS_BASE/lib/libgflags${MAYBE_PIC}.a"
CFLAGS+=" -DGFLAGS=gflags"
BENCHMARK_INCLUDE=" -I $BENCHMARK_BASE/include/"
BENCHMARK_LIBS=" $BENCHMARK_BASE/lib/libbenchmark${MAYBE_PIC}.a"
# location of jemalloc
JEMALLOC_INCLUDE=" -I $JEMALLOC_BASE/include/"
JEMALLOC_LIB=" $JEMALLOC_BASE/lib/libjemalloc${MAYBE_PIC}.a"
# location of numa
NUMA_INCLUDE=" -I $NUMA_BASE/include/"
NUMA_LIB=" $NUMA_BASE/lib/libnuma${MAYBE_PIC}.a"
CFLAGS+=" -DNUMA"
# location of libunwind
LIBUNWIND="$LIBUNWIND_BASE/lib/libunwind${MAYBE_PIC}.a"
# location of TBB
TBB_INCLUDE=" -isystem $TBB_BASE/include/"
TBB_LIBS="$TBB_BASE/lib/libtbb${MAYBE_PIC}.a"
CFLAGS+=" -DTBB"
# location of LIBURING
LIBURING_INCLUDE=" -isystem $LIBURING_BASE/include/"
LIBURING_LIBS="$LIBURING_BASE/lib/liburing${MAYBE_PIC}.a"
CFLAGS+=" -DLIBURING"
test "$USE_SSE" || USE_SSE=1
export USE_SSE
test "$PORTABLE" || PORTABLE=1
export PORTABLE
BINUTILS="$BINUTILS_BASE/bin"
AR="$BINUTILS/ar"
AS="$BINUTILS/as"
DEPS_INCLUDE="$SNAPPY_INCLUDE $ZLIB_INCLUDE $BZIP_INCLUDE $LZ4_INCLUDE $ZSTD_INCLUDE $GFLAGS_INCLUDE $NUMA_INCLUDE $TBB_INCLUDE $LIBURING_INCLUDE $BENCHMARK_INCLUDE"
STDLIBS="-L $GCC_BASE/lib64"
CLANG_BIN="$CLANG_BASE/bin"
CLANG_LIB="$CLANG_BASE/lib"
CLANG_SRC="$CLANG_BASE/../../src"
CLANG_ANALYZER="$CLANG_BIN/clang++"
CLANG_SCAN_BUILD="$CLANG_SRC/llvm/clang/tools/scan-build/bin/scan-build"
if [ -z "$USE_CLANG" ]; then
# gcc
CC="$GCC_BASE/bin/gcc"
CXX="$GCC_BASE/bin/g++"
AR="$GCC_BASE/bin/gcc-ar"
CFLAGS+=" -B$BINUTILS -nostdinc -nostdlib"
CFLAGS+=" -I$GCC_BASE/include"
CFLAGS+=" -isystem $GCC_BASE/lib/gcc/x86_64-redhat-linux-gnu/11.2.1/include"
CFLAGS+=" -isystem $GCC_BASE/lib/gcc/x86_64-redhat-linux-gnu/11.2.1/install-tools/include"
CFLAGS+=" -isystem $GCC_BASE/lib/gcc/x86_64-redhat-linux-gnu/11.2.1/include-fixed/"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -I$GLIBC_INCLUDE"
CFLAGS+=" -I$LIBGCC_BASE/include"
CFLAGS+=" -I$LIBGCC_BASE/include/c++/11.x/"
CFLAGS+=" -I$LIBGCC_BASE/include/c++/11.x/x86_64-facebook-linux/"
CFLAGS+=" -I$LIBGCC_BASE/include/c++/11.x/backward"
CFLAGS+=" -isystem $GLIBC_INCLUDE -I$GLIBC_INCLUDE"
JEMALLOC=1
else
# clang
CLANG_INCLUDE="$CLANG_LIB/clang/stable/include"
CC="$CLANG_BIN/clang"
CXX="$CLANG_BIN/clang++"
AR="$CLANG_BIN/llvm-ar"
CFLAGS+=" -B$BINUTILS -nostdinc -nostdlib"
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/trunk "
CFLAGS+=" -isystem $LIBGCC_BASE/include/c++/trunk/x86_64-facebook-linux "
CFLAGS+=" -isystem $GLIBC_INCLUDE"
CFLAGS+=" -isystem $LIBGCC_INCLUDE"
CFLAGS+=" -isystem $CLANG_INCLUDE"
CFLAGS+=" -Wno-expansion-to-defined "
CXXFLAGS="-nostdinc++"
fi
KERNEL_HEADERS_INCLUDE="$KERNEL_HEADERS_BASE/include"
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE/linux "
CFLAGS+=" -isystem $KERNEL_HEADERS_INCLUDE "
CFLAGS+=" $DEPS_INCLUDE"
CFLAGS+=" -DROCKSDB_PLATFORM_POSIX -DROCKSDB_LIB_IO_POSIX -DROCKSDB_FALLOCATE_PRESENT -DROCKSDB_MALLOC_USABLE_SIZE -DROCKSDB_RANGESYNC_PRESENT -DROCKSDB_SCHED_GETCPU_PRESENT -DROCKSDB_SUPPORT_THREAD_LOCAL -DHAVE_SSE42 -DROCKSDB_IOURING_PRESENT"
CXXFLAGS+=" $CFLAGS"
EXEC_LDFLAGS=" $SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $NUMA_LIB $TBB_LIBS $LIBURING_LIBS $BENCHMARK_LIBS"
EXEC_LDFLAGS+=" -Wl,--dynamic-linker,/usr/local/fbcode/platform010/lib/ld.so"
EXEC_LDFLAGS+=" $LIBUNWIND"
EXEC_LDFLAGS+=" -Wl,-rpath=/usr/local/fbcode/platform010/lib"
EXEC_LDFLAGS+=" -Wl,-rpath=$GCC_BASE/lib64"
# required by libtbb
EXEC_LDFLAGS+=" -ldl"
PLATFORM_LDFLAGS="$LIBGCC_LIBS $GLIBC_LIBS $STDLIBS -lgcc -lstdc++"
PLATFORM_LDFLAGS+=" -B$BINUTILS"
EXEC_LDFLAGS_SHARED="$SNAPPY_LIBS $ZLIB_LIBS $BZIP_LIBS $LZ4_LIBS $ZSTD_LIBS $GFLAGS_LIBS $TBB_LIBS $LIBURING_LIBS $BENCHMARK_LIBS"
VALGRIND_VER="$VALGRIND_BASE/bin/"
export CC CXX AR AS CFLAGS CXXFLAGS EXEC_LDFLAGS EXEC_LDFLAGS_SHARED VALGRIND_VER JEMALLOC_LIB JEMALLOC_INCLUDE CLANG_ANALYZER CLANG_SCAN_BUILD LUA_PATH LUA_LIB

209
build_tools/precommit_checker.py Executable file
View File

@ -0,0 +1,209 @@
#!/usr/bin/env python2.7
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
import argparse
import commands
import subprocess
import sys
import re
import os
import time
#
# Simple logger
#
class Log:
def __init__(self, filename):
self.filename = filename
self.f = open(self.filename, 'w+', 0)
def caption(self, str):
line = "\n##### %s #####\n" % str
if self.f:
self.f.write("%s \n" % line)
else:
print(line)
def error(self, str):
data = "\n\n##### ERROR ##### %s" % str
if self.f:
self.f.write("%s \n" % data)
else:
print(data)
def log(self, str):
if self.f:
self.f.write("%s \n" % str)
else:
print(str)
#
# Shell Environment
#
class Env(object):
def __init__(self, logfile, tests):
self.tests = tests
self.log = Log(logfile)
def shell(self, cmd, path=os.getcwd()):
if path:
os.chdir(path)
self.log.log("==== shell session ===========================")
self.log.log("%s> %s" % (path, cmd))
status = subprocess.call("cd %s; %s" % (path, cmd), shell=True,
stdout=self.log.f, stderr=self.log.f)
self.log.log("status = %s" % status)
self.log.log("============================================== \n\n")
return status
def GetOutput(self, cmd, path=os.getcwd()):
if path:
os.chdir(path)
self.log.log("==== shell session ===========================")
self.log.log("%s> %s" % (path, cmd))
status, out = commands.getstatusoutput(cmd)
self.log.log("status = %s" % status)
self.log.log("out = %s" % out)
self.log.log("============================================== \n\n")
return status, out
#
# Pre-commit checker
#
class PreCommitChecker(Env):
def __init__(self, args):
Env.__init__(self, args.logfile, args.tests)
self.ignore_failure = args.ignore_failure
#
# Get commands for a given job from the determinator file
#
def get_commands(self, test):
status, out = self.GetOutput(
"RATIO=1 build_tools/rocksdb-lego-determinator %s" % test, ".")
return status, out
#
# Run a specific CI job
#
def run_test(self, test):
self.log.caption("Running test %s locally" % test)
# get commands for the CI job determinator
status, cmds = self.get_commands(test)
if status != 0:
self.log.error("Error getting commands for test %s" % test)
return False
# Parse the JSON to extract the commands to run
cmds = re.findall("'shell':'([^\']*)'", cmds)
if len(cmds) == 0:
self.log.log("No commands found")
return False
# Run commands
for cmd in cmds:
# Replace J=<..> with the local environment variable
if "J" in os.environ:
cmd = cmd.replace("J=1", "J=%s" % os.environ["J"])
cmd = cmd.replace("make ", "make -j%s " % os.environ["J"])
# Run the command
status = self.shell(cmd, ".")
if status != 0:
self.log.error("Error running command %s for test %s"
% (cmd, test))
return False
return True
#
# Run specified CI jobs
#
def run_tests(self):
if not self.tests:
self.log.error("Invalid args. Please provide tests")
return False
self.print_separator()
self.print_row("TEST", "RESULT")
self.print_separator()
result = True
for test in self.tests:
start_time = time.time()
self.print_test(test)
result = self.run_test(test)
elapsed_min = (time.time() - start_time) / 60
if not result:
self.log.error("Error running test %s" % test)
self.print_result("FAIL (%dm)" % elapsed_min)
if not self.ignore_failure:
return False
result = False
else:
self.print_result("PASS (%dm)" % elapsed_min)
self.print_separator()
return result
#
# Print a line
#
def print_separator(self):
print("".ljust(60, "-"))
#
# Print two colums
#
def print_row(self, c0, c1):
print("%s%s" % (c0.ljust(40), c1.ljust(20)))
def print_test(self, test):
print(test.ljust(40), end="")
sys.stdout.flush()
def print_result(self, result):
print(result.ljust(20))
#
# Main
#
parser = argparse.ArgumentParser(description='RocksDB pre-commit checker.')
# --log <logfile>
parser.add_argument('--logfile', default='/tmp/precommit-check.log',
help='Log file. Default is /tmp/precommit-check.log')
# --ignore_failure
parser.add_argument('--ignore_failure', action='store_true', default=False,
help='Stop when an error occurs')
# <test ....>
parser.add_argument('tests', nargs='+',
help='CI test(s) to run. e.g: unit punit asan tsan ubsan')
args = parser.parse_args()
checker = PreCommitChecker(args)
print("Please follow log %s" % checker.log.filename)
if not checker.run_tests():
print("Error running tests. Please check log file %s"
% checker.log.filename)
sys.exit(1)
sys.exit(0)

File diff suppressed because it is too large Load Diff

View File

@ -42,7 +42,7 @@ $RunOnly.Add("c_test") | Out-Null
$RunOnly.Add("compact_on_deletion_collector_test") | Out-Null
$RunOnly.Add("merge_test") | Out-Null
$RunOnly.Add("stringappend_test") | Out-Null # Apparently incorrectly written
$RunOnly.Add("backup_engine_test") | Out-Null # Disabled
$RunOnly.Add("backupable_db_test") | Out-Null # Disabled
$RunOnly.Add("timer_queue_test") | Out-Null # Not a gtest
if($RunAll -and $SuiteRun -ne "") {
@ -491,3 +491,5 @@ if(!$script:success) {
}
exit 0

View File

@ -9,7 +9,6 @@ OUTPUT=""
function log_header()
{
echo "# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved." >> "$OUTPUT"
echo "# The file is generated using update_dependencies.sh." >> "$OUTPUT"
}
@ -19,7 +18,7 @@ function log_variable()
}
TP2_LATEST="/data/users/$USER/fbsource/fbcode/third-party2/"
TP2_LATEST="/mnt/vol/engshare/fbcode/third-party2"
## $1 => lib name
## $2 => lib version (if not provided, will try to pick latest)
## $3 => platform (if not provided, will try to pick latest gcc)
@ -51,8 +50,6 @@ function get_lib_base()
fi
result=`ls -1d $result/*/ | head -n1`
echo Finding link $result
# lib_name => LIB_NAME_BASE
local __res_var=${lib_name^^}"_BASE"
@ -64,10 +61,10 @@ function get_lib_base()
}
###########################################################
# platform010 dependencies #
# platform007 dependencies #
###########################################################
OUTPUT="$BASEDIR/dependencies_platform010.sh"
OUTPUT="$BASEDIR/dependencies_platform007.sh"
rm -f "$OUTPUT"
touch "$OUTPUT"
@ -75,42 +72,40 @@ touch "$OUTPUT"
echo "Writing dependencies to $OUTPUT"
# Compilers locations
GCC_BASE=`readlink -f $TP2_LATEST/gcc/11.x/centos7-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/12/platform010/*/`
GCC_BASE=`readlink -f $TP2_LATEST/gcc/7.x/centos7-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/stable/centos7-native/*/`
log_header
log_variable GCC_BASE
log_variable CLANG_BASE
# Libraries locations
get_lib_base libgcc 11.x platform010
get_lib_base glibc 2.34 platform010
get_lib_base snappy LATEST platform010
get_lib_base zlib LATEST platform010
get_lib_base bzip2 LATEST platform010
get_lib_base lz4 LATEST platform010
get_lib_base zstd LATEST platform010
get_lib_base gflags LATEST platform010
get_lib_base jemalloc LATEST platform010
get_lib_base numa LATEST platform010
get_lib_base libunwind LATEST platform010
get_lib_base tbb 2018_U5 platform010
get_lib_base liburing LATEST platform010
get_lib_base benchmark LATEST platform010
get_lib_base libgcc 7.x platform007
get_lib_base glibc 2.26 platform007
get_lib_base snappy LATEST platform007
get_lib_base zlib LATEST platform007
get_lib_base bzip2 LATEST platform007
get_lib_base lz4 LATEST platform007
get_lib_base zstd LATEST platform007
get_lib_base gflags LATEST platform007
get_lib_base jemalloc LATEST platform007
get_lib_base numa LATEST platform007
get_lib_base libunwind LATEST platform007
get_lib_base tbb LATEST platform007
get_lib_base liburing LATEST platform007
get_lib_base kernel-headers fb platform010
get_lib_base kernel-headers fb platform007
get_lib_base binutils LATEST centos7-native
get_lib_base valgrind LATEST platform010
get_lib_base lua 5.3.4 platform010
get_lib_base valgrind LATEST platform007
get_lib_base lua 5.3.4 platform007
git diff $OUTPUT
###########################################################
# platform009 dependencies #
# 5.x dependencies #
###########################################################
OUTPUT="$BASEDIR/dependencies_platform009.sh"
OUTPUT="$BASEDIR/dependencies.sh"
rm -f "$OUTPUT"
touch "$OUTPUT"
@ -118,32 +113,70 @@ touch "$OUTPUT"
echo "Writing dependencies to $OUTPUT"
# Compilers locations
GCC_BASE=`readlink -f $TP2_LATEST/gcc/9.x/centos7-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/9.0.0/platform009/*/`
GCC_BASE=`readlink -f $TP2_LATEST/gcc/5.x/centos7-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/stable/centos7-native/*/`
log_header
log_variable GCC_BASE
log_variable CLANG_BASE
# Libraries locations
get_lib_base libgcc 9.x platform009
get_lib_base glibc 2.30 platform009
get_lib_base snappy LATEST platform009
get_lib_base zlib LATEST platform009
get_lib_base bzip2 LATEST platform009
get_lib_base lz4 LATEST platform009
get_lib_base zstd LATEST platform009
get_lib_base gflags LATEST platform009
get_lib_base jemalloc LATEST platform009
get_lib_base numa LATEST platform009
get_lib_base libunwind LATEST platform009
get_lib_base tbb 2018_U5 platform009
get_lib_base liburing LATEST platform009
get_lib_base benchmark LATEST platform009
get_lib_base libgcc 5.x gcc-5-glibc-2.23
get_lib_base glibc 2.23 gcc-5-glibc-2.23
get_lib_base snappy LATEST gcc-5-glibc-2.23
get_lib_base zlib LATEST gcc-5-glibc-2.23
get_lib_base bzip2 LATEST gcc-5-glibc-2.23
get_lib_base lz4 LATEST gcc-5-glibc-2.23
get_lib_base zstd LATEST gcc-5-glibc-2.23
get_lib_base gflags LATEST gcc-5-glibc-2.23
get_lib_base jemalloc LATEST gcc-5-glibc-2.23
get_lib_base numa LATEST gcc-5-glibc-2.23
get_lib_base libunwind LATEST gcc-5-glibc-2.23
get_lib_base tbb LATEST gcc-5-glibc-2.23
get_lib_base kernel-headers fb platform009
get_lib_base kernel-headers 4.0.9-36_fbk5_2933_gd092e3f gcc-5-glibc-2.23
get_lib_base binutils LATEST centos7-native
get_lib_base valgrind LATEST platform009
get_lib_base lua 5.3.4 platform009
get_lib_base valgrind LATEST gcc-5-glibc-2.23
get_lib_base lua 5.2.3 gcc-5-glibc-2.23
git diff $OUTPUT
###########################################################
# 4.8.1 dependencies #
###########################################################
OUTPUT="$BASEDIR/dependencies_4.8.1.sh"
rm -f "$OUTPUT"
touch "$OUTPUT"
echo "Writing 4.8.1 dependencies to $OUTPUT"
# Compilers locations
GCC_BASE=`readlink -f $TP2_LATEST/gcc/4.8.1/centos6-native/*/`
CLANG_BASE=`readlink -f $TP2_LATEST/llvm-fb/stable/centos6-native/*/`
log_header
log_variable GCC_BASE
log_variable CLANG_BASE
# Libraries locations
get_lib_base libgcc 4.8.1 gcc-4.8.1-glibc-2.17
get_lib_base glibc 2.17 gcc-4.8.1-glibc-2.17
get_lib_base snappy LATEST gcc-4.8.1-glibc-2.17
get_lib_base zlib LATEST gcc-4.8.1-glibc-2.17
get_lib_base bzip2 LATEST gcc-4.8.1-glibc-2.17
get_lib_base lz4 LATEST gcc-4.8.1-glibc-2.17
get_lib_base zstd LATEST gcc-4.8.1-glibc-2.17
get_lib_base gflags LATEST gcc-4.8.1-glibc-2.17
get_lib_base jemalloc LATEST gcc-4.8.1-glibc-2.17
get_lib_base numa LATEST gcc-4.8.1-glibc-2.17
get_lib_base libunwind LATEST gcc-4.8.1-glibc-2.17
get_lib_base tbb 4.0_update2 gcc-4.8.1-glibc-2.17
get_lib_base kernel-headers LATEST gcc-4.8.1-glibc-2.17
get_lib_base binutils LATEST centos6-native
get_lib_base valgrind 3.8.1 gcc-4.8.1-glibc-2.17
get_lib_base lua 5.2.3 centos6-native
git diff $OUTPUT

View File

@ -11,7 +11,7 @@
namespace ROCKSDB_NAMESPACE {
std::array<std::string, kNumCacheEntryRoles> kCacheEntryRoleToCamelString{{
std::array<const char*, kNumCacheEntryRoles> kCacheEntryRoleToCamelString{{
"DataBlock",
"FilterBlock",
"FilterMetaBlock",
@ -21,11 +21,10 @@ std::array<std::string, kNumCacheEntryRoles> kCacheEntryRoleToCamelString{{
"WriteBuffer",
"CompressionDictionaryBuildingBuffer",
"FilterConstruction",
"BlockBasedTableReader",
"Misc",
}};
std::array<std::string, kNumCacheEntryRoles> kCacheEntryRoleToHyphenString{{
std::array<const char*, kNumCacheEntryRoles> kCacheEntryRoleToHyphenString{{
"data-block",
"filter-block",
"filter-meta-block",
@ -35,76 +34,19 @@ std::array<std::string, kNumCacheEntryRoles> kCacheEntryRoleToHyphenString{{
"write-buffer",
"compression-dictionary-building-buffer",
"filter-construction",
"block-based-table-reader",
"misc",
}};
const std::string& GetCacheEntryRoleName(CacheEntryRole role) {
return kCacheEntryRoleToHyphenString[static_cast<size_t>(role)];
}
const std::string& BlockCacheEntryStatsMapKeys::CacheId() {
static const std::string kCacheId = "id";
return kCacheId;
}
const std::string& BlockCacheEntryStatsMapKeys::CacheCapacityBytes() {
static const std::string kCacheCapacityBytes = "capacity";
return kCacheCapacityBytes;
}
const std::string&
BlockCacheEntryStatsMapKeys::LastCollectionDurationSeconds() {
static const std::string kLastCollectionDurationSeconds =
"secs_for_last_collection";
return kLastCollectionDurationSeconds;
}
const std::string& BlockCacheEntryStatsMapKeys::LastCollectionAgeSeconds() {
static const std::string kLastCollectionAgeSeconds =
"secs_since_last_collection";
return kLastCollectionAgeSeconds;
}
namespace {
std::string GetPrefixedCacheEntryRoleName(const std::string& prefix,
CacheEntryRole role) {
const std::string& role_name = GetCacheEntryRoleName(role);
std::string prefixed_role_name;
prefixed_role_name.reserve(prefix.size() + role_name.size());
prefixed_role_name.append(prefix);
prefixed_role_name.append(role_name);
return prefixed_role_name;
}
} // namespace
std::string BlockCacheEntryStatsMapKeys::EntryCount(CacheEntryRole role) {
const static std::string kPrefix = "count.";
return GetPrefixedCacheEntryRoleName(kPrefix, role);
}
std::string BlockCacheEntryStatsMapKeys::UsedBytes(CacheEntryRole role) {
const static std::string kPrefix = "bytes.";
return GetPrefixedCacheEntryRoleName(kPrefix, role);
}
std::string BlockCacheEntryStatsMapKeys::UsedPercent(CacheEntryRole role) {
const static std::string kPrefix = "percent.";
return GetPrefixedCacheEntryRoleName(kPrefix, role);
}
namespace {
struct Registry {
std::mutex mutex;
UnorderedMap<Cache::DeleterFn, CacheEntryRole> role_map;
std::unordered_map<Cache::DeleterFn, CacheEntryRole> role_map;
void Register(Cache::DeleterFn fn, CacheEntryRole role) {
std::lock_guard<std::mutex> lock(mutex);
role_map[fn] = role;
}
UnorderedMap<Cache::DeleterFn, CacheEntryRole> Copy() {
std::unordered_map<Cache::DeleterFn, CacheEntryRole> Copy() {
std::lock_guard<std::mutex> lock(mutex);
return role_map;
}
@ -121,7 +63,7 @@ void RegisterCacheDeleterRole(Cache::DeleterFn fn, CacheEntryRole role) {
GetRegistry().Register(fn, role);
}
UnorderedMap<Cache::DeleterFn, CacheEntryRole> CopyCacheDeleterRoleMap() {
std::unordered_map<Cache::DeleterFn, CacheEntryRole> CopyCacheDeleterRoleMap() {
return GetRegistry().Copy();
}

View File

@ -9,15 +9,46 @@
#include <cstdint>
#include <memory>
#include <type_traits>
#include <unordered_map>
#include "rocksdb/cache.h"
#include "util/hash_containers.h"
namespace ROCKSDB_NAMESPACE {
extern std::array<std::string, kNumCacheEntryRoles>
// Classifications of block cache entries, for reporting statistics
// Adding new enum to this class requires corresponding updates to
// kCacheEntryRoleToCamelString and kCacheEntryRoleToHyphenString
enum class CacheEntryRole {
// Block-based table data block
kDataBlock,
// Block-based table filter block (full or partitioned)
kFilterBlock,
// Block-based table metadata block for partitioned filter
kFilterMetaBlock,
// Block-based table deprecated filter block (old "block-based" filter)
kDeprecatedFilterBlock,
// Block-based table index block
kIndexBlock,
// Other kinds of block-based table block
kOtherBlock,
// WriteBufferManager reservations to account for memtable usage
kWriteBuffer,
// BlockBasedTableBuilder reservations to account for
// compression dictionary building buffer's memory usage
kCompressionDictionaryBuildingBuffer,
// Filter reservations to account for
// (new) bloom and ribbon filter construction's memory usage
kFilterConstruction,
// Default bucket, for miscellaneous cache entries. Do not use for
// entries that could potentially add up to large usage.
kMisc,
};
constexpr uint32_t kNumCacheEntryRoles =
static_cast<uint32_t>(CacheEntryRole::kMisc) + 1;
extern std::array<const char*, kNumCacheEntryRoles>
kCacheEntryRoleToCamelString;
extern std::array<std::string, kNumCacheEntryRoles>
extern std::array<const char*, kNumCacheEntryRoles>
kCacheEntryRoleToHyphenString;
// To associate cache entries with their role, we use a hack on the
@ -44,7 +75,7 @@ void RegisterCacheDeleterRole(Cache::DeleterFn fn, CacheEntryRole role);
// * This is suitable for preparing for batch operations, like with
// CacheEntryStatsCollector.
// * The number of mappings should be sufficiently small (dozens).
UnorderedMap<Cache::DeleterFn, CacheEntryRole> CopyCacheDeleterRoleMap();
std::unordered_map<Cache::DeleterFn, CacheEntryRole> CopyCacheDeleterRoleMap();
// ************************************************************** //
// An automatic registration infrastructure. This enables code

View File

@ -17,30 +17,12 @@
#include "rocksdb/cache.h"
#include "rocksdb/slice.h"
#include "rocksdb/status.h"
#include "table/block_based/reader_common.h"
#include "table/block_based/block_based_table_reader.h"
#include "util/coding.h"
namespace ROCKSDB_NAMESPACE {
template <CacheEntryRole R>
CacheReservationManagerImpl<R>::CacheReservationHandle::CacheReservationHandle(
std::size_t incremental_memory_used,
std::shared_ptr<CacheReservationManagerImpl> cache_res_mgr)
: incremental_memory_used_(incremental_memory_used) {
assert(cache_res_mgr);
cache_res_mgr_ = cache_res_mgr;
}
template <CacheEntryRole R>
CacheReservationManagerImpl<
R>::CacheReservationHandle::~CacheReservationHandle() {
Status s = cache_res_mgr_->ReleaseCacheReservation(incremental_memory_used_);
s.PermitUncheckedError();
}
template <CacheEntryRole R>
CacheReservationManagerImpl<R>::CacheReservationManagerImpl(
std::shared_ptr<Cache> cache, bool delayed_decrease)
CacheReservationManager::CacheReservationManager(std::shared_ptr<Cache> cache,
bool delayed_decrease)
: delayed_decrease_(delayed_decrease),
cache_allocated_size_(0),
memory_used_(0) {
@ -48,15 +30,14 @@ CacheReservationManagerImpl<R>::CacheReservationManagerImpl(
cache_ = cache;
}
template <CacheEntryRole R>
CacheReservationManagerImpl<R>::~CacheReservationManagerImpl() {
CacheReservationManager::~CacheReservationManager() {
for (auto* handle : dummy_handles_) {
cache_->Release(handle, true);
}
}
template <CacheEntryRole R>
Status CacheReservationManagerImpl<R>::UpdateCacheReservation(
Status CacheReservationManager::UpdateCacheReservation(
std::size_t new_mem_used) {
memory_used_ = new_mem_used;
std::size_t cur_cache_allocated_size =
@ -64,7 +45,7 @@ Status CacheReservationManagerImpl<R>::UpdateCacheReservation(
if (new_mem_used == cur_cache_allocated_size) {
return Status::OK();
} else if (new_mem_used > cur_cache_allocated_size) {
Status s = IncreaseCacheReservation(new_mem_used);
Status s = IncreaseCacheReservation<R>(new_mem_used);
return s;
} else {
// In delayed decrease mode, we don't decrease cache reservation
@ -85,32 +66,41 @@ Status CacheReservationManagerImpl<R>::UpdateCacheReservation(
}
}
// Explicitly instantiate templates for "CacheEntryRole" values we use.
// This makes it possible to keep the template definitions in the .cc file.
template Status CacheReservationManager::UpdateCacheReservation<
CacheEntryRole::kWriteBuffer>(std::size_t new_mem_used);
template Status CacheReservationManager::UpdateCacheReservation<
CacheEntryRole::kCompressionDictionaryBuildingBuffer>(
std::size_t new_mem_used);
// For cache reservation manager unit tests
template Status CacheReservationManager::UpdateCacheReservation<
CacheEntryRole::kMisc>(std::size_t new_mem_used);
template <CacheEntryRole R>
Status CacheReservationManagerImpl<R>::MakeCacheReservation(
Status CacheReservationManager::MakeCacheReservation(
std::size_t incremental_memory_used,
std::unique_ptr<CacheReservationManager::CacheReservationHandle>* handle) {
assert(handle);
std::unique_ptr<CacheReservationHandle<R>>* handle) {
assert(handle != nullptr);
Status s =
UpdateCacheReservation(GetTotalMemoryUsed() + incremental_memory_used);
(*handle).reset(new CacheReservationManagerImpl::CacheReservationHandle(
incremental_memory_used,
std::enable_shared_from_this<
CacheReservationManagerImpl<R>>::shared_from_this()));
UpdateCacheReservation<R>(GetTotalMemoryUsed() + incremental_memory_used);
(*handle).reset(new CacheReservationHandle<R>(incremental_memory_used,
shared_from_this()));
return s;
}
template <CacheEntryRole R>
Status CacheReservationManagerImpl<R>::ReleaseCacheReservation(
std::size_t incremental_memory_used) {
assert(GetTotalMemoryUsed() >= incremental_memory_used);
std::size_t updated_total_mem_used =
GetTotalMemoryUsed() - incremental_memory_used;
Status s = UpdateCacheReservation(updated_total_mem_used);
return s;
}
template Status
CacheReservationManager::MakeCacheReservation<CacheEntryRole::kMisc>(
std::size_t incremental_memory_used,
std::unique_ptr<CacheReservationHandle<CacheEntryRole::kMisc>>* handle);
template Status CacheReservationManager::MakeCacheReservation<
CacheEntryRole::kFilterConstruction>(
std::size_t incremental_memory_used,
std::unique_ptr<
CacheReservationHandle<CacheEntryRole::kFilterConstruction>>* handle);
template <CacheEntryRole R>
Status CacheReservationManagerImpl<R>::IncreaseCacheReservation(
Status CacheReservationManager::IncreaseCacheReservation(
std::size_t new_mem_used) {
Status return_status = Status::OK();
while (new_mem_used > cache_allocated_size_.load(std::memory_order_relaxed)) {
@ -128,8 +118,7 @@ Status CacheReservationManagerImpl<R>::IncreaseCacheReservation(
return return_status;
}
template <CacheEntryRole R>
Status CacheReservationManagerImpl<R>::DecreaseCacheReservation(
Status CacheReservationManager::DecreaseCacheReservation(
std::size_t new_mem_used) {
Status return_status = Status::OK();
@ -148,18 +137,15 @@ Status CacheReservationManagerImpl<R>::DecreaseCacheReservation(
return return_status;
}
template <CacheEntryRole R>
std::size_t CacheReservationManagerImpl<R>::GetTotalReservedCacheSize() {
std::size_t CacheReservationManager::GetTotalReservedCacheSize() {
return cache_allocated_size_.load(std::memory_order_relaxed);
}
template <CacheEntryRole R>
std::size_t CacheReservationManagerImpl<R>::GetTotalMemoryUsed() {
std::size_t CacheReservationManager::GetTotalMemoryUsed() {
return memory_used_;
}
template <CacheEntryRole R>
Slice CacheReservationManagerImpl<R>::GetNextCacheKey() {
Slice CacheReservationManager::GetNextCacheKey() {
// Calling this function will have the side-effect of changing the
// underlying cache_key_ that is shared among other keys generated from this
// fucntion. Therefore please make sure the previous keys are saved/copied
@ -169,15 +155,34 @@ Slice CacheReservationManagerImpl<R>::GetNextCacheKey() {
}
template <CacheEntryRole R>
Cache::DeleterFn CacheReservationManagerImpl<R>::TEST_GetNoopDeleterForRole() {
Cache::DeleterFn CacheReservationManager::TEST_GetNoopDeleterForRole() {
return GetNoopDeleterForRole<R>();
}
template class CacheReservationManagerImpl<
CacheEntryRole::kBlockBasedTableReader>;
template class CacheReservationManagerImpl<
CacheEntryRole::kCompressionDictionaryBuildingBuffer>;
template class CacheReservationManagerImpl<CacheEntryRole::kFilterConstruction>;
template class CacheReservationManagerImpl<CacheEntryRole::kMisc>;
template class CacheReservationManagerImpl<CacheEntryRole::kWriteBuffer>;
template Cache::DeleterFn CacheReservationManager::TEST_GetNoopDeleterForRole<
CacheEntryRole::kFilterConstruction>();
template <CacheEntryRole R>
CacheReservationHandle<R>::CacheReservationHandle(
std::size_t incremental_memory_used,
std::shared_ptr<CacheReservationManager> cache_res_mgr)
: incremental_memory_used_(incremental_memory_used) {
assert(cache_res_mgr != nullptr);
cache_res_mgr_ = cache_res_mgr;
}
template <CacheEntryRole R>
CacheReservationHandle<R>::~CacheReservationHandle() {
assert(cache_res_mgr_ != nullptr);
assert(cache_res_mgr_->GetTotalMemoryUsed() >= incremental_memory_used_);
Status s = cache_res_mgr_->UpdateCacheReservation<R>(
cache_res_mgr_->GetTotalMemoryUsed() - incremental_memory_used_);
s.PermitUncheckedError();
}
// Explicitly instantiate templates for "CacheEntryRole" values we use.
// This makes it possible to keep the template definitions in the .cc file.
template class CacheReservationHandle<CacheEntryRole::kMisc>;
template class CacheReservationHandle<CacheEntryRole::kFilterConstruction>;
} // namespace ROCKSDB_NAMESPACE

View File

@ -13,90 +13,54 @@
#include <cstddef>
#include <cstdint>
#include <memory>
#include <mutex>
#include <vector>
#include "cache/cache_entry_roles.h"
#include "cache/cache_key.h"
#include "rocksdb/cache.h"
#include "rocksdb/slice.h"
#include "rocksdb/status.h"
#include "table/block_based/block_based_table_reader.h"
#include "util/coding.h"
namespace ROCKSDB_NAMESPACE {
// CacheReservationManager is an interface for reserving cache space for the
// memory used
class CacheReservationManager {
public:
// CacheReservationHandle is for managing the lifetime of a cache reservation
// for an incremental amount of memory used (i.e, incremental_memory_used)
class CacheReservationHandle {
public:
virtual ~CacheReservationHandle() {}
};
virtual ~CacheReservationManager() {}
virtual Status UpdateCacheReservation(std::size_t new_memory_used) = 0;
virtual Status MakeCacheReservation(
std::size_t incremental_memory_used,
std::unique_ptr<CacheReservationManager::CacheReservationHandle>
*handle) = 0;
virtual std::size_t GetTotalReservedCacheSize() = 0;
virtual std::size_t GetTotalMemoryUsed() = 0;
};
// CacheReservationManagerImpl implements interface CacheReservationManager
// for reserving cache space for the memory used by inserting/releasing dummy
// entries in the cache.
template <CacheEntryRole R>
class CacheReservationHandle;
// CacheReservationManager is for reserving cache space for the memory used
// through inserting/releasing dummy entries in the cache.
//
// This class is NOT thread-safe, except that GetTotalReservedCacheSize()
// can be called without external synchronization.
template <CacheEntryRole R>
class CacheReservationManagerImpl
: public CacheReservationManager,
public std::enable_shared_from_this<CacheReservationManagerImpl<R>> {
class CacheReservationManager
: public std::enable_shared_from_this<CacheReservationManager> {
public:
class CacheReservationHandle
: public CacheReservationManager::CacheReservationHandle {
public:
CacheReservationHandle(
std::size_t incremental_memory_used,
std::shared_ptr<CacheReservationManagerImpl> cache_res_mgr);
~CacheReservationHandle() override;
private:
std::size_t incremental_memory_used_;
std::shared_ptr<CacheReservationManagerImpl> cache_res_mgr_;
};
// Construct a CacheReservationManagerImpl
// Construct a CacheReservationManager
// @param cache The cache where dummy entries are inserted and released for
// reserving cache space
// @param delayed_decrease If set true, then dummy entries won't be released
// immediately when memory usage decreases.
// immediately when memory usage decreases.
// Instead, it will be released when the memory usage
// decreases to 3/4 of what we have reserved so far.
// This is for saving some future dummy entry
// insertion when memory usage increases are likely to
// happen in the near future.
//
// REQUIRED: cache is not nullptr
explicit CacheReservationManagerImpl(std::shared_ptr<Cache> cache,
bool delayed_decrease = false);
explicit CacheReservationManager(std::shared_ptr<Cache> cache,
bool delayed_decrease = false);
// no copy constructor, copy assignment, move constructor, move assignment
CacheReservationManagerImpl(const CacheReservationManagerImpl &) = delete;
CacheReservationManagerImpl &operator=(const CacheReservationManagerImpl &) =
delete;
CacheReservationManagerImpl(CacheReservationManagerImpl &&) = delete;
CacheReservationManagerImpl &operator=(CacheReservationManagerImpl &&) =
delete;
CacheReservationManager(const CacheReservationManager &) = delete;
CacheReservationManager &operator=(const CacheReservationManager &) = delete;
CacheReservationManager(CacheReservationManager &&) = delete;
CacheReservationManager &operator=(CacheReservationManager &&) = delete;
~CacheReservationManagerImpl() override;
~CacheReservationManager();
// One of the two ways of reserving/releasing cache space,
// see MakeCacheReservation() for the other.
//
// Use ONLY one of these two ways to prevent unexpected behavior.
template <CacheEntryRole R>
// One of the two ways of reserving/releasing cache,
// see CacheReservationManager::MakeCacheReservation() for the other.
// Use ONLY one of them to prevent unexpected behavior.
//
// Insert and release dummy entries in the cache to
// match the size of total dummy entries with the least multiple of
@ -126,13 +90,11 @@ class CacheReservationManagerImpl
// Otherwise, it returns the first non-ok status;
// On releasing dummy entries, it always returns Status::OK().
// On keeping dummy entries the same, it always returns Status::OK().
Status UpdateCacheReservation(std::size_t new_memory_used) override;
Status UpdateCacheReservation(std::size_t new_memory_used);
// One of the two ways of reserving cache space and releasing is done through
// destruction of CacheReservationHandle.
// See UpdateCacheReservation() for the other way.
//
// Use ONLY one of these two ways to prevent unexpected behavior.
// One of the two ways of reserving/releasing cache,
// see CacheReservationManager::UpdateCacheReservation() for the other.
// Use ONLY one of them to prevent unexpected behavior.
//
// Insert dummy entries in the cache for the incremental memory usage
// to match the size of total dummy entries with the least multiple of
@ -156,19 +118,21 @@ class CacheReservationManagerImpl
// calling MakeCacheReservation() is needed if you want
// GetTotalMemoryUsed() indeed returns the latest memory used.
//
// @param handle An pointer to std::unique_ptr<CacheReservationHandle> that
// manages the lifetime of the cache reservation represented by the
// handle.
// @param handle An pointer to std::unique_ptr<CacheReservationHandle<R>> that
// manages the lifetime of the handle and its cache reservation.
//
// @return It returns Status::OK() if all dummy
// entry insertions succeed.
// Otherwise, it returns the first non-ok status;
//
// REQUIRES: handle != nullptr
// REQUIRES: The CacheReservationManager object is NOT managed by
// std::unique_ptr as CacheReservationHandle needs to
// shares ownership to the CacheReservationManager object.
template <CacheEntryRole R>
Status MakeCacheReservation(
std::size_t incremental_memory_used,
std::unique_ptr<CacheReservationManager::CacheReservationHandle> *handle)
override;
std::unique_ptr<CacheReservationHandle<R>> *handle);
// Return the size of the cache (which is a multiple of kSizeDummyEntry)
// successfully reserved by calling UpdateCacheReservation().
@ -178,25 +142,25 @@ class CacheReservationManagerImpl
// smaller number than the actual reserved cache size due to
// the returned number will always be a multiple of kSizeDummyEntry
// and cache full might happen in the middle of inserting a dummy entry.
std::size_t GetTotalReservedCacheSize() override;
std::size_t GetTotalReservedCacheSize();
// Return the latest total memory used indicated by the most recent call of
// UpdateCacheReservation(std::size_t new_memory_used);
std::size_t GetTotalMemoryUsed() override;
std::size_t GetTotalMemoryUsed();
static constexpr std::size_t GetDummyEntrySize() { return kSizeDummyEntry; }
// For testing only - it is to help ensure the NoopDeleterForRole<R>
// accessed from CacheReservationManagerImpl and the one accessed from the
// test are from the same translation units
// accessed from CacheReservationManager and the one accessed from the test
// are from the same translation units
template <CacheEntryRole R>
static Cache::DeleterFn TEST_GetNoopDeleterForRole();
private:
static constexpr std::size_t kSizeDummyEntry = 256 * 1024;
Slice GetNextCacheKey();
Status ReleaseCacheReservation(std::size_t incremental_memory_used);
template <CacheEntryRole R>
Status IncreaseCacheReservation(std::size_t new_mem_used);
Status DecreaseCacheReservation(std::size_t new_mem_used);
@ -208,81 +172,20 @@ class CacheReservationManagerImpl
CacheKey cache_key_;
};
class ConcurrentCacheReservationManager
: public CacheReservationManager,
public std::enable_shared_from_this<ConcurrentCacheReservationManager> {
// CacheReservationHandle is for managing the lifetime of a cache reservation
// This class is NOT thread-safe
template <CacheEntryRole R>
class CacheReservationHandle {
public:
class CacheReservationHandle
: public CacheReservationManager::CacheReservationHandle {
public:
CacheReservationHandle(
std::shared_ptr<ConcurrentCacheReservationManager> cache_res_mgr,
std::unique_ptr<CacheReservationManager::CacheReservationHandle>
cache_res_handle) {
assert(cache_res_mgr && cache_res_handle);
cache_res_mgr_ = cache_res_mgr;
cache_res_handle_ = std::move(cache_res_handle);
}
~CacheReservationHandle() override {
std::lock_guard<std::mutex> lock(cache_res_mgr_->cache_res_mgr_mu_);
cache_res_handle_.reset();
}
private:
std::shared_ptr<ConcurrentCacheReservationManager> cache_res_mgr_;
std::unique_ptr<CacheReservationManager::CacheReservationHandle>
cache_res_handle_;
};
explicit ConcurrentCacheReservationManager(
std::shared_ptr<CacheReservationManager> cache_res_mgr) {
cache_res_mgr_ = std::move(cache_res_mgr);
}
ConcurrentCacheReservationManager(const ConcurrentCacheReservationManager &) =
delete;
ConcurrentCacheReservationManager &operator=(
const ConcurrentCacheReservationManager &) = delete;
ConcurrentCacheReservationManager(ConcurrentCacheReservationManager &&) =
delete;
ConcurrentCacheReservationManager &operator=(
ConcurrentCacheReservationManager &&) = delete;
~ConcurrentCacheReservationManager() override {}
inline Status UpdateCacheReservation(std::size_t new_memory_used) override {
std::lock_guard<std::mutex> lock(cache_res_mgr_mu_);
return cache_res_mgr_->UpdateCacheReservation(new_memory_used);
}
inline Status MakeCacheReservation(
// REQUIRES: cache_res_mgr != nullptr
explicit CacheReservationHandle(
std::size_t incremental_memory_used,
std::unique_ptr<CacheReservationManager::CacheReservationHandle> *handle)
override {
std::unique_ptr<CacheReservationManager::CacheReservationHandle>
wrapped_handle;
Status s;
{
std::lock_guard<std::mutex> lock(cache_res_mgr_mu_);
s = cache_res_mgr_->MakeCacheReservation(incremental_memory_used,
&wrapped_handle);
}
(*handle).reset(
new ConcurrentCacheReservationManager::CacheReservationHandle(
std::enable_shared_from_this<
ConcurrentCacheReservationManager>::shared_from_this(),
std::move(wrapped_handle)));
return s;
}
inline std::size_t GetTotalReservedCacheSize() override {
return cache_res_mgr_->GetTotalReservedCacheSize();
}
inline std::size_t GetTotalMemoryUsed() override {
std::lock_guard<std::mutex> lock(cache_res_mgr_mu_);
return cache_res_mgr_->GetTotalMemoryUsed();
}
std::shared_ptr<CacheReservationManager> cache_res_mgr);
~CacheReservationHandle();
private:
std::mutex cache_res_mgr_mu_;
std::size_t incremental_memory_used_;
std::shared_ptr<CacheReservationManager> cache_res_mgr_;
};
} // namespace ROCKSDB_NAMESPACE

View File

@ -15,6 +15,7 @@
#include "cache/cache_entry_roles.h"
#include "rocksdb/cache.h"
#include "rocksdb/slice.h"
#include "table/block_based/block_based_table_reader.h"
#include "test_util/testharness.h"
#include "util/coding.h"
@ -22,24 +23,25 @@ namespace ROCKSDB_NAMESPACE {
class CacheReservationManagerTest : public ::testing::Test {
protected:
static constexpr std::size_t kSizeDummyEntry =
CacheReservationManagerImpl<CacheEntryRole::kMisc>::GetDummyEntrySize();
CacheReservationManager::GetDummyEntrySize();
static constexpr std::size_t kCacheCapacity = 4096 * kSizeDummyEntry;
static constexpr int kNumShardBits = 0; // 2^0 shard
static constexpr std::size_t kMetaDataChargeOverhead = 10000;
std::shared_ptr<Cache> cache = NewLRUCache(kCacheCapacity, kNumShardBits);
std::shared_ptr<CacheReservationManager> test_cache_rev_mng;
std::unique_ptr<CacheReservationManager> test_cache_rev_mng;
CacheReservationManagerTest() {
test_cache_rev_mng =
std::make_shared<CacheReservationManagerImpl<CacheEntryRole::kMisc>>(
cache);
test_cache_rev_mng.reset(new CacheReservationManager(cache));
}
};
TEST_F(CacheReservationManagerTest, GenerateCacheKey) {
std::size_t new_mem_used = 1 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_GE(cache->GetPinnedUsage(), 1 * kSizeDummyEntry);
ASSERT_LT(cache->GetPinnedUsage(),
@ -47,14 +49,13 @@ TEST_F(CacheReservationManagerTest, GenerateCacheKey) {
// Next unique Cache key
CacheKey ckey = CacheKey::CreateUniqueForCacheLifetime(cache.get());
// Get to the underlying values
using PairU64 = std::array<uint64_t, 2>;
auto& ckey_pair = *reinterpret_cast<PairU64*>(&ckey);
// Back it up to the one used by CRM (using CacheKey implementation details)
ckey_pair[1]--;
using PairU64 = std::pair<uint64_t, uint64_t>;
auto& ckey_pair = *reinterpret_cast<PairU64*>(&ckey);
ckey_pair.second--;
// Specific key (subject to implementation details)
EXPECT_EQ(ckey_pair, PairU64({0, 2}));
EXPECT_EQ(ckey_pair, PairU64(0, 2));
Cache::Handle* handle = cache->Lookup(ckey.AsSlice());
EXPECT_NE(handle, nullptr)
@ -65,7 +66,10 @@ TEST_F(CacheReservationManagerTest, GenerateCacheKey) {
TEST_F(CacheReservationManagerTest, KeepCacheReservationTheSame) {
std::size_t new_mem_used = 1 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
1 * kSizeDummyEntry);
@ -75,7 +79,9 @@ TEST_F(CacheReservationManagerTest, KeepCacheReservationTheSame) {
ASSERT_LT(initial_pinned_usage,
1 * kSizeDummyEntry + kMetaDataChargeOverhead);
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to keep cache reservation the same when new_mem_used equals "
"to current cache reservation";
@ -94,7 +100,10 @@ TEST_F(CacheReservationManagerTest, KeepCacheReservationTheSame) {
TEST_F(CacheReservationManagerTest,
IncreaseCacheReservationByMultiplesOfDummyEntrySize) {
std::size_t new_mem_used = 2 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to increase cache reservation correctly";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
@ -112,7 +121,10 @@ TEST_F(CacheReservationManagerTest,
TEST_F(CacheReservationManagerTest,
IncreaseCacheReservationNotByMultiplesOfDummyEntrySize) {
std::size_t new_mem_used = 2 * kSizeDummyEntry + kSizeDummyEntry / 2;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to increase cache reservation correctly";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
@ -131,7 +143,7 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
IncreaseCacheReservationOnFullCache) {
;
constexpr std::size_t kSizeDummyEntry =
CacheReservationManagerImpl<CacheEntryRole::kMisc>::GetDummyEntrySize();
CacheReservationManager::GetDummyEntrySize();
constexpr std::size_t kSmallCacheCapacity = 4 * kSizeDummyEntry;
constexpr std::size_t kBigCacheCapacity = 4096 * kSizeDummyEntry;
constexpr std::size_t kMetaDataChargeOverhead = 10000;
@ -141,12 +153,14 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
lo.num_shard_bits = 0; // 2^0 shard
lo.strict_capacity_limit = true;
std::shared_ptr<Cache> cache = NewLRUCache(lo);
std::shared_ptr<CacheReservationManager> test_cache_rev_mng =
std::make_shared<CacheReservationManagerImpl<CacheEntryRole::kMisc>>(
cache);
std::unique_ptr<CacheReservationManager> test_cache_rev_mng(
new CacheReservationManager(cache));
std::size_t new_mem_used = kSmallCacheCapacity + 1;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::Incomplete())
<< "Failed to return status to indicate failure of dummy entry insertion "
"during cache reservation on full cache";
@ -169,7 +183,9 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
"encountering cache resevation failure due to full cache";
new_mem_used = kSmallCacheCapacity / 2; // 2 dummy entries
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to decrease cache reservation after encountering cache "
"reservation failure due to full cache";
@ -191,7 +207,9 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
// Create cache full again for subsequent tests
new_mem_used = kSmallCacheCapacity + 1;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::Incomplete())
<< "Failed to return status to indicate failure of dummy entry insertion "
"during cache reservation on full cache";
@ -217,7 +235,9 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
// succeed
cache->SetCapacity(kBigCacheCapacity);
new_mem_used = kSmallCacheCapacity + 1;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to increase cache reservation after increasing cache capacity "
"and mitigating cache full error";
@ -239,7 +259,10 @@ TEST(CacheReservationManagerIncreaseReservcationOnFullCacheTest,
TEST_F(CacheReservationManagerTest,
DecreaseCacheReservationByMultiplesOfDummyEntrySize) {
std::size_t new_mem_used = 2 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
2 * kSizeDummyEntry);
@ -249,7 +272,9 @@ TEST_F(CacheReservationManagerTest,
2 * kSizeDummyEntry + kMetaDataChargeOverhead);
new_mem_used = 1 * kSizeDummyEntry;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to decrease cache reservation correctly";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
@ -267,7 +292,10 @@ TEST_F(CacheReservationManagerTest,
TEST_F(CacheReservationManagerTest,
DecreaseCacheReservationNotByMultiplesOfDummyEntrySize) {
std::size_t new_mem_used = 2 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
2 * kSizeDummyEntry);
@ -277,7 +305,9 @@ TEST_F(CacheReservationManagerTest,
2 * kSizeDummyEntry + kMetaDataChargeOverhead);
new_mem_used = kSizeDummyEntry / 2;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to decrease cache reservation correctly";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
@ -295,7 +325,7 @@ TEST_F(CacheReservationManagerTest,
TEST(CacheReservationManagerWithDelayedDecreaseTest,
DecreaseCacheReservationWithDelayedDecrease) {
constexpr std::size_t kSizeDummyEntry =
CacheReservationManagerImpl<CacheEntryRole::kMisc>::GetDummyEntrySize();
CacheReservationManager::GetDummyEntrySize();
constexpr std::size_t kCacheCapacity = 4096 * kSizeDummyEntry;
constexpr std::size_t kMetaDataChargeOverhead = 10000;
@ -303,12 +333,14 @@ TEST(CacheReservationManagerWithDelayedDecreaseTest,
lo.capacity = kCacheCapacity;
lo.num_shard_bits = 0;
std::shared_ptr<Cache> cache = NewLRUCache(lo);
std::shared_ptr<CacheReservationManager> test_cache_rev_mng =
std::make_shared<CacheReservationManagerImpl<CacheEntryRole::kMisc>>(
cache, true /* delayed_decrease */);
std::unique_ptr<CacheReservationManager> test_cache_rev_mng(
new CacheReservationManager(cache, true /* delayed_decrease */));
std::size_t new_mem_used = 8 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
8 * kSizeDummyEntry);
@ -319,7 +351,9 @@ TEST(CacheReservationManagerWithDelayedDecreaseTest,
8 * kSizeDummyEntry + kMetaDataChargeOverhead);
new_mem_used = 6 * kSizeDummyEntry;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK()) << "Failed to delay decreasing cache reservation";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
8 * kSizeDummyEntry)
@ -331,7 +365,9 @@ TEST(CacheReservationManagerWithDelayedDecreaseTest,
<< "Failed to delay decreasing underlying dummy entries in cache";
new_mem_used = 7 * kSizeDummyEntry;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK()) << "Failed to delay decreasing cache reservation";
EXPECT_EQ(test_cache_rev_mng->GetTotalReservedCacheSize(),
8 * kSizeDummyEntry)
@ -343,7 +379,9 @@ TEST(CacheReservationManagerWithDelayedDecreaseTest,
<< "Failed to delay decreasing underlying dummy entries in cache";
new_mem_used = 6 * kSizeDummyEntry - 1;
s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
s = test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
EXPECT_EQ(s, Status::OK())
<< "Failed to decrease cache reservation correctly when new_mem_used < "
"GetTotalReservedCacheSize() * 3 / 4 on delayed decrease mode";
@ -367,7 +405,7 @@ TEST(CacheReservationManagerWithDelayedDecreaseTest,
TEST(CacheReservationManagerDestructorTest,
ReleaseRemainingDummyEntriesOnDestruction) {
constexpr std::size_t kSizeDummyEntry =
CacheReservationManagerImpl<CacheEntryRole::kMisc>::GetDummyEntrySize();
CacheReservationManager::GetDummyEntrySize();
constexpr std::size_t kCacheCapacity = 4096 * kSizeDummyEntry;
constexpr std::size_t kMetaDataChargeOverhead = 10000;
@ -376,11 +414,13 @@ TEST(CacheReservationManagerDestructorTest,
lo.num_shard_bits = 0;
std::shared_ptr<Cache> cache = NewLRUCache(lo);
{
std::shared_ptr<CacheReservationManager> test_cache_rev_mng =
std::make_shared<CacheReservationManagerImpl<CacheEntryRole::kMisc>>(
cache);
std::unique_ptr<CacheReservationManager> test_cache_rev_mng(
new CacheReservationManager(cache));
std::size_t new_mem_used = 1 * kSizeDummyEntry;
Status s = test_cache_rev_mng->UpdateCacheReservation(new_mem_used);
Status s =
test_cache_rev_mng
->UpdateCacheReservation<ROCKSDB_NAMESPACE::CacheEntryRole::kMisc>(
new_mem_used);
ASSERT_EQ(s, Status::OK());
ASSERT_GE(cache->GetPinnedUsage(), 1 * kSizeDummyEntry);
ASSERT_LT(cache->GetPinnedUsage(),
@ -402,19 +442,18 @@ TEST(CacheReservationHandleTest, HandleTest) {
std::shared_ptr<Cache> cache = NewLRUCache(lo);
std::shared_ptr<CacheReservationManager> test_cache_rev_mng(
std::make_shared<CacheReservationManagerImpl<CacheEntryRole::kMisc>>(
cache));
std::make_shared<CacheReservationManager>(cache));
std::size_t mem_used = 0;
const std::size_t incremental_mem_used_handle_1 = 1 * kSizeDummyEntry;
const std::size_t incremental_mem_used_handle_2 = 2 * kSizeDummyEntry;
std::unique_ptr<CacheReservationManager::CacheReservationHandle> handle_1,
std::unique_ptr<CacheReservationHandle<CacheEntryRole::kMisc>> handle_1,
handle_2;
// To test consecutive CacheReservationManager::MakeCacheReservation works
// correctly in terms of returning the handle as well as updating cache
// reservation and the latest total memory used
Status s = test_cache_rev_mng->MakeCacheReservation(
Status s = test_cache_rev_mng->MakeCacheReservation<CacheEntryRole::kMisc>(
incremental_mem_used_handle_1, &handle_1);
mem_used = mem_used + incremental_mem_used_handle_1;
ASSERT_EQ(s, Status::OK());
@ -424,8 +463,8 @@ TEST(CacheReservationHandleTest, HandleTest) {
EXPECT_GE(cache->GetPinnedUsage(), mem_used);
EXPECT_LT(cache->GetPinnedUsage(), mem_used + kMetaDataChargeOverhead);
s = test_cache_rev_mng->MakeCacheReservation(incremental_mem_used_handle_2,
&handle_2);
s = test_cache_rev_mng->MakeCacheReservation<CacheEntryRole::kMisc>(
incremental_mem_used_handle_2, &handle_2);
mem_used = mem_used + incremental_mem_used_handle_2;
ASSERT_EQ(s, Status::OK());
EXPECT_TRUE(handle_2 != nullptr);
@ -434,9 +473,8 @@ TEST(CacheReservationHandleTest, HandleTest) {
EXPECT_GE(cache->GetPinnedUsage(), mem_used);
EXPECT_LT(cache->GetPinnedUsage(), mem_used + kMetaDataChargeOverhead);
// To test
// CacheReservationManager::CacheReservationHandle::~CacheReservationHandle()
// works correctly in releasing the cache reserved for the handle
// To test CacheReservationHandle::~CacheReservationHandle() works correctly
// in releasing the cache reserved for the handle
handle_1.reset();
EXPECT_TRUE(handle_1 == nullptr);
mem_used = mem_used - incremental_mem_used_handle_1;

46
cache/cache_test.cc vendored
View File

@ -14,9 +14,7 @@
#include <iostream>
#include <string>
#include <vector>
#include "cache/clock_cache.h"
#include "cache/fast_lru_cache.h"
#include "cache/lru_cache.h"
#include "test_util/testharness.h"
#include "util/coding.h"
@ -41,7 +39,6 @@ static int DecodeValue(void* v) {
const std::string kLRU = "lru";
const std::string kClock = "clock";
const std::string kFast = "fast";
void dumbDeleter(const Slice& /*key*/, void* /*value*/) {}
@ -86,9 +83,6 @@ class CacheTest : public testing::TestWithParam<std::string> {
if (type == kClock) {
return NewClockCache(capacity);
}
if (type == kFast) {
return NewFastLRUCache(capacity);
}
return nullptr;
}
@ -109,10 +103,6 @@ class CacheTest : public testing::TestWithParam<std::string> {
return NewClockCache(capacity, num_shard_bits, strict_capacity_limit,
charge_policy);
}
if (type == kFast) {
return NewFastLRUCache(capacity, num_shard_bits, strict_capacity_limit,
charge_policy);
}
return nullptr;
}
@ -193,7 +183,7 @@ TEST_P(CacheTest, UsageTest) {
// make sure the cache will be overloaded
for (uint64_t i = 1; i < kCapacity; ++i) {
auto key = std::to_string(i);
auto key = ToString(i);
ASSERT_OK(cache->Insert(key, reinterpret_cast<void*>(value), key.size() + 5,
dumbDeleter));
ASSERT_OK(precise_cache->Insert(key, reinterpret_cast<void*>(value),
@ -265,7 +255,7 @@ TEST_P(CacheTest, PinnedUsageTest) {
// check that overloading the cache does not change the pinned usage
for (uint64_t i = 1; i < 2 * kCapacity; ++i) {
auto key = std::to_string(i);
auto key = ToString(i);
ASSERT_OK(cache->Insert(key, reinterpret_cast<void*>(value), key.size() + 5,
dumbDeleter));
ASSERT_OK(precise_cache->Insert(key, reinterpret_cast<void*>(value),
@ -585,7 +575,7 @@ TEST_P(CacheTest, SetCapacity) {
std::vector<Cache::Handle*> handles(10);
// Insert 5 entries, but not releasing.
for (size_t i = 0; i < 5; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i+1);
Status s = cache->Insert(key, new Value(i + 1), 1, &deleter, &handles[i]);
ASSERT_TRUE(s.ok());
}
@ -600,7 +590,7 @@ TEST_P(CacheTest, SetCapacity) {
// then decrease capacity to 7, final capacity should be 7
// and usage should be 7
for (size_t i = 5; i < 10; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i+1);
Status s = cache->Insert(key, new Value(i + 1), 1, &deleter, &handles[i]);
ASSERT_TRUE(s.ok());
}
@ -631,7 +621,7 @@ TEST_P(LRUCacheTest, SetStrictCapacityLimit) {
std::vector<Cache::Handle*> handles(10);
Status s;
for (size_t i = 0; i < 10; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i + 1);
s = cache->Insert(key, new Value(i + 1), 1, &deleter, &handles[i]);
ASSERT_OK(s);
ASSERT_NE(nullptr, handles[i]);
@ -655,7 +645,7 @@ TEST_P(LRUCacheTest, SetStrictCapacityLimit) {
// test3: init with flag being true.
std::shared_ptr<Cache> cache2 = NewCache(5, 0, true);
for (size_t i = 0; i < 5; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i + 1);
s = cache2->Insert(key, new Value(i + 1), 1, &deleter, &handles[i]);
ASSERT_OK(s);
ASSERT_NE(nullptr, handles[i]);
@ -685,14 +675,14 @@ TEST_P(CacheTest, OverCapacity) {
// Insert n+1 entries, but not releasing.
for (size_t i = 0; i < n + 1; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i+1);
Status s = cache->Insert(key, new Value(i + 1), 1, &deleter, &handles[i]);
ASSERT_TRUE(s.ok());
}
// Guess what's in the cache now?
for (size_t i = 0; i < n + 1; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i+1);
auto h = cache->Lookup(key);
ASSERT_TRUE(h != nullptr);
if (h) cache->Release(h);
@ -713,7 +703,7 @@ TEST_P(CacheTest, OverCapacity) {
// This is consistent with the LRU policy since the element 0
// was released first
for (size_t i = 0; i < n + 1; i++) {
std::string key = std::to_string(i + 1);
std::string key = ToString(i+1);
auto h = cache->Lookup(key);
if (h) {
ASSERT_NE(i, 0U);
@ -754,9 +744,9 @@ TEST_P(CacheTest, ApplyToAllEntriesTest) {
std::vector<std::string> callback_state;
const auto callback = [&](const Slice& key, void* value, size_t charge,
Cache::DeleterFn deleter) {
callback_state.push_back(std::to_string(DecodeKey(key)) + "," +
std::to_string(DecodeValue(value)) + "," +
std::to_string(charge));
callback_state.push_back(ToString(DecodeKey(key)) + "," +
ToString(DecodeValue(value)) + "," +
ToString(charge));
assert(deleter == &CacheTest::Deleter);
};
@ -765,8 +755,8 @@ TEST_P(CacheTest, ApplyToAllEntriesTest) {
for (int i = 0; i < 10; ++i) {
Insert(i, i * 2, i + 1);
inserted.push_back(std::to_string(i) + "," + std::to_string(i * 2) + "," +
std::to_string(i + 1));
inserted.push_back(ToString(i) + "," + ToString(i * 2) + "," +
ToString(i + 1));
}
cache_->ApplyToAllEntries(callback, /*opts*/ {});
@ -848,13 +838,11 @@ TEST_P(CacheTest, GetChargeAndDeleter) {
std::shared_ptr<Cache> (*new_clock_cache_func)(
size_t, int, bool, CacheMetadataChargePolicy) = NewClockCache;
INSTANTIATE_TEST_CASE_P(CacheTestInstance, CacheTest,
testing::Values(kLRU, kClock, kFast));
testing::Values(kLRU, kClock));
#else
INSTANTIATE_TEST_CASE_P(CacheTestInstance, CacheTest,
testing::Values(kLRU, kFast));
INSTANTIATE_TEST_CASE_P(CacheTestInstance, CacheTest, testing::Values(kLRU));
#endif // SUPPORT_CLOCK_CACHE
INSTANTIATE_TEST_CASE_P(CacheTestInstance, LRUCacheTest,
testing::Values(kLRU, kFast));
INSTANTIATE_TEST_CASE_P(CacheTestInstance, LRUCacheTest, testing::Values(kLRU));
} // namespace ROCKSDB_NAMESPACE

12
cache/clock_cache.cc vendored
View File

@ -286,8 +286,8 @@ class ClockCacheShard final : public CacheShard {
return Lookup(key, hash);
}
bool Release(Cache::Handle* handle, bool /*useful*/,
bool erase_if_last_ref) override {
return Release(handle, erase_if_last_ref);
bool force_erase) override {
return Release(handle, force_erase);
}
bool IsReady(Cache::Handle* /*handle*/) override { return true; }
void Wait(Cache::Handle* /*handle*/) override {}
@ -297,7 +297,7 @@ class ClockCacheShard final : public CacheShard {
//
// Not necessary to hold mutex_ before being called.
bool Ref(Cache::Handle* handle) override;
bool Release(Cache::Handle* handle, bool erase_if_last_ref = false) override;
bool Release(Cache::Handle* handle, bool force_erase = false) override;
void Erase(const Slice& key, uint32_t hash) override;
bool EraseAndConfirm(const Slice& key, uint32_t hash,
CleanupContext* context);
@ -687,7 +687,7 @@ Status ClockCacheShard::Insert(const Slice& key, uint32_t hash, void* value,
Status s;
if (out_handle != nullptr) {
if (handle == nullptr) {
s = Status::Incomplete("Insert failed due to CLOCK cache being full.");
s = Status::Incomplete("Insert failed due to LRU cache being full.");
} else {
*out_handle = reinterpret_cast<Cache::Handle*>(handle);
}
@ -725,11 +725,11 @@ Cache::Handle* ClockCacheShard::Lookup(const Slice& key, uint32_t hash) {
return reinterpret_cast<Cache::Handle*>(handle);
}
bool ClockCacheShard::Release(Cache::Handle* h, bool erase_if_last_ref) {
bool ClockCacheShard::Release(Cache::Handle* h, bool force_erase) {
CleanupContext context;
CacheHandle* handle = reinterpret_cast<CacheHandle*>(h);
bool erased = Unref(handle, true, &context);
if (erase_if_last_ref && !erased) {
if (force_erase && !erased) {
erased = EraseAndConfirm(handle->key, handle->hash, &context);
}
Cleanup(context);

View File

@ -1,171 +0,0 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#include "cache/compressed_secondary_cache.h"
#include <memory>
#include "memory/memory_allocator.h"
#include "util/compression.h"
#include "util/string_util.h"
namespace ROCKSDB_NAMESPACE {
namespace {
void DeletionCallback(const Slice& /*key*/, void* obj) {
delete reinterpret_cast<CacheAllocationPtr*>(obj);
obj = nullptr;
}
} // namespace
CompressedSecondaryCache::CompressedSecondaryCache(
size_t capacity, int num_shard_bits, bool strict_capacity_limit,
double high_pri_pool_ratio,
std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
CacheMetadataChargePolicy metadata_charge_policy,
CompressionType compression_type, uint32_t compress_format_version)
: cache_options_(capacity, num_shard_bits, strict_capacity_limit,
high_pri_pool_ratio, memory_allocator, use_adaptive_mutex,
metadata_charge_policy, compression_type,
compress_format_version) {
cache_ = NewLRUCache(capacity, num_shard_bits, strict_capacity_limit,
high_pri_pool_ratio, memory_allocator,
use_adaptive_mutex, metadata_charge_policy);
}
CompressedSecondaryCache::~CompressedSecondaryCache() { cache_.reset(); }
std::unique_ptr<SecondaryCacheResultHandle> CompressedSecondaryCache::Lookup(
const Slice& key, const Cache::CreateCallback& create_cb, bool /*wait*/,
bool& is_in_sec_cache) {
std::unique_ptr<SecondaryCacheResultHandle> handle;
is_in_sec_cache = false;
Cache::Handle* lru_handle = cache_->Lookup(key);
if (lru_handle == nullptr) {
return handle;
}
CacheAllocationPtr* ptr =
reinterpret_cast<CacheAllocationPtr*>(cache_->Value(lru_handle));
void* value = nullptr;
size_t charge = 0;
Status s;
if (cache_options_.compression_type == kNoCompression) {
s = create_cb(ptr->get(), cache_->GetCharge(lru_handle), &value, &charge);
} else {
UncompressionContext uncompression_context(cache_options_.compression_type);
UncompressionInfo uncompression_info(uncompression_context,
UncompressionDict::GetEmptyDict(),
cache_options_.compression_type);
size_t uncompressed_size = 0;
CacheAllocationPtr uncompressed;
uncompressed = UncompressData(
uncompression_info, (char*)ptr->get(), cache_->GetCharge(lru_handle),
&uncompressed_size, cache_options_.compress_format_version,
cache_options_.memory_allocator.get());
if (!uncompressed) {
cache_->Release(lru_handle, /* erase_if_last_ref */ true);
return handle;
}
s = create_cb(uncompressed.get(), uncompressed_size, &value, &charge);
}
if (!s.ok()) {
cache_->Release(lru_handle, /* erase_if_last_ref */ true);
return handle;
}
cache_->Release(lru_handle, /* erase_if_last_ref */ true);
handle.reset(new CompressedSecondaryCacheResultHandle(value, charge));
return handle;
}
Status CompressedSecondaryCache::Insert(const Slice& key, void* value,
const Cache::CacheItemHelper* helper) {
size_t size = (*helper->size_cb)(value);
CacheAllocationPtr ptr =
AllocateBlock(size, cache_options_.memory_allocator.get());
Status s = (*helper->saveto_cb)(value, 0, size, ptr.get());
if (!s.ok()) {
return s;
}
Slice val(ptr.get(), size);
std::string compressed_val;
if (cache_options_.compression_type != kNoCompression) {
CompressionOptions compression_opts;
CompressionContext compression_context(cache_options_.compression_type);
uint64_t sample_for_compression = 0;
CompressionInfo compression_info(
compression_opts, compression_context, CompressionDict::GetEmptyDict(),
cache_options_.compression_type, sample_for_compression);
bool success =
CompressData(val, compression_info,
cache_options_.compress_format_version, &compressed_val);
if (!success) {
return Status::Corruption("Error compressing value.");
}
val = Slice(compressed_val);
size = compressed_val.size();
ptr = AllocateBlock(size, cache_options_.memory_allocator.get());
memcpy(ptr.get(), compressed_val.data(), size);
}
CacheAllocationPtr* buf = new CacheAllocationPtr(std::move(ptr));
return cache_->Insert(key, buf, size, DeletionCallback);
}
void CompressedSecondaryCache::Erase(const Slice& key) { cache_->Erase(key); }
std::string CompressedSecondaryCache::GetPrintableOptions() const {
std::string ret;
ret.reserve(20000);
const int kBufferSize = 200;
char buffer[kBufferSize];
ret.append(cache_->GetPrintableOptions());
snprintf(buffer, kBufferSize, " compression_type : %s\n",
CompressionTypeToString(cache_options_.compression_type).c_str());
ret.append(buffer);
snprintf(buffer, kBufferSize, " compression_type : %d\n",
cache_options_.compress_format_version);
ret.append(buffer);
return ret;
}
std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
size_t capacity, int num_shard_bits, bool strict_capacity_limit,
double high_pri_pool_ratio,
std::shared_ptr<MemoryAllocator> memory_allocator, bool use_adaptive_mutex,
CacheMetadataChargePolicy metadata_charge_policy,
CompressionType compression_type, uint32_t compress_format_version) {
return std::make_shared<CompressedSecondaryCache>(
capacity, num_shard_bits, strict_capacity_limit, high_pri_pool_ratio,
memory_allocator, use_adaptive_mutex, metadata_charge_policy,
compression_type, compress_format_version);
}
std::shared_ptr<SecondaryCache> NewCompressedSecondaryCache(
const CompressedSecondaryCacheOptions& opts) {
// The secondary_cache is disabled for this LRUCache instance.
assert(opts.secondary_cache == nullptr);
return NewCompressedSecondaryCache(
opts.capacity, opts.num_shard_bits, opts.strict_capacity_limit,
opts.high_pri_pool_ratio, opts.memory_allocator, opts.use_adaptive_mutex,
opts.metadata_charge_policy, opts.compression_type,
opts.compress_format_version);
}
} // namespace ROCKSDB_NAMESPACE

View File

@ -1,86 +0,0 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#pragma once
#include <memory>
#include "cache/lru_cache.h"
#include "memory/memory_allocator.h"
#include "rocksdb/secondary_cache.h"
#include "rocksdb/slice.h"
#include "rocksdb/status.h"
#include "util/compression.h"
namespace ROCKSDB_NAMESPACE {
class CompressedSecondaryCacheResultHandle : public SecondaryCacheResultHandle {
public:
CompressedSecondaryCacheResultHandle(void* value, size_t size)
: value_(value), size_(size) {}
virtual ~CompressedSecondaryCacheResultHandle() override = default;
CompressedSecondaryCacheResultHandle(
const CompressedSecondaryCacheResultHandle&) = delete;
CompressedSecondaryCacheResultHandle& operator=(
const CompressedSecondaryCacheResultHandle&) = delete;
bool IsReady() override { return true; }
void Wait() override {}
void* Value() override { return value_; }
size_t Size() override { return size_; }
private:
void* value_;
size_t size_;
};
// The CompressedSecondaryCache is a concrete implementation of
// rocksdb::SecondaryCache.
//
// Users can also cast a pointer to it and call methods on
// it directly, especially custom methods that may be added
// in the future. For example -
// std::unique_ptr<rocksdb::SecondaryCache> cache =
// NewCompressedSecondaryCache(opts);
// static_cast<CompressedSecondaryCache*>(cache.get())->Erase(key);
class CompressedSecondaryCache : public SecondaryCache {
public:
CompressedSecondaryCache(
size_t capacity, int num_shard_bits, bool strict_capacity_limit,
double high_pri_pool_ratio,
std::shared_ptr<MemoryAllocator> memory_allocator = nullptr,
bool use_adaptive_mutex = kDefaultToAdaptiveMutex,
CacheMetadataChargePolicy metadata_charge_policy =
kDontChargeCacheMetadata,
CompressionType compression_type = CompressionType::kLZ4Compression,
uint32_t compress_format_version = 2);
virtual ~CompressedSecondaryCache() override;
const char* Name() const override { return "CompressedSecondaryCache"; }
Status Insert(const Slice& key, void* value,
const Cache::CacheItemHelper* helper) override;
std::unique_ptr<SecondaryCacheResultHandle> Lookup(
const Slice& key, const Cache::CreateCallback& create_cb, bool /*wait*/,
bool& is_in_sec_cache) override;
void Erase(const Slice& key) override;
void WaitAll(std::vector<SecondaryCacheResultHandle*> /*handles*/) override {}
std::string GetPrintableOptions() const override;
private:
std::shared_ptr<Cache> cache_;
CompressedSecondaryCacheOptions cache_options_;
};
} // namespace ROCKSDB_NAMESPACE

View File

@ -1,607 +0,0 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#include "cache/compressed_secondary_cache.h"
#include <algorithm>
#include <cstdint>
#include "memory/jemalloc_nodump_allocator.h"
#include "memory/memory_allocator.h"
#include "test_util/testharness.h"
#include "test_util/testutil.h"
#include "util/compression.h"
#include "util/random.h"
namespace ROCKSDB_NAMESPACE {
class CompressedSecondaryCacheTest : public testing::Test {
public:
CompressedSecondaryCacheTest() : fail_create_(false) {}
~CompressedSecondaryCacheTest() {}
protected:
class TestItem {
public:
TestItem(const char* buf, size_t size) : buf_(new char[size]), size_(size) {
memcpy(buf_.get(), buf, size);
}
~TestItem() {}
char* Buf() { return buf_.get(); }
size_t Size() { return size_; }
private:
std::unique_ptr<char[]> buf_;
size_t size_;
};
static size_t SizeCallback(void* obj) {
return reinterpret_cast<TestItem*>(obj)->Size();
}
static Status SaveToCallback(void* from_obj, size_t from_offset,
size_t length, void* out) {
TestItem* item = reinterpret_cast<TestItem*>(from_obj);
const char* buf = item->Buf();
EXPECT_EQ(length, item->Size());
EXPECT_EQ(from_offset, 0);
memcpy(out, buf, length);
return Status::OK();
}
static void DeletionCallback(const Slice& /*key*/, void* obj) {
delete reinterpret_cast<TestItem*>(obj);
obj = nullptr;
}
static Cache::CacheItemHelper helper_;
static Status SaveToCallbackFail(void* /*obj*/, size_t /*offset*/,
size_t /*size*/, void* /*out*/) {
return Status::NotSupported();
}
static Cache::CacheItemHelper helper_fail_;
Cache::CreateCallback test_item_creator = [&](const void* buf, size_t size,
void** out_obj,
size_t* charge) -> Status {
if (fail_create_) {
return Status::NotSupported();
}
*out_obj = reinterpret_cast<void*>(new TestItem((char*)buf, size));
*charge = size;
return Status::OK();
};
void SetFailCreate(bool fail) { fail_create_ = fail; }
void BasicTest(bool sec_cache_is_compressed, bool use_jemalloc) {
CompressedSecondaryCacheOptions opts;
opts.capacity = 2048;
opts.num_shard_bits = 0;
opts.metadata_charge_policy = kDontChargeCacheMetadata;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
opts.compression_type = CompressionType::kNoCompression;
}
} else {
opts.compression_type = CompressionType::kNoCompression;
}
if (use_jemalloc) {
JemallocAllocatorOptions jopts;
std::shared_ptr<MemoryAllocator> allocator;
std::string msg;
if (JemallocNodumpAllocator::IsSupported(&msg)) {
Status s = NewJemallocNodumpAllocator(jopts, &allocator);
if (s.ok()) {
opts.memory_allocator = allocator;
}
} else {
ROCKSDB_GTEST_BYPASS("JEMALLOC not supported");
}
}
std::shared_ptr<SecondaryCache> sec_cache =
NewCompressedSecondaryCache(opts);
bool is_in_sec_cache{true};
// Lookup an non-existent key.
std::unique_ptr<SecondaryCacheResultHandle> handle0 =
sec_cache->Lookup("k0", test_item_creator, true, is_in_sec_cache);
ASSERT_EQ(handle0, nullptr);
Random rnd(301);
// Insert and Lookup the first item.
std::string str1;
test::CompressibleString(&rnd, 0.25, 1000, &str1);
TestItem item1(str1.data(), str1.length());
ASSERT_OK(sec_cache->Insert("k1", &item1,
&CompressedSecondaryCacheTest::helper_));
std::unique_ptr<SecondaryCacheResultHandle> handle1 =
sec_cache->Lookup("k1", test_item_creator, true, is_in_sec_cache);
ASSERT_NE(handle1, nullptr);
ASSERT_FALSE(is_in_sec_cache);
std::unique_ptr<TestItem> val1 =
std::unique_ptr<TestItem>(static_cast<TestItem*>(handle1->Value()));
ASSERT_NE(val1, nullptr);
ASSERT_EQ(memcmp(val1->Buf(), item1.Buf(), item1.Size()), 0);
// Lookup the first item again.
std::unique_ptr<SecondaryCacheResultHandle> handle1_1 =
sec_cache->Lookup("k1", test_item_creator, true, is_in_sec_cache);
ASSERT_EQ(handle1_1, nullptr);
// Insert and Lookup the second item.
std::string str2;
test::CompressibleString(&rnd, 0.5, 1000, &str2);
TestItem item2(str2.data(), str2.length());
ASSERT_OK(sec_cache->Insert("k2", &item2,
&CompressedSecondaryCacheTest::helper_));
std::unique_ptr<SecondaryCacheResultHandle> handle2 =
sec_cache->Lookup("k2", test_item_creator, true, is_in_sec_cache);
ASSERT_NE(handle2, nullptr);
std::unique_ptr<TestItem> val2 =
std::unique_ptr<TestItem>(static_cast<TestItem*>(handle2->Value()));
ASSERT_NE(val2, nullptr);
ASSERT_EQ(memcmp(val2->Buf(), item2.Buf(), item2.Size()), 0);
std::vector<SecondaryCacheResultHandle*> handles = {handle1.get(),
handle2.get()};
sec_cache->WaitAll(handles);
sec_cache.reset();
}
void FailsTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 1100;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> sec_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
// Insert and Lookup the first item.
Random rnd(301);
std::string str1(rnd.RandomString(1000));
TestItem item1(str1.data(), str1.length());
ASSERT_OK(sec_cache->Insert("k1", &item1,
&CompressedSecondaryCacheTest::helper_));
// Insert and Lookup the second item.
std::string str2(rnd.RandomString(200));
TestItem item2(str2.data(), str2.length());
// k1 is evicted.
ASSERT_OK(sec_cache->Insert("k2", &item2,
&CompressedSecondaryCacheTest::helper_));
bool is_in_sec_cache{false};
std::unique_ptr<SecondaryCacheResultHandle> handle1_1 =
sec_cache->Lookup("k1", test_item_creator, true, is_in_sec_cache);
ASSERT_EQ(handle1_1, nullptr);
std::unique_ptr<SecondaryCacheResultHandle> handle2 =
sec_cache->Lookup("k2", test_item_creator, true, is_in_sec_cache);
ASSERT_NE(handle2, nullptr);
std::unique_ptr<TestItem> val2 =
std::unique_ptr<TestItem>(static_cast<TestItem*>(handle2->Value()));
ASSERT_NE(val2, nullptr);
ASSERT_EQ(memcmp(val2->Buf(), item2.Buf(), item2.Size()), 0);
// Create Fails.
SetFailCreate(true);
std::unique_ptr<SecondaryCacheResultHandle> handle2_1 =
sec_cache->Lookup("k2", test_item_creator, true, is_in_sec_cache);
ASSERT_EQ(handle2_1, nullptr);
// Save Fails.
std::string str3 = rnd.RandomString(10);
TestItem item3(str3.data(), str3.length());
ASSERT_NOK(sec_cache->Insert("k3", &item3,
&CompressedSecondaryCacheTest::helper_fail_));
sec_cache.reset();
}
void BasicIntegrationTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 2300;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> secondary_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
LRUCacheOptions lru_cache_opts(1024, 0, false, 0.5, nullptr,
kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
lru_cache_opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(lru_cache_opts);
std::shared_ptr<Statistics> stats = CreateDBStatistics();
Random rnd(301);
std::string str1 = rnd.RandomString(1010);
std::string str1_clone{str1};
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &CompressedSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// After Insert, lru cache contains k2 and secondary cache contains k1.
ASSERT_OK(cache->Insert("k2", item2, &CompressedSecondaryCacheTest::helper_,
str2.length()));
std::string str3 = rnd.RandomString(1020);
TestItem* item3 = new TestItem(str3.data(), str3.length());
// After Insert, lru cache contains k3 and secondary cache contains k1 and
// k2
ASSERT_OK(cache->Insert("k3", item3, &CompressedSecondaryCacheTest::helper_,
str3.length()));
Cache::Handle* handle;
handle = cache->Lookup("k3", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true,
stats.get());
ASSERT_NE(handle, nullptr);
TestItem* val3 = static_cast<TestItem*>(cache->Value(handle));
ASSERT_NE(val3, nullptr);
ASSERT_EQ(memcmp(val3->Buf(), item3->Buf(), item3->Size()), 0);
cache->Release(handle);
// Lookup an non-existent key.
handle = cache->Lookup("k0", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true,
stats.get());
ASSERT_EQ(handle, nullptr);
// This Lookup should promote k1 and erase k1 from the secondary cache,
// then k3 is demoted. So k2 and k3 are in the secondary cache.
handle = cache->Lookup("k1", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true,
stats.get());
ASSERT_NE(handle, nullptr);
TestItem* val1_1 = static_cast<TestItem*>(cache->Value(handle));
ASSERT_NE(val1_1, nullptr);
ASSERT_EQ(memcmp(val1_1->Buf(), str1_clone.data(), str1_clone.size()), 0);
cache->Release(handle);
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true,
stats.get());
ASSERT_NE(handle, nullptr);
cache->Release(handle);
cache.reset();
secondary_cache.reset();
}
void BasicIntegrationFailTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 2048;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> secondary_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
auto item1 =
std::unique_ptr<TestItem>(new TestItem(str1.data(), str1.length()));
ASSERT_NOK(cache->Insert("k1", item1.get(), nullptr, str1.length()));
ASSERT_OK(cache->Insert("k1", item1.get(),
&CompressedSecondaryCacheTest::helper_,
str1.length()));
item1.release(); // Appease clang-analyze "potential memory leak"
Cache::Handle* handle;
handle = cache->Lookup("k2", nullptr, test_item_creator,
Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, false);
ASSERT_EQ(handle, nullptr);
cache.reset();
secondary_cache.reset();
}
void IntegrationSaveFailTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 2048;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> secondary_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1,
&CompressedSecondaryCacheTest::helper_fail_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to the secondary cache.
ASSERT_OK(cache->Insert("k2", item2,
&CompressedSecondaryCacheTest::helper_fail_,
str2.length()));
Cache::Handle* handle;
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should fail, since k1 demotion would have failed
handle = cache->Lookup("k1", &CompressedSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
// Since k1 didn't get promoted, k2 should still be in cache
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
cache.reset();
secondary_cache.reset();
}
void IntegrationCreateFailTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 2048;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> secondary_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &CompressedSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to the secondary cache.
ASSERT_OK(cache->Insert("k2", item2, &CompressedSecondaryCacheTest::helper_,
str2.length()));
Cache::Handle* handle;
SetFailCreate(true);
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should fail, since k1 creation would have failed
handle = cache->Lookup("k1", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
// Since k1 didn't get promoted, k2 should still be in cache
handle = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
cache.reset();
secondary_cache.reset();
}
void IntegrationFullCapacityTest(bool sec_cache_is_compressed) {
CompressedSecondaryCacheOptions secondary_cache_opts;
if (sec_cache_is_compressed) {
if (!LZ4_Supported()) {
ROCKSDB_GTEST_SKIP("This test requires LZ4 support.");
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
} else {
secondary_cache_opts.compression_type = CompressionType::kNoCompression;
}
secondary_cache_opts.capacity = 2048;
secondary_cache_opts.num_shard_bits = 0;
secondary_cache_opts.metadata_charge_policy = kDontChargeCacheMetadata;
std::shared_ptr<SecondaryCache> secondary_cache =
NewCompressedSecondaryCache(secondary_cache_opts);
LRUCacheOptions opts(1024, 0, /*_strict_capacity_limit=*/true, 0.5, nullptr,
kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &CompressedSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to the secondary cache.
ASSERT_OK(cache->Insert("k2", item2, &CompressedSecondaryCacheTest::helper_,
str2.length()));
Cache::Handle* handle2;
handle2 = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle2, nullptr);
cache->Release(handle2);
// k1 promotion should fail due to the block cache being at capacity,
// but the lookup should still succeed
Cache::Handle* handle1;
handle1 = cache->Lookup("k1", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle1, nullptr);
cache->Release(handle1);
// Since k1 didn't get inserted, k2 should still be in cache
handle2 = cache->Lookup("k2", &CompressedSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle2, nullptr);
cache->Release(handle2);
cache.reset();
secondary_cache.reset();
}
private:
bool fail_create_;
};
Cache::CacheItemHelper CompressedSecondaryCacheTest::helper_(
CompressedSecondaryCacheTest::SizeCallback,
CompressedSecondaryCacheTest::SaveToCallback,
CompressedSecondaryCacheTest::DeletionCallback);
Cache::CacheItemHelper CompressedSecondaryCacheTest::helper_fail_(
CompressedSecondaryCacheTest::SizeCallback,
CompressedSecondaryCacheTest::SaveToCallbackFail,
CompressedSecondaryCacheTest::DeletionCallback);
TEST_F(CompressedSecondaryCacheTest, BasicTestWithNoCompression) {
BasicTest(false, false);
}
TEST_F(CompressedSecondaryCacheTest,
BasicTestWithMemoryAllocatorAndNoCompression) {
BasicTest(false, true);
}
TEST_F(CompressedSecondaryCacheTest, BasicTestWithCompression) {
BasicTest(true, false);
}
TEST_F(CompressedSecondaryCacheTest,
BasicTestWithMemoryAllocatorAndCompression) {
BasicTest(true, true);
}
TEST_F(CompressedSecondaryCacheTest, FailsTestWithNoCompression) {
FailsTest(false);
}
TEST_F(CompressedSecondaryCacheTest, FailsTestWithCompression) {
FailsTest(true);
}
TEST_F(CompressedSecondaryCacheTest, BasicIntegrationTestWithNoCompression) {
BasicIntegrationTest(false);
}
TEST_F(CompressedSecondaryCacheTest, BasicIntegrationTestWithCompression) {
BasicIntegrationTest(true);
}
TEST_F(CompressedSecondaryCacheTest,
BasicIntegrationFailTestWithNoCompression) {
BasicIntegrationFailTest(false);
}
TEST_F(CompressedSecondaryCacheTest, BasicIntegrationFailTestWithCompression) {
BasicIntegrationFailTest(true);
}
TEST_F(CompressedSecondaryCacheTest, IntegrationSaveFailTestWithNoCompression) {
IntegrationSaveFailTest(false);
}
TEST_F(CompressedSecondaryCacheTest, IntegrationSaveFailTestWithCompression) {
IntegrationSaveFailTest(true);
}
TEST_F(CompressedSecondaryCacheTest,
IntegrationCreateFailTestWithNoCompression) {
IntegrationCreateFailTest(false);
}
TEST_F(CompressedSecondaryCacheTest, IntegrationCreateFailTestWithCompression) {
IntegrationCreateFailTest(true);
}
TEST_F(CompressedSecondaryCacheTest,
IntegrationFullCapacityTestWithNoCompression) {
IntegrationFullCapacityTest(false);
}
TEST_F(CompressedSecondaryCacheTest,
IntegrationFullCapacityTestWithCompression) {
IntegrationFullCapacityTest(true);
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {
::testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}

View File

@ -1,511 +0,0 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#include "cache/fast_lru_cache.h"
#include <cassert>
#include <cstdint>
#include <cstdio>
#include "monitoring/perf_context_imp.h"
#include "monitoring/statistics.h"
#include "port/lang.h"
#include "util/mutexlock.h"
namespace ROCKSDB_NAMESPACE {
namespace fast_lru_cache {
LRUHandleTable::LRUHandleTable(int max_upper_hash_bits)
: length_bits_(/* historical starting size*/ 4),
list_(new LRUHandle* [size_t{1} << length_bits_] {}),
elems_(0),
max_length_bits_(max_upper_hash_bits) {}
LRUHandleTable::~LRUHandleTable() {
ApplyToEntriesRange(
[](LRUHandle* h) {
if (!h->HasRefs()) {
h->Free();
}
},
0, uint32_t{1} << length_bits_);
}
LRUHandle* LRUHandleTable::Lookup(const Slice& key, uint32_t hash) {
return *FindPointer(key, hash);
}
LRUHandle* LRUHandleTable::Insert(LRUHandle* h) {
LRUHandle** ptr = FindPointer(h->key(), h->hash);
LRUHandle* old = *ptr;
h->next_hash = (old == nullptr ? nullptr : old->next_hash);
*ptr = h;
if (old == nullptr) {
++elems_;
if ((elems_ >> length_bits_) > 0) { // elems_ >= length
// Since each cache entry is fairly large, we aim for a small
// average linked list length (<= 1).
Resize();
}
}
return old;
}
LRUHandle* LRUHandleTable::Remove(const Slice& key, uint32_t hash) {
LRUHandle** ptr = FindPointer(key, hash);
LRUHandle* result = *ptr;
if (result != nullptr) {
*ptr = result->next_hash;
--elems_;
}
return result;
}
LRUHandle** LRUHandleTable::FindPointer(const Slice& key, uint32_t hash) {
LRUHandle** ptr = &list_[hash >> (32 - length_bits_)];
while (*ptr != nullptr && ((*ptr)->hash != hash || key != (*ptr)->key())) {
ptr = &(*ptr)->next_hash;
}
return ptr;
}
void LRUHandleTable::Resize() {
if (length_bits_ >= max_length_bits_) {
// Due to reaching limit of hash information, if we made the table bigger,
// we would allocate more addresses but only the same number would be used.
return;
}
if (length_bits_ >= 31) {
// Avoid undefined behavior shifting uint32_t by 32.
return;
}
uint32_t old_length = uint32_t{1} << length_bits_;
int new_length_bits = length_bits_ + 1;
std::unique_ptr<LRUHandle* []> new_list {
new LRUHandle* [size_t{1} << new_length_bits] {}
};
uint32_t count = 0;
for (uint32_t i = 0; i < old_length; i++) {
LRUHandle* h = list_[i];
while (h != nullptr) {
LRUHandle* next = h->next_hash;
uint32_t hash = h->hash;
LRUHandle** ptr = &new_list[hash >> (32 - new_length_bits)];
h->next_hash = *ptr;
*ptr = h;
h = next;
count++;
}
}
assert(elems_ == count);
list_ = std::move(new_list);
length_bits_ = new_length_bits;
}
LRUCacheShard::LRUCacheShard(size_t capacity, bool strict_capacity_limit,
CacheMetadataChargePolicy metadata_charge_policy,
int max_upper_hash_bits)
: capacity_(0),
strict_capacity_limit_(strict_capacity_limit),
table_(max_upper_hash_bits),
usage_(0),
lru_usage_(0) {
set_metadata_charge_policy(metadata_charge_policy);
// Make empty circular linked list.
lru_.next = &lru_;
lru_.prev = &lru_;
lru_low_pri_ = &lru_;
SetCapacity(capacity);
}
void LRUCacheShard::EraseUnRefEntries() {
autovector<LRUHandle*> last_reference_list;
{
MutexLock l(&mutex_);
while (lru_.next != &lru_) {
LRUHandle* old = lru_.next;
// LRU list contains only elements which can be evicted.
assert(old->InCache() && !old->HasRefs());
LRU_Remove(old);
table_.Remove(old->key(), old->hash);
old->SetInCache(false);
size_t total_charge = old->CalcTotalCharge(metadata_charge_policy_);
assert(usage_ >= total_charge);
usage_ -= total_charge;
last_reference_list.push_back(old);
}
}
// Free the entries here outside of mutex for performance reasons.
for (auto entry : last_reference_list) {
entry->Free();
}
}
void LRUCacheShard::ApplyToSomeEntries(
const std::function<void(const Slice& key, void* value, size_t charge,
DeleterFn deleter)>& callback,
uint32_t average_entries_per_lock, uint32_t* state) {
// The state is essentially going to be the starting hash, which works
// nicely even if we resize between calls because we use upper-most
// hash bits for table indexes.
MutexLock l(&mutex_);
uint32_t length_bits = table_.GetLengthBits();
uint32_t length = uint32_t{1} << length_bits;
assert(average_entries_per_lock > 0);
// Assuming we are called with same average_entries_per_lock repeatedly,
// this simplifies some logic (index_end will not overflow).
assert(average_entries_per_lock < length || *state == 0);
uint32_t index_begin = *state >> (32 - length_bits);
uint32_t index_end = index_begin + average_entries_per_lock;
if (index_end >= length) {
// Going to end
index_end = length;
*state = UINT32_MAX;
} else {
*state = index_end << (32 - length_bits);
}
table_.ApplyToEntriesRange(
[callback](LRUHandle* h) {
callback(h->key(), h->value, h->charge, h->deleter);
},
index_begin, index_end);
}
void LRUCacheShard::LRU_Remove(LRUHandle* e) {
assert(e->next != nullptr);
assert(e->prev != nullptr);
e->next->prev = e->prev;
e->prev->next = e->next;
e->prev = e->next = nullptr;
size_t total_charge = e->CalcTotalCharge(metadata_charge_policy_);
assert(lru_usage_ >= total_charge);
lru_usage_ -= total_charge;
}
void LRUCacheShard::LRU_Insert(LRUHandle* e) {
assert(e->next == nullptr);
assert(e->prev == nullptr);
size_t total_charge = e->CalcTotalCharge(metadata_charge_policy_);
// Inset "e" to head of LRU list.
e->next = &lru_;
e->prev = lru_.prev;
e->prev->next = e;
e->next->prev = e;
lru_usage_ += total_charge;
}
void LRUCacheShard::EvictFromLRU(size_t charge,
autovector<LRUHandle*>* deleted) {
while ((usage_ + charge) > capacity_ && lru_.next != &lru_) {
LRUHandle* old = lru_.next;
// LRU list contains only elements which can be evicted.
assert(old->InCache() && !old->HasRefs());
LRU_Remove(old);
table_.Remove(old->key(), old->hash);
old->SetInCache(false);
size_t old_total_charge = old->CalcTotalCharge(metadata_charge_policy_);
assert(usage_ >= old_total_charge);
usage_ -= old_total_charge;
deleted->push_back(old);
}
}
void LRUCacheShard::SetCapacity(size_t capacity) {
autovector<LRUHandle*> last_reference_list;
{
MutexLock l(&mutex_);
capacity_ = capacity;
EvictFromLRU(0, &last_reference_list);
}
// Free the entries here outside of mutex for performance reasons.
for (auto entry : last_reference_list) {
entry->Free();
}
}
void LRUCacheShard::SetStrictCapacityLimit(bool strict_capacity_limit) {
MutexLock l(&mutex_);
strict_capacity_limit_ = strict_capacity_limit;
}
Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
bool free_handle_on_fail) {
Status s = Status::OK();
autovector<LRUHandle*> last_reference_list;
size_t total_charge = e->CalcTotalCharge(metadata_charge_policy_);
{
MutexLock l(&mutex_);
// Free the space following strict LRU policy until enough space
// is freed or the lru list is empty.
EvictFromLRU(total_charge, &last_reference_list);
if ((usage_ + total_charge) > capacity_ &&
(strict_capacity_limit_ || handle == nullptr)) {
e->SetInCache(false);
if (handle == nullptr) {
// Don't insert the entry but still return ok, as if the entry inserted
// into cache and get evicted immediately.
last_reference_list.push_back(e);
} else {
if (free_handle_on_fail) {
delete[] reinterpret_cast<char*>(e);
*handle = nullptr;
}
s = Status::Incomplete("Insert failed due to LRU cache being full.");
}
} else {
// Insert into the cache. Note that the cache might get larger than its
// capacity if not enough space was freed up.
LRUHandle* old = table_.Insert(e);
usage_ += total_charge;
if (old != nullptr) {
s = Status::OkOverwritten();
assert(old->InCache());
old->SetInCache(false);
if (!old->HasRefs()) {
// old is on LRU because it's in cache and its reference count is 0.
LRU_Remove(old);
size_t old_total_charge =
old->CalcTotalCharge(metadata_charge_policy_);
assert(usage_ >= old_total_charge);
usage_ -= old_total_charge;
last_reference_list.push_back(old);
}
}
if (handle == nullptr) {
LRU_Insert(e);
} else {
// If caller already holds a ref, no need to take one here.
if (!e->HasRefs()) {
e->Ref();
}
*handle = reinterpret_cast<Cache::Handle*>(e);
}
}
}
// Free the entries here outside of mutex for performance reasons.
for (auto entry : last_reference_list) {
entry->Free();
}
return s;
}
Cache::Handle* LRUCacheShard::Lookup(const Slice& key, uint32_t hash) {
LRUHandle* e = nullptr;
{
MutexLock l(&mutex_);
e = table_.Lookup(key, hash);
if (e != nullptr) {
assert(e->InCache());
if (!e->HasRefs()) {
// The entry is in LRU since it's in hash and has no external references
LRU_Remove(e);
}
e->Ref();
}
}
return reinterpret_cast<Cache::Handle*>(e);
}
bool LRUCacheShard::Ref(Cache::Handle* h) {
LRUHandle* e = reinterpret_cast<LRUHandle*>(h);
MutexLock l(&mutex_);
// To create another reference - entry must be already externally referenced.
assert(e->HasRefs());
e->Ref();
return true;
}
bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) {
if (handle == nullptr) {
return false;
}
LRUHandle* e = reinterpret_cast<LRUHandle*>(handle);
bool last_reference = false;
{
MutexLock l(&mutex_);
last_reference = e->Unref();
if (last_reference && e->InCache()) {
// The item is still in cache, and nobody else holds a reference to it.
if (usage_ > capacity_ || erase_if_last_ref) {
// The LRU list must be empty since the cache is full.
assert(lru_.next == &lru_ || erase_if_last_ref);
// Take this opportunity and remove the item.
table_.Remove(e->key(), e->hash);
e->SetInCache(false);
} else {
// Put the item back on the LRU list, and don't free it.
LRU_Insert(e);
last_reference = false;
}
}
// If it was the last reference, then decrement the cache usage.
if (last_reference) {
size_t total_charge = e->CalcTotalCharge(metadata_charge_policy_);
assert(usage_ >= total_charge);
usage_ -= total_charge;
}
}
// Free the entry here outside of mutex for performance reasons.
if (last_reference) {
e->Free();
}
return last_reference;
}
Status LRUCacheShard::Insert(const Slice& key, uint32_t hash, void* value,
size_t charge, Cache::DeleterFn deleter,
Cache::Handle** handle,
Cache::Priority /*priority*/) {
// Allocate the memory here outside of the mutex.
// If the cache is full, we'll have to release it.
// It shouldn't happen very often though.
LRUHandle* e = reinterpret_cast<LRUHandle*>(
new char[sizeof(LRUHandle) - 1 + key.size()]);
e->value = value;
e->flags = 0;
e->deleter = deleter;
e->charge = charge;
e->key_length = key.size();
e->hash = hash;
e->refs = 0;
e->next = e->prev = nullptr;
e->SetInCache(true);
memcpy(e->key_data, key.data(), key.size());
return InsertItem(e, handle, /* free_handle_on_fail */ true);
}
void LRUCacheShard::Erase(const Slice& key, uint32_t hash) {
LRUHandle* e;
bool last_reference = false;
{
MutexLock l(&mutex_);
e = table_.Remove(key, hash);
if (e != nullptr) {
assert(e->InCache());
e->SetInCache(false);
if (!e->HasRefs()) {
// The entry is in LRU since it's in hash and has no external references
LRU_Remove(e);
size_t total_charge = e->CalcTotalCharge(metadata_charge_policy_);
assert(usage_ >= total_charge);
usage_ -= total_charge;
last_reference = true;
}
}
}
// Free the entry here outside of mutex for performance reasons.
// last_reference will only be true if e != nullptr.
if (last_reference) {
e->Free();
}
}
size_t LRUCacheShard::GetUsage() const {
MutexLock l(&mutex_);
return usage_;
}
size_t LRUCacheShard::GetPinnedUsage() const {
MutexLock l(&mutex_);
assert(usage_ >= lru_usage_);
return usage_ - lru_usage_;
}
std::string LRUCacheShard::GetPrintableOptions() const { return std::string{}; }
LRUCache::LRUCache(size_t capacity, int num_shard_bits,
bool strict_capacity_limit,
CacheMetadataChargePolicy metadata_charge_policy)
: ShardedCache(capacity, num_shard_bits, strict_capacity_limit) {
num_shards_ = 1 << num_shard_bits;
shards_ = reinterpret_cast<LRUCacheShard*>(
port::cacheline_aligned_alloc(sizeof(LRUCacheShard) * num_shards_));
size_t per_shard = (capacity + (num_shards_ - 1)) / num_shards_;
for (int i = 0; i < num_shards_; i++) {
new (&shards_[i])
LRUCacheShard(per_shard, strict_capacity_limit, metadata_charge_policy,
/* max_upper_hash_bits */ 32 - num_shard_bits);
}
}
LRUCache::~LRUCache() {
if (shards_ != nullptr) {
assert(num_shards_ > 0);
for (int i = 0; i < num_shards_; i++) {
shards_[i].~LRUCacheShard();
}
port::cacheline_aligned_free(shards_);
}
}
CacheShard* LRUCache::GetShard(uint32_t shard) {
return reinterpret_cast<CacheShard*>(&shards_[shard]);
}
const CacheShard* LRUCache::GetShard(uint32_t shard) const {
return reinterpret_cast<CacheShard*>(&shards_[shard]);
}
void* LRUCache::Value(Handle* handle) {
return reinterpret_cast<const LRUHandle*>(handle)->value;
}
size_t LRUCache::GetCharge(Handle* handle) const {
return reinterpret_cast<const LRUHandle*>(handle)->charge;
}
Cache::DeleterFn LRUCache::GetDeleter(Handle* handle) const {
auto h = reinterpret_cast<const LRUHandle*>(handle);
return h->deleter;
}
uint32_t LRUCache::GetHash(Handle* handle) const {
return reinterpret_cast<const LRUHandle*>(handle)->hash;
}
void LRUCache::DisownData() {
// Leak data only if that won't generate an ASAN/valgrind warning.
if (!kMustFreeHeapAllocations) {
shards_ = nullptr;
num_shards_ = 0;
}
}
} // namespace fast_lru_cache
std::shared_ptr<Cache> NewFastLRUCache(
size_t capacity, int num_shard_bits, bool strict_capacity_limit,
CacheMetadataChargePolicy metadata_charge_policy) {
if (num_shard_bits >= 20) {
return nullptr; // The cache cannot be sharded into too many fine pieces.
}
if (num_shard_bits < 0) {
num_shard_bits = GetDefaultCacheShardBits(capacity);
}
return std::make_shared<fast_lru_cache::LRUCache>(
capacity, num_shard_bits, strict_capacity_limit, metadata_charge_policy);
}
} // namespace ROCKSDB_NAMESPACE

299
cache/fast_lru_cache.h vendored
View File

@ -1,299 +0,0 @@
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
//
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file. See the AUTHORS file for names of contributors.
#pragma once
#include <memory>
#include <string>
#include "cache/sharded_cache.h"
#include "port/lang.h"
#include "port/malloc.h"
#include "port/port.h"
#include "rocksdb/secondary_cache.h"
#include "util/autovector.h"
namespace ROCKSDB_NAMESPACE {
namespace fast_lru_cache {
// An experimental (under development!) alternative to LRUCache
struct LRUHandle {
void* value;
Cache::DeleterFn deleter;
LRUHandle* next_hash;
LRUHandle* next;
LRUHandle* prev;
size_t charge; // TODO(opt): Only allow uint32_t?
size_t key_length;
// The hash of key(). Used for fast sharding and comparisons.
uint32_t hash;
// The number of external refs to this entry. The cache itself is not counted.
uint32_t refs;
enum Flags : uint8_t {
// Whether this entry is referenced by the hash table.
IN_CACHE = (1 << 0),
};
uint8_t flags;
// Beginning of the key (MUST BE THE LAST FIELD IN THIS STRUCT!)
char key_data[1];
Slice key() const { return Slice(key_data, key_length); }
// Increase the reference count by 1.
void Ref() { refs++; }
// Just reduce the reference count by 1. Return true if it was last reference.
bool Unref() {
assert(refs > 0);
refs--;
return refs == 0;
}
// Return true if there are external refs, false otherwise.
bool HasRefs() const { return refs > 0; }
bool InCache() const { return flags & IN_CACHE; }
void SetInCache(bool in_cache) {
if (in_cache) {
flags |= IN_CACHE;
} else {
flags &= ~IN_CACHE;
}
}
void Free() {
assert(refs == 0);
if (deleter) {
(*deleter)(key(), value);
}
delete[] reinterpret_cast<char*>(this);
}
// Calculate the memory usage by metadata.
inline size_t CalcTotalCharge(
CacheMetadataChargePolicy metadata_charge_policy) {
size_t meta_charge = 0;
if (metadata_charge_policy == kFullChargeCacheMetadata) {
#ifdef ROCKSDB_MALLOC_USABLE_SIZE
meta_charge += malloc_usable_size(static_cast<void*>(this));
#else
// This is the size that is used when a new handle is created.
meta_charge += sizeof(LRUHandle) - 1 + key_length;
#endif
}
return charge + meta_charge;
}
};
// We provide our own simple hash table since it removes a whole bunch
// of porting hacks and is also faster than some of the built-in hash
// table implementations in some of the compiler/runtime combinations
// we have tested. E.g., readrandom speeds up by ~5% over the g++
// 4.4.3's builtin hashtable.
class LRUHandleTable {
public:
// If the table uses more hash bits than `max_upper_hash_bits`,
// it will eat into the bits used for sharding, which are constant
// for a given LRUHandleTable.
explicit LRUHandleTable(int max_upper_hash_bits);
~LRUHandleTable();
LRUHandle* Lookup(const Slice& key, uint32_t hash);
LRUHandle* Insert(LRUHandle* h);
LRUHandle* Remove(const Slice& key, uint32_t hash);
template <typename T>
void ApplyToEntriesRange(T func, uint32_t index_begin, uint32_t index_end) {
for (uint32_t i = index_begin; i < index_end; i++) {
LRUHandle* h = list_[i];
while (h != nullptr) {
auto n = h->next_hash;
assert(h->InCache());
func(h);
h = n;
}
}
}
int GetLengthBits() const { return length_bits_; }
private:
// Return a pointer to slot that points to a cache entry that
// matches key/hash. If there is no such cache entry, return a
// pointer to the trailing slot in the corresponding linked list.
LRUHandle** FindPointer(const Slice& key, uint32_t hash);
void Resize();
// Number of hash bits (upper because lower bits used for sharding)
// used for table index. Length == 1 << length_bits_
int length_bits_;
// The table consists of an array of buckets where each bucket is
// a linked list of cache entries that hash into the bucket.
std::unique_ptr<LRUHandle*[]> list_;
// Number of elements currently in the table.
uint32_t elems_;
// Set from max_upper_hash_bits (see constructor).
const int max_length_bits_;
};
// A single shard of sharded cache.
class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
public:
LRUCacheShard(size_t capacity, bool strict_capacity_limit,
CacheMetadataChargePolicy metadata_charge_policy,
int max_upper_hash_bits);
~LRUCacheShard() override = default;
// Separate from constructor so caller can easily make an array of LRUCache
// if current usage is more than new capacity, the function will attempt to
// free the needed space.
void SetCapacity(size_t capacity) override;
// Set the flag to reject insertion if cache if full.
void SetStrictCapacityLimit(bool strict_capacity_limit) override;
// Like Cache methods, but with an extra "hash" parameter.
Status Insert(const Slice& key, uint32_t hash, void* value, size_t charge,
Cache::DeleterFn deleter, Cache::Handle** handle,
Cache::Priority priority) override;
Status Insert(const Slice& key, uint32_t hash, void* value,
const Cache::CacheItemHelper* helper, size_t charge,
Cache::Handle** handle, Cache::Priority priority) override {
return Insert(key, hash, value, charge, helper->del_cb, handle, priority);
}
Cache::Handle* Lookup(const Slice& key, uint32_t hash,
const Cache::CacheItemHelper* /*helper*/,
const Cache::CreateCallback& /*create_cb*/,
Cache::Priority /*priority*/, bool /*wait*/,
Statistics* /*stats*/) override {
return Lookup(key, hash);
}
Cache::Handle* Lookup(const Slice& key, uint32_t hash) override;
bool Release(Cache::Handle* handle, bool /*useful*/,
bool erase_if_last_ref) override {
return Release(handle, erase_if_last_ref);
}
bool IsReady(Cache::Handle* /*handle*/) override { return true; }
void Wait(Cache::Handle* /*handle*/) override {}
bool Ref(Cache::Handle* handle) override;
bool Release(Cache::Handle* handle, bool erase_if_last_ref = false) override;
void Erase(const Slice& key, uint32_t hash) override;
size_t GetUsage() const override;
size_t GetPinnedUsage() const override;
void ApplyToSomeEntries(
const std::function<void(const Slice& key, void* value, size_t charge,
DeleterFn deleter)>& callback,
uint32_t average_entries_per_lock, uint32_t* state) override;
void EraseUnRefEntries() override;
std::string GetPrintableOptions() const override;
private:
friend class LRUCache;
// Insert an item into the hash table and, if handle is null, insert into
// the LRU list. Older items are evicted as necessary. If the cache is full
// and free_handle_on_fail is true, the item is deleted and handle is set to
// nullptr.
Status InsertItem(LRUHandle* item, Cache::Handle** handle,
bool free_handle_on_fail);
void LRU_Remove(LRUHandle* e);
void LRU_Insert(LRUHandle* e);
// Free some space following strict LRU policy until enough space
// to hold (usage_ + charge) is freed or the lru list is empty
// This function is not thread safe - it needs to be executed while
// holding the mutex_.
void EvictFromLRU(size_t charge, autovector<LRUHandle*>* deleted);
// Initialized before use.
size_t capacity_;
// Whether to reject insertion if cache reaches its full capacity.
bool strict_capacity_limit_;
// Dummy head of LRU list.
// lru.prev is newest entry, lru.next is oldest entry.
// LRU contains items which can be evicted, ie reference only by cache
LRUHandle lru_;
// Pointer to head of low-pri pool in LRU list.
LRUHandle* lru_low_pri_;
// ------------^^^^^^^^^^^^^-----------
// Not frequently modified data members
// ------------------------------------
//
// We separate data members that are updated frequently from the ones that
// are not frequently updated so that they don't share the same cache line
// which will lead into false cache sharing
//
// ------------------------------------
// Frequently modified data members
// ------------vvvvvvvvvvvvv-----------
LRUHandleTable table_;
// Memory size for entries residing in the cache.
size_t usage_;
// Memory size for entries residing only in the LRU list.
size_t lru_usage_;
// mutex_ protects the following state.
// We don't count mutex_ as the cache's internal state so semantically we
// don't mind mutex_ invoking the non-const actions.
mutable port::Mutex mutex_;
};
class LRUCache
#ifdef NDEBUG
final
#endif
: public ShardedCache {
public:
LRUCache(size_t capacity, int num_shard_bits, bool strict_capacity_limit,
CacheMetadataChargePolicy metadata_charge_policy =
kDontChargeCacheMetadata);
~LRUCache() override;
const char* Name() const override { return "LRUCache"; }
CacheShard* GetShard(uint32_t shard) override;
const CacheShard* GetShard(uint32_t shard) const override;
void* Value(Handle* handle) override;
size_t GetCharge(Handle* handle) const override;
uint32_t GetHash(Handle* handle) const override;
DeleterFn GetDeleter(Handle* handle) const override;
void DisownData() override;
private:
LRUCacheShard* shards_ = nullptr;
int num_shards_ = 0;
};
} // namespace fast_lru_cache
std::shared_ptr<Cache> NewFastLRUCache(
size_t capacity, int num_shard_bits = -1,
bool strict_capacity_limit = false,
CacheMetadataChargePolicy metadata_charge_policy =
kDefaultCacheMetadataChargePolicy);
} // namespace ROCKSDB_NAMESPACE

84
cache/lru_cache.cc vendored
View File

@ -19,7 +19,6 @@
#include "util/mutexlock.h"
namespace ROCKSDB_NAMESPACE {
namespace lru_cache {
LRUHandleTable::LRUHandleTable(int max_upper_hash_bits)
: length_bits_(/* historical starting size*/ 4),
@ -77,12 +76,13 @@ LRUHandle** LRUHandleTable::FindPointer(const Slice& key, uint32_t hash) {
void LRUHandleTable::Resize() {
if (length_bits_ >= max_length_bits_) {
// Due to reaching limit of hash information, if we made the table bigger,
// we would allocate more addresses but only the same number would be used.
// Due to reaching limit of hash information, if we made the table
// bigger, we would allocate more addresses but only the same
// number would be used.
return;
}
if (length_bits_ >= 31) {
// Avoid undefined behavior shifting uint32_t by 32.
// Avoid undefined behavior shifting uint32_t by 32
return;
}
@ -125,7 +125,7 @@ LRUCacheShard::LRUCacheShard(
mutex_(use_adaptive_mutex),
secondary_cache_(secondary_cache) {
set_metadata_charge_policy(metadata_charge_policy);
// Make empty circular linked list.
// Make empty circular linked list
lru_.next = &lru_;
lru_.prev = &lru_;
lru_low_pri_ = &lru_;
@ -138,7 +138,7 @@ void LRUCacheShard::EraseUnRefEntries() {
MutexLock l(&mutex_);
while (lru_.next != &lru_) {
LRUHandle* old = lru_.next;
// LRU list contains only elements which can be evicted.
// LRU list contains only elements which can be evicted
assert(old->InCache() && !old->HasRefs());
LRU_Remove(old);
table_.Remove(old->key(), old->hash);
@ -168,7 +168,7 @@ void LRUCacheShard::ApplyToSomeEntries(
assert(average_entries_per_lock > 0);
// Assuming we are called with same average_entries_per_lock repeatedly,
// this simplifies some logic (index_end will not overflow).
// this simplifies some logic (index_end will not overflow)
assert(average_entries_per_lock < length || *state == 0);
uint32_t index_begin = *state >> (32 - length_bits);
@ -274,7 +274,7 @@ void LRUCacheShard::EvictFromLRU(size_t charge,
autovector<LRUHandle*>* deleted) {
while ((usage_ + charge) > capacity_ && lru_.next != &lru_) {
LRUHandle* old = lru_.next;
// LRU list contains only elements which can be evicted.
// LRU list contains only elements which can be evicted
assert(old->InCache() && !old->HasRefs());
LRU_Remove(old);
table_.Remove(old->key(), old->hash);
@ -295,11 +295,11 @@ void LRUCacheShard::SetCapacity(size_t capacity) {
EvictFromLRU(0, &last_reference_list);
}
// Try to insert the evicted entries into tiered cache.
// Free the entries outside of mutex for performance reasons.
// Try to insert the evicted entries into tiered cache
// Free the entries outside of mutex for performance reasons
for (auto entry : last_reference_list) {
if (secondary_cache_ && entry->IsSecondaryCacheCompatible() &&
!entry->IsInSecondaryCache()) {
!entry->IsPromoted()) {
secondary_cache_->Insert(entry->key(), entry->value, entry->info_.helper)
.PermitUncheckedError();
}
@ -322,7 +322,7 @@ Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
MutexLock l(&mutex_);
// Free the space following strict LRU policy until enough space
// is freed or the lru list is empty.
// is freed or the lru list is empty
EvictFromLRU(total_charge, &last_reference_list);
if ((usage_ + total_charge) > capacity_ &&
@ -349,7 +349,7 @@ Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
assert(old->InCache());
old->SetInCache(false);
if (!old->HasRefs()) {
// old is on LRU because it's in cache and its reference count is 0.
// old is on LRU because it's in cache and its reference count is 0
LRU_Remove(old);
size_t old_total_charge =
old->CalcTotalCharge(metadata_charge_policy_);
@ -361,7 +361,7 @@ Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
if (handle == nullptr) {
LRU_Insert(e);
} else {
// If caller already holds a ref, no need to take one here.
// If caller already holds a ref, no need to take one here
if (!e->HasRefs()) {
e->Ref();
}
@ -370,11 +370,11 @@ Status LRUCacheShard::InsertItem(LRUHandle* e, Cache::Handle** handle,
}
}
// Try to insert the evicted entries into the secondary cache.
// Free the entries here outside of mutex for performance reasons.
// Try to insert the evicted entries into the secondary cache
// Free the entries here outside of mutex for performance reasons
for (auto entry : last_reference_list) {
if (secondary_cache_ && entry->IsSecondaryCacheCompatible() &&
!entry->IsInSecondaryCache()) {
!entry->IsPromoted()) {
secondary_cache_->Insert(entry->key(), entry->value, entry->info_.helper)
.PermitUncheckedError();
}
@ -390,6 +390,7 @@ void LRUCacheShard::Promote(LRUHandle* e) {
assert(secondary_handle->IsReady());
e->SetIncomplete(false);
e->SetInCache(true);
e->SetPromoted(true);
e->value = secondary_handle->Value();
e->charge = secondary_handle->Size();
delete secondary_handle;
@ -403,7 +404,7 @@ void LRUCacheShard::Promote(LRUHandle* e) {
Status s = InsertItem(e, &handle, /*free_handle_on_fail=*/false);
if (!s.ok()) {
// Item is in memory, but not accounted against the cache capacity.
// When the handle is released, the item should get deleted.
// When the handle is released, the item should get deleted
assert(!e->InCache());
}
} else {
@ -436,8 +437,8 @@ Cache::Handle* LRUCacheShard::Lookup(
}
// If handle table lookup failed, then allocate a handle outside the
// mutex if we're going to lookup in the secondary cache.
// Only support synchronous for now.
// mutex if we're going to lookup in the secondary cache
// Only support synchronous for now
// TODO: Support asynchronous lookup in secondary cache
if (!e && secondary_cache_ && helper && helper->saveto_cb) {
// For objects from the secondary cache, we expect the caller to provide
@ -446,9 +447,8 @@ Cache::Handle* LRUCacheShard::Lookup(
// accounting purposes, which we won't demote to the secondary cache
// anyway.
assert(create_cb && helper->del_cb);
bool is_in_sec_cache{false};
std::unique_ptr<SecondaryCacheResultHandle> secondary_handle =
secondary_cache_->Lookup(key, create_cb, wait, is_in_sec_cache);
secondary_cache_->Lookup(key, create_cb, wait);
if (secondary_handle != nullptr) {
e = reinterpret_cast<LRUHandle*>(
new char[sizeof(LRUHandle) - 1 + key.size()]);
@ -468,9 +468,8 @@ Cache::Handle* LRUCacheShard::Lookup(
if (wait) {
Promote(e);
e->SetIsInSecondaryCache(is_in_sec_cache);
if (!e->value) {
// The secondary cache returned a handle, but the lookup failed.
// The secondary cache returned a handle, but the lookup failed
e->Unref();
e->Free();
e = nullptr;
@ -480,9 +479,8 @@ Cache::Handle* LRUCacheShard::Lookup(
}
} else {
// If wait is false, we always return a handle and let the caller
// release the handle after checking for success or failure.
// release the handle after checking for success or failure
e->SetIncomplete(true);
e->SetIsInSecondaryCache(is_in_sec_cache);
// This may be slightly inaccurate, if the lookup eventually fails.
// But the probability is very low.
PERF_COUNTER_ADD(secondary_cache_hit_count, 1);
@ -496,7 +494,7 @@ Cache::Handle* LRUCacheShard::Lookup(
bool LRUCacheShard::Ref(Cache::Handle* h) {
LRUHandle* e = reinterpret_cast<LRUHandle*>(h);
MutexLock l(&mutex_);
// To create another reference - entry must be already externally referenced.
// To create another reference - entry must be already externally referenced
assert(e->HasRefs());
e->Ref();
return true;
@ -509,7 +507,7 @@ void LRUCacheShard::SetHighPriorityPoolRatio(double high_pri_pool_ratio) {
MaintainPoolSize();
}
bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) {
bool LRUCacheShard::Release(Cache::Handle* handle, bool force_erase) {
if (handle == nullptr) {
return false;
}
@ -519,15 +517,15 @@ bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) {
MutexLock l(&mutex_);
last_reference = e->Unref();
if (last_reference && e->InCache()) {
// The item is still in cache, and nobody else holds a reference to it.
if (usage_ > capacity_ || erase_if_last_ref) {
// The LRU list must be empty since the cache is full.
assert(lru_.next == &lru_ || erase_if_last_ref);
// Take this opportunity and remove the item.
// The item is still in cache, and nobody else holds a reference to it
if (usage_ > capacity_ || force_erase) {
// The LRU list must be empty since the cache is full
assert(lru_.next == &lru_ || force_erase);
// Take this opportunity and remove the item
table_.Remove(e->key(), e->hash);
e->SetInCache(false);
} else {
// Put the item back on the LRU list, and don't free it.
// Put the item back on the LRU list, and don't free it
LRU_Insert(e);
last_reference = false;
}
@ -544,7 +542,7 @@ bool LRUCacheShard::Release(Cache::Handle* handle, bool erase_if_last_ref) {
}
}
// Free the entry here outside of mutex for performance reasons.
// Free the entry here outside of mutex for performance reasons
if (last_reference) {
e->Free();
}
@ -556,8 +554,8 @@ Status LRUCacheShard::Insert(const Slice& key, uint32_t hash, void* value,
void (*deleter)(const Slice& key, void* value),
const Cache::CacheItemHelper* helper,
Cache::Handle** handle, Cache::Priority priority) {
// Allocate the memory here outside of the mutex.
// If the cache is full, we'll have to release it.
// Allocate the memory here outside of the mutex
// If the cache is full, we'll have to release it
// It shouldn't happen very often though.
LRUHandle* e = reinterpret_cast<LRUHandle*>(
new char[sizeof(LRUHandle) - 1 + key.size()]);
@ -605,8 +603,8 @@ void LRUCacheShard::Erase(const Slice& key, uint32_t hash) {
}
}
// Free the entry here outside of mutex for performance reasons.
// last_reference will only be true if e != nullptr.
// Free the entry here outside of mutex for performance reasons
// last_reference will only be true if e != nullptr
if (last_reference) {
e->Free();
}
@ -707,7 +705,7 @@ uint32_t LRUCache::GetHash(Handle* handle) const {
}
void LRUCache::DisownData() {
// Leak data only if that won't generate an ASAN/valgrind warning.
// Leak data only if that won't generate an ASAN/valgrind warning
if (!kMustFreeHeapAllocations) {
shards_ = nullptr;
num_shards_ = 0;
@ -760,8 +758,6 @@ void LRUCache::WaitAll(std::vector<Handle*>& handles) {
}
}
} // namespace lru_cache
std::shared_ptr<Cache> NewLRUCache(
size_t capacity, int num_shard_bits, bool strict_capacity_limit,
double high_pri_pool_ratio,
@ -769,10 +765,10 @@ std::shared_ptr<Cache> NewLRUCache(
CacheMetadataChargePolicy metadata_charge_policy,
const std::shared_ptr<SecondaryCache>& secondary_cache) {
if (num_shard_bits >= 20) {
return nullptr; // The cache cannot be sharded into too many fine pieces.
return nullptr; // the cache cannot be sharded into too many fine pieces
}
if (high_pri_pool_ratio < 0.0 || high_pri_pool_ratio > 1.0) {
// Invalid high_pri_pool_ratio
// invalid high_pri_pool_ratio
return nullptr;
}
if (num_shard_bits < 0) {

66
cache/lru_cache.h vendored
View File

@ -19,7 +19,6 @@
#include "util/autovector.h"
namespace ROCKSDB_NAMESPACE {
namespace lru_cache {
// LRU cache implementation. This class is not thread-safe.
@ -82,12 +81,12 @@ struct LRUHandle {
IN_HIGH_PRI_POOL = (1 << 2),
// Whether this entry has had any lookups (hits).
HAS_HIT = (1 << 3),
// Can this be inserted into the secondary cache.
// Can this be inserted into the secondary cache
IS_SECONDARY_CACHE_COMPATIBLE = (1 << 4),
// Is the handle still being read from a lower tier.
// Is the handle still being read from a lower tier
IS_PENDING = (1 << 5),
// Whether this handle is still in a lower tier
IS_IN_SECONDARY_CACHE = (1 << 6),
// Has the item been promoted from a lower tier
IS_PROMOTED = (1 << 6),
};
uint8_t flags;
@ -130,7 +129,7 @@ struct LRUHandle {
#endif // __SANITIZE_THREAD__
}
bool IsPending() const { return flags & IS_PENDING; }
bool IsInSecondaryCache() const { return flags & IS_IN_SECONDARY_CACHE; }
bool IsPromoted() const { return flags & IS_PROMOTED; }
void SetInCache(bool in_cache) {
if (in_cache) {
@ -177,11 +176,11 @@ struct LRUHandle {
}
}
void SetIsInSecondaryCache(bool is_in_secondary_cache) {
if (is_in_secondary_cache) {
flags |= IS_IN_SECONDARY_CACHE;
void SetPromoted(bool promoted) {
if (promoted) {
flags |= IS_PROMOTED;
} else {
flags &= ~IS_IN_SECONDARY_CACHE;
flags &= ~IS_PROMOTED;
}
}
@ -209,7 +208,7 @@ struct LRUHandle {
delete[] reinterpret_cast<char*>(this);
}
// Calculate the memory usage by metadata.
// Calculate the memory usage by metadata
inline size_t CalcTotalCharge(
CacheMetadataChargePolicy metadata_charge_policy) {
size_t meta_charge = 0;
@ -217,7 +216,7 @@ struct LRUHandle {
#ifdef ROCKSDB_MALLOC_USABLE_SIZE
meta_charge += malloc_usable_size(static_cast<void*>(this));
#else
// This is the size that is used when a new handle is created.
// This is the size that is used when a new handle is created
meta_charge += sizeof(LRUHandle) - 1 + key_length;
#endif
}
@ -273,10 +272,10 @@ class LRUHandleTable {
// a linked list of cache entries that hash into the bucket.
std::unique_ptr<LRUHandle*[]> list_;
// Number of elements currently in the table.
// Number of elements currently in the table
uint32_t elems_;
// Set from max_upper_hash_bits (see constructor).
// Set from max_upper_hash_bits (see constructor)
const int max_length_bits_;
};
@ -292,7 +291,7 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
// Separate from constructor so caller can easily make an array of LRUCache
// if current usage is more than new capacity, the function will attempt to
// free the needed space.
// free the needed space
virtual void SetCapacity(size_t capacity) override;
// Set the flag to reject insertion if cache if full.
@ -315,7 +314,8 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
assert(helper);
return Insert(key, hash, value, charge, nullptr, helper, handle, priority);
}
// If helper_cb is null, the values of the following arguments don't matter.
// If helper_cb is null, the values of the following arguments don't
// matter
virtual Cache::Handle* Lookup(const Slice& key, uint32_t hash,
const ShardedCache::CacheItemHelper* helper,
const ShardedCache::CreateCallback& create_cb,
@ -326,14 +326,14 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
nullptr);
}
virtual bool Release(Cache::Handle* handle, bool /*useful*/,
bool erase_if_last_ref) override {
return Release(handle, erase_if_last_ref);
bool force_erase) override {
return Release(handle, force_erase);
}
virtual bool IsReady(Cache::Handle* /*handle*/) override;
virtual void Wait(Cache::Handle* /*handle*/) override {}
virtual bool Ref(Cache::Handle* handle) override;
virtual bool Release(Cache::Handle* handle,
bool erase_if_last_ref = false) override;
bool force_erase = false) override;
virtual void Erase(const Slice& key, uint32_t hash) override;
// Although in some platforms the update of size_t is atomic, to make sure
@ -354,8 +354,8 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
void TEST_GetLRUList(LRUHandle** lru, LRUHandle** lru_low_pri);
// Retrieves number of elements in LRU, for unit test purpose only.
// Not threadsafe.
// Retrieves number of elements in LRU, for unit test purpose only
// not threadsafe
size_t TEST_GetLRUSize();
// Retrieves high pri pool ratio
@ -365,16 +365,14 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
friend class LRUCache;
// Insert an item into the hash table and, if handle is null, insert into
// the LRU list. Older items are evicted as necessary. If the cache is full
// and free_handle_on_fail is true, the item is deleted and handle is set to
// nullptr.
// and free_handle_on_fail is true, the item is deleted and handle is set to.
Status InsertItem(LRUHandle* item, Cache::Handle** handle,
bool free_handle_on_fail);
Status Insert(const Slice& key, uint32_t hash, void* value, size_t charge,
DeleterFn deleter, const Cache::CacheItemHelper* helper,
Cache::Handle** handle, Cache::Priority priority);
// Promote an item looked up from the secondary cache to the LRU cache.
// The item may be still in the secondary cache.
// It is only inserted into the hash table and not the LRU list, and only
// Promote an item looked up from the secondary cache to the LRU cache. The
// item is only inserted into the hash table and not the LRU list, and only
// if the cache is not at full capacity, as is the case during Insert. The
// caller should hold a reference on the LRUHandle. When the caller releases
// the last reference, the item is added to the LRU list.
@ -391,7 +389,7 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
// Free some space following strict LRU policy until enough space
// to hold (usage_ + charge) is freed or the lru list is empty
// This function is not thread safe - it needs to be executed while
// holding the mutex_.
// holding the mutex_
void EvictFromLRU(size_t charge, autovector<LRUHandle*>* deleted);
// Initialized before use.
@ -431,10 +429,10 @@ class ALIGN_AS(CACHE_LINE_SIZE) LRUCacheShard final : public CacheShard {
// ------------vvvvvvvvvvvvv-----------
LRUHandleTable table_;
// Memory size for entries residing in the cache.
// Memory size for entries residing in the cache
size_t usage_;
// Memory size for entries residing only in the LRU list.
// Memory size for entries residing only in the LRU list
size_t lru_usage_;
// mutex_ protects the following state.
@ -469,9 +467,9 @@ class LRUCache
virtual void DisownData() override;
virtual void WaitAll(std::vector<Handle*>& handles) override;
// Retrieves number of elements in LRU, for unit test purpose only.
// Retrieves number of elements in LRU, for unit test purpose only
size_t TEST_GetLRUSize();
// Retrieves high pri pool ratio.
// Retrieves high pri pool ratio
double GetHighPriPoolRatio();
private:
@ -480,10 +478,4 @@ class LRUCache
std::shared_ptr<SecondaryCache> secondary_cache_;
};
} // namespace lru_cache
using LRUCache = lru_cache::LRUCache;
using LRUHandle = lru_cache::LRUHandle;
using LRUCacheShard = lru_cache::LRUCacheShard;
} // namespace ROCKSDB_NAMESPACE

View File

@ -266,13 +266,12 @@ class TestSecondaryCache : public SecondaryCache {
}
std::unique_ptr<SecondaryCacheResultHandle> Lookup(
const Slice& key, const Cache::CreateCallback& create_cb, bool /*wait*/,
bool& is_in_sec_cache) override {
const Slice& key, const Cache::CreateCallback& create_cb,
bool /*wait*/) override {
std::string key_str = key.ToString();
TEST_SYNC_POINT_CALLBACK("TestSecondaryCache::Lookup", &key_str);
std::unique_ptr<SecondaryCacheResultHandle> secondary_handle;
is_in_sec_cache = false;
ResultType type = ResultType::SUCCESS;
auto iter = result_map_.find(key.ToString());
if (iter != result_map_.end()) {
@ -297,7 +296,6 @@ class TestSecondaryCache : public SecondaryCache {
if (s.ok()) {
secondary_handle.reset(new TestSecondaryCacheResultHandle(
cache_.get(), handle, value, charge, type));
is_in_sec_cache = true;
} else {
cache_->Release(handle);
}
@ -385,10 +383,10 @@ class DBSecondaryCacheTest : public DBTestBase {
std::unique_ptr<Env> fault_env_;
};
class LRUCacheSecondaryCacheTest : public LRUCacheTest {
class LRUSecondaryCacheTest : public LRUCacheTest {
public:
LRUCacheSecondaryCacheTest() : fail_create_(false) {}
~LRUCacheSecondaryCacheTest() {}
LRUSecondaryCacheTest() : fail_create_(false) {}
~LRUSecondaryCacheTest() {}
protected:
class TestItem {
@ -451,17 +449,16 @@ class LRUCacheSecondaryCacheTest : public LRUCacheTest {
bool fail_create_;
};
Cache::CacheItemHelper LRUCacheSecondaryCacheTest::helper_(
LRUCacheSecondaryCacheTest::SizeCallback,
LRUCacheSecondaryCacheTest::SaveToCallback,
LRUCacheSecondaryCacheTest::DeletionCallback);
Cache::CacheItemHelper LRUSecondaryCacheTest::helper_(
LRUSecondaryCacheTest::SizeCallback, LRUSecondaryCacheTest::SaveToCallback,
LRUSecondaryCacheTest::DeletionCallback);
Cache::CacheItemHelper LRUCacheSecondaryCacheTest::helper_fail_(
LRUCacheSecondaryCacheTest::SizeCallback,
LRUCacheSecondaryCacheTest::SaveToCallbackFail,
LRUCacheSecondaryCacheTest::DeletionCallback);
Cache::CacheItemHelper LRUSecondaryCacheTest::helper_fail_(
LRUSecondaryCacheTest::SizeCallback,
LRUSecondaryCacheTest::SaveToCallbackFail,
LRUSecondaryCacheTest::DeletionCallback);
TEST_F(LRUCacheSecondaryCacheTest, BasicTest) {
TEST_F(LRUSecondaryCacheTest, BasicTest) {
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
@ -473,25 +470,25 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicTest) {
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k1", item1, &LRUSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to NVM
ASSERT_OK(cache->Insert("k2", item2, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k2", item2, &LRUSecondaryCacheTest::helper_,
str2.length()));
get_perf_context()->Reset();
Cache::Handle* handle;
handle =
cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true, stats.get());
cache->Lookup("k2", &LRUSecondaryCacheTest::helper_, test_item_creator,
Cache::Priority::LOW, true, stats.get());
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should promote k1 and demote k2
handle =
cache->Lookup("k1", &LRUCacheSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true, stats.get());
cache->Lookup("k1", &LRUSecondaryCacheTest::helper_, test_item_creator,
Cache::Priority::LOW, true, stats.get());
ASSERT_NE(handle, nullptr);
cache->Release(handle);
ASSERT_EQ(secondary_cache->num_inserts(), 2u);
@ -505,7 +502,7 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicTest) {
secondary_cache.reset();
}
TEST_F(LRUCacheSecondaryCacheTest, BasicFailTest) {
TEST_F(LRUSecondaryCacheTest, BasicFailTest) {
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
@ -518,15 +515,15 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicFailTest) {
auto item1 = std::make_unique<TestItem>(str1.data(), str1.length());
ASSERT_TRUE(cache->Insert("k1", item1.get(), nullptr, str1.length())
.IsInvalidArgument());
ASSERT_OK(cache->Insert("k1", item1.get(),
&LRUCacheSecondaryCacheTest::helper_, str1.length()));
ASSERT_OK(cache->Insert("k1", item1.get(), &LRUSecondaryCacheTest::helper_,
str1.length()));
item1.release(); // Appease clang-analyze "potential memory leak"
Cache::Handle* handle;
handle = cache->Lookup("k2", nullptr, test_item_creator, Cache::Priority::LOW,
true);
ASSERT_EQ(handle, nullptr);
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, false);
ASSERT_EQ(handle, nullptr);
@ -534,7 +531,7 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicFailTest) {
secondary_cache.reset();
}
TEST_F(LRUCacheSecondaryCacheTest, SaveFailTest) {
TEST_F(LRUSecondaryCacheTest, SaveFailTest) {
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
@ -545,66 +542,25 @@ TEST_F(LRUCacheSecondaryCacheTest, SaveFailTest) {
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert(
"k1", item1, &LRUCacheSecondaryCacheTest::helper_fail_, str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to NVM
ASSERT_OK(cache->Insert(
"k2", item2, &LRUCacheSecondaryCacheTest::helper_fail_, str2.length()));
Cache::Handle* handle;
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should fail, since k1 demotion would have failed
handle = cache->Lookup("k1", &LRUCacheSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
// Since k1 didn't get promoted, k2 should still be in cache
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
ASSERT_EQ(secondary_cache->num_inserts(), 1u);
ASSERT_EQ(secondary_cache->num_lookups(), 1u);
cache.reset();
secondary_cache.reset();
}
TEST_F(LRUCacheSecondaryCacheTest, CreateFailTest) {
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
std::make_shared<TestSecondaryCache>(2048);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k1", item1, &LRUSecondaryCacheTest::helper_fail_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to NVM
ASSERT_OK(cache->Insert("k2", item2, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k2", item2, &LRUSecondaryCacheTest::helper_fail_,
str2.length()));
Cache::Handle* handle;
SetFailCreate(true);
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should fail, since k1 creation would have failed
handle = cache->Lookup("k1", &LRUCacheSecondaryCacheTest::helper_,
// This lookup should fail, since k1 demotion would have failed
handle = cache->Lookup("k1", &LRUSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
// Since k1 didn't get promoted, k2 should still be in cache
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_fail_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
@ -615,7 +571,48 @@ TEST_F(LRUCacheSecondaryCacheTest, CreateFailTest) {
secondary_cache.reset();
}
TEST_F(LRUCacheSecondaryCacheTest, FullCapacityTest) {
TEST_F(LRUSecondaryCacheTest, CreateFailTest) {
LRUCacheOptions opts(1024, 0, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
std::make_shared<TestSecondaryCache>(2048);
opts.secondary_cache = secondary_cache;
std::shared_ptr<Cache> cache = NewLRUCache(opts);
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &LRUSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to NVM
ASSERT_OK(cache->Insert("k2", item2, &LRUSecondaryCacheTest::helper_,
str2.length()));
Cache::Handle* handle;
SetFailCreate(true);
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
// This lookup should fail, since k1 creation would have failed
handle = cache->Lookup("k1", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_EQ(handle, nullptr);
// Since k1 didn't get promoted, k2 should still be in cache
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
ASSERT_EQ(secondary_cache->num_inserts(), 1u);
ASSERT_EQ(secondary_cache->num_lookups(), 1u);
cache.reset();
secondary_cache.reset();
}
TEST_F(LRUSecondaryCacheTest, FullCapacityTest) {
LRUCacheOptions opts(1024, 0, /*_strict_capacity_limit=*/true, 0.5, nullptr,
kDefaultToAdaptiveMutex, kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
@ -626,28 +623,28 @@ TEST_F(LRUCacheSecondaryCacheTest, FullCapacityTest) {
Random rnd(301);
std::string str1 = rnd.RandomString(1020);
TestItem* item1 = new TestItem(str1.data(), str1.length());
ASSERT_OK(cache->Insert("k1", item1, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k1", item1, &LRUSecondaryCacheTest::helper_,
str1.length()));
std::string str2 = rnd.RandomString(1020);
TestItem* item2 = new TestItem(str2.data(), str2.length());
// k1 should be demoted to NVM
ASSERT_OK(cache->Insert("k2", item2, &LRUCacheSecondaryCacheTest::helper_,
ASSERT_OK(cache->Insert("k2", item2, &LRUSecondaryCacheTest::helper_,
str2.length()));
Cache::Handle* handle;
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
// k1 promotion should fail due to the block cache being at capacity,
// but the lookup should still succeed
Cache::Handle* handle2;
handle2 = cache->Lookup("k1", &LRUCacheSecondaryCacheTest::helper_,
handle2 = cache->Lookup("k1", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle2, nullptr);
// Since k1 didn't get inserted, k2 should still be in cache
cache->Release(handle);
cache->Release(handle2);
handle = cache->Lookup("k2", &LRUCacheSecondaryCacheTest::helper_,
handle = cache->Lookup("k2", &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, true);
ASSERT_NE(handle, nullptr);
cache->Release(handle);
@ -1049,7 +1046,7 @@ TEST_F(DBSecondaryCacheTest, SecondaryCacheFailureTest) {
Destroy(options);
}
TEST_F(LRUCacheSecondaryCacheTest, BasicWaitAllTest) {
TEST_F(LRUSecondaryCacheTest, BasicWaitAllTest) {
LRUCacheOptions opts(1024, 2, false, 0.5, nullptr, kDefaultToAdaptiveMutex,
kDontChargeCacheMetadata);
std::shared_ptr<TestSecondaryCache> secondary_cache =
@ -1065,8 +1062,7 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicWaitAllTest) {
values.emplace_back(str);
TestItem* item = new TestItem(str.data(), str.length());
ASSERT_OK(cache->Insert("k" + std::to_string(i), item,
&LRUCacheSecondaryCacheTest::helper_,
str.length()));
&LRUSecondaryCacheTest::helper_, str.length()));
}
// Force all entries to be evicted to the secondary cache
cache->SetCapacity(0);
@ -1079,9 +1075,9 @@ TEST_F(LRUCacheSecondaryCacheTest, BasicWaitAllTest) {
{"k5", TestSecondaryCache::ResultType::FAIL}});
std::vector<Cache::Handle*> results;
for (int i = 0; i < 6; ++i) {
results.emplace_back(cache->Lookup(
"k" + std::to_string(i), &LRUCacheSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, false));
results.emplace_back(
cache->Lookup("k" + std::to_string(i), &LRUSecondaryCacheTest::helper_,
test_item_creator, Cache::Priority::LOW, false));
}
cache->WaitAll(results);
for (int i = 0; i < 6; ++i) {

View File

@ -104,15 +104,14 @@ bool ShardedCache::Ref(Handle* handle) {
return GetShard(Shard(hash))->Ref(handle);
}
bool ShardedCache::Release(Handle* handle, bool erase_if_last_ref) {
bool ShardedCache::Release(Handle* handle, bool force_erase) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->Release(handle, erase_if_last_ref);
return GetShard(Shard(hash))->Release(handle, force_erase);
}
bool ShardedCache::Release(Handle* handle, bool useful,
bool erase_if_last_ref) {
bool ShardedCache::Release(Handle* handle, bool useful, bool force_erase) {
uint32_t hash = GetHash(handle);
return GetShard(Shard(hash))->Release(handle, useful, erase_if_last_ref);
return GetShard(Shard(hash))->Release(handle, useful, force_erase);
}
void ShardedCache::Erase(const Slice& key) {

View File

@ -37,11 +37,11 @@ class CacheShard {
Cache::Priority priority, bool wait,
Statistics* stats) = 0;
virtual bool Release(Cache::Handle* handle, bool useful,
bool erase_if_last_ref) = 0;
bool force_erase) = 0;
virtual bool IsReady(Cache::Handle* handle) = 0;
virtual void Wait(Cache::Handle* handle) = 0;
virtual bool Ref(Cache::Handle* handle) = 0;
virtual bool Release(Cache::Handle* handle, bool erase_if_last_ref) = 0;
virtual bool Release(Cache::Handle* handle, bool force_erase) = 0;
virtual void Erase(const Slice& key, uint32_t hash) = 0;
virtual void SetCapacity(size_t capacity) = 0;
virtual void SetStrictCapacityLimit(bool strict_capacity_limit) = 0;
@ -94,11 +94,11 @@ class ShardedCache : public Cache {
const CreateCallback& create_cb, Priority priority,
bool wait, Statistics* stats = nullptr) override;
virtual bool Release(Handle* handle, bool useful,
bool erase_if_last_ref = false) override;
bool force_erase = false) override;
virtual bool IsReady(Handle* handle) override;
virtual void Wait(Handle* handle) override;
virtual bool Ref(Handle* handle) override;
virtual bool Release(Handle* handle, bool erase_if_last_ref = false) override;
virtual bool Release(Handle* handle, bool force_erase = false) override;
virtual void Erase(const Slice& key) override;
virtual uint64_t NewId() override;
virtual size_t GetCapacity() const override;

View File

@ -1,30 +0,0 @@
ifndef PYTHON
# Default to python3. Some distros like CentOS 8 do not have `python`.
ifeq ($(origin PYTHON), undefined)
PYTHON := $(shell which python3 || which python || echo python3)
endif
export PYTHON
endif
# To setup tmp directory, first recognize some old variables for setting
# test tmp directory or base tmp directory. TEST_TMPDIR is usually read
# by RocksDB tools though Env/FileSystem::GetTestDirectory.
ifeq ($(TEST_TMPDIR),)
TEST_TMPDIR := $(TMPD)
endif
ifeq ($(TEST_TMPDIR),)
ifeq ($(BASE_TMPDIR),)
BASE_TMPDIR :=$(TMPDIR)
endif
ifeq ($(BASE_TMPDIR),)
BASE_TMPDIR :=/tmp
endif
# Use /dev/shm if it has the sticky bit set (otherwise, /tmp or other
# base dir), and create a randomly-named rocksdb.XXXX directory therein.
TEST_TMPDIR := $(shell f=/dev/shm; test -k $$f || f=$(BASE_TMPDIR); \
perl -le 'use File::Temp "tempdir";' \
-e 'print tempdir("'$$f'/rocksdb.XXXX", CLEANUP => 0)')
endif
export TEST_TMPDIR

View File

@ -12,7 +12,7 @@ fi
ROOT=".."
# Fetch right version of gcov
if [ -d /mnt/gvfs/third-party -a -z "$CXX" ]; then
source $ROOT/build_tools/fbcode_config_platform009.sh
source $ROOT/build_tools/fbcode_config_platform007.sh
GCOV=$GCC_BASE/bin/gcov
else
GCOV=$(which gcov)

View File

@ -1,93 +0,0 @@
# This file is used by Meta-internal infrastructure as well as by Makefile
# When included from Makefile, there are rules to build DB_STRESS_CMD. When
# used directly with `make -f crashtest.mk ...` there will be no rules to
# build DB_STRESS_CMD so it must exist prior.
DB_STRESS_CMD?=./db_stress
include common.mk
CRASHTEST_MAKE=$(MAKE) -f crash_test.mk
CRASHTEST_PY=$(PYTHON) -u tools/db_crashtest.py --stress_cmd=$(DB_STRESS_CMD)
.PHONY: crash_test crash_test_with_atomic_flush crash_test_with_txn \
crash_test_with_best_efforts_recovery crash_test_with_ts \
blackbox_crash_test blackbox_crash_test_with_atomic_flush \
blackbox_crash_test_with_txn blackbox_crash_test_with_ts \
blackbox_crash_test_with_best_efforts_recovery \
whitebox_crash_test whitebox_crash_test_with_atomic_flush \
whitebox_crash_test_with_txn whitebox_crash_test_with_ts \
blackbox_crash_test_with_multiops_wc_txn \
blackbox_crash_test_with_multiops_wp_txn
crash_test: $(DB_STRESS_CMD)
# Do not parallelize
$(CRASHTEST_MAKE) whitebox_crash_test
$(CRASHTEST_MAKE) blackbox_crash_test
crash_test_with_atomic_flush: $(DB_STRESS_CMD)
# Do not parallelize
$(CRASHTEST_MAKE) whitebox_crash_test_with_atomic_flush
$(CRASHTEST_MAKE) blackbox_crash_test_with_atomic_flush
crash_test_with_txn: $(DB_STRESS_CMD)
# Do not parallelize
$(CRASHTEST_MAKE) whitebox_crash_test_with_txn
$(CRASHTEST_MAKE) blackbox_crash_test_with_txn
crash_test_with_best_efforts_recovery: blackbox_crash_test_with_best_efforts_recovery
crash_test_with_ts: $(DB_STRESS_CMD)
# Do not parallelize
$(CRASHTEST_MAKE) whitebox_crash_test_with_ts
$(CRASHTEST_MAKE) blackbox_crash_test_with_ts
crash_test_with_multiops_wc_txn: $(DB_STRESS_CMD)
$(CRASHTEST_MAKE) blackbox_crash_test_with_multiops_wc_txn
crash_test_with_multiops_wp_txn: $(DB_STRESS_CMD)
$(CRASHTEST_MAKE) blackbox_crash_test_with_multiops_wp_txn
blackbox_crash_test: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --simple blackbox $(CRASH_TEST_EXT_ARGS)
$(CRASHTEST_PY) blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_atomic_flush: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --cf_consistency blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_txn: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --txn blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_best_efforts_recovery: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --test_best_efforts_recovery blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_ts: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --enable_ts blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_multiops_wc_txn: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --test_multiops_txn --write_policy write_committed blackbox $(CRASH_TEST_EXT_ARGS)
blackbox_crash_test_with_multiops_wp_txn: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --test_multiops_txn --write_policy write_prepared blackbox $(CRASH_TEST_EXT_ARGS)
ifeq ($(CRASH_TEST_KILL_ODD),)
CRASH_TEST_KILL_ODD=888887
endif
whitebox_crash_test: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --simple whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
$(CRASHTEST_PY) whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_atomic_flush: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --cf_consistency whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_txn: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --txn whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)
whitebox_crash_test_with_ts: $(DB_STRESS_CMD)
$(CRASHTEST_PY) --enable_ts whitebox --random_kill_odd \
$(CRASH_TEST_KILL_ODD) $(CRASH_TEST_EXT_ARGS)

View File

@ -23,7 +23,7 @@ Status ArenaWrappedDBIter::GetProperty(std::string prop_name,
if (prop_name == "rocksdb.iterator.super-version-number") {
// First try to pass the value returned from inner iterator.
if (!db_iter_->GetProperty(prop_name, prop).ok()) {
*prop = std::to_string(sv_number_);
*prop = ToString(sv_number_);
}
return Status::OK();
}

View File

@ -96,9 +96,9 @@ class BlobIndex {
assert(slice.size() > 0);
type_ = static_cast<Type>(*slice.data());
if (type_ >= Type::kUnknown) {
return Status::Corruption(kErrorMessage,
"Unknown blob index type: " +
std::to_string(static_cast<char>(type_)));
return Status::Corruption(
kErrorMessage,
"Unknown blob index type: " + ToString(static_cast<char>(type_)));
}
slice = Slice(slice.data() + 1, slice.size() - 1);
if (HasTTL()) {

View File

@ -366,26 +366,19 @@ TEST_F(DBBlobBasicTest, GetBlob_CorruptIndex) {
Reopen(options);
constexpr char key[] = "key";
constexpr char blob[] = "blob";
ASSERT_OK(Put(key, blob));
// Fake a corrupt blob index.
const std::string blob_index("foobar");
WriteBatch batch;
ASSERT_OK(WriteBatchInternal::PutBlobIndex(&batch, 0, key, blob_index));
ASSERT_OK(db_->Write(WriteOptions(), &batch));
ASSERT_OK(Flush());
SyncPoint::GetInstance()->SetCallBack(
"Version::Get::TamperWithBlobIndex", [](void* arg) {
Slice* const blob_index = static_cast<Slice*>(arg);
assert(blob_index);
assert(!blob_index->empty());
blob_index->remove_prefix(1);
});
SyncPoint::GetInstance()->EnableProcessing();
PinnableSlice result;
ASSERT_TRUE(db_->Get(ReadOptions(), db_->DefaultColumnFamily(), key, &result)
.IsCorruption());
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
TEST_F(DBBlobBasicTest, MultiGetBlob_CorruptIndex) {
@ -408,27 +401,17 @@ TEST_F(DBBlobBasicTest, MultiGetBlob_CorruptIndex) {
}
constexpr char key[] = "key";
constexpr char blob[] = "blob";
ASSERT_OK(Put(key, blob));
keys[kNumOfKeys] = key;
{
// Fake a corrupt blob index.
const std::string blob_index("foobar");
WriteBatch batch;
ASSERT_OK(WriteBatchInternal::PutBlobIndex(&batch, 0, key, blob_index));
ASSERT_OK(db_->Write(WriteOptions(), &batch));
keys[kNumOfKeys] = Slice(static_cast<const char*>(key), sizeof(key) - 1);
}
ASSERT_OK(Flush());
SyncPoint::GetInstance()->SetCallBack(
"Version::MultiGet::TamperWithBlobIndex", [&key](void* arg) {
KeyContext* const key_context = static_cast<KeyContext*>(arg);
assert(key_context);
assert(key_context->key);
if (*(key_context->key) == key) {
Slice* const blob_index = key_context->value;
assert(blob_index);
assert(!blob_index->empty());
blob_index->remove_prefix(1);
}
});
SyncPoint::GetInstance()->EnableProcessing();
std::array<PinnableSlice, kNumOfKeys + 1> values;
std::array<Status, kNumOfKeys + 1> statuses;
db_->MultiGet(ReadOptions(), dbfull()->DefaultColumnFamily(), kNumOfKeys + 1,
@ -442,9 +425,6 @@ TEST_F(DBBlobBasicTest, MultiGetBlob_CorruptIndex) {
ASSERT_TRUE(statuses[i].IsCorruption());
}
}
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
TEST_F(DBBlobBasicTest, MultiGetBlob_ExceedSoftLimit) {
@ -753,14 +733,6 @@ TEST_F(DBBlobBasicTest, Properties) {
&live_blob_file_size));
ASSERT_EQ(live_blob_file_size, total_expected_size);
// Total amount of garbage in live blob files
{
uint64_t live_blob_file_garbage_size = 0;
ASSERT_TRUE(db_->GetIntProperty(DB::Properties::kLiveBlobFileGarbageSize,
&live_blob_file_garbage_size));
ASSERT_EQ(live_blob_file_garbage_size, 0);
}
// Total size of all blob files across all versions
// Note: this should be the same as above since we only have one
// version at this point.
@ -796,14 +768,6 @@ TEST_F(DBBlobBasicTest, Properties) {
<< "\nBlob file space amplification: " << expected_space_amp << '\n';
ASSERT_EQ(blob_stats, oss.str());
// Total amount of garbage in live blob files
{
uint64_t live_blob_file_garbage_size = 0;
ASSERT_TRUE(db_->GetIntProperty(DB::Properties::kLiveBlobFileGarbageSize,
&live_blob_file_garbage_size));
ASSERT_EQ(live_blob_file_garbage_size, expected_garbage_size);
}
}
TEST_F(DBBlobBasicTest, PropertiesMultiVersion) {

View File

@ -415,30 +415,16 @@ TEST_F(DBBlobCompactionTest, CorruptedBlobIndex) {
new ValueMutationFilter(""));
options.compaction_filter = compaction_filter_guard.get();
DestroyAndReopen(options);
// Mock a corrupted blob index
constexpr char key[] = "key";
constexpr char blob[] = "blob";
ASSERT_OK(Put(key, blob));
std::string blob_idx("blob_idx");
WriteBatch write_batch;
ASSERT_OK(WriteBatchInternal::PutBlobIndex(&write_batch, 0, key, blob_idx));
ASSERT_OK(db_->Write(WriteOptions(), &write_batch));
ASSERT_OK(Flush());
SyncPoint::GetInstance()->SetCallBack(
"CompactionIterator::InvokeFilterIfNeeded::TamperWithBlobIndex",
[](void* arg) {
Slice* const blob_index = static_cast<Slice*>(arg);
assert(blob_index);
assert(!blob_index->empty());
blob_index->remove_prefix(1);
});
SyncPoint::GetInstance()->EnableProcessing();
ASSERT_TRUE(db_->CompactRange(CompactRangeOptions(), /*begin=*/nullptr,
/*end=*/nullptr)
.IsCorruption());
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
Close();
}

View File

@ -13,7 +13,6 @@
#include <vector>
#include "db/arena_wrapped_db_iter.h"
#include "db/blob/blob_index.h"
#include "db/column_family.h"
#include "db/db_iter.h"
#include "db/db_test_util.h"
@ -139,39 +138,20 @@ class DBBlobIndexTest : public DBTestBase {
}
};
// Note: the following test case pertains to the StackableDB-based BlobDB
// implementation. We should be able to write kTypeBlobIndex to memtables and
// SST files.
// Should be able to write kTypeBlobIndex to memtables and SST files.
TEST_F(DBBlobIndexTest, Write) {
for (auto tier : kAllTiers) {
DestroyAndReopen(GetTestOptions());
std::vector<std::pair<std::string, std::string>> key_values;
constexpr size_t num_key_values = 5;
key_values.reserve(num_key_values);
for (size_t i = 1; i <= num_key_values; ++i) {
std::string key = "key" + std::to_string(i);
std::string blob_index;
BlobIndex::EncodeInlinedTTL(&blob_index, /* expiration */ 9876543210,
"blob" + std::to_string(i));
key_values.emplace_back(std::move(key), std::move(blob_index));
}
for (const auto& key_value : key_values) {
for (int i = 1; i <= 5; i++) {
std::string index = ToString(i);
WriteBatch batch;
ASSERT_OK(PutBlobIndex(&batch, key_value.first, key_value.second));
ASSERT_OK(PutBlobIndex(&batch, "key" + index, "blob" + index));
ASSERT_OK(Write(&batch));
}
MoveDataTo(tier);
for (const auto& key_value : key_values) {
ASSERT_EQ(GetBlobIndex(key_value.first), key_value.second);
for (int i = 1; i <= 5; i++) {
std::string index = ToString(i);
ASSERT_EQ("blob" + index, GetBlobIndex("key" + index));
}
}
}
@ -184,19 +164,13 @@ TEST_F(DBBlobIndexTest, Write) {
// accidentally opening the base DB of a stacked BlobDB and actual corruption
// when using the integrated BlobDB.
TEST_F(DBBlobIndexTest, Get) {
std::string blob_index;
BlobIndex::EncodeInlinedTTL(&blob_index, /* expiration */ 9876543210, "blob");
for (auto tier : kAllTiers) {
DestroyAndReopen(GetTestOptions());
WriteBatch batch;
ASSERT_OK(batch.Put("key", "value"));
ASSERT_OK(PutBlobIndex(&batch, "blob_key", blob_index));
ASSERT_OK(PutBlobIndex(&batch, "blob_key", "blob_index"));
ASSERT_OK(Write(&batch));
MoveDataTo(tier);
// Verify normal value
bool is_blob_index = false;
PinnableSlice value;
@ -204,7 +178,6 @@ TEST_F(DBBlobIndexTest, Get) {
ASSERT_EQ("value", GetImpl("key"));
ASSERT_EQ("value", GetImpl("key", &is_blob_index));
ASSERT_FALSE(is_blob_index);
// Verify blob index
if (tier <= kImmutableMemtables) {
ASSERT_TRUE(Get("blob_key", &value).IsNotSupported());
@ -213,7 +186,7 @@ TEST_F(DBBlobIndexTest, Get) {
ASSERT_TRUE(Get("blob_key", &value).IsCorruption());
ASSERT_EQ("CORRUPTION", GetImpl("blob_key"));
}
ASSERT_EQ(blob_index, GetImpl("blob_key", &is_blob_index));
ASSERT_EQ("blob_index", GetImpl("blob_key", &is_blob_index));
ASSERT_TRUE(is_blob_index);
}
}
@ -223,14 +196,11 @@ TEST_F(DBBlobIndexTest, Get) {
// if blob index is updated with a normal value. See the test case above for
// more details.
TEST_F(DBBlobIndexTest, Updated) {
std::string blob_index;
BlobIndex::EncodeInlinedTTL(&blob_index, /* expiration */ 9876543210, "blob");
for (auto tier : kAllTiers) {
DestroyAndReopen(GetTestOptions());
WriteBatch batch;
for (int i = 0; i < 10; i++) {
ASSERT_OK(PutBlobIndex(&batch, "key" + std::to_string(i), blob_index));
ASSERT_OK(PutBlobIndex(&batch, "key" + ToString(i), "blob_index"));
}
ASSERT_OK(Write(&batch));
// Avoid blob values from being purged.
@ -248,7 +218,7 @@ TEST_F(DBBlobIndexTest, Updated) {
ASSERT_OK(dbfull()->DeleteRange(WriteOptions(), cfh(), "key6", "key9"));
MoveDataTo(tier);
for (int i = 0; i < 10; i++) {
ASSERT_EQ(blob_index, GetBlobIndex("key" + std::to_string(i), snapshot));
ASSERT_EQ("blob_index", GetBlobIndex("key" + ToString(i), snapshot));
}
ASSERT_EQ("new_value", Get("key1"));
if (tier <= kImmutableMemtables) {
@ -260,9 +230,9 @@ TEST_F(DBBlobIndexTest, Updated) {
ASSERT_EQ("NOT_FOUND", Get("key4"));
ASSERT_EQ("a,b,c", GetImpl("key5"));
for (int i = 6; i < 9; i++) {
ASSERT_EQ("NOT_FOUND", Get("key" + std::to_string(i)));
ASSERT_EQ("NOT_FOUND", Get("key" + ToString(i)));
}
ASSERT_EQ(blob_index, GetBlobIndex("key9"));
ASSERT_EQ("blob_index", GetBlobIndex("key9"));
dbfull()->ReleaseSnapshot(snapshot);
}
}
@ -301,7 +271,7 @@ TEST_F(DBBlobIndexTest, Iterate) {
};
auto get_value = [&](int index, int version) {
return get_key(index) + "_value" + std::to_string(version);
return get_key(index) + "_value" + ToString(version);
};
auto check_iterator = [&](Iterator* iterator, Status::Code expected_status,
@ -501,7 +471,7 @@ TEST_F(DBBlobIndexTest, IntegratedBlobIterate) {
auto get_key = [](size_t index) { return ("key" + std::to_string(index)); };
auto get_value = [&](size_t index, size_t version) {
return get_key(index) + "_value" + std::to_string(version);
return get_key(index) + "_value" + ToString(version);
};
auto check_iterator = [&](Iterator* iterator, Status expected_status,

View File

@ -62,9 +62,9 @@ Status BuildTable(
FileMetaData* meta, std::vector<BlobFileAddition>* blob_file_additions,
std::vector<SequenceNumber> snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, SnapshotChecker* snapshot_checker,
bool paranoid_file_checks, InternalStats* internal_stats,
IOStatus* io_status, const std::shared_ptr<IOTracer>& io_tracer,
SnapshotChecker* snapshot_checker, bool paranoid_file_checks,
InternalStats* internal_stats, IOStatus* io_status,
const std::shared_ptr<IOTracer>& io_tracer,
BlobFileCreationReason blob_creation_reason, EventLogger* event_logger,
int job_id, const Env::IOPriority io_priority,
TableProperties* table_properties, Env::WriteLifeTimeHint write_hint,
@ -115,7 +115,6 @@ Status BuildTable(
assert(fs);
TableProperties tp;
bool table_file_created = false;
if (iter->Valid() || !range_del_agg->IsEmpty()) {
std::unique_ptr<CompactionFilter> compaction_filter;
if (ioptions.compaction_filter_factory != nullptr &&
@ -159,8 +158,6 @@ Status BuildTable(
file_checksum_func_name);
return s;
}
table_file_created = true;
FileTypeSet tmp_set = ioptions.checksum_handoff_file_types;
file->SetIOPriority(io_priority);
file->SetWriteLifeTimeHint(write_hint);
@ -192,14 +189,12 @@ Status BuildTable(
CompactionIterator c_iter(
iter, tboptions.internal_comparator.user_comparator(), &merge,
kMaxSequenceNumber, &snapshots, earliest_write_conflict_snapshot,
job_snapshot, snapshot_checker, env,
ShouldReportDetailedTime(env, ioptions.stats),
snapshot_checker, env, ShouldReportDetailedTime(env, ioptions.stats),
true /* internal key corruption is not ok */, range_del_agg.get(),
blob_file_builder.get(), ioptions.allow_data_in_errors,
ioptions.enforce_single_del_contracts,
/*compaction=*/nullptr, compaction_filter.get(),
/*shutting_down=*/nullptr,
/*manual_compaction_paused=*/nullptr,
/*preserve_deletes_seqnum=*/0, /*manual_compaction_paused=*/nullptr,
/*manual_compaction_canceled=*/nullptr, db_options.info_log,
full_history_ts_low);
@ -216,11 +211,7 @@ Status BuildTable(
break;
}
builder->Add(key, value);
s = meta->UpdateBoundaries(key, value, ikey.sequence, ikey.type);
if (!s.ok()) {
break;
}
meta->UpdateBoundaries(key, value, ikey.sequence, ikey.type);
// TODO(noetzli): Update stats after flush, too.
if (io_priority == Env::IO_HIGH &&
@ -327,7 +318,6 @@ Status BuildTable(
// TODO Also check the IO status when create the Iterator.
TEST_SYNC_POINT("BuildTable:BeforeOutputValidation");
if (s.ok() && !empty) {
// Verify that the table is usable
// We set for_compaction to false and don't OptimizeForCompactionTableRead
@ -375,17 +365,15 @@ Status BuildTable(
constexpr IODebugContext* dbg = nullptr;
if (table_file_created) {
Status ignored = fs->DeleteFile(fname, IOOptions(), dbg);
ignored.PermitUncheckedError();
}
Status ignored = fs->DeleteFile(fname, IOOptions(), dbg);
ignored.PermitUncheckedError();
assert(blob_file_additions || blob_file_paths.empty());
if (blob_file_additions) {
for (const std::string& blob_file_path : blob_file_paths) {
Status ignored = DeleteDBFile(&db_options, blob_file_path, dbname,
/*force_bg=*/false, /*force_fg=*/false);
ignored = DeleteDBFile(&db_options, blob_file_path, dbname,
/*force_bg=*/false, /*force_fg=*/false);
ignored.PermitUncheckedError();
TEST_SYNC_POINT("BuildTable::AfterDeleteFile");
}

View File

@ -57,9 +57,9 @@ extern Status BuildTable(
FileMetaData* meta, std::vector<BlobFileAddition>* blob_file_additions,
std::vector<SequenceNumber> snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, SnapshotChecker* snapshot_checker,
bool paranoid_file_checks, InternalStats* internal_stats,
IOStatus* io_status, const std::shared_ptr<IOTracer>& io_tracer,
SnapshotChecker* snapshot_checker, bool paranoid_file_checks,
InternalStats* internal_stats, IOStatus* io_status,
const std::shared_ptr<IOTracer>& io_tracer,
BlobFileCreationReason blob_creation_reason,
EventLogger* event_logger = nullptr, int job_id = 0,
const Env::IOPriority io_priority = Env::IO_HIGH,

50
db/c.cc
View File

@ -1163,43 +1163,6 @@ void rocksdb_multi_get_cf(
}
}
void rocksdb_batched_multi_get_cf(rocksdb_t* db,
const rocksdb_readoptions_t* options,
rocksdb_column_family_handle_t* column_family,
size_t num_keys, const char* const* keys_list,
const size_t* keys_list_sizes,
rocksdb_pinnableslice_t** values, char** errs,
const bool sorted_input) {
Slice* key_slices = new Slice[num_keys];
PinnableSlice* value_slices = new PinnableSlice[num_keys];
Status* statuses = new Status[num_keys];
for (size_t i = 0; i < num_keys; ++i) {
key_slices[i] = Slice(keys_list[i], keys_list_sizes[i]);
}
db->rep->MultiGet(options->rep, column_family->rep, num_keys, key_slices,
value_slices, statuses, sorted_input);
for (size_t i = 0; i < num_keys; ++i) {
if (statuses[i].ok()) {
values[i] = new (rocksdb_pinnableslice_t);
values[i]->rep = std::move(value_slices[i]);
errs[i] = nullptr;
} else {
values[i] = nullptr;
if (!statuses[i].IsNotFound()) {
errs[i] = strdup(statuses[i].ToString().c_str());
} else {
errs[i] = nullptr;
}
}
}
delete[] key_slices;
delete[] value_slices;
delete[] statuses;
}
unsigned char rocksdb_key_may_exist(rocksdb_t* db,
const rocksdb_readoptions_t* options,
const char* key, size_t key_len,
@ -2334,6 +2297,11 @@ void rocksdb_block_based_options_set_data_block_hash_ratio(
options->rep.data_block_hash_table_util_ratio = v;
}
void rocksdb_block_based_options_set_hash_index_allow_collision(
rocksdb_block_based_table_options_t* options, unsigned char v) {
options->rep.hash_index_allow_collision = v;
}
void rocksdb_block_based_options_set_cache_index_and_filter_blocks(
rocksdb_block_based_table_options_t* options, unsigned char v) {
options->rep.cache_index_and_filter_blocks = v;
@ -4230,14 +4198,6 @@ rocksdb_cache_t* rocksdb_cache_create_lru(size_t capacity) {
return c;
}
rocksdb_cache_t* rocksdb_cache_create_lru_with_strict_capacity_limit(
size_t capacity) {
rocksdb_cache_t* c = new rocksdb_cache_t;
c->rep = NewLRUCache(capacity);
c->rep->SetStrictCapacityLimit(true);
return c;
}
rocksdb_cache_t* rocksdb_cache_create_lru_opts(
rocksdb_lru_cache_options_t* opt) {
rocksdb_cache_t* c = new rocksdb_cache_t;

View File

@ -1260,18 +1260,15 @@ int main(int argc, char** argv) {
rocksdb_writebatch_clear(wb);
rocksdb_writebatch_put_cf(wb, handles[1], "bar", 3, "b", 1);
rocksdb_writebatch_put_cf(wb, handles[1], "box", 3, "c", 1);
rocksdb_writebatch_put_cf(wb, handles[1], "buff", 4, "rocksdb", 7);
rocksdb_writebatch_delete_cf(wb, handles[1], "bar", 3);
rocksdb_write(db, woptions, wb, &err);
CheckNoError(err);
CheckGetCF(db, roptions, handles[1], "baz", NULL);
CheckGetCF(db, roptions, handles[1], "bar", NULL);
CheckGetCF(db, roptions, handles[1], "box", "c");
CheckGetCF(db, roptions, handles[1], "buff", "rocksdb");
CheckPinGetCF(db, roptions, handles[1], "baz", NULL);
CheckPinGetCF(db, roptions, handles[1], "bar", NULL);
CheckPinGetCF(db, roptions, handles[1], "box", "c");
CheckPinGetCF(db, roptions, handles[1], "buff", "rocksdb");
rocksdb_writebatch_destroy(wb);
rocksdb_flush_wal(db, 1, &err);
@ -1302,26 +1299,6 @@ int main(int argc, char** argv) {
Free(&vals[i]);
}
{
const char* batched_keys[4] = {"box", "buff", "barfooxx", "box"};
const size_t batched_keys_sizes[4] = {3, 4, 8, 3};
const char* expected_value[4] = {"c", "rocksdb", NULL, "c"};
char* batched_errs[4];
rocksdb_pinnableslice_t* pvals[4];
rocksdb_batched_multi_get_cf(db, roptions, handles[1], 4, batched_keys,
batched_keys_sizes, pvals, batched_errs,
false);
const char* val;
size_t val_len;
for (i = 0; i < 4; ++i) {
val = rocksdb_pinnableslice_value(pvals[i], &val_len);
CheckNoError(batched_errs[i]);
CheckEqual(expected_value[i], val, val_len);
rocksdb_pinnableslice_destroy(pvals[i]);
}
}
{
unsigned char value_found = 0;
@ -1353,7 +1330,7 @@ int main(int argc, char** argv) {
for (i = 0; rocksdb_iter_valid(iter) != 0; rocksdb_iter_next(iter)) {
i++;
}
CheckCondition(i == 4);
CheckCondition(i == 3);
rocksdb_iter_get_error(iter, &err);
CheckNoError(err);
rocksdb_iter_destroy(iter);
@ -1377,7 +1354,7 @@ int main(int argc, char** argv) {
for (i = 0; rocksdb_iter_valid(iter) != 0; rocksdb_iter_next(iter)) {
i++;
}
CheckCondition(i == 4);
CheckCondition(i == 3);
rocksdb_iter_get_error(iter, &err);
CheckNoError(err);
rocksdb_iter_destroy(iter);
@ -2471,7 +2448,7 @@ int main(int argc, char** argv) {
rocksdb_fifo_compaction_options_destroy(fco);
}
StartPhase("backup_engine_option");
StartPhase("backupable_db_option");
{
rocksdb_backup_engine_options_t* bdo;
bdo = rocksdb_backup_engine_options_create("path");

View File

@ -501,8 +501,7 @@ std::vector<std::string> ColumnFamilyData::GetDbPaths() const {
return paths;
}
const uint32_t ColumnFamilyData::kDummyColumnFamilyDataId =
std::numeric_limits<uint32_t>::max();
const uint32_t ColumnFamilyData::kDummyColumnFamilyDataId = port::kMaxUint32;
ColumnFamilyData::ColumnFamilyData(
uint32_t id, const std::string& name, Version* _dummy_versions,
@ -827,8 +826,8 @@ int GetL0ThresholdSpeedupCompaction(int level0_file_num_compaction_trigger,
// condition.
// Or twice as compaction trigger, if it is smaller.
int64_t res = std::min(twice_level0_trigger, one_fourth_trigger_slowdown);
if (res >= std::numeric_limits<int32_t>::max()) {
return std::numeric_limits<int32_t>::max();
if (res >= port::kMaxInt32) {
return port::kMaxInt32;
} else {
// res fits in int
return static_cast<int>(res);
@ -1165,12 +1164,12 @@ Compaction* ColumnFamilyData::CompactRange(
int output_level, const CompactRangeOptions& compact_range_options,
const InternalKey* begin, const InternalKey* end,
InternalKey** compaction_end, bool* conflict,
uint64_t max_file_num_to_ignore, const std::string& trim_ts) {
uint64_t max_file_num_to_ignore) {
auto* result = compaction_picker_->CompactRange(
GetName(), mutable_cf_options, mutable_db_options,
current_->storage_info(), input_level, output_level,
compact_range_options, begin, end, compaction_end, conflict,
max_file_num_to_ignore, trim_ts);
max_file_num_to_ignore);
if (result != nullptr) {
result->SetInputVersion(current_);
}

View File

@ -9,10 +9,10 @@
#pragma once
#include <atomic>
#include <string>
#include <unordered_map>
#include <string>
#include <vector>
#include <atomic>
#include "db/memtable_list.h"
#include "db/table_cache.h"
@ -25,7 +25,6 @@
#include "rocksdb/env.h"
#include "rocksdb/options.h"
#include "trace_replay/block_cache_tracer.h"
#include "util/hash_containers.h"
#include "util/thread_local.h"
namespace ROCKSDB_NAMESPACE {
@ -414,8 +413,7 @@ class ColumnFamilyData {
const CompactRangeOptions& compact_range_options,
const InternalKey* begin, const InternalKey* end,
InternalKey** compaction_end, bool* manual_conflict,
uint64_t max_file_num_to_ignore,
const std::string& trim_ts);
uint64_t max_file_num_to_ignore);
CompactionPicker* compaction_picker() { return compaction_picker_.get(); }
// thread-safe
@ -706,8 +704,8 @@ class ColumnFamilySet {
// * when reading, at least one condition needs to be satisfied:
// 1. DB mutex locked
// 2. accessed from a single-threaded write thread
UnorderedMap<std::string, uint32_t> column_families_;
UnorderedMap<uint32_t, ColumnFamilyData*> column_family_data_;
std::unordered_map<std::string, uint32_t> column_families_;
std::unordered_map<uint32_t, ColumnFamilyData*> column_family_data_;
uint32_t max_column_family_;
const FileOptions file_options_;

View File

@ -383,7 +383,7 @@ class ColumnFamilyTestBase : public testing::Test {
int NumTableFilesAtLevel(int level, int cf) {
return GetProperty(cf,
"rocksdb.num-files-at-level" + std::to_string(level));
"rocksdb.num-files-at-level" + ToString(level));
}
#ifndef ROCKSDB_LITE
@ -783,7 +783,7 @@ TEST_P(ColumnFamilyTest, BulkAddDrop) {
std::vector<std::string> cf_names;
std::vector<ColumnFamilyHandle*> cf_handles;
for (int i = 1; i <= kNumCF; i++) {
cf_names.push_back("cf1-" + std::to_string(i));
cf_names.push_back("cf1-" + ToString(i));
}
ASSERT_OK(db_->CreateColumnFamilies(cf_options, cf_names, &cf_handles));
for (int i = 1; i <= kNumCF; i++) {
@ -796,8 +796,7 @@ TEST_P(ColumnFamilyTest, BulkAddDrop) {
}
cf_handles.clear();
for (int i = 1; i <= kNumCF; i++) {
cf_descriptors.emplace_back("cf2-" + std::to_string(i),
ColumnFamilyOptions());
cf_descriptors.emplace_back("cf2-" + ToString(i), ColumnFamilyOptions());
}
ASSERT_OK(db_->CreateColumnFamilies(cf_descriptors, &cf_handles));
for (int i = 1; i <= kNumCF; i++) {
@ -821,7 +820,7 @@ TEST_P(ColumnFamilyTest, DropTest) {
Open({"default"});
CreateColumnFamiliesAndReopen({"pikachu"});
for (int i = 0; i < 100; ++i) {
ASSERT_OK(Put(1, std::to_string(i), "bar" + std::to_string(i)));
ASSERT_OK(Put(1, ToString(i), "bar" + ToString(i)));
}
ASSERT_OK(Flush(1));
@ -1345,7 +1344,7 @@ TEST_P(ColumnFamilyTest, DifferentCompactionStyles) {
PutRandomData(1, 10, 12000);
PutRandomData(1, 1, 10);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
// SETUP column family "two" -- level style with 4 levels
@ -1353,7 +1352,7 @@ TEST_P(ColumnFamilyTest, DifferentCompactionStyles) {
PutRandomData(2, 10, 12000);
PutRandomData(2, 1, 10);
WaitForFlush(2);
AssertFilesPerLevel(std::to_string(i + 1), 2);
AssertFilesPerLevel(ToString(i + 1), 2);
}
// TRIGGER compaction "one"
@ -1417,7 +1416,7 @@ TEST_P(ColumnFamilyTest, MultipleManualCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
bool cf_1_1 = true;
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->LoadDependency(
@ -1447,7 +1446,7 @@ TEST_P(ColumnFamilyTest, MultipleManualCompactions) {
PutRandomData(2, 10, 12000);
PutRandomData(2, 1, 10);
WaitForFlush(2);
AssertFilesPerLevel(std::to_string(i + 1), 2);
AssertFilesPerLevel(ToString(i + 1), 2);
}
threads.emplace_back([&] {
TEST_SYNC_POINT("ColumnFamilyTest::MultiManual:1");
@ -1534,7 +1533,7 @@ TEST_P(ColumnFamilyTest, AutomaticAndManualCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
TEST_SYNC_POINT("ColumnFamilyTest::AutoManual:1");
@ -1544,7 +1543,7 @@ TEST_P(ColumnFamilyTest, AutomaticAndManualCompactions) {
PutRandomData(2, 10, 12000);
PutRandomData(2, 1, 10);
WaitForFlush(2);
AssertFilesPerLevel(std::to_string(i + 1), 2);
AssertFilesPerLevel(ToString(i + 1), 2);
}
ROCKSDB_NAMESPACE::port::Thread threads([&] {
CompactRangeOptions compact_options;
@ -1616,7 +1615,7 @@ TEST_P(ColumnFamilyTest, ManualAndAutomaticCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
bool cf_1_1 = true;
bool cf_1_2 = true;
@ -1651,7 +1650,7 @@ TEST_P(ColumnFamilyTest, ManualAndAutomaticCompactions) {
PutRandomData(2, 10, 12000);
PutRandomData(2, 1, 10);
WaitForFlush(2);
AssertFilesPerLevel(std::to_string(i + 1), 2);
AssertFilesPerLevel(ToString(i + 1), 2);
}
TEST_SYNC_POINT("ColumnFamilyTest::ManualAuto:5");
threads.join();
@ -1710,7 +1709,7 @@ TEST_P(ColumnFamilyTest, SameCFManualManualCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
bool cf_1_1 = true;
bool cf_1_2 = true;
@ -1749,8 +1748,8 @@ TEST_P(ColumnFamilyTest, SameCFManualManualCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(
std::to_string(one.level0_file_num_compaction_trigger + i), 1);
AssertFilesPerLevel(ToString(one.level0_file_num_compaction_trigger + i),
1);
}
ROCKSDB_NAMESPACE::port::Thread threads1([&] {
@ -1812,7 +1811,7 @@ TEST_P(ColumnFamilyTest, SameCFManualAutomaticCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
bool cf_1_1 = true;
bool cf_1_2 = true;
@ -1850,8 +1849,8 @@ TEST_P(ColumnFamilyTest, SameCFManualAutomaticCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(
std::to_string(one.level0_file_num_compaction_trigger + i), 1);
AssertFilesPerLevel(ToString(one.level0_file_num_compaction_trigger + i),
1);
}
TEST_SYNC_POINT("ColumnFamilyTest::ManualAuto:1");
@ -1905,7 +1904,7 @@ TEST_P(ColumnFamilyTest, SameCFManualAutomaticCompactionsLevel) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
bool cf_1_1 = true;
bool cf_1_2 = true;
@ -1943,8 +1942,8 @@ TEST_P(ColumnFamilyTest, SameCFManualAutomaticCompactionsLevel) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(
std::to_string(one.level0_file_num_compaction_trigger + i), 1);
AssertFilesPerLevel(ToString(one.level0_file_num_compaction_trigger + i),
1);
}
TEST_SYNC_POINT("ColumnFamilyTest::ManualAuto:1");
@ -2025,7 +2024,7 @@ TEST_P(ColumnFamilyTest, SameCFAutomaticManualCompactions) {
PutRandomData(1, 10, 12000, true);
PutRandomData(1, 1, 10, true);
WaitForFlush(1);
AssertFilesPerLevel(std::to_string(i + 1), 1);
AssertFilesPerLevel(ToString(i + 1), 1);
}
TEST_SYNC_POINT("ColumnFamilyTest::AutoManual:5");

View File

@ -91,8 +91,8 @@ TEST_F(CompactFilesTest, L0ConflictsFiles) {
// create couple files
// Background compaction starts and waits in BackgroundCallCompaction:0
for (int i = 0; i < kLevel0Trigger * 4; ++i) {
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i), ""));
ASSERT_OK(db->Put(WriteOptions(), std::to_string(100 - i), ""));
ASSERT_OK(db->Put(WriteOptions(), ToString(i), ""));
ASSERT_OK(db->Put(WriteOptions(), ToString(100 - i), ""));
ASSERT_OK(db->Flush(FlushOptions()));
}
@ -136,20 +136,17 @@ TEST_F(CompactFilesTest, MultipleLevel) {
// create couple files in L0, L3, L4 and L5
for (int i = 5; i > 2; --i) {
collector->ClearFlushedFiles();
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i), ""));
ASSERT_OK(db->Put(WriteOptions(), ToString(i), ""));
ASSERT_OK(db->Flush(FlushOptions()));
// Ensure background work is fully finished including listener callbacks
// before accessing listener state.
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForBackgroundWork());
auto l0_files = collector->GetFlushedFiles();
ASSERT_OK(db->CompactFiles(CompactionOptions(), l0_files, i));
std::string prop;
ASSERT_TRUE(db->GetProperty(
"rocksdb.num-files-at-level" + std::to_string(i), &prop));
ASSERT_TRUE(
db->GetProperty("rocksdb.num-files-at-level" + ToString(i), &prop));
ASSERT_EQ("1", prop);
}
ASSERT_OK(db->Put(WriteOptions(), std::to_string(0), ""));
ASSERT_OK(db->Put(WriteOptions(), ToString(0), ""));
ASSERT_OK(db->Flush(FlushOptions()));
ColumnFamilyMetaData meta;
@ -218,7 +215,7 @@ TEST_F(CompactFilesTest, ObsoleteFiles) {
// create couple files
for (int i = 1000; i < 2000; ++i) {
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i),
ASSERT_OK(db->Put(WriteOptions(), ToString(i),
std::string(kWriteBufferSize / 10, 'a' + (i % 26))));
}
@ -257,14 +254,14 @@ TEST_F(CompactFilesTest, NotCutOutputOnLevel0) {
// create couple files
for (int i = 0; i < 500; ++i) {
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i),
ASSERT_OK(db->Put(WriteOptions(), ToString(i),
std::string(1000, 'a' + (i % 26))));
}
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForFlushMemTable());
auto l0_files_1 = collector->GetFlushedFiles();
collector->ClearFlushedFiles();
for (int i = 0; i < 500; ++i) {
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i),
ASSERT_OK(db->Put(WriteOptions(), ToString(i),
std::string(1000, 'a' + (i % 26))));
}
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForFlushMemTable());
@ -295,13 +292,10 @@ TEST_F(CompactFilesTest, CapturingPendingFiles) {
// Create 5 files.
for (int i = 0; i < 5; ++i) {
ASSERT_OK(db->Put(WriteOptions(), "key" + std::to_string(i), "value"));
ASSERT_OK(db->Put(WriteOptions(), "key" + ToString(i), "value"));
ASSERT_OK(db->Flush(FlushOptions()));
}
// Ensure background work is fully finished including listener callbacks
// before accessing listener state.
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForBackgroundWork());
auto l0_files = collector->GetFlushedFiles();
EXPECT_EQ(5, l0_files.size());
@ -420,9 +414,6 @@ TEST_F(CompactFilesTest, SentinelCompressionType) {
ASSERT_OK(db->Put(WriteOptions(), "key", "val"));
ASSERT_OK(db->Flush(FlushOptions()));
// Ensure background work is fully finished including listener callbacks
// before accessing listener state.
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForBackgroundWork());
auto l0_files = collector->GetFlushedFiles();
ASSERT_EQ(1, l0_files.size());
@ -465,7 +456,7 @@ TEST_F(CompactFilesTest, GetCompactionJobInfo) {
// create couple files
for (int i = 0; i < 500; ++i) {
ASSERT_OK(db->Put(WriteOptions(), std::to_string(i),
ASSERT_OK(db->Put(WriteOptions(), ToString(i),
std::string(1000, 'a' + (i % 26))));
}
ASSERT_OK(static_cast_with_check<DBImpl>(db)->TEST_WaitForFlushMemTable());

View File

@ -213,8 +213,8 @@ Compaction::Compaction(
uint32_t _output_path_id, CompressionType _compression,
CompressionOptions _compression_opts, Temperature _output_temperature,
uint32_t _max_subcompactions, std::vector<FileMetaData*> _grandparents,
bool _manual_compaction, const std::string& _trim_ts, double _score,
bool _deletion_compaction, CompactionReason _compaction_reason)
bool _manual_compaction, double _score, bool _deletion_compaction,
CompactionReason _compaction_reason)
: input_vstorage_(vstorage),
start_level_(_inputs[0].level),
output_level_(_output_level),
@ -237,7 +237,6 @@ Compaction::Compaction(
bottommost_level_(IsBottommostLevel(output_level_, vstorage, inputs_)),
is_full_compaction_(IsFullCompaction(vstorage, inputs_)),
is_manual_compaction_(_manual_compaction),
trim_ts_(_trim_ts),
is_trivial_move_(false),
compaction_reason_(_compaction_reason),
notify_on_compaction_completion_(false) {
@ -278,9 +277,9 @@ Compaction::~Compaction() {
bool Compaction::InputCompressionMatchesOutput() const {
int base_level = input_vstorage_->base_level();
bool matches =
(GetCompressionType(input_vstorage_, mutable_cf_options_, start_level_,
base_level) == output_compression_);
bool matches = (GetCompressionType(immutable_options_, input_vstorage_,
mutable_cf_options_, start_level_,
base_level) == output_compression_);
if (matches) {
TEST_SYNC_POINT("Compaction::InputCompressionMatchesOutput:Matches");
return true;
@ -518,7 +517,7 @@ uint64_t Compaction::OutputFilePreallocationSize() const {
}
}
if (max_output_file_size_ != std::numeric_limits<uint64_t>::max() &&
if (max_output_file_size_ != port::kMaxUint64 &&
(immutable_options_.compaction_style == kCompactionStyleLevel ||
output_level() > 0)) {
preallocation_size = std::min(max_output_file_size_, preallocation_size);
@ -616,7 +615,7 @@ bool Compaction::DoesInputReferenceBlobFiles() const {
uint64_t Compaction::MinInputFileOldestAncesterTime(
const InternalKey* start, const InternalKey* end) const {
uint64_t min_oldest_ancester_time = std::numeric_limits<uint64_t>::max();
uint64_t min_oldest_ancester_time = port::kMaxUint64;
const InternalKeyComparator& icmp =
column_family_data()->internal_comparator();
for (const auto& level_files : inputs_) {

View File

@ -79,8 +79,8 @@ class Compaction {
CompressionOptions compression_opts,
Temperature output_temperature, uint32_t max_subcompactions,
std::vector<FileMetaData*> grandparents,
bool manual_compaction = false, const std::string& trim_ts = "",
double score = -1, bool deletion_compaction = false,
bool manual_compaction = false, double score = -1,
bool deletion_compaction = false,
CompactionReason compaction_reason = CompactionReason::kUnknown);
// No copying allowed
@ -208,8 +208,6 @@ class Compaction {
// Was this compaction triggered manually by the client?
bool is_manual_compaction() const { return is_manual_compaction_; }
std::string trim_ts() const { return trim_ts_; }
// Used when allow_trivial_move option is set in
// Universal compaction. If all the input files are
// non overlapping, then is_trivial_move_ variable
@ -387,9 +385,6 @@ class Compaction {
// Is this compaction requested by the client?
const bool is_manual_compaction_;
// The data with timestamp > trim_ts_ will be removed
const std::string trim_ts_;
// True if we can do trivial move in Universal multi level
// compaction
bool is_trivial_move_;

View File

@ -24,39 +24,40 @@ CompactionIterator::CompactionIterator(
InternalIterator* input, const Comparator* cmp, MergeHelper* merge_helper,
SequenceNumber last_sequence, std::vector<SequenceNumber>* snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, const SnapshotChecker* snapshot_checker,
Env* env, bool report_detailed_time, bool expect_valid_internal_key,
const SnapshotChecker* snapshot_checker, Env* env,
bool report_detailed_time, bool expect_valid_internal_key,
CompactionRangeDelAggregator* range_del_agg,
BlobFileBuilder* blob_file_builder, bool allow_data_in_errors,
bool enforce_single_del_contracts, const Compaction* compaction,
const CompactionFilter* compaction_filter,
const Compaction* compaction, const CompactionFilter* compaction_filter,
const std::atomic<bool>* shutting_down,
const SequenceNumber preserve_deletes_seqnum,
const std::atomic<int>* manual_compaction_paused,
const std::atomic<bool>* manual_compaction_canceled,
const std::shared_ptr<Logger> info_log,
const std::string* full_history_ts_low)
: CompactionIterator(
input, cmp, merge_helper, last_sequence, snapshots,
earliest_write_conflict_snapshot, job_snapshot, snapshot_checker, env,
earliest_write_conflict_snapshot, snapshot_checker, env,
report_detailed_time, expect_valid_internal_key, range_del_agg,
blob_file_builder, allow_data_in_errors, enforce_single_del_contracts,
blob_file_builder, allow_data_in_errors,
std::unique_ptr<CompactionProxy>(
compaction ? new RealCompaction(compaction) : nullptr),
compaction_filter, shutting_down, manual_compaction_paused,
manual_compaction_canceled, info_log, full_history_ts_low) {}
compaction_filter, shutting_down, preserve_deletes_seqnum,
manual_compaction_paused, manual_compaction_canceled, info_log,
full_history_ts_low) {}
CompactionIterator::CompactionIterator(
InternalIterator* input, const Comparator* cmp, MergeHelper* merge_helper,
SequenceNumber /*last_sequence*/, std::vector<SequenceNumber>* snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, const SnapshotChecker* snapshot_checker,
Env* env, bool report_detailed_time, bool expect_valid_internal_key,
const SnapshotChecker* snapshot_checker, Env* env,
bool report_detailed_time, bool expect_valid_internal_key,
CompactionRangeDelAggregator* range_del_agg,
BlobFileBuilder* blob_file_builder, bool allow_data_in_errors,
bool enforce_single_del_contracts,
std::unique_ptr<CompactionProxy> compaction,
const CompactionFilter* compaction_filter,
const std::atomic<bool>* shutting_down,
const SequenceNumber preserve_deletes_seqnum,
const std::atomic<int>* manual_compaction_paused,
const std::atomic<bool>* manual_compaction_canceled,
const std::shared_ptr<Logger> info_log,
@ -67,7 +68,6 @@ CompactionIterator::CompactionIterator(
merge_helper_(merge_helper),
snapshots_(snapshots),
earliest_write_conflict_snapshot_(earliest_write_conflict_snapshot),
job_snapshot_(job_snapshot),
snapshot_checker_(snapshot_checker),
env_(env),
clock_(env_->GetSystemClock().get()),
@ -80,9 +80,9 @@ CompactionIterator::CompactionIterator(
shutting_down_(shutting_down),
manual_compaction_paused_(manual_compaction_paused),
manual_compaction_canceled_(manual_compaction_canceled),
preserve_deletes_seqnum_(preserve_deletes_seqnum),
info_log_(info_log),
allow_data_in_errors_(allow_data_in_errors),
enforce_single_del_contracts_(enforce_single_del_contracts),
timestamp_size_(cmp_ ? cmp_->timestamp_size() : 0),
full_history_ts_low_(full_history_ts_low),
current_user_key_sequence_(0),
@ -237,10 +237,6 @@ bool CompactionIterator::InvokeFilterIfNeeded(bool* need_skip,
return false;
}
TEST_SYNC_POINT_CALLBACK(
"CompactionIterator::InvokeFilterIfNeeded::TamperWithBlobIndex",
&value_);
// For integrated BlobDB impl, CompactionIterator reads blob value.
// For Stacked BlobDB impl, the corresponding CompactionFilter's
// FilterV2 method should read the blob value.
@ -310,14 +306,6 @@ bool CompactionIterator::InvokeFilterIfNeeded(bool* need_skip,
// no value associated with delete
value_.clear();
iter_stats_.num_record_drop_user++;
} else if (filter == CompactionFilter::Decision::kPurge) {
// convert the current key to a single delete; key_ is pointing into
// current_key_ at this point, so updating current_key_ updates key()
ikey_.type = kTypeSingleDeletion;
current_key_.UpdateInternalKey(ikey_.sequence, kTypeSingleDeletion);
// no value associated with single delete
value_.clear();
iter_stats_.num_record_drop_user++;
} else if (filter == CompactionFilter::Decision::kChangeValue) {
if (ikey_.type == kTypeBlobIndex) {
// value transfer from blob file to inlined data
@ -636,39 +624,24 @@ void CompactionIterator::NextFromInput() {
TEST_SYNC_POINT_CALLBACK(
"CompactionIterator::NextFromInput:SingleDelete:2", nullptr);
if (next_ikey.type == kTypeSingleDeletion) {
if (next_ikey.type == kTypeSingleDeletion ||
next_ikey.type == kTypeDeletion) {
// We encountered two SingleDeletes for same key in a row. This
// could be due to unexpected user input. If write-(un)prepared
// transaction is used, this could also be due to releasing an old
// snapshot between a Put and its matching SingleDelete.
// Furthermore, if write-(un)prepared transaction is rolled back
// after prepare, we will write a Delete to cancel a prior Put. If
// old snapshot is released between a later Put and its matching
// SingleDelete, we will end up with a Delete followed by
// SingleDelete.
// Skip the first SingleDelete and let the next iteration decide
// how to handle the second SingleDelete.
// how to handle the second SingleDelete or Delete.
// First SingleDelete has been skipped since we already called
// input_.Next().
++iter_stats_.num_record_drop_obsolete;
++iter_stats_.num_single_del_mismatch;
} else if (next_ikey.type == kTypeDeletion) {
std::ostringstream oss;
oss << "Found SD and type: " << static_cast<int>(next_ikey.type)
<< " on the same key, violating the contract "
"of SingleDelete. Check your application to make sure the "
"application does not mix SingleDelete and Delete for "
"the same key. If you are using "
"write-prepared/write-unprepared transactions, and use "
"SingleDelete to delete certain keys, then make sure "
"TransactionDBOptions::rollback_deletion_type_callback is "
"configured properly. Mixing SD and DEL can lead to "
"undefined behaviors";
++iter_stats_.num_record_drop_obsolete;
++iter_stats_.num_single_del_mismatch;
if (enforce_single_del_contracts_) {
ROCKS_LOG_ERROR(info_log_, "%s", oss.str().c_str());
valid_ = false;
status_ = Status::Corruption(oss.str());
return;
}
ROCKS_LOG_WARN(info_log_, "%s", oss.str().c_str());
} else if (!is_timestamp_eligible_for_gc) {
// We cannot drop the SingleDelete as timestamp is enabled, and
// timestamp of this key is greater than or equal to
@ -785,6 +758,7 @@ void CompactionIterator::NextFromInput() {
(ikey_.type == kTypeDeletionWithTimestamp &&
cmp_with_history_ts_low_ < 0)) &&
DefinitelyInSnapshot(ikey_.sequence, earliest_snapshot_) &&
ikeyNotNeededForIncrementalSnapshot() &&
compaction_->KeyNotExistsBeyondOutputLevel(ikey_.user_key,
&level_ptrs_)) {
// TODO(noetzli): This is the only place where we use compaction_
@ -818,7 +792,7 @@ void CompactionIterator::NextFromInput() {
} else if ((ikey_.type == kTypeDeletion ||
(ikey_.type == kTypeDeletionWithTimestamp &&
cmp_with_history_ts_low_ < 0)) &&
bottommost_level_) {
bottommost_level_ && ikeyNotNeededForIncrementalSnapshot()) {
// Handle the case where we have a delete key at the bottom most level
// We can skip outputting the key iff there are no subsequent puts for this
// key
@ -980,10 +954,6 @@ void CompactionIterator::GarbageCollectBlobIfNeeded() {
// GC for integrated BlobDB
if (compaction_->enable_blob_garbage_collection()) {
TEST_SYNC_POINT_CALLBACK(
"CompactionIterator::GarbageCollectBlobIfNeeded::TamperWithBlobIndex",
&value_);
BlobIndex blob_index;
{
@ -1090,9 +1060,10 @@ void CompactionIterator::PrepareOutput() {
// Can we do the same for levels above bottom level as long as
// KeyNotExistsBeyondOutputLevel() return true?
if (valid_ && compaction_ != nullptr &&
!compaction_->allow_ingest_behind() && bottommost_level_ &&
!compaction_->allow_ingest_behind() &&
ikeyNotNeededForIncrementalSnapshot() && bottommost_level_ &&
DefinitelyInSnapshot(ikey_.sequence, earliest_snapshot_) &&
ikey_.type != kTypeMerge && current_key_committed_) {
ikey_.type != kTypeMerge) {
assert(ikey_.type != kTypeDeletion);
assert(ikey_.type != kTypeSingleDeletion ||
(timestamp_size_ || full_history_ts_low_));
@ -1168,6 +1139,13 @@ inline SequenceNumber CompactionIterator::findEarliestVisibleSnapshot(
return kMaxSequenceNumber;
}
// used in 2 places - prevents deletion markers to be dropped if they may be
// needed and disables seqnum zero-out in PrepareOutput for recent keys.
inline bool CompactionIterator::ikeyNotNeededForIncrementalSnapshot() {
return (!compaction_->preserve_deletes()) ||
(ikey_.sequence < preserve_deletes_seqnum_);
}
uint64_t CompactionIterator::ComputeBlobGarbageCollectionCutoffFileNumber(
const CompactionProxy* compaction) {
if (!compaction) {

View File

@ -92,6 +92,8 @@ class CompactionIterator {
virtual bool allow_ingest_behind() const = 0;
virtual bool preserve_deletes() const = 0;
virtual bool allow_mmap_reads() const = 0;
virtual bool enable_blob_garbage_collection() const = 0;
@ -137,6 +139,8 @@ class CompactionIterator {
return compaction_->immutable_options()->allow_ingest_behind;
}
bool preserve_deletes() const override { return false; }
bool allow_mmap_reads() const override {
return compaction_->immutable_options()->allow_mmap_reads;
}
@ -172,13 +176,14 @@ class CompactionIterator {
InternalIterator* input, const Comparator* cmp, MergeHelper* merge_helper,
SequenceNumber last_sequence, std::vector<SequenceNumber>* snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, const SnapshotChecker* snapshot_checker,
Env* env, bool report_detailed_time, bool expect_valid_internal_key,
const SnapshotChecker* snapshot_checker, Env* env,
bool report_detailed_time, bool expect_valid_internal_key,
CompactionRangeDelAggregator* range_del_agg,
BlobFileBuilder* blob_file_builder, bool allow_data_in_errors,
bool enforce_single_del_contracts, const Compaction* compaction = nullptr,
const Compaction* compaction = nullptr,
const CompactionFilter* compaction_filter = nullptr,
const std::atomic<bool>* shutting_down = nullptr,
const SequenceNumber preserve_deletes_seqnum = 0,
const std::atomic<int>* manual_compaction_paused = nullptr,
const std::atomic<bool>* manual_compaction_canceled = nullptr,
const std::shared_ptr<Logger> info_log = nullptr,
@ -189,14 +194,14 @@ class CompactionIterator {
InternalIterator* input, const Comparator* cmp, MergeHelper* merge_helper,
SequenceNumber last_sequence, std::vector<SequenceNumber>* snapshots,
SequenceNumber earliest_write_conflict_snapshot,
SequenceNumber job_snapshot, const SnapshotChecker* snapshot_checker,
Env* env, bool report_detailed_time, bool expect_valid_internal_key,
const SnapshotChecker* snapshot_checker, Env* env,
bool report_detailed_time, bool expect_valid_internal_key,
CompactionRangeDelAggregator* range_del_agg,
BlobFileBuilder* blob_file_builder, bool allow_data_in_errors,
bool enforce_single_del_contracts,
std::unique_ptr<CompactionProxy> compaction,
const CompactionFilter* compaction_filter = nullptr,
const std::atomic<bool>* shutting_down = nullptr,
const SequenceNumber preserve_deletes_seqnum = 0,
const std::atomic<int>* manual_compaction_paused = nullptr,
const std::atomic<bool>* manual_compaction_canceled = nullptr,
const std::shared_ptr<Logger> info_log = nullptr,
@ -267,9 +272,14 @@ class CompactionIterator {
inline SequenceNumber findEarliestVisibleSnapshot(
SequenceNumber in, SequenceNumber* prev_snapshot);
// Checks whether the currently seen ikey_ is needed for
// incremental (differential) snapshot and hence can't be dropped
// or seqnum be zero-ed out even if all other conditions for it are met.
inline bool ikeyNotNeededForIncrementalSnapshot();
inline bool KeyCommitted(SequenceNumber sequence) {
return snapshot_checker_ == nullptr ||
snapshot_checker_->CheckInSnapshot(sequence, job_snapshot_) ==
snapshot_checker_->CheckInSnapshot(sequence, kMaxSequenceNumber) ==
SnapshotCheckerResult::kInSnapshot;
}
@ -310,7 +320,6 @@ class CompactionIterator {
std::unordered_set<SequenceNumber> released_snapshots_;
std::vector<SequenceNumber>::const_iterator earliest_snapshot_iter_;
const SequenceNumber earliest_write_conflict_snapshot_;
const SequenceNumber job_snapshot_;
const SnapshotChecker* const snapshot_checker_;
Env* env_;
SystemClock* clock_;
@ -323,6 +332,7 @@ class CompactionIterator {
const std::atomic<bool>* shutting_down_;
const std::atomic<int>* manual_compaction_paused_;
const std::atomic<bool>* manual_compaction_canceled_;
const SequenceNumber preserve_deletes_seqnum_;
bool bottommost_level_;
bool valid_ = false;
bool visible_at_tip_;
@ -333,8 +343,6 @@ class CompactionIterator {
bool allow_data_in_errors_;
const bool enforce_single_del_contracts_;
// Comes from comparator.
const size_t timestamp_size_;

View File

@ -166,6 +166,8 @@ class FakeCompaction : public CompactionIterator::CompactionProxy {
bool allow_ingest_behind() const override { return is_allow_ingest_behind; }
bool preserve_deletes() const override { return false; }
bool allow_mmap_reads() const override { return false; }
bool enable_blob_garbage_collection() const override { return false; }
@ -275,12 +277,11 @@ class CompactionIteratorTest : public testing::TestWithParam<bool> {
iter_->SeekToFirst();
c_iter_.reset(new CompactionIterator(
iter_.get(), cmp_, merge_helper_.get(), last_sequence, &snapshots_,
earliest_write_conflict_snapshot, kMaxSequenceNumber,
snapshot_checker_.get(), Env::Default(),
false /* report_detailed_time */, false, range_del_agg_.get(),
nullptr /* blob_file_builder */, true /*allow_data_in_errors*/,
true /*enforce_single_del_contracts*/, std::move(compaction), filter,
&shutting_down_,
earliest_write_conflict_snapshot, snapshot_checker_.get(),
Env::Default(), false /* report_detailed_time */, false,
range_del_agg_.get(), nullptr /* blob_file_builder */,
true /*allow_data_in_errors*/, std::move(compaction), filter,
&shutting_down_, /*preserve_deletes_seqnum=*/0,
/*manual_compaction_paused=*/nullptr,
/*manual_compaction_canceled=*/nullptr, /*info_log=*/nullptr,
full_history_ts_low));
@ -314,7 +315,7 @@ class CompactionIteratorTest : public testing::TestWithParam<bool> {
key_not_exists_beyond_output_level, full_history_ts_low);
c_iter_->SeekToFirst();
for (size_t i = 0; i < expected_keys.size(); i++) {
std::string info = "i = " + std::to_string(i);
std::string info = "i = " + ToString(i);
ASSERT_TRUE(c_iter_->Valid()) << info;
ASSERT_OK(c_iter_->status()) << info;
ASSERT_EQ(expected_keys[i], c_iter_->key().ToString()) << info;

View File

@ -31,7 +31,6 @@
#include "db/dbformat.h"
#include "db/error_handler.h"
#include "db/event_helpers.h"
#include "db/history_trimming_iterator.h"
#include "db/log_reader.h"
#include "db/log_writer.h"
#include "db/memtable.h"
@ -417,22 +416,20 @@ CompactionJob::CompactionJob(
int job_id, Compaction* compaction, const ImmutableDBOptions& db_options,
const MutableDBOptions& mutable_db_options, const FileOptions& file_options,
VersionSet* versions, const std::atomic<bool>* shutting_down,
LogBuffer* log_buffer, FSDirectory* db_directory,
FSDirectory* output_directory, FSDirectory* blob_output_directory,
Statistics* stats, InstrumentedMutex* db_mutex,
ErrorHandler* db_error_handler,
const SequenceNumber preserve_deletes_seqnum, LogBuffer* log_buffer,
FSDirectory* db_directory, FSDirectory* output_directory,
FSDirectory* blob_output_directory, Statistics* stats,
InstrumentedMutex* db_mutex, ErrorHandler* db_error_handler,
std::vector<SequenceNumber> existing_snapshots,
SequenceNumber earliest_write_conflict_snapshot,
const SnapshotChecker* snapshot_checker, JobContext* job_context,
std::shared_ptr<Cache> table_cache, EventLogger* event_logger,
bool paranoid_file_checks, bool measure_io_stats, const std::string& dbname,
CompactionJobStats* compaction_job_stats, Env::Priority thread_pri,
const std::shared_ptr<IOTracer>& io_tracer,
const SnapshotChecker* snapshot_checker, std::shared_ptr<Cache> table_cache,
EventLogger* event_logger, bool paranoid_file_checks, bool measure_io_stats,
const std::string& dbname, CompactionJobStats* compaction_job_stats,
Env::Priority thread_pri, const std::shared_ptr<IOTracer>& io_tracer,
const std::atomic<int>* manual_compaction_paused,
const std::atomic<bool>* manual_compaction_canceled,
const std::string& db_id, const std::string& db_session_id,
std::string full_history_ts_low, std::string trim_ts,
BlobFileCompletionCallback* blob_callback)
std::string full_history_ts_low, BlobFileCompletionCallback* blob_callback)
: compact_(new CompactionState(compaction)),
compaction_stats_(compaction->compaction_reason(), 1),
db_options_(db_options),
@ -457,6 +454,7 @@ CompactionJob::CompactionJob(
shutting_down_(shutting_down),
manual_compaction_paused_(manual_compaction_paused),
manual_compaction_canceled_(manual_compaction_canceled),
preserve_deletes_seqnum_(preserve_deletes_seqnum),
db_directory_(db_directory),
blob_output_directory_(blob_output_directory),
db_mutex_(db_mutex),
@ -464,14 +462,12 @@ CompactionJob::CompactionJob(
existing_snapshots_(std::move(existing_snapshots)),
earliest_write_conflict_snapshot_(earliest_write_conflict_snapshot),
snapshot_checker_(snapshot_checker),
job_context_(job_context),
table_cache_(std::move(table_cache)),
event_logger_(event_logger),
paranoid_file_checks_(paranoid_file_checks),
measure_io_stats_(measure_io_stats),
thread_pri_(thread_pri),
full_history_ts_low_(std::move(full_history_ts_low)),
trim_ts_(std::move(trim_ts)),
blob_callback_(blob_callback) {
assert(compaction_job_stats_ != nullptr);
assert(log_buffer_ != nullptr);
@ -1253,7 +1249,7 @@ void CompactionJob::NotifyOnSubcompactionBegin(
if (shutting_down_->load(std::memory_order_acquire)) {
return;
}
if (c->is_manual_compaction() && manual_compaction_paused_ &&
if (c->is_manual_compaction() &&
manual_compaction_paused_->load(std::memory_order_acquire) > 0) {
return;
}
@ -1342,8 +1338,6 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) {
CompactionRangeDelAggregator range_del_agg(&cfd->internal_comparator(),
existing_snapshots_);
// TODO: since we already use C++17, should use
// std::optional<const Slice> instead.
const Slice* const start = sub_compact->start;
const Slice* const end = sub_compact->end;
@ -1365,13 +1359,9 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) {
// Although the v2 aggregator is what the level iterator(s) know about,
// the AddTombstones calls will be propagated down to the v1 aggregator.
std::unique_ptr<InternalIterator> raw_input(versions_->MakeInputIterator(
read_options, sub_compact->compaction, &range_del_agg,
file_options_for_read_,
(start == nullptr) ? std::optional<const Slice>{}
: std::optional<const Slice>{*start},
(end == nullptr) ? std::optional<const Slice>{}
: std::optional<const Slice>{*end}));
std::unique_ptr<InternalIterator> raw_input(
versions_->MakeInputIterator(read_options, sub_compact->compaction,
&range_del_agg, file_options_for_read_));
InternalIterator* input = raw_input.get();
IterKey start_ikey;
@ -1390,28 +1380,21 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) {
std::unique_ptr<InternalIterator> clip;
if (start || end) {
clip = std::make_unique<ClippingIterator>(
clip.reset(new ClippingIterator(
raw_input.get(), start ? &start_slice : nullptr,
end ? &end_slice : nullptr, &cfd->internal_comparator());
end ? &end_slice : nullptr, &cfd->internal_comparator()));
input = clip.get();
}
std::unique_ptr<InternalIterator> blob_counter;
if (sub_compact->compaction->DoesInputReferenceBlobFiles()) {
sub_compact->blob_garbage_meter = std::make_unique<BlobGarbageMeter>();
blob_counter = std::make_unique<BlobCountingIterator>(
input, sub_compact->blob_garbage_meter.get());
sub_compact->blob_garbage_meter.reset(new BlobGarbageMeter);
blob_counter.reset(
new BlobCountingIterator(input, sub_compact->blob_garbage_meter.get()));
input = blob_counter.get();
}
std::unique_ptr<InternalIterator> trim_history_iter;
if (cfd->user_comparator()->timestamp_size() > 0 && !trim_ts_.empty()) {
trim_history_iter = std::make_unique<HistoryTrimmingIterator>(
input, cfd->user_comparator(), trim_ts_);
input = trim_history_iter.get();
}
input->SeekToFirst();
AutoThreadOperationStageUpdater stage_updater(
@ -1470,17 +1453,14 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) {
Status status;
const std::string* const full_history_ts_low =
full_history_ts_low_.empty() ? nullptr : &full_history_ts_low_;
const SequenceNumber job_snapshot_seq =
job_context_ ? job_context_->GetJobSnapshotSequence()
: kMaxSequenceNumber;
sub_compact->c_iter.reset(new CompactionIterator(
input, cfd->user_comparator(), &merge, versions_->LastSequence(),
&existing_snapshots_, earliest_write_conflict_snapshot_, job_snapshot_seq,
&existing_snapshots_, earliest_write_conflict_snapshot_,
snapshot_checker_, env_, ShouldReportDetailedTime(env_, stats_),
/*expect_valid_internal_key=*/true, &range_del_agg,
blob_file_builder.get(), db_options_.allow_data_in_errors,
db_options_.enforce_single_del_contracts, sub_compact->compaction,
compaction_filter, shutting_down_, manual_compaction_paused_,
sub_compact->compaction, compaction_filter, shutting_down_,
preserve_deletes_seqnum_, manual_compaction_paused_,
manual_compaction_canceled_, db_options_.info_log, full_history_ts_low));
auto c_iter = sub_compact->c_iter.get();
c_iter->SeekToFirst();
@ -1533,15 +1513,11 @@ void CompactionJob::ProcessKeyValueCompaction(SubcompactionState* sub_compact) {
break;
}
const ParsedInternalKey& ikey = c_iter->ikey();
status = sub_compact->current_output()->meta.UpdateBoundaries(
key, value, ikey.sequence, ikey.type);
if (!status.ok()) {
break;
}
sub_compact->current_output_file_size =
sub_compact->builder->EstimatedFileSize();
const ParsedInternalKey& ikey = c_iter->ikey();
sub_compact->current_output()->meta.UpdateBoundaries(
key, value, ikey.sequence, ikey.type);
sub_compact->num_output_records++;
// Close output file if it is big enough. Two possibilities determine it's
@ -1974,8 +1950,7 @@ Status CompactionJob::FinishCompactionOutputFile(
refined_oldest_ancester_time =
sub_compact->compaction->MinInputFileOldestAncesterTime(
&(meta->smallest), &(meta->largest));
if (refined_oldest_ancester_time !=
std::numeric_limits<uint64_t>::max()) {
if (refined_oldest_ancester_time != port::kMaxUint64) {
meta->oldest_ancester_time = refined_oldest_ancester_time;
}
}
@ -2265,7 +2240,7 @@ Status CompactionJob::OpenCompactionOutputFile(
sub_compact->compaction->MinInputFileOldestAncesterTime(
(sub_compact->start != nullptr) ? &tmp_start : nullptr,
(sub_compact->end != nullptr) ? &tmp_end : nullptr);
if (oldest_ancester_time == std::numeric_limits<uint64_t>::max()) {
if (oldest_ancester_time == port::kMaxUint64) {
oldest_ancester_time = current_time;
}
@ -2459,7 +2434,7 @@ void CompactionJob::LogCompaction() {
<< "compaction_reason"
<< GetCompactionReasonString(compaction->compaction_reason());
for (size_t i = 0; i < compaction->num_input_levels(); ++i) {
stream << ("files_L" + std::to_string(compaction->level(i)));
stream << ("files_L" + ToString(compaction->level(i)));
stream.StartArray();
for (auto f : *compaction->inputs(i)) {
stream << f->fd.GetNumber();
@ -2497,20 +2472,19 @@ CompactionServiceCompactionJob::CompactionServiceCompactionJob(
std::vector<SequenceNumber> existing_snapshots,
std::shared_ptr<Cache> table_cache, EventLogger* event_logger,
const std::string& dbname, const std::shared_ptr<IOTracer>& io_tracer,
const std::atomic<bool>* manual_compaction_canceled,
const std::string& db_id, const std::string& db_session_id,
const std::string& output_path,
const CompactionServiceInput& compaction_service_input,
CompactionServiceResult* compaction_service_result)
: CompactionJob(
job_id, compaction, db_options, mutable_db_options, file_options,
versions, shutting_down, log_buffer, nullptr, output_directory,
versions, shutting_down, 0, log_buffer, nullptr, output_directory,
nullptr, stats, db_mutex, db_error_handler, existing_snapshots,
kMaxSequenceNumber, nullptr, nullptr, table_cache, event_logger,
kMaxSequenceNumber, nullptr, table_cache, event_logger,
compaction->mutable_cf_options()->paranoid_file_checks,
compaction->mutable_cf_options()->report_bg_io_stats, dbname,
&(compaction_service_result->stats), Env::Priority::USER, io_tracer,
nullptr, manual_compaction_canceled, db_id, db_session_id,
nullptr, nullptr, db_id, db_session_id,
compaction->column_family_data()->GetFullHistoryTsLow()),
output_path_(output_path),
compaction_input_(compaction_service_input),
@ -2952,7 +2926,6 @@ static std::unordered_map<std::string, OptionTypeInfo> cs_result_type_info = {
const void* addr1, const void* addr2, std::string* mismatch) {
const auto status1 = static_cast<const Status*>(addr1);
const auto status2 = static_cast<const Status*>(addr2);
StatusSerializationAdapter adatper1(*status1);
StatusSerializationAdapter adapter2(*status2);
return OptionTypeInfo::TypesAreEqual(opts, status_adapter_type_info,
@ -3010,7 +2983,7 @@ Status CompactionServiceInput::Read(const std::string& data_str,
} else {
return Status::NotSupported(
"Compaction Service Input data version not supported: " +
std::to_string(format_version));
ToString(format_version));
}
}
@ -3039,7 +3012,7 @@ Status CompactionServiceResult::Read(const std::string& data_str,
} else {
return Status::NotSupported(
"Compaction Service Result data version not supported: " +
std::to_string(format_version));
ToString(format_version));
}
}

View File

@ -67,13 +67,14 @@ class CompactionJob {
int job_id, Compaction* compaction, const ImmutableDBOptions& db_options,
const MutableDBOptions& mutable_db_options,
const FileOptions& file_options, VersionSet* versions,
const std::atomic<bool>* shutting_down, LogBuffer* log_buffer,
const std::atomic<bool>* shutting_down,
const SequenceNumber preserve_deletes_seqnum, LogBuffer* log_buffer,
FSDirectory* db_directory, FSDirectory* output_directory,
FSDirectory* blob_output_directory, Statistics* stats,
InstrumentedMutex* db_mutex, ErrorHandler* db_error_handler,
std::vector<SequenceNumber> existing_snapshots,
SequenceNumber earliest_write_conflict_snapshot,
const SnapshotChecker* snapshot_checker, JobContext* job_context,
const SnapshotChecker* snapshot_checker,
std::shared_ptr<Cache> table_cache, EventLogger* event_logger,
bool paranoid_file_checks, bool measure_io_stats,
const std::string& dbname, CompactionJobStats* compaction_job_stats,
@ -81,7 +82,7 @@ class CompactionJob {
const std::atomic<int>* manual_compaction_paused = nullptr,
const std::atomic<bool>* manual_compaction_canceled = nullptr,
const std::string& db_id = "", const std::string& db_session_id = "",
std::string full_history_ts_low = "", std::string trim_ts = "",
std::string full_history_ts_low = "",
BlobFileCompletionCallback* blob_callback = nullptr);
virtual ~CompactionJob();
@ -195,6 +196,7 @@ class CompactionJob {
const std::atomic<bool>* shutting_down_;
const std::atomic<int>* manual_compaction_paused_;
const std::atomic<bool>* manual_compaction_canceled_;
const SequenceNumber preserve_deletes_seqnum_;
FSDirectory* db_directory_;
FSDirectory* blob_output_directory_;
InstrumentedMutex* db_mutex_;
@ -212,8 +214,6 @@ class CompactionJob {
const SnapshotChecker* const snapshot_checker_;
JobContext* job_context_;
std::shared_ptr<Cache> table_cache_;
EventLogger* event_logger_;
@ -226,7 +226,6 @@ class CompactionJob {
std::vector<uint64_t> sizes_;
Env::Priority thread_pri_;
std::string full_history_ts_low_;
std::string trim_ts_;
BlobFileCompletionCallback* blob_callback_;
uint64_t GetCompactionId(SubcompactionState* sub_compact);
@ -345,7 +344,6 @@ class CompactionServiceCompactionJob : private CompactionJob {
std::vector<SequenceNumber> existing_snapshots,
std::shared_ptr<Cache> table_cache, EventLogger* event_logger,
const std::string& dbname, const std::shared_ptr<IOTracer>& io_tracer,
const std::atomic<bool>* manual_compaction_canceled,
const std::string& db_id, const std::string& db_session_id,
const std::string& output_path,
const CompactionServiceInput& compaction_service_input,

View File

@ -268,10 +268,10 @@ class CompactionJobStatsTest : public testing::Test,
if (cf == 0) {
// default cfd
EXPECT_TRUE(db_->GetProperty(
"rocksdb.num-files-at-level" + std::to_string(level), &property));
"rocksdb.num-files-at-level" + ToString(level), &property));
} else {
EXPECT_TRUE(db_->GetProperty(
handles_[cf], "rocksdb.num-files-at-level" + std::to_string(level),
handles_[cf], "rocksdb.num-files-at-level" + ToString(level),
&property));
}
return atoi(property.c_str());
@ -672,7 +672,7 @@ TEST_P(CompactionJobStatsTest, CompactionJobStatsTest) {
snprintf(buf, kBufSize, "%d", ++num_L0_files);
ASSERT_EQ(std::string(buf), FilesPerLevel(1));
}
ASSERT_EQ(std::to_string(num_L0_files), FilesPerLevel(1));
ASSERT_EQ(ToString(num_L0_files), FilesPerLevel(1));
// 2nd Phase: perform L0 -> L1 compaction.
int L0_compaction_count = 6;

View File

@ -87,6 +87,7 @@ class CompactionJobTestBase : public testing::Test {
/*block_cache_tracer=*/nullptr,
/*io_tracer=*/nullptr, /*db_session_id*/ "")),
shutting_down_(false),
preserve_deletes_seqnum_(0),
mock_table_factory_(new mock::MockTableFactory()),
error_handler_(nullptr, db_options_, &mutex_),
encode_u64_ts_(std::move(encode_u64_ts)) {
@ -236,8 +237,8 @@ class CompactionJobTestBase : public testing::Test {
for (int i = 0; i < 2; ++i) {
auto contents = mock::MakeMockFile();
for (int k = 0; k < kKeysPerFile; ++k) {
auto key = std::to_string(i * kMatchingKeys + k);
auto value = std::to_string(i * kKeysPerFile + k);
auto key = ToString(i * kMatchingKeys + k);
auto value = ToString(i * kKeysPerFile + k);
InternalKey internal_key(key, ++sequence_number, kTypeValue);
// This is how the key will look like once it's written in bottommost
@ -353,11 +354,11 @@ class CompactionJobTestBase : public testing::Test {
ucmp_->timestamp_size() == full_history_ts_low_.size());
CompactionJob compaction_job(
0, &compaction, db_options_, mutable_db_options_, env_options_,
versions_.get(), &shutting_down_, &log_buffer, nullptr, nullptr,
nullptr, nullptr, &mutex_, &error_handler_, snapshots,
earliest_write_conflict_snapshot, snapshot_checker, nullptr,
table_cache_, &event_logger, false, false, dbname_,
&compaction_job_stats_, Env::Priority::USER, nullptr /* IOTracer */,
versions_.get(), &shutting_down_, preserve_deletes_seqnum_, &log_buffer,
nullptr, nullptr, nullptr, nullptr, &mutex_, &error_handler_, snapshots,
earliest_write_conflict_snapshot, snapshot_checker, table_cache_,
&event_logger, false, false, dbname_, &compaction_job_stats_,
Env::Priority::USER, nullptr /* IOTracer */,
/*manual_compaction_paused=*/nullptr,
/*manual_compaction_canceled=*/nullptr, /*db_id=*/"",
/*db_session_id=*/"", full_history_ts_low_);
@ -408,6 +409,7 @@ class CompactionJobTestBase : public testing::Test {
std::unique_ptr<VersionSet> versions_;
InstrumentedMutex mutex_;
std::atomic<bool> shutting_down_;
SequenceNumber preserve_deletes_seqnum_;
std::shared_ptr<mock::MockTableFactory> mock_table_factory_;
CompactionJobStats compaction_job_stats_;
ColumnFamilyData* cfd_;
@ -890,10 +892,10 @@ TEST_F(CompactionJobTest, MultiSingleDelete) {
// -> Snapshot Put
// K: SDel SDel Put SDel Put Put Snapshot SDel Put SDel SDel Put SDel
// -> Snapshot Put Snapshot SDel
// L: SDel Put SDel Put SDel Snapshot SDel Put SDel SDel Put SDel
// -> Snapshot SDel Put SDel
// M: (Put) SDel Put SDel Put SDel Snapshot Put SDel SDel Put SDel SDel
// -> SDel Snapshot Put SDel
// L: SDel Put Del Put SDel Snapshot Del Put Del SDel Put SDel
// -> Snapshot SDel
// M: (Put) SDel Put Del Put SDel Snapshot Put Del SDel Put SDel Del
// -> SDel Snapshot Del
NewDB();
auto file1 = mock::MakeMockFile({
@ -924,14 +926,14 @@ TEST_F(CompactionJobTest, MultiSingleDelete) {
{KeyStr("L", 16U, kTypeSingleDeletion), ""},
{KeyStr("L", 15U, kTypeValue), "val"},
{KeyStr("L", 14U, kTypeSingleDeletion), ""},
{KeyStr("L", 13U, kTypeSingleDeletion), ""},
{KeyStr("L", 13U, kTypeDeletion), ""},
{KeyStr("L", 12U, kTypeValue), "val"},
{KeyStr("L", 11U, kTypeSingleDeletion), ""},
{KeyStr("M", 16U, kTypeSingleDeletion), ""},
{KeyStr("L", 11U, kTypeDeletion), ""},
{KeyStr("M", 16U, kTypeDeletion), ""},
{KeyStr("M", 15U, kTypeSingleDeletion), ""},
{KeyStr("M", 14U, kTypeValue), "val"},
{KeyStr("M", 13U, kTypeSingleDeletion), ""},
{KeyStr("M", 12U, kTypeSingleDeletion), ""},
{KeyStr("M", 12U, kTypeDeletion), ""},
{KeyStr("M", 11U, kTypeValue), "val"},
});
AddMockFile(file1);
@ -972,12 +974,12 @@ TEST_F(CompactionJobTest, MultiSingleDelete) {
{KeyStr("K", 1U, kTypeSingleDeletion), ""},
{KeyStr("L", 5U, kTypeSingleDeletion), ""},
{KeyStr("L", 4U, kTypeValue), "val"},
{KeyStr("L", 3U, kTypeSingleDeletion), ""},
{KeyStr("L", 3U, kTypeDeletion), ""},
{KeyStr("L", 2U, kTypeValue), "val"},
{KeyStr("L", 1U, kTypeSingleDeletion), ""},
{KeyStr("M", 10U, kTypeSingleDeletion), ""},
{KeyStr("M", 7U, kTypeValue), "val"},
{KeyStr("M", 5U, kTypeSingleDeletion), ""},
{KeyStr("M", 5U, kTypeDeletion), ""},
{KeyStr("M", 4U, kTypeValue), "val"},
{KeyStr("M", 3U, kTypeSingleDeletion), ""},
});
@ -1019,9 +1021,7 @@ TEST_F(CompactionJobTest, MultiSingleDelete) {
{KeyStr("K", 8U, kTypeValue), "val3"},
{KeyStr("L", 16U, kTypeSingleDeletion), ""},
{KeyStr("L", 15U, kTypeValue), ""},
{KeyStr("L", 11U, kTypeSingleDeletion), ""},
{KeyStr("M", 15U, kTypeSingleDeletion), ""},
{KeyStr("M", 14U, kTypeValue), ""},
{KeyStr("M", 16U, kTypeDeletion), ""},
{KeyStr("M", 3U, kTypeSingleDeletion), ""}});
SetLastSequence(22U);
@ -1107,21 +1107,6 @@ TEST_F(CompactionJobTest, OldestBlobFileNumber) {
/* expected_oldest_blob_file_number */ 19);
}
TEST_F(CompactionJobTest, NoEnforceSingleDeleteContract) {
db_options_.enforce_single_del_contracts = false;
NewDB();
auto file =
mock::MakeMockFile({{KeyStr("a", 4U, kTypeSingleDeletion), ""},
{KeyStr("a", 3U, kTypeDeletion), "dontcare"}});
AddMockFile(file);
SetLastSequence(4U);
auto expected_results = mock::MakeMockFile();
auto files = cfd_->current()->storage_info()->LevelFiles(0);
RunCompaction({files}, expected_results);
}
TEST_F(CompactionJobTest, InputSerialization) {
// Setup a random CompactionServiceInput
CompactionServiceInput input;

View File

@ -65,7 +65,7 @@ bool FindIntraL0Compaction(const std::vector<FileMetaData*>& level_files,
size_t compact_bytes = static_cast<size_t>(level_files[start]->fd.file_size);
uint64_t compensated_compact_bytes =
level_files[start]->compensated_file_size;
size_t compact_bytes_per_del_file = std::numeric_limits<size_t>::max();
size_t compact_bytes_per_del_file = port::kMaxSizet;
// Compaction range will be [start, limit).
size_t limit;
// Pull in files until the amount of compaction work per deleted file begins
@ -100,7 +100,8 @@ bool FindIntraL0Compaction(const std::vector<FileMetaData*>& level_files,
// If enable_compression is false, then compression is always disabled no
// matter what the values of the other two parameters are.
// Otherwise, the compression type is determined based on options and level.
CompressionType GetCompressionType(const VersionStorageInfo* vstorage,
CompressionType GetCompressionType(const ImmutableCFOptions& ioptions,
const VersionStorageInfo* vstorage,
const MutableCFOptions& mutable_cf_options,
int level, int base_level,
const bool enable_compression) {
@ -117,19 +118,17 @@ CompressionType GetCompressionType(const VersionStorageInfo* vstorage,
}
// If the user has specified a different compression level for each level,
// then pick the compression for that level.
if (!mutable_cf_options.compression_per_level.empty()) {
if (!ioptions.compression_per_level.empty()) {
assert(level == 0 || level >= base_level);
int idx = (level == 0) ? 0 : level - base_level + 1;
const int n =
static_cast<int>(mutable_cf_options.compression_per_level.size()) - 1;
const int n = static_cast<int>(ioptions.compression_per_level.size()) - 1;
// It is possible for level_ to be -1; in that case, we use level
// 0's compression. This occurs mostly in backwards compatibility
// situations when the builder doesn't know what level the file
// belongs to. Likewise, if level is beyond the end of the
// specified compression levels, use the last value.
return mutable_cf_options
.compression_per_level[std::max(0, std::min(idx, n))];
return ioptions.compression_per_level[std::max(0, std::min(idx, n))];
} else {
return mutable_cf_options.compression;
}
@ -348,8 +347,9 @@ Compaction* CompactionPicker::CompactFiles(
} else {
base_level = 1;
}
compression_type = GetCompressionType(vstorage, mutable_cf_options,
output_level, base_level);
compression_type =
GetCompressionType(ioptions_, vstorage, mutable_cf_options,
output_level, base_level);
} else {
// TODO(ajkr): `CompactionOptions` offers configurable `CompressionType`
// without configurable `CompressionOptions`, which is inconsistent.
@ -401,7 +401,7 @@ Status CompactionPicker::GetCompactionInputsFromFileNumbers(
"Cannot find matched SST files for the following file numbers:");
for (auto fn : *input_set) {
message += " ";
message += std::to_string(fn);
message += ToString(fn);
}
return Status::InvalidArgument(message);
}
@ -571,7 +571,7 @@ Compaction* CompactionPicker::CompactRange(
int input_level, int output_level,
const CompactRangeOptions& compact_range_options, const InternalKey* begin,
const InternalKey* end, InternalKey** compaction_end, bool* manual_conflict,
uint64_t max_file_num_to_ignore, const std::string& trim_ts) {
uint64_t max_file_num_to_ignore) {
// CompactionPickerFIFO has its own implementation of compact range
assert(ioptions_.compaction_style != kCompactionStyleFIFO);
@ -637,10 +637,12 @@ Compaction* CompactionPicker::CompactRange(
ioptions_.compaction_style),
/* max_compaction_bytes */ LLONG_MAX,
compact_range_options.target_path_id,
GetCompressionType(vstorage, mutable_cf_options, output_level, 1),
GetCompressionType(ioptions_, vstorage, mutable_cf_options,
output_level, 1),
GetCompressionOptions(mutable_cf_options, vstorage, output_level),
Temperature::kUnknown, compact_range_options.max_subcompactions,
/* grandparents */ {}, /* is manual */ true, trim_ts);
/* grandparents */ {},
/* is manual */ true);
RegisterCompaction(c);
vstorage->ComputeCompactionScore(ioptions_, mutable_cf_options);
return c;
@ -717,7 +719,7 @@ Compaction* CompactionPicker::CompactRange(
// files that are created during the current compaction.
if (compact_range_options.bottommost_level_compaction ==
BottommostLevelCompaction::kForceOptimized &&
max_file_num_to_ignore != std::numeric_limits<uint64_t>::max()) {
max_file_num_to_ignore != port::kMaxUint64) {
assert(input_level == output_level);
// inputs_shrunk holds a continuous subset of input files which were all
// created before the current manual compaction
@ -814,12 +816,12 @@ Compaction* CompactionPicker::CompactRange(
ioptions_.level_compaction_dynamic_level_bytes),
mutable_cf_options.max_compaction_bytes,
compact_range_options.target_path_id,
GetCompressionType(vstorage, mutable_cf_options, output_level,
GetCompressionType(ioptions_, vstorage, mutable_cf_options, output_level,
vstorage->base_level()),
GetCompressionOptions(mutable_cf_options, vstorage, output_level),
Temperature::kUnknown, compact_range_options.max_subcompactions,
std::move(grandparents),
/* is manual compaction */ true, trim_ts);
/* is manual compaction */ true);
TEST_SYNC_POINT_CALLBACK("CompactionPicker::CompactRange:Return", compaction);
RegisterCompaction(compaction);
@ -1004,14 +1006,14 @@ Status CompactionPicker::SanitizeCompactionInputFiles(
return Status::InvalidArgument(
"Output level for column family " + cf_meta.name +
" must between [0, " +
std::to_string(cf_meta.levels[cf_meta.levels.size() - 1].level) + "].");
ToString(cf_meta.levels[cf_meta.levels.size() - 1].level) + "].");
}
if (output_level > MaxOutputLevel()) {
return Status::InvalidArgument(
"Exceed the maximum output level defined by "
"the current compaction algorithm --- " +
std::to_string(MaxOutputLevel()));
ToString(MaxOutputLevel()));
}
if (output_level < 0) {
@ -1061,8 +1063,8 @@ Status CompactionPicker::SanitizeCompactionInputFiles(
return Status::InvalidArgument(
"Cannot compact file to up level, input file: " +
MakeTableFileName("", file_num) + " level " +
std::to_string(input_file_level) + " > output level " +
std::to_string(output_level));
ToString(input_file_level) + " > output level " +
ToString(output_level));
}
}

View File

@ -78,7 +78,7 @@ class CompactionPicker {
const CompactRangeOptions& compact_range_options,
const InternalKey* begin, const InternalKey* end,
InternalKey** compaction_end, bool* manual_conflict,
uint64_t max_file_num_to_ignore, const std::string& trim_ts);
uint64_t max_file_num_to_ignore);
// The maximum allowed output level. Default value is NumberLevels() - 1.
virtual int MaxOutputLevel() const { return NumberLevels() - 1; }
@ -270,8 +270,7 @@ class NullCompactionPicker : public CompactionPicker {
const InternalKey* /*end*/,
InternalKey** /*compaction_end*/,
bool* /*manual_conflict*/,
uint64_t /*max_file_num_to_ignore*/,
const std::string& /*trim_ts*/) override {
uint64_t /*max_file_num_to_ignore*/) override {
return nullptr;
}
@ -305,7 +304,8 @@ bool FindIntraL0Compaction(
CompactionInputFiles* comp_inputs,
SequenceNumber earliest_mem_seqno = kMaxSequenceNumber);
CompressionType GetCompressionType(const VersionStorageInfo* vstorage,
CompressionType GetCompressionType(const ImmutableCFOptions& ioptions,
const VersionStorageInfo* vstorage,
const MutableCFOptions& mutable_cf_options,
int level, int base_level,
const bool enable_compression = true);

View File

@ -115,7 +115,7 @@ Compaction* FIFOCompactionPicker::PickTTLCompaction(
std::move(inputs), 0, 0, 0, 0, kNoCompression,
mutable_cf_options.compression_opts, Temperature::kUnknown,
/* max_subcompactions */ 0, {}, /* is manual */ false,
/* trim_ts */ "", vstorage->CompactionScore(0),
vstorage->CompactionScore(0),
/* is deletion compaction */ true, CompactionReason::kFIFOTtl);
return c;
}
@ -157,8 +157,8 @@ Compaction* FIFOCompactionPicker::PickSizeCompaction(
0 /* max compaction bytes, not applicable */,
0 /* output path ID */, mutable_cf_options.compression,
mutable_cf_options.compression_opts, Temperature::kUnknown,
0 /* max_subcompactions */, {}, /* is manual */ false,
/* trim_ts */ "", vstorage->CompactionScore(0),
0 /* max_subcompactions */, {},
/* is manual */ false, vstorage->CompactionScore(0),
/* is deletion compaction */ false,
CompactionReason::kFIFOReduceNumFiles);
return c;
@ -208,7 +208,7 @@ Compaction* FIFOCompactionPicker::PickSizeCompaction(
std::move(inputs), 0, 0, 0, 0, kNoCompression,
mutable_cf_options.compression_opts, Temperature::kUnknown,
/* max_subcompactions */ 0, {}, /* is manual */ false,
/* trim_ts */ "", vstorage->CompactionScore(0),
vstorage->CompactionScore(0),
/* is deletion compaction */ true, CompactionReason::kFIFOMaxSize);
return c;
}
@ -313,7 +313,7 @@ Compaction* FIFOCompactionPicker::PickCompactionToWarm(
0 /* max compaction bytes, not applicable */, 0 /* output path ID */,
mutable_cf_options.compression, mutable_cf_options.compression_opts,
Temperature::kWarm,
/* max_subcompactions */ 0, {}, /* is manual */ false, /* trim_ts */ "",
/* max_subcompactions */ 0, {}, /* is manual */ false,
vstorage->CompactionScore(0),
/* is deletion compaction */ false, CompactionReason::kChangeTemperature);
return c;
@ -349,7 +349,7 @@ Compaction* FIFOCompactionPicker::CompactRange(
const CompactRangeOptions& /*compact_range_options*/,
const InternalKey* /*begin*/, const InternalKey* /*end*/,
InternalKey** compaction_end, bool* /*manual_conflict*/,
uint64_t /*max_file_num_to_ignore*/, const std::string& /*trim_ts*/) {
uint64_t /*max_file_num_to_ignore*/) {
#ifdef NDEBUG
(void)input_level;
(void)output_level;

View File

@ -32,7 +32,7 @@ class FIFOCompactionPicker : public CompactionPicker {
const CompactRangeOptions& compact_range_options,
const InternalKey* begin, const InternalKey* end,
InternalKey** compaction_end, bool* manual_conflict,
uint64_t max_file_num_to_ignore, const std::string& trim_ts) override;
uint64_t max_file_num_to_ignore) override;
// The maximum allowed output level. Always returns 0.
virtual int MaxOutputLevel() const override { return 0; }

View File

@ -343,13 +343,12 @@ Compaction* LevelCompactionBuilder::GetCompaction() {
ioptions_.level_compaction_dynamic_level_bytes),
mutable_cf_options_.max_compaction_bytes,
GetPathId(ioptions_, mutable_cf_options_, output_level_),
GetCompressionType(vstorage_, mutable_cf_options_, output_level_,
vstorage_->base_level()),
GetCompressionType(ioptions_, vstorage_, mutable_cf_options_,
output_level_, vstorage_->base_level()),
GetCompressionOptions(mutable_cf_options_, vstorage_, output_level_),
Temperature::kUnknown,
/* max_subcompactions */ 0, std::move(grandparents_), is_manual_,
/* trim_ts */ "", start_level_score_, false /* deletion_compaction */,
compaction_reason_);
start_level_score_, false /* deletion_compaction */, compaction_reason_);
// If it's level 0 compaction, make sure we don't execute any other level 0
// compactions in parallel
@ -504,7 +503,7 @@ bool LevelCompactionBuilder::PickIntraL0Compaction() {
return false;
}
return FindIntraL0Compaction(level_files, kMinFilesForIntraL0Compaction,
std::numeric_limits<uint64_t>::max(),
port::kMaxUint64,
mutable_cf_options_.max_compaction_bytes,
&start_level_inputs_, earliest_mem_seqno_);
}

View File

@ -273,9 +273,9 @@ TEST_F(CompactionPickerTest, NeedsCompactionLevel) {
// start a brand new version in each test.
NewVersionStorage(kLevels, kCompactionStyleLevel);
for (int i = 0; i < file_count; ++i) {
Add(level, i, std::to_string((i + 100) * 1000).c_str(),
std::to_string((i + 100) * 1000 + 999).c_str(), file_size, 0,
i * 100, i * 100 + 99);
Add(level, i, ToString((i + 100) * 1000).c_str(),
ToString((i + 100) * 1000 + 999).c_str(),
file_size, 0, i * 100, i * 100 + 99);
}
UpdateVersionStorageInfo();
ASSERT_EQ(vstorage_->CompactionScoreLevel(0), level);
@ -439,8 +439,8 @@ TEST_F(CompactionPickerTest, NeedsCompactionUniversal) {
for (int i = 1;
i <= mutable_cf_options_.level0_file_num_compaction_trigger * 2; ++i) {
NewVersionStorage(1, kCompactionStyleUniversal);
Add(0, i, std::to_string((i + 100) * 1000).c_str(),
std::to_string((i + 100) * 1000 + 999).c_str(), 1000000, 0, i * 100,
Add(0, i, ToString((i + 100) * 1000).c_str(),
ToString((i + 100) * 1000 + 999).c_str(), 1000000, 0, i * 100,
i * 100 + 99);
UpdateVersionStorageInfo();
ASSERT_EQ(level_compaction_picker.NeedsCompaction(vstorage_.get()),
@ -852,17 +852,17 @@ TEST_F(CompactionPickerTest, UniversalIncrementalSpace4) {
// L3: (1101, 1180) (1201, 1280) ... (7901, 7908)
// L4: (1130, 1150) (1160, 1210) (1230, 1250) (1260 1310) ... (7960, 8010)
for (int i = 11; i < 79; i++) {
Add(3, 100 + i * 3, std::to_string(i * 100).c_str(),
std::to_string(i * 100 + 80).c_str(), kFileSize, 0, 200, 251);
Add(3, 100 + i * 3, ToString(i * 100).c_str(),
ToString(i * 100 + 80).c_str(), kFileSize, 0, 200, 251);
// Add a tie breaker
if (i == 66) {
Add(3, 10000U, "6690", "6699", kFileSize, 0, 200, 251);
}
Add(4, 100 + i * 3 + 1, std::to_string(i * 100 + 30).c_str(),
std::to_string(i * 100 + 50).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 2, std::to_string(i * 100 + 60).c_str(),
std::to_string(i * 100 + 110).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 1, ToString(i * 100 + 30).c_str(),
ToString(i * 100 + 50).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 2, ToString(i * 100 + 60).c_str(),
ToString(i * 100 + 110).c_str(), kFileSize, 0, 200, 251);
}
UpdateVersionStorageInfo();
@ -899,14 +899,14 @@ TEST_F(CompactionPickerTest, UniversalIncrementalSpace5) {
// L3: (1101, 1180) (1201, 1280) ... (7901, 7908)
// L4: (1130, 1150) (1160, 1210) (1230, 1250) (1260 1310) ... (7960, 8010)
for (int i = 11; i < 70; i++) {
Add(3, 100 + i * 3, std::to_string(i * 100).c_str(),
std::to_string(i * 100 + 80).c_str(),
Add(3, 100 + i * 3, ToString(i * 100).c_str(),
ToString(i * 100 + 80).c_str(),
i % 10 == 9 ? kFileSize * 100 : kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 1, std::to_string(i * 100 + 30).c_str(),
std::to_string(i * 100 + 50).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 2, std::to_string(i * 100 + 60).c_str(),
std::to_string(i * 100 + 110).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 1, ToString(i * 100 + 30).c_str(),
ToString(i * 100 + 50).c_str(), kFileSize, 0, 200, 251);
Add(4, 100 + i * 3 + 2, ToString(i * 100 + 60).c_str(),
ToString(i * 100 + 110).c_str(), kFileSize, 0, 200, 251);
}
UpdateVersionStorageInfo();
@ -941,8 +941,8 @@ TEST_F(CompactionPickerTest, NeedsCompactionFIFO) {
// size of L0 files.
for (int i = 1; i <= kFileCount; ++i) {
NewVersionStorage(1, kCompactionStyleFIFO);
Add(0, i, std::to_string((i + 100) * 1000).c_str(),
std::to_string((i + 100) * 1000 + 999).c_str(), kFileSize, 0, i * 100,
Add(0, i, ToString((i + 100) * 1000).c_str(),
ToString((i + 100) * 1000 + 999).c_str(), kFileSize, 0, i * 100,
i * 100 + 99);
UpdateVersionStorageInfo();
ASSERT_EQ(fifo_compaction_picker.NeedsCompaction(vstorage_.get()),
@ -2648,13 +2648,12 @@ TEST_F(CompactionPickerTest, UniversalMarkedManualCompaction) {
ASSERT_EQ(3U, vstorage_->FilesMarkedForCompaction().size());
bool manual_conflict = false;
InternalKey* manual_end = nullptr;
InternalKey* manual_end = NULL;
std::unique_ptr<Compaction> compaction(
universal_compaction_picker.CompactRange(
cf_name_, mutable_cf_options_, mutable_db_options_, vstorage_.get(),
ColumnFamilyData::kCompactAllLevels, 6, CompactRangeOptions(),
nullptr, nullptr, &manual_end, &manual_conflict,
std::numeric_limits<uint64_t>::max(), ""));
ColumnFamilyData::kCompactAllLevels, 6, CompactRangeOptions(), NULL,
NULL, &manual_end, &manual_conflict, port::kMaxUint64));
ASSERT_TRUE(compaction);

View File

@ -746,19 +746,19 @@ Compaction* UniversalCompactionBuilder::PickCompactionToReduceSortedRuns(
} else {
compaction_reason = CompactionReason::kUniversalSortedRunNum;
}
return new Compaction(vstorage_, ioptions_, mutable_cf_options_,
mutable_db_options_, std::move(inputs), output_level,
MaxFileSizeForLevel(mutable_cf_options_, output_level,
kCompactionStyleUniversal),
GetMaxOverlappingBytes(), path_id,
GetCompressionType(vstorage_, mutable_cf_options_,
start_level, 1, enable_compression),
GetCompressionOptions(mutable_cf_options_, vstorage_,
start_level, enable_compression),
Temperature::kUnknown,
/* max_subcompactions */ 0, grandparents,
/* is manual */ false, /* trim_ts */ "", score_,
false /* deletion_compaction */, compaction_reason);
return new Compaction(
vstorage_, ioptions_, mutable_cf_options_, mutable_db_options_,
std::move(inputs), output_level,
MaxFileSizeForLevel(mutable_cf_options_, output_level,
kCompactionStyleUniversal),
GetMaxOverlappingBytes(), path_id,
GetCompressionType(ioptions_, vstorage_, mutable_cf_options_, start_level,
1, enable_compression),
GetCompressionOptions(mutable_cf_options_, vstorage_, start_level,
enable_compression),
Temperature::kUnknown,
/* max_subcompactions */ 0, grandparents, /* is manual */ false, score_,
false /* deletion_compaction */, compaction_reason);
}
// Look at overall size amplification. If size amplification
@ -1076,13 +1076,13 @@ Compaction* UniversalCompactionBuilder::PickIncrementalForReduceSizeAmp(
MaxFileSizeForLevel(mutable_cf_options_, output_level,
kCompactionStyleUniversal),
GetMaxOverlappingBytes(), path_id,
GetCompressionType(vstorage_, mutable_cf_options_, output_level, 1,
true /* enable_compression */),
GetCompressionType(ioptions_, vstorage_, mutable_cf_options_,
output_level, 1, true /* enable_compression */),
GetCompressionOptions(mutable_cf_options_, vstorage_, output_level,
true /* enable_compression */),
Temperature::kUnknown,
/* max_subcompactions */ 0, /* grandparents */ {}, /* is manual */ false,
/* trim_ts */ "", score_, false /* deletion_compaction */,
score_, false /* deletion_compaction */,
CompactionReason::kUniversalSizeAmplification);
}
@ -1220,11 +1220,12 @@ Compaction* UniversalCompactionBuilder::PickDeleteTriggeredCompaction() {
MaxFileSizeForLevel(mutable_cf_options_, output_level,
kCompactionStyleUniversal),
/* max_grandparent_overlap_bytes */ GetMaxOverlappingBytes(), path_id,
GetCompressionType(vstorage_, mutable_cf_options_, output_level, 1),
GetCompressionType(ioptions_, vstorage_, mutable_cf_options_,
output_level, 1),
GetCompressionOptions(mutable_cf_options_, vstorage_, output_level),
Temperature::kUnknown,
/* max_subcompactions */ 0, grandparents, /* is manual */ false,
/* trim_ts */ "", score_, false /* deletion_compaction */,
/* max_subcompactions */ 0, grandparents, /* is manual */ false, score_,
false /* deletion_compaction */,
CompactionReason::kFilesMarkedForCompaction);
}
@ -1293,14 +1294,13 @@ Compaction* UniversalCompactionBuilder::PickCompactionToOldest(
MaxFileSizeForLevel(mutable_cf_options_, output_level,
kCompactionStyleUniversal),
GetMaxOverlappingBytes(), path_id,
GetCompressionType(vstorage_, mutable_cf_options_, output_level, 1,
true /* enable_compression */),
GetCompressionType(ioptions_, vstorage_, mutable_cf_options_,
output_level, 1, true /* enable_compression */),
GetCompressionOptions(mutable_cf_options_, vstorage_, output_level,
true /* enable_compression */),
Temperature::kUnknown,
/* max_subcompactions */ 0, /* grandparents */ {}, /* is manual */ false,
/* trim_ts */ "", score_, false /* deletion_compaction */,
compaction_reason);
score_, false /* deletion_compaction */, compaction_reason);
}
Compaction* UniversalCompactionBuilder::PickPeriodicCompaction() {
@ -1371,7 +1371,7 @@ Compaction* UniversalCompactionBuilder::PickPeriodicCompaction() {
uint64_t UniversalCompactionBuilder::GetMaxOverlappingBytes() const {
if (!mutable_cf_options_.compaction_options_universal.incremental) {
return std::numeric_limits<uint64_t>::max();
return port::kMaxUint64;
} else {
// Try to align cutting boundary with files at the next level if the
// file isn't end up with 1/2 of target size, or it would overlap

View File

@ -12,16 +12,13 @@ namespace ROCKSDB_NAMESPACE {
class MyTestCompactionService : public CompactionService {
public:
MyTestCompactionService(
std::string db_path, Options& options,
std::shared_ptr<Statistics>& statistics,
std::vector<std::shared_ptr<EventListener>>& listeners)
MyTestCompactionService(std::string db_path, Options& options,
std::shared_ptr<Statistics>& statistics)
: db_path_(std::move(db_path)),
options_(options),
statistics_(statistics),
start_info_("na", "na", "na", 0, Env::TOTAL),
wait_info_("na", "na", "na", 0, Env::TOTAL),
listeners_(listeners) {}
wait_info_("na", "na", "na", 0, Env::TOTAL) {}
static const char* kClassName() { return "MyTestCompactionService"; }
@ -74,15 +71,9 @@ class MyTestCompactionService : public CompactionService {
options_override.table_factory = options_.table_factory;
options_override.sst_partitioner_factory = options_.sst_partitioner_factory;
options_override.statistics = statistics_;
if (!listeners_.empty()) {
options_override.listeners = listeners_;
}
OpenAndCompactOptions options;
options.canceled = &canceled_;
Status s = DB::OpenAndCompact(
options, db_path_, db_path_ + "/" + std::to_string(info.job_id),
db_path_, db_path_ + "/" + ROCKSDB_NAMESPACE::ToString(info.job_id),
compaction_input, compaction_service_result, options_override);
if (is_override_wait_result_) {
*compaction_service_result = override_wait_result_;
@ -121,8 +112,6 @@ class MyTestCompactionService : public CompactionService {
is_override_wait_status_ = false;
}
void SetCanceled(bool canceled) { canceled_ = canceled; }
private:
InstrumentedMutex mutex_;
std::atomic_int compaction_num_{0};
@ -140,8 +129,6 @@ class MyTestCompactionService : public CompactionService {
CompactionServiceJobStatus::kFailure;
bool is_override_wait_result_ = false;
std::string override_wait_result_;
std::vector<std::shared_ptr<EventListener>> listeners_;
std::atomic_bool canceled_{false};
};
class CompactionServiceTest : public DBTestBase {
@ -157,7 +144,7 @@ class CompactionServiceTest : public DBTestBase {
compactor_statistics_ = CreateDBStatistics();
compaction_service_ = std::make_shared<MyTestCompactionService>(
dbname_, *options, compactor_statistics_, remote_listeners);
dbname_, *options, compactor_statistics_);
options->compaction_service = compaction_service_;
DestroyAndReopen(*options);
}
@ -176,7 +163,7 @@ class CompactionServiceTest : public DBTestBase {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -186,7 +173,7 @@ class CompactionServiceTest : public DBTestBase {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -198,15 +185,13 @@ class CompactionServiceTest : public DBTestBase {
for (int i = 0; i < 200; i++) {
auto result = Get(Key(i));
if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
ASSERT_EQ(result, "value" + ToString(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
ASSERT_EQ(result, "value_new" + ToString(i));
}
}
}
std::vector<std::shared_ptr<EventListener>> remote_listeners;
private:
std::shared_ptr<Statistics> compactor_statistics_;
std::shared_ptr<Statistics> primary_statistics_;
@ -223,7 +208,7 @@ TEST_F(CompactionServiceTest, BasicCompactions) {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -231,7 +216,7 @@ TEST_F(CompactionServiceTest, BasicCompactions) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -241,9 +226,9 @@ TEST_F(CompactionServiceTest, BasicCompactions) {
for (int i = 0; i < 200; i++) {
auto result = Get(Key(i));
if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
ASSERT_EQ(result, "value" + ToString(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
ASSERT_EQ(result, "value_new" + ToString(i));
}
}
auto my_cs = GetCompactionService();
@ -280,7 +265,7 @@ TEST_F(CompactionServiceTest, BasicCompactions) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
s = Put(Key(key_id), "value_new" + std::to_string(key_id));
s = Put(Key(key_id), "value_new" + ToString(key_id));
if (s.IsAborted()) {
break;
}
@ -337,51 +322,6 @@ TEST_F(CompactionServiceTest, ManualCompaction) {
VerifyTestData();
}
TEST_F(CompactionServiceTest, CancelCompactionOnRemoteSide) {
Options options = CurrentOptions();
options.disable_auto_compactions = true;
ReopenWithCompactionService(&options);
GenerateTestData();
auto my_cs = GetCompactionService();
std::string start_str = Key(15);
std::string end_str = Key(45);
Slice start(start_str);
Slice end(end_str);
uint64_t comp_num = my_cs->GetCompactionNum();
// Test cancel compaction at the beginning
my_cs->SetCanceled(true);
auto s = db_->CompactRange(CompactRangeOptions(), &start, &end);
ASSERT_TRUE(s.IsIncomplete());
// compaction number is not increased
ASSERT_GE(my_cs->GetCompactionNum(), comp_num);
VerifyTestData();
// Test cancel compaction in progress
ReopenWithCompactionService(&options);
GenerateTestData();
my_cs = GetCompactionService();
my_cs->SetCanceled(false);
std::atomic_bool cancel_issued{false};
SyncPoint::GetInstance()->SetCallBack("CompactionJob::Run():Inprogress",
[&](void* /*arg*/) {
cancel_issued = true;
my_cs->SetCanceled(true);
});
SyncPoint::GetInstance()->EnableProcessing();
s = db_->CompactRange(CompactRangeOptions(), &start, &end);
ASSERT_TRUE(s.IsIncomplete());
ASSERT_TRUE(cancel_issued);
// compaction number is not increased
ASSERT_GE(my_cs->GetCompactionNum(), comp_num);
VerifyTestData();
}
TEST_F(CompactionServiceTest, FailedToStart) {
Options options = CurrentOptions();
options.disable_auto_compactions = true;
@ -467,7 +407,7 @@ TEST_F(CompactionServiceTest, CompactionFilter) {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -475,7 +415,7 @@ TEST_F(CompactionServiceTest, CompactionFilter) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -489,9 +429,9 @@ TEST_F(CompactionServiceTest, CompactionFilter) {
if (i > 5 && i <= 105) {
ASSERT_EQ(result, "NOT_FOUND");
} else if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
ASSERT_EQ(result, "value" + ToString(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
ASSERT_EQ(result, "value_new" + ToString(i));
}
}
auto my_cs = GetCompactionService();
@ -546,9 +486,9 @@ TEST_F(CompactionServiceTest, ConcurrentCompaction) {
for (int i = 0; i < 200; i++) {
auto result = Get(Key(i));
if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
ASSERT_EQ(result, "value" + ToString(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
ASSERT_EQ(result, "value_new" + ToString(i));
}
}
auto my_cs = GetCompactionService();
@ -563,7 +503,7 @@ TEST_F(CompactionServiceTest, CompactionInfo) {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -571,7 +511,7 @@ TEST_F(CompactionServiceTest, CompactionInfo) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -616,7 +556,7 @@ TEST_F(CompactionServiceTest, CompactionInfo) {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -624,7 +564,7 @@ TEST_F(CompactionServiceTest, CompactionInfo) {
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -652,7 +592,7 @@ TEST_F(CompactionServiceTest, FallbackLocalAuto) {
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -660,7 +600,7 @@ TEST_F(CompactionServiceTest, FallbackLocalAuto) {
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value_new" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -670,9 +610,9 @@ TEST_F(CompactionServiceTest, FallbackLocalAuto) {
for (int i = 0; i < 200; i++) {
auto result = Get(Key(i));
if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
ASSERT_EQ(result, "value" + ToString(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
ASSERT_EQ(result, "value_new" + ToString(i));
}
}
@ -745,88 +685,6 @@ TEST_F(CompactionServiceTest, FallbackLocalManual) {
VerifyTestData();
}
TEST_F(CompactionServiceTest, RemoteEventListener) {
class RemoteEventListenerTest : public EventListener {
public:
const char* Name() const override { return "RemoteEventListenerTest"; }
void OnSubcompactionBegin(const SubcompactionJobInfo& info) override {
auto result = on_going_compactions.emplace(info.job_id);
ASSERT_TRUE(result.second); // make sure there's no duplication
compaction_num++;
EventListener::OnSubcompactionBegin(info);
}
void OnSubcompactionCompleted(const SubcompactionJobInfo& info) override {
auto num = on_going_compactions.erase(info.job_id);
ASSERT_TRUE(num == 1); // make sure the compaction id exists
EventListener::OnSubcompactionCompleted(info);
}
void OnTableFileCreated(const TableFileCreationInfo& info) override {
ASSERT_EQ(on_going_compactions.count(info.job_id), 1);
file_created++;
EventListener::OnTableFileCreated(info);
}
void OnTableFileCreationStarted(
const TableFileCreationBriefInfo& info) override {
ASSERT_EQ(on_going_compactions.count(info.job_id), 1);
file_creation_started++;
EventListener::OnTableFileCreationStarted(info);
}
bool ShouldBeNotifiedOnFileIO() override {
file_io_notified++;
return EventListener::ShouldBeNotifiedOnFileIO();
}
std::atomic_uint64_t file_io_notified{0};
std::atomic_uint64_t file_creation_started{0};
std::atomic_uint64_t file_created{0};
std::set<int> on_going_compactions; // store the job_id
std::atomic_uint64_t compaction_num{0};
};
auto listener = new RemoteEventListenerTest();
remote_listeners.emplace_back(listener);
Options options = CurrentOptions();
ReopenWithCompactionService(&options);
for (int i = 0; i < 20; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
}
ASSERT_OK(Flush());
}
for (int i = 0; i < 10; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value_new" + std::to_string(key_id)));
}
ASSERT_OK(Flush());
}
ASSERT_OK(dbfull()->TEST_WaitForCompact());
// check the events are triggered
ASSERT_TRUE(listener->file_io_notified > 0);
ASSERT_TRUE(listener->file_creation_started > 0);
ASSERT_TRUE(listener->file_created > 0);
ASSERT_TRUE(listener->compaction_num > 0);
ASSERT_TRUE(listener->on_going_compactions.empty());
// verify result
for (int i = 0; i < 200; i++) {
auto result = Get(Key(i));
if (i % 2) {
ASSERT_EQ(result, "value" + std::to_string(i));
} else {
ASSERT_EQ(result, "value_new" + std::to_string(i));
}
}
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -397,7 +397,7 @@ TEST_P(ComparatorDBTest, DoubleComparator) {
for (uint32_t j = 0; j < divide_order; j++) {
to_divide *= 10.0;
}
source_strings.push_back(std::to_string(r / to_divide));
source_strings.push_back(ToString(r / to_divide));
}
DoRandomIteraratorTest(GetDB(), source_strings, &rnd, 200, 1000, 66);

View File

@ -40,8 +40,7 @@ Status VerifySstFileChecksum(const Options& options,
Status VerifySstFileChecksum(const Options& options,
const EnvOptions& env_options,
const ReadOptions& read_options,
const std::string& file_path,
const SequenceNumber& largest_seqno) {
const std::string& file_path) {
std::unique_ptr<FSRandomAccessFile> file;
uint64_t file_size;
InternalKeyComparator internal_comparator(options.comparator);
@ -62,13 +61,12 @@ Status VerifySstFileChecksum(const Options& options,
nullptr /* stats */, 0 /* hist_type */, nullptr /* file_read_hist */,
ioptions.rate_limiter.get()));
const bool kImmortal = true;
auto reader_options = TableReaderOptions(
ioptions, options.prefix_extractor, env_options, internal_comparator,
false /* skip_filters */, !kImmortal, false /* force_direct_prefetch */,
-1 /* level */);
reader_options.largest_seqno = largest_seqno;
s = ioptions.table_factory->NewTableReader(
reader_options, std::move(file_reader), file_size, &table_reader,
TableReaderOptions(ioptions, options.prefix_extractor, env_options,
internal_comparator, false /* skip_filters */,
!kImmortal, false /* force_direct_prefetch */,
-1 /* level */),
std::move(file_reader), file_size, &table_reader,
false /* prefetch_index_and_filter_in_cache */);
if (!s.ok()) {
return s;

View File

@ -26,7 +26,6 @@
#include "rocksdb/db.h"
#include "rocksdb/env.h"
#include "rocksdb/table.h"
#include "rocksdb/utilities/transaction_db.h"
#include "rocksdb/write_batch.h"
#include "table/block_based/block_based_table_builder.h"
#include "table/meta_blocks.h"
@ -276,42 +275,6 @@ class CorruptionTest : public testing::Test {
}
return Slice(*storage);
}
void GetSortedWalFiles(std::vector<uint64_t>& file_nums) {
std::vector<std::string> tmp_files;
ASSERT_OK(env_->GetChildren(dbname_, &tmp_files));
FileType type = kWalFile;
for (const auto& file : tmp_files) {
uint64_t number = 0;
if (ParseFileName(file, &number, &type) && type == kWalFile) {
file_nums.push_back(number);
}
}
std::sort(file_nums.begin(), file_nums.end());
}
void CorruptFileWithTruncation(FileType file, uint64_t number,
uint64_t bytes_to_truncate = 0) {
std::string path;
switch (file) {
case FileType::kWalFile:
path = LogFileName(dbname_, number);
break;
// TODO: Add other file types as this method is being used for those file
// types.
default:
return;
}
uint64_t old_size = 0;
ASSERT_OK(env_->GetFileSize(path, &old_size));
assert(old_size > bytes_to_truncate);
uint64_t new_size = old_size - bytes_to_truncate;
// If bytes_to_truncate == 0, it will do full truncation.
if (bytes_to_truncate == 0) {
new_size = 0;
}
ASSERT_OK(test::TruncateFile(env_, path, new_size));
}
};
TEST_F(CorruptionTest, Recovery) {
@ -337,72 +300,6 @@ TEST_F(CorruptionTest, Recovery) {
Check(36, 36);
}
TEST_F(CorruptionTest, PostPITRCorruptionWALsRetained) {
// Repro for bug where WALs following the point-in-time recovery were not
// retained leading to the next recovery failing.
CloseDb();
options_.wal_recovery_mode = WALRecoveryMode::kPointInTimeRecovery;
const std::string test_cf_name = "test_cf";
std::vector<ColumnFamilyDescriptor> cf_descs;
cf_descs.emplace_back(kDefaultColumnFamilyName, ColumnFamilyOptions());
cf_descs.emplace_back(test_cf_name, ColumnFamilyOptions());
uint64_t log_num;
{
options_.create_missing_column_families = true;
std::vector<ColumnFamilyHandle*> cfhs;
ASSERT_OK(DB::Open(options_, dbname_, cf_descs, &cfhs, &db_));
assert(db_ != nullptr); // suppress false clang-analyze report
ASSERT_OK(db_->Put(WriteOptions(), cfhs[0], "k", "v"));
ASSERT_OK(db_->Put(WriteOptions(), cfhs[1], "k", "v"));
ASSERT_OK(db_->Put(WriteOptions(), cfhs[0], "k2", "v2"));
std::vector<uint64_t> file_nums;
GetSortedWalFiles(file_nums);
log_num = file_nums.back();
for (auto* cfh : cfhs) {
delete cfh;
}
CloseDb();
}
CorruptFileWithTruncation(FileType::kWalFile, log_num,
/*bytes_to_truncate=*/1);
{
// Recover "k" -> "v" for both CFs. "k2" -> "v2" is lost due to truncation.
options_.avoid_flush_during_recovery = true;
std::vector<ColumnFamilyHandle*> cfhs;
ASSERT_OK(DB::Open(options_, dbname_, cf_descs, &cfhs, &db_));
assert(db_ != nullptr); // suppress false clang-analyze report
// Flush one but not both CFs and write some data so there's a seqno gap
// between the PITR corruption and the next DB session's first WAL.
ASSERT_OK(db_->Put(WriteOptions(), cfhs[1], "k2", "v2"));
ASSERT_OK(db_->Flush(FlushOptions(), cfhs[1]));
for (auto* cfh : cfhs) {
delete cfh;
}
CloseDb();
}
// With the bug, this DB open would remove the WALs following the PITR
// corruption. Then, the next recovery would fail.
for (int i = 0; i < 2; ++i) {
std::vector<ColumnFamilyHandle*> cfhs;
ASSERT_OK(DB::Open(options_, dbname_, cf_descs, &cfhs, &db_));
assert(db_ != nullptr); // suppress false clang-analyze report
for (auto* cfh : cfhs) {
delete cfh;
}
CloseDb();
}
}
TEST_F(CorruptionTest, RecoverWriteError) {
env_->writable_file_error_ = true;
Status s = TryReopen();
@ -1015,480 +912,6 @@ TEST_F(CorruptionTest, VerifyWholeTableChecksum) {
ASSERT_EQ(1, count);
}
class CrashDuringRecoveryWithCorruptionTest
: public CorruptionTest,
public testing::WithParamInterface<std::tuple<bool, bool>> {
public:
explicit CrashDuringRecoveryWithCorruptionTest()
: CorruptionTest(),
avoid_flush_during_recovery_(std::get<0>(GetParam())),
track_and_verify_wals_in_manifest_(std::get<1>(GetParam())) {}
protected:
const bool avoid_flush_during_recovery_;
const bool track_and_verify_wals_in_manifest_;
};
INSTANTIATE_TEST_CASE_P(CorruptionTest, CrashDuringRecoveryWithCorruptionTest,
::testing::Values(std::make_tuple(true, false),
std::make_tuple(false, false),
std::make_tuple(true, true),
std::make_tuple(false, true)));
// In case of non-TransactionDB with avoid_flush_during_recovery = true, RocksDB
// won't flush the data from WAL to L0 for all column families if possible. As a
// result, not all column families can increase their log_numbers, and
// min_log_number_to_keep won't change.
// It may prematurely persist a new MANIFEST even before we can declare the DB
// is in consistent state after recovery (this is when the new WAL is synced)
// and advances log_numbers for some column families.
//
// If there is power failure before we sync the new WAL, we will end up in
// a situation in which after persisting the MANIFEST, RocksDB will see some
// column families' log_numbers larger than the corrupted wal, and
// "Column family inconsistency: SST file contains data beyond the point of
// corruption" error will be hit, causing recovery to fail.
//
// After adding the fix, only after new WAL is synced, RocksDB persist a new
// MANIFEST with column families to ensure RocksDB is in consistent state.
// RocksDB writes an empty WriteBatch as a sentinel to the new WAL which is
// synced immediately afterwards. The sequence number of the sentinel
// WriteBatch will be the next sequence number immediately after the largest
// sequence number recovered from previous WALs and MANIFEST because of which DB
// will be in consistent state.
// If a future recovery starts from the new MANIFEST, then it means the new WAL
// is successfully synced. Due to the sentinel empty write batch at the
// beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point.
// If future recovery starts from the old MANIFEST, it means the writing the new
// MANIFEST failed. It won't have the "SST ahead of WAL" error.
//
// The combination of corrupting a WAL and injecting an error during subsequent
// re-open exposes the bug of prematurely persisting a new MANIFEST with
// advanced ColumnFamilyData::log_number.
TEST_P(CrashDuringRecoveryWithCorruptionTest, DISABLED_CrashDuringRecovery) {
CloseDb();
Options options;
options.track_and_verify_wals_in_manifest =
track_and_verify_wals_in_manifest_;
options.wal_recovery_mode = WALRecoveryMode::kPointInTimeRecovery;
options.avoid_flush_during_recovery = false;
options.env = env_;
ASSERT_OK(DestroyDB(dbname_, options));
options.create_if_missing = true;
options.max_write_buffer_number = 8;
Reopen(&options);
Status s;
const std::string test_cf_name = "test_cf";
ColumnFamilyHandle* cfh = nullptr;
s = db_->CreateColumnFamily(options, test_cf_name, &cfh);
ASSERT_OK(s);
delete cfh;
CloseDb();
std::vector<ColumnFamilyDescriptor> cf_descs;
cf_descs.emplace_back(kDefaultColumnFamilyName, options);
cf_descs.emplace_back(test_cf_name, options);
std::vector<ColumnFamilyHandle*> handles;
// 1. Open and populate the DB. Write and flush default_cf several times to
// advance wal number so that some column families have advanced log_number
// while other don't.
{
ASSERT_OK(DB::Open(options, dbname_, cf_descs, &handles, &db_));
auto* dbimpl = static_cast_with_check<DBImpl>(db_);
assert(dbimpl);
// Write one key to test_cf.
ASSERT_OK(db_->Put(WriteOptions(), handles[1], "old_key", "dontcare"));
ASSERT_OK(db_->Flush(FlushOptions(), handles[1]));
// Write to default_cf and flush this cf several times to advance wal
// number. TEST_SwitchMemtable makes sure WALs are not synced and test can
// corrupt un-sync WAL.
for (int i = 0; i < 2; ++i) {
ASSERT_OK(db_->Put(WriteOptions(), "key" + std::to_string(i), "value"));
ASSERT_OK(dbimpl->TEST_SwitchMemtable());
}
for (auto* h : handles) {
delete h;
}
handles.clear();
CloseDb();
}
// 2. Corrupt second last un-syned wal file to emulate power reset which
// caused the DB to lose the un-synced WAL.
{
std::vector<uint64_t> file_nums;
GetSortedWalFiles(file_nums);
size_t size = file_nums.size();
assert(size >= 2);
uint64_t log_num = file_nums[size - 2];
CorruptFileWithTruncation(FileType::kWalFile, log_num,
/*bytes_to_truncate=*/8);
}
// 3. After first crash reopen the DB which contains corrupted WAL. Default
// family has higher log number than corrupted wal number.
//
// Case1: If avoid_flush_during_recovery = true, RocksDB won't flush the data
// from WAL to L0 for all column families (test_cf_name in this case). As a
// result, not all column families can increase their log_numbers, and
// min_log_number_to_keep won't change.
//
// Case2: If avoid_flush_during_recovery = false, all column families have
// flushed their data from WAL to L0 during recovery, and none of them will
// ever need to read the WALs again.
// 4. Fault is injected to fail the recovery.
{
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
SyncPoint::GetInstance()->SetCallBack(
"DBImpl::GetLogSizeAndMaybeTruncate:0", [&](void* arg) {
auto* tmp_s = reinterpret_cast<Status*>(arg);
assert(tmp_s);
*tmp_s = Status::IOError("Injected");
});
SyncPoint::GetInstance()->EnableProcessing();
handles.clear();
options.avoid_flush_during_recovery = true;
s = DB::Open(options, dbname_, cf_descs, &handles, &db_);
ASSERT_TRUE(s.IsIOError());
ASSERT_EQ("IO error: Injected", s.ToString());
for (auto* h : handles) {
delete h;
}
CloseDb();
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
// 5. After second crash reopen the db with second corruption. Default family
// has higher log number than corrupted wal number.
//
// Case1: If avoid_flush_during_recovery = true, we persist a new
// MANIFEST with advanced log_numbers for some column families only after
// syncing the WAL. So during second crash, RocksDB will skip the corrupted
// WAL files as they have been moved to different folder. Since newly synced
// WAL file's sequence number (sentinel WriteBatch) will be the next
// sequence number immediately after the largest sequence number recovered
// from previous WALs and MANIFEST, db will be in consistent state and opens
// successfully.
//
// Case2: If avoid_flush_during_recovery = false, the corrupted WAL is below
// this number. So during a second crash after persisting the new MANIFEST,
// RocksDB will skip the corrupted WAL(s) because they are all below this
// bound. Therefore, we won't hit the "column family inconsistency" error
// message.
{
options.avoid_flush_during_recovery = avoid_flush_during_recovery_;
ASSERT_OK(DB::Open(options, dbname_, cf_descs, &handles, &db_));
for (auto* h : handles) {
delete h;
}
handles.clear();
CloseDb();
}
}
// In case of TransactionDB, it enables two-phase-commit. The prepare section of
// an uncommitted transaction always need to be kept. Even if we perform flush
// during recovery, we may still need to hold an old WAL. The
// min_log_number_to_keep won't change, and "Column family inconsistency: SST
// file contains data beyond the point of corruption" error will be hit, causing
// recovery to fail.
//
// After adding the fix, only after new WAL is synced, RocksDB persist a new
// MANIFEST with column families to ensure RocksDB is in consistent state.
// RocksDB writes an empty WriteBatch as a sentinel to the new WAL which is
// synced immediately afterwards. The sequence number of the sentinel
// WriteBatch will be the next sequence number immediately after the largest
// sequence number recovered from previous WALs and MANIFEST because of which DB
// will be in consistent state.
// If a future recovery starts from the new MANIFEST, then it means the new WAL
// is successfully synced. Due to the sentinel empty write batch at the
// beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point.
// If future recovery starts from the old MANIFEST, it means the writing the new
// MANIFEST failed. It won't have the "SST ahead of WAL" error.
//
// The combination of corrupting a WAL and injecting an error during subsequent
// re-open exposes the bug of prematurely persisting a new MANIFEST with
// advanced ColumnFamilyData::log_number.
TEST_P(CrashDuringRecoveryWithCorruptionTest,
DISABLED_TxnDbCrashDuringRecovery) {
CloseDb();
Options options;
options.wal_recovery_mode = WALRecoveryMode::kPointInTimeRecovery;
options.track_and_verify_wals_in_manifest =
track_and_verify_wals_in_manifest_;
options.avoid_flush_during_recovery = false;
options.env = env_;
ASSERT_OK(DestroyDB(dbname_, options));
options.create_if_missing = true;
options.max_write_buffer_number = 3;
Reopen(&options);
// Create cf test_cf_name.
ColumnFamilyHandle* cfh = nullptr;
const std::string test_cf_name = "test_cf";
Status s = db_->CreateColumnFamily(options, test_cf_name, &cfh);
ASSERT_OK(s);
delete cfh;
CloseDb();
std::vector<ColumnFamilyDescriptor> cf_descs;
cf_descs.emplace_back(kDefaultColumnFamilyName, options);
cf_descs.emplace_back(test_cf_name, options);
std::vector<ColumnFamilyHandle*> handles;
TransactionDB* txn_db = nullptr;
TransactionDBOptions txn_db_opts;
// 1. Open and populate the DB. Write and flush default_cf several times to
// advance wal number so that some column families have advanced log_number
// while other don't.
{
ASSERT_OK(TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs,
&handles, &txn_db));
auto* txn = txn_db->BeginTransaction(WriteOptions(), TransactionOptions());
// Put cf1
ASSERT_OK(txn->Put(handles[1], "foo", "value"));
ASSERT_OK(txn->SetName("txn0"));
ASSERT_OK(txn->Prepare());
ASSERT_OK(txn_db->Flush(FlushOptions()));
delete txn;
txn = nullptr;
auto* dbimpl = static_cast_with_check<DBImpl>(txn_db->GetRootDB());
assert(dbimpl);
// Put and flush cf0
for (int i = 0; i < 2; ++i) {
ASSERT_OK(txn_db->Put(WriteOptions(), "dontcare", "value"));
ASSERT_OK(dbimpl->TEST_SwitchMemtable());
}
// Put cf1
txn = txn_db->BeginTransaction(WriteOptions(), TransactionOptions());
ASSERT_OK(txn->Put(handles[1], "foo1", "value"));
ASSERT_OK(txn->Commit());
delete txn;
txn = nullptr;
for (auto* h : handles) {
delete h;
}
handles.clear();
delete txn_db;
}
// 2. Corrupt second last wal to emulate power reset which caused the DB to
// lose the un-synced WAL.
{
std::vector<uint64_t> file_nums;
GetSortedWalFiles(file_nums);
size_t size = file_nums.size();
assert(size >= 2);
uint64_t log_num = file_nums[size - 2];
CorruptFileWithTruncation(FileType::kWalFile, log_num,
/*bytes_to_truncate=*/8);
}
// 3. After first crash reopen the DB which contains corrupted WAL. Default
// family has higher log number than corrupted wal number. There may be old
// WAL files that it must not delete because they can contain data of
// uncommitted transactions. As a result, min_log_number_to_keep won't change.
{
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
SyncPoint::GetInstance()->SetCallBack(
"DBImpl::Open::BeforeSyncWAL", [&](void* arg) {
auto* tmp_s = reinterpret_cast<Status*>(arg);
assert(tmp_s);
*tmp_s = Status::IOError("Injected");
});
SyncPoint::GetInstance()->EnableProcessing();
handles.clear();
s = TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs, &handles,
&txn_db);
ASSERT_TRUE(s.IsIOError());
ASSERT_EQ("IO error: Injected", s.ToString());
for (auto* h : handles) {
delete h;
}
CloseDb();
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
// 4. Corrupt max_wal_num.
{
std::vector<uint64_t> file_nums;
GetSortedWalFiles(file_nums);
size_t size = file_nums.size();
assert(size >= 2);
uint64_t log_num = file_nums[size - 1];
CorruptFileWithTruncation(FileType::kWalFile, log_num);
}
// 5. After second crash reopen the db with second corruption. Default family
// has higher log number than corrupted wal number.
// We persist a new MANIFEST with advanced log_numbers for some column
// families only after syncing the WAL. So during second crash, RocksDB will
// skip the corrupted WAL files as they have been moved to different folder.
// Since newly synced WAL file's sequence number (sentinel WriteBatch) will be
// the next sequence number immediately after the largest sequence number
// recovered from previous WALs and MANIFEST, db will be in consistent state
// and opens successfully.
{
ASSERT_OK(TransactionDB::Open(options, txn_db_opts, dbname_, cf_descs,
&handles, &txn_db));
for (auto* h : handles) {
delete h;
}
delete txn_db;
}
}
// This test is similar to
// CrashDuringRecoveryWithCorruptionTest.CrashDuringRecovery except it calls
// flush and corrupts Last WAL. It calls flush to sync some of the WALs and
// remaining are unsyned one of which is then corrupted to simulate crash.
//
// In case of non-TransactionDB with avoid_flush_during_recovery = true, RocksDB
// won't flush the data from WAL to L0 for all column families if possible. As a
// result, not all column families can increase their log_numbers, and
// min_log_number_to_keep won't change.
// It may prematurely persist a new MANIFEST even before we can declare the DB
// is in consistent state after recovery (this is when the new WAL is synced)
// and advances log_numbers for some column families.
//
// If there is power failure before we sync the new WAL, we will end up in
// a situation in which after persisting the MANIFEST, RocksDB will see some
// column families' log_numbers larger than the corrupted wal, and
// "Column family inconsistency: SST file contains data beyond the point of
// corruption" error will be hit, causing recovery to fail.
//
// After adding the fix, only after new WAL is synced, RocksDB persist a new
// MANIFEST with column families to ensure RocksDB is in consistent state.
// RocksDB writes an empty WriteBatch as a sentinel to the new WAL which is
// synced immediately afterwards. The sequence number of the sentinel
// WriteBatch will be the next sequence number immediately after the largest
// sequence number recovered from previous WALs and MANIFEST because of which DB
// will be in consistent state.
// If a future recovery starts from the new MANIFEST, then it means the new WAL
// is successfully synced. Due to the sentinel empty write batch at the
// beginning, kPointInTimeRecovery of WAL is guaranteed to go after this point.
// If future recovery starts from the old MANIFEST, it means the writing the new
// MANIFEST failed. It won't have the "SST ahead of WAL" error.
// The combination of corrupting a WAL and injecting an error during subsequent
// re-open exposes the bug of prematurely persisting a new MANIFEST with
// advanced ColumnFamilyData::log_number.
TEST_P(CrashDuringRecoveryWithCorruptionTest,
DISABLED_CrashDuringRecoveryWithFlush) {
CloseDb();
Options options;
options.wal_recovery_mode = WALRecoveryMode::kPointInTimeRecovery;
options.avoid_flush_during_recovery = false;
options.env = env_;
options.create_if_missing = true;
ASSERT_OK(DestroyDB(dbname_, options));
Reopen(&options);
ColumnFamilyHandle* cfh = nullptr;
const std::string test_cf_name = "test_cf";
Status s = db_->CreateColumnFamily(options, test_cf_name, &cfh);
ASSERT_OK(s);
delete cfh;
CloseDb();
std::vector<ColumnFamilyDescriptor> cf_descs;
cf_descs.emplace_back(kDefaultColumnFamilyName, options);
cf_descs.emplace_back(test_cf_name, options);
std::vector<ColumnFamilyHandle*> handles;
{
ASSERT_OK(DB::Open(options, dbname_, cf_descs, &handles, &db_));
// Write one key to test_cf.
ASSERT_OK(db_->Put(WriteOptions(), handles[1], "old_key", "dontcare"));
// Write to default_cf and flush this cf several times to advance wal
// number.
for (int i = 0; i < 2; ++i) {
ASSERT_OK(db_->Put(WriteOptions(), "key" + std::to_string(i), "value"));
ASSERT_OK(db_->Flush(FlushOptions()));
}
ASSERT_OK(db_->Put(WriteOptions(), handles[1], "dontcare", "dontcare"));
for (auto* h : handles) {
delete h;
}
handles.clear();
CloseDb();
}
// Corrupt second last un-syned wal file to emulate power reset which
// caused the DB to lose the un-synced WAL.
{
std::vector<uint64_t> file_nums;
GetSortedWalFiles(file_nums);
size_t size = file_nums.size();
uint64_t log_num = file_nums[size - 1];
CorruptFileWithTruncation(FileType::kWalFile, log_num,
/*bytes_to_truncate=*/8);
}
// Fault is injected to fail the recovery.
{
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
SyncPoint::GetInstance()->SetCallBack(
"DBImpl::GetLogSizeAndMaybeTruncate:0", [&](void* arg) {
auto* tmp_s = reinterpret_cast<Status*>(arg);
assert(tmp_s);
*tmp_s = Status::IOError("Injected");
});
SyncPoint::GetInstance()->EnableProcessing();
handles.clear();
options.avoid_flush_during_recovery = true;
s = DB::Open(options, dbname_, cf_descs, &handles, &db_);
ASSERT_TRUE(s.IsIOError());
ASSERT_EQ("IO error: Injected", s.ToString());
for (auto* h : handles) {
delete h;
}
CloseDb();
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
// Reopen db again
{
options.avoid_flush_during_recovery = avoid_flush_during_recovery_;
ASSERT_OK(DB::Open(options, dbname_, cf_descs, &handles, &db_));
for (auto* h : handles) {
delete h;
}
}
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -95,8 +95,8 @@ class CuckooTableDBTest : public testing::Test {
int NumTableFilesAtLevel(int level) {
std::string property;
EXPECT_TRUE(db_->GetProperty(
"rocksdb.num-files-at-level" + std::to_string(level), &property));
EXPECT_TRUE(db_->GetProperty("rocksdb.num-files-at-level" + ToString(level),
&property));
return atoi(property.c_str());
}

View File

@ -175,10 +175,7 @@ TEST_F(DBBasicTest, ReadOnlyDB) {
ASSERT_TRUE(db_->SyncWAL().IsNotSupported());
}
// TODO akanksha: Update the test to check that combination
// does not actually write to FS (use open read-only with
// CompositeEnvWrapper+ReadOnlyFileSystem).
TEST_F(DBBasicTest, DISABLED_ReadOnlyDBWithWriteDBIdToManifestSet) {
TEST_F(DBBasicTest, ReadOnlyDBWithWriteDBIdToManifestSet) {
ASSERT_OK(Put("foo", "v1"));
ASSERT_OK(Put("bar", "v2"));
ASSERT_OK(Put("foo", "v3"));
@ -3783,7 +3780,7 @@ TEST_P(DBBasicTestDeadline, PointLookupDeadline) {
Random rnd(301);
for (int i = 0; i < 400; ++i) {
std::string key = "k" + std::to_string(i);
std::string key = "k" + ToString(i);
ASSERT_OK(Put(key, rnd.RandomString(100)));
}
ASSERT_OK(Flush());
@ -3866,7 +3863,7 @@ TEST_P(DBBasicTestDeadline, IteratorDeadline) {
Random rnd(301);
for (int i = 0; i < 400; ++i) {
std::string key = "k" + std::to_string(i);
std::string key = "k" + ToString(i);
ASSERT_OK(Put(key, rnd.RandomString(100)));
}
ASSERT_OK(Flush());

View File

@ -13,7 +13,6 @@
#include "cache/cache_entry_roles.h"
#include "cache/cache_key.h"
#include "cache/fast_lru_cache.h"
#include "cache/lru_cache.h"
#include "db/column_family.h"
#include "db/db_impl/db_impl.h"
@ -76,7 +75,7 @@ class DBBlockCacheTest : public DBTestBase {
void InitTable(const Options& /*options*/) {
std::string value(kValueSize, 'a');
for (size_t i = 0; i < kNumBlocks; i++) {
ASSERT_OK(Put(std::to_string(i), value.c_str()));
ASSERT_OK(Put(ToString(i), value.c_str()));
}
}
@ -205,7 +204,7 @@ TEST_F(DBBlockCacheTest, IteratorBlockCacheUsage) {
ASSERT_EQ(0, cache->GetUsage());
iter = db_->NewIterator(read_options);
iter->Seek(std::to_string(0));
iter->Seek(ToString(0));
ASSERT_LT(0, cache->GetUsage());
delete iter;
iter = nullptr;
@ -236,7 +235,7 @@ TEST_F(DBBlockCacheTest, TestWithoutCompressedBlockCache) {
// Load blocks into cache.
for (size_t i = 0; i + 1 < kNumBlocks; i++) {
iter = db_->NewIterator(read_options);
iter->Seek(std::to_string(i));
iter->Seek(ToString(i));
ASSERT_OK(iter->status());
CheckCacheCounters(options, 1, 0, 1, 0);
iterators[i].reset(iter);
@ -249,7 +248,7 @@ TEST_F(DBBlockCacheTest, TestWithoutCompressedBlockCache) {
// Test with strict capacity limit.
cache->SetStrictCapacityLimit(true);
iter = db_->NewIterator(read_options);
iter->Seek(std::to_string(kNumBlocks - 1));
iter->Seek(ToString(kNumBlocks - 1));
ASSERT_TRUE(iter->status().IsIncomplete());
CheckCacheCounters(options, 1, 0, 0, 1);
delete iter;
@ -263,7 +262,7 @@ TEST_F(DBBlockCacheTest, TestWithoutCompressedBlockCache) {
ASSERT_EQ(0, cache->GetPinnedUsage());
for (size_t i = 0; i + 1 < kNumBlocks; i++) {
iter = db_->NewIterator(read_options);
iter->Seek(std::to_string(i));
iter->Seek(ToString(i));
ASSERT_OK(iter->status());
CheckCacheCounters(options, 0, 1, 0, 0);
iterators[i].reset(iter);
@ -289,7 +288,7 @@ TEST_F(DBBlockCacheTest, TestWithCompressedBlockCache) {
std::string value(kValueSize, 'a');
for (size_t i = 0; i < kNumBlocks; i++) {
ASSERT_OK(Put(std::to_string(i), value));
ASSERT_OK(Put(ToString(i), value));
ASSERT_OK(Flush());
}
@ -313,7 +312,7 @@ TEST_F(DBBlockCacheTest, TestWithCompressedBlockCache) {
// Load blocks into cache.
for (size_t i = 0; i < kNumBlocks - 1; i++) {
ASSERT_EQ(value, Get(std::to_string(i)));
ASSERT_EQ(value, Get(ToString(i)));
CheckCacheCounters(options, 1, 0, 1, 0);
CheckCompressedCacheCounters(options, 1, 0, 1, 0);
}
@ -334,7 +333,7 @@ TEST_F(DBBlockCacheTest, TestWithCompressedBlockCache) {
// Load last key block.
ASSERT_EQ("Result incomplete: Insert failed due to LRU cache being full.",
Get(std::to_string(kNumBlocks - 1)));
Get(ToString(kNumBlocks - 1)));
// Failure will also record the miss counter.
CheckCacheCounters(options, 1, 0, 0, 1);
CheckCompressedCacheCounters(options, 1, 0, 1, 0);
@ -343,7 +342,7 @@ TEST_F(DBBlockCacheTest, TestWithCompressedBlockCache) {
// cache and load into block cache.
cache->SetStrictCapacityLimit(false);
// Load last key block.
ASSERT_EQ(value, Get(std::to_string(kNumBlocks - 1)));
ASSERT_EQ(value, Get(ToString(kNumBlocks - 1)));
CheckCacheCounters(options, 1, 0, 1, 0);
CheckCompressedCacheCounters(options, 0, 1, 0, 0);
}
@ -568,7 +567,7 @@ TEST_F(DBBlockCacheTest, FillCacheAndIterateDB) {
Iterator* iter = nullptr;
iter = db_->NewIterator(read_options);
iter->Seek(std::to_string(0));
iter->Seek(ToString(0));
while (iter->Valid()) {
iter->Next();
}
@ -646,10 +645,10 @@ TEST_F(DBBlockCacheTest, WarmCacheWithDataBlocksDuringFlush) {
std::string value(kValueSize, 'a');
for (size_t i = 1; i <= kNumBlocks; i++) {
ASSERT_OK(Put(std::to_string(i), value));
ASSERT_OK(Put(ToString(i), value));
ASSERT_OK(Flush());
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_DATA_ADD));
ASSERT_EQ(value, Get(std::to_string(i)));
ASSERT_EQ(value, Get(ToString(i)));
ASSERT_EQ(0, options.statistics->getTickerCount(BLOCK_CACHE_DATA_MISS));
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_DATA_HIT));
}
@ -706,7 +705,7 @@ TEST_P(DBBlockCacheTest1, WarmCacheWithBlocksDuringFlush) {
std::string value(kValueSize, 'a');
for (size_t i = 1; i <= kNumBlocks; i++) {
ASSERT_OK(Put(std::to_string(i), value));
ASSERT_OK(Put(ToString(i), value));
ASSERT_OK(Flush());
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_DATA_ADD));
if (filter_type == 1) {
@ -718,7 +717,7 @@ TEST_P(DBBlockCacheTest1, WarmCacheWithBlocksDuringFlush) {
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_INDEX_ADD));
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_FILTER_ADD));
}
ASSERT_EQ(value, Get(std::to_string(i)));
ASSERT_EQ(value, Get(ToString(i)));
ASSERT_EQ(0, options.statistics->getTickerCount(BLOCK_CACHE_DATA_MISS));
ASSERT_EQ(i, options.statistics->getTickerCount(BLOCK_CACHE_DATA_HIT));
@ -773,12 +772,12 @@ TEST_F(DBBlockCacheTest, DynamicallyWarmCacheDuringFlush) {
std::string value(kValueSize, 'a');
for (size_t i = 1; i <= 5; i++) {
ASSERT_OK(Put(std::to_string(i), value));
ASSERT_OK(Put(ToString(i), value));
ASSERT_OK(Flush());
ASSERT_EQ(1,
options.statistics->getAndResetTickerCount(BLOCK_CACHE_DATA_ADD));
ASSERT_EQ(value, Get(std::to_string(i)));
ASSERT_EQ(value, Get(ToString(i)));
ASSERT_EQ(0,
options.statistics->getAndResetTickerCount(BLOCK_CACHE_DATA_ADD));
ASSERT_EQ(
@ -791,12 +790,12 @@ TEST_F(DBBlockCacheTest, DynamicallyWarmCacheDuringFlush) {
{{"block_based_table_factory", "{prepopulate_block_cache=kDisable;}"}}));
for (size_t i = 6; i <= kNumBlocks; i++) {
ASSERT_OK(Put(std::to_string(i), value));
ASSERT_OK(Put(ToString(i), value));
ASSERT_OK(Flush());
ASSERT_EQ(0,
options.statistics->getAndResetTickerCount(BLOCK_CACHE_DATA_ADD));
ASSERT_EQ(value, Get(std::to_string(i)));
ASSERT_EQ(value, Get(ToString(i)));
ASSERT_EQ(1,
options.statistics->getAndResetTickerCount(BLOCK_CACHE_DATA_ADD));
ASSERT_EQ(
@ -935,8 +934,7 @@ TEST_F(DBBlockCacheTest, AddRedundantStats) {
int iterations_tested = 0;
for (std::shared_ptr<Cache> base_cache :
{NewLRUCache(capacity, num_shard_bits),
NewClockCache(capacity, num_shard_bits),
NewFastLRUCache(capacity, num_shard_bits)}) {
NewClockCache(capacity, num_shard_bits)}) {
if (!base_cache) {
// Skip clock cache when not supported
continue;
@ -1290,8 +1288,7 @@ TEST_F(DBBlockCacheTest, CacheEntryRoleStats) {
int iterations_tested = 0;
for (bool partition : {false, true}) {
for (std::shared_ptr<Cache> cache :
{NewLRUCache(capacity), NewClockCache(capacity),
NewFastLRUCache(capacity)}) {
{NewLRUCache(capacity), NewClockCache(capacity)}) {
if (!cache) {
// Skip clock cache when not supported
continue;
@ -1407,11 +1404,21 @@ TEST_F(DBBlockCacheTest, CacheEntryRoleStats) {
ASSERT_TRUE(
db_->GetMapProperty(DB::Properties::kBlockCacheEntryStats, &values));
for (size_t i = 0; i < kNumCacheEntryRoles; ++i) {
auto role = static_cast<CacheEntryRole>(i);
EXPECT_EQ(std::to_string(expected[i]),
values[BlockCacheEntryStatsMapKeys::EntryCount(role)]);
}
EXPECT_EQ(
ToString(expected[static_cast<size_t>(CacheEntryRole::kIndexBlock)]),
values["count.index-block"]);
EXPECT_EQ(
ToString(expected[static_cast<size_t>(CacheEntryRole::kDataBlock)]),
values["count.data-block"]);
EXPECT_EQ(
ToString(expected[static_cast<size_t>(CacheEntryRole::kFilterBlock)]),
values["count.filter-block"]);
EXPECT_EQ(
ToString(
prev_expected[static_cast<size_t>(CacheEntryRole::kWriteBuffer)]),
values["count.write-buffer"]);
EXPECT_EQ(ToString(expected[static_cast<size_t>(CacheEntryRole::kMisc)]),
values["count.misc"]);
// Add one for kWriteBuffer
{
@ -1422,20 +1429,18 @@ TEST_F(DBBlockCacheTest, CacheEntryRoleStats) {
// re-scanning stats, but not totally aggressive.
// Within some time window, we will get cached entry stats
env_->MockSleepForSeconds(1);
EXPECT_EQ(std::to_string(prev_expected[static_cast<size_t>(
EXPECT_EQ(ToString(prev_expected[static_cast<size_t>(
CacheEntryRole::kWriteBuffer)]),
values[BlockCacheEntryStatsMapKeys::EntryCount(
CacheEntryRole::kWriteBuffer)]);
values["count.write-buffer"]);
// Not enough for a "background" miss but enough for a "foreground" miss
env_->MockSleepForSeconds(45);
ASSERT_TRUE(db_->GetMapProperty(DB::Properties::kBlockCacheEntryStats,
&values));
EXPECT_EQ(
std::to_string(
ToString(
expected[static_cast<size_t>(CacheEntryRole::kWriteBuffer)]),
values[BlockCacheEntryStatsMapKeys::EntryCount(
CacheEntryRole::kWriteBuffer)]);
values["count.write-buffer"]);
}
prev_expected = expected;
@ -1640,7 +1645,7 @@ TEST_P(DBBlockCacheKeyTest, StableCacheKeys) {
SstFileWriter sst_file_writer(EnvOptions(), options);
std::vector<std::string> external;
for (int i = 0; i < 2; ++i) {
std::string f = dbname_ + "/external" + std::to_string(i) + ".sst";
std::string f = dbname_ + "/external" + ToString(i) + ".sst";
external.push_back(f);
ASSERT_OK(sst_file_writer.Open(f));
ASSERT_OK(sst_file_writer.Put(Key(key_count), "abc"));
@ -1724,7 +1729,7 @@ class CacheKeyTest : public testing::Test {
// Like SemiStructuredUniqueIdGen::GenerateNext
tp_.db_session_id = EncodeSessionId(base_session_upper_,
base_session_lower_ ^ session_counter_);
tp_.db_id = std::to_string(db_id_);
tp_.db_id = ToString(db_id_);
tp_.orig_file_number = file_number_;
bool is_stable;
std::string cur_session_id = ""; // ignored

View File

@ -17,11 +17,8 @@
#include "db/db_test_util.h"
#include "options/options_helper.h"
#include "port/stack_trace.h"
#include "rocksdb/advanced_options.h"
#include "rocksdb/convenience.h"
#include "rocksdb/filter_policy.h"
#include "rocksdb/perf_context.h"
#include "rocksdb/statistics.h"
#include "rocksdb/table.h"
#include "table/block_based/block_based_table_reader.h"
#include "table/block_based/filter_policy_internal.h"
@ -111,7 +108,6 @@ TEST_P(DBBloomFilterTestDefFormatVersion, KeyMayExist) {
options_override.filter_policy = Create(20, bfp_impl_);
options_override.partition_filters = partition_filters_;
options_override.metadata_block_size = 32;
options_override.full_block_cache = true;
Options options = CurrentOptions(options_override);
if (partition_filters_) {
auto* table_options =
@ -802,87 +798,83 @@ TEST_F(DBBloomFilterTest, BloomFilterRate) {
}
}
namespace {
struct CompatibilityConfig {
std::shared_ptr<const FilterPolicy> policy;
bool partitioned;
uint32_t format_version;
void SetInTableOptions(BlockBasedTableOptions* table_options) {
table_options->filter_policy = policy;
table_options->partition_filters = partitioned;
if (partitioned) {
table_options->index_type =
BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
} else {
table_options->index_type =
BlockBasedTableOptions::IndexType::kBinarySearch;
}
table_options->format_version = format_version;
}
};
// High bits per key -> almost no FPs
std::shared_ptr<const FilterPolicy> kCompatibilityBloomPolicy{
NewBloomFilterPolicy(20)};
// bloom_before_level=-1 -> always use Ribbon
std::shared_ptr<const FilterPolicy> kCompatibilityRibbonPolicy{
NewRibbonFilterPolicy(20, -1)};
std::vector<CompatibilityConfig> kCompatibilityConfigs = {
{Create(20, kDeprecatedBlock), false,
BlockBasedTableOptions().format_version},
{kCompatibilityBloomPolicy, false, BlockBasedTableOptions().format_version},
{kCompatibilityBloomPolicy, true, BlockBasedTableOptions().format_version},
{kCompatibilityBloomPolicy, false, /* legacy Bloom */ 4U},
{kCompatibilityRibbonPolicy, false,
BlockBasedTableOptions().format_version},
{kCompatibilityRibbonPolicy, true, BlockBasedTableOptions().format_version},
};
} // namespace
TEST_F(DBBloomFilterTest, BloomFilterCompatibility) {
Options options = CurrentOptions();
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
options.level0_file_num_compaction_trigger =
static_cast<int>(kCompatibilityConfigs.size()) + 1;
options.max_open_files = -1;
BlockBasedTableOptions table_options;
table_options.filter_policy.reset(NewBloomFilterPolicy(10, true));
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
Close();
// Create with block based filter
CreateAndReopenWithCF({"pikachu"}, options);
// Create one file for each kind of filter. Each file covers a distinct key
// range.
for (size_t i = 0; i < kCompatibilityConfigs.size(); ++i) {
BlockBasedTableOptions table_options;
kCompatibilityConfigs[i].SetInTableOptions(&table_options);
ASSERT_TRUE(table_options.filter_policy != nullptr);
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
Reopen(options);
std::string prefix = std::to_string(i) + "_";
ASSERT_OK(Put(prefix + "A", "val"));
ASSERT_OK(Put(prefix + "Z", "val"));
ASSERT_OK(Flush());
const int maxKey = 10000;
for (int i = 0; i < maxKey; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
ASSERT_OK(Put(1, Key(maxKey + 55555), Key(maxKey + 55555)));
Flush(1);
// Test filter is used between each pair of {reader,writer} configurations,
// because any built-in FilterPolicy should be able to read filters from any
// other built-in FilterPolicy
for (size_t i = 0; i < kCompatibilityConfigs.size(); ++i) {
// Check db with full filter
table_options.filter_policy.reset(NewBloomFilterPolicy(10, false));
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
ReopenWithColumnFamilies({"default", "pikachu"}, options);
// Check if they can be found
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ(Key(i), Get(1, Key(i)));
}
ASSERT_EQ(TestGetTickerCount(options, BLOOM_FILTER_USEFUL), 0);
// Check db with partitioned full filter
table_options.partition_filters = true;
table_options.index_type =
BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
table_options.filter_policy.reset(NewBloomFilterPolicy(10, false));
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
ReopenWithColumnFamilies({"default", "pikachu"}, options);
// Check if they can be found
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ(Key(i), Get(1, Key(i)));
}
ASSERT_EQ(TestGetTickerCount(options, BLOOM_FILTER_USEFUL), 0);
}
TEST_F(DBBloomFilterTest, BloomFilterReverseCompatibility) {
for (bool partition_filters : {true, false}) {
Options options = CurrentOptions();
options.statistics = ROCKSDB_NAMESPACE::CreateDBStatistics();
BlockBasedTableOptions table_options;
kCompatibilityConfigs[i].SetInTableOptions(&table_options);
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
Reopen(options);
for (size_t j = 0; j < kCompatibilityConfigs.size(); ++j) {
std::string prefix = std::to_string(j) + "_";
ASSERT_EQ("val", Get(prefix + "A")); // Filter positive
ASSERT_EQ("val", Get(prefix + "Z")); // Filter positive
// Filter negative, with high probability
ASSERT_EQ("NOT_FOUND", Get(prefix + "Q"));
// FULL_POSITIVE does not include block-based filter case (j == 0)
EXPECT_EQ(TestGetAndResetTickerCount(options, BLOOM_FILTER_FULL_POSITIVE),
j == 0 ? 0 : 2);
EXPECT_EQ(TestGetAndResetTickerCount(options, BLOOM_FILTER_USEFUL), 1);
if (partition_filters) {
table_options.partition_filters = true;
table_options.index_type =
BlockBasedTableOptions::IndexType::kTwoLevelIndexSearch;
}
table_options.filter_policy.reset(NewBloomFilterPolicy(10, false));
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
DestroyAndReopen(options);
// Create with full filter
CreateAndReopenWithCF({"pikachu"}, options);
const int maxKey = 10000;
for (int i = 0; i < maxKey; i++) {
ASSERT_OK(Put(1, Key(i), Key(i)));
}
ASSERT_OK(Put(1, Key(maxKey + 55555), Key(maxKey + 55555)));
Flush(1);
// Check db with block_based filter
table_options.filter_policy.reset(NewBloomFilterPolicy(10, true));
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
ReopenWithColumnFamilies({"default", "pikachu"}, options);
// Check if they can be found
for (int i = 0; i < maxKey; i++) {
ASSERT_EQ(Key(i), Get(1, Key(i)));
}
ASSERT_EQ(TestGetTickerCount(options, BLOOM_FILTER_USEFUL), 0);
}
}
@ -927,7 +919,7 @@ class FilterConstructResPeakTrackingCache : public CacheWrapper {
}
using Cache::Release;
bool Release(Handle* handle, bool erase_if_last_ref = false) override {
bool Release(Handle* handle, bool force_erase = false) override {
auto deleter = GetDeleter(handle);
if (deleter == kNoopDeleterForFilterConstruction) {
if (!last_peak_tracked_) {
@ -937,7 +929,7 @@ class FilterConstructResPeakTrackingCache : public CacheWrapper {
}
cur_cache_res_ -= GetCharge(handle);
}
bool is_successful = target_->Release(handle, erase_if_last_ref);
bool is_successful = target_->Release(handle, force_erase);
return is_successful;
}
@ -960,8 +952,8 @@ class FilterConstructResPeakTrackingCache : public CacheWrapper {
const Cache::DeleterFn
FilterConstructResPeakTrackingCache::kNoopDeleterForFilterConstruction =
CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::TEST_GetNoopDeleterForRole();
CacheReservationManager::TEST_GetNoopDeleterForRole<
CacheEntryRole::kFilterConstruction>();
// To align with the type of hash entry being reserved in implementation.
using FilterConstructionReserveMemoryHash = uint64_t;
@ -991,9 +983,7 @@ class DBFilterConstructionReserveMemoryTestWithParam
// trigger at least 1 dummy entry reservation each for hash entries and
// final filter, we need a large number of keys to ensure we have at least
// two partitions.
num_key_ = 18 *
CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::GetDummyEntrySize() /
num_key_ = 18 * CacheReservationManager::GetDummyEntrySize() /
sizeof(FilterConstructionReserveMemoryHash);
} else if (policy_ == kFastLocalBloom) {
// For Bloom Filter + FullFilter case, since we design the num_key_ to
@ -1003,9 +993,7 @@ class DBFilterConstructionReserveMemoryTestWithParam
// behavior and we don't need a large number of keys to verify we
// indeed charge the final filter for cache reservation, even though final
// filter is a lot smaller than hash entries.
num_key_ = 1 *
CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::GetDummyEntrySize() /
num_key_ = 1 * CacheReservationManager::GetDummyEntrySize() /
sizeof(FilterConstructionReserveMemoryHash);
} else {
// For Ribbon Filter + FullFilter case, we need a large enough number of
@ -1013,9 +1001,7 @@ class DBFilterConstructionReserveMemoryTestWithParam
// reservation will trigger at least another dummy entry (or equivalently
// to saying, causing another peak in cache reservation) as banding
// reservation might not be a multiple of dummy entry.
num_key_ = 12 *
CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::GetDummyEntrySize() /
num_key_ = 12 * CacheReservationManager::GetDummyEntrySize() /
sizeof(FilterConstructionReserveMemoryHash);
}
}
@ -1075,8 +1061,7 @@ class DBFilterConstructionReserveMemoryTestWithParam
};
INSTANTIATE_TEST_CASE_P(
DBFilterConstructionReserveMemoryTestWithParam,
DBFilterConstructionReserveMemoryTestWithParam,
BlockBasedTableOptions, DBFilterConstructionReserveMemoryTestWithParam,
::testing::Values(std::make_tuple(false, kFastLocalBloom, false, false),
std::make_tuple(true, kFastLocalBloom, false, false),
@ -1092,7 +1077,7 @@ INSTANTIATE_TEST_CASE_P(
std::make_tuple(true, kDeprecatedBlock, false, false),
std::make_tuple(true, kLegacyBloom, false, false)));
// TODO: Speed up this test, and reduce disk space usage (~700MB)
// TODO: Speed up this test.
// The current test inserts many keys (on the scale of dummy entry size)
// in order to make small memory user (e.g, final filter, partitioned hash
// entries/filter/banding) , which is proportional to the number of
@ -1171,8 +1156,8 @@ TEST_P(DBFilterConstructionReserveMemoryTestWithParam, ReserveMemory) {
return;
}
const std::size_t kDummyEntrySize = CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::GetDummyEntrySize();
const std::size_t kDummyEntrySize =
CacheReservationManager::GetDummyEntrySize();
const std::size_t predicted_hash_entries_cache_res =
num_key * sizeof(FilterConstructionReserveMemoryHash);
@ -1360,12 +1345,9 @@ TEST_P(DBFilterConstructionReserveMemoryTestWithParam, ReserveMemory) {
*
*/
if (!partition_filters) {
ASSERT_GE(
std::floor(
1.0 * predicted_final_filter_cache_res /
CacheReservationManagerImpl<
CacheEntryRole::kFilterConstruction>::GetDummyEntrySize()),
1)
ASSERT_GE(std::floor(1.0 * predicted_final_filter_cache_res /
CacheReservationManager::GetDummyEntrySize()),
1)
<< "Final filter cache reservation too small for this test - please "
"increase the number of keys";
if (!detect_filter_construct_corruption) {
@ -1558,93 +1540,6 @@ TEST_P(DBFilterConstructionCorruptionTestWithParam, DetectCorruption) {
"TamperFilter");
}
// RocksDB lite does not support dynamic options
#ifndef ROCKSDB_LITE
TEST_P(DBFilterConstructionCorruptionTestWithParam,
DynamicallyTurnOnAndOffDetectConstructCorruption) {
Options options = CurrentOptions();
BlockBasedTableOptions table_options = GetBlockBasedTableOptions();
// We intend to turn on
// table_options.detect_filter_construct_corruption dynamically
// therefore we override this test parmater's value
table_options.detect_filter_construct_corruption = false;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));
options.create_if_missing = true;
int num_key = static_cast<int>(GetNumKey());
Status s;
DestroyAndReopen(options);
// Case 1: !table_options.detect_filter_construct_corruption
for (int i = 0; i < num_key; i++) {
ASSERT_OK(Put(Key(i), Key(i)));
}
SyncPoint::GetInstance()->SetCallBack(
"XXPH3FilterBitsBuilder::Finish::TamperHashEntries", [&](void* arg) {
std::deque<uint64_t>* hash_entries_to_corrupt =
(std::deque<uint64_t>*)arg;
assert(!hash_entries_to_corrupt->empty());
*(hash_entries_to_corrupt->begin()) =
*(hash_entries_to_corrupt->begin()) ^ uint64_t { 1 };
});
SyncPoint::GetInstance()->EnableProcessing();
s = Flush();
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearCallBack(
"XXPH3FilterBitsBuilder::Finish::"
"TamperHashEntries");
ASSERT_FALSE(table_options.detect_filter_construct_corruption);
EXPECT_TRUE(s.ok());
// Case 2: dynamically turn on
// table_options.detect_filter_construct_corruption
ASSERT_OK(db_->SetOptions({{"block_based_table_factory",
"{detect_filter_construct_corruption=true;}"}}));
for (int i = 0; i < num_key; i++) {
ASSERT_OK(Put(Key(i), Key(i)));
}
SyncPoint::GetInstance()->SetCallBack(
"XXPH3FilterBitsBuilder::Finish::TamperHashEntries", [&](void* arg) {
std::deque<uint64_t>* hash_entries_to_corrupt =
(std::deque<uint64_t>*)arg;
assert(!hash_entries_to_corrupt->empty());
*(hash_entries_to_corrupt->begin()) =
*(hash_entries_to_corrupt->begin()) ^ uint64_t { 1 };
});
SyncPoint::GetInstance()->EnableProcessing();
s = Flush();
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearCallBack(
"XXPH3FilterBitsBuilder::Finish::"
"TamperHashEntries");
auto updated_table_options =
db_->GetOptions().table_factory->GetOptions<BlockBasedTableOptions>();
EXPECT_TRUE(updated_table_options->detect_filter_construct_corruption);
EXPECT_TRUE(s.IsCorruption());
EXPECT_TRUE(s.ToString().find("Filter's hash entries checksum mismatched") !=
std::string::npos);
// Case 3: dynamically turn off
// table_options.detect_filter_construct_corruption
ASSERT_OK(db_->SetOptions({{"block_based_table_factory",
"{detect_filter_construct_corruption=false;}"}}));
updated_table_options =
db_->GetOptions().table_factory->GetOptions<BlockBasedTableOptions>();
EXPECT_FALSE(updated_table_options->detect_filter_construct_corruption);
}
#endif // ROCKSDB_LITE
namespace {
// NOTE: This class is referenced by HISTORY.md as a model for a wrapper
// FilterPolicy selecting among configurations based on context.
@ -1713,11 +1608,11 @@ class TestingContextCustomFilterPolicy
test_report_ +=
OptionsHelper::compaction_style_to_string[context.compaction_style];
test_report_ += ",n=";
test_report_ += std::to_string(context.num_levels);
test_report_ += ROCKSDB_NAMESPACE::ToString(context.num_levels);
test_report_ += ",l=";
test_report_ += std::to_string(context.level_at_creation);
test_report_ += ROCKSDB_NAMESPACE::ToString(context.level_at_creation);
test_report_ += ",b=";
test_report_ += std::to_string(int{context.is_bottommost});
test_report_ += ROCKSDB_NAMESPACE::ToString(int{context.is_bottommost});
test_report_ += ",r=";
test_report_ += table_file_creation_reason_to_string[context.reason];
test_report_ += "\n";
@ -2274,6 +2169,7 @@ class BloomStatsTestWithParam
options_.table_factory.reset(NewPlainTableFactory(table_options));
} else {
BlockBasedTableOptions table_options;
table_options.hash_index_allow_collision = false;
if (partition_filters_) {
assert(bfp_impl_ != kDeprecatedBlock);
table_options.partition_filters = partition_filters_;

View File

@ -454,7 +454,7 @@ TEST_F(DBTestCompactionFilter, CompactionFilterDeletesAll) {
// put some data
for (int table = 0; table < 4; ++table) {
for (int i = 0; i < 10 + table; ++i) {
ASSERT_OK(Put(std::to_string(table * 100 + i), "val"));
ASSERT_OK(Put(ToString(table * 100 + i), "val"));
}
ASSERT_OK(Flush());
}
@ -755,7 +755,7 @@ TEST_F(DBTestCompactionFilter, CompactionFilterContextCfId) {
#ifndef ROCKSDB_LITE
// Compaction filters aplies to all records, regardless snapshots.
TEST_F(DBTestCompactionFilter, CompactionFilterIgnoreSnapshot) {
std::string five = std::to_string(5);
std::string five = ToString(5);
Options options = CurrentOptions();
options.compaction_filter_factory = std::make_shared<DeleteISFilterFactory>();
options.disable_auto_compactions = true;
@ -766,7 +766,7 @@ TEST_F(DBTestCompactionFilter, CompactionFilterIgnoreSnapshot) {
const Snapshot* snapshot = nullptr;
for (int table = 0; table < 4; ++table) {
for (int i = 0; i < 10; ++i) {
ASSERT_OK(Put(std::to_string(table * 100 + i), "val"));
ASSERT_OK(Put(ToString(table * 100 + i), "val"));
}
ASSERT_OK(Flush());
@ -968,71 +968,6 @@ TEST_F(DBTestCompactionFilter, IgnoreSnapshotsFalseRecovery) {
ASSERT_TRUE(TryReopen(options).IsNotSupported());
}
TEST_F(DBTestCompactionFilter, DropKeyWithSingleDelete) {
Options options = GetDefaultOptions();
options.create_if_missing = true;
Reopen(options);
ASSERT_OK(Put("a", "v0"));
ASSERT_OK(Put("b", "v0"));
const Snapshot* snapshot = db_->GetSnapshot();
ASSERT_OK(SingleDelete("b"));
ASSERT_OK(Flush());
{
CompactRangeOptions cro;
cro.change_level = true;
cro.target_level = options.num_levels - 1;
ASSERT_OK(db_->CompactRange(cro, nullptr, nullptr));
}
db_->ReleaseSnapshot(snapshot);
Close();
class DeleteFilterV2 : public CompactionFilter {
public:
Decision FilterV2(int /*level*/, const Slice& key, ValueType /*value_type*/,
const Slice& /*existing_value*/,
std::string* /*new_value*/,
std::string* /*skip_until*/) const override {
if (key.starts_with("b")) {
return Decision::kPurge;
}
return Decision::kRemove;
}
const char* Name() const override { return "DeleteFilterV2"; }
} delete_filter_v2;
options.compaction_filter = &delete_filter_v2;
options.level0_file_num_compaction_trigger = 2;
Reopen(options);
ASSERT_OK(Put("b", "v1"));
ASSERT_OK(Put("x", "v1"));
ASSERT_OK(Flush());
ASSERT_OK(Put("r", "v1"));
ASSERT_OK(Put("z", "v1"));
ASSERT_OK(Flush());
ASSERT_OK(dbfull()->TEST_WaitForCompact());
Close();
options.compaction_filter = nullptr;
Reopen(options);
ASSERT_OK(SingleDelete("b"));
ASSERT_OK(Flush());
{
CompactRangeOptions cro;
cro.bottommost_level_compaction = BottommostLevelCompaction::kForce;
ASSERT_OK(db_->CompactRange(cro, nullptr, nullptr));
}
}
} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {

View File

@ -1354,74 +1354,6 @@ TEST_P(DBCompactionTestWithParam, TrivialMoveTargetLevel) {
}
}
TEST_P(DBCompactionTestWithParam, PartialOverlappingL0) {
class SubCompactionEventListener : public EventListener {
public:
void OnSubcompactionCompleted(const SubcompactionJobInfo&) override {
sub_compaction_finished_++;
}
std::atomic<int> sub_compaction_finished_{0};
};
Options options = CurrentOptions();
options.disable_auto_compactions = true;
options.write_buffer_size = 10 * 1024 * 1024;
options.max_subcompactions = max_subcompactions_;
SubCompactionEventListener* listener = new SubCompactionEventListener();
options.listeners.emplace_back(listener);
DestroyAndReopen(options);
// For subcompactino to trigger, output level needs to be non-empty.
ASSERT_OK(Put("key", ""));
ASSERT_OK(Put("kez", ""));
ASSERT_OK(Flush());
ASSERT_OK(Put("key", ""));
ASSERT_OK(Put("kez", ""));
ASSERT_OK(Flush());
ASSERT_OK(db_->CompactRange(CompactRangeOptions(), nullptr, nullptr));
// Ranges that are only briefly overlapping so that they won't be trivially
// moved but subcompaction ranges would only contain a subset of files.
std::vector<std::pair<int32_t, int32_t>> ranges = {
{100, 199}, {198, 399}, {397, 600}, {598, 800}, {799, 900}, {895, 999},
};
int32_t value_size = 10 * 1024; // 10 KB
Random rnd(301);
std::map<int32_t, std::string> values;
for (size_t i = 0; i < ranges.size(); i++) {
for (int32_t j = ranges[i].first; j <= ranges[i].second; j++) {
values[j] = rnd.RandomString(value_size);
ASSERT_OK(Put(Key(j), values[j]));
}
ASSERT_OK(Flush());
}
int32_t level0_files = NumTableFilesAtLevel(0, 0);
ASSERT_EQ(level0_files, ranges.size()); // Multiple files in L0
ASSERT_EQ(NumTableFilesAtLevel(1, 0), 1); // One file in L1
listener->sub_compaction_finished_ = 0;
ASSERT_OK(db_->EnableAutoCompaction({db_->DefaultColumnFamily()}));
ASSERT_OK(dbfull()->TEST_WaitForCompact());
if (max_subcompactions_ > 3) {
// RocksDB might not generate the exact number of sub compactions.
// Here we validate that at least subcompaction happened.
ASSERT_GT(listener->sub_compaction_finished_.load(), 2);
}
// We expect that all the files were compacted to L1
ASSERT_EQ(NumTableFilesAtLevel(0, 0), 0);
ASSERT_GT(NumTableFilesAtLevel(1, 0), 1);
for (size_t i = 0; i < ranges.size(); i++) {
for (int32_t j = ranges[i].first; j <= ranges[i].second; j++) {
ASSERT_EQ(Get(Key(j)), values[j]);
}
}
}
TEST_P(DBCompactionTestWithParam, ManualCompactionPartial) {
int32_t trivial_move = 0;
int32_t non_trivial_move = 0;
@ -2409,30 +2341,6 @@ TEST_P(DBCompactionTestWithParam, LevelCompactionCFPathUse) {
check_getvalues();
{ // Also verify GetLiveFilesStorageInfo with db_paths / cf_paths
std::vector<LiveFileStorageInfo> new_infos;
LiveFilesStorageInfoOptions lfsio;
lfsio.wal_size_for_flush = UINT64_MAX; // no flush
ASSERT_OK(db_->GetLiveFilesStorageInfo(lfsio, &new_infos));
std::unordered_map<std::string, int> live_sst_by_dir;
for (auto& info : new_infos) {
if (info.file_type == kTableFile) {
live_sst_by_dir[info.directory]++;
// Verify file on disk (no directory confusion)
uint64_t size;
ASSERT_OK(env_->GetFileSize(
info.directory + "/" + info.relative_filename, &size));
ASSERT_EQ(info.size, size);
}
}
ASSERT_EQ(3U * 3U, live_sst_by_dir.size());
for (auto& paths : {options.db_paths, cf_opt1.cf_paths, cf_opt2.cf_paths}) {
ASSERT_EQ(1, live_sst_by_dir[paths[0].path]);
ASSERT_EQ(4, live_sst_by_dir[paths[1].path]);
ASSERT_EQ(2, live_sst_by_dir[paths[2].path]);
}
}
ReopenWithColumnFamilies({"default", "one", "two"}, option_vector);
check_getvalues();
@ -2817,7 +2725,7 @@ TEST_P(DBCompactionTestWithParam, DISABLED_CompactFilesOnLevelCompaction) {
Random rnd(301);
for (int key = 64 * kEntriesPerBuffer; key >= 0; --key) {
ASSERT_OK(Put(1, std::to_string(key), rnd.RandomString(kTestValueSize)));
ASSERT_OK(Put(1, ToString(key), rnd.RandomString(kTestValueSize)));
}
ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable(handles_[1]));
ASSERT_OK(dbfull()->TEST_WaitForCompact());
@ -2849,7 +2757,7 @@ TEST_P(DBCompactionTestWithParam, DISABLED_CompactFilesOnLevelCompaction) {
// make sure all key-values are still there.
for (int key = 64 * kEntriesPerBuffer; key >= 0; --key) {
ASSERT_NE(Get(1, std::to_string(key)), "NOT_FOUND");
ASSERT_NE(Get(1, ToString(key)), "NOT_FOUND");
}
}
@ -4404,8 +4312,7 @@ TEST_F(DBCompactionTest, LevelPeriodicCompactionWithCompactionFilters) {
for (CompactionFilterType comp_filter_type :
{kUseCompactionFilter, kUseCompactionFilterFactory}) {
// Assert that periodic compactions are not enabled.
ASSERT_EQ(std::numeric_limits<uint64_t>::max() - 1,
options.periodic_compaction_seconds);
ASSERT_EQ(port::kMaxUint64 - 1, options.periodic_compaction_seconds);
if (comp_filter_type == kUseCompactionFilter) {
options.compaction_filter = &test_compaction_filter;
@ -4668,9 +4575,9 @@ TEST_F(DBCompactionTest, CompactRangeSkipFlushAfterDelay) {
});
TEST_SYNC_POINT("DBCompactionTest::CompactRangeSkipFlushAfterDelay:PreFlush");
ASSERT_OK(Put(std::to_string(0), rnd.RandomString(1024)));
ASSERT_OK(Put(ToString(0), rnd.RandomString(1024)));
ASSERT_OK(dbfull()->Flush(flush_opts));
ASSERT_OK(Put(std::to_string(0), rnd.RandomString(1024)));
ASSERT_OK(Put(ToString(0), rnd.RandomString(1024)));
TEST_SYNC_POINT("DBCompactionTest::CompactRangeSkipFlushAfterDelay:PostFlush");
manual_compaction_thread.join();
@ -4679,7 +4586,7 @@ TEST_F(DBCompactionTest, CompactRangeSkipFlushAfterDelay) {
std::string num_keys_in_memtable;
ASSERT_TRUE(db_->GetProperty(DB::Properties::kNumEntriesActiveMemTable,
&num_keys_in_memtable));
ASSERT_EQ(std::to_string(1), num_keys_in_memtable);
ASSERT_EQ(ToString(1), num_keys_in_memtable);
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
}
@ -4828,7 +4735,7 @@ TEST_F(DBCompactionTest, SubcompactionEvent) {
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 10 + j;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -4838,7 +4745,7 @@ TEST_F(DBCompactionTest, SubcompactionEvent) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 10; j++) {
int key_id = i * 20 + j * 2;
ASSERT_OK(Put(Key(key_id), "value" + std::to_string(key_id)));
ASSERT_OK(Put(Key(key_id), "value" + ToString(key_id)));
}
ASSERT_OK(Flush());
}
@ -5830,7 +5737,7 @@ TEST_P(DBCompactionTestWithBottommostParam, SequenceKeysManualCompaction) {
}
ASSERT_OK(dbfull()->TEST_WaitForFlushMemTable());
ASSERT_EQ(std::to_string(kSstNum), FilesPerLevel(0));
ASSERT_EQ(ToString(kSstNum), FilesPerLevel(0));
auto cro = CompactRangeOptions();
cro.bottommost_level_compaction = bottommost_level_compaction_;
@ -5843,7 +5750,7 @@ TEST_P(DBCompactionTestWithBottommostParam, SequenceKeysManualCompaction) {
ASSERT_EQ("0,1", FilesPerLevel(0));
} else {
// Just trivial move from level 0 -> 1
ASSERT_EQ("0," + std::to_string(kSstNum), FilesPerLevel(0));
ASSERT_EQ("0," + ToString(kSstNum), FilesPerLevel(0));
}
}
@ -6509,29 +6416,20 @@ TEST_F(DBCompactionTest, CompactionWithBlobGCError_CorruptIndex) {
ASSERT_OK(Put(third_key, third_value));
constexpr char fourth_key[] = "fourth_key";
constexpr char fourth_value[] = "fourth_value";
ASSERT_OK(Put(fourth_key, fourth_value));
constexpr char corrupt_blob_index[] = "foobar";
WriteBatch batch;
ASSERT_OK(WriteBatchInternal::PutBlobIndex(&batch, 0, fourth_key,
corrupt_blob_index));
ASSERT_OK(db_->Write(WriteOptions(), &batch));
ASSERT_OK(Flush());
SyncPoint::GetInstance()->SetCallBack(
"CompactionIterator::GarbageCollectBlobIfNeeded::TamperWithBlobIndex",
[](void* arg) {
Slice* const blob_index = static_cast<Slice*>(arg);
assert(blob_index);
assert(!blob_index->empty());
blob_index->remove_prefix(1);
});
SyncPoint::GetInstance()->EnableProcessing();
constexpr Slice* begin = nullptr;
constexpr Slice* end = nullptr;
ASSERT_TRUE(
db_->CompactRange(CompactRangeOptions(), begin, end).IsCorruption());
SyncPoint::GetInstance()->DisableProcessing();
SyncPoint::GetInstance()->ClearAllCallBacks();
}
TEST_F(DBCompactionTest, CompactionWithBlobGCError_InlinedTTLIndex) {
@ -7174,7 +7072,7 @@ TEST_F(DBCompactionTest, DisableManualCompactionThreadQueueFull) {
ASSERT_OK(Put(Key(2), "value2"));
ASSERT_OK(Flush());
}
ASSERT_EQ(std::to_string(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
ASSERT_EQ(ToString(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
db_->DisableManualCompaction();
@ -7231,7 +7129,7 @@ TEST_F(DBCompactionTest, DisableManualCompactionThreadQueueFullDBClose) {
ASSERT_OK(Put(Key(2), "value2"));
ASSERT_OK(Flush());
}
ASSERT_EQ(std::to_string(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
ASSERT_EQ(ToString(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
db_->DisableManualCompaction();
@ -7291,7 +7189,7 @@ TEST_F(DBCompactionTest, DBCloseWithManualCompaction) {
ASSERT_OK(Put(Key(2), "value2"));
ASSERT_OK(Flush());
}
ASSERT_EQ(std::to_string(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
ASSERT_EQ(ToString(kNumL0Files + (kNumL0Files / 2)), FilesPerLevel(0));
// Close DB with manual compaction and auto triggered compaction in the queue.
auto s = db_->Close();

View File

@ -96,7 +96,7 @@ Status DBImpl::GetLiveFiles(std::vector<std::string>& ret,
3); // for CURRENT + MANIFEST + OPTIONS
// create names of the live files. The names are not absolute
// paths, instead they are relative to dbname_.
// paths, instead they are relative to dbname_;
for (const auto& table_file_number : live_table_files) {
ret.emplace_back(MakeTableFileName("", table_file_number));
}
@ -166,7 +166,7 @@ Status DBImpl::GetCurrentWalFile(std::unique_ptr<LogFile>* current_log_file) {
Status DBImpl::GetLiveFilesStorageInfo(
const LiveFilesStorageInfoOptions& opts,
std::vector<LiveFileStorageInfo>* files) {
// To avoid returning partial results, only move results to files on success.
// To avoid returning partial results, only move to ouput on success
assert(files);
files->clear();
std::vector<LiveFileStorageInfo> results;
@ -177,10 +177,10 @@ Status DBImpl::GetLiveFilesStorageInfo(
VectorLogPtr live_wal_files;
bool flush_memtable = true;
if (!immutable_db_options_.allow_2pc) {
if (opts.wal_size_for_flush == std::numeric_limits<uint64_t>::max()) {
if (opts.wal_size_for_flush == port::kMaxUint64) {
flush_memtable = false;
} else if (opts.wal_size_for_flush > 0) {
// If the outstanding log files are small, we skip the flush.
// If out standing log files are small, we skip the flush.
s = GetSortedWalFiles(live_wal_files);
if (!s.ok()) {
@ -313,7 +313,8 @@ Status DBImpl::GetLiveFilesStorageInfo(
info.relative_filename = kCurrentFileName;
info.directory = GetName();
info.file_type = kCurrentFile;
// CURRENT could be replaced so we have to record the contents as needed.
// CURRENT could be replaced so we have to record the contents we want
// for it
info.replacement_contents = manifest_fname + "\n";
info.size = manifest_fname.size() + 1;
if (opts.include_checksum_info) {
@ -355,7 +356,7 @@ Status DBImpl::GetLiveFilesStorageInfo(
TEST_SYNC_POINT("CheckpointImpl::CreateCustomCheckpoint:AfterGetLive1");
TEST_SYNC_POINT("CheckpointImpl::CreateCustomCheckpoint:AfterGetLive2");
// If we have more than one column family, we also need to get WAL files.
// if we have more than one column family, we need to also get WAL files
if (s.ok()) {
s = GetSortedWalFiles(live_wal_files);
}
@ -393,7 +394,7 @@ Status DBImpl::GetLiveFilesStorageInfo(
}
if (s.ok()) {
// Only move results to output on success.
// Only move output on success
*files = std::move(results);
}
return s;

View File

@ -1374,140 +1374,6 @@ TEST_F(DBFlushTest, DISABLED_MemPurgeWALSupport) {
} while (ChangeWalOptions());
}
TEST_F(DBFlushTest, MemPurgeCorrectLogNumberAndSSTFileCreation) {
// Before our bug fix, we noticed that when 2 memtables were
// being flushed (with one memtable being the output of a
// previous MemPurge and one memtable being a newly-sealed memtable),
// the SST file created was not properly added to the DB version
// (via the VersionEdit obj), leading to data loss (the SST file
// was later being purged as an obsolete file).
// Therefore, we reproduce this scenario to test our fix.
Options options = CurrentOptions();
options.create_if_missing = true;
options.compression = kNoCompression;
options.inplace_update_support = false;
options.allow_concurrent_memtable_write = true;
// Enforce size of a single MemTable to 1MB (64MB = 1048576 bytes).
options.write_buffer_size = 1 << 20;
// Activate the MemPurge prototype.
options.experimental_mempurge_threshold = 1.0;
// Force to have more than one memtable to trigger a flush.
// For some reason this option does not seem to be enforced,
// so the following test is designed to make sure that we
// are testing the correct test case.
options.min_write_buffer_number_to_merge = 3;
options.max_write_buffer_number = 5;
options.max_write_buffer_size_to_maintain = 2 * (options.write_buffer_size);
options.disable_auto_compactions = true;
ASSERT_OK(TryReopen(options));
std::atomic<uint32_t> mempurge_count{0};
std::atomic<uint32_t> sst_count{0};
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"DBImpl::FlushJob:MemPurgeSuccessful",
[&](void* /*arg*/) { mempurge_count++; });
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"DBImpl::FlushJob:SSTFileCreated", [&](void* /*arg*/) { sst_count++; });
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();
// Dummy variable used for the following callback function.
uint64_t ZERO = 0;
// We will first execute mempurge operations exclusively.
// Therefore, when the first flush is triggered, we want to make
// sure there is at least 2 memtables being flushed: one output
// from a previous mempurge, and one newly sealed memtable.
// This is when we observed in the past that some SST files created
// were not properly added to the DB version (via the VersionEdit obj).
std::atomic<uint64_t> num_memtable_at_first_flush(0);
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"FlushJob::WriteLevel0Table:num_memtables", [&](void* arg) {
uint64_t* mems_size = reinterpret_cast<uint64_t*>(arg);
// atomic_compare_exchange_strong sometimes updates the value
// of ZERO (the "expected" object), so we make sure ZERO is indeed...
// zero.
ZERO = 0;
std::atomic_compare_exchange_strong(&num_memtable_at_first_flush, &ZERO,
*mems_size);
});
const std::vector<std::string> KEYS = {
"ThisIsKey1", "ThisIsKey2", "ThisIsKey3", "ThisIsKey4", "ThisIsKey5",
"ThisIsKey6", "ThisIsKey7", "ThisIsKey8", "ThisIsKey9"};
const std::string NOT_FOUND = "NOT_FOUND";
Random rnd(117);
const uint64_t NUM_REPEAT_OVERWRITES = 100;
const uint64_t NUM_RAND_INSERTS = 500;
const uint64_t RAND_VALUES_LENGTH = 10240;
std::string key, value;
std::vector<std::string> values(9, "");
// Keys used to check that no SST file disappeared.
for (uint64_t k = 0; k < 5; k++) {
values[k] = rnd.RandomString(RAND_VALUES_LENGTH);
ASSERT_OK(Put(KEYS[k], values[k]));
}
// Insertion of of K-V pairs, multiple times.
// Trigger at least one mempurge and no SST file creation.
for (size_t i = 0; i < NUM_REPEAT_OVERWRITES; i++) {
// Create value strings of arbitrary length RAND_VALUES_LENGTH bytes.
for (uint64_t k = 5; k < values.size(); k++) {
values[k] = rnd.RandomString(RAND_VALUES_LENGTH);
ASSERT_OK(Put(KEYS[k], values[k]));
}
// Check database consistency.
for (uint64_t k = 0; k < values.size(); k++) {
ASSERT_EQ(Get(KEYS[k]), values[k]);
}
}
// Check that there was at least one mempurge
uint32_t expected_min_mempurge_count = 1;
// Check that there was no SST files created during flush.
uint32_t expected_sst_count = 0;
EXPECT_GE(mempurge_count.load(), expected_min_mempurge_count);
EXPECT_EQ(sst_count.load(), expected_sst_count);
// Trigger an SST file creation and no mempurge.
for (size_t i = 0; i < NUM_RAND_INSERTS; i++) {
key = rnd.RandomString(RAND_VALUES_LENGTH);
// Create value strings of arbitrary length RAND_VALUES_LENGTH bytes.
value = rnd.RandomString(RAND_VALUES_LENGTH);
ASSERT_OK(Put(key, value));
// Check database consistency.
for (uint64_t k = 0; k < values.size(); k++) {
ASSERT_EQ(Get(KEYS[k]), values[k]);
}
ASSERT_EQ(Get(key), value);
}
// Check that there was at least one SST files created during flush.
expected_sst_count = 1;
EXPECT_GE(sst_count.load(), expected_sst_count);
// Oddly enough, num_memtable_at_first_flush is not enforced to be
// equal to min_write_buffer_number_to_merge. So by asserting that
// the first SST file creation comes from one output memtable
// from a previous mempurge, and one newly sealed memtable. This
// is the scenario where we observed that some SST files created
// were not properly added to the DB version before our bug fix.
ASSERT_GE(num_memtable_at_first_flush.load(), 2);
// Check that no data was lost after SST file creation.
for (uint64_t k = 0; k < values.size(); k++) {
ASSERT_EQ(Get(KEYS[k]), values[k]);
}
// Extra check of database consistency.
ASSERT_EQ(Get(key), value);
Close();
}
TEST_P(DBFlushDirectIOTest, DirectIO) {
Options options;
options.create_if_missing = true;
@ -1709,9 +1575,6 @@ TEST_F(DBFlushTest, FireOnFlushCompletedAfterCommittedResult) {
flush_opts.wait = false;
ASSERT_OK(db_->Flush(flush_opts));
t1.join();
// Ensure background work is fully finished including listener callbacks
// before accessing listener state.
ASSERT_OK(dbfull()->TEST_WaitForBackgroundWork());
ASSERT_TRUE(listener->completed1);
ASSERT_TRUE(listener->completed2);
SyncPoint::GetInstance()->DisableProcessing();
@ -2271,7 +2134,7 @@ TEST_P(DBAtomicFlushTest, ManualFlushUnder2PC) {
// The recovered min log number with prepared data should be non-zero.
// In 2pc mode, MinLogNumberToKeep returns the
// VersionSet::min_log_number_to_keep recovered from MANIFEST, if it's 0,
// VersionSet::min_log_number_to_keep_2pc recovered from MANIFEST, if it's 0,
// it means atomic flush didn't write the min_log_number_to_keep to MANIFEST.
cfs.push_back(kDefaultColumnFamilyName);
ASSERT_OK(TryReopenWithColumnFamilies(cfs, options));
@ -2356,7 +2219,7 @@ TEST_P(DBAtomicFlushTest, PrecomputeMinLogNumberToKeepNon2PC) {
ASSERT_OK(Flush(cf_ids));
uint64_t log_num_after_flush = dbfull()->TEST_GetCurrentLogNumber();
uint64_t min_log_number_to_keep = std::numeric_limits<uint64_t>::max();
uint64_t min_log_number_to_keep = port::kMaxUint64;
autovector<ColumnFamilyData*> flushed_cfds;
autovector<autovector<VersionEdit*>> flush_edits;
for (size_t i = 0; i != num_cfs; ++i) {

View File

@ -101,7 +101,6 @@
#include "util/compression.h"
#include "util/crc32c.h"
#include "util/defer.h"
#include "util/hash_containers.h"
#include "util/mutexlock.h"
#include "util/stop_watch.h"
#include "util/string_util.h"
@ -120,16 +119,18 @@ CompressionType GetCompressionFlush(
// Compressing memtable flushes might not help unless the sequential load
// optimization is used for leveled compaction. Otherwise the CPU and
// latency overhead is not offset by saving much space.
if (ioptions.compaction_style == kCompactionStyleUniversal &&
mutable_cf_options.compaction_options_universal
.compression_size_percent >= 0) {
return kNoCompression;
}
if (mutable_cf_options.compression_per_level.empty()) {
return mutable_cf_options.compression;
} else {
if (ioptions.compaction_style == kCompactionStyleUniversal) {
if (mutable_cf_options.compaction_options_universal
.compression_size_percent < 0) {
return mutable_cf_options.compression;
} else {
return kNoCompression;
}
} else if (!ioptions.compression_per_level.empty()) {
// For leveled compress when min_level_to_compress != 0.
return mutable_cf_options.compression_per_level[0];
return ioptions.compression_per_level[0];
} else {
return mutable_cf_options.compression;
}
}
@ -567,7 +568,7 @@ Status DBImpl::CloseHelper() {
// flushing by first checking if there is a need for
// flushing (but need to implement something
// else than imm()->IsFlushPending() because the output
// memtables added to imm() don't trigger flushes).
// memtables added to imm() dont trigger flushes).
if (immutable_db_options_.experimental_mempurge_threshold > 0.0) {
Status flush_ret;
mutex_.Unlock();
@ -849,8 +850,7 @@ void DBImpl::PersistStats() {
if (stats_slice_.find(stat.first) != stats_slice_.end()) {
uint64_t delta = stat.second - stats_slice_[stat.first];
s = batch.Put(persist_stats_cf_handle_,
Slice(key, std::min(100, length)),
std::to_string(delta));
Slice(key, std::min(100, length)), ToString(delta));
}
}
}
@ -1083,6 +1083,7 @@ Status DBImpl::SetOptions(
MutableCFOptions new_options;
Status s;
Status persist_options_status;
persist_options_status.PermitUncheckedError(); // Allow uninitialized access
SuperVersionContext sv_context(/* create_superversion */ true);
{
auto db_options = GetDBOptions();
@ -1118,11 +1119,9 @@ Status DBImpl::SetOptions(
"[%s] SetOptions() succeeded", cfd->GetName().c_str());
new_options.Dump(immutable_db_options_.info_log.get());
if (!persist_options_status.ok()) {
// NOTE: WriteOptionsFile already logs on failure
s = persist_options_status;
}
} else {
persist_options_status.PermitUncheckedError(); // less important
ROCKS_LOG_WARN(immutable_db_options_.info_log, "[%s] SetOptions() failed",
cfd->GetName().c_str());
}
@ -1891,8 +1890,6 @@ Status DBImpl::GetImpl(const ReadOptions& read_options, const Slice& key,
return s;
}
}
TEST_SYNC_POINT("DBImpl::GetImpl:PostMemTableGet:0");
TEST_SYNC_POINT("DBImpl::GetImpl:PostMemTableGet:1");
PinnedIteratorsManager pinned_iters_mgr;
if (!done) {
PERF_TIMER_GUARD(get_from_output_files_time);
@ -1910,6 +1907,8 @@ Status DBImpl::GetImpl(const ReadOptions& read_options, const Slice& key,
{
PERF_TIMER_GUARD(get_post_process_time);
ReturnAndCleanupSuperVersion(cfd, sv);
RecordTick(stats_, NUMBER_KEYS_READ);
size_t size = 0;
if (s.ok()) {
@ -1936,8 +1935,6 @@ Status DBImpl::GetImpl(const ReadOptions& read_options, const Slice& key,
PERF_COUNTER_ADD(get_read_bytes, size);
}
RecordInHistogram(stats_, BYTES_PER_READ, size);
ReturnAndCleanupSuperVersion(cfd, sv);
}
return s;
}
@ -2002,7 +1999,7 @@ std::vector<Status> DBImpl::MultiGet(
SequenceNumber consistent_seqnum;
UnorderedMap<uint32_t, MultiGetColumnFamilyData> multiget_cf_data(
std::unordered_map<uint32_t, MultiGetColumnFamilyData> multiget_cf_data(
column_family.size());
for (auto cf : column_family) {
auto cfh = static_cast_with_check<ColumnFamilyHandleImpl>(cf);
@ -2014,13 +2011,13 @@ std::vector<Status> DBImpl::MultiGet(
}
std::function<MultiGetColumnFamilyData*(
UnorderedMap<uint32_t, MultiGetColumnFamilyData>::iterator&)>
std::unordered_map<uint32_t, MultiGetColumnFamilyData>::iterator&)>
iter_deref_lambda =
[](UnorderedMap<uint32_t, MultiGetColumnFamilyData>::iterator&
[](std::unordered_map<uint32_t, MultiGetColumnFamilyData>::iterator&
cf_iter) { return &cf_iter->second; };
bool unref_only =
MultiCFSnapshot<UnorderedMap<uint32_t, MultiGetColumnFamilyData>>(
MultiCFSnapshot<std::unordered_map<uint32_t, MultiGetColumnFamilyData>>(
read_options, nullptr, iter_deref_lambda, &multiget_cf_data,
&consistent_seqnum);
@ -3356,7 +3353,7 @@ bool DBImpl::GetProperty(ColumnFamilyHandle* column_family,
bool ret_value =
GetIntPropertyInternal(cfd, *property_info, false, &int_value);
if (ret_value) {
*value = std::to_string(int_value);
*value = ToString(int_value);
}
return ret_value;
} else if (property_info->handle_string) {
@ -3683,11 +3680,6 @@ Status DBImpl::GetUpdatesSince(
SequenceNumber seq, std::unique_ptr<TransactionLogIterator>* iter,
const TransactionLogIterator::ReadOptions& read_options) {
RecordTick(stats_, GET_UPDATES_SINCE_CALLS);
if (seq_per_batch_) {
return Status::NotSupported(
"This API is not yet compatible with write-prepared/write-unprepared "
"transactions");
}
if (seq > versions_->LastSequence()) {
return Status::NotFound("Requested sequence not yet written in the db");
}
@ -3991,8 +3983,8 @@ Status DBImpl::CheckConsistency() {
} else if (fsize != md.size) {
corruption_messages += "Sst file size mismatch: " + file_path +
". Size recorded in manifest " +
std::to_string(md.size) + ", actual size " +
std::to_string(fsize) + "\n";
ToString(md.size) + ", actual size " +
ToString(fsize) + "\n";
}
}
}
@ -5124,8 +5116,8 @@ Status DBImpl::VerifyChecksumInternal(const ReadOptions& read_options,
fmeta->file_checksum_func_name, fname,
read_options);
} else {
s = ROCKSDB_NAMESPACE::VerifySstFileChecksum(
opts, file_options_, read_options, fname, fd.largest_seqno);
s = ROCKSDB_NAMESPACE::VerifySstFileChecksum(opts, file_options_,
read_options, fname);
}
RecordTick(stats_, VERIFY_CHECKSUM_READ_BYTES,
IOSTATS(bytes_read) - prev_bytes_read);
@ -5339,7 +5331,7 @@ Status DBImpl::ReserveFileNumbersBeforeIngestion(
Status DBImpl::GetCreationTimeOfOldestFile(uint64_t* creation_time) {
if (mutable_db_options_.max_open_files == -1) {
uint64_t oldest_time = std::numeric_limits<uint64_t>::max();
uint64_t oldest_time = port::kMaxUint64;
for (auto cfd : *versions_->GetColumnFamilySet()) {
if (!cfd->IsDropped()) {
uint64_t ctime;

View File

@ -663,8 +663,7 @@ class DBImpl : public DB {
const CompactRangeOptions& compact_range_options,
const Slice* begin, const Slice* end,
bool exclusive, bool disallow_trivial_move,
uint64_t max_file_num_to_ignore,
const std::string& trim_ts);
uint64_t max_file_num_to_ignore);
// Return an internal iterator over the current state of the database.
// The keys of this iterator are internal keys (see format.h).
@ -1248,8 +1247,7 @@ class DBImpl : public DB {
Status CompactRangeInternal(const CompactRangeOptions& options,
ColumnFamilyHandle* column_family,
const Slice* begin, const Slice* end,
const std::string& trim_ts);
const Slice* begin, const Slice* end);
// The following two functions can only be called when:
// 1. WriteThread::Writer::EnterUnbatched() is used.
@ -1783,12 +1781,8 @@ class DBImpl : public DB {
WriteBatch* tmp_batch, size_t* write_with_wal,
WriteBatch** to_be_cached_state);
// rate_limiter_priority is used to charge `DBOptions::rate_limiter`
// for automatic WAL flush (`Options::manual_wal_flush` == false)
// associated with this WriteToWAL
IOStatus WriteToWAL(const WriteBatch& merged_batch, log::Writer* log_writer,
uint64_t* log_used, uint64_t* log_size,
Env::IOPriority rate_limiter_priority,
bool with_db_mutex = false, bool with_log_mutex = false);
IOStatus WriteToWAL(const WriteThread::WriteGroup& write_group,
@ -2144,7 +2138,7 @@ class DBImpl : public DB {
// only during recovery. Using this feature enables not writing the state to
// memtable on normal writes and hence improving the throughput. Each new
// write of the state will replace the previous state entirely even if the
// keys in the two consecutive states do not overlap.
// keys in the two consecuitive states do not overlap.
// It is protected by log_write_mutex_ when two_write_queues_ is enabled.
// Otherwise only the heaad of write_thread_ can access it.
WriteBatch cached_recoverable_state_;
@ -2299,7 +2293,7 @@ class DBImpl : public DB {
static const int KEEP_LOG_FILE_NUM = 1000;
// MSVC version 1800 still does not have constexpr for ::max()
static const uint64_t kNoTimeOut = std::numeric_limits<uint64_t>::max();
static const uint64_t kNoTimeOut = port::kMaxUint64;
std::string db_absolute_path_;
@ -2369,6 +2363,11 @@ class DBImpl : public DB {
// DB::Open() or passed to us
bool own_sfm_;
// Default value is 0 which means ALL deletes are
// preserved. Note that this has no effect if preserve_deletes is false.
const std::atomic<SequenceNumber> preserve_deletes_seqnum_{0};
const bool preserve_deletes_ = false;
// Flag to check whether Close() has been called on this DB
bool closed_;
// save the closing status, for re-calling the close()

View File

@ -188,7 +188,7 @@ Status DBImpl::FlushMemTableToOutputFile(
// a memtable without knowing such snapshot(s).
uint64_t max_memtable_id = needs_to_sync_closed_wals
? cfd->imm()->GetLatestMemTableID()
: std::numeric_limits<uint64_t>::max();
: port::kMaxUint64;
// If needs_to_sync_closed_wals is false, then the flush job will pick ALL
// existing memtables of the column family when PickMemTable() is called
@ -263,6 +263,11 @@ Status DBImpl::FlushMemTableToOutputFile(
if (!s.ok() && need_cancel) {
flush_job.Cancel();
}
IOStatus io_s = IOStatus::OK();
io_s = flush_job.io_status();
if (s.ok()) {
s = io_s;
}
if (s.ok()) {
InstallSuperVersionAndScheduleWork(cfd, superversion_context,
@ -298,7 +303,9 @@ Status DBImpl::FlushMemTableToOutputFile(
}
if (!s.ok() && !s.IsShutdownInProgress() && !s.IsColumnFamilyDropped()) {
if (log_io_s.ok()) {
if (!io_s.ok() && !io_s.IsShutdownInProgress() &&
!io_s.IsColumnFamilyDropped()) {
assert(log_io_s.ok());
// Error while writing to MANIFEST.
// In fact, versions_->io_status() can also be the result of renaming
// CURRENT file. With current code, it's just difficult to tell. So just
@ -309,19 +316,24 @@ Status DBImpl::FlushMemTableToOutputFile(
// error), all the Manifest write will be map to soft error.
// TODO: kManifestWriteNoWAL and kFlushNoWAL are misleading. Refactor is
// needed.
error_handler_.SetBGError(s,
error_handler_.SetBGError(io_s,
BackgroundErrorReason::kManifestWriteNoWAL);
} else {
// If WAL sync is successful (either WAL size is 0 or there is no IO
// error), all the other SST file write errors will be set as
// kFlushNoWAL.
error_handler_.SetBGError(s, BackgroundErrorReason::kFlushNoWAL);
error_handler_.SetBGError(io_s, BackgroundErrorReason::kFlushNoWAL);
}
} else {
assert(s == log_io_s);
Status new_bg_error = s;
error_handler_.SetBGError(new_bg_error, BackgroundErrorReason::kFlush);
if (log_io_s.ok()) {
Status new_bg_error = s;
error_handler_.SetBGError(new_bg_error, BackgroundErrorReason::kFlush);
}
}
} else {
// If we got here, then we decided not to care about the i_os status (either
// from never needing it or ignoring the flush job status
io_s.PermitUncheckedError();
}
// If flush ran smoothly and no mempurge happened
// install new SST file path.
@ -370,14 +382,13 @@ Status DBImpl::FlushMemTablesToOutputFiles(
&earliest_write_conflict_snapshot, &snapshot_checker);
const auto& bg_flush_arg = bg_flush_args[0];
ColumnFamilyData* cfd = bg_flush_arg.cfd_;
// intentional infrequent copy for each flush
MutableCFOptions mutable_cf_options_copy = *cfd->GetLatestMutableCFOptions();
MutableCFOptions mutable_cf_options = *cfd->GetLatestMutableCFOptions();
SuperVersionContext* superversion_context =
bg_flush_arg.superversion_context_;
Status s = FlushMemTableToOutputFile(
cfd, mutable_cf_options_copy, made_progress, job_context,
superversion_context, snapshot_seqs, earliest_write_conflict_snapshot,
snapshot_checker, log_buffer, thread_pri);
cfd, mutable_cf_options, made_progress, job_context, superversion_context,
snapshot_seqs, earliest_write_conflict_snapshot, snapshot_checker,
log_buffer, thread_pri);
return s;
}
@ -404,7 +415,6 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
for (const auto cfd : cfds) {
assert(cfd->imm()->NumNotFlushed() != 0);
assert(cfd->imm()->IsFlushPending());
assert(cfd->GetFlushReason() == cfds[0]->GetFlushReason());
}
#endif /* !NDEBUG */
@ -491,10 +501,12 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
// exec_status stores the execution status of flush_jobs as
// <bool /* executed */, Status /* status code */>
autovector<std::pair<bool, Status>> exec_status;
autovector<IOStatus> io_status;
std::vector<bool> pick_status;
for (int i = 0; i != num_cfs; ++i) {
// Initially all jobs are not executed, with status OK.
exec_status.emplace_back(false, Status::OK());
io_status.emplace_back(IOStatus::OK());
pick_status.push_back(false);
}
@ -514,6 +526,7 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
jobs[i]->Run(&logs_with_prep_tracker_, &file_meta[i],
&(switched_to_mempurge.at(i)));
exec_status[i].first = true;
io_status[i] = jobs[i]->io_status();
}
if (num_cfs > 1) {
TEST_SYNC_POINT(
@ -527,6 +540,7 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
&logs_with_prep_tracker_, file_meta.data() /* &file_meta[0] */,
switched_to_mempurge.empty() ? nullptr : &(switched_to_mempurge.at(0)));
exec_status[0].first = true;
io_status[0] = jobs[0]->io_status();
Status error_status;
for (const auto& e : exec_status) {
@ -545,6 +559,21 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
s = error_status.ok() ? s : error_status;
}
IOStatus io_s = IOStatus::OK();
if (io_s.ok()) {
IOStatus io_error = IOStatus::OK();
for (int i = 0; i != static_cast<int>(io_status.size()); i++) {
if (!io_status[i].ok() && !io_status[i].IsShutdownInProgress() &&
!io_status[i].IsColumnFamilyDropped()) {
io_error = io_status[i];
}
}
io_s = io_error;
if (s.ok() && !io_s.ok()) {
s = io_s;
}
}
if (s.IsColumnFamilyDropped()) {
s = Status::OK();
}
@ -617,10 +646,7 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
return std::make_pair(Status::OK(), !ready);
};
bool resuming_from_bg_err =
error_handler_.IsDBStopped() ||
(cfds[0]->GetFlushReason() == FlushReason::kErrorRecovery ||
cfds[0]->GetFlushReason() == FlushReason::kErrorRecoveryRetryFlush);
bool resuming_from_bg_err = error_handler_.IsDBStopped();
while ((!resuming_from_bg_err || error_handler_.GetRecoveryError().ok())) {
std::pair<Status, bool> res = wait_to_install_func();
@ -635,10 +661,7 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
}
atomic_flush_install_cv_.Wait();
resuming_from_bg_err =
error_handler_.IsDBStopped() ||
(cfds[0]->GetFlushReason() == FlushReason::kErrorRecovery ||
cfds[0]->GetFlushReason() == FlushReason::kErrorRecoveryRetryFlush);
resuming_from_bg_err = error_handler_.IsDBStopped();
}
if (!resuming_from_bg_err) {
@ -762,7 +785,8 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
// Need to undo atomic flush if something went wrong, i.e. s is not OK and
// it is not because of CF drop.
if (!s.ok() && !s.IsColumnFamilyDropped()) {
if (log_io_s.ok()) {
if (!io_s.ok() && !io_s.IsColumnFamilyDropped()) {
assert(log_io_s.ok());
// Error while writing to MANIFEST.
// In fact, versions_->io_status() can also be the result of renaming
// CURRENT file. With current code, it's just difficult to tell. So just
@ -773,18 +797,19 @@ Status DBImpl::AtomicFlushMemTablesToOutputFiles(
// error), all the Manifest write will be map to soft error.
// TODO: kManifestWriteNoWAL and kFlushNoWAL are misleading. Refactor
// is needed.
error_handler_.SetBGError(s,
error_handler_.SetBGError(io_s,
BackgroundErrorReason::kManifestWriteNoWAL);
} else {
// If WAL sync is successful (either WAL size is 0 or there is no IO
// error), all the other SST file write errors will be set as
// kFlushNoWAL.
error_handler_.SetBGError(s, BackgroundErrorReason::kFlushNoWAL);
error_handler_.SetBGError(io_s, BackgroundErrorReason::kFlushNoWAL);
}
} else {
assert(s == log_io_s);
Status new_bg_error = s;
error_handler_.SetBGError(new_bg_error, BackgroundErrorReason::kFlush);
if (log_io_s.ok()) {
Status new_bg_error = s;
error_handler_.SetBGError(new_bg_error, BackgroundErrorReason::kFlush);
}
}
}
@ -901,7 +926,7 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
size_t ts_sz = ucmp->timestamp_size();
if (ts_sz == 0) {
return CompactRangeInternal(options, column_family, begin_without_ts,
end_without_ts, "" /*trim_ts*/);
end_without_ts);
}
std::string begin_str;
@ -923,7 +948,7 @@ Status DBImpl::CompactRange(const CompactRangeOptions& options,
Slice* end_with_ts = end_without_ts ? &end : nullptr;
return CompactRangeInternal(options, column_family, begin_with_ts,
end_with_ts, "" /*trim_ts*/);
end_with_ts);
}
Status DBImpl::IncreaseFullHistoryTsLow(ColumnFamilyHandle* column_family,
@ -969,8 +994,7 @@ Status DBImpl::IncreaseFullHistoryTsLowImpl(ColumnFamilyData* cfd,
Status DBImpl::CompactRangeInternal(const CompactRangeOptions& options,
ColumnFamilyHandle* column_family,
const Slice* begin, const Slice* end,
const std::string& trim_ts) {
const Slice* begin, const Slice* end) {
auto cfh = static_cast_with_check<ColumnFamilyHandleImpl>(column_family);
auto cfd = cfh->cfd();
@ -1041,8 +1065,7 @@ Status DBImpl::CompactRangeInternal(const CompactRangeOptions& options,
}
s = RunManualCompaction(cfd, ColumnFamilyData::kCompactAllLevels,
final_output_level, options, begin, end, exclusive,
false, std::numeric_limits<uint64_t>::max(),
trim_ts);
false, port::kMaxUint64);
} else {
int first_overlapped_level = kInvalidLevel;
int max_overlapped_level = kInvalidLevel;
@ -1079,7 +1102,7 @@ Status DBImpl::CompactRangeInternal(const CompactRangeOptions& options,
if (s.ok() && first_overlapped_level != kInvalidLevel) {
// max_file_num_to_ignore can be used to filter out newly created SST
// files, useful for bottom level compaction in a manual compaction
uint64_t max_file_num_to_ignore = std::numeric_limits<uint64_t>::max();
uint64_t max_file_num_to_ignore = port::kMaxUint64;
uint64_t next_file_number = versions_->current_next_file_number();
final_output_level = max_overlapped_level;
int output_level;
@ -1128,13 +1151,9 @@ Status DBImpl::CompactRangeInternal(const CompactRangeOptions& options,
disallow_trivial_move = true;
}
}
// trim_ts need real compaction to remove latest record
if (!trim_ts.empty()) {
disallow_trivial_move = true;
}
s = RunManualCompaction(cfd, level, output_level, options, begin, end,
exclusive, disallow_trivial_move,
max_file_num_to_ignore, trim_ts);
max_file_num_to_ignore);
if (!s.ok()) {
break;
}
@ -1364,17 +1383,16 @@ Status DBImpl::CompactFilesImpl(
CompactionJob compaction_job(
job_context->job_id, c.get(), immutable_db_options_, mutable_db_options_,
file_options_for_compaction_, versions_.get(), &shutting_down_,
log_buffer, directories_.GetDbDir(),
preserve_deletes_seqnum_.load(), log_buffer, directories_.GetDbDir(),
GetDataDir(c->column_family_data(), c->output_path_id()),
GetDataDir(c->column_family_data(), 0), stats_, &mutex_, &error_handler_,
snapshot_seqs, earliest_write_conflict_snapshot, snapshot_checker,
job_context, table_cache_, &event_logger_,
table_cache_, &event_logger_,
c->mutable_cf_options()->paranoid_file_checks,
c->mutable_cf_options()->report_bg_io_stats, dbname_,
&compaction_job_stats, Env::Priority::USER, io_tracer_,
&manual_compaction_paused_, nullptr, db_id_, db_session_id_,
c->column_family_data()->GetFullHistoryTsLow(), c->trim_ts(),
&blob_callback_);
c->column_family_data()->GetFullHistoryTsLow(), &blob_callback_);
// Creating a compaction influences the compaction score because the score
// takes running compactions into account (by skipping files that are already
@ -1763,7 +1781,7 @@ Status DBImpl::RunManualCompaction(
ColumnFamilyData* cfd, int input_level, int output_level,
const CompactRangeOptions& compact_range_options, const Slice* begin,
const Slice* end, bool exclusive, bool disallow_trivial_move,
uint64_t max_file_num_to_ignore, const std::string& trim_ts) {
uint64_t max_file_num_to_ignore) {
assert(input_level == ColumnFamilyData::kCompactAllLevels ||
input_level >= 0);
@ -1875,7 +1893,7 @@ Status DBImpl::RunManualCompaction(
*manual.cfd->GetLatestMutableCFOptions(), mutable_db_options_,
manual.input_level, manual.output_level, compact_range_options,
manual.begin, manual.end, &manual.manual_end, &manual_conflict,
max_file_num_to_ignore, trim_ts)) == nullptr &&
max_file_num_to_ignore)) == nullptr &&
manual_conflict))) {
// exclusive manual compactions should not see a conflict during
// CompactRange
@ -2014,7 +2032,7 @@ Status DBImpl::FlushMemTable(ColumnFamilyData* cfd,
// be created and scheduled, status::OK() will be returned.
s = SwitchMemtable(cfd, &context);
}
const uint64_t flush_memtable_id = std::numeric_limits<uint64_t>::max();
const uint64_t flush_memtable_id = port::kMaxUint64;
if (s.ok()) {
if (cfd->imm()->NumNotFlushed() != 0 || !cfd->mem()->IsEmpty() ||
!cached_recoverable_state_empty_.load()) {
@ -3358,18 +3376,19 @@ Status DBImpl::BackgroundCompaction(bool* made_progress,
CompactionJob compaction_job(
job_context->job_id, c.get(), immutable_db_options_,
mutable_db_options_, file_options_for_compaction_, versions_.get(),
&shutting_down_, log_buffer, directories_.GetDbDir(),
&shutting_down_, preserve_deletes_seqnum_.load(), log_buffer,
directories_.GetDbDir(),
GetDataDir(c->column_family_data(), c->output_path_id()),
GetDataDir(c->column_family_data(), 0), stats_, &mutex_,
&error_handler_, snapshot_seqs, earliest_write_conflict_snapshot,
snapshot_checker, job_context, table_cache_, &event_logger_,
snapshot_checker, table_cache_, &event_logger_,
c->mutable_cf_options()->paranoid_file_checks,
c->mutable_cf_options()->report_bg_io_stats, dbname_,
&compaction_job_stats, thread_pri, io_tracer_,
is_manual ? &manual_compaction_paused_ : nullptr,
is_manual ? manual_compaction->canceled : nullptr, db_id_,
db_session_id_, c->column_family_data()->GetFullHistoryTsLow(),
c->trim_ts(), &blob_callback_);
&blob_callback_);
compaction_job.Prepare();
NotifyOnCompactionBegin(c->column_family_data(), c.get(), status,

View File

@ -118,11 +118,9 @@ Status DBImpl::TEST_CompactRange(int level, const Slice* begin,
cfd->ioptions()->compaction_style == kCompactionStyleFIFO)
? level
: level + 1;
return RunManualCompaction(
cfd, level, output_level, CompactRangeOptions(), begin, end, true,
disallow_trivial_move,
std::numeric_limits<uint64_t>::max() /*max_file_num_to_ignore*/,
"" /*trim_ts*/);
return RunManualCompaction(cfd, level, output_level, CompactRangeOptions(),
begin, end, true, disallow_trivial_move,
port::kMaxUint64 /*max_file_num_to_ignore*/);
}
Status DBImpl::TEST_SwitchMemtable(ColumnFamilyData* cfd) {

View File

@ -761,7 +761,7 @@ uint64_t PrecomputeMinLogNumberToKeepNon2PC(
assert(!cfds_to_flush.empty());
assert(cfds_to_flush.size() == edit_lists.size());
uint64_t min_log_number_to_keep = std::numeric_limits<uint64_t>::max();
uint64_t min_log_number_to_keep = port::kMaxUint64;
for (const auto& edit_list : edit_lists) {
uint64_t log = 0;
for (const auto& e : edit_list) {
@ -773,7 +773,7 @@ uint64_t PrecomputeMinLogNumberToKeepNon2PC(
min_log_number_to_keep = std::min(min_log_number_to_keep, log);
}
}
if (min_log_number_to_keep == std::numeric_limits<uint64_t>::max()) {
if (min_log_number_to_keep == port::kMaxUint64) {
min_log_number_to_keep = cfds_to_flush[0]->GetLogNumber();
for (size_t i = 1; i < cfds_to_flush.size(); i++) {
min_log_number_to_keep =

View File

@ -760,11 +760,11 @@ Status DBImpl::PersistentStatsProcessFormatVersion() {
WriteBatch batch;
if (s.ok()) {
s = batch.Put(persist_stats_cf_handle_, kFormatVersionKeyString,
std::to_string(kStatsCFCurrentFormatVersion));
ToString(kStatsCFCurrentFormatVersion));
}
if (s.ok()) {
s = batch.Put(persist_stats_cf_handle_, kCompatibleVersionKeyString,
std::to_string(kStatsCFCompatibleFormatVersion));
ToString(kStatsCFCompatibleFormatVersion));
}
if (s.ok()) {
WriteOptions wo;
@ -947,6 +947,7 @@ Status DBImpl::RecoverLogFiles(const std::vector<uint64_t>& wal_numbers,
// Read all the records and add to a memtable
std::string scratch;
Slice record;
WriteBatch batch;
TEST_SYNC_POINT_CALLBACK("DBImpl::RecoverLogFiles:BeforeReadWal",
/*arg=*/nullptr);
@ -960,15 +961,10 @@ Status DBImpl::RecoverLogFiles(const std::vector<uint64_t>& wal_numbers,
continue;
}
// We create a new batch and initialize with a valid prot_info_ to store
// the data checksums
WriteBatch batch(0, 0, 8, 0);
status = WriteBatchInternal::SetContents(&batch, record);
if (!status.ok()) {
return status;
}
SequenceNumber sequence = WriteBatchInternal::Sequence(&batch);
if (immutable_db_options_.wal_recovery_mode ==
@ -1326,7 +1322,6 @@ Status DBImpl::GetLogSizeAndMaybeTruncate(uint64_t wal_number, bool truncate,
Status s;
// This gets the appear size of the wals, not including preallocated space.
s = env_->GetFileSize(fname, &log.size);
TEST_SYNC_POINT_CALLBACK("DBImpl::GetLogSizeAndMaybeTruncate:0", /*arg=*/&s);
if (s.ok() && truncate) {
std::unique_ptr<FSWritableFile> last_log;
Status truncate_status = fs_->ReopenWritableFile(
@ -1470,9 +1465,9 @@ Status DBImpl::WriteLevel0TableForRecovery(int job_id, ColumnFamilyData* cfd,
dbname_, versions_.get(), immutable_db_options_, tboptions,
file_options_for_compaction_, cfd->table_cache(), iter.get(),
std::move(range_del_iters), &meta, &blob_file_additions,
snapshot_seqs, earliest_write_conflict_snapshot, kMaxSequenceNumber,
snapshot_checker, paranoid_file_checks, cfd->internal_stats(), &io_s,
io_tracer_, BlobFileCreationReason::kRecovery, &event_logger_, job_id,
snapshot_seqs, earliest_write_conflict_snapshot, snapshot_checker,
paranoid_file_checks, cfd->internal_stats(), &io_s, io_tracer_,
BlobFileCreationReason::kRecovery, &event_logger_, job_id,
Env::IO_HIGH, nullptr /* table_properties */, write_hint,
nullptr /*full_history_ts_low*/, &blob_callback_);
LogFlush(immutable_db_options_.info_log);
@ -1571,72 +1566,6 @@ Status DB::Open(const DBOptions& db_options, const std::string& dbname,
!kSeqPerBatch, kBatchPerTxn);
}
// TODO: Implement the trimming in flush code path.
// TODO: Perform trimming before inserting into memtable during recovery.
// TODO: Pick files with max_timestamp > trim_ts by each file's timestamp meta
// info, and handle only these files to reduce io.
Status DB::OpenAndTrimHistory(
const DBOptions& db_options, const std::string& dbname,
const std::vector<ColumnFamilyDescriptor>& column_families,
std::vector<ColumnFamilyHandle*>* handles, DB** dbptr,
std::string trim_ts) {
assert(dbptr != nullptr);
assert(handles != nullptr);
auto validate_options = [&db_options] {
if (db_options.avoid_flush_during_recovery) {
return Status::InvalidArgument(
"avoid_flush_during_recovery incompatible with "
"OpenAndTrimHistory");
}
return Status::OK();
};
auto s = validate_options();
if (!s.ok()) {
return s;
}
DB* db = nullptr;
s = DB::Open(db_options, dbname, column_families, handles, &db);
if (!s.ok()) {
return s;
}
assert(db);
CompactRangeOptions options;
options.bottommost_level_compaction =
BottommostLevelCompaction::kForceOptimized;
auto db_impl = static_cast_with_check<DBImpl>(db);
for (auto handle : *handles) {
assert(handle != nullptr);
auto cfh = static_cast_with_check<ColumnFamilyHandleImpl>(handle);
auto cfd = cfh->cfd();
assert(cfd != nullptr);
// Only compact column families with timestamp enabled
if (cfd->user_comparator() != nullptr &&
cfd->user_comparator()->timestamp_size() > 0) {
s = db_impl->CompactRangeInternal(options, handle, nullptr, nullptr,
trim_ts);
if (!s.ok()) {
break;
}
}
}
auto clean_op = [&handles, &db] {
for (auto handle : *handles) {
auto temp_s = db->DestroyColumnFamilyHandle(handle);
assert(temp_s.ok());
}
handles->clear();
delete db;
};
if (!s.ok()) {
clean_op();
return s;
}
*dbptr = db;
return s;
}
IOStatus DBImpl::CreateWAL(uint64_t log_file_num, uint64_t recycle_log_number,
size_t preallocate_block_size,
log::Writer** new_log) {
@ -1822,11 +1751,10 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname,
uint64_t log_used, log_size;
log::Writer* log_writer = impl->logs_.back().writer;
s = impl->WriteToWAL(empty_batch, log_writer, &log_used, &log_size,
Env::IO_TOTAL, /*with_db_mutex==*/true);
/*with_db_mutex==*/true);
if (s.ok()) {
// Need to fsync, otherwise it might get lost after a power reset.
s = impl->FlushWAL(false);
TEST_SYNC_POINT_CALLBACK("DBImpl::Open::BeforeSyncWAL", /*arg=*/&s);
if (s.ok()) {
s = log_writer->file()->Sync(impl->immutable_db_options_.use_fsync);
}
@ -1977,10 +1905,10 @@ Status DBImpl::Open(const DBOptions& db_options, const std::string& dbname,
"DB::Open() failed --- Unable to persist Options file",
persist_options_status.ToString());
}
}
if (!s.ok()) {
} else {
ROCKS_LOG_WARN(impl->immutable_db_options_.info_log,
"DB::Open() failed: %s", s.ToString().c_str());
"Persisting Option File error: %s",
persist_options_status.ToString().c_str());
}
if (s.ok()) {
s = impl->StartPeriodicWorkScheduler();

View File

@ -247,16 +247,15 @@ Status DBImplSecondary::RecoverLogFiles(
if (seq_of_batch <= seq) {
continue;
}
auto curr_log_num = std::numeric_limits<uint64_t>::max();
auto curr_log_num = port::kMaxUint64;
if (cfd_to_current_log_.count(cfd) > 0) {
curr_log_num = cfd_to_current_log_[cfd];
}
// If the active memtable contains records added by replaying an
// earlier WAL, then we need to seal the memtable, add it to the
// immutable memtable list and create a new active memtable.
if (!cfd->mem()->IsEmpty() &&
(curr_log_num == std::numeric_limits<uint64_t>::max() ||
curr_log_num != log_number)) {
if (!cfd->mem()->IsEmpty() && (curr_log_num == port::kMaxUint64 ||
curr_log_num != log_number)) {
const MutableCFOptions mutable_cf_options =
*cfd->GetLatestMutableCFOptions();
MemTable* new_mem =
@ -711,11 +710,8 @@ Status DB::OpenAsSecondary(
}
Status DBImplSecondary::CompactWithoutInstallation(
const OpenAndCompactOptions& options, ColumnFamilyHandle* cfh,
const CompactionServiceInput& input, CompactionServiceResult* result) {
if (options.canceled && options.canceled->load(std::memory_order_acquire)) {
return Status::Incomplete(Status::SubCode::kManualCompactionPaused);
}
ColumnFamilyHandle* cfh, const CompactionServiceInput& input,
CompactionServiceResult* result) {
InstrumentedMutexLock l(&mutex_);
auto cfd = static_cast_with_check<ColumnFamilyHandleImpl>(cfh)->cfd();
if (!cfd) {
@ -777,7 +773,7 @@ Status DBImplSecondary::CompactWithoutInstallation(
file_options_for_compaction_, versions_.get(), &shutting_down_,
&log_buffer, output_dir.get(), stats_, &mutex_, &error_handler_,
input.snapshots, table_cache_, &event_logger_, dbname_, io_tracer_,
options.canceled, db_id_, db_session_id_, secondary_path_, input, result);
db_id_, db_session_id_, secondary_path_, input, result);
mutex_.Unlock();
s = compaction_job.Run();
@ -796,13 +792,9 @@ Status DBImplSecondary::CompactWithoutInstallation(
}
Status DB::OpenAndCompact(
const OpenAndCompactOptions& options, const std::string& name,
const std::string& output_directory, const std::string& input,
std::string* output,
const std::string& name, const std::string& output_directory,
const std::string& input, std::string* result,
const CompactionServiceOptionsOverride& override_options) {
if (options.canceled && options.canceled->load(std::memory_order_acquire)) {
return Status::Incomplete(Status::SubCode::kManualCompactionPaused);
}
CompactionServiceInput compaction_input;
Status s = CompactionServiceInput::Read(input, &compaction_input);
if (!s.ok()) {
@ -832,7 +824,6 @@ Status DB::OpenAndCompact(
override_options.table_factory;
compaction_input.column_family.options.sst_partitioner_factory =
override_options.sst_partitioner_factory;
compaction_input.db_options.listeners = override_options.listeners;
std::vector<ColumnFamilyDescriptor> column_families;
column_families.push_back(compaction_input.column_family);
@ -856,10 +847,10 @@ Status DB::OpenAndCompact(
CompactionServiceResult compaction_result;
DBImplSecondary* db_secondary = static_cast_with_check<DBImplSecondary>(db);
assert(handles.size() > 0);
s = db_secondary->CompactWithoutInstallation(
options, handles[0], compaction_input, &compaction_result);
s = db_secondary->CompactWithoutInstallation(handles[0], compaction_input,
&compaction_result);
Status serialization_status = compaction_result.Write(output);
Status serialization_status = compaction_result.Write(result);
for (auto& handle : handles) {
delete handle;
@ -871,14 +862,6 @@ Status DB::OpenAndCompact(
return s;
}
Status DB::OpenAndCompact(
const std::string& name, const std::string& output_directory,
const std::string& input, std::string* output,
const CompactionServiceOptionsOverride& override_options) {
return OpenAndCompact(OpenAndCompactOptions(), name, output_directory, input,
output, override_options);
}
#else // !ROCKSDB_LITE
Status DB::OpenAsSecondary(const Options& /*options*/,

View File

@ -228,11 +228,10 @@ class DBImplSecondary : public DBImpl {
Status CheckConsistency() override;
#ifndef NDEBUG
Status TEST_CompactWithoutInstallation(const OpenAndCompactOptions& options,
ColumnFamilyHandle* cfh,
Status TEST_CompactWithoutInstallation(ColumnFamilyHandle* cfh,
const CompactionServiceInput& input,
CompactionServiceResult* result) {
return CompactWithoutInstallation(options, cfh, input, result);
return CompactWithoutInstallation(cfh, input, result);
}
#endif // NDEBUG
@ -347,8 +346,7 @@ class DBImplSecondary : public DBImpl {
// Run compaction without installation, the output files will be placed in the
// secondary DB path. The LSM tree won't be changed, the secondary DB is still
// in read-only mode.
Status CompactWithoutInstallation(const OpenAndCompactOptions& options,
ColumnFamilyHandle* cfh,
Status CompactWithoutInstallation(ColumnFamilyHandle* cfh,
const CompactionServiceInput& input,
CompactionServiceResult* result);

Some files were not shown because too many files have changed in this diff Show More