rocksdb

Author	SHA1	Message	Date
sdong	9cc25190e1	Test CircleCI with CLANG-10 (#7025 ) Summary: It's useful to build RocksDB using a more recent clang version in CI. Add a CircleCI build and fix some issues with it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7025 Test Plan: See all tests pass. Reviewed By: pdillinger Differential Revision: D22215700 fbshipit-source-id: 914a729c2cd3f3ac4a627cc0ac58d4691dca2168	2020-06-24 16:22:49 -07:00
Zhichao Cao	83a4dd1a67	Fix the memory leak in Env_basic_test (#7017 ) Summary: Fix the memory leak broken asan and other test introduced by https://github.com/facebook/rocksdb/issues/6830 Pull Request resolved: https://github.com/facebook/rocksdb/pull/7017 Test Plan: pass asan_check Reviewed By: siying Differential Revision: D22190289 Pulled By: zhichao-cao fbshipit-source-id: 03a095f698b4f9d72fd9374191b17c890d7c2b56	2020-06-24 11:05:24 -07:00
Matthew Von-Maszewski	1092f19d95	Make EncryptEnv inheritable (#6830 ) Summary: EncryptEnv class is both declared and defined within env_encryption.cc. This makes it really tough to derive new classes from that base. This branch moves declaration of the class to rocksdb/env_encryption.h. The change facilitates making new encryption modules (such as an upcoming openssl AES CTR pull request) possible / easy. The only coding change was to add the EncryptEnv object to env_basic_test.cc. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6830 Reviewed By: riversand963 Differential Revision: D21706593 Pulled By: ajkr fbshipit-source-id: 64d2da95a1569ceeb9b1549c3bec5404cf4c89f0	2020-06-22 13:27:16 -07:00
Peter Dillinger	88b4210701	Remove racially charged terms "whitelist" and "blacklist" (#7008 ) Summary: We don't need them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/7008 Test Plan: "make check" and ensure "make crash_test" starts Reviewed By: ajkr Differential Revision: D22143838 Pulled By: pdillinger fbshipit-source-id: 72c8e16603abc59f4954e304466bc4dc1f58f94e	2020-06-19 15:27:32 -07:00
Andrew Kryczka	312f23c92d	build fixes for GNU/kFreeBSD (#6992 ) Summary: Upstream https://salsa.debian.org/mariadb-team/mariadb-10.4/-/blob/master/debian/patches/rocksdb-kfreebsd.patch by jrtc27. Fixes https://github.com/facebook/rocksdb/issues/5223. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6992 Reviewed By: zhichao-cao Differential Revision: D22084150 Pulled By: ajkr fbshipit-source-id: 1822311ba16f112a15065b2180ce89d36af9cafc	2020-06-18 09:51:28 -07:00
Cheng Chang	f7613e2a9e	Make it able to lower cpu priority to specific level in threadpool (#6969 ) Summary: `Env::LowerThreadPoolCPUPriority` takes a new parameter `CpuPriority` to be able to lower to a specific priority such as `CpuPriority::kIdle`, previously, the priority is always lowered to `CpuPriority::kLow`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6969 Test Plan: unit test `EnvPosixTest::LowerThreadPoolCpuPriority` added to `env_test.cc`. Reviewed By: siying Differential Revision: D22011169 Pulled By: cheng-chang fbshipit-source-id: 568878c24a924912e35cef00c552d4a63431cdf4	2020-06-13 13:25:20 -07:00
Zhen Li	d63f86e506	fix build with 'USE_HDFS' on windows (#6950 ) Summary: Build with "USE_HDFS" failed with below errors on Windows. This PR is trying to fix them Severity Code Description Project File Line Suppression State Error (active) E0020 identifier "ssize_t" is undefined rocksdb D:\Git\rocksdb\rocksdb\env\env_hdfs.cc 127 Error (active) E1696 cannot open source file "sys/time.h" rocksdb D:\Git\rocksdb\rocksdb\env\env_hdfs.cc 15 Error C2065 'pthread_t': undeclared identifier rocksdb d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h 166 Error C3861 'pthread_self': identifier not found rocksdb d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h 167 Error C1083 Cannot open include file: 'sys/time.h': No such file or directory rocksdb d:\git\rocksdb\rocksdb\env\env_hdfs.cc 15 Error C2065 'pthread_t': undeclared identifier db_bench d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h 166 Error C3861 'pthread_self': identifier not found db_bench d:\git\rocksdb\rocksdb\hdfs\env_hdfs.h 167 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6950 Test Plan: 1. manually test build with "USE_HDFS" on Windows, verified HDFS Env related function by db_bench.exe. D:\Git\rocksdb\build\Debug>db_bench.exe --hdfs="abfs://test@rdbtest2.dfs.core.windows.net" --num=100 --benchmarks="fillseq,readseq,fillseekseq" --db="abfs://test@rdbtest2.dfs.core.windows.net/test" 2020-06-05 20:42:21,102 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-06-05 20:42:22,646 WARN utils.SSLSocketFactoryEx: Failed to load OpenSSL. Falling back to the JSSE default. Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags RocksDB: version 6.10 Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 100 Prefix: 0 bytes Keys per prefix: 0 RawSize: 0.0 MB (estimated) FileSize: 0.0 MB (estimated) Write rate: 0 bytes/second Read rate: 0 ops/second Compression: Snappy Compression sampling rate: 0 Memtablerep: skip_list Perf Level: 1 WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test] fillseq : 1138.350 micros/op 877 ops/sec; 0.1 MB/s DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test] readseq : 63.580 micros/op 15627 ops/sec; 1.7 MB/s DB path: [abfs://test@rdbtest2.dfs.core.windows.net/test] fillseekseq : 45.615 micros/op 21762 ops/sec; Reviewed By: cheng-chang Differential Revision: D21964806 Pulled By: riversand963 fbshipit-source-id: 9d7413178ece0113d11bc4398583f7d0590d5dbd	2020-06-12 16:21:50 -07:00
Yanqin Jin	a8170d774c	Close file to avoid file-descriptor leakage (#6936 ) Summary: When operation on an open file descriptor fails, we should close the file descriptor. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6936 Test Plan: make check Reviewed By: pdillinger Differential Revision: D21885458 Pulled By: riversand963 fbshipit-source-id: ba077a76b256a8537f21e22e4ec198f45390bf50	2020-06-04 14:21:15 -07:00
sdong	0b45a68c59	env_test /RunMany/ tests to run individually (#6931 ) Summary: When run /RunMany/ tests individually, e.g. ChrootEnvWithDirectIO/EnvPosixTestWithParam.RunMany/0, they hang. It's because they insert to background thread pool without initializing them. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6931 Test Plan: Run ChrootEnvWithDirectIO/EnvPosixTestWithParam.RunMany/0 by itself and see it passes. Reviewed By: riversand963 Differential Revision: D21875603 fbshipit-source-id: 7f848174c1a660254a2b1f7e11cca5370793ba30	2020-06-04 09:51:38 -07:00
sdong	afa3518839	Revert "Update googletest from 1.8.1 to 1.10.0 (#6808 )" (#6923 ) Summary: This reverts commit `8d87e9cea1`. Based on offline discussions, it's too early to upgrade to gtest 1.10, as it prevents some developers from using an older version of gtest to integrate to some other systems. Revert it for now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6923 Reviewed By: pdillinger Differential Revision: D21864799 fbshipit-source-id: d0726b1ff649fc911b9378f1763316200bd363fc	2020-06-03 15:55:03 -07:00
Hans Holmberg	0f85d163e6	Route GetTestDirectory to FileSystem in CompositeEnvWrappers (#6896 ) Summary: GetTestDirectory implies a file system operation (it creates the default test directory if missing), so it should be routed to the FileSystem rather than the Env. Also remove the GetTestDirectory implementation in the PosixEnv, since it overrides GetTestDirectory in CompositeEnv making it impossible to override with a custom FileSystem. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6896 Reviewed By: cheng-chang Differential Revision: D21868984 Pulled By: ajkr fbshipit-source-id: e79bfef758d06dacef727c54b96abe62e78726fd	2020-06-03 14:57:46 -07:00
hfrt456	f005dac2d9	fix IsDirectory function in env_hdfs.cc (#6917 ) Summary: fix IsDirectory function for hdfsEnv Pull Request resolved: https://github.com/facebook/rocksdb/pull/6917 Reviewed By: cheng-chang Differential Revision: D21865020 Pulled By: riversand963 fbshipit-source-id: ad69ed564d027b7bbdf4c693dd57cd02622fb3f8	2020-06-03 13:50:17 -07:00
Zhichao Cao	2adb7e3768	Fix potential overflow of unsigned type in for loop (#6902 ) Summary: x.size() -1 or y - 1 can overflow to an extremely large value when x.size() pr y is 0 when they are unsigned type. The end condition of i in the for loop will be extremely large, potentially causes segment fault. Fix them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6902 Test Plan: pass make asan_check Reviewed By: ajkr Differential Revision: D21843767 Pulled By: zhichao-cao fbshipit-source-id: 5b8b88155ac5a93d86246d832e89905a783bb5a1	2020-06-02 15:05:07 -07:00
Adam Retter	8d87e9cea1	Update googletest from 1.8.1 to 1.10.0 (#6808 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6808 Reviewed By: anand1976 Differential Revision: D21483984 Pulled By: pdillinger fbshipit-source-id: 70c5eff2bd54ddba469761d95e4cd4611fb8e598	2020-06-01 20:33:42 -07:00
Akanksha Mahajan	a1523efcdf	Status check enforcement for io_posix_test and options_settable_test (#6857 ) Summary: Added status check enforcement for io_posix_test and options_settable_test Pull Request resolved: https://github.com/facebook/rocksdb/pull/6857 Test Plan: ASSERT_STATUS_CHECKED=1 make -j48 check Reviewed By: ajkr Differential Revision: D21647904 Pulled By: akankshamahajan15 fbshipit-source-id: b7f2321eb6c141a88cd5e1270ecb7d58f00341af	2020-05-19 19:22:28 -07:00
Cheng Chang	ada700b906	Re-read the whole request in direct IO mode when IO uring returns partial result (#6853 ) Summary: If both direct IO and IO uring are enabled, when IO uring returns partial result, we'll try to read the remaining part of the request, but the starting address/offset of the remaining part might not be aligned to the block size, in direct IO mode, the unaligned offset causes bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6853 Test Plan: run make check with both direct IO and IO uring enabled, this is covered by one of the continuous tests. Reviewed By: anand1976 Differential Revision: D21603023 Pulled By: cheng-chang fbshipit-source-id: 942f6a11ff21e1892af6c4464e02bab4c707787c	2020-05-18 17:25:57 -07:00
Derrick Pallas	5272305437	Fix FilterBench when RTTI=0 (#6732 ) Summary: The dynamic_cast in the filter benchmark causes release mode to fail due to no-rtti. Replace with static_cast_with_check. Signed-off-by: Derrick Pallas <derrick@pallas.us> Addition by peterd: Remove unnecessary 2nd template arg on all static_cast_with_check Pull Request resolved: https://github.com/facebook/rocksdb/pull/6732 Reviewed By: ltamasi Differential Revision: D21304260 Pulled By: pdillinger fbshipit-source-id: 6e8eb437c4ca5a16dbbfa4053d67c4ad55f1608c	2020-04-29 13:09:23 -07:00
Cheng Chang	1758f76f2d	Fix unused variable of r in release mode (#6750 ) Summary: In release mode, asserts are not compiled, so `r` is not used, causing compiler warnings. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6750 Test Plan: make check under release mode Reviewed By: anand1976 Differential Revision: D21220365 Pulled By: cheng-chang fbshipit-source-id: fd4afa9843d54af68c4da8660ec61549803e1167	2020-04-24 15:14:13 -07:00
Cheng Chang	51bdfae010	Check alignment of MultiRead requests in direct IO mode (#6739 ) Summary: Add assertions to check direct IO's alignment requirements in MultiRead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6739 Test Plan: make check Reviewed By: siying Differential Revision: D21143825 Pulled By: cheng-chang fbshipit-source-id: 26f1623b062a1851080771128feac0669a61f5e9	2020-04-23 15:19:31 -07:00
Yanqin Jin	243852ec15	Add IsDirectory() to Env and FS (#6711 ) Summary: IsDirectory() is a common API to check whether a path is a regular file or directory. POSIX: call stat() and use S_ISDIR(st_mode) Windows: PathIsDirectoryA() and PathIsDirectoryW() HDFS: FileSystem.IsDirectory() Java: File.IsDirectory() ... Pull Request resolved: https://github.com/facebook/rocksdb/pull/6711 Test Plan: make check Reviewed By: anand1976 Differential Revision: D21053520 Pulled By: riversand963 fbshipit-source-id: 680aadfd8ce982b63689190cf31b3145d5a89e27	2020-04-17 14:39:18 -07:00
Cheng Chang	2767972386	Fix warning when O_CLOEXEC is not defined (#6695 ) Summary: Compilation fails on systems that do not support O_CLOEXEC. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6695 Test Plan: compile without O_CLOEXEC support Reviewed By: anand1976 Differential Revision: D21011850 Pulled By: cheng-chang fbshipit-source-id: f1bf1cce2aa65c7d10b5a9613e941db30e928347	2020-04-16 11:02:50 -07:00
Yi Wu	2b02ea25e2	Add counter in perf_context to time cipher time (#6596 ) Summary: Add `encrypt_data_time` and `decrypt_data_time` perf_context counters to time encryption/decryption time when `EnvEncryption` is enabled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6596 Test Plan: CI Reviewed By: anand1976 Differential Revision: D20678617 fbshipit-source-id: 7b57536143aa38509cde011f704de33382169e07	2020-04-01 16:59:35 -07:00
phantomape	cb671ea1ca	env: Add clearerr() before repeating an interrupted file read (#6609 ) Summary: This change updates PosixSequentialFile::Read to call clearerr() before fread()ing again after an EINTR is returned on a previous fread. The original fix is from `bd8f1ebb91`. Fixing https://github.com/facebook/rocksdb/issues/6509 Signed-off-by: phantomape <cxucheng@outlook.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/6609 Reviewed By: zhichao-cao Differential Revision: D20731482 Pulled By: riversand963 fbshipit-source-id: 7f1f3a1449077d5560f45c465a78d08633740ba0	2020-03-29 21:56:31 -07:00
Zhichao Cao	4246888101	Pass IOStatus to write path and set retryable IO Error as hard error in BG jobs (#6487 ) Summary: In the current code base, we use Status to get and store the returned status from the call. Specifically, for IO related functions, the current Status cannot reflect the IO Error details such as error scope, error retryable attribute, and others. With the implementation of https://github.com/facebook/rocksdb/issues/5761, we have the new Wrapper for IO, which returns IOStatus instead of Status. However, the IOStatus is purged at the lower level of write path and transferred to Status. The first job of this PR is to pass the IOStatus to the write path (flush, WAL write, and Compaction). The second job is to identify the Retryable IO Error as HardError, and set the bg_error_ as HardError. In this case, the DB Instance becomes read only. User is informed of the Status and need to take actions to deal with it (e.g., call db->Resume()). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6487 Test Plan: Added the testing case to error_handler_fs_test. Pass make asan_check Reviewed By: anand1976 Differential Revision: D20685017 Pulled By: zhichao-cao fbshipit-source-id: ff85f042896243abcd6ef37877834e26f36b6eb0	2020-03-27 16:04:43 -07:00
anand76	a9d168cfd7	Simplify migration to FileSystem API (#6552 ) Summary: The current Env/FileSystem API separation has a couple of issues - 1. It requires the user to specify 2 options - ```Options::env``` and ```Options::file_system``` - which means they have to make code changes to benefit from the new APIs. Furthermore, there is a risk of accessing the same APIs in two different ways, through Env in the old way and through FileSystem in the new way. The two may not always match, for example, if env is ```PosixEnv``` and FileSystem is a custom implementation. Any stray RocksDB calls to env will use the ```PosixEnv``` implementation rather than the file_system implementation. 2. There needs to be a simple way for the FileSystem developer to instantiate an Env for backward compatibility purposes. This PR solves the above issues and simplifies the migration in the following ways - 1. Embed a shared_ptr to the ```FileSystem``` in the ```Env```, and remove ```Options::file_system``` as a configurable option. This way, no code changes will be required in application code to benefit from the new API. The default Env constructor uses a ```LegacyFileSystemWrapper``` as the embedded ```FileSystem```. 1a. - This also makes it more robust by ensuring that even if RocksDB has some stray calls to Env APIs rather than FileSystem, they will go through the same object and thus there is no risk of getting out of sync. 2. Provide a ```NewCompositeEnv()``` API that can be used to construct a PosixEnv with a custom FileSystem implementation. This eliminates an indirection to call Env APIs, and relieves the FileSystem developer of the burden of having to implement wrappers for the Env APIs. 3. Add a couple of missing FileSystem APIs - ```SanitizeEnvOptions()``` and ```NewLogger()``` Tests: 1. New unit tests 2. make check and make asan_check Pull Request resolved: https://github.com/facebook/rocksdb/pull/6552 Reviewed By: riversand963 Differential Revision: D20592038 Pulled By: anand1976 fbshipit-source-id: c3801ad4153f96d21d5a3ae26c92ba454d1bf1f7	2020-03-23 21:54:21 -07:00
Yanqin Jin	fb09ef05dc	Attempt to recover from db with missing table files (#6334 ) Summary: There are situations when RocksDB tries to recover, but the db is in an inconsistent state due to SST files referenced in the MANIFEST being missing. In this case, previous RocksDB will just fail the recovery and return a non-ok status. This PR enables another possibility. During recovery, RocksDB checks possible MANIFEST files, and try to recover to the most recent state without missing table file. `VersionSet::Recover()` applies version edits incrementally and "materializes" a version only when this version does not reference any missing table file. After processing the entire MANIFEST, the version created last will be the latest version. `DBImpl::Recover()` calls `VersionSet::Recover()`. Afterwards, WAL replay will not be performed. To use this capability, set `options.best_efforts_recovery = true` when opening the db. Best-efforts recovery is currently incompatible with atomic flush. Test plan (on devserver): ``` $make check $COMPILE_WITH_ASAN=1 make all && make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6334 Reviewed By: anand1976 Differential Revision: D19778960 Pulled By: riversand963 fbshipit-source-id: c27ea80f29bc952e7d3311ecf5ee9c54393b40a8	2020-03-20 19:30:48 -07:00
Cheng Chang	5fd152b7ad	Get block size only in direct IO mode (#6522 ) Summary: When `use_direct_reads` and `use_direct_writes` are `false`, `logical_sector_size_` inside various `*File` implementations are not actually used, so `GetLogicalBlockSize` does not necessarily need to be called for `logical_sector_size_`, just set a default page size. This is a follow up PR for https://github.com/facebook/rocksdb/pull/6457. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6522 Test Plan: make check Reviewed By: siying Differential Revision: D20408885 Pulled By: cheng-chang fbshipit-source-id: f2d3808f41265237e7fa2c0be9f084f8fa97fe3d	2020-03-20 15:26:10 -07:00
Cheng Chang	2d9efc9ab2	Cache result of GetLogicalBufferSize in Linux (#6457 ) Summary: In Linux, when reopening DB with many SST files, profiling shows that 100% system cpu time spent for a couple of seconds for `GetLogicalBufferSize`. This slows down MyRocks' recovery time when site is down. This PR introduces two new APIs: 1. `Env::RegisterDbPaths` and `Env::UnregisterDbPaths` lets `DB` tell the env when it starts or stops using its database directories . The `PosixFileSystem` takes this opportunity to set up a cache from database directories to the corresponding logical block sizes. 2. `LogicalBlockSizeCache` is defined only for OS_LINUX to cache the logical block sizes. Other modifications: 1. rename `logical buffer size` to `logical block size` to be consistent with Linux terms. 2. declare `GetLogicalBlockSize` in `PosixHelper` to expose it to `PosixFileSystem`. 3. change the functions `IOError` and `IOStatus` in `env/io_posix.h` to have external linkage since they are used in other translation units too. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6457 Test Plan: 1. A new unit test is added for `LogicalBlockSizeCache` in `env/io_posix_test.cc`. 2. A new integration test is added for `DB` operations related to the cache in `db/db_logical_block_size_cache_test.cc`. `make check` Differential Revision: D20131243 Pulled By: cheng-chang fbshipit-source-id: 3077c50f8065c0bffb544d8f49fb10bba9408d04	2020-03-11 18:40:05 -07:00
sdong	331e6199df	Include more information in file lock failure (#6507 ) Summary: When users fail to open a DB with file lock failure, it is sometimes hard for users to debug. We now include the time the lock is acquired and the thread ID that acquired the lock, to help users debug problems like this. Default Env's thread ID is used. Since type of lockedFiles is changed, rename it to follow naming convention too. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6507 Test Plan: Add a unit test and improve an existing test to validate the case. Differential Revision: D20378333 fbshipit-source-id: 312fe0e9733fd1d1e9969c321b90ce523cf4708a	2020-03-11 16:23:08 -07:00
sumeerbhola	48d8d076a3	Add missing MutexLock to MockEnv::CreateDir (#6474 ) Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6474 Differential Revision: D20205109 Pulled By: ltamasi fbshipit-source-id: ec136005c63740f5b713ff537b5671ea9b8e217a	2020-03-02 20:52:19 -08:00
sdong	86f1ad7046	Add more unit test coverage to MultiRead (#6452 ) Summary: MultiRead tests in env_test cannot simulate the io_uring case when queries need to be submitted in multiple rounds. Add a new unit test to cover up more requests per MultiRead Pull Request resolved: https://github.com/facebook/rocksdb/pull/6452 Test Plan: Run it and see it pass when liburing is enabled or not enabled. Differential Revision: D20078924 fbshipit-source-id: 6cff7fe345a4c5aa47135186e6181bf00df02b68	2020-02-28 16:42:44 -08:00
Michael R. Crusoe	051696bf98	fix some spelling typos (#6464 ) Summary: Found from Debian's "Lintian" program Pull Request resolved: https://github.com/facebook/rocksdb/pull/6464 Differential Revision: D20162862 Pulled By: zhichao-cao fbshipit-source-id: 06941ee2437b038b2b8045becbe9d2c6fbff3e12	2020-02-28 14:14:03 -08:00
Peter Dillinger	43dde332cb	Share kPageSize (and other small tweaks) (#6443 ) Summary: Make kPageSize extern const size_t (used in draft https://github.com/facebook/rocksdb/issues/6427) Make kLitteEndian constexpr bool Clarify a couple of comments Pull Request resolved: https://github.com/facebook/rocksdb/pull/6443 Test Plan: make check, CI Differential Revision: D20044558 Pulled By: pdillinger fbshipit-source-id: e0c5cc13229c82726280dc0ddcba4078346b8418	2020-02-22 08:01:36 -08:00
sdong	942eaba091	Handle io_uring partial results (#6441 ) Summary: The logic that handles io_uring partial results was wrong. Fix the logic by putting it into a queue and continue reading. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6441 Test Plan: Make sure this patch fixes the application test case where the bug was discovered; in env_test, add a unit test that simulates partial results and make sure the results are still correct. Differential Revision: D20018616 fbshipit-source-id: 5398a7e34d74c26d52aa69dfd604e93e95d99c62	2020-02-21 16:57:37 -08:00
sdong	fdf882ded2	Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433 ) Summary: When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for user to solve the problem, the RocksDB namespace is changed to a flag which can be overridden in build time. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6433 Test Plan: Build release, all and jtest. Try to build with ROCKSDB_NAMESPACE with another flag. Differential Revision: D19977691 fbshipit-source-id: aa7f2d0972e1c31d75339ac48478f34f6cfcfb3e	2020-02-20 12:09:57 -08:00
Cheng Chang	46516778dd	Fix flaky test DecreaseNumBgThreads (#6393 ) Summary: The DecreaseNumBgThreads test keeps failing on Windows in AppVeyor. It fails because it depends on a timed wait for the tasks to be dequeued from the threadpool's internal queue, but within the specified time, the task might have not been scheduled onto the newly created threads. https://github.com/facebook/rocksdb/pull/6232 tries to fix this by waiting for longer time to let the threads scheduled. This PR tries to fix this by replacing the timed wait with a synchronization on the task's internal conditional variable. When the number of threads increases, instead of guessing the time needed for the task to be scheduled, it directly blocks on the conditional variable until the task starts running. But when thread number is reduced, it still does a timed wait, but this does not lead to the flakiness now, will try to remove these timed waits in a future PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6393 Test Plan: Wait to see whether AppVeyor tests pass. Differential Revision: D19890928 Pulled By: cheng-chang fbshipit-source-id: 4e56e4addf625c98c0876e62d9d57a6f0a156f76	2020-02-13 17:27:18 -08:00
sdong	3a073234da	Consolidate ReadFileToString() (#6366 ) Summary: It's a minor refactoring. We have two ReadFileToString() but they are very similar. Make the one with Env argument calls the one with FS argument instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6366 Test Plan: Run all existing tests Differential Revision: D19712332 fbshipit-source-id: 5ae6fabf6355938690d95cda52afd1f39e0a7823	2020-02-04 11:39:23 -08:00
sdong	39410bcb3d	Fix some shadow warning (#6242 ) Summary: Some shadow warning shows up when using gcc 4.8. An example: ./utilities/blob_db/blob_compaction_filter.h: In constructor ‘rocksdb::blob_db::BlobIndexCompactionFilterFactoryBase::BlobIndexCompactionFilterFactoryBase(rocksdb::blob_db::lobDBImpl, rocksdb::Env, rocksdb::Statistics*)’: ./utilities/blob_db/blob_compaction_filter.h:121:7: error: declaration of ‘blob_db_impl’ shadows a member of 'this' [-Werror=shadow] : blob_db_impl_(blob_db_impl), env_(_env), statistics_(_statistics) {} ^ Fix them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6242 Test Plan: Build and see the warnings go away. Differential Revision: D19217789 fbshipit-source-id: 8ef631941f23dab47a388e060adec24b72efd65e	2020-01-08 18:20:13 -08:00
anand76	ad34faba15	Fix unity test (#6178 ) Summary: Fix the test failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6178 Differential Revision: D19071208 Pulled By: maysamyabandeh fbshipit-source-id: 71622832ac93ff2663946c546d9642d5b9e3d194	2019-12-14 15:39:41 -08:00
anand76	afa2420c2b	Introduce a new storage specific Env API (#5761 ) Summary: The current Env API encompasses both storage/file operations, as well as OS related operations. Most of the APIs return a Status, which does not have enough metadata about an error, such as whether its retry-able or not, scope (i.e fault domain) of the error etc., that may be required in order to properly handle a storage error. The file APIs also do not provide enough control over the IO SLA, such as timeout, prioritization, hinting about placement and redundancy etc. This PR separates out the file/storage APIs from Env into a new FileSystem class. The APIs are updated to return an IOStatus with metadata about the error, as well as to take an IOOptions structure as input in order to allow more control over the IO. The user can set both ```options.env``` and ```options.file_system``` to specify that RocksDB should use the former for OS related operations and the latter for storage operations. Internally, a ```CompositeEnvWrapper``` has been introduced that inherits from ```Env``` and redirects individual methods to either an ```Env``` implementation or the ```FileSystem``` as appropriate. When options are sanitized during ```DB::Open```, ```options.env``` is replaced with a newly allocated ```CompositeEnvWrapper``` instance if both env and file_system have been specified. This way, the rest of the RocksDB code can continue to function as before. This PR also ports PosixEnv to the new API by splitting it into two - PosixEnv and PosixFileSystem. PosixEnv is defined as a sub-class of CompositeEnvWrapper, and threading/time functions are overridden with Posix specific implementations in order to avoid an extra level of indirection. The ```CompositeEnvWrapper``` translates ```IOStatus``` return code to ```Status```, and sets the severity to ```kSoftError``` if the io_status is retryable. The error handling code in RocksDB can then recover the DB automatically. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5761 Differential Revision: D18868376 Pulled By: anand1976 fbshipit-source-id: 39efe18a162ea746fabac6360ff529baba48486f	2019-12-13 14:48:41 -08:00
sdong	d1ae2c3faf	Fix an asan warning caused by the recent io_uring change (#6135 ) Summary: ASAN reports: internal_repo_rocksdb/repo:db_test - MultiThreaded/MultiThreadedDBTest.MultiThreaded/43: fatal ==2692739==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6130000500ca at pc 0x0000006be780 bp 0x7efef85ccd20 sp 0x7efef85cc4d0 [CONTEXT] === How to use this, how to get the raw stack trace, and more: fburl.com/ASAN === [CONTEXT] READ of size 331 at 0x6130000500ca thread T195 [CONTEXT] #0 db_test_bin+0x6be77f __interceptor_strlen.part.35 [CONTEXT] https://github.com/facebook/rocksdb/issues/1 internal_repo_rocksdb/repo/include/rocksdb/slice.h:55 rocksdb::Slice::Slice(char const) [CONTEXT] https://github.com/facebook/rocksdb/issues/2 internal_repo_rocksdb/repo/env/io_posix.cc:522 rocksdb::PosixRandomAccessFile::MultiRead(rocksdb::ReadRequest, unsigned long) I looked at env/io_posix.cc:522 but don't see a reason why the line needs to be there at all, because it is not used before overwritten. So it must be a line that is put there as a bug. Remove it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6135 Test Plan: Rerun the same test which passes after the fix. Run all the tests and make sure they all pass. Differential Revision: D18880251 fbshipit-source-id: 3b84ac6a05b67b529c4202e0ceb4c047460f44f2	2019-12-09 10:25:09 -08:00
sdong	7d79b32618	Break db_stress_tool.cc to a list of source files (#6134 ) Summary: db_stress_tool.cc now is a giant file. In order to main it easier to improve and maintain, break it down to multiple source files. Most classes are turned into their own files. Separate .h and .cc files are created for gflag definiations. Another .h and .cc files are created for some common functions. Some test execution logic that is only loosely related to class StressTest is moved to db_stress_driver.h and db_stress_driver.cc. All the files are located under db_stress_tool/. The directory name is created as such because if we end it with either stress or test, .gitignore will ignore any file under it and makes it prone to issues in developements. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6134 Test Plan: Build under GCC7 with and without LITE on using GNU Make. Build with GCC 4.8. Build with cmake with -DWITH_TOOL=1 Differential Revision: D18876064 fbshipit-source-id: b25d0a7451840f31ac0f5ebb0068785f783fdf7d	2019-12-08 23:51:01 -08:00
sdong	e3a82bb934	PosixRandomAccessFile::MultiRead() to use I/O uring if supported (#5881 ) Summary: Right now, PosixRandomAccessFile::MultiRead() executes read requests in parallel. In this PR, it leverages I/O Uring library to run it in parallel, even when page cache is enabled. This function will fall back if the kernel version doesn't support it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5881 Test Plan: Run the unit test on a kernel version supporting it and make sure all tests pass, and run a unit test on kernel version supporting it and see it pass. Before merging, will also run stress test and see it passes. Differential Revision: D17742266 fbshipit-source-id: e05699c925ac04fdb42379456a4e23e4ebcb803a	2019-12-07 20:55:52 -08:00
Yanqin Jin	231fffd07c	Add Env::SanitizeEnvOptions (#5885 ) Summary: Add Env::SanitizeEnvOptions to allow underlying environments properly configure env options. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5885 Test Plan: ``` make check ``` Differential Revision: D17910327 Pulled By: riversand963 fbshipit-source-id: 86a1ac616e485742c35c4a9cc9f1227c529fc00f	2019-10-14 12:25:00 -07:00
Andrew Kryczka	b00761eea6	Fix block cache ID uniqueness for Windows builds (#5844 ) Summary: Since we do not evict a file's blocks from block cache before that file is deleted, we require a file's cache ID prefix is both unique and non-reusable. However, the Windows functionality we were relying on only guaranteed uniqueness. That meant a newly created file could be assigned the same cache ID prefix as a deleted file. If the newly created file had block offsets matching the deleted file, full cache keys could be exactly the same, resulting in obsolete data blocks returned from cache when trying to read from the new file. We noticed this when running on FAT32 where compaction was writing out of order keys due to reading obsolete blocks from its input files. The functionality is documented as behaving the same on NTFS, although I wasn't able to repro it there. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5844 Test Plan: we had a reliable repro of out-of-order keys on FAT32 that was fixed by this change Differential Revision: D17752442 fbshipit-source-id: 95d983f9196cf415f269e19293b97341edbf7e00	2019-10-11 18:19:31 -07:00
Yanqin Jin	167cdc9f17	Support custom env in sst_dump (#5845 ) Summary: This PR allows for the creation of custom env when using sst_dump. If the user does not set options.env or set options.env to nullptr, then sst_dump will automatically try to create a custom env depending on the path to the sst file or db directory. In order to use this feature, the user must call ObjectRegistry::Register() beforehand. Test Plan (on devserver): ``` $make all && make check ``` All tests must pass to ensure this change does not break anything. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5845 Differential Revision: D17678038 Pulled By: riversand963 fbshipit-source-id: 58ecb4b3f75246d52b07c4c924a63ee61c1ee626	2019-10-08 19:19:12 -07:00
sdong	e8263dbdaa	Apply formatter to recent 200+ commits. (#5830 ) Summary: Further apply formatter to more recent commits. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5830 Test Plan: Run all existing tests. Differential Revision: D17488031 fbshipit-source-id: 137458fd94d56dd271b8b40c522b03036943a2ab	2019-09-20 12:04:26 -07:00
Maysam Yabandeh	638d239507	Charge block cache for cache internal usage (#5797 ) Summary: For our default block cache, each additional entry has extra memory overhead. It include LRUHandle (72 bytes currently) and the cache key (two varint64, file id and offset). The usage is not negligible. For example for block_size=4k, the overhead accounts for an extra 2% memory usage for the cache. The patch charging the cache for the extra usage, reducing untracked memory usage outside block cache. The feature is enabled by default and can be disabled by passing kDontChargeCacheMetadata to the cache constructor. This PR builds up on https://github.com/facebook/rocksdb/issues/4258 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5797 Test Plan: - Existing tests are updated to either disable the feature when the test has too much dependency on the old way of accounting the usage or increasing the cache capacity to account for the additional charge of metadata. - The Usage tests in cache_test.cc are augmented to test the cache usage under kFullChargeCacheMetadata. Differential Revision: D17396833 Pulled By: maysamyabandeh fbshipit-source-id: 7684ccb9f8a40ca595e4f5efcdb03623afea0c6f	2019-09-16 15:26:21 -07:00
Shylock Hg	9eb3e1f77d	Use delete to disable automatic generated methods. (#5009 ) Summary: Use delete to disable automatic generated methods instead of private, and put the constructor together for more clear.This modification cause the unused field warning, so add unused attribute to disable this warning. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5009 Differential Revision: D17288733 fbshipit-source-id: 8a767ce096f185f1db01bd28fc88fef1cdd921f3	2019-09-11 18:09:00 -07:00
Andrew Kryczka	20dd828c01	Avoid clock_gettime on pre-10.12 macOS versions (#5570 ) Summary: On older macOS like 10.10 we saw the following compiler error: ``` /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb/env/env_posix.cc:845:19: error: use of undeclared identifier 'CLOCK_THREAD_CPUTIME_ID' clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts); ^ ``` According to mac's `man clock_gettime`: "These functions first appeared in Mac OSX 10.12". So we should not try to compile it on earlier versions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5570 Test Plan: verified it compiles now on 10.10. Also did some investigation to ensure it does not cause regression on macOS 10.12+, although I do not have access to such an environment to really test. Differential Revision: D17322629 Pulled By: riversand963 fbshipit-source-id: e0a412223854f826b4d83e6d15c3739ff4620d7d	2019-09-11 14:07:25 -07:00
Richard He	cfc20019d1	Fixed FALLOC_FL_KEEP_SIZE undefined (#5614 ) Summary: Fix `error: ‘FALLOC_FL_KEEP_SIZE’` undeclared error in `io_posix.cc` during Vagrant build in CentOS as per issue https://github.com/facebook/rocksdb/issues/5599 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5614 Differential Revision: D17217960 fbshipit-source-id: ef736c51b16833107fd9ccc7917ed1def2a8d02c	2019-09-05 17:37:21 -07:00
Yi Wu	83b991922e	Fix EncryptedEnv assert (#5735 ) Summary: Fixes https://github.com/facebook/rocksdb/issues/5734. By reading the code the assert don't quite make sense to me, since `dataSize` and `fileOffset` has no correlation. But my knowledge about `EncryptedEnv` is very limited. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5735 Test Plan: run `ENCRYPTED_ENV=1 ./db_encryption_test` Signed-off-by: Yi Wu <yiwu@pingcap.com> Differential Revision: D17133849 fbshipit-source-id: bb7262d308e5b2503c400b180edc252668df0ef0	2019-09-05 17:21:42 -07:00
sheng qiu	c762efc4a9	fix compile error: ‘FALLOC_FL_KEEP_SIZE’ undeclared (#5708 ) Summary: add "linux/falloc.h" in env/io_posix.cc to fix compile error: ‘FALLOC_FL_KEEP_SIZE’ undeclared Signed-off-by: sheng qiu <herbert1984106@gmail.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/5708 Differential Revision: D16832922 fbshipit-source-id: 30e787c4a1b5a9724a8acfd68962ff5ec5f27d3e	2019-08-16 13:58:05 -07:00
Yi Wu	849a8c0ae0	fix sign compare warnings (#5651 ) Summary: Fix -Wsign-compare warnings for gcc9. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5651 Test Plan: Tested with ubuntu19.10+gcc9 Differential Revision: D16567428 fbshipit-source-id: 730b2704d42ba0c4e4ea946a3199bbb34be4c25c	2019-07-30 14:12:54 -07:00
Mark Rambacher	cfcf045acc	The ObjectRegistry class replaces the Registrar and NewCustomObjects.… (#5293 ) Summary: The ObjectRegistry class replaces the Registrar and NewCustomObjects. Objects are registered with the registry by Type (the class must implement the static const char *Type() method). This change is necessary for a few reasons: - By having a class (rather than static template instances), the class can be passed between compilation units, meaning that objects could be registered and shared from a dynamic library with an executable. - By having a class with instances, different units could have different objects registered. This could be useful if, for example, one Option allowed for a dynamic library and one did not. When combined with some other PRs (being able to load shared libraries, a Configurable interface to configure objects to/from string), this code will allow objects in external shared libraries to be added to a RocksDB image at run-time, rather than requiring every new extension to be built into the main library and called explicitly by every program. Test plan (on riversand963's devserver) ``` $COMPILE_WITH_ASAN=1 make -j32 all && sleep 1 && make check ``` All tests pass. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5293 Differential Revision: D16363396 Pulled By: riversand963 fbshipit-source-id: fbe4acb615bfc11103eef40a0b288845791c0180	2019-07-23 17:13:05 -07:00
ggaurav28	60d8b19836	Implemented a file logger that uses WritableFileWriter (#5491 ) Summary: Current PosixLogger performs IO operations using posix calls. Thus the current implementation will not work for non-posix env. Created a new logger class EnvLogger that uses env specific WritableFileWriter for IO operations. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5491 Test Plan: make check Differential Revision: D15909002 Pulled By: ggaurav28 fbshipit-source-id: 13a8105176e8e42db0c59798d48cb6a0dbccc965	2019-07-09 16:27:22 -07:00
Sagar Vemuri	68614a9608	Fix AlignedBuffer's usage in Encryption Env (#5396 ) Summary: The usage of `AlignedBuffer` in env_encryption.cc writes and reads to/from the AlignedBuffer's internal buffer directly without going through AlignedBuffer's APIs (like `Append` and `Read`), causing encapsulation to break in some cases. The writes are especially problematic as after the data is written to the buffer (directly using either memmove or memcpy), the size of the buffer is not updated ... causing the AlignedBuffer to lose track of the encapsulated buffer's current size. Fixed this by updating the buffer size after every write. Todo for later: Add an overloaded method to AlignedBuffer to support a memmove in addition to a memcopy. Encryption env does a memmove, and hence I couldn't switch to using `AlignedBuffer.Append()`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5396 Test Plan: `make check` Differential Revision: D15764756 Pulled By: sagar0 fbshipit-source-id: 2e24b52bd3b4b5056c5c1da157f91ddf89370183	2019-06-19 16:46:20 -07:00
Andrew Kryczka	220870523c	Fix compilation with USE_HDFS (#5444 ) Summary: The changes in `8272a6de57` were untested with `USE_HDFS=1`. There were a couple compiler errors. This PR fixes them. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5444 Test Plan: ``` $ EXTRA_LDFLAGS="-L/tmp/hadoop-3.1.2/lib/native/" EXTRA_CXXFLAGS="-I/tmp/hadoop-3.1.2/include" USE_HDFS=1 make -j12 check ``` Differential Revision: D15885009 fbshipit-source-id: 2a0a63739e0b9a2819b461ad63ce1292c4833fe2	2019-06-18 14:55:59 -07:00
Andrew Kryczka	2c9df9f9e5	Dynamic test whether sync_file_range returns ENOSYS (#5416 ) Summary: `sync_file_range` returns `ENOSYS` on Windows Subsystem for Linux even when using a supposedly supported filesystem like ext4. To handle this case we can do a dynamic check that a no-op `sync_file_range` invocation, which is accomplished by passing zero for the `flags` argument, succeeds. Also I rearranged the function and comments to hopefully make it more easily understandable. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5416 Differential Revision: D15807061 fbshipit-source-id: d31d94e1f228b7850ea500e6199f8b5daf8cfbd3	2019-06-13 13:56:10 -07:00
Levi Tamasi	a16d0cc494	Fix build errors regarding const qualifier being ignored on cast result type (#5432 ) Summary: This affects some TSAN builds: env/env_test.cc: In member function ‘virtual void rocksdb::EnvPosixTestWithParam_MultiRead_Test::TestBody()’: env/env_test.cc:1126:76: error: type qualifiers ignored on cast result type [-Werror=ignored-qualifiers] auto data = NewAligned(kSectorSize * 8, static_cast<const char>(i + 1)); ^ env/env_test.cc:1154:77: error: type qualifiers ignored on cast result type [-Werror=ignored-qualifiers] auto buf = NewAligned(kSectorSize * 8, static_cast<const char>(i*2 + 1)); ^ Pull Request resolved: https://github.com/facebook/rocksdb/pull/5432 Differential Revision: D15727277 Pulled By: ltamasi fbshipit-source-id: dc0e687b123e7c4d703ccc0c16b7167e07d1c9b0	2019-06-07 19:37:41 -07:00
Yanqin Jin	cb1bf09bfc	Fix tsan error (#5414 ) Summary: Previous code has a warning when compile with tsan, leading to an error since we have -Werror. Compilation result ``` In file included from ./env/env_chroot.h:12, from env/env_test.cc:40: ./include/rocksdb/env.h: In instantiation of ‘rocksdb::Status rocksdb::DynamicLibrary::LoadFunction(const string&, std::function<T>) [with T = void(void, const char); std::__cxx11::string = std::__cxx11::basic_string<char>]’: env/env_test.cc:260:5: required from here ./include/rocksdb/env.h:1010:17: error: cast between incompatible function types from ‘rocksdb::DynamicLibrary::FunctionPtr’ {aka ‘void* ()()’} to ‘void ()(void, const char)’ [-Werror=cast-function-type] function = reinterpret_cast<T>(ptr); ^~~~~~~~~~~~~~~~~~~~~~~~~ cc1plus: all warnings being treated as errors make: ** [env/env_test.o] Error 1 ``` It also has another error reported by clang ``` env/env_posix.cc:141:11: warning: Value stored to 'err' during its initialization is never read char* err = dlerror(); // Clear any old error ^~~ ~~~~~~~~~ 1 warning generated. ``` Test plan (on my devserver). ``` $make clean $OPT=-g ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1 COMPILE_WITH_TSAN=1 make -j32 $ $make clean $USE_CLANG=1 TEST_TMPDIR=/dev/shm/rocksdb OPT=-g make -j1 analyze ``` Both should pass. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5414 Differential Revision: D15637315 Pulled By: riversand963 fbshipit-source-id: 8e307483761019a4d5998cab92d49516d7edffbf	2019-06-05 15:42:23 -07:00
anand76	0153e14569	Add a MultiRead() method to Env (#5311 ) Summary: Define the Env:: MultiRead() method to allow callers to request multiple block reads in one shot. The underlying Env implementation can parallelize it if it chooses to in order to reduce the overall IO latency. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5311 Differential Revision: D15502172 Pulled By: anand1976 fbshipit-source-id: 2b228269c2e11b5f54694d6b2bb3119c8a8ce2b9	2019-06-05 09:41:34 -07:00
Mark Rambacher	c8267120d8	Add support for loading dynamic libraries into the RocksDB environment (#5281 ) Summary: This change adds a Dynamic Library class to the RocksDB Env. Dynamic libraries are populated via the Env::LoadLibrary method. The addition of dynamic library support allows for a few different features to be developed: 1. The compression code can be changed to use dynamic library support. This would allow RocksDB to determine at run-time what compression packages were installed. This change would eliminate the need to make sure the build-time and run-time environment had the same library set. It would also simplify some of the Java build issues (where it attempts to build and include various packages inside the RocksDB jars). 2. Along with other features (to be provided in a subsequent PR), this change would allow code/configurations to be added to RocksDB at run-time. For example, the build system includes code for building an "rados" environment and adding "Cassandra" features. Instead of these extensions being built into the base RocksDB code, these extensions could be loaded at run-time as required/appropriate, either by configuration or explicitly. We intend to push out other changes in support of the extending RocksDB at run-time via configurations. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5281 Differential Revision: D15447613 Pulled By: riversand963 fbshipit-source-id: 452cd4f54511c0bceee18f6d9d919aae9fd25fef	2019-06-03 23:02:56 -07:00
Siying Dong	000b9ec217	Move some logging related files to logging/ (#5387 ) Summary: Many logging related source files are under util/. It will be more structured if they are together. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5387 Differential Revision: D15579036 Pulled By: siying fbshipit-source-id: 3850134ed50b8c0bb40a0c8ae1f184fa4081303f	2019-05-31 17:23:59 -07:00
Siying Dong	8843129ece	Move some memory related files from util/ to memory/ (#5382 ) Summary: Move arena, allocator, and memory tools under util to a separate memory/ directory. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5382 Differential Revision: D15564655 Pulled By: siying fbshipit-source-id: 9cd6b5d0d3d52b39606e19221fa154596e5852a5	2019-05-30 17:44:09 -07:00
Siying Dong	e9e0101ca4	Move test related files under util/ to test_util/ (#5377 ) Summary: There are too many types of files under util/. Some test related files don't belong to there or just are just loosely related. Mo ve them to a new directory test_util/, so that util/ is cleaner. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5377 Differential Revision: D15551366 Pulled By: siying fbshipit-source-id: 0f5c8653832354ef8caa31749c0143815d719e2c	2019-05-30 11:25:51 -07:00
Raphael Bost	468ca61105	Break large file writes into 1GB chunks (#5213 ) Summary: This is a workaround for the issue described in #5169. It has been tested on a database with very large values, but not dedicated test has been added to the code base. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5213 Differential Revision: D15243116 Pulled By: siying fbshipit-source-id: e0c226a6cd71a60924dcd7ce7af74abcb4054484	2019-05-15 14:20:24 -07:00
Andrew Kryczka	8272a6de57	Optionally wait on bytes_per_sync to smooth I/O (#5183 ) Summary: The existing implementation does not guarantee bytes reach disk every `bytes_per_sync` when writing SST files, or every `wal_bytes_per_sync` when writing WALs. This can cause confusing behavior for users who enable this feature to avoid large syncs during flush and compaction, but then end up hitting them anyways. My understanding of the existing behavior is we used `sync_file_range` with `SYNC_FILE_RANGE_WRITE` to submit ranges for async writeback, such that we could continue processing the next range of bytes while that I/O is happening. I believe we can preserve that benefit while also limiting how far the processing can get ahead of the I/O, which prevents huge syncs from happening when the file finishes. Consider this `sync_file_range` usage: `sync_file_range(fd_, 0, static_cast<off_t>(offset + nbytes), SYNC_FILE_RANGE_WAIT_BEFORE \| SYNC_FILE_RANGE_WRITE)`. Expanding the range to start at 0 and adding the `SYNC_FILE_RANGE_WAIT_BEFORE` flag causes any pending writeback (like from a previous call to `sync_file_range`) to finish before it proceeds to submit the latest `nbytes` for writeback. The latest `nbytes` are still written back asynchronously, unless processing exceeds I/O speed, in which case the following `sync_file_range` will need to wait on it. There is a second change in this PR to use `fdatasync` when `sync_file_range` is unavailable (determined statically) or has some known problem with the underlying filesystem (determined dynamically). The above two changes only apply when the user enables a new option, `strict_bytes_per_sync`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5183 Differential Revision: D14953553 Pulled By: siying fbshipit-source-id: 445c3862e019fb7b470f9c7f314fc231b62706e9	2019-04-22 11:51:39 -07:00
Fosco Marotto	6c2bf9e916	Add copyright headers per FB open-source checkup tool. (#5199 ) Summary: internal task: T35568575 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5199 Differential Revision: D14962794 Pulled By: gfosco fbshipit-source-id: 93838ede6d0235eaecff90d200faed9a8515bbbe	2019-04-18 10:55:01 -07:00
Siying Dong	ed9f5e21aa	Change OptimizeForPointLookup() and OptimizeForSmallDb() (#5165 ) Summary: Change the behavior of OptimizeForSmallDb() so that it is less likely to go out of memory. Change the behavior of OptimizeForPointLookup() to take advantage of the new memtable whole key filter, and move away from prefix extractor as well as hash-based indexing, as they are prone to misuse. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5165 Differential Revision: D14880709 Pulled By: siying fbshipit-source-id: 9af30e3c9e151eceea6d6b38701a58f1f9fb692d	2019-04-11 10:45:36 -07:00
jsteemann	313e877285	fix reading encrypted files beyond file boundaries (#5160 ) Summary: This fix should help reading from encrypted files if the file-to-be-read is smaller than expected. For example, when using the encrypted env and making it read a journal file of exactly 0 bytes size, the encrypted env code crashes with SIGSEGV in its Decrypt function, as there is no check if the read attempts to read over the file's boundaries (as specified originally by the `dataSize` parameter). The most important problem this patch addresses is however that there is no size underlow check in `CTREncryptionProvider::CreateCipherStream`: The stream to be read will be initialized to a size of always `prefix.size() - (2 * blockSize)`. If the prefix however is smaller than twice the block size, this will obviously assume a _very_ large stream and read over the bounds. The patch adds a check here as follows: // If the prefix is smaller than twice the block size, we would below read a // very large chunk of the file (and very likely read over the bounds) assert(prefix.size() >= 2 * blockSize); if (prefix.size() < 2 * blockSize) { return Status::Corruption("Unable to read from file " + fname + ": read attempt would read beyond file bounds"); } so embedders can catch the error in their release builds. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5160 Differential Revision: D14834633 Pulled By: sagar0 fbshipit-source-id: 47aa39a6db8977252cede054c7eb9a663b9a3484	2019-04-08 14:57:25 -07:00
Yanqin Jin	d77476ef55	Fix db_stress for custom env (#5122 ) Summary: Fix some hdfs-related code so that it can compile and run 'db_stress' Pull Request resolved: https://github.com/facebook/rocksdb/pull/5122 Differential Revision: D14675495 Pulled By: riversand963 fbshipit-source-id: cac280479efcf5451982558947eac1732e8bc45a	2019-03-28 19:20:27 -07:00
Yanqin Jin	9358178edc	Support for single-primary, multi-secondary instances (#4899 ) Summary: This PR allows RocksDB to run in single-primary, multi-secondary process mode. The writer is a regular RocksDB (e.g. an `DBImpl`) instance playing the role of a primary. Multiple `DBImplSecondary` processes (secondaries) share the same set of SST files, MANIFEST, WAL files with the primary. Secondaries tail the MANIFEST of the primary and apply updates to their own in-memory state of the file system, e.g. `VersionStorageInfo`. This PR has several components: 1. (Originally in #4745). Add a `PathNotFound` subcode to `IOError` to denote the failure when a secondary tries to open a file which has been deleted by the primary. 2. (Similar to #4602). Add `FragmentBufferedReader` to handle partially-read, trailing record at the end of a log from where future read can continue. 3. (Originally in #4710 and #4820). Add implementation of the secondary, i.e. `DBImplSecondary`. 3.1 Tail the primary's MANIFEST during recovery. 3.2 Tail the primary's MANIFEST during normal processing by calling `ReadAndApply`. 3.3 Tailing WAL will be in a future PR. 4. Add an example in 'examples/multi_processes_example.cc' to demonstrate the usage of secondary RocksDB instance in a multi-process setting. Instructions to run the example can be found at the beginning of the source code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4899 Differential Revision: D14510945 Pulled By: riversand963 fbshipit-source-id: 4ac1c5693e6012ad23f7b4b42d3c374fecbe8886	2019-03-26 16:45:31 -07:00
Zhongyi Xie	3c5eed5ebe	remove incorrect assert in `GetUniqueIdFromFile` (#5102 ) Summary: User report has shown that sometimes `BlockBasedTable::SetupCacheKeyPrefix` would assert when trying to generate an id from the file. The actual cause seems to be hardware related but we might be better off without the incorrect assertion See T42178927 for more information Pull Request resolved: https://github.com/facebook/rocksdb/pull/5102 Differential Revision: D14604677 Pulled By: miasantreble fbshipit-source-id: fcb09207ebdc4fa66e941afbc0523d84797e7ad7	2019-03-25 23:28:29 -07:00
Rashmi Sharma	a4396f9218	Make it easier for users to load options from option file and set shared block cache. (#5063 ) Summary: [RocksDB] Make it easier for users to load options from option file and set shared block cache. Right now, it requires several dynamic casting for users to set the shared block cache to their option struct cast from the option file. If people don't do that, every CF of every DB will generate its own 8MB block cache. It's not a usable setting. So we are dragging every user who loads options from the file into such a mess. Instead, we should allow them to pass their cache object to LoadLatestOptions() and LoadOptionsFromFile(), so that those loaded option structs will have the shared block cache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5063 Differential Revision: D14518584 Pulled By: rashmishrm fbshipit-source-id: c91430ff9425a0e67d76fc67931d755f491ca5aa	2019-03-21 16:25:28 -07:00
Zhongyi Xie	a291f3a1e5	Collect compaction stats by priority and dump to info LOG (#5050 ) Summary: In order to better understand compaction done by different priority thread pool, we now collect compaction stats by priority and also print them to info LOG through stats dump. ``` Compaction Stats [default] Priority Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Low 0/0 0.00 KB 0.0 16.8 11.3 5.5 5.6 0.1 0.0 0.0 406.4 136.1 42.24 34.96 45 0.939 13M 8865K High 0/0 0.00 KB 0.0 0.0 0.0 0.0 11.4 11.4 0.0 0.0 0.0 76.2 153.00 35.74 12185 0.013 0 0 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5050 Differential Revision: D14408583 Pulled By: miasantreble fbshipit-source-id: e53746586ea27cb8abc9fec35805bd80ed30f608	2019-03-19 17:28:19 -07:00
Andrew Kryczka	186b3afaa8	Use `fallocate` even if hole-punching unsupported (#5023 ) Summary: The compiler flag `-DROCKSDB_FALLOCATE_PRESENT` was only set when `fallocate`, `FALLOC_FL_KEEP_SIZE`, and `FALLOC_FL_PUNCH_HOLE` were all present. However, the last of the three is not really necessary for the primary `fallocate` use case; furthermore, it was introduced only in later Linux kernel versions (2.6.38+). This PR changes the flag `-DROCKSDB_FALLOCATE_PRESENT` to only require `fallocate` and `FALLOC_FL_KEEP_SIZE` to be present. There is a separate check for `FALLOC_FL_PUNCH_HOLE` only in the place where it is used. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5023 Differential Revision: D14248487 Pulled By: siying fbshipit-source-id: a10ed0b902fa755988e957bd2dcec9081ec0502e	2019-03-04 15:43:17 -08:00
Michael Liu	3c5d1b16b1	Apply modernize-use-override (3) Summary: Use C++11’s override and remove virtual where applicable. Change are automatically generated. bypass-lint drop-conflicts Reviewed By: igorsugak Differential Revision: D14131816 fbshipit-source-id: f20e7f7cecf2e699d70f5fa036f72c0e3f59b50e	2019-02-19 13:39:49 -08:00
Siying Dong	c2affccc18	Header logger should call LogHeader() (#4980 ) Summary: The info log header feature never worked well, because log level Header was not translated to Logger::LogHeader() call. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4980 Differential Revision: D14087283 Pulled By: siying fbshipit-source-id: 7e7d03ce35fa8d13d4ee549f46f7326f7bc0006d	2019-02-15 16:59:36 -08:00
Young Tack Jin	4091597c67	fix for nvme device path (#4866 ) Summary: nvme device path doesn't have "block" as like "nvme/nvme0/nvme0n1" or "nvme/nvme0/nvme0n1/nvme0n1p1". the last directory such as "nvme0n1p1" should be removed if nvme drive is partitioned. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4866 Differential Revision: D13627824 Pulled By: riversand963 fbshipit-source-id: 09ab968f349f3dbb890beea20193f1359b17d317	2019-01-31 19:08:37 -08:00
Alexander Zinoviev	32a6dd9a41	Add a new CPU time counter to compaction report (#4889 ) Summary: Measure CPU time consumed for a compaction and report it in the stats report Enable NowCPUNanos() to work for MacOS Pull Request resolved: https://github.com/facebook/rocksdb/pull/4889 Differential Revision: D13701276 Pulled By: zinoale fbshipit-source-id: 5024e5bbccd4dd10fd90d947870237f436445055	2019-01-29 17:24:00 -08:00
Siying Dong	da1c64b6e7	Introduce a CPU time counter in perf_context (#4741 ) Summary: Introduce the first CPU timing counter, perf_context.get_cpu_nanos. This opens a door to more CPU counters in the future. Only Posix Env has it implemented using clock_gettime() with CLOCK_THREAD_CPUTIME_ID. How accurate the counter is depends on the platform. Make PerfStepTimer to take an Env as an argument, and sometimes pass it in. The direct reason is to make the unit tests to use SpecialEnv where we can ingest logic there. But in long term, this is a good change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4741 Differential Revision: D13287798 Pulled By: siying fbshipit-source-id: 090361049d9d5095d1d1a369fe1338d2e2e1c73f	2018-12-20 12:03:44 -08:00
Sagar Vemuri	dc3528077a	Update all unique/shared_ptr instances to be qualified with namespace std (#4638 ) Summary: Ran the following commands to recursively change all the files under RocksDB: ``` find . -type f -name ".cc" -exec sed -i 's/ unique_ptr/ std::unique_ptr/g' {} + find . -type f -name ".cc" -exec sed -i 's/<unique_ptr/<std::unique_ptr/g' {} + find . -type f -name ".cc" -exec sed -i 's/ shared_ptr/ std::shared_ptr/g' {} + find . -type f -name ".cc" -exec sed -i 's/<shared_ptr/<std::shared_ptr/g' {} + ``` Running `make format` updated some formatting on the files touched. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4638 Differential Revision: D12934992 Pulled By: sagar0 fbshipit-source-id: 45a15d23c230cdd64c08f9c0243e5183934338a8	2018-11-09 11:19:58 -08:00
Sagar Vemuri	abb8ecb4cd	Add missing methods to WritableFileWrapper (#4584 ) Summary: `WritableFileWrapper` was missing some newer methods that were added to `WritableFile`. Without these functions, the missing wrapper methods would fallback to using the default implementations in WritableFile instead of using the corresponding implementations in, say, `PosixWritableFile` or `WinWritableFile`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4584 Differential Revision: D10559199 Pulled By: sagar0 fbshipit-source-id: 0d0f18a486aee727d5b8eebd3110a41988e27391	2018-10-24 12:19:54 -07:00
Zhongyi Xie	cac87fcf57	move dump stats to a separate thread (#4382 ) Summary: Currently statistics are supposed to be dumped to info log at intervals of `options.stats_dump_period_sec`. However the implementation choice was to bind it with compaction thread, meaning if the database has been serving very light traffic, the stats may not get dumped at all. We decided to separate stats dumping into a new timed thread using `TimerQueue`, which is already used in blob_db. This will allow us schedule new timed tasks with more deterministic behavior. Tested with db_bench using `--stats_dump_period_sec=20` in command line: > LOG:2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS ------- LOG:2018/09/17-14:08:05.643286 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS ------- LOG:2018/09/17-14:08:25.691325 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS ------- LOG:2018/09/17-14:08:45.740989 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS ------- LOG content: > 2018/09/17-14:07:45.575025 7fe99fbfe700 [WARN] [db/db_impl.cc:605] ------- DUMPING STATS ------- 2018/09/17-14:07:45.575080 7fe99fbfe700 [WARN] [db/db_impl.cc:606] DB Stats Uptime(secs): 20.0 total, 20.0 interval Cumulative writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5.57 GB, 285.01 MB/s Cumulative WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 GB, 285.01 MB/s Cumulative stall: 00:00:0.012 H:M:S, 0.1 percent Interval writes: 4447K writes, 4447K keys, 4447K commit groups, 1.0 writes per commit group, ingest: 5700.71 MB, 285.01 MB/s Interval WAL: 4447K writes, 0 syncs, 4447638.00 writes per sync, written: 5.57 MB, 285.01 MB/s Interval stall: 00:00:0.012 H:M:S, 0.1 percent Compaction Stats [default] Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Pull Request resolved: https://github.com/facebook/rocksdb/pull/4382 Differential Revision: D9933051 Pulled By: miasantreble fbshipit-source-id: 6d12bb1e4977674eea4bf2d2ac6d486b814bb2fa	2018-10-08 22:54:43 -07:00
Sagar Vemuri	b1dad4cfcc	assert in PosixEnv::FileExists should be based on errno (#4427 ) Summary: The assert in PosixEnv::FileExists is currently based on the return value of `access` syscall. Instead it should be based on errno. Initially I wanted to remove this assert as [`access`](https://linux.die.net/man/2/access) can error out in a few other cases (like EROFS). But on thinking more it feels like the assert is doing the right thing ... its good to crash on EROFS, EFAULT, EINVAL, and other major filesystem related problems so that the user is immediately aware of the problems while testing. (I think it might be ok to crash on EIO as well, but there might be a specific reason why it was decided not to crash for EIO, and I don't have that context. So letting the letting the assert checks remain as is for now). Pull Request resolved: https://github.com/facebook/rocksdb/pull/4427 Differential Revision: D10037200 Pulled By: sagar0 fbshipit-source-id: 5cc96116a2e53cef701f444a8b5290576f311e51	2018-09-26 13:25:15 -07:00
Anand Ananthabhotla	a27fce408e	Auto recovery from out of space errors (#4164 ) Summary: This commit implements automatic recovery from a Status::NoSpace() error during background operations such as write callback, flush and compaction. The broad design is as follows - 1. Compaction errors are treated as soft errors and don't put the database in read-only mode. A compaction is delayed until enough free disk space is available to accomodate the compaction outputs, which is estimated based on the input size. This means that users can continue to write, and we rely on the WriteController to delay or stop writes if the compaction debt becomes too high due to persistent low disk space condition 2. Errors during write callback and flush are treated as hard errors, i.e the database is put in read-only mode and goes back to read-write only fater certain recovery actions are taken. 3. Both types of recovery rely on the SstFileManagerImpl to poll for sufficient disk space. We assume that there is a 1-1 mapping between an SFM and the underlying OS storage container. For cases where multiple DBs are hosted on a single storage container, the user is expected to allocate a single SFM instance and use the same one for all the DBs. If no SFM is specified by the user, DBImpl::Open() will allocate one, but this will be one per DB and each DB will recover independently. The recovery implemented by SFM is as follows - a) On the first occurance of an out of space error during compaction, subsequent compactions will be delayed until the disk free space check indicates enough available space. The required space is computed as the sum of input sizes. b) The free space check requirement will be removed once the amount of free space is greater than the size reserved by in progress compactions when the first error occured c) If the out of space error is a hard error, a background thread in SFM will poll for sufficient headroom before triggering the recovery of the database and putting it in write-only mode. The headroom is calculated as the sum of the write_buffer_size of all the DB instances associated with the SFM 4. EventListener callbacks will be called at the start and completion of automatic recovery. Users can disable the auto recov ery in the start callback, and later initiate it manually by calling DB::Resume() Todo: 1. More extensive testing 2. Add disk full condition to db_stress (follow-on PR) Pull Request resolved: https://github.com/facebook/rocksdb/pull/4164 Differential Revision: D9846378 Pulled By: anand1976 fbshipit-source-id: 80ea875dbd7f00205e19c82215ff6e37da10da4a	2018-09-15 13:43:04 -07:00
cngzhnp	64324e329e	Support pragma once in all header files and cleanup some warnings (#4339 ) Summary: As you know, almost all compilers support "pragma once" keyword instead of using include guards. To be keep consistency between header files, all header files are edited. Besides this, try to fix some warnings about loss of data. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4339 Differential Revision: D9654990 Pulled By: ajkr fbshipit-source-id: c2cf3d2d03a599847684bed81378c401920ca848	2018-09-05 18:13:31 -07:00
Wez Furlong	d00e5de7fc	use atomic O_CLOEXEC when available (#4328 ) Summary: In our application we spawn helper child processes concurrently with opening rocksdb. In one situation I observed that the child process had inherited the rocksdb lock file as well as directory handles to the rocksdb storage location. The code in env_posix takes care to set CLOEXEC but doesn't use `O_CLOEXEC` at the time that the files are opened which means that there is a window of opportunity to leak the descriptors across a fork/exec boundary. This diff introduces a helper that can conditionally set the `O_CLOEXEC` bit for the open call using the same logic as that in the existing helper for setting that flag post-open. I've preserved the post-open logic for systems that don't have `O_CLOEXEC`. I've introduced setting `O_CLOEXEC` for what appears to be a number of temporary or transient files and directory handles; I suspect that none of the files opened by Rocks are intended to be inherited by a forked child process. In one case, `fopen` is used to open a file. I've added the use of the glibc-specific `e` mode to turn on `O_CLOEXEC` for this case. While this doesn't cover all posix systems, it is an improvement for our common deployment system. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4328 Reviewed By: ajkr Differential Revision: D9553046 Pulled By: wez fbshipit-source-id: acdb89f7a85ca649b22fe3c3bd76f82142bec2bf	2018-08-29 20:27:43 -07:00
Jean-Marc Le Roux	bbf30330b4	Fix the build failure with OS_ANDROID (#4232 ) Summary: sysmacros.h should be included in OS_ANDROID build as well otherwise the compile would complain: error: use of undeclared identifier 'major'. Fixes https://github.com/facebook/rocksdb/issues/4231 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4232 Differential Revision: D9217350 Pulled By: maysamyabandeh fbshipit-source-id: 21f4b62dbbda3163120ac0b38b95d95d35d67dce	2018-08-08 08:12:02 -07:00
Maysam Yabandeh	8581a93a6b	Per-thread unique test db names (#4135 ) Summary: The patch makes sure that two parallel test threads will operate on different db paths. This enables using open source tools such as gtest-parallel to run the tests of a file in parallel. Example: ``` ~/gtest-parallel/gtest-parallel ./table_test``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4135 Differential Revision: D8846653 Pulled By: maysamyabandeh fbshipit-source-id: 799bad1abb260e3d346bcb680d2ae207a852ba84	2018-07-13 17:27:39 -07:00
Yanqin Jin	520bbb1774	Disable EnvPosixTest.RunImmediately, add EnvPosixTest.RunEventually. (#4126 ) Summary: The original `EnvPosixTest.RunImmediately` assumes that after scheduling a background thread, the thread is guaranteed to complete after 0.1 second. I do not know about any non-real-time OS/runtime providing this guarantee. Nor does C++11 standard say anything about this in the documentation of `std::thread`. In fact, we have observed this test failure multiple times on appveyor, and we haven't been able to reproduce the failure deterministically. Therefore, I disable this test for now until we know for sure how it used to fail. Instead, I add another test `EnvPosixTest.RunEventually` that checks that a thread will be scheduled eventually. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4126 Differential Revision: D8827086 Pulled By: riversand963 fbshipit-source-id: abc5cb655f90d50b791493da5eeb3716885dfe93	2018-07-12 18:27:15 -07:00
Siying Dong	926f3a78a6	In delete scheduler, before ftruncate file for slow delete, check whether there is other hard links (#4093 ) Summary: Right now slow deletion with ftruncate doesn't work well with checkpoints because it ruin hard linked files in checkpoints. To fix it, check the file has no other hard link before ftruncate it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4093 Differential Revision: D8730360 Pulled By: siying fbshipit-source-id: 756eea5bce8a87b9a2ea3a5bfa190b2cab6f75df	2018-07-09 15:28:12 -07:00
Sagar Vemuri	645e57c22d	Assert for Direct IO at the beginning in PositionedRead (#3891 ) Summary: Moved the direct-IO assertion to the top in `PosixSequentialFile::PositionedRead`, as it doesn't make sense to check for sector alignments before checking for direct IO. Closes https://github.com/facebook/rocksdb/pull/3891 Differential Revision: D8267972 Pulled By: sagar0 fbshipit-source-id: 0ecf77c0fb5c35747a4ddbc15e278918c0849af7	2018-06-21 14:58:01 -07:00
Tomas Kolda	906a602c2c	Build and tests fixes for Solaris Sparc (#4000 ) Summary: Here are some fixes for build on Solaris Sparc. It is also fixing CRC test on BigEndian platforms. Closes https://github.com/facebook/rocksdb/pull/4000 Differential Revision: D8455394 Pulled By: ajkr fbshipit-source-id: c9289a7b541a5628139c6b77e84368e14dc3d174	2018-06-15 12:42:53 -07:00
Andrew Kryczka	1f32dc7d2b	Check with PosixEnv before opening LOCK file (#3993 ) Summary: Rebased and resubmitting #1831 on behalf of stevelittle. The problem is when a single process attempts to open the same DB twice, the second attempt fails due to LOCK file held. If the second attempt had opened the LOCK file, it'll now need to close it, and closing causes the file to be unlocked. Then, any subsequent attempt to open the DB will succeed, which is the wrong behavior. The solution was to track which files a process has locked in PosixEnv, and check those before opening a LOCK file. Fixes #1780. Closes https://github.com/facebook/rocksdb/pull/3993 Differential Revision: D8398984 Pulled By: ajkr fbshipit-source-id: 2755fe66950a0c9de63075f932f9e15768041918	2018-06-13 17:32:04 -07:00
Zhongyi Xie	f1592a06c2	run make format for PR 3838 (#3954 ) Summary: PR https://github.com/facebook/rocksdb/pull/3838 made some changes that triggers lint warnings. Run `make format` to fix formatting as suggested by siying . Also piggyback two changes: 1) fix singleton destruction order for windows and posix env 2) fix two clang warnings Closes https://github.com/facebook/rocksdb/pull/3954 Differential Revision: D8272041 Pulled By: miasantreble fbshipit-source-id: 7c4fd12bd17aac13534520de0c733328aa3c6c9f	2018-06-05 12:58:02 -07:00
Andrew Kryczka	2210152947	Fix singleton destruction order of PosixEnv and SyncPoint (#3951 ) Summary: Ensure the PosixEnv singleton is destroyed first since its destructor waits for background threads to all complete. This ensures background threads cannot hit sync points after the SyncPoint singleton is destroyed, which was previously possible. Closes https://github.com/facebook/rocksdb/pull/3951 Differential Revision: D8265295 Pulled By: ajkr fbshipit-source-id: 7738dd458c5d993a78377dd0420e82badada81ab	2018-06-04 15:58:46 -07:00
奏之章	6e08916eb3	Fix Fadvise on closed file when reads use mmap Summary: ```PosixMmapReadableFile::fd_``` is closed after created, but needs to remain open for the lifetime of `PosixMmapReadableFile` since it is used whenever `InvalidateCache` is called. Closes https://github.com/facebook/rocksdb/pull/2764 Differential Revision: D8152515 Pulled By: ajkr fbshipit-source-id: b738a6a55ba4e392f9b0f374ff396a1e61c64f65	2018-05-25 10:57:57 -07:00
Dmitri Smirnov	3db8504cde	Catchup with posix features Summary: Catch up with Posix features NewWritableRWFile must fail when file does not exists Implement Env::Truncate() Adjust Env options optimization functions Implement MemoryMappedBuffer on Windows. Closes https://github.com/facebook/rocksdb/pull/3857 Differential Revision: D8053610 Pulled By: ajkr fbshipit-source-id: ccd0d46c29648a9f6f496873bc1c9d6c5547487e	2018-05-24 15:13:04 -07:00

1 2 3 4 5

218 Commits