Summary:
This ports `folly::DistributedMutex` into RocksDB. The PR includes everything else needed to compile and use DistributedMutex as a component within folly. Most files are unchanged except for some portability stuff and includes.
For now, I've put this under `rocksdb/third-party`, but if there is a better folder to put this under, let me know. I also am not sure how or where to put unit tests for third-party stuff like this. It seems like gtest is included already, but I need to link with it from another third-party folder.
This also includes some other common components from folly
- folly/Optional
- folly/ScopeGuard (In particular `SCOPE_EXIT`)
- folly/synchronization/ParkingLot (A portable futex-like interface)
- folly/synchronization/AtomicNotification (The standard C++ interface for futexes)
- folly/Indestructible (For singletons that don't get destroyed without allocations)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5642
Differential Revision: D16544439
fbshipit-source-id: 179b98b5dcddc3075926d31a30f92fd064245731
Summary:
This change adds a Dynamic Library class to the RocksDB Env. Dynamic libraries are populated via the Env::LoadLibrary method.
The addition of dynamic library support allows for a few different features to be developed:
1. The compression code can be changed to use dynamic library support. This would allow RocksDB to determine at run-time what compression packages were installed. This change would eliminate the need to make sure the build-time and run-time environment had the same library set. It would also simplify some of the Java build issues (where it attempts to build and include various packages inside the RocksDB jars).
2. Along with other features (to be provided in a subsequent PR), this change would allow code/configurations to be added to RocksDB at run-time. For example, the build system includes code for building an "rados" environment and adding "Cassandra" features. Instead of these extensions being built into the base RocksDB code, these extensions could be loaded at run-time as required/appropriate, either by configuration or explicitly.
We intend to push out other changes in support of the extending RocksDB at run-time via configurations.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5281
Differential Revision: D15447613
Pulled By: riversand963
fbshipit-source-id: 452cd4f54511c0bceee18f6d9d919aae9fd25fef
Summary:
Fix some hdfs-related code so that it can compile and run 'db_stress'
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5122
Differential Revision: D14675495
Pulled By: riversand963
fbshipit-source-id: cac280479efcf5451982558947eac1732e8bc45a
Summary:
The compiler flag `-DROCKSDB_FALLOCATE_PRESENT` was only set when
`fallocate`, `FALLOC_FL_KEEP_SIZE`, and `FALLOC_FL_PUNCH_HOLE` were all
present. However, the last of the three is not really necessary for the
primary `fallocate` use case; furthermore, it was introduced only in later
Linux kernel versions (2.6.38+).
This PR changes the flag `-DROCKSDB_FALLOCATE_PRESENT` to only require
`fallocate` and `FALLOC_FL_KEEP_SIZE` to be present. There is a separate
check for `FALLOC_FL_PUNCH_HOLE` only in the place where it is used.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5023
Differential Revision: D14248487
Pulled By: siying
fbshipit-source-id: a10ed0b902fa755988e957bd2dcec9081ec0502e
Summary:
- When building with internal dependencies, specify this toolchain by setting `ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1`
- It is not enabled by default. However, it is enabled for TSAN builds in CI since there is a known problem with TSAN in gcc-5: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71090
- I did not add support for Lua since (1) we agreed to deprecate it, and (2) we only have an internal build for v5.3 with this toolchain while that has breaking changes compared to our current version (v5.2).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4923
Differential Revision: D13827226
Pulled By: ajkr
fbshipit-source-id: 9aa3388ed3679777cfb15ef8cbcb83c07f62f947
Summary:
In fbcode when we build with clang7++, although -faligned-new is available in compile phase, we link with an older version of libstdc++.a and it doesn't come with aligned-new support (e.g. `nm libstdc++.a | grep align_val_t` return empty). In this case the previous -faligned-new detection can pass but will end up with link error. Fixing it by only have the detection for non-fbcode build.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4576
Differential Revision: D10500008
Pulled By: yiwu-arbug
fbshipit-source-id: b375de4fbb61d2a08e54ab709441aa8e7b4b08cf
Summary:
clang compilation is failing due to a4fb1f8c04. In that commit I added a call to `std::atomic::is_lock_free` which was evidently relying on a compiler builtin only present in gcc.
Drawbacks to this fix are:
- users may need to install libatomic
- there might be cases where clang is used even though USE_CLANG is unset (e.g., when clang is the only available compiler). I didn't figure out how to add -latomic in those cases...
An alternative fix mentioned in http://lists.llvm.org/pipermail/llvm-bugs/2017-August/057263.html is using -stdlib=libc++ with clang.
Closes https://github.com/facebook/rocksdb/pull/3769
Differential Revision: D7756261
Pulled By: ajkr
fbshipit-source-id: 26888300683fa9970ab5950239d1aa217e8efd49
Summary:
I modified the Makefile so that we can compile rocksdb on OpenBSD.
The instructions for building have been added to INSTALL.md.
The whole compilation process works fine like this on OpenBSD-current
Closes https://github.com/facebook/rocksdb/pull/3617
Differential Revision: D7323754
Pulled By: siying
fbshipit-source-id: 990037d1cc69138d22f85bd77ef4dc8c1ba9edea
Summary:
In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t. This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported.
This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types.
Also up for discussion: is iOS worth supporting? Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience.
Closes https://github.com/facebook/rocksdb/pull/3503
Differential Revision: D7106457
Pulled By: gfosco
fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e
Summary:
Most popular versions of GCC can't identify platform on ARM if "-march=native" is specified. Remove it to unblock most people.
Closes https://github.com/facebook/rocksdb/pull/3346
Differential Revision: D6690544
Pulled By: siying
fbshipit-source-id: bbaba9fe2645b6b37144b36ea75beeff88992b49
Summary:
**# Summary**
RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.
This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.
**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.
Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.
1) ReadRandom in db_bench overall metrics
PER RUN
Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
3-way | 1 | 4.143 | 241387 | 26.7
3-way | 2 | 3.775 | 264872 | 29.3
3-way | 3 | 4.116 | 242929 | 26.9
FastCrc32c|1 | 4.037 | 247727 | 27.4
FastCrc32c|2 | 4.648 | 215166 | 23.8
FastCrc32c|3 | 4.352 | 229799 | 25.4
AVG
Algorithm | Average of micros/op | Average of ops/sec | Average of Throughput (MB/s)
3-way | 4.01 | 249,729 | 27.63
FastCrc32c | 4.35 | 230,897 | 25.53
2) Crc32c computation CPU cost (inclusive samples percentage)
PER RUN
Implementation | run | TotalSamples | Crc32c percentage
3-way | 1 | 4,572,250,000 | 4.37%
3-way | 2 | 3,779,250,000 | 4.62%
3-way | 3 | 4,129,500,000 | 4.48%
FastCrc32c | 1 | 4,663,500,000 | 11.24%
FastCrc32c | 2 | 4,047,500,000 | 12.34%
FastCrc32c | 3 | 4,366,750,000 | 11.68%
**# Test Plan**
make -j64 corruption_test && ./corruption_test
By default it uses 3-way SSE algorithm
NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test
make clean && DEBUG_LEVEL=0 make -j64 db_bench
make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173
Differential Revision: D6330882
Pulled By: yingsu00
fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
Summary:
I started adding gflags support for cmake on linux and got frustrated that I'd need to duplicate the build_detect_platform logic, which determines namespace based on attempting compilation. We can do it differently -- use the GFLAGS_NAMESPACE macro if available, and if not, that indicates it's an old gflags version without configurable namespace so we can simply hardcode "google".
Closes https://github.com/facebook/rocksdb/pull/3212
Differential Revision: D6456973
Pulled By: ajkr
fbshipit-source-id: 3e6d5bde3ca00d4496a120a7caf4687399f5d656
Summary:
If possible, use -march or -mcpu to get enable all features available on the local CPU or architecture. Only if this is impossible, we will manually set -msse4.2. It should be safe as there'll be a warning printed if `USE_SSE` is set and the provided flags are insufficient to support SSE4.2.
Closes https://github.com/facebook/rocksdb/pull/3156
Differential Revision: D6304703
Pulled By: ajkr
fbshipit-source-id: 030a53491263300cae7fafb429114d87acc828ef
Summary:
With some compilers, `-std=c++11` is necessary for <cstdint> to be
available. Pass this flag via $PLATFORM_CXXFLAGS. Fixes#2488.
Closes https://github.com/facebook/rocksdb/pull/2545
Differential Revision: D5620610
Pulled By: yiwu-arbug
fbshipit-source-id: 2f975b8c1ad52e283e677d9a33543abd064f13ce
Summary:
We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS.
See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build.
I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream.
Closes https://github.com/facebook/rocksdb/pull/2199
Differential Revision: D5054042
Pulled By: yiwu-arbug
fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171
Summary:
- added a feature test in build_detect_platform to check whether sched_getcpu() is available. glibc offers it only on some platforms (e.g., linux but not mac); this way should be easier than maintaining a list of platforms on which it's available.
- refactored PhysicalCoreID() to be simpler / less repetitive. ordered the conditional compilation clauses from most-to-least preferred
Closes https://github.com/facebook/rocksdb/pull/2272
Differential Revision: D5038093
Pulled By: ajkr
fbshipit-source-id: 81d7db3cc620250de220bdeb3194b2b3d7673de7
Summary:
Replacement of #2147
The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194
Differential Revision: D4929799
Pulled By: siying
fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
Summary:
This diff includes an implementation of CompactionFilter that allows
users to write CompactionFilter in Lua. With this ability, users can
dynamically change compaction filter logic without requiring building
the rocksdb binary and restarting the database.
To compile, WITH_LUA_PATH must be specified to the base directory
of lua.
Closes https://github.com/facebook/rocksdb/pull/1478
Differential Revision: D4150138
Pulled By: yhchiang
fbshipit-source-id: ed84222
Summary: Splitting the makefile part of D55581.
Test Plan:
make all check -j32
ROCKSDB_FBCODE_BUILD_WITH_481=1 make all check -j32
ROCKSDB_NO_FBCODE=1 make all check -j32
export TBB_BASE=/mnt/gvfs/third-party2/tbb/afa54b33cfcf93f1d90a3160cdb894d6d63d5dca/4.0_update2/gcc-4.9-glibc-2.20/e9936bf;
ROCKSDB_NO_FBCODE=1 CFLAGS="-I $TBB_BASE/include" LDFLAGS="-L $TBB_BASE/lib -Wl,-rpath=$TBB_BASE/lib" make all check -j32
Reviewers: IslamAbdelRahman, sdong
Reviewed By: sdong
Subscribers: kradhakrishnan, yhchiang, IslamAbdelRahman, andrewkr, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D56979
Summary:
We have replaced LEVELDB_PLATFORM_POSIX with ROCKSDB_PLATFORM_POSIX in our code
replace it in Makefile
Test Plan: make check
Reviewers: sdong, yiwu, andrewkr, lightmark
Reviewed By: lightmark
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D61365
Summary: It looks like we mistakenly enable JEMALLOC even if it's not available on the machine, that's why travis is failing
Test Plan:
check on my devserver
check on my mac
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57345
Summary:
Introduced option to dump malloc statistics using new option flag.
Added new command line option to db_bench tool to enable this
funtionality.
Also extended build to support environments with/without jemalloc.
Test Plan:
1) Build rocksdb using `make` command. Launch the following command
`./db_bench --benchmarks=fillrandom --dump_malloc_stats=true
--num=10000000` end verified that jemalloc dump is present in LOG file.
2) Build rocksdb using `DISABLE_JEMALLOC=1 make db_bench -j32` and ran
the same db_bench tool and found the following message in LOG file:
"Please compile with jemalloc to enable malloc dump".
3) Also built rocksdb using `make` command on MacOS to verify behavior
in non-FB environment.
Also to debug build configuration change temporary changed
AM_DEFAULT_VERBOSITY = 1 in Makefile to see compiler and build
tools output. For case 1) -DROCKSDB_JEMALLOC was present in compiler
command line. For both 2) and 3) this flag was not present.
Reviewers: sdong
Reviewed By: sdong
Subscribers: andrewkr, dhruba
Differential Revision: https://reviews.facebook.net/D57321
* Musl libc does not provide adaptive mutex. Added feature test for PTHREAD_MUTEX_ADAPTIVE_NP.
* Musl libc does not provide backtrace(3). Added a feature check for backtrace(3).
* Fixed compiler error.
* Musl libc does not implement backtrace(3). Added platform check for libexecinfo.
* Alpine does not appear to support gcc -pg option. By default (gcc has PIE option enabled) it fails with:
gcc: error: -pie and -pg|p|profile are incompatible when linking
When -fno-PIE and -nopie are used it fails with:
/usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find gcrt1.o: No such file or directory
Added gcc -pg platform test and output PROFILING_FLAGS accordingly. Replaced pg var in Makefile with PROFILING_FLAGS.
* fix segfault when TEST_IOCTL_FRIENDLY_TMPDIR is undefined and default candidates are not suitable
* use ASSERT_DOUBLE_EQ instead of ASSERT_EQ
* When compiled with ROCKSDB_MALLOC_USABLE_SIZE UniversalCompactionFourPaths and UniversalCompactionSecondPathRatio tests fail due to premature memtable flushes on systems with 16-byte alignment. Arena runs out of block space before GenerateNewFile() completes.
Increased options.write_buffer_size.
Summary:
see https://github.com/facebook/rocksdb/issues/977; there are issues
with fallocate() on certain filesystems/kernel versions that can lead it to pre-
allocating blocks but never freeing them, even if they're unused.
Test Plan:
verified build commands omit DROCKSDB_FALLOCATE_PRESENT when this env
variable is set.
without disabling it:
$ ROCKSDB_NO_FBCODE=1 make -n env_test | grep -q DROCKSDB_FALLOCATE_PRESENT ; echo $?
0
with disabling it:
$ ROCKSDB_NO_FBCODE=1 DISABLE_FALLOCATE=1 make -n env_test | grep -q DROCKSDB_FALLOCATE_PRESENT ; echo $?
1
Reviewers: kradhakrishnan, yhchiang, sdong
Reviewed By: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D54069
These simple changes are required to allow builds on ppc64[le] systems
consistent with X86. The Makefile now recognizes both ppc64 and ppc64le, and
in the absence of PORTABLE=1, the code will be built analogously to the X86
-march=native.
Note that although GCC supports -mcpu=native -mtune=native on POWER, it
doesn't work correctly on all systems. This is why we need to get the actual
machine model from the AUX vector.
Summary: This patch splits the posix storage backend into Env and
the actual *File implementations. The motivation is to allow other Envs
to use posix as a library. This enables a storage backend different from
posix to split its secondary storage between a normal file system
partition managed by posix, and it own media.
Test Plan: No new functionality is added to posix Env or the library,
thus the current tests should suffice.
in the Java jar. Also build the linux libraries using the portable flag to fix a problem with
the linux32 build and improve the general portability of the RocksDB dynamic libraries.
==> linux32: util/crc32c.cc:318:39: error: ‘_mm_crc32_u64’ was not declared in this scope
Summary: Add ZSTD compression type. The same way as adding LZ4.
Test Plan: run all tests. Generate files in db_bench. Make sure reads succeed. But the SST files cannot be opened in older versions. Also some other adhoc tests.
Reviewers: rven, anthony, IslamAbdelRahman, kradhakrishnan, igor
Reviewed By: igor
Subscribers: MarkCallaghan, maykov, yoshinorim, leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D45747
Summary:
Currently, when we insert something into block cache, we say that the block cache capacity decreased by the size of the block. However, size of the block might be less than the actual memory used by this object. For example, 4.5KB block will actually use 8KB of memory. So even if we configure block cache to 10GB, our actually memory usage of block cache will be 20GB!
This problem showed up a lot in testing and just recently also showed up in MongoRocks production where we were using 30GB more memory than expected.
This diff will fix the problem. Instead of counting the block size, we will count memory used by the block. That way, a block cache configured to be 10GB will actually use only 10GB of memory.
I'm using non-portable function and I couldn't find info on portability on Google. However, it seems to work on Linux, which will cover majority of our use-cases.
Test Plan:
1. fill up mongo instance with 80GB of data
2. restart mongo with block cache size configured to 10GB
3. do a table scan in mongo
4. memory usage before the diff: 12GB. memory usage after the diff: 10.5GB
Reviewers: sdong, MarkCallaghan, rven, yhchiang
Reviewed By: yhchiang
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40635
Summary:
Remove -Wl,--no-as-needed flag when making shared_lib in OSX and IOS as
those environment doe not have compile option --no-as-needed
ld: unknown option: --no-as-needed
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Test Plan: make shared_lib
Reviewers: meyering, igor
Reviewed By: igor
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D40353
Summary:
Cygwin doesn't support -fPIC. Remove it.
Not sure whether we can build shared library in Cygwin but at least it can build without warning.
Test Plan: Build under Cygwin
Reviewers: yhchiang, rven, kradhakrishnan, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D40077
Summary:
Make it build for CYGWIN.
Need to define "-std=gnu++11" instead of "-std=c++11" and use some replacement functions.
Test Plan: Build it and run some unit tests in CYGWIN
Reviewers: yhchiang, rven, anthony, kradhakrishnan, igor
Reviewed By: igor
Subscribers: leveldb, dhruba
Differential Revision: https://reviews.facebook.net/D37605
Summary: Currently open source rocksdb only builds with tcmalloc. This diff first checks if jemalloc is available. If it is, it compiles with jemalloc. If it isn't, it checks for tcmalloc.
Test Plan: Tried this out on my Ubuntu virtual machine and confirms that jemalloc is correctly detected and compiled.
Reviewers: MarkCallaghan, yhchiang, rven, sdong
Reviewed By: sdong
Subscribers: adamretter, meyering, dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D36789
Summary: `echo` correctly interpretes \n on mac, but not on linux. On linux you have to give it `-e` to interpret \n. Unfortunately, `-e` options is not available on Mac. Go back to old way of checking gflags
Test Plan: build_tools/build_detect_platform on mac and linux
Reviewers: sdong
Subscribers: dhruba
Differential Revision: https://reviews.facebook.net/D37515
Summary: We should send error output to /dev/null
Test Plan: none
Reviewers: sdong
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D37449
Summary:
If the system has gflags with both `google` and `gflags` namespaces installed, we try to define GFLAGS as two things. This breaks the compile.
Fix: Use `else if` -- try compiling with `google` namespace only if compile with `gflags` failed
Test Plan: build_tools/build_detect_platform correctly identifies gflags
Reviewers: lgalanis
Reviewed By: lgalanis
Subscribers: dhruba, leveldb
Differential Revision: https://reviews.facebook.net/D37389