Commit Graph

122 Commits

Author SHA1 Message Date
Peter Dillinger
31da5e34c1 C++20 compatibility (#6697)
Summary:
Based on https://github.com/facebook/rocksdb/issues/6648 (CLA Signed), but heavily modified / extended:

* Implicit capture of this via [=] deprecated in C++20, and [=,this] not standard before C++20 -> now using explicit capture lists
* Implicit copy operator deprecated in gcc 9 -> add explicit '= default' definition
* std::random_shuffle deprecated in C++17 and removed in C++20 -> migrated to a replacement in RocksDB random.h API
* Add the ability to build with different std version though -DCMAKE_CXX_STANDARD=11/14/17/20 on the cmake command line
* Minimal rebuild flag of MSVC is deprecated and is forbidden with /std:c++latest (C++20)
* Added MSVC 2019 C++11 & MSVC 2019 C++20 in AppVeyor
* Added GCC 9 C++11 & GCC9 C++20 in Travis
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6697

Test Plan: make check and CI

Reviewed By: cheng-chang

Differential Revision: D21020318

Pulled By: pdillinger

fbshipit-source-id: 12311be5dbd8675a0e2c817f7ec50fa11c18ab91
2020-04-20 13:24:25 -07:00
anand76
5c19a441c4 Fault injection in db_stress (#6538)
Summary:
This PR implements a fault injection mechanism for injecting errors in reads in db_stress. The FaultInjectionTestFS is used for this purpose. A thread local structure is used to track the errors, so that each db_stress thread can independently enable/disable error injection and verify observed errors against expected errors. This is initially enabled only for Get and MultiGet, but can be extended to iterator as well once its proven stable.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6538

Test Plan:
crash_test
make check

Reviewed By: riversand963

Differential Revision: D20714347

Pulled By: anand1976

fbshipit-source-id: d7598321d4a2d72bda0ced57411a337a91d87dc7
2020-04-10 17:21:26 -07:00
Luca Giacchino
66a95f0fac Provide an allocator for new memory type to be used with RocksDB block cache (#6214)
Summary:
New memory technologies are being developed by various hardware vendors (Intel DCPMM is one such technology currently available). These new memory types require different libraries for allocation and management (such as PMDK and memkind). The high capacities available make it possible to provision large caches (up to several TBs in size), beyond what is achievable with DRAM.
The new allocator provided in this PR uses the memkind library to allocate memory on different media.

**Performance**

We tested the new allocator using db_bench.
- For each test, we vary the size of the block cache (relative to the size of the uncompressed data in the database).
- The database is filled sequentially. Throughput is then measured with a readrandom benchmark.
- We use a uniform distribution as a worst-case scenario.

The plot shows throughput (ops/s) relative to a configuration with no block cache and default allocator.
For all tests, p99 latency is below 500 us.

![image](https://user-images.githubusercontent.com/26400080/71108594-42479100-2178-11ea-8231-8a775bbc92db.png)

**Changes**

- Add MemkindKmemAllocator
- Add --use_cache_memkind_kmem_allocator db_bench option (to create an LRU block cache with the new allocator)
- Add detection of memkind library with KMEM DAX support
- Add test for MemkindKmemAllocator

**Minimum Requirements**

- kernel 5.3.12
- ndctl v67 - https://github.com/pmem/ndctl
- memkind v1.10.0 - https://github.com/memkind/memkind

**Memory Configuration**

The allocator uses the MEMKIND_DAX_KMEM memory kind. Follow the instructions on[ memkind’s GitHub page](https://github.com/memkind/memkind) to set up NVDIMM memory accordingly.

Note on memory allocation with NVDIMM memory exposed as system memory.
- The MemkindKmemAllocator will only allocate from NVDIMM memory (using memkind_malloc with MEMKIND_DAX_KMEM kind).
- The default allocator is not restricted to RAM by default. Based on NUMA node latency, the kernel should allocate from local RAM preferentially, but it’s a kernel decision. numactl --preferred/--membind can be used to allocate preferentially/exclusively from the local RAM node.

**Usage**

When creating an LRU cache, pass a MemkindKmemAllocator object as argument.
For example (replace capacity with the desired value in bytes):

```
#include "rocksdb/cache.h"
#include "memory/memkind_kmem_allocator.h"

NewLRUCache(
    capacity /*size_t*/,
    6 /*cache_numshardbits*/,
    false /*strict_capacity_limit*/,
    false /*cache_high_pri_pool_ratio*/,
    std::make_shared<MemkindKmemAllocator>());
```

Refer to [RocksDB’s block cache documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache) to assign the LRU cache as block cache for a database.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6214

Reviewed By: cheng-chang

Differential Revision: D19292435

fbshipit-source-id: 7202f47b769e7722b539c86c2ffd669f64d7b4e1
2020-04-09 20:47:23 -07:00
Adam Retter
0772768d07 Force Java version on Travis CI (#6512)
Summary:
In the `.travis.yml` file the `jdk: openjdk7` element is ignored when `language: cpp`. So whatever version of the JDK that was installed in the Travis container was used - typically JDK 11.

To ensure our RocksJava builds are working, we now instead install and use OpenJDK 8. Ideally we would use OpenJDK 7, as RocksJava supports Java 7, but many of the newer Travis containers don't support Java 7, so Java 8 is the next best thing.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6512

Differential Revision: D20388296

Pulled By: pdillinger

fbshipit-source-id: 8bbe6b59b70cfab7fe81ff63867d907fefdd2df1
2020-03-12 12:24:51 -07:00
Fabrice Fontaine
8bbd76edbf Check for sys/auxv.h (#6359)
Summary:
Check for sys/auxv.h and getauxval before using them as they are not
always available (for example on uclibc)

Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6359

Differential Revision: D20239797

fbshipit-source-id: 175a098094d81545628c2372e7c388e70a32fd48
2020-03-03 18:09:59 -08:00
sdong
6e97d4de00 By default turn IO Uring off. (#6405)
Summary:
We realized bugs related to IO Uring. Turn it off by default.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6405

Test Plan: Manually run build_tools/build_detect_platform and observe outputs.

Differential Revision: D19862792

fbshipit-source-id: 5d5e8e2762997b72a145ae59389ef3d7e4ccd060
2020-02-12 18:01:49 -08:00
sdong
e3a82bb934 PosixRandomAccessFile::MultiRead() to use I/O uring if supported (#5881)
Summary:
Right now, PosixRandomAccessFile::MultiRead() executes read requests in parallel. In this PR, it leverages I/O Uring library to run it in parallel, even when page cache is enabled. This function will fall back if the kernel version doesn't support it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5881

Test Plan: Run the unit test on a kernel version supporting it and make sure all tests pass, and run a unit test on kernel version supporting it and see it pass. Before merging, will also run stress test and see it passes.

Differential Revision: D17742266

fbshipit-source-id: e05699c925ac04fdb42379456a4e23e4ebcb803a
2019-12-07 20:55:52 -08:00
Peter Dillinger
f059c7d9b9 New Bloom filter implementation for full and partitioned filters (#6007)
Summary:
Adds an improved, replacement Bloom filter implementation (FastLocalBloom) for full and partitioned filters in the block-based table. This replacement is faster and more accurate, especially for high bits per key or millions of keys in a single filter.

Speed

The improved speed, at least on recent x86_64, comes from
* Using fastrange instead of modulo (%)
* Using our new hash function (XXH3 preview, added in a previous commit), which is much faster for large keys and only *slightly* slower on keys around 12 bytes if hashing the same size many thousands of times in a row.
* Optimizing the Bloom filter queries with AVX2 SIMD operations. (Added AVX2 to the USE_SSE=1 build.) Careful design was required to support (a) SIMD-optimized queries, (b) compatible non-SIMD code that's simple and efficient, (c) flexible choice of number of probes, and (d) essentially maximized accuracy for a cache-local Bloom filter. Probes are made eight at a time, so any number of probes up to 8 is the same speed, then up to 16, etc.
* Prefetching cache lines when building the filter. Although this optimization could be applied to the old structure as well, it seems to balance out the small added cost of accumulating 64 bit hashes for adding to the filter rather than 32 bit hashes.

Here's nominal speed data from filter_bench (200MB in filters, about 10k keys each, 10 bits filter data / key, 6 probes, avg key size 24 bytes, includes hashing time) on Skylake DE (relatively low clock speed):

$ ./filter_bench -quick -impl=2 -net_includes_hashing # New Bloom filter
Build avg ns/key: 47.7135
Mixed inside/outside queries...
  Single filter net ns/op: 26.2825
  Random filter net ns/op: 150.459
    Average FP rate %: 0.954651
$ ./filter_bench -quick -impl=0 -net_includes_hashing # Old Bloom filter
Build avg ns/key: 47.2245
Mixed inside/outside queries...
  Single filter net ns/op: 63.2978
  Random filter net ns/op: 188.038
    Average FP rate %: 1.13823

Similar build time but dramatically faster query times on hot data (63 ns to 26 ns), and somewhat faster on stale data (188 ns to 150 ns). Performance differences on batched and skewed query loads are between these extremes as expected.

The only other interesting thing about speed is "inside" (query key was added to filter) vs. "outside" (query key was not added to filter) query times. The non-SIMD implementations are substantially slower when most queries are "outside" vs. "inside". This goes against what one might expect or would have observed years ago, as "outside" queries only need about two probes on average, due to short-circuiting, while "inside" always have num_probes (say 6). The problem is probably the nastily unpredictable branch. The SIMD implementation has few branches (very predictable) and has pretty consistent running time regardless of query outcome.

Accuracy

The generally improved accuracy (re: Issue https://github.com/facebook/rocksdb/issues/5857) comes from a better design for probing indices
within a cache line (re: Issue https://github.com/facebook/rocksdb/issues/4120) and improved accuracy for millions of keys in a single filter from using a 64-bit hash function (XXH3p). Design details in code comments.

Accuracy data (generalizes, except old impl gets worse with millions of keys):
Memory bits per key: FP rate percent old impl -> FP rate percent new impl
6: 5.70953 -> 5.69888
8: 2.45766 -> 2.29709
10: 1.13977 -> 0.959254
12: 0.662498 -> 0.411593
16: 0.353023 -> 0.0873754
24: 0.261552 -> 0.0060971
50: 0.225453 -> ~0.00003 (less than 1 in a million queries are FP)

Fixes https://github.com/facebook/rocksdb/issues/5857
Fixes https://github.com/facebook/rocksdb/issues/4120

Unlike the old implementation, this implementation has a fixed cache line size (64 bytes). At 10 bits per key, the accuracy of this new implementation is very close to the old implementation with 128-byte cache line size. If there's sufficient demand, this implementation could be generalized.

Compatibility

Although old releases would see the new structure as corrupt filter data and read the table as if there's no filter, we've decided only to enable the new Bloom filter with new format_version=5. This provides a smooth path for automatic adoption over time, with an option for early opt-in.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6007

Test Plan: filter_bench has been used thoroughly to validate speed, accuracy, and correctness. Unit tests have been carefully updated to exercise new and old implementations, as well as the logic to select an implementation based on context (format_version).

Differential Revision: D18294749

Pulled By: pdillinger

fbshipit-source-id: d44c9db3696e4d0a17caaec47075b7755c262c5f
2019-11-13 16:44:01 -08:00
Peter Dillinger
ca7ccbe2ea Misc hashing updates / upgrades (#5909)
Summary:
- Updated our included xxhash implementation to version 0.7.2 (== the latest dev version as of 2019-10-09).
- Using XXH_NAMESPACE (like other fb projects) to avoid potential name collisions.
- Added fastrange64, and unit tests for it and fastrange32. These are faster alternatives to hash % range.
- Use preview version of XXH3 instead of MurmurHash64A for NPHash64
-- Had to update cache_test to increase probability of passing for any given hash function.
- Use fastrange64 instead of % with uses of NPHash64
-- Had to fix WritePreparedTransactionTest.CommitOfDelayedPrepared to avoid deadlock apparently caused by new hash collision.
- Set default seed for NPHash64 because specifying a seed rarely makes sense for it.
- Removed unnecessary include xxhash.h in a popular .h file
- Rename preview version of XXH3 to XXH3p for clarity and to ease backward compatibility in case final version of XXH3 is integrated.

Relying on existing unit tests for NPHash64-related changes. Each new implementation of fastrange64 passed unit tests when manipulating my local build to select it. I haven't done any integration performance tests, but I consider the improved performance of the pieces being swapped in to be well established.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5909

Differential Revision: D18125196

Pulled By: pdillinger

fbshipit-source-id: f6bf83d49d20cbb2549926adf454fd035f0ecc0d
2019-10-24 17:16:46 -07:00
Vijay Nadimpalli
2ce6aa5f39 Making platform 007 (gcc 7) default in build_detect_platform.sh (#5947)
Summary:
Making platform 007 (gcc 7) default in build_detect_platform.sh.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5947

Differential Revision: D18038837

Pulled By: vjnadimpalli

fbshipit-source-id: 9ac2ddaa93bf328a416faec028970e039886378e
2019-10-21 12:09:29 -07:00
Peter Dillinger
6a171724b7 Clean up + fix build scripts re: USE_SSE= and PORTABLE= (#5800)
Summary:
In preparing to utilize a new Intel instruction extension, I
noticed problems with the existing build script in regard to the
existing utilized extensions, either with USE_SSE or PORTABLE flags.

* PORTABLE=0 was interpreted the same as PORTABLE=1. Now empty and 0
mean the same. (I guess you were not supposed to set PORTABLE= if you
wanted non-portable--except that...)
* The Facebook build script extensions would set PORTABLE=1 even if
it's already set in a make var or environment. Now it does not override
a non-empty setting, so use PORTABLE=0 for fully optimized build,
overriding Facebook environment default.
* Put in an explanation of the USE_SSE flag where it's used by
build_detect_platform, and cleaned up some confusing/redundant
associated logic.
* If USE_SSE was set and expected intrinsics were not available,
build_detect_platform would exit early but build would proceed with
broken, incomplete configuration. Now warning is gracefully recovered.
* If USE_SSE was set and expected intrinsics were not available,
build would still try to use flags like -msse4.2 etc. which could lead
to unexpected compilation failure or binary incompatibility. Now those
flags are not used if the warning is issued.

This should not break or change existing, valid build scripts.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5800

Test Plan: manual case testing

Differential Revision: D17369543

Pulled By: pdillinger

fbshipit-source-id: 4ee244911680ae71144d272c40aceea548e3ce88
2019-09-13 11:07:13 -07:00
Aaryaman Sagar
38b03c840e Port folly/synchronization/DistributedMutex to rocksdb (#5642)
Summary:
This ports `folly::DistributedMutex` into RocksDB. The PR includes everything else needed to compile and use DistributedMutex as a component within folly. Most files are unchanged except for some portability stuff and includes.

For now, I've put this under `rocksdb/third-party`, but if there is a better folder to put this under, let me know. I also am not sure how or where to put unit tests for third-party stuff like this. It seems like gtest is included already, but I need to link with it from another third-party folder.

This also includes some other common components from folly

- folly/Optional
- folly/ScopeGuard (In particular `SCOPE_EXIT`)
- folly/synchronization/ParkingLot (A portable futex-like interface)
- folly/synchronization/AtomicNotification (The standard C++ interface for futexes)
- folly/Indestructible (For singletons that don't get destroyed without allocations)
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5642

Differential Revision: D16544439

fbshipit-source-id: 179b98b5dcddc3075926d31a30f92fd064245731
2019-08-07 14:34:19 -07:00
siddontang
4bd0cf541d build on ARM64 (#5450)
Summary:
Support building RocksDB on AWS ARM64

```
uname -m
aarch64
```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5450

Differential Revision: D15879851

fbshipit-source-id: a9b56520a2cd9921338305a06d7103a40a3300b8
2019-06-18 11:27:45 -07:00
Patrick Zhang
5c76ba9dc4 Support rocksdbjava aarch64 build and test (#5258)
Summary:
Verified with an Ampere Computing eMAG aarch64 system.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5258

Differential Revision: D15807309

Pulled By: maysamyabandeh

fbshipit-source-id: ab85d2fd3fe40e6094430ab0eba557b1e979510d
2019-06-13 11:48:10 -07:00
Mark Rambacher
c8267120d8 Add support for loading dynamic libraries into the RocksDB environment (#5281)
Summary:
This change adds a Dynamic Library class to the RocksDB Env.  Dynamic libraries are populated via the  Env::LoadLibrary method.

The addition of dynamic library support allows for a few different features to be developed:
1.  The compression code can be changed to use dynamic library support.  This would allow RocksDB to determine at run-time what compression packages were installed.  This change would eliminate the need to make sure the build-time and run-time environment had the same library set.  It would also simplify some of the Java build issues (where it attempts to build and include various packages inside the RocksDB jars).

2.  Along with other features (to be provided in a subsequent PR), this change would allow code/configurations to be added to RocksDB at run-time.  For example, the build system includes code for building an "rados" environment and adding "Cassandra" features.  Instead of these extensions being built into the base RocksDB code, these extensions could be loaded at run-time as required/appropriate, either by configuration or explicitly.

We intend to push out other changes in support of the extending RocksDB at run-time via configurations.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5281

Differential Revision: D15447613

Pulled By: riversand963

fbshipit-source-id: 452cd4f54511c0bceee18f6d9d919aae9fd25fef
2019-06-03 23:02:56 -07:00
Azat Khuzhin
29a198564d Fixes for build_detect_platform
Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/5255

Differential Revision: D15246532

Pulled By: riversand963

fbshipit-source-id: 96a21509666152788fa2f956e865a6bed7c8f474
2019-05-15 16:01:08 -07:00
Yanqin Jin
d77476ef55 Fix db_stress for custom env (#5122)
Summary:
Fix some hdfs-related code so that it can compile and run 'db_stress'
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5122

Differential Revision: D14675495

Pulled By: riversand963

fbshipit-source-id: cac280479efcf5451982558947eac1732e8bc45a
2019-03-28 19:20:27 -07:00
Andrew Kryczka
186b3afaa8 Use fallocate even if hole-punching unsupported (#5023)
Summary:
The compiler flag `-DROCKSDB_FALLOCATE_PRESENT` was only set when
`fallocate`, `FALLOC_FL_KEEP_SIZE`, and `FALLOC_FL_PUNCH_HOLE` were all
present. However, the last of the three is not really necessary for the
primary `fallocate` use case; furthermore, it was introduced only in later
Linux kernel versions (2.6.38+).

This PR changes the flag `-DROCKSDB_FALLOCATE_PRESENT` to only require
`fallocate` and `FALLOC_FL_KEEP_SIZE` to be present. There is a separate
check for `FALLOC_FL_PUNCH_HOLE` only in the place where it is used.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/5023

Differential Revision: D14248487

Pulled By: siying

fbshipit-source-id: a10ed0b902fa755988e957bd2dcec9081ec0502e
2019-03-04 15:43:17 -08:00
Andrew Kryczka
e242fa4664 Add latest toolchain (gcc-8, etc.) build support for fbcode users (#4923)
Summary:
- When building with internal dependencies, specify this toolchain by setting `ROCKSDB_FBCODE_BUILD_WITH_PLATFORM007=1`
- It is not enabled by default. However, it is enabled for TSAN builds in CI since there is a known problem with TSAN in gcc-5: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71090
- I did not add support for Lua since (1) we agreed to deprecate it, and (2) we only have an internal build for v5.3 with this toolchain while that has breaking changes compared to our current version (v5.2).
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4923

Differential Revision: D13827226

Pulled By: ajkr

fbshipit-source-id: 9aa3388ed3679777cfb15ef8cbcb83c07f62f947
2019-01-28 11:26:32 -08:00
Adam Retter
75a333ad2d Fix build on macOS when jemalloc is installed via HomeBrew (#4767)
Summary:
Closes https://github.com/facebook/rocksdb/issues/2954

This doesn't necessarily mean that Rocks runs correctly with jemalloc on macOS... but at least now the build completes!
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4767

Differential Revision: D13495195

Pulled By: sagar0

fbshipit-source-id: d3fda0d6358a93e9803d215e3406f117be367c42
2018-12-17 16:30:20 -08:00
Yi Wu
742302a1a3 Fix compile error with aligned-new (#4576)
Summary:
In fbcode when we build with clang7++, although -faligned-new is available in compile phase, we link with an older version of libstdc++.a and it doesn't come with aligned-new support (e.g. `nm libstdc++.a | grep align_val_t` return empty). In this case the previous -faligned-new detection can pass but will end up with link error. Fixing it by only have the detection for non-fbcode build.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4576

Differential Revision: D10500008

Pulled By: yiwu-arbug

fbshipit-source-id: b375de4fbb61d2a08e54ab709441aa8e7b4b08cf
2018-10-23 10:55:41 -07:00
Andrew Kryczka
9b89479e64 Pass -latomic to linker when using clang
Summary:
clang compilation is failing due to a4fb1f8c04. In that commit I added a call to `std::atomic::is_lock_free` which was evidently relying on a compiler builtin only present in gcc.

Drawbacks to this fix are:

- users may need to install libatomic
- there might be cases where clang is used even though USE_CLANG is unset (e.g., when clang is the only available compiler). I didn't figure out how to add -latomic in those cases...

An alternative fix mentioned in http://lists.llvm.org/pipermail/llvm-bugs/2017-August/057263.html is using -stdlib=libc++ with clang.
Closes https://github.com/facebook/rocksdb/pull/3769

Differential Revision: D7756261

Pulled By: ajkr

fbshipit-source-id: 26888300683fa9970ab5950239d1aa217e8efd49
2018-04-25 12:13:41 -07:00
Adam Retter
12b400e814 Some small improvements to the build_tools
Summary: Closes https://github.com/facebook/rocksdb/pull/3664

Differential Revision: D7459433

Pulled By: sagar0

fbshipit-source-id: 3817e5d45fc70e83cb26f9800eaa0f4566c8dc0e
2018-04-02 23:57:41 -07:00
Tobias Tschinkowitz
ccb761364d Enable compilation on OpenBSD
Summary:
I modified the Makefile so that we can compile rocksdb on OpenBSD.
The instructions for building have been added to INSTALL.md.
The whole compilation process works fine like this on OpenBSD-current
Closes https://github.com/facebook/rocksdb/pull/3617

Differential Revision: D7323754

Pulled By: siying

fbshipit-source-id: 990037d1cc69138d22f85bd77ef4dc8c1ba9edea
2018-03-19 12:30:05 -07:00
Fosco Marotto
d518fe1da6 uint64_t and size_t changes to compile for iOS
Summary:
In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t.  This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported.

This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types.

Also up for discussion: is iOS worth supporting?  Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience.
Closes https://github.com/facebook/rocksdb/pull/3503

Differential Revision: D7106457

Pulled By: gfosco

fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e
2018-03-06 12:43:51 -08:00
Adam Retter
a53c571d2d FreeBSD build support for RocksDB and RocksJava
Summary:
Tested on a clean FreeBSD 11.01 x64.

Closes https://github.com/facebook/rocksdb/pull/1423
Closes https://github.com/facebook/rocksdb/pull/3357

Differential Revision: D6705868

Pulled By: sagar0

fbshipit-source-id: cbccbbdafd4f42922512ca03619a5d5583a425fd
2018-01-11 13:29:55 -08:00
Siying Dong
a478e85697 Remove GCC parameter "-march=native" for ARM
Summary:
Most popular versions of GCC can't identify platform on ARM if "-march=native" is specified. Remove it to unblock most people.
Closes https://github.com/facebook/rocksdb/pull/3346

Differential Revision: D6690544

Pulled By: siying

fbshipit-source-id: bbaba9fe2645b6b37144b36ea75beeff88992b49
2018-01-09 18:27:03 -08:00
yingsu00
f54d7f5fea Port 3 way SSE4.2 crc32c implementation from Folly
Summary:
**# Summary**

RocksDB uses SSE crc32 intrinsics to calculate the crc32 values but it does it in single way fashion (not pipelined on single CPU core). Intel's whitepaper () published an algorithm that uses 3-way pipelining for the crc32 intrinsics, then use pclmulqdq intrinsic to combine the values. Because pclmulqdq has overhead on its own, this algorithm will show perf gains on buffers larger than 216 bytes, which makes RocksDB a perfect user, since most of the buffers RocksDB call crc32c on is over 4KB. Initial db_bench show tremendous CPU gain.

This change uses the 3-way SSE algorithm by default. The old SSE algorithm is now behind a compiler tag NO_THREEWAY_CRC32C. If user compiles the code with NO_THREEWAY_CRC32C=1 then the old SSE Crc32c algorithm would be used. If the server does not have SSE4.2 at the run time the slow way (Non SSE) will be used.

**# Performance Test Results**
We ran the FillRandom and ReadRandom benchmarks in db_bench. ReadRandom is the point of interest here since it calculates the CRC32 for the in-mem buffers. We did 3 runs for each algorithm.

Before this change the CRC32 value computation takes about 11.5% of total CPU cost, and with the new 3-way algorithm it reduced to around 4.5%. The overall throughput also improved from 25.53MB/s to 27.63MB/s.

1) ReadRandom in db_bench overall metrics

    PER RUN
    Algorithm | run | micros/op | ops/sec |Throughput (MB/s)
    3-way      |  1   | 4.143   | 241387 | 26.7
    3-way      |  2   | 3.775   | 264872 | 29.3
    3-way      | 3    | 4.116   | 242929 | 26.9
    FastCrc32c|1  | 4.037   | 247727 | 27.4
    FastCrc32c|2  | 4.648   | 215166 | 23.8
    FastCrc32c|3  | 4.352   | 229799 | 25.4

     AVG
    Algorithm     |    Average of micros/op |   Average of ops/sec |    Average of Throughput (MB/s)
    3-way           |     4.01                               |      249,729                 |      27.63
    FastCrc32c  |     4.35                              |     230,897                  |      25.53

 2)   Crc32c computation CPU cost (inclusive samples percentage)
    PER RUN
    Implementation | run |  TotalSamples   | Crc32c percentage
    3-way                 |  1    |  4,572,250,000 | 4.37%
    3-way                 |  2    |  3,779,250,000 | 4.62%
    3-way                 |  3    |  4,129,500,000 | 4.48%
    FastCrc32c       |  1    |  4,663,500,000 | 11.24%
    FastCrc32c       |  2    |  4,047,500,000 | 12.34%
    FastCrc32c       |  3    |  4,366,750,000 | 11.68%

 **# Test Plan**
     make -j64 corruption_test && ./corruption_test
      By default it uses 3-way SSE algorithm

     NO_THREEWAY_CRC32C=1 make -j64 corruption_test && ./corruption_test

    make clean && DEBUG_LEVEL=0 make -j64 db_bench
    make clean && DEBUG_LEVEL=0 NO_THREEWAY_CRC32C=1 make -j64 db_bench
Closes https://github.com/facebook/rocksdb/pull/3173

Differential Revision: D6330882

Pulled By: yingsu00

fbshipit-source-id: 8ec3d89719533b63b536a736663ca6f0dd4482e9
2017-12-19 18:26:49 -08:00
Adam Novak
a37d734596 Add ROCKSDB_DISABLE_* environment variables
Summary:
Should fix #3036.
Closes https://github.com/facebook/rocksdb/pull/3042

Differential Revision: D6452921

Pulled By: sagar0

fbshipit-source-id: eaf11e43fee1f8747006530cfc0c7a358f1c2f0f
2017-12-05 15:12:46 -08:00
Andrew Kryczka
63f1c0a57d fix gflags namespace
Summary:
I started adding gflags support for cmake on linux and got frustrated that I'd need to duplicate the build_detect_platform logic, which determines namespace based on attempting compilation. We can do it differently -- use the GFLAGS_NAMESPACE macro if available, and if not, that indicates it's an old gflags version without configurable namespace so we can simply hardcode "google".
Closes https://github.com/facebook/rocksdb/pull/3212

Differential Revision: D6456973

Pulled By: ajkr

fbshipit-source-id: 3e6d5bde3ca00d4496a120a7caf4687399f5d656
2017-12-01 10:42:05 -08:00
Andrew Kryczka
07c2738ffa prefer enabling cpu features via -march/-mcpu
Summary:
If possible, use -march or -mcpu to get enable all features available on the local CPU or architecture. Only if this is impossible, we will manually set -msse4.2. It should be safe as there'll be a warning printed if `USE_SSE` is set and the provided flags are insufficient to support SSE4.2.
Closes https://github.com/facebook/rocksdb/pull/3156

Differential Revision: D6304703

Pulled By: ajkr

fbshipit-source-id: 030a53491263300cae7fafb429114d87acc828ef
2017-11-10 16:57:11 -08:00
Nikhil Benesch
c5f0c6cc66 compile with correct flags to determine SSE4.2 support
Summary:
With some compilers, `-std=c++11` is necessary for <cstdint> to be
available. Pass this flag via $PLATFORM_CXXFLAGS. Fixes #2488.
Closes https://github.com/facebook/rocksdb/pull/2545

Differential Revision: D5620610

Pulled By: yiwu-arbug

fbshipit-source-id: 2f975b8c1ad52e283e677d9a33543abd064f13ce
2017-08-13 21:47:45 -07:00
Maysam Yabandeh
627c9f1abb Don't add -ljemalloc when DISABLE_JEMALLOC is set
Summary:
fixes #2555
Closes https://github.com/facebook/rocksdb/pull/2684

Differential Revision: D5560527

Pulled By: maysamyabandeh

fbshipit-source-id: 6e1d874ae0b4e699a77203d9d52d0bb8f59013b0
2017-08-04 10:42:32 -07:00
Nikhil Benesch
11c5d4741a cross-platform compatibility improvements
Summary:
We've had a couple CockroachDB users fail to build RocksDB on exotic platforms, so I figured I'd try my hand at solving these issues upstream. The problems stem from a) `USE_SSE=1` being too aggressive about turning on SSE4.2, even on toolchains that don't support SSE4.2 and b) RocksDB attempting to detect support for thread-local storage based on OS, even though it can vary by compiler on the same OS.

See the individual commit messages for details. Regarding SSE support, this PR should change virtually nothing for non-CMake based builds. `make`, `PORTABLE=1 make`, `USE_SSE=1 make`, and `PORTABLE=1 USE_SSE=1 make` function exactly as before, except that SSE support will be automatically disabled when a simple SSE4.2-using test program fails to compile, as it does on OpenBSD. (OpenBSD's ports GCC supports SSE4.2, but its binutils do not, so `__SSE_4_2__` is defined but an SSE4.2-using program will fail to assemble.) A warning is emitted in this case. The CMake build is modified to support the same set of options, except that `USE_SSE` is spelled `FORCE_SSE42` because `USE_SSE` is rather useless now that we can automatically detect SSE support, and I figure changing options in the CMake build is less disruptive than changing the non-CMake build.

I've tested these changes on all the platforms I can get my hands on (macOS, Windows MSVC, Windows MinGW, and OpenBSD) and it all works splendidly. Let me know if there's anything you object to—I obviously don't mean to break any of your build pipelines in the process of fixing ours downstream.
Closes https://github.com/facebook/rocksdb/pull/2199

Differential Revision: D5054042

Pulled By: yiwu-arbug

fbshipit-source-id: 938e1fc665c049c02ae15698e1409155b8e72171
2017-05-15 16:15:38 -07:00
Andrew Kryczka
be421b0b16 portable sched_getcpu calls
Summary:
- added a feature test in build_detect_platform to check whether sched_getcpu() is available. glibc offers it only on some platforms (e.g., linux but not mac); this way should be easier than maintaining a list of platforms on which it's available.
- refactored PhysicalCoreID() to be simpler / less repetitive. ordered the conditional compilation clauses from most-to-least preferred
Closes https://github.com/facebook/rocksdb/pull/2272

Differential Revision: D5038093

Pulled By: ajkr

fbshipit-source-id: 81d7db3cc620250de220bdeb3194b2b3d7673de7
2017-05-10 12:29:23 -07:00
Tomas Kolda
04d58970cb AIX and Solaris Sparc Support
Summary:
Replacement of #2147

The change was squashed due to a lot of conflicts.
Closes https://github.com/facebook/rocksdb/pull/2194

Differential Revision: D4929799

Pulled By: siying

fbshipit-source-id: 5cd49c254737a1d5ac13f3c035f128e86524c581
2017-04-21 20:48:04 -07:00
Yueh-Hsuan Chiang
647eafdc21 Introduce Lua Extension: RocksLuaCompactionFilter
Summary:
This diff includes an implementation of CompactionFilter that allows
users to write CompactionFilter in Lua.  With this ability, users can
dynamically change compaction filter logic without requiring building
the rocksdb binary and restarting the database.

To compile, WITH_LUA_PATH must be specified to the base directory
of lua.
Closes https://github.com/facebook/rocksdb/pull/1478

Differential Revision: D4150138

Pulled By: yhchiang

fbshipit-source-id: ed84222
2016-11-16 15:39:12 -08:00
Alex Robinson
b10d65c2a4 Update and slightly clarify instructions in build_detect_platform (#1301) 2016-08-25 10:40:38 -07:00
Yi Wu
ff17a2abf3 Adding TBB as dependency.
Summary: Splitting the makefile part of D55581.

Test Plan:
  make all check -j32
  ROCKSDB_FBCODE_BUILD_WITH_481=1 make all check -j32
  ROCKSDB_NO_FBCODE=1 make all check -j32

  export TBB_BASE=/mnt/gvfs/third-party2/tbb/afa54b33cfcf93f1d90a3160cdb894d6d63d5dca/4.0_update2/gcc-4.9-glibc-2.20/e9936bf;
  ROCKSDB_NO_FBCODE=1 CFLAGS="-I $TBB_BASE/include" LDFLAGS="-L $TBB_BASE/lib -Wl,-rpath=$TBB_BASE/lib" make all check -j32

Reviewers: IslamAbdelRahman, sdong

Reviewed By: sdong

Subscribers: kradhakrishnan, yhchiang, IslamAbdelRahman, andrewkr, dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D56979
2016-08-18 10:44:29 -07:00
Islam AbdelRahman
4990c0d1a5 Remove deprecated LEVELDB_PLATFORM_POSIX
Summary:
We have replaced LEVELDB_PLATFORM_POSIX with ROCKSDB_PLATFORM_POSIX in our code
replace it in Makefile

Test Plan: make check

Reviewers: sdong, yiwu, andrewkr, lightmark

Reviewed By: lightmark

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D61365
2016-08-03 11:07:53 -07:00
woonhak.kang
5c858ddd20 fix errata in libnuma test (#1244) 2016-07-26 19:28:45 -07:00
Islam AbdelRahman
0850bc5147 Fix build on machines without jemalloc
Summary: It looks like we mistakenly enable JEMALLOC even if it's not available on the machine, that's why travis is failing

Test Plan:
check on my devserver
check on my mac

Reviewers: sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D57345
2016-04-27 18:25:19 -07:00
Sergey Makarenko
1c80dfab24 Print memory allocation counters
Summary:
Introduced option to dump malloc statistics using new option flag.
    Added new command line option to db_bench tool to enable this
    funtionality.
    Also extended build to support environments with/without jemalloc.

Test Plan:
1) Build rocksdb using `make` command. Launch the following command
    `./db_bench --benchmarks=fillrandom --dump_malloc_stats=true
    --num=10000000` end verified that jemalloc dump is present in LOG file.
    2) Build rocksdb using `DISABLE_JEMALLOC=1  make db_bench -j32` and ran
    the same db_bench tool and found the following message in LOG file:
    "Please compile with jemalloc to enable malloc dump".
    3) Also built rocksdb using `make` command on MacOS to verify behavior
    in non-FB environment.
    Also to debug build configuration change temporary changed
    AM_DEFAULT_VERBOSITY = 1 in Makefile to see compiler and build
    tools output. For case 1) -DROCKSDB_JEMALLOC was present in compiler
    command line. For both 2) and 3) this flag was not present.

Reviewers: sdong

Reviewed By: sdong

Subscribers: andrewkr, dhruba

Differential Revision: https://reviews.facebook.net/D57321
2016-04-27 16:23:33 -07:00
dx9
b71c4e613f Alpine Linux Build (#990)
* Musl libc does not provide adaptive mutex. Added feature test for PTHREAD_MUTEX_ADAPTIVE_NP.

* Musl libc does not provide backtrace(3). Added a feature check for backtrace(3).

* Fixed compiler error.

* Musl libc does not implement backtrace(3). Added platform check for libexecinfo.

* Alpine does not appear to support gcc -pg option. By default (gcc has PIE option enabled) it fails with:

gcc: error: -pie and -pg|p|profile are incompatible when linking

When -fno-PIE and -nopie are used it fails with:

/usr/lib/gcc/x86_64-alpine-linux-musl/5.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find gcrt1.o: No such file or directory

Added gcc -pg platform test and output PROFILING_FLAGS accordingly. Replaced pg var in Makefile with PROFILING_FLAGS.

* fix segfault when TEST_IOCTL_FRIENDLY_TMPDIR is undefined and default candidates are not suitable

* use ASSERT_DOUBLE_EQ instead of ASSERT_EQ

* When compiled with ROCKSDB_MALLOC_USABLE_SIZE UniversalCompactionFourPaths and UniversalCompactionSecondPathRatio tests fail due to premature memtable flushes on systems with 16-byte alignment. Arena runs out of block space before GenerateNewFile() completes.

Increased options.write_buffer_size.
2016-04-22 16:49:12 -07:00
root
3373c81fa8 Modify build_tools/build_detect_platform to detect and set -march=z10 on Linux s390x. 2016-02-29 15:02:52 -05:00
root
21f17aaa60 Modified Makefile and build_tools/build_detect_platform to compile on Linux s390x. 2016-02-26 18:03:07 -05:00
Andrew Kryczka
6a2b4fcb80 Add flag to forcibly disable fallocate
Summary:
see https://github.com/facebook/rocksdb/issues/977; there are issues
with fallocate() on certain filesystems/kernel versions that can lead it to pre-
allocating blocks but never freeing them, even if they're unused.

Test Plan:
verified build commands omit DROCKSDB_FALLOCATE_PRESENT when this env
variable is set.

without disabling it:

  $ ROCKSDB_NO_FBCODE=1 make -n env_test | grep -q DROCKSDB_FALLOCATE_PRESENT ; echo $?
  0

with disabling it:

  $ ROCKSDB_NO_FBCODE=1 DISABLE_FALLOCATE=1 make -n env_test | grep -q DROCKSDB_FALLOCATE_PRESENT ; echo $?
  1

Reviewers: kradhakrishnan, yhchiang, sdong

Reviewed By: sdong

Subscribers: dhruba, leveldb

Differential Revision: https://reviews.facebook.net/D54069
2016-02-11 17:00:01 -08:00
bcbrock
f423f05dcd Simple changes to support builds for ppc64[le] consistent with X86
These simple changes are required to allow builds on ppc64[le] systems
consistent with X86. The Makefile now recognizes both ppc64 and ppc64le, and
in the absence of PORTABLE=1, the code will be built analogously to the X86
-march=native.

Note that although GCC supports -mcpu=native -mtune=native on POWER, it
doesn't work correctly on all systems. This is why we need to get the actual
machine model from the AUX vector.
2016-01-19 09:08:19 -06:00
Javier González
6e6dd5f6f9 Split posix storage backend into Env and library
Summary: This patch splits the posix storage backend into Env and
the actual *File implementations. The motivation is to allow other Envs
to use posix as a library. This enables a storage backend different from
posix to split its secondary storage between a normal file system
partition managed by posix, and it own media.

Test Plan: No new functionality is added to posix Env or the library,
thus the current tests should suffice.
2015-10-22 17:31:31 +02:00
James Lent
5a7222782a Ensure that the compression libraries are statically linked into dynamic libraries included
in the Java jar.  Also build the linux libraries using the portable flag to fix a problem with
the linux32 build and improve the general portability of the RocksDB dynamic libraries.
==> linux32: util/crc32c.cc:318:39: error: ‘_mm_crc32_u64’ was not declared in this scope
2015-10-09 14:41:40 -04:00