rocksdb

Author	SHA1	Message	Date
Maysam Yabandeh	349542332a	Fix race condition on options_file_number_ (#4780 ) Summary: options_file_number_ must be written under db::mutex_ sine its read is protected by mutex_ in ::GetLiveFiles(). However currently it is written in ::RenameTempFileToOptionsFile() which according to its contract must be called without holding db::mutex_. The patch fixes the race condition by also acquitting the mutex_ before writing options_file_number_. Also it does that only if the rename of option file is successful. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4780 Differential Revision: D13461411 Pulled By: maysamyabandeh fbshipit-source-id: 2d5bae96a1f3e969ef2505b737cf2d7ae749787b	2018-12-13 19:27:38 -08:00
Yanqin Jin	4fce44fc8b	Improve flushing multiple column families (#4708 ) Summary: If one column family is dropped, we should simply skip it and continue to flush other active ones. Currently we use Status::ShutdownInProgress to notify caller of column families being dropped. In the future, we should consider using a different Status code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4708 Differential Revision: D13378954 Pulled By: riversand963 fbshipit-source-id: 42f248cdf2d32d4c0f677cd39012694b8f1328ca	2018-12-13 15:12:40 -08:00
Maysam Yabandeh	67e5b5420e	Reduce runtime of compact_on_deletion_collector_test (#4779 ) Summary: It sometimes times out with it is run with TSAN. The patch reduces the iteration from 50 to 30. This reduces the normal runtime from 5.2 to 3.1 seconds and should similarly address the TSAN timeout problem. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4779 Differential Revision: D13456862 Pulled By: maysamyabandeh fbshipit-source-id: fdc0ad7d781b1c33b771d2415ff5fa2f1b5e2537	2018-12-13 14:47:08 -08:00
DorianZheng	2670fe8c73	Get `CompactionJobInfo` from CompactFiles Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4716 Differential Revision: D13207677 Pulled By: ajkr fbshipit-source-id: d0ccf5a66df6cbb07288b0c5ebad81fd9df3926b	2018-12-13 14:21:24 -08:00
Burton Li	a8b9891f95	Concurrent task limiter for compaction thread control (#4332 ) Summary: The PR is targeting to resolve the issue of: https://github.com/facebook/rocksdb/issues/3972#issue-330771918 We have a rocksdb created with leveled-compaction with multiple column families (CFs), some of CFs are using HDD to store big and less frequently accessed data and others are using SSD. When there are continuously write traffics going on to all CFs, the compaction thread pool is mostly occupied by those slow HDD compactions, which blocks fully utilize SSD bandwidth. Since atomic write and transaction is needed across CFs, so splitting it to multiple rocksdb instance is not an option for us. With the compaction thread control, we got 30%+ HDD write throughput gain, and also a lot smooth SSD write since less write stall happening. ConcurrentTaskLimiter can be shared with multi-CFs across rocksdb instances, so the feature does not only work for multi-CFs scenarios, but also for multi-rocksdbs scenarios, who need disk IO resource control per tenant. The usage is straight forward: e.g.: // // Enable compaction thread limiter thru ColumnFamilyOptions // std::shared_ptr<ConcurrentTaskLimiter> ctl(NewConcurrentTaskLimiter("foo_limiter", 4)); Options options; ColumnFamilyOptions cf_opt(options); cf_opt.compaction_thread_limiter = ctl; ... // // Compaction thread limiter can be tuned or disabled on-the-fly // ctl->SetMaxOutstandingTask(12); // enlarge to 12 tasks ... ctl->ResetMaxOutstandingTask(); // disable (bypass) thread limiter ctl->SetMaxOutstandingTask(-1); // Same as above ... ctl->SetMaxOutstandingTask(0); // full throttle (0 task) // // Sharing compaction thread limiter among CFs (to resolve multiple storage perf issue) // std::shared_ptr<ConcurrentTaskLimiter> ctl_ssd(NewConcurrentTaskLimiter("ssd_limiter", 8)); std::shared_ptr<ConcurrentTaskLimiter> ctl_hdd(NewConcurrentTaskLimiter("hdd_limiter", 4)); Options options; ColumnFamilyOptions cf_opt_ssd1(options); ColumnFamilyOptions cf_opt_ssd2(options); ColumnFamilyOptions cf_opt_hdd1(options); ColumnFamilyOptions cf_opt_hdd2(options); ColumnFamilyOptions cf_opt_hdd3(options); // SSD CFs cf_opt_ssd1.compaction_thread_limiter = ctl_ssd; cf_opt_ssd2.compaction_thread_limiter = ctl_ssd; // HDD CFs cf_opt_hdd1.compaction_thread_limiter = ctl_hdd; cf_opt_hdd2.compaction_thread_limiter = ctl_hdd; cf_opt_hdd3.compaction_thread_limiter = ctl_hdd; ... // // The limiter is disabled by default (or set to nullptr explicitly) // Options options; ColumnFamilyOptions cf_opt(options); cf_opt.compaction_thread_limiter = nullptr; Pull Request resolved: https://github.com/facebook/rocksdb/pull/4332 Differential Revision: D13226590 Pulled By: siying fbshipit-source-id: 14307aec55b8bd59c8223d04aa6db3c03d1b0c1d	2018-12-13 13:18:28 -08:00
Maysam Yabandeh	0aa17c1002	Fix flaky test DBCompactionTest::DeleteFileRange (#4776 ) Summary: The test has been failing sporadically probably because the configured compaction options were actually unused. Verified that by the following: ``` ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.DeleteFileRange --repeat=1000 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4776 Differential Revision: D13441052 Pulled By: maysamyabandeh fbshipit-source-id: d35075b9e6cef9b9c9d0d571f9cd72ade8eda55d	2018-12-12 16:32:14 -08:00
DorianZheng	4862720e08	Expose column family id to `FlushJobInfo` Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4772 Differential Revision: D13428923 Pulled By: ajkr fbshipit-source-id: e351e9c5eea97816db25429e129357a8af90712a	2018-12-11 20:33:42 -08:00
Siying Dong	ae25546a7a	Direct I/O Close() shouldn't rewrite the last block (#4771 ) Summary: In Direct I/O case, WritableFileWriter::Close() rewrites the last block again, even if there is nothing new. The reason is that, Close() flushes the buffer. For non-direct I/O case, the buffer is empty in this case so it is a no-op. However, in direct I/O case, the partial data in the last block is kept in the buffer because it needs to be rewritten for the next write. This piece of data is flushed again. This commit fixes it by skipping this write out if `pending_sync_` flag shows that there isn't new data sync last sync. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4771 Differential Revision: D13420426 Pulled By: siying fbshipit-source-id: 9d39ec9a215b1425d4ed40d85e0eba1f5daa75c6	2018-12-11 13:55:02 -08:00
Tathagata Das	49666d76cf	Fix swallowing of exception in Java RocksDB when loading native library (#4728 ) Summary: This PR fixes #4721. When an exception is caught and thrown as a different exception, then the original exception should be inserted as a cause of the new exception. This bug in RocksDB was swallowing the underlying exception from `NativeLibraryLoader` and throwing the following exception ``` ... Caused by: java.lang.RuntimeException: Unable to load the RocksDB shared libraryjava.nio.channels.ClosedByInterruptException at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:67) at org.rocksdb.RocksDB.<clinit>(RocksDB.java:35) ... 73 more ``` The fix is simple and self-explanatory. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4728 Differential Revision: D13418371 Pulled By: sagar0 fbshipit-source-id: d76c25af2a83a0f8ba62cc8d7b721bfddc85fdf1	2018-12-11 12:18:44 -08:00
Abhishek Madan	cad248f5c6	Prepare FragmentedRangeTombstoneIterator for use in compaction (#4740 ) Summary: To support the flush/compaction use cases of RangeDelAggregator in v2, FragmentedRangeTombstoneIterator now supports dropping tombstones that cannot be read in the compaction output file. Furthermore, FragmentedRangeTombstoneIterator supports the "snapshot striping" use case by allowing an iterator to be split by a list of snapshots. RangeDelAggregatorV2 will use these changes in a follow-up change. In the process of making these changes, other miscellaneous cleanups were also done in these files. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4740 Differential Revision: D13287382 Pulled By: abhimadan fbshipit-source-id: f5aeb03e1b3058049b80c02a558ee48f723fa48c	2018-12-11 12:10:48 -08:00
Adam Retter	d3daa0db8b	RocksJava must compile on JDK7 (#4768 ) Summary: Fixes some RocksJava regressions recently introduced, whereby RocksJava would not build on JDK 7. These should have been visible on Travis-CI! Pull Request resolved: https://github.com/facebook/rocksdb/pull/4768 Differential Revision: D13418173 Pulled By: sagar0 fbshipit-source-id: 57bf223188887f84d9e072031af2e0d2c8a69c30	2018-12-11 11:40:23 -08:00
Sagar Vemuri	dde3ef1116	Change directory where ExternalSSTFileBasicTest runs (#4766 ) Summary: Change the directory where ExternalSSTFileBasicTest* tests run. Problem: Without this change, I spent considerable time chasing around a non-existent issue as ExternalSSTFileTest.* and ExternalSSTFileBasicTest.* create similar directories. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4766 Differential Revision: D13409384 Pulled By: sagar0 fbshipit-source-id: c33e1f4d505dfa6efbc788d6c57cdb680053ded3	2018-12-11 10:21:37 -08:00
Adam Retter	f8943ec003	Fix issues with RocksJava dropColumnFamily (#4770 ) Summary: Closes https://github.com/facebook/rocksdb/issues/4409 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4770 Differential Revision: D13416802 Pulled By: ajkr fbshipit-source-id: 8a351e9b80dc9eeb6073467fbc67cd2f544917b0	2018-12-11 09:17:57 -08:00
Ben Clay	8261e0026b	Promote CompactionFilter* accessors to ColumnFamilyOptionsInterface (#3461 ) Summary: When adding CompactionFilter and CompactionFilterFactory settings to the Java layer, ColumnFamilyOptions was modified directly instead of ColumnFamilyOptionsInterface. This meant that the old-stye Options monolith was left behind. This patch fixes that, by: - promoting the CompactionFilter + CompactionFilterFactory setters from ColumnFamilyOptions -> ColumnFamilyOptionsInterface - adding getters in ColumnFamilyOptionsInterface - implementing setters in Options - implementing getters in both ColumnFamilyOptions and Options - adding testcases - reusing a test CompactionFilterFactory by moving it to a common location Pull Request resolved: https://github.com/facebook/rocksdb/pull/3461 Differential Revision: D13278788 Pulled By: sagar0 fbshipit-source-id: 72602c6eb97dc80734e718abb5e2e9958d3c753b	2018-12-10 15:54:52 -08:00
Abhishek Madan	64aabc9183	Properly set smallest key of subcompaction output (#4723 ) Summary: It is possible to see a situation like the following when subcompactions are enabled: 1. A subcompaction boundary is set to `[b, e)`. 2. The first output file in a subcompaction has `c@20` as its smallest key 3. The range tombstone `[a, d)30` is encountered. 4. The tombstone is written to the range-del meta block and the new smallest key is set to `b@0` (since no keys in this subcompaction's output can be smaller than `b`). 5. A key `b@10` in a lower level will now reappear, since it is not covered by the truncated start key `b@0`. In general, unless the smallest data key in a file has a seqnum of 0, it is not safe to truncate a tombstone at the start key to have a seqnum of 0, since it can expose keys with a seqnum greater than 0 but less than the tombstone's actual seqnum. To fix this, when the lower bound of a file is from the subcompaction boundaries, we now set the seqnum of an artificially extended smallest key to the tombstone's seqnum. This is safe because subcompactions operate over disjoint sets of keys, and the subcompactions that can experience this problem are not the first subcompaction (which is unbounded on the left). Furthermore, there is now an assertion to detect the described anomalous case. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4723 Differential Revision: D13236188 Pulled By: abhimadan fbshipit-source-id: a6da6a113f2de1e2ff307ca72e055300c8fe5692	2018-12-10 12:38:31 -08:00
Adam Singer	10e7de7705	Reduce javadoc warnings (#4764 ) Summary: Compile logs have a bit of noise due to missing javadoc annotations. Updating docs to reduce. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4764 Differential Revision: D13400193 Pulled By: sagar0 fbshipit-source-id: 65c7efb70747cc3bb35a336a6881ea6536ae5ff4	2018-12-10 11:08:38 -08:00
Maysam Yabandeh	21fca397cc	Fix inline comments for assumed_tracked (#4762 ) Summary: Fix the definition of assumed_tracked in Transaction that was introduced in #4680 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4762 Differential Revision: D13399150 Pulled By: maysamyabandeh fbshipit-source-id: 2a30fe49e3c44adacd7e45cd48eae95023ca9dca	2018-12-10 09:56:21 -08:00
Yanqin Jin	f307479ba6	Enable checkpoint of read-only db (#4681 ) Summary: 1. DBImplReadOnly::GetLiveFiles should not return NotSupported. Instead, it should call DBImpl::GetLiveFiles(flush_memtable=false). 2. In DBImp::Recover, we should also recover the OPTIONS file name and/or number so that an immediate subsequent GetLiveFiles will get the correct OPTIONS name. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4681 Differential Revision: D13069205 Pulled By: riversand963 fbshipit-source-id: 3e6a0174307d06db5a01feb099b306cea1f7f88a	2018-12-07 17:06:02 -08:00
Anand Ananthabhotla	1b01d23be2	Add PerfContext counters for index/filter block cache stats (#4540 ) Summary: Add counters to track block cache index/filter hits and misses. We currently count aggregate hits and misses, which includes index/filter/data blocks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4540 Differential Revision: D10459652 Pulled By: anand1976 fbshipit-source-id: 0c59eee7f12f5103dcb6686f0e7995babe63d425	2018-12-07 15:07:56 -08:00
Adam Retter	4048762cbe	Updated the CentOS 6 Docker build for RocksJava to a newer GCC toolchain (#4756 ) Summary: Uses a newer build toolchain but the same old GLIBC when building releases of RocksJava for Linux x64 in the Docker Container. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4756 Differential Revision: D13383575 Pulled By: sagar0 fbshipit-source-id: 27c58814876e434d5fa61395e6664cfc5f6830b1	2018-12-07 14:37:26 -08:00
Sagar Vemuri	0463f61837	Refactor BlockBasedTable::Open (#4636 ) Summary: Refactored and simplified `BlockBasedTable::Open` to be similar to `BlockBasedTableBuilder::Finish` as both these functions complement each other. Also added `BlockBasedTableBuilder::WriteFooter` along the way. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4636 Differential Revision: D12933319 Pulled By: sagar0 fbshipit-source-id: 1ff1d02f6d80a63b5ba720a1fc75e71c7344137b	2018-12-07 13:18:44 -08:00
Pengchao Wang	c41c60be13	fix tombstone collectable test (#4755 ) Summary: the original test does not give enough time difference between tombstone write time and the expire time point, which make test flaky. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4755 Reviewed By: maysamyabandeh Differential Revision: D13369681 Pulled By: wpc fbshipit-source-id: 22576f354c63cd0b39d8b35c3913303707503ea9	2018-12-07 10:13:54 -08:00
Maysam Yabandeh	b878f93c70	Extend Transaction::GetForUpdate with do_validate (#4680 ) Summary: Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot (if there is any) before doing the read. After the read it also returns the latest value (expects the ReadOptions::snapshot to be nullptr). This allows RocksDB applications to use GetForUpdate similarly to how InnoDB does. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false). The Java APIs are accordingly updated. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4680 Differential Revision: D13068508 Pulled By: maysamyabandeh fbshipit-source-id: f0b59db28f7f6a078b60844d902057140765e67d	2018-12-06 17:49:00 -08:00
Yanqin Jin	1d679e35fd	Update HISTORY.md (#4753 ) Summary: As titled. Update history to include a recent bug fix in `9be3e6b488`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4753 Differential Revision: D13350286 Pulled By: riversand963 fbshipit-source-id: b6324780dee4cb1757bc2209403a08531c150c08	2018-12-05 16:55:58 -08:00
Yanqin Jin	9be3e6b488	Allow file-ingest-triggered flush to skip waiting for write-stall clear (#4751 ) Summary: When write stall has already been triggered due to number of L0 files reaching threshold, file ingestion must proceed with its flush without waiting for the write stall condition to cleared by the compaction because compaction can wait for ingestion to finish (circular wait). In order to avoid this wait, we can set `FlushOptions.allow_write_stall` to be true (default is false). Setting it to false can cause deadlock. This can happen when the number of compaction threads is low. Considere the following ``` Time compaction_thread ingestion_thread \| num_running_ingest_file_++ \| while(num_running_ingest_file_>0){wait} \| flush V ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4751 Differential Revision: D13343037 Pulled By: riversand963 fbshipit-source-id: d3b95938814af46ec4c463feff0b50c70bd8b23f	2018-12-05 14:59:29 -08:00
Yanqin Jin	b96fccb1e6	Move a function to critical section (#4752 ) Summary: Test plan ``` $make clean && make -j32 all check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4752 Differential Revision: D13344705 Pulled By: riversand963 fbshipit-source-id: fc3a43174d09d70ccc2b09decd78e1da1b6ba9d1	2018-12-05 13:12:09 -08:00
anand76	e58d76955a	Fix buck dev mode fbcode builds (#4747 ) Summary: Don't enable ROCKSDB_JEMALLOC unless the build mode is opt and default allocator is jemalloc. In dev mode, this is causing compile/link errors such as - ``` stderr: buck-out/dev/gen/rocksdb/src/rocksdb_lib#compile-pic-malloc_stats.cc.o4768b59e,gcc-5-glibc-2.23-clang/db/malloc_stats.cc.o:malloc_stats.cc:function rocksdb::DumpMallocStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*): error: undefined reference to 'malloc_stats_print' clang-7.0: error: linker command failed with exit code 1 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4747 Differential Revision: D13324840 Pulled By: anand1976 fbshipit-source-id: 45ffbd4f63fe4d9e8a0473d8f066155e4ef64a14	2018-12-05 10:40:31 -08:00
Zhongyi Xie	2f1ca4e838	Revert "BaseDeltaIterator: always check valid() before accessing key(… (#4744 ) Summary: …) (#4702)" This reverts commit `3a18bb3e15`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4744 Differential Revision: D13311869 Pulled By: miasantreble fbshipit-source-id: 6300b12cc34828d8b9274e907a3aef1506d5d553	2018-12-03 23:38:27 -08:00
Fosco Marotto	55479eb572	Update History for fast-forwarded 5.18 branch Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4704 Differential Revision: D13283300 Pulled By: gfosco fbshipit-source-id: cb4fdaa93137e0bba64b781ba7e8fe31b19e5656	2018-11-30 16:25:09 -08:00
Zhongyi Xie	3a18bb3e15	BaseDeltaIterator: always check valid() before accessing key() (#4702 ) Summary: Current implementation of `current_over_upper_bound_` fails to take into consideration that keys might be invalid in either base iterator or delta iterator. Calling key() in such scenario will lead to assertion failure and runtime errors. This PR addresses the bug by adding check for valid keys before calling `IsOverUpperBound()`, also added test coverage for iterate_upper_bound usage in BaseDeltaIterator Also recommit https://github.com/facebook/rocksdb/pull/4656 (It was reverted earlier due to bugs) Pull Request resolved: https://github.com/facebook/rocksdb/pull/4702 Differential Revision: D13146643 Pulled By: miasantreble fbshipit-source-id: 6d136929da12d0f2e2a5cea474a8038ec5cdf1d0	2018-11-30 15:35:13 -08:00
Siying Dong	6e938c904f	Make NewBloomFilterPolicy() use full filter by default (#4735 ) Summary: Full block (use_block_based_builder=false) Bloom filter has clear CPU saving benefits but with limitation of using temp memory when building an SST file proportional to the SST file size. We reduced the chance of having large SST files with multi-level universal compaction. Now we change to a default with better performance. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4735 Differential Revision: D13266674 Pulled By: siying fbshipit-source-id: 7594a4c3e32568a5a2adce22bb0e46553e55c602	2018-11-30 13:13:27 -08:00
Zhongyi Xie	b0f3d9b478	fix unused param "options" error in jemalloc_nodump_allocator.cc (#4738 ) Summary: Currently tests are failing on master with the following message: > util/jemalloc_nodump_allocator.cc:132:8: error: unused parameter ‘options’ [-Werror=unused-parameter] Status NewJemallocNodumpAllocator( This PR attempts to fix the issue Pull Request resolved: https://github.com/facebook/rocksdb/pull/4738 Differential Revision: D13278804 Pulled By: miasantreble fbshipit-source-id: 64a6204aa685bd85d8b5080655cafef9980fac2f	2018-11-30 12:08:55 -08:00
Maysam Yabandeh	f1b0841f06	WritePrepared: followup fix for snapshot double release issue (#4734 ) Summary: The fix in #4727 for double snapshot release was incomplete since it does not properly remove the duplicate entires in the snapshot list after finding that a snapshot is still valid. The patch does that and also improves the unit test to show the issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4734 Differential Revision: D13266260 Pulled By: maysamyabandeh fbshipit-source-id: 351e2c40cca45a87b757774c11af74182314911e	2018-11-29 21:01:57 -08:00
Yi Wu	cf1df5d3cb	JemallocNodumpAllocator: option to limit tcache memory usage (#4736 ) Summary: Add option to limit tcache usage by allocation size. This is to reduce total tcache size in case there are many user threads accessing the allocator and incur non-trivial memory usage. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4736 Differential Revision: D13269305 Pulled By: yiwu-arbug fbshipit-source-id: 95a9b7fc67facd66837c849137e30e137112e19d	2018-11-29 17:33:40 -08:00
Sagar Vemuri	70645355ad	Move FIFOCompactionPicker to a separate file (#4724 ) Summary: Summary: Simplified the code layout by moving FIFOCompactionPicker to a separate file. Why?: While trying to add ttl functionality to universal compaction, I found that `FIFOCompactionPicker` class and its impl methods to be interspersed between `LevelCompactionPicker` methods which kind-of made the code a little hard to traverse. So I moved `FIFOCompactionPicker` to a separate compaction_picker_fifo.h/cc file, similar to `UniversalCompactionPicker`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4724 Differential Revision: D13227914 Pulled By: sagar0 fbshipit-source-id: 89471766ea67fa4d87664a41c057dd7df4b3d4e3	2018-11-29 16:04:52 -08:00
Yanqin Jin	8d7bc76f36	Fix a flaky test DBFlushTest.SyncFail (#4633 ) Summary: There is a race condition in DBFlushTest.SyncFail, as illustrated below. ``` time thread1 bg_flush_thread \| Flush(wait=false, cfd) \| refs_before=cfd->current()->TEST_refs() PickMemtable calls cfd->current()->Ref() V ``` The race condition between thread1 getting the ref count of cfd's current version and bg_flush_thread incrementing the cfd's current version makes it possible for later assertion on refs_before to fail. Therefore, we add test sync points to enforce the order and assert on the ref count before and after PickMemtable is called in bg_flush_thread. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4633 Differential Revision: D12967131 Pulled By: riversand963 fbshipit-source-id: a99d2bacb7869ec5d8d03b24ef2babc0e6ae1a3b	2018-11-29 13:39:56 -08:00
Kefu Chai	7dbee38716	db/repair: reset Repair::db_lock_ in ctor (#4683 ) Summary: there is chance that * the caller tries to repair the db when holding the db_lock, in that case the env implementation might not set the `lock` parameter of Repairer::Run(). * the caller somehow never calls Repairer::Run(). either way, the desctructor of Repair will compare the uninitialized db_lock_ with nullptr, and tries to unlock it. there is good chance that the db_lock_ is not nullptr, then boom. Signed-off-by: Kefu Chai <tchaikov@gmail.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/4683 Differential Revision: D13260287 Pulled By: riversand963 fbshipit-source-id: 878a119d2e9f10a0fa17ee62cf3fb24b33d49fa5	2018-11-29 11:26:41 -08:00
anand76	8d9b4d9741	Fix failure of sst_file_reader_test in LITE mode regression test (#4725 ) Summary: Add a dummy main() in sst_file_reader_test for ROCKSDB_LITE to fix link failure in regression Pull Request resolved: https://github.com/facebook/rocksdb/pull/4725 Differential Revision: D13252885 Pulled By: anand1976 fbshipit-source-id: 0e22b964815e2bf01aff7d03ed4ae59d44fa86f1	2018-11-29 10:51:41 -08:00
Maysam Yabandeh	1a5a93ff74	WritePrepared: Fix double snapshot release issue (#4727 ) Summary: Currently the garbage collection of items in old_commit_map_ was done upon ::ReleaseSnapshot. The assumption behind this method was that the sequence number of snapshots are unique, which is incorrect. In the very rare cases that two consecutive snapshot have the same sequence number this could lead the release of the first snapshot affect the old_commit_map_ that is necessary to service the reads of the second snapshot. The bug would be triggered only if i) two snapshot have the same seq, ii) both of them are very old (older than the last ~4m transactions), and iii) there is commit entry overlapping with the snapshot seq number. It is fixed by doing the cleanup of old_commit_map_ in UpdateSnapshot: the new list of snapshots are compared with the old one and the missing sequence numbers are concluded released. If two snapshots have the same seq number, after the release of one of them, the seq number still appears in the snapshot least and thus not cleaned up prematurely. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4727 Differential Revision: D13246495 Pulled By: maysamyabandeh fbshipit-source-id: 93b87a5042afd8060889df245526d3f5d29de9fe	2018-11-28 19:03:31 -08:00
Yi Wu	512a5e3ef8	Fix BlockBasedTable not always using memory allocator if available (#4678 ) Summary: Fix block based table reader not using memory_allocator when allocating index blocks and compression dictionary blocks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4678 Differential Revision: D13054594 Pulled By: yiwu-arbug fbshipit-source-id: 379f25bcc665395662511c4f873f4b7b55104ce2	2018-11-28 18:01:24 -08:00
Abhishek Madan	8fe1e06ca0	Clean up FragmentedRangeTombstoneList (#4692 ) Summary: Removed `one_time_use` flag, which removed the need for some tests, and changed all `NewRangeTombstoneIterator` methods to return `FragmentedRangeTombstoneIterators`. These changes also led to removing `RangeDelAggregatorV2::AddUnfragmentedTombstones` and one of the `MemTableListVersion::AddRangeTombstoneIterators` methods. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4692 Differential Revision: D13106570 Pulled By: abhimadan fbshipit-source-id: cbab5432d7fc2d9cdfd8d9d40361a1bffaa8f845	2018-11-28 15:29:02 -08:00
Zhichao Cao	7125e24619	Add the max trace file size limitation option to Tracing (#4610 ) Summary: If user do not end the trace manually, the tracing will continue which can potential use up all the storage space and cause problem. In this PR, the max trace file size is added to the TraceOptions and user can set the value if they need or the default is 64GB. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4610 Differential Revision: D12893400 Pulled By: zhichao-cao fbshipit-source-id: acf4b5a6076bb691778bdfbac4864e1006758953	2018-11-27 14:27:05 -08:00
Sagar Vemuri	c94f073e5e	Fix Mac build break in casting (#4722 ) Summary: Mac build is failing with the below error: ``` $ make db_bench -j8 ... ... tools/db_bench_tool.cc:4583:25: error: no matching function for call to 'max' (uint64_t)std::max(0l, seek_pos - FLAGS_max_scan_distance), ^~~~~~~~ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2717:1: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('long' vs. 'long long') max(const _Tp& __a, const _Tp& __b) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2727:1: note: candidate template ignored: could not match 'initializer_list<type-parameter-0-0>' against 'long' max(initializer_list<_Tp> __t, _Compare __comp) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2709:1: note: candidate function template not viable: requires 3 arguments, but 2 were provided max(const _Tp& __a, const _Tp& __b, _Compare __comp) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2735:1: note: candidate function template not viable: requires single argument '__t', but 2 arguments were provided max(initializer_list<_Tp> __t) ^ 1 error generated. make: *** [tools/db_bench_tool.o] Error 1 ``` My compiler version: Mac OS X Mojave ``` $ clang++ --version Apple LLVM version 10.0.0 (clang-1000.11.45.5) Target: x86_64-apple-darwin18.2.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4722 Differential Revision: D13220196 Pulled By: sagar0 fbshipit-source-id: 01e5e928288a5613027c83a26ad8aedf04438b14	2018-11-27 13:30:16 -08:00
Huachao Huang	5e72bc113a	Add SstFileReader to read sst files (#4717 ) Summary: A user friendly sst file reader is useful when we want to access sst files outside of RocksDB. For example, we can generate an sst file with SstFileWriter and send it to other places, then use SstFileReader to read the file and process the entries in other ways. Also rename the original SstFileReader to SstFileDumper because of name conflict, and seems SstFileDumper is more appropriate for tools. TODO: there is only a very simple test now, because I want to get some feedback first. If the changes look good, I will add more tests soon. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4717 Differential Revision: D13212686 Pulled By: ajkr fbshipit-source-id: 737593383264c954b79e63edaf44aaae0d947e56	2018-11-27 13:02:23 -08:00
Adam Singer	3fa80f0e85	Remove enable_internal_stats (#4714 ) Summary: Simple patch to address comments in [statistics.h#L65](https://github.com/facebook/rocksdb/blob/master/monitoring/statistics.h#L65\|statistics.h#L65) `TODO(ajkr): clean this up since there are no internal stats anymore` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4714 Differential Revision: D13208093 Pulled By: ajkr fbshipit-source-id: 4468badb850592411147539f859082644f5296f6	2018-11-27 12:58:58 -08:00
Abhishek Madan	e76448185c	Remove DeleteRange experimental comment (#4709 ) Summary: DeleteRange is now ready for production use. Change the header comment to reflect this, and update HISTORY.md with the feature's status. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4709 Differential Revision: D13209055 Pulled By: abhimadan fbshipit-source-id: 65423eb1a4927cf593c38254cd87c322f73ae137	2018-11-27 11:11:35 -08:00
Adam Singer	1db4a096d4	Test mapping of Histograms and HistogramsNameMap (#4720 ) Summary: Adding sanity check test for mapping of `Histograms` and `HistogramsNameMap` ``` [==========] Running 2 tests from 1 test case. [----------] Global test environment set-up. [----------] 2 tests from StatisticsTest [ RUN ] StatisticsTest.SanityTickers [ OK ] StatisticsTest.SanityTickers (0 ms) [ RUN ] StatisticsTest.SanityHistograms [ OK ] StatisticsTest.SanityHistograms (0 ms) [----------] 2 tests from StatisticsTest (0 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test case ran. (0 ms total) [ PASSED ] 2 tests. ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4720 Differential Revision: D13217061 Pulled By: ajkr fbshipit-source-id: 6427f4e684c36b2f3c3440808b74fee86a364683	2018-11-27 10:48:30 -08:00
Sagar Vemuri	a2dec2ed08	Fix Java to C++ ticker conversions (#4719 ) Summary: Added back `NO_ITERATORS` and moved `NO_ITERATOR_CREATED` to the end of `toCppTickers`. This is a leftover fix which is needed in addition to `a138e351bc` to correctly convert java tickers to c++ tickers. `a138e351bc` only updated `toJavaTickerType` but both `toJavaTickerType` and `toCppTickers` need to be changed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4719 Differential Revision: D13208847 Pulled By: sagar0 fbshipit-source-id: 53a42f3d6ffe04034acfde972d73040b92b4c1af	2018-11-27 10:17:07 -08:00
Po-Chuan Hsieh	60deb4485e	Fix build with ROCKSDB_LITE and -Wunused-private-field (#4715 ) Summary: The error message of databases/rocksdb-lite (FreeBSD port) is as follows: ``` tools/db_bench_tool.cc:1976:16: error: private field 'trace_options_' is not used [-Werror,-Wunused-private-field] TraceOptions trace_options_; ^ ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4715 Differential Revision: D13207902 Pulled By: ajkr fbshipit-source-id: be3c612eba656aeddb77e35e2f201dd25dc92f7e	2018-11-26 21:35:38 -08:00
Soli	f1837595a3	FIX #3278 : Move global const object definitions from .h to .cc (#4691 ) Summary: Summary We should declare constants in headers and define them in source files. But this commit is only aimed at compound types. I don't know if it is necessary to do the same thing to fundamental types. I used this command to find all of the constant definitions in header files. `find . -name ".h" \| xargs grep -e "^const .=."` And here is what I found: ``` ./db/version_edit.h:const uint64_t kFileNumberMask = 0x3FFFFFFFFFFFFFFF; ./include/rocksdb/env.h:const size_t kDefaultPageSize = 4 1024; ./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = { ./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = { ./include/rocksdb/table.h:const uint32_t kPlainTableVariableLength = 0; ./include/rocksdb/utilities/transaction_db.h:const uint32_t kInitialMaxDeadlocks = 5; ./port/port_posix.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max(); ./port/port_posix.h:const int kMaxInt32 = std::numeric_limits<int32_t>::max(); ./port/port_posix.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max(); ./port/port_posix.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max(); ./port/port_posix.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max(); ./port/win/port_win.h:const uint32_t kMaxUint32 = UINT32_MAX; ./port/win/port_win.h:const int kMaxInt32 = INT32_MAX; ./port/win/port_win.h:const int64_t kMaxInt64 = INT64_MAX; ./port/win/port_win.h:const uint64_t kMaxUint64 = UINT64_MAX; ./port/win/port_win.h:const size_t kMaxSizet = UINT64_MAX; ./port/win/port_win.h:const size_t kMaxSizet = UINT_MAX; ./port/win/port_win.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max(); ./port/win/port_win.h:const int kMaxInt32 = std::numeric_limits<int>::max(); ./port/win/port_win.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max(); ./port/win/port_win.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max(); ./port/win/port_win.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max(); ./port/win/port_win.h:const bool kLittleEndian = true; ./table/cuckoo_table_factory.h:const uint32_t kCuckooMurmurSeedMultiplier = 816922183; ./table/data_block_hash_index.h:const uint8_t kNoEntry = 255; ./table/data_block_hash_index.h:const uint8_t kCollision = 254; ./table/data_block_hash_index.h:const uint8_t kMaxRestartSupportedByHashIndex = 253; ./table/data_block_hash_index.h:const size_t kMaxBlockSizeSupportedByHashIndex = 1u << 16; ./table/data_block_hash_index.h:const double kDefaultUtilRatio = 0.75; ./table/filter_block.h:const uint64_t kNotValid = ULLONG_MAX; ./table/format.h:const int kMagicNumberLengthByte = 8; ./third-party/fbson/FbsonJsonParser.h:const char* const kJsonDelim = " ,]}\t\r\n"; ./third-party/fbson/FbsonJsonParser.h:const char* const kWhiteSpace = " \t\n\r"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const BiggestInt kMaxBiggestInt = ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestStyleFlag[] = "death_test_style"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestUseFork[] = "death_test_use_fork"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kInternalRunDeathTestFlag[] = "internal_run_death_test"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char* pets[] = {"cat", "dog"}; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const size_t kProtobufOneLinerMaxLength = 50; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const int kMaxStackTraceDepth = 100; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const T* WithParamInterface<T>::parameter_ = NULL; ./util/coding.h:const unsigned int kMaxVarint64Length = 10; ./util/filename.h:const size_t kFormatFileNumberBufSize = 38; ./util/testutil.h:const SliceTransform* RandomSliceTransform(Random* rnd, int pre_defined = -1); ./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef"; ./util/trace_replay.h:const unsigned int kTraceTimestampSize = 8; ./util/trace_replay.h:const unsigned int kTraceTypeSize = 1; ./util/trace_replay.h:const unsigned int kTracePayloadLengthSize = 4; ./util/trace_replay.h:const unsigned int kTraceMetadataSize = ./utilities/cassandra/serialize.h:const int64_t kCharMask = 0xFFLL; ./utilities/cassandra/serialize.h:const int32_t kBitsPerByte = 8; ``` And these 3 lines are related to this commit: ``` ./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = { ./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = { ./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef"; ``` Any comments would be appreciated. Thanks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4691 Differential Revision: D13208049 Pulled By: ajkr fbshipit-source-id: e5ee55fdaec5447fc5798c6721e2821e7cdc0d5b	2018-11-26 21:32:03 -08:00

1 2 3 4 5 ...

7653 Commits