Commit Graph

5918 Commits

Author SHA1 Message Date
Maysam Yabandeh
69d5262c81 Two-level Indexes
Summary:
Partition Index blocks and use a Partition-index as a 2nd level index.

The two-level index can be used by setting
BlockBasedTableOptions::kTwoLevelIndexSearch as the index type and
configuring BlockBasedTableOptions::index_per_partition

t15539501
Closes https://github.com/facebook/rocksdb/pull/1814

Differential Revision: D4473535

Pulled By: maysamyabandeh

fbshipit-source-id: bffb87e
2017-02-06 16:39:12 -08:00
Dmitri Smirnov
0a4cdde50a Windows thread
Summary:
introduce new methods into a public threadpool interface,
- allow submission of std::functions as they allow greater flexibility.
- add Joining methods to the implementation to join scheduled and submitted jobs with
  an option to cancel jobs that did not start executing.
- Remove ugly `#ifdefs` between pthread and std implementation, make it uniform.
- introduce pimpl for a drop in replacement of the implementation
- Introduce rocksdb::port::Thread typedef which is a replacement for std::thread.  On Posix Thread defaults as before std::thread.
- Implement WindowsThread that allocates memory in a more controllable manner than windows std::thread with a replaceable implementation.
- should be no functionality changes.
Closes https://github.com/facebook/rocksdb/pull/1823

Differential Revision: D4492902

Pulled By: siying

fbshipit-source-id: c74cb11
2017-02-06 14:54:18 -08:00
Vitaliy Liptchinsky
1aaa898cf1 Adding GetApproximateMemTableStats method
Summary:
Added method that returns approx num of entries as well as size for memtables.
Closes https://github.com/facebook/rocksdb/pull/1841

Differential Revision: D4511990

Pulled By: VitaliyLi

fbshipit-source-id: 9a4576e
2017-02-06 14:54:16 -08:00
Rhys Parry
9fc23c55f2 Use gcc-4.9-glibc-2.20-fb python in precommit_checker
Summary:
The gcc-4.8.1-glibc-2.17 platform is deprecated and will be removed
soon.
Closes https://github.com/facebook/rocksdb/pull/1839

Differential Revision: D4509684

Pulled By: siying

fbshipit-source-id: 3efe296
2017-02-03 13:39:18 -08:00
Andrew Kryczka
b797e42157 Dump compression dictionary meta-block
Summary:
make sst_dump print size/contents of the dictionary meta-block for easier debugging
Closes https://github.com/facebook/rocksdb/pull/1837

Differential Revision: D4506399

Pulled By: ajkr

fbshipit-source-id: b9bf668
2017-02-03 12:39:16 -08:00
Siying Dong
036d668b19 Fix wrong result in data race case related to Get()
Summary:
In theory, Get() can get a wrong result, if it races in a special with with flush. The bug can be reproduced in DBTest2.GetRaceFlush. Fix this bug by getting snapshot after referencing the super version.
Closes https://github.com/facebook/rocksdb/pull/1816

Differential Revision: D4475958

Pulled By: siying

fbshipit-source-id: bd9e67a
2017-02-03 11:39:15 -08:00
Islam AbdelRahman
574b543f80 Rename merger.h -> merging_iterator.h
Summary:
merger.h was always a confusing name for me, simply give the file a better name
Closes https://github.com/facebook/rocksdb/pull/1836

Differential Revision: D4505357

Pulled By: IslamAbdelRahman

fbshipit-source-id: 07b28d8
2017-02-02 16:54:19 -08:00
Dmitri Smirnov
add8b50cc9 Move ThreadLocal implementation into .cc
Summary: Closes https://github.com/facebook/rocksdb/pull/1829

Differential Revision: D4502314

Pulled By: siying

fbshipit-source-id: f46fac1
2017-02-02 14:09:12 -08:00
Siying Dong
71d2496afc Fix arc setting for Facebook internal tools
Summary: Closes https://github.com/facebook/rocksdb/pull/1834

Differential Revision: D4503469

Pulled By: siying

fbshipit-source-id: 761bfa6
2017-02-02 13:24:16 -08:00
Siying Dong
f289d9f4ac Fix OSX build break after the fallocate change
Summary:
The recent update about fallocate failed OSX build. Fix it.
Closes https://github.com/facebook/rocksdb/pull/1830

Differential Revision: D4500235

Pulled By: siying

fbshipit-source-id: a5f2b40
2017-02-02 10:39:11 -08:00
Siying Dong
4a3e7d320c Change the default of delayed slowdown value to 16MB/s
Summary:
Change the default of delayed slowdown value to 16MB/s and further increase the L0 stop condition to 36 files.
Closes https://github.com/facebook/rocksdb/pull/1821

Differential Revision: D4489229

Pulled By: siying

fbshipit-source-id: 1003981
2017-02-01 20:39:17 -08:00
Siying Dong
0513e21f9b RangeSync() should work with ROCKSDB_FALLOCATE_PRESENT not set
Summary: Closes https://github.com/facebook/rocksdb/pull/1824

Differential Revision: D4493862

Pulled By: siying

fbshipit-source-id: c168446
2017-02-01 10:24:20 -08:00
Islam AbdelRahman
8b369ae5bd Cleaner default options using C++11 in-class init
Summary:
C++11 in-class initialization is cleaner and makes it the default more explicit to our users and more visible.
Use it for ColumnFamilyOptions and DBOptions
Closes https://github.com/facebook/rocksdb/pull/1822

Differential Revision: D4490473

Pulled By: IslamAbdelRahman

fbshipit-source-id: c493a87
2017-01-31 18:09:15 -08:00
Islam AbdelRahman
ec79a7b53c Dedup code in option.cc and db_options.cc
Summary:
The code in DBOptions::Dump is simply a duplicate of the code in ImmutableDBOptions::Dump and MutableDBOptions.Dump

consolidate duplicate code.

tested visually
Closes https://github.com/facebook/rocksdb/pull/1818

Differential Revision: D4486710

Pulled By: IslamAbdelRahman

fbshipit-source-id: 7085189
2017-01-31 17:39:12 -08:00
oranagra
b96372dead improving the C wrapper
Summary:
- rocksdb_property_int (so that we don't have to parse strings)
- and rocksdb_set_options (to allow controlling options via strings)
- a few other missing options exposed
- a documentation comment fix
Closes https://github.com/facebook/rocksdb/pull/1793

Differential Revision: D4456569

Pulled By: yiwu-arbug

fbshipit-source-id: 9f1fac1
2017-01-27 17:39:16 -08:00
Siying Dong
04c4ec41d1 Change corruption_test to use 4 bits.
Summary:
In the patch which LRU cache was made use dynamic shard bits, I changed to 2 shard bits to make the test happy. Look like it is occasionally still unhappy. Change it to 4 shard bits.
Closes https://github.com/facebook/rocksdb/pull/1815

Differential Revision: D4475849

Pulled By: siying

fbshipit-source-id: 575ff00
2017-01-27 11:24:16 -08:00
Siying Dong
2d75cd40d3 NewLRUCache() to pick number of shard bits based on capacity if not given
Summary:
If the users use the NewLRUCache() without passing in the number of shard bits, instead of using hard-coded 6, we'll determine it based on capacity.
Closes https://github.com/facebook/rocksdb/pull/1584

Differential Revision: D4242517

Pulled By: siying

fbshipit-source-id: 86b0f18
2017-01-27 06:39:12 -08:00
Siying Dong
f25f1ec60b Add test DBTest2.GetRaceFlush which can expose a data race bug
Summary:
A current data race issue in Get() and Flush() can cause a Get() to return wrong results when a flush happened in the middle. Disable the test for now.
Closes https://github.com/facebook/rocksdb/pull/1813

Differential Revision: D4472310

Pulled By: siying

fbshipit-source-id: 5755ebd
2017-01-26 16:39:14 -08:00
Andrew Kryczka
37d4a79e99 Deserialize custom Statistics object in db_bench
Summary:
Added -statistics_string to deserialize a Statistics object using the factory functions registered by applications.
Closes https://github.com/facebook/rocksdb/pull/1812

Differential Revision: D4469811

Pulled By: ajkr

fbshipit-source-id: 2d80862
2017-01-26 15:24:16 -08:00
Andrew Kryczka
3b35134e4b Avoid cache lookups for range deletion meta-block
Summary:
I added the Cache::Ref() function a couple weeks ago (#1761) to make this feature possible. Like other meta-blocks, rep_->range_del_entry holds a cache handle to pin the range deletion block in uncompressed block cache for the duration of the table reader's lifetime. We can reuse this cache handle to create an iterator over this meta-block without any cache lookup. Ref() is used to increment the cache handle's refcount in case the returned iterator outlives the table reader.
Closes https://github.com/facebook/rocksdb/pull/1801

Differential Revision: D4458782

Pulled By: ajkr

fbshipit-source-id: 2883f10
2017-01-26 11:24:13 -08:00
Andrew Kryczka
94a0c32e73 Fix LRU Ref() for handles with external references only
Summary:
For case !handle->InCache() && handle->refs >= 1 (the third case mentioned in lru_cache.h), the key was overwritten by Insert(). In this case, the refcount can still be incremented, and the cache handle will never enter LRU list. Fix Ref() logic for this case.
Closes https://github.com/facebook/rocksdb/pull/1808

Differential Revision: D4467656

Pulled By: ajkr

fbshipit-source-id: c0784d8
2017-01-26 10:54:15 -08:00
Andrew Kryczka
17c1180603 Generalize Env registration framework
Summary:
The Env registration framework supports registering client Envs and selecting which one to instantiate according to a text field. This enabled things like adding the -env_uri argument to db_bench, so the same binary could be reused with different Envs just by changing CLI config.

Now this problem has come up again in a non-Env context, as I want to instantiate a client Statistics implementation from db_bench, which is configured entirely via text parameters. Also, in the future we may wish to use it for deserializing client objects when loading OPTIONS file.

This diff generalizes the Env registration logic to work with arbitrary types.

- Generalized registration and instantiation code by templating them
- The entire implementation is in a header file as that's Google style guide's recommendation for template definitions
- Pattern match with std::regex_match rather than checking prefix, which was the previous behavior
- Rename functions/files to be non-Env-specific
Closes https://github.com/facebook/rocksdb/pull/1776

Differential Revision: D4421933

Pulled By: ajkr

fbshipit-source-id: 34647d1
2017-01-25 16:09:14 -08:00
sdong
07dddd5f7e EnvPosixTestWithParam should wait for all threads to finish
Summary:
If we don't wait for the threads to finish after each run, the thread queue may not be empty while the next test starts to run, which can cause unexpected behaviors.

Also make some of the relaxed read/write more restrict.
Closes https://github.com/facebook/rocksdb/pull/1590

Reviewed By: AsyncDBConnMarkedDownDBException

Differential Revision: D4245922

Pulled By: AsyncDBConnMarkedDownDBException

fbshipit-source-id: f83b74b
2017-01-25 15:54:13 -08:00
sdong
5dad9d6d28 Avoid logs_ operation out of DB mutex
Summary:
logs_.back() is called out of DB mutex, which can cause data race. We move the access into the DB mutex protection area.
Closes https://github.com/facebook/rocksdb/pull/1774

Reviewed By: AsyncDBConnMarkedDownDBException

Differential Revision: D4417472

Pulled By: AsyncDBConnMarkedDownDBException

fbshipit-source-id: 2da1f1e
2017-01-25 15:54:13 -08:00
Islam AbdelRahman
a7b13919bf Fix CompactFiles() bug when used with CompactionFilter using SuperVersion
Summary:
GetAndRefSuperVersion() should not be called again in the same thread before ReturnAndCleanupSuperVersion() is called.

If we have a compaction filter that is using DB::Get, This will happen
```
CompactFiles() {
  GetAndRefSuperVersion() // -- first call
    ..
    CompactionFilter() {
      GetAndRefSuperVersion() // -- second call
      ReturnAndCleanupSuperVersion()
    }
    ..
  ReturnAndCleanupSuperVersion()
}
```

We solve this issue in the same way Iterator is solving it, but using GetReferencedSuperVersion()

This was discovered in https://github.com/facebook/mysql-5.6/issues/427 by alxyang
Closes https://github.com/facebook/rocksdb/pull/1803

Differential Revision: D4460155

Pulled By: IslamAbdelRahman

fbshipit-source-id: 5e54322
2017-01-25 14:09:13 -08:00
Andrew Kryczka
616a1464ea Fix DeleteRange including sentinels in output files
Summary:
when writing RangeDelAggregator::AddToBuilder, I forgot that there are sentinel tombstones in the middle of the interval map since gaps between real tombstones are represented with sentinels.

blame: #1614
Closes https://github.com/facebook/rocksdb/pull/1804

Differential Revision: D4460426

Pulled By: ajkr

fbshipit-source-id: 69444b5
2017-01-25 11:09:12 -08:00
Zongzhi Chen
c918c4b76a Update USERS.md add user Pika
Summary:
add rocksdb users pika
Closes https://github.com/facebook/rocksdb/pull/1805

Differential Revision: D4462809

Pulled By: siying

fbshipit-source-id: 7289886
2017-01-25 10:39:17 -08:00
Islam AbdelRahman
03ca2ac8a9 Remove function from DBImpl that are not used anywhere
Summary:
GetAndRefSuperVersionUnlocked
ReturnAndCleanupSuperVersionUnlocked
GetColumnFamilyHandleUnlocked

Are dead code that are not used any where
Closes https://github.com/facebook/rocksdb/pull/1802

Differential Revision: D4459948

Pulled By: IslamAbdelRahman

fbshipit-source-id: 30fa89d
2017-01-24 19:24:13 -08:00
Andrew Kryczka
b0029bc7fa Test merge op covered by range deletion in memtable
Summary:
It's a test case for #1797. Also got rid of kTypeDeletion in the conditional since we treat it the same as kTypeRangeDeletion.
Closes https://github.com/facebook/rocksdb/pull/1800

Differential Revision: D4451300

Pulled By: ajkr

fbshipit-source-id: b39dda1
2017-01-24 13:39:11 -08:00
Andrew Kryczka
d438e1ec17 Test range deletion block outlives table reader
Summary:
This test ensures RangeDelAggregator can still access blocks even if it outlives the table readers that created them (detailed description in comments).

I plan to optimize away the extra cache lookup we currently do in BlockBasedTable::NewRangeTombstoneIterator(), as it is ~5% CPU in my random read benchmark in a database with 1k tombstones. This test will help make sure nothing breaks in the process.
Closes https://github.com/facebook/rocksdb/pull/1739

Differential Revision: D4375954

Pulled By: ajkr

fbshipit-source-id: aef9357
2017-01-24 13:24:14 -08:00
Arun Sharma
fba726e555 Version librocksdb.so
Summary:
After make install, I see a directory hierarchy that looks like

```
./usr
./usr/include
./usr/include/rocksdb
./usr/include/rocksdb/filter_policy.h
[..]
./usr/include/rocksdb/iterator.h
./usr/include/rocksdb/utilities
./usr/include/rocksdb/utilities/ldb_cmd_execute_result.h
./usr/include/rocksdb/utilities/lua
./usr/include/rocksdb/utilities/lua/rocks_lua_custom_library.h
./usr/include/rocksdb/utilities/lua/rocks_lua_util.h
./usr/include/rocksdb/utilities/lua/rocks_lua_compaction_filter.h
./usr/include/rocksdb/utilities/backupable_db.h
[..]
./usr/include/rocksdb/utilities/env_registry.h
[..]
./usr/include/rocksdb/env.h
./usr/lib64
./usr/lib64/librocksdb.so.5
./usr/lib64/librocksdb.so.5.0.0
./usr/lib64/librocksdb.so
./usr/lib64/librocksdb.a
```
Closes https://github.com/facebook/rocksdb/pull/1798

Differential Revision: D4456536

Pulled By: yiwu-arbug

fbshipit-source-id: 5494e91
2017-01-24 11:09:10 -08:00
Andrew Kryczka
9da4d542fe Range deletions unsupported in tailing iterator
Summary:
change the iterator status to NotSupported as soon as a range tombstone
is encountered by a ForwardIterator.
Closes https://github.com/facebook/rocksdb/pull/1593

Differential Revision: D4246294

Pulled By: ajkr

fbshipit-source-id: aef9f49
2017-01-23 13:39:12 -08:00
Hyeonseok Oh
f2b4939da4 fixed typo
Summary:
I fixed exisit -> exist
Closes https://github.com/facebook/rocksdb/pull/1799

Differential Revision: D4451466

Pulled By: yiwu-arbug

fbshipit-source-id: b447c3a
2017-01-23 12:54:13 -08:00
yinqiwen
973f1b78fd memtable: delete merge value for range deleteion
Summary: Closes https://github.com/facebook/rocksdb/pull/1797

Differential Revision: D4448004

Pulled By: ajkr

fbshipit-source-id: 3ffc27c
2017-01-23 12:24:14 -08:00
jsteemann
aebfd1703b fix non-portable behavior in encoder
Summary:
using ~0UL for mask uses a uint32_t at least in MSVC, but a uint64_t is required for it to work properly
Closes https://github.com/facebook/rocksdb/pull/1777

Differential Revision: D4444004

Pulled By: yiwu-arbug

fbshipit-source-id: 057cc42
2017-01-20 16:39:22 -08:00
Vitaliy Liptchinsky
753ff84a3d Fix get approx size
Summary:
Fixing GetApproximateSize bug for the case of computing stats for mem tables only.
Closes https://github.com/facebook/rocksdb/pull/1795

Differential Revision: D4445507

Pulled By: IslamAbdelRahman

fbshipit-source-id: 3905846
2017-01-20 15:54:12 -08:00
Arun Sharma
d7ea44f2f9 Fixup a couple of builds errors on Linux.
Summary:
The libraries produced on linux are now named
librocksdb.a
librocksdb.so

Other fixes:

* Also link with -lrt to avoid linker errors.
* Generalize comments at the top to include Linux
* Move -lgtest before -lpthread to avoid linker errors
* move add_subdirectory(tools) to the end so it picks up
  the right libraries
Closes https://github.com/facebook/rocksdb/pull/1364

Differential Revision: D4444138

Pulled By: yiwu-arbug

fbshipit-source-id: f0e2c19
2017-01-20 13:24:13 -08:00
Jay Lee
537da370da c: allow set savepoint to writebatch
Summary:
Allow set SavePoint to WriteBatch in C ABI.
Closes https://github.com/facebook/rocksdb/pull/1698

Differential Revision: D4378556

Pulled By: yiwu-arbug

fbshipit-source-id: afca746
2017-01-20 13:24:13 -08:00
Min Wei
af6ec4d78e fix batchresult handle leak
Summary:
This is related to PR https://github.com/facebook/rocksdb/pull/1642

Sorry about this extra PR, as my workflow was messed up when I checked in another work item by accident.
Closes https://github.com/facebook/rocksdb/pull/1681

Differential Revision: D4379513

Pulled By: yiwu-arbug

fbshipit-source-id: a668d4c
2017-01-20 13:24:12 -08:00
Adam Retter
e29bb934f7 Zlib 1.2.8 is no longer available, switched to 1.2.10
Summary: Closes https://github.com/facebook/rocksdb/pull/1741

Differential Revision: D4380116

Pulled By: yiwu-arbug

fbshipit-source-id: 7cf09df
2017-01-20 13:24:12 -08:00
Changli Gao
5ac97314e7 Fix std::out_of_range when DBOptions::keep_log_file_num is zero
Summary:
We should validate this option, otherwise we may see
std::out_of_range thrown at: db/db_impl.cc:1124

1123     for (unsigned int i = 0; i <= end; i++) {
1124       std::string& to_delete = old_info_log_files.at(i);
1125       std::string full_path_to_delete =
1126           (immutable_db_options_.db_log_dir.empty()
Closes https://github.com/facebook/rocksdb/pull/1722

Differential Revision: D4379495

Pulled By: yiwu-arbug

fbshipit-source-id: e136552
2017-01-20 13:24:12 -08:00
Kefu Chai
4e35ffdfab cmake: check -momit-leaf-frame-pointer before using it
Summary:
because not all archs support this option. see
https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html.

also do not pass "-fno-omit-frame-pointer" and
"-momit-leaf-frame-pointer" to compiler if ${CMAKE_BUILD_TYPE} is
"Debug". this matches the behaviour of DEBUG_LEVEL=2 in Makefile.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Closes https://github.com/facebook/rocksdb/pull/1762

Differential Revision: D4444036

Pulled By: yiwu-arbug

fbshipit-source-id: 8596fbe
2017-01-20 13:09:11 -08:00
Shu Zhang
3c0852d1da Make ingest external file backward compatible
Summary: Closes https://github.com/facebook/rocksdb/pull/1783

Differential Revision: D4443463

Pulled By: IslamAbdelRahman

fbshipit-source-id: 39d21d6
2017-01-20 12:09:19 -08:00
Siying Dong
0e8dfd6062 Fix OptimizeForPointLookup()
Summary:
If users directly call OptimizeForPointLookup(), it is broken as the option isn't compatible with parallel memtable insert. Fix it by using memtable bloomo filter instead.
Closes https://github.com/facebook/rocksdb/pull/1791

Differential Revision: D4442836

Pulled By: siying

fbshipit-source-id: bf6c9cd
2017-01-20 10:54:12 -08:00
Vitaliy Liptchinsky
e840213d6e Change DB::GetApproximateSizes for more flexibility needed for MyRocks
Summary:
Added an option to GetApproximateSizes to exclude file stats, as MyRocks has those counted exactly and we need only stats from memtables.
Closes https://github.com/facebook/rocksdb/pull/1787

Differential Revision: D4441111

Pulled By: IslamAbdelRahman

fbshipit-source-id: c11f4c3
2017-01-20 09:39:11 -08:00
Yi Wu
9239103cd4 Flush job should release reference current version if sync log failed
Summary:
Fix the bug when sync log fail, FlushJob::Run() will not be execute and
reference to cfd->current() will not be release.
Closes https://github.com/facebook/rocksdb/pull/1792

Differential Revision: D4441316

Pulled By: yiwu-arbug

fbshipit-source-id: 5523e28
2017-01-19 23:09:15 -08:00
Islam AbdelRahman
da54d36a96 Disable IngestExternalFile in ReadOnly mode
Summary:
Disable IngestExternalFile() in read only mode
Closes https://github.com/facebook/rocksdb/pull/1781

Differential Revision: D4439179

Pulled By: IslamAbdelRahman

fbshipit-source-id: b7e46e7
2017-01-19 15:54:19 -08:00
Reid Horuff
5cf176ca15 Fix for 2PC causing WAL to grow too large
Summary:
Consider the following single column family scenario:
prepare in log A
commit in log B
*WAL is too large, flush all CFs to releast log A*
*CFA is on log B so we do not see CFA is depending on log A so no flush is requested*

To fix this we must also consider the log containing the prepare section when determining what log a CF is dependent on.
Closes https://github.com/facebook/rocksdb/pull/1768

Differential Revision: D4403265

Pulled By: reidHoruff

fbshipit-source-id: ce800ff
2017-01-19 15:39:12 -08:00
Islam AbdelRahman
4a73bb0b43 Split travis jobs
Summary:
Some travis jobs are running out of space, splitting to more jobs should reduce the chance of that happening
Closes https://github.com/facebook/rocksdb/pull/1789

Differential Revision: D4438039

Pulled By: IslamAbdelRahman

fbshipit-source-id: 05787ff
2017-01-19 13:39:12 -08:00
Dmitri Smirnov
c70d3c7ade Enable DBTest.GroupCommit as it runs in a reasonlable time now.
Summary:
Remove ReRuns as they only waste time.
  Add env_basic_test to get more foundation coverage.
Closes https://github.com/facebook/rocksdb/pull/1788

Differential Revision: D4431433

Pulled By: IslamAbdelRahman

fbshipit-source-id: 50d07f8
2017-01-18 14:24:11 -08:00