Commit Graph

54 Commits

Author SHA1 Message Date
Yi Wu
04d373b260 BlobDB: handle IO error on read (#4410)
Summary:
Fix IO error on read not being handle and crashing the DB. With the fix we properly return the error.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4410

Differential Revision: D9979246

Pulled By: yiwu-arbug

fbshipit-source-id: 111a85675067a29c03cb60e9a34103f4ff636694
2018-09-20 16:58:45 -07:00
Yi Wu
462ed70d64 BlobDB: GetLiveFiles and GetLiveFilesMetadata return relative path (#4326)
Summary:
`GetLiveFiles` and `GetLiveFilesMetadata` should return path relative to db path.

It is a separate issue when `path_relative` is false how can we return relative path. But `DBImpl::GetLiveFiles` don't handle it as well when there are multiple `db_paths`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4326

Differential Revision: D9545904

Pulled By: yiwu-arbug

fbshipit-source-id: 6762d879fcb561df2b612e6fdfb4a6b51db03f5d
2018-08-31 12:12:49 -07:00
Yi Wu
38ad3c9f8a BlobDB: Avoid returning garbage value on key not found (#4321)
Summary:
When reading an expired key using `Get(..., std::string* value)` API, BlobDB first read the index entry and decode expiration from it. In this case, although BlobDB reset the PinnableSlice, the index entry is stored in user provided string `value`. The value will be returned as a garbage value, despite status being NotFound. Fixing it by use a different PinnableSlice to read the index entry.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4321

Differential Revision: D9519042

Pulled By: yiwu-arbug

fbshipit-source-id: f054c951a1fa98265228be94f931904ed7056677
2018-08-27 16:28:39 -07:00
Yi Wu
a6d3de4e7a BlobDB: Implement DisableFileDeletions (#4314)
Summary:
`DB::DiableFileDeletions` and `DB::EnableFileDeletions` are used for applications to stop RocksDB background jobs to delete files while they are doing replication. Implement these methods for BlobDB. `DeleteObsolteFiles` now needs to check `disable_file_deletions_` before starting, and will hold `delete_file_mutex_` the whole time while it is running. `DisableFileDeletions` needs to wait on `delete_file_mutex_` for running `DeleteObsolteFiles` job and set `disable_file_deletions_` flag.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4314

Differential Revision: D9501373

Pulled By: yiwu-arbug

fbshipit-source-id: 81064c1228f1724eff46da22b50ff765b16292cd
2018-08-27 10:58:29 -07:00
Yi Wu
7188bd34f3 BlobDB: Fix expired file not being evicted (#4294)
Summary:
Fix expired file not being evicted from the DB. We have a background task (previously called `CheckSeqFiles` and I rename it to `EvictExpiredFiles`) to scan and remove expired files, but it only close the files, not marking them as expired.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4294

Differential Revision: D9415984

Pulled By: yiwu-arbug

fbshipit-source-id: eff7bf0331c52a7ccdb02318602bff7f64f3ef3d
2018-08-20 22:42:33 -07:00
Siying Dong
9c0c8f5ff6 GetAllKeyVersions() to take an extra argument of max_num_ikeys. (#4271)
Summary:
Right now, `ldb idump` may have memory out of control if there is a big range of tombstones. Add an option to cut maxinum number of keys in GetAllKeyVersions(), and push down --max_num_ikeys from ldb.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4271

Differential Revision: D9369149

Pulled By: siying

fbshipit-source-id: 7cbb797b7d2fa16573495a7e84937456d3ff25bf
2018-08-16 15:57:08 -07:00
Yi Wu
c970358574 BlobDB: Can return expiration together with Get() (#4227)
Summary:
Add API to allow fetching expiration of a key with `Get()`.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4227

Differential Revision: D9169897

Pulled By: yiwu-arbug

fbshipit-source-id: 2a6f216c493dc75731ddcef1daa689b517fab31b
2018-08-06 17:43:14 -07:00
Yi Wu
140f256da2 BlobDB: Cleanup TTLExtractor interface (#4229)
Summary:
Cleanup TTLExtractor interface. The original purpose of it is to allow our users keep using existing `Write()` interface but allow it to accept TTL via `TTLExtractor`. However the interface is confusing. Will replace it with something like `WriteWithTTL(batch, ttl)` in the future.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4229

Differential Revision: D9174390

Pulled By: yiwu-arbug

fbshipit-source-id: 68201703d784408b851336ab4dd9b84188245b2d
2018-08-06 11:58:05 -07:00
Maysam Yabandeh
8581a93a6b Per-thread unique test db names (#4135)
Summary:
The patch makes sure that two parallel test threads will operate on different db paths. This enables using open source tools such as gtest-parallel to run the tests of a file in parallel.
Example: ``` ~/gtest-parallel/gtest-parallel ./table_test```
Pull Request resolved: https://github.com/facebook/rocksdb/pull/4135

Differential Revision: D8846653

Pulled By: maysamyabandeh

fbshipit-source-id: 799bad1abb260e3d346bcb680d2ae207a852ba84
2018-07-13 17:27:39 -07:00
Yi Wu
6d454d7376 BlobDB: is_fifo=true also evict non-TTL blob files (#4049)
Summary:
Previously with is_fifo=true we only evict TTL file. Changing it to also evict non-TTL files from oldest to newest, after exhausted TTL files.
Closes https://github.com/facebook/rocksdb/pull/4049

Differential Revision: D8604597

Pulled By: yiwu-arbug

fbshipit-source-id: bc4209ee27c1528ce4b72833e6f1e1bff80082c1
2018-06-25 22:43:05 -07:00
Zhongyi Xie
954b496b3f fix memory leak in two_level_iterator
Summary:
this PR fixes a few failed contbuild:
1. ASAN memory leak in Block::NewIterator (table/block.cc:429). the proper destruction of first_level_iter_ and second_level_iter_ of two_level_iterator.cc is missing from the code after the refactoring in https://github.com/facebook/rocksdb/pull/3406
2. various unused param errors introduced by https://github.com/facebook/rocksdb/pull/3662
3. updated comment for `ForceReleaseCachedEntry` to emphasize the use of `force_erase` flag.
Closes https://github.com/facebook/rocksdb/pull/3718

Reviewed By: maysamyabandeh

Differential Revision: D7621192

Pulled By: miasantreble

fbshipit-source-id: 476c94264083a0730ded957c29de7807e4f5b146
2018-04-15 17:26:26 -07:00
Yi Wu
f1b7b790c9 BlobDB: Fix BlobDBImpl::GCFileAndUpdateLSM issues
Summary:
* Fix BlobDBImpl::GCFileAndUpdateLSM doesn't close the new file, and the new file will not be able to be garbage collected later.
* Fix BlobDBImpl::GCFileAndUpdateLSM doesn't copy over metadata from old file to new file.
Closes https://github.com/facebook/rocksdb/pull/3639

Differential Revision: D7355092

Pulled By: yiwu-arbug

fbshipit-source-id: 4fa3594ac5ce376bed1af04a545c532cfc0088c4
2018-03-21 16:30:09 -07:00
Yi Wu
b864bc9b5b Blob DB: Improve FIFO eviction
Summary:
Improving blob db FIFO eviction with the following changes,
* Change blob_dir_size to max_db_size. Take into account SST file size when computing DB size.
* FIFO now only take into account live sst files and live blob files. It is normal for disk usage to go over max_db_size because there are obsolete sst files and blob files pending deletion.
* FIFO eviction now also evict TTL blob files that's still open. It doesn't evict non-TTL blob files.
* If FIFO is triggered, it will pass an expiration and the current sequence number to compaction filter. Compaction filter will then filter inlined keys to evict those with an earlier expiration and smaller sequence number. So call LSM FIFO.
* Compaction filter also filter those blob indexes where corresponding blob file is gone.
* Add an event listener to listen compaction/flush event and update sst file size.
* Implement DB::Close() to make sure base db, as well as event listener and compaction filter, destruct before blob db.
* More blob db statistics around FIFO.
* Fix some locking issue when accessing a blob file.
Closes https://github.com/facebook/rocksdb/pull/3556

Differential Revision: D7139328

Pulled By: yiwu-arbug

fbshipit-source-id: ea5edb07b33dfceacb2682f4789bea61de28bbfa
2018-03-06 11:57:42 -08:00
Yi Wu
f2f034ef3b Blob DB: fix crash when DB full but no candidate file to evict
Summary:
When blob_files is empty, std::min_element will return blobfiles.end(), which cannot be dereference. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/3387

Differential Revision: D6764927

Pulled By: yiwu-arbug

fbshipit-source-id: 86f78700132be95760d35ac63480dfd3a8bbe17a
2018-01-19 16:26:50 -08:00
Yi Wu
237b292515 BlobDB: Remove the need to get sequence number per write
Summary:
Previously we store sequence number range of each blob files, and use the sequence number range to check if the file can be possibly visible by a snapshot. But it adds complexity to the code, since the sequence number is only available after a write. (The current implementation get sequence number by calling GetLatestSequenceNumber(), which is wrong.) With the patch, we are not storing sequence number range, and check if snapshot_sequence < obsolete_sequence to decide if the file is visible by a snapshot (previously we check if first_sequence <= snapshot_sequence < obsolete_sequence).
Closes https://github.com/facebook/rocksdb/pull/3274

Differential Revision: D6571497

Pulled By: yiwu-arbug

fbshipit-source-id: ca06479dc1fcd8782f6525b62b7762cd47d61909
2017-12-15 13:27:30 -08:00
Yi Wu
78279350aa Blob DB: Add statistics
Summary:
Adding a list of blob db counters.

Also remove WaStats() which doesn't expose the stats and can be substitute by (BLOB_DB_BYTES_WRITTEN / BLOB_DB_BLOB_FILE_BYTES_WRITTEN).
Closes https://github.com/facebook/rocksdb/pull/3193

Differential Revision: D6394216

Pulled By: yiwu-arbug

fbshipit-source-id: 017508c8ff3fcd7ea7403c64d0f9834b24816803
2017-11-28 11:58:49 -08:00
Yi Wu
42564ada53 Blob DB: not using PinnableSlice move assignment
Summary:
The current implementation of PinnableSlice move assignment have an issue #3163. We are moving away from it instead of try to get the move assignment right, since it is too tricky.
Closes https://github.com/facebook/rocksdb/pull/3164

Differential Revision: D6319201

Pulled By: yiwu-arbug

fbshipit-source-id: 8f3279021f3710da4a4caa14fd238ed2df902c48
2017-11-13 18:12:20 -08:00
Dmitri Smirnov
f8e2db0717 Fix crashes, address test issues and adjust windows test script
Summary:
Add per-exe execution capability
  Add fix parsing of groups/tests
  Add timer test exclusion

 Fix unit tests
  Ifdef threadpool specific tests that do not pass on Vista threadpool.
  Remove spurious outout from prefix_test so test case listing works
  properly.
  Fix not using standard test directories results in file creation errors
  in sst_dump_test.

  BlobDb fixes:
    In C++ end() iterators can not be dereferenced. They are not valid.
	When deleting blob_db_ set it to nullptr before any other code executes.
	Not fixed:. On Windows you can not delete a file while it is open.
	[ RUN      ] BlobDBTest.ReadWhileGC
	d:\dev\rocksdb\rocksdb\utilities\blob_db\blob_db_test.cc(75): error: DestroyBlobDB(dbname_, options, bdb_options)
	IO error: Failed to delete: d:/mnt/db\testrocksdb-17444/blob_db_test/blob_dir/000001.blob: Permission denied
	d:\dev\rocksdb\rocksdb\utilities\blob_db\blob_db_test.cc(75): error: DestroyBlobDB(dbname_, options, bdb_options)
	IO error: Failed to delete: d:/mnt/db\testrocksdb-17444/blob_db_test/blob_dir/000001.blob: Permission denied

  write_batch
    Should not call front() if there is a chance the container is empty
Closes https://github.com/facebook/rocksdb/pull/3152

Differential Revision: D6293274

Pulled By: sagar0

fbshipit-source-id: 318c3717c22087fae13b18715dffb24565dbd956
2017-11-10 10:41:57 -08:00
Yi Wu
4f9f124347 Blob DB: use compression in file header instead of global options
Summary:
To fix the issue of failing to decompress existing value after reopen DB with a different compression settings.
Closes https://github.com/facebook/rocksdb/pull/3142

Differential Revision: D6267260

Pulled By: yiwu-arbug

fbshipit-source-id: c7cf7f3e33b0cd25520abf4771cdf9180cc02a5f
2017-11-07 17:42:17 -08:00
Yi Wu
2581c0a5a1 Blob DB: Fix BlobDBTest::SnapshotAndGarbageCollection asan failure
Summary:
Fix unreleased snapshot at the end of the test.
Closes https://github.com/facebook/rocksdb/pull/3126

Differential Revision: D6232867

Pulled By: yiwu-arbug

fbshipit-source-id: 651ca3144fc573ea2ab0ab20f0a752fb4a101d26
2017-11-03 10:26:59 -07:00
Yi Wu
62578d80c1 Blob DB: Add compaction filter to remove expired blob index entries
Summary:
After adding expiration to blob index in #3066, we are now able to add a compaction filter to cleanup expired blob index entries.
Closes https://github.com/facebook/rocksdb/pull/3090

Differential Revision: D6183812

Pulled By: yiwu-arbug

fbshipit-source-id: 9cb03267a9702975290e758c9c176a2c03530b83
2017-11-02 17:27:38 -07:00
Yi Wu
7bfa88037e Blob DB: fix snapshot handling
Summary:
Blob db will keep blob file if data in the file is visible to an active snapshot. Before this patch it checks whether there is an active snapshot has sequence number greater than the earliest sequence in the file. This is problematic since we take snapshot on every read, if it keep having reads, old blob files will not be cleanup. Change to check if there is an active snapshot falls in the range of [earliest_sequence, obsolete_sequence) where obsolete sequence is
1. if data is relocated to another file by garbage collection, it is the latest sequence at the time garbage collection finish
2. otherwise, it is the latest sequence of the file
Closes https://github.com/facebook/rocksdb/pull/3087

Differential Revision: D6182519

Pulled By: yiwu-arbug

fbshipit-source-id: cdf4c35281f782eb2a9ad6a87b6727bbdff27a45
2017-11-02 15:58:27 -07:00
Yi Wu
167ba599ec Blob DB: Fix flaky BlobDBTest::GCExpiredKeyWhileOverwriting test
Summary:
The test intent to wait until key being overwritten until proceed with garbage collection. It failed to wait for `PutUntil` finally finish. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/3116

Differential Revision: D6222833

Pulled By: yiwu-arbug

fbshipit-source-id: fa9b57a772b92a66cf250b44e7975c43f62f45c5
2017-11-02 13:27:34 -07:00
Sagar Vemuri
25ac1697b4 Blob DB: Evict oldest blob file when close to blob db size limit
Summary:
Evict oldest blob file and put it in obsolete_files list when close to blob db size limit. The file will be delete when the `DeleteObsoleteFiles` background job runs next time.
For now I set `kEvictOldestFileAtSize` constant, which controls when to evict the oldest file, at 90%. It could be tweaked or made into an option if really needed; I didn't want to expose it as an option pre-maturely as there are already too many :) .
Closes https://github.com/facebook/rocksdb/pull/3094

Differential Revision: D6187340

Pulled By: sagar0

fbshipit-source-id: 687f8262101b9301bf964b94025a2fe9d8573421
2017-11-02 12:11:21 -07:00
Yi Wu
f6082d1944 Blob DB: cleanup unused options
Summary:
* cleanup num_concurrent_simple_blobs. We don't do concurrent writes (by taking write_mutex_) so it doesn't make sense to have multiple non TTL files open. We can revisit later when we want to improve writes.
* cleanup eviction callback. we don't have plan to use it now.
* rename s/open_simple_blob_files_/open_non_ttl_file_/ and s/open_blob_files_/open_ttl_files_/ to avoid confusion.
Closes https://github.com/facebook/rocksdb/pull/3088

Differential Revision: D6182598

Pulled By: yiwu-arbug

fbshipit-source-id: 99e6f5e01fa66d31309cdb06ce48502464bac6ad
2017-10-31 16:42:08 -07:00
Yi Wu
3ebb7ba7b9 Blob DB: update blob file format
Summary:
Changing blob file format and some code cleanup around the change. The change with blob log format are:
* Remove timestamp field in blob file header, blob file footer and blob records. The field is not being use and often confuse with expiration field.
* Blob file header now come with column family id, which always equal to default column family id. It leaves room for future support of column family.
* Compression field in blob file header now is a standalone byte (instead of compact encode with flags field)
* Blob file footer now come with its own crc.
* Key length now being uint64_t instead of uint32_t
* Blob CRC now checksum both key and value (instead of value only).
* Some reordering of the fields.

The list of cleanups:
* Better inline comments in blob_log_format.h
* rename ttlrange_t and snrange_t to ExpirationRange and SequenceRange respectively.
* simplify blob_db::Reader
* Move crc checking logic to inside blob_log_format.cc
Closes https://github.com/facebook/rocksdb/pull/3081

Differential Revision: D6171304

Pulled By: yiwu-arbug

fbshipit-source-id: e4373e0d39264441b7e2fbd0caba93ddd99ea2af
2017-10-27 13:27:12 -07:00
Yi Wu
5a2a6483dc Blob DB: Inline small values in base DB
Summary:
Adding the `min_blob_size` option to allow storing small values in base db (in LSM tree) together with the key. The goal is to improve performance for small values, while taking advantage of blob db's low write amplification for large values.

Also adding expiration timestamp to blob index. It will be useful to evict stale blob indexes in base db by adding a compaction filter. I'll work on the compaction filter in future patches.

See blob_index.h for the new blob index format. There are 4 cases when writing a new key:
* small value w/o TTL: put in base db as normal value (i.e. ValueType::kTypeValue)
* small value w/ TTL: put (type, expiration, value) to base db.
* large value w/o TTL: write value to blob log and put (type, file, offset, size, compression) to base db.
* large value w/TTL: write value to blob log and put (type, expiration, file, offset, size, compression) to base db.
Closes https://github.com/facebook/rocksdb/pull/3066

Differential Revision: D6142115

Pulled By: yiwu-arbug

fbshipit-source-id: 9526e76e19f0839310a3f5f2a43772a4ad182cd0
2017-10-26 12:30:54 -07:00
Sagar Vemuri
96e3a600ba Return write error on reaching blob dir size limit
Summary:
I found that we continue accepting writes even when the blob db goes beyond the configured blob directory size limit. Now, we return an error for writes on reaching `blob_dir_size` limit and if `is_fifo` is set to false. (We cannot just drop any file when `is_fifo` is true.)

Deleting the oldest file when `is_fifo` is true will be handled in a later PR.
Closes https://github.com/facebook/rocksdb/pull/3060

Differential Revision: D6136156

Pulled By: sagar0

fbshipit-source-id: 2f11cb3f2eedfa94524fbfa2613dd64bfad7a23c
2017-10-25 16:30:37 -07:00
Yi Wu
66a2c44ef4 Add DB::Properties::kEstimateOldestKeyTime
Summary:
With FIFO compaction we would like to get the oldest data time for monitoring. The problem is we don't have timestamp for each key in the DB. As an approximation, we expose the earliest of sst file "creation_time" property.

My plan is to override the property with a more accurate value with blob db, where we actually have timestamp.
Closes https://github.com/facebook/rocksdb/pull/2842

Differential Revision: D5770600

Pulled By: yiwu-arbug

fbshipit-source-id: 03833c8f10bbfbee62f8ea5c0d03c0cafb5d853a
2017-10-23 15:27:27 -07:00
Yi Wu
eaaef91178 Blob DB: Store blob index as kTypeBlobIndex in base db
Summary:
Blob db insert blob index to base db as kTypeBlobIndex type, to tell apart values written by plain rocksdb or blob db. This is to make it possible to migrate from existing rocksdb to blob db.

Also with the patch blob db garbage collection get away from OptimisticTransaction. Instead it use a custom write callback to achieve similar behavior as OptimisticTransaction. This is because we need to pass the is_blob_index flag to DBImpl::Get but OptimisticTransaction don't support it.
Closes https://github.com/facebook/rocksdb/pull/3000

Differential Revision: D6050044

Pulled By: yiwu-arbug

fbshipit-source-id: 61dc72ab9977625e75f78cd968e7d8a3976e3632
2017-10-17 17:28:11 -07:00
Yi Wu
0552029b5c Blob DB: not writing sequence number as blob record footer
Summary:
Previously each time we write a blob we write blog_record_header + key + value + blob_record_footer to blob log. The footer only contains a sequence and a crc for the sequence number. The sequence number was used in garbage collection to verify the value is recent. After #2703 we moved to use optimistic transaction and no longer use sequence number from the footer. Remove the footer altogether.

There's another usage of sequence number and we are keeping it: Each blob log file keep track of sequence number range of keys in it, and use it to check if it is reference by a snapshot, before being deleted.
Closes https://github.com/facebook/rocksdb/pull/3005

Differential Revision: D6057585

Pulled By: yiwu-arbug

fbshipit-source-id: d6da53c457a316e9723f359a1b47facfc3ffe090
2017-10-17 12:13:08 -07:00
Zhongyi Xie
e2548366e1 add GetLiveFiles and GetLiveFilesMetaData for BlobDB
Summary: Closes https://github.com/facebook/rocksdb/pull/2976

Differential Revision: D5994759

Pulled By: miasantreble

fbshipit-source-id: 985c31dccb957cb970c302f813cd07a1e8cb6438
2017-10-09 19:56:04 -07:00
Yi Wu
dcd36a6aee Make it explicit blob db doesn't support CF
Summary:
Blob db doesn't currently support column families. Return NotSupported status explicitly.
Closes https://github.com/facebook/rocksdb/pull/2825

Differential Revision: D5757438

Pulled By: yiwu-arbug

fbshipit-source-id: 44de9408fd032c98e8ae337d4db4ed37169bd9fa
2017-09-08 11:11:04 -07:00
Yi Wu
503db684f7 make blob file close synchronous
Summary:
Fixing flaky blob_db_test.

To close a blob file, blob db used to add a CloseSeqWrite job to the background thread to close it. Changing file close to be synchronous in order to simplify logic, and fix flaky blob_db_test.
Closes https://github.com/facebook/rocksdb/pull/2787

Differential Revision: D5699387

Pulled By: yiwu-arbug

fbshipit-source-id: dd07a945cd435cd3808fce7ee4ea57817409474a
2017-08-25 10:41:49 -07:00
yiwu-arbug
5b68b114f1 Blob db create a snapshot before every read
Summary:
If GC kicks in between

* A Get() reads index entry from base db.
* The Get() read from a blob file

The GC can delete the corresponding blob file, making the key not found. Fortunately we have existing logic to avoid deleting a blob file if it is referenced by a snapshot. So the fix is to explicitly create a snapshot before reading index entry from base db.
Closes https://github.com/facebook/rocksdb/pull/2754

Differential Revision: D5655956

Pulled By: yiwu-arbug

fbshipit-source-id: e4ccbc51331362542e7343175bbcbdea5830f544
2017-08-20 18:26:19 -07:00
yiwu-arbug
4624ae52c9 GC the oldest file when out of space
Summary:
When out of space, blob db should GC the oldest file. The current implementation GC the newest one instead. Fixing it.
Closes https://github.com/facebook/rocksdb/pull/2757

Differential Revision: D5657611

Pulled By: yiwu-arbug

fbshipit-source-id: 56c30a4c52e6ab04551dda8c5c46006d4070b28d
2017-08-20 17:11:06 -07:00
follitude
ac8fb77afd fix some misspellings
Summary:
PTAL ajkr
Closes https://github.com/facebook/rocksdb/pull/2750

Differential Revision: D5648052

Pulled By: ajkr

fbshipit-source-id: 7cd1ddd61364d5a55a10fdd293fa74b2bf89dd98
2017-08-16 21:57:20 -07:00
yiwu-arbug
e5a1b727c0 Fix blob DB transaction usage while GC
Summary:
While GC, blob DB use optimistic transaction to delete or replace the index entry in LSM, to guarantee correctness if there's a normal write writing to the same key. However, the previous implementation doesn't call SetSnapshot() nor use GetForUpdate() of transaction API, instead it do its own sequence number checking before beginning the transaction. A normal write can sneak in after the sequence number check and overwrite the key, and the GC will delete or relocate the old version of the key by mistake. Update the code to property use GetForUpdate() to check the existing index entry.

After the patch the sequence number store with each blob record is useless, So I'm considering remove the sequence number from blob record, in another patch.
Closes https://github.com/facebook/rocksdb/pull/2703

Differential Revision: D5589178

Pulled By: yiwu-arbug

fbshipit-source-id: 8dc960cd5f4e61b36024ba7c32d05584ce149c24
2017-08-11 12:43:17 -07:00
Yi Wu
92afe830f9 Update all blob db TTL and timestamps to uint64_t
Summary:
The current blob db implementation use mix of int32_t, uint32_t and uint64_t for TTL and expiration. Update all timestamps to uint64_t for consistency.
Closes https://github.com/facebook/rocksdb/pull/2683

Differential Revision: D5557103

Pulled By: yiwu-arbug

fbshipit-source-id: e4eab2691629a755e614e8cf1eed9c3a681d0c42
2017-08-03 17:57:30 -07:00
Yi Wu
0b814ba92d Allow concurrent writes to blob db
Summary:
I'm going with brute-force solution, just letting Put() and Write() holding a mutex before writing. May improve concurrent writing with finer granularity locking later.
Closes https://github.com/facebook/rocksdb/pull/2682

Differential Revision: D5552690

Pulled By: yiwu-arbug

fbshipit-source-id: 039abd675b5d274a7af6428198d1733cafecef4c
2017-08-03 15:11:26 -07:00
Yi Wu
2c45ada4c4 Blob DB garbage collection should keep keys with newer version
Summary:
Fix the bug where if blob db garbage collection revmoe keys with newer version. It shouldn't delete the key from base db when sequence number in base db is not equal to the one in blob log.
Closes https://github.com/facebook/rocksdb/pull/2678

Differential Revision: D5549752

Pulled By: yiwu-arbug

fbshipit-source-id: abb8649260963b5c389748023970fd746279d227
2017-08-03 13:12:12 -07:00
Yi Wu
1900771bd2 Dump Blob DB options to info log
Summary:
* Dump blob db options to info log
* Remove BlobDBOptionsImpl to disallow dynamic cast *BlobDBOptions into *BlobDBOptionsImpl. Move options there to be constants or into BlobDBOptions. The dynamic cast is broken after #2645
* Change some of the default options
* Remove blob_db_options.min_blob_size, which is unimplemented. Will implement it soon.
Closes https://github.com/facebook/rocksdb/pull/2671

Differential Revision: D5529912

Pulled By: yiwu-arbug

fbshipit-source-id: dcd58ca981db5bcc7f123b65a0d6f6ae0dc703c7
2017-08-01 13:01:47 -07:00
Yi Wu
6083bc79f8 Blob DB TTL extractor
Summary:
Introducing blob_db::TTLExtractor to replace extract_ttl_fn. The TTL
extractor can be use to extract TTL from keys insert with Put or
WriteBatch. Change over existing extract_ttl_fn are:
* If value is changed, it will be return via std::string* (rather than Slice*). With Slice* the new value has to be part of the existing value. With std::string* the limitation is removed.
* It can optionally return TTL or expiration.

Other changes in this PR:
* replace `std::chrono::system_clock` with `Env::NowMicros` so that I can mock time in tests.
* add several TTL tests.
* other minor naming change.
Closes https://github.com/facebook/rocksdb/pull/2659

Differential Revision: D5512627

Pulled By: yiwu-arbug

fbshipit-source-id: 0dfcb00d74d060b8534c6130c808e4d5d0a54440
2017-07-27 23:26:04 -07:00
Siying Dong
3c327ac2d0 Change RocksDB License
Summary: Closes https://github.com/facebook/rocksdb/pull/2589

Differential Revision: D5431502

Pulled By: siying

fbshipit-source-id: 8ebf8c87883daa9daa54b2303d11ce01ab1f6f75
2017-07-15 16:11:23 -07:00
foolenough
21b17d7686 Fix BlobDB::Get which only get out the value offset
Summary:
Blob db use StackableDB::get which only get out the
value offset, but not the value.
Fix by making BlobDB::Get override the designated getter.
Closes https://github.com/facebook/rocksdb/pull/2553

Differential Revision: D5396823

Pulled By: yiwu-arbug

fbshipit-source-id: 5a7d1cf77ee44490f836a6537225955382296878
2017-07-12 17:57:40 -07:00
Siying Dong
18c63af6ef Make "make analyze" happy
Summary:
"make analyze" is reporting some errors. It's complicated to look but it seems to me that they are all false positive. Anyway, I think cleaning them up is a good idea. Some of the changes are hacky but I don't know a better way.
Closes https://github.com/facebook/rocksdb/pull/2508

Differential Revision: D5341710

Pulled By: siying

fbshipit-source-id: 6070e430e0e41a080ef441e05e8ec827d45efab6
2017-06-28 15:42:27 -07:00
Dmitri Smirnov
a21db161c9 Implement ReopenWritibaleFile on Windows and other fixes
Summary:
Make default impl return NoSupported so the db_blob
  tests exist in a meaningful manner.
  Replace std::thread to port::Thread
Closes https://github.com/facebook/rocksdb/pull/2465

Differential Revision: D5275563

Pulled By: yiwu-arbug

fbshipit-source-id: cedf1a18a2c05e20d768c1308b3f3224dbd70ab6
2017-06-20 10:31:13 -07:00
Yi Wu
ae8571f5c2 Fix blob db compression bug
Summary:
`CompressBlock()` will return the uncompressed slice (i.e. `Slice(value_unc)`) if compression ratio is not good enough. This is undesired. We need to always assign the compressed slice to `value`.
Closes https://github.com/facebook/rocksdb/pull/2447

Differential Revision: D5244682

Pulled By: yiwu-arbug

fbshipit-source-id: 6989dd8852c9622822ba9acec9beea02007dff09
2017-06-14 13:56:42 -07:00
Yi Wu
7a380deff7 Update blob_db_test
Summary:
I'm trying to improve unit test of blob db. I'm rewriting blob db test. In this patch:
* Rewrite tests of basic put/write/delete operations.
* Add disable_background_tasks to BlobDBOptionsImpl to allow me not running any background job for basic unit tests.
* Move DestroyBlobDB out from BlobDBImpl to be a standalone function.
* Remove all garbage collection related tests. Will rewrite them in following patch.
* Disabled compression test since it is failing. Will fix in a followup patch.
Closes https://github.com/facebook/rocksdb/pull/2446

Differential Revision: D5243306

Pulled By: yiwu-arbug

fbshipit-source-id: 157c71ad3b699307cb88baa3830e9b6e74f8e939
2017-06-14 13:12:34 -07:00
Yi Wu
91e2aa3ce2 write exact sequence number for each put in write batch
Summary:
At the beginning of write batch write, grab the latest sequence from base db and assume sequence number will increment by 1 for each put and delete, and write the exact sequence number with each put. This is assuming we are the only writer to increment sequence number (no external file ingestion, etc) and there should be no holes in the sequence number.

Also having some minor naming changes.
Closes https://github.com/facebook/rocksdb/pull/2402

Differential Revision: D5176134

Pulled By: yiwu-arbug

fbshipit-source-id: cb4712ee44478d5a2e5951213a10b72f08fe8c88
2017-06-13 12:42:36 -07:00