1cd45cd1b3
Summary: Introducing FIFO compactions with TTL. FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size. To address that request: - Added a new TTL option to FIFO compaction options. - Updated FIFO compaction score to take TTL into consideration. - Added a new table property, creation_time, to keep track of when the SST file is created. - Creation_time is set as below: - On Flush: Set to the time of flush. - On Compaction: Set to the max creation_time of all the files involved in the compaction. - On Repair and Recovery: Set to the time of repair/recovery. - Old files created prior to this code change will have a creation_time of 0. - FIFO compaction with TTL is enabled when ttl > 0. All files older than ttl will be deleted during compaction. i.e. `if (file.creation_time < (current_time - ttl)) then delete(file)`. This will enable cases where you might want to delete all files older than, say, 1 day. - FIFO compaction will fall back to the prior way of deleting files based on size if: - the creation_time of all files involved in compaction is 0. - the total size (of all SST files combined) does not drop below `compaction_options_fifo.max_table_files_size` even if the files older than ttl are deleted. This feature is not supported if max_open_files != -1 or with table formats other than Block-based. **Test Plan:** Added tests. **Benchmark results:** Base: FIFO with max size: 100MB :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 readwhilewriting : 1.924 micros/op 519858 ops/sec; 13.6 MB/s (1176277 of 5000000 found) ``` With TTL (a low one for testing) :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 --fifo_compaction_ttl=20 readwhilewriting : 1.902 micros/op 525817 ops/sec; 13.7 MB/s (1185057 of 5000000 found) ``` Example Log lines: ``` 2017/06/26-15:17:24.609249 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609177) [db/compaction_picker.cc:1471] [default] FIFO compaction: picking file 40 with creation time 1498515423 for deletion 2017/06/26-15:17:24.609255 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609234) [db/db_impl_compaction_flush.cc:1541] [default] Deleted 1 files ... 2017/06/26-15:17:25.553185 7fd5a61a5800 [DEBUG] [db/db_impl_files.cc:309] [JOB 0] Delete /dev/shm/dbbench/000040.sst type=2 #40 -- OK 2017/06/26-15:17:25.553205 7fd5a61a5800 EVENT_LOG_v1 {"time_micros": 1498515445553199, "job": 0, "event": "table_file_deletion", "file_number": 40} ``` SST Files remaining in the dbbench dir, after db_bench execution completed: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ ls -l /dev/shm//dbbench/*.sst -rw-r--r--. 1 svemuri users 30749887 Jun 26 15:17 /dev/shm//dbbench/000042.sst -rw-r--r--. 1 svemuri users 30768779 Jun 26 15:17 /dev/shm//dbbench/000044.sst -rw-r--r--. 1 svemuri users 30757481 Jun 26 15:17 /dev/shm//dbbench/000046.sst ``` Closes https://github.com/facebook/rocksdb/pull/2480 Differential Revision: D5305116 Pulled By: sagar0 fbshipit-source-id: 3e5cfcf5dd07ed2211b5b37492eb235b45139174
85 lines
3.6 KiB
C++
85 lines
3.6 KiB
C++
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved.
|
|
// This source code is licensed under the BSD-style license found in the
|
|
// LICENSE file in the root directory of this source tree. An additional grant
|
|
// of patent rights can be found in the PATENTS file in the same directory.
|
|
// This source code is also licensed under the GPLv2 license found in the
|
|
// COPYING file in the root directory of this source tree.
|
|
// Copyright (c) 2011 The LevelDB Authors. All rights reserved.
|
|
// Use of this source code is governed by a BSD-style license that can be
|
|
// found in the LICENSE file. See the AUTHORS file for names of contributors.
|
|
#pragma once
|
|
#include <string>
|
|
#include <utility>
|
|
#include <vector>
|
|
#include "db/table_properties_collector.h"
|
|
#include "options/cf_options.h"
|
|
#include "rocksdb/comparator.h"
|
|
#include "rocksdb/env.h"
|
|
#include "rocksdb/listener.h"
|
|
#include "rocksdb/options.h"
|
|
#include "rocksdb/status.h"
|
|
#include "rocksdb/table_properties.h"
|
|
#include "rocksdb/types.h"
|
|
#include "table/scoped_arena_iterator.h"
|
|
#include "util/event_logger.h"
|
|
|
|
namespace rocksdb {
|
|
|
|
struct Options;
|
|
struct FileMetaData;
|
|
|
|
class Env;
|
|
struct EnvOptions;
|
|
class Iterator;
|
|
class TableCache;
|
|
class VersionEdit;
|
|
class TableBuilder;
|
|
class WritableFileWriter;
|
|
class InternalStats;
|
|
class InternalIterator;
|
|
|
|
// @param column_family_name Name of the column family that is also identified
|
|
// by column_family_id, or empty string if unknown. It must outlive the
|
|
// TableBuilder returned by this function.
|
|
// @param compression_dict Data for presetting the compression library's
|
|
// dictionary, or nullptr.
|
|
TableBuilder* NewTableBuilder(
|
|
const ImmutableCFOptions& options,
|
|
const InternalKeyComparator& internal_comparator,
|
|
const std::vector<std::unique_ptr<IntTblPropCollectorFactory>>*
|
|
int_tbl_prop_collector_factories,
|
|
uint32_t column_family_id, const std::string& column_family_name,
|
|
WritableFileWriter* file, const CompressionType compression_type,
|
|
const CompressionOptions& compression_opts, int level,
|
|
const std::string* compression_dict = nullptr,
|
|
const bool skip_filters = false, const uint64_t creation_time = 0);
|
|
|
|
// Build a Table file from the contents of *iter. The generated file
|
|
// will be named according to number specified in meta. On success, the rest of
|
|
// *meta will be filled with metadata about the generated table.
|
|
// If no data is present in *iter, meta->file_size will be set to
|
|
// zero, and no Table file will be produced.
|
|
//
|
|
// @param column_family_name Name of the column family that is also identified
|
|
// by column_family_id, or empty string if unknown.
|
|
extern Status BuildTable(
|
|
const std::string& dbname, Env* env, const ImmutableCFOptions& options,
|
|
const MutableCFOptions& mutable_cf_options, const EnvOptions& env_options,
|
|
TableCache* table_cache, InternalIterator* iter,
|
|
std::unique_ptr<InternalIterator> range_del_iter, FileMetaData* meta,
|
|
const InternalKeyComparator& internal_comparator,
|
|
const std::vector<std::unique_ptr<IntTblPropCollectorFactory>>*
|
|
int_tbl_prop_collector_factories,
|
|
uint32_t column_family_id, const std::string& column_family_name,
|
|
std::vector<SequenceNumber> snapshots,
|
|
SequenceNumber earliest_write_conflict_snapshot,
|
|
const CompressionType compression,
|
|
const CompressionOptions& compression_opts, bool paranoid_file_checks,
|
|
InternalStats* internal_stats, TableFileCreationReason reason,
|
|
EventLogger* event_logger = nullptr, int job_id = 0,
|
|
const Env::IOPriority io_priority = Env::IO_HIGH,
|
|
TableProperties* table_properties = nullptr, int level = -1,
|
|
const uint64_t creation_time = 0);
|
|
|
|
} // namespace rocksdb
|