Sagar Vemuri df23e80e5f Improve performance of long range scans with readahead
Summary:
This change improves the performance of iterators doing long range scans (e.g. big/full table scans in MyRocks) by using readahead and prefetching additional data on each disk IO. This prefetching is automatically enabled on noticing more than 2 IOs for the same table file during iteration. The readahead size starts with 8KB and is exponentially increased on each additional sequential IO, up to a max of 256 KB. This helps in cutting down the number of IOs needed to complete the range scan.

Constraints:
- The prefetched data is stored by the OS in page cache. So this currently works only for non direct-reads use-cases i.e applications which use page cache. (Direct-I/O support will be enabled in a later PR).
- This gets currently enabled only when ReadOptions.readahead_size = 0 (which is the default value).

Thanks to siying for the original idea and implementation.

**Benchmarks:**
Data fill:
```
TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=fillrandom -num=1000000000 -compression_type="none" -level_compaction_dynamic_level_bytes
```
Do a long range scan: Seekrandom with large number of nexts
```
TEST_TMPDIR=/data/users/$USER/benchmarks/iter ./db_bench -benchmarks=seekrandom -duration=60 -num=1000000000 -use_existing_db -seek_nexts=10000 -statistics -histogram
```

Page cache was cleared before each experiment with the command:
```
sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
```
```
Before:
seekrandom   :   34020.945 micros/op 29 ops/sec;   32.5 MB/s (1636 of 1999 found)
With this change:
seekrandom   :    8726.912 micros/op 114 ops/sec;  126.8 MB/s (5702 of 6999 found)
```
~3.9X performance improvement.

Also verified with strace and gdb that the readahead size is increasing as expected.
```
strace -e readahead -f -T -t -p <db_bench process pid>
```
Closes https://github.com/facebook/rocksdb/pull/3282

Differential Revision: D6586477

Pulled By: sagar0

fbshipit-source-id: 8a118a0ed4594fbb7f5b1cafb242d7a4033cb58c
2018-02-05 15:00:23 -08:00
..
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-12-11 15:27:32 -08:00
2017-12-12 12:12:38 -08:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-12-07 11:57:36 -08:00
2017-07-17 10:41:56 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-28 16:27:16 -07:00
2017-07-15 16:11:23 -07:00
2017-12-11 15:27:32 -08:00
2017-07-15 16:11:23 -07:00
2017-10-12 18:28:24 -07:00
2017-12-11 15:27:32 -08:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-12-07 11:57:36 -08:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00
2017-12-01 10:42:05 -08:00
2017-08-09 15:58:13 -07:00
2017-08-18 10:56:20 -07:00
2017-07-15 16:11:23 -07:00
2017-07-15 16:11:23 -07:00