rocksdb

Author	SHA1	Message	Date
Islam AbdelRahman	4a21b1402c	Cache heap::downheap() root comparison (optimize heap cmp call) Summary: Reduce number of comparisons in heap by caching which child node in the first level is smallest (left_child or right_child) So next time we can compare directly against the smallest child I see that the total number of calls to comparator drops significantly when using this optimization Before caching (~2mil key comparison for iterating the DB) ``` $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq" --db="/dev/shm/heap_opt" --use_existing_db --disable_auto_compactions --cache_size=1000000000 --perf_level=2 readseq : 0.338 micros/op 2959201 ops/sec; 327.4 MB/s user_key_comparison_count = 2000008 ``` After caching (~1mil key comparison for iterating the DB) ``` $ DEBUG_LEVEL=0 make db_bench -j64 && ./db_bench --benchmarks="readseq" --db="/dev/shm/heap_opt" --use_existing_db --disable_auto_compactions --cache_size=1000000000 --perf_level=2 readseq : 0.309 micros/op 3236801 ops/sec; 358.1 MB/s user_key_comparison_count = 1000011 ``` It also improves Closes https://github.com/facebook/rocksdb/pull/1600 Differential Revision: D4256027 Pulled By: IslamAbdelRahman fbshipit-source-id: 76fcc66	2016-12-01 13:39:14 -08:00
Baraa Hamodi	21e95811d1	Updated all copyright headers to the new format.	2016-02-09 15:12:00 -08:00
lovro	e1c99e10c1	Replace std::priority_queue in MergingIterator with custom heap, take 2 Summary: Repeat of `b6655a679d` (reverted in `b7a2369fb2`) with a proper fix for the issue that `57d216ea65` was trying to fix. Test Plan: make check for i in $(seq 100); do ./db_stress --test_batches_snapshots=1 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=0 --reopen=20 --readpercent=45 --prefixpercent=5 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/tmp/rocksdb_crashtest_KdCI5F --max_key=100000000 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --progress_reports=0 --disable_wal=0 --disable_data_sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=0 --memtablerep=prefix_hash --prefix_size=7 --ops_per_thread=200 \|\| break; done Reviewers: anthony, sdong, igor, yhchiang Reviewed By: igor, yhchiang Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D41391	2015-07-15 03:34:40 -07:00
Yueh-Hsuan Chiang	b7a2369fb2	Revert "Replace std::priority_queue in MergingIterator with custom heap" Summary: This patch reverts "Replace std::priority_queue in MergingIterator with custom heap" (commit commit `b6655a679d`) as it causes db_stress failure. Test Plan: ./db_stress --test_batches_snapshots=1 --threads=32 --write_buffer_size=4194304 --destroy_db_initially=0 --reopen=20 --readpercent=45 --prefixpercent=5 --writepercent=35 --delpercent=5 --iterpercent=10 --db=/tmp/rocksdb_crashtest_KdCI5F --max_key=100000000 --mmap_read=0 --block_size=16384 --cache_size=1048576 --open_files=500000 --verify_checksum=1 --sync=0 --progress_reports=0 --disable_wal=0 --disable_data_sync=1 --target_file_size_base=2097152 --target_file_size_multiplier=2 --max_write_buffer_number=3 --max_background_compactions=20 --max_bytes_for_level_base=10485760 --filter_deletes=0 --memtablerep=prefix_hash --prefix_size=7 --ops_per_thread=200 --kill_random_test=97 Reviewers: igor, anthony, lovro, sdong Reviewed By: sdong Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D41343	2015-07-07 14:45:20 -07:00
lovro	b6655a679d	Replace std::priority_queue in MergingIterator with custom heap Summary: While profiling compaction in our service I noticed a lot of CPU (~15% of compaction) being spent in MergingIterator and key comparison. Looking at the code I found MergingIterator was (understandably) using std::priority_queue for the multiway merge. Keys in our dataset include sequence numbers that increase with time. Adjacent keys in an L0 file are very likely to be adjacent in the full database. Consequently, compaction will often pick a chunk of rows from the same L0 file before switching to another one. It would be great to avoid the O(log K) operation per row while compacting. This diff replaces std::priority_queue with a custom binary heap implementation. It has a "replace top" operation that is cheap when the new top is the same as the old one (i.e. the priority of the top entry is decreased but it still stays on top). Test Plan: make check To test the effect on performance, I generated databases with data patterns that mimic what I describe in the summary (rows have a mostly increasing sequence number). I see a 10-15% CPU decrease for compaction (and a matching throughput improvement on tmpfs). The exact improvement depends on the number of L0 files and the amount of locality. Performance on randomly distributed keys seems on par with the old code. Reviewers: kailiu, sdong, igor Reviewed By: igor Subscribers: yoshinorim, dhruba, tnovak Differential Revision: https://reviews.facebook.net/D29133	2015-07-06 04:24:09 -07:00

5 Commits