[LINUX] Deadline IO scheduler tunables

https://www.kernel.org/doc/html/latest/block/deadline-iosched.html

Deadline IO scheduler tunables

This little file attempts to document how the deadline io scheduler works. In particular, it will clarify the meaning of the exposed tunables that may be of interest to power users.

In this small file I'm trying to document how the deadline io scheduler works. In particular, we will clarify the published tunable adjustments that power users may be interested in.

Selecting IO schedulers

Refer to Documentation/block/switching-sched.rst for information on selecting an io scheduler on a per-device basis.

See Documentation / block / switching-sched.rst for information on choosing an io scheduler for each device.

read_expire (in ms)

The goal of the deadline io scheduler is to attempt to guarantee a start service time for a request. As we focus mainly on read latencies, this is tunable. When a read request first enters the io scheduler, it is assigned a deadline that is the current time + the read_expire value in units of milliseconds.

The purpose of the deadline io scheduler is to guarantee a start service time for a request. You can focus on read latency and adjust it. When the read request arrives at the io scheduler for the first time, the deadline of current_ time + read_expire is assigned. These units are milliseconds.

write_expire (in ms)

Similar to read_expire mentioned above, but for writes.

It's similar to read_expire, but it's about write.

fifo_batch (number of requests)

Requests are grouped into batches of a particular data direction (read or write) which are serviced in increasing sector order. To limit extra seeking, deadline expiries are only checked between batches. fifo_batch controls the maximum number of requests per batch.

Requests are grouped as a batch of each data direction (read / write). These are processed in sector order. Expired expires are only checked in the batch building to limit extra seeks. fifo_batch controls the maximum number of requests in batch units.

This parameter tunes the balance between per-request latency and aggregate throughput. When low latency is the primary concern, smaller is better (where a value of 1 yields first-come first-served behaviour). Increasing fifo_batch generally improves throughput, at the cost of latency variation.

This parameter balances the latency per request and the throughput guarantee. Small values are useful if low lantency is important (1 is the behavior that processes the first one first). By increasing the price of fifo_batch, the throughput generally improves, but the latency varies.

writes_starved (number of dispatches)

When we have to move requests from the io scheduler queue to the block device dispatch queue, we always give a preference to reads. However, we don’t want to starve writes indefinitely either. So writes_starved controls how many times we give preference to reads over writes. When that has been done writes_starved number of times, we dispatch some writes based on the same criteria as reads.

If you have to make a lot of requests from the io scheduker queue to the block device dispatch queue, you should always prioritize reading. But I don't want to exhaust the writes indefinitely. So writes_starved controls how many times you prioritize reads to writes. When written_starved many times, write is dispatched with the same priority as read.

front_merges (bool)

Sometimes it happens that a request enters the io scheduler that is contiguous with a request that is already on the queue. Either it fits in the back of that request, or it fits at the front. That is called either a back merge candidate or a front merge candidate. Due to the way files are typically laid out, back merges are much more common than front merges. For some work loads, you may even know that it is a waste of time to spend any time attempting to front merge requests.

Requests may come to the io scheduler that is adjacent to a request that is already in the queue. In this case, it fits behind or before the request. This is called a back merge candidate or front merge candidate. Back-merging is much more common than front-merging in the usual way files are laid out. For a typical workload, spending time trying to process a merge request to the front may be perceived as a waste of time.

Setting front_merges to 0 disables this functionality. Front merges may still occur due to the cached last_merge hint, but since that comes at basically 0 cost we leave that on. We simply disable the rbtree front sector lookup when the io scheduler merge function is called.

You can disable this feature by setting front_merges to 0. It may be Front merged again based on the cached last_merge hint, but it's basically 0 cost, so it doesn't do anything. Simply disable the rbtree front sector lookup when the io scheduler merge function is called.

Nov 11 2002, Jens Axboe <jens.axboe@oracle.com>


Originally, it is a part of the Linux Kernel source code, so it will be treated as GPLv2 (recognition that it should be).

https://www.kernel.org/doc/html/latest/index.html

Licensing documentation

The following describes the license of the Linux kernel source code (GPLv2), how to properly mark the license of individual files in the source tree, as well as links to the full license text.

https://www.kernel.org/doc/html/latest/process/license-rules.html#kernel-licensing

Recommended Posts

Deadline IO scheduler tunables