IO statistics

From OpenVZ Virtuozzo Containers Wiki
Revision as of 10:52, 12 May 2017 by Finist (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page describes the IO statistics that is collected at the IO-scheduler level. It describes the information about the container's real work with disks. This is different from what shown by IO accounting.

Kernel interface[edit]

The stats are reported via the proc files. Currently it is available in kernels starting from 028stab069.1.

Files[edit]

  • /proc/bc/$id/iostat
statistics for beancounter $id
  • /proc/bc/iostat
statistics for all beancounters

Format[edit]

Files contains one row for each disk-beancounter pair.

Columns are:

N name type description
1 disk string Disk device name, e.g. sda or hda, or a special queue (like fuse or flush)
2 ub id integer Beancounter id
3 state char currently unused (always '.')
4 busy queues integer The number of queues with requests (see below)
5 on dispatch integer currently unused (always '0')
6 activations count integer currently unused (always '0')
7 wait time integer Total time in waiting state in milliseconds
8 used time integer Total time in active state in milliseconds.
9 requests completed integer The number of completed requests
10 sectors transferred integer The number of 512 sectors transferred (includes both read and write)

New columns might be added at the end of row in future.

Separate stats exist for fuse and flush, that only report requests and sectors stats (others are always 0).

Example of parsing code: parse_proc_iostat() function in vzstat.c

I/O schedulers[edit]

Check available/active I/O schedulers for block device "sda":

# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
  • for "cfq" I/O scheduler: a separate block device line is added in iostat proc file
# cat /proc/bc/100/iostat
flush 100 . 0 0 0 0 0 7389 1893968 0 0
fuse 100 . 0 0 0 0 0 0 0 0 0
sda 100 . 0 0 0 9000 1843380 245216 55845488 245028 188
  • for "deadline" I/O scheduler: no additional per-device line is added, iops counters for such devices are added to "flush" line counters (iops limit works)
  • for "noop" I/O scheduler: iops are not counted (iops limit does not work)
  • for devices with no I/O scheduler (like logical devices, ceph rbd devices, etc): iops are not counted (iops limit does not work)
# cat /sys/block/dm-0/queue/scheduler
none
# cat /sys/block/rbd0/queue/scheduler
none

Queues[edit]

Each beancounter may have many queues with requests. Typically there's one queue for each task with synchronous (e.g. reads) requests and and the fixed amount of them for asynchronous requests (e.g. cached writes) for each beancounter.

Interpretation[edit]

Disk usage times[edit]

The disk usage should be reported in a top-like style. Consider the following code

read_iostat(&a);
sleep(interval);
read_iostat(&b);

Now the following numbers should be calculated and shown.

active  = sum(b.used_time - a.used_time) * 100 / interval;
waiting = sum(b.wait_time - a.wait_time) * 100 / interval;
idle    = 100 - (active + waiting);

The sum function sums up the times for all disk for the beancounter.

Additionally two more values should be shown for beancounter.

IO speed[edit]

The value

sum(b.transfered_sectors - a.transfered_sectors) * 512 / interval

denotes the speed of the IO performed by the beancounter.

Average request size[edit]

The value

(b.transfered_sectors - a.transfered_sectors)/(b.requests_completed - a.requests_completed)

denotes the average request size for a beancounter to a particular disk.

See also[edit]