Knowledge: Choosing The Right Configuration


Large server setups are quite typical for KVM on z. One common pitfall is that system and application defaults that work well for a small number of servers may not work very well for a large number of virtual servers or in combination with huge host resources. The following sections present a list of snags that could happen, along with respective suggestions on how to resolve them.

System-wide I/O Limits

In order to achieve the best possible performance, QEMU should be configured to use Linux native asynchronous I/O (AIO). With a large number of virtual servers running on the KVM host, the number of outstanding asynchronous I/O requests can exceed the system wide limit which is controlled by /proc/sys/fs/max-aio-nr via sysctl fs.aio-max-nr. If this happens, it might not be possible to start any more virtual servers.
Therefore, use
   $ sysctl fs.aio-nr fs.aio-max-nr
   fs.aio-nr = 0
   fs.aio-max-nr = 65536
to check on the current and maximum number of asynchronous I/O requests.
To prevent any issues, we recommend to increase the maximum to a higher number, e.g.
   $ systctl fs.aio-max.nr=4194304

Paging Performance

The storage servers used by z Systems typically do not suffer from seek latencies, and are similar to SSDs in this respect. However, e.g. FCP LUNs are treated as regular SCSI disks and thus the I/O scheduler will utilize strategies avoiding seeks. This can adversely affect performance, e.g. when paging.
Therefore we recommend to configure all disks as non-rotational. Use
   $ cat /sys/block/<sdx>/queue/rotational
to check the current setting. Use the following command to mark the respective device as non-rotational:
   $ echo 0 > /sys/block/<sdx>/queue/rotational

QEMU User Limits

QEMU instances started by libvirt typically run as dedicated user qemu and are subject to the host system's per-user limits. When deploying many guests, both the number of user processes and open file descriptors can be exceeded. Therefore, we recommend to increase QEMU’s limits for the number of processes and files.
Use the following commands to determine the current number of QEMU processes and their open files:
   # Number of QEMU processes
   $ ps -afem | grep qemu | grep -v grep | wc -l
   463

   # Number of open files
   $ for i in `ps -afe | grep qemu | grep -v grep | awk \
     '{print($2)}'`; do ls -1 /proc/$i/fd; done | wc -l
   13891
We recommend to edit /etc/libvirt/qemu.conf and to use the following values to avoid running into resource limits:
   max_processes = 10000
   max_files = 100000
Note that these numbers were chosen in order to be on the safe side for large deployments. If you have only a small number of virtual servers, the defaults might be just fine.

Libvirt Client Connections

libvirt clients like virsh or virt-managercommunicate with the libvirt daemon via remote procedure calls (RPCs). The number of concurrent RPC requests, both per client and global, is limited.
The default limits should suffice for many situations where a single human administrator is using standard tools to manage virtual servers. However, in certain situations many libvirt requests may be processed in parallel, e.g. when a script starts a large number of virtual servers after the host has been booted.
For large deployments, we recommend to allow for up to 200 requests to be processed in parallel, and of these up to 10 per client connection. Edit /etc/libvirt/libvirtd.conf and use the following settings:
   max_workers = 200
   max_requests = 200
   max_client_requests = 10

No comments:

Post a Comment