New KVM deployment bugs and recommendations (Ubuntu 14.04: qemu 2.0, libvirt 1.2.4, Linux 3.10)

New Linux KVM qemu deployment, running on Ubuntu 14.04 with Linux 3.10 kernel and openvswitch. Hardware setup is 2 SSD in RAID1, and 2 7200RPM HDD in RAID1 using mdadm. bcache is being used as the backing cache for the HDD.

Bugs

  • hv_vapic ("vapic state='on'" in libvirt) causes Windows 2008 R2 and above VMs not to boot if CPU is an Intel IvyBridge or greater (check /sys/module/kvm_intel/parameters/enable_apicv) – Redhat Bugzilla
  • Linux 3.12 or greater (Ubuntu 14.04 ships with 3.13) have issues with virtio-net NIC and TSO (RX and TX checksuming) offloading – TCP sessions can't be established across virtual machines in certain situations (think a virtual machine as a firewall) – Debian Bugreport
  • Windows virtual machines still freeze up/high latency if you use virtio NIC, this is with the latest signed drivers available from the Fedora Project
  • Still have issues with "Russian roulette" of network interfaces with openvswitch – Blog post

Recommendations

Installed Packages

System
apt-get install haveged ntp sysstat irqbalance acpid
Linux KVM, openvswitch, virt-install, virt-top
apt-get install qemu-kvm libvirt-bin virtinst virt-top openvswitch-switch sysfsutils iotop gdisk iftop
bcache
apt-get install python-software-properties
add-apt-repository ppa:g2p/storage && apt-get update && apt-get install bcache-tools

Tuning memory, scheduler I/O subsystems for Linux KVM

Taken from RHEL 6 tuned (virtual-host)

/etc/sysctl.conf
kernel.sched_min_granularity_ns=10000000
kernel.sched_wakeup_granularity_ns=15000000
vm.dirty_ratio=10
vm.dirty_background_ratio=5
vm.swappiness=10

Disable experimental virtio-net zero copy transmit

RHEL 7 has experimental_zcopytx disabled by default.

/etc/modprobe.d/vhost-net.conf
options vhost_net  experimental_zcopytx=0

Use virtio-blk for guests, and enable Multiqueue virtio-net (except Windows)

Linux KVM page describing Multiqueue

libvirt
<devices>
  <interface type='network'>
    <model type='virtio'/>
    <driver name='vhost' queues='4'/>
  </interface>
</devices>

Where number of queues is equal to the number of virtual processors assigned to the virtual machine. Don't forget to enable the vhost_net kernel module, edit /etc/default/qemu-kvm and set VHOST_NET_ENABLED=1.

Make sure to enable Multiqueue support in the guest

ethtool -L eth0 combined 4

Use deadline scheduler, and enable transparent hugepages for KVM

/etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="elevator=deadline transparent_hugepage=always"

Don't forget to run update-grub to make the changes persistent.

For Windows guests, take advantage of Hyper-V enlightments and use e1000 Ethernet adapter

Linux KVM presentation on Hyper-V enlightenment (slightly outdated)

  • hv_vapic (for "supported processors") for Virtual APIC
  • hv_time (aka "hypervclock") for TSC invariant timestamps passed to guest
  • hv_relaxed to prevent BSOD under high load (when a timer can't be serviced when expected)
  • hv_spinlocks let's the guest know when a virtual processor is trying to acquire a lock on the same resource as another processor
libvirt
<features>
  <acpi/>
  <apic/>
  <hyperv>
    <relaxed state='on'/>
    <vapic state='on'/>
    <spinlocks state='on' retries='4096'/>
  </hyperv>
</features>
<clock offset='localtime'>
  <timer name='hypervclock' present='yes'/>
  <timer name='hpet' present='no'/>
</clock>

Build and install longterm Linux 3.10 kernel for stability (and working openvswitch with virtio-net)

apt-get -y install build-essential
cd /usr/local/src
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.44.tar.xz
tar -Jxf linux-3.10.44.tar.xz
cd linux-3.10.44
cp /boot/config-`uname -r` .config
make olddefconfig
make -j`nproc` INSTALL_MOD_STRIP=1 deb-pkg
dpkg -i ../*.deb
apt-mark hold linux-libc-dev

Time keeping is king on FreeBSD – TSC and "how not to have time go backwards in guest"

/etc/sysctl.conf
kern.timecounter.hardware=ACPI-fast
/boot/loader.conf
virtio_load="YES"
virtio_pci_load="YES"
virtio_blk_load="YES"
if_vtnet_load="YES"
virtio_balloon_load="YES"
kern.timecounter.smp_tsc="1"
kern.timecounter.invariant_tsc="1"
libvirt
<clock offset='localtime'>
  <timer name='rtc' tickpolicy='catchup'/>
  <timer name='pit' tickpolicy='delay'/>
  <timer name='hpet' present='no'/>
</clock>