openvswitch and libvirt: vnet port "russian roulette" on restart (solution)

Update: This issue has been resolved in libvirt 1.2.7 release, or commit. The below instructions are no longer required if your distribution has updated the package.

libvirt has openvswitch integration. When a virtual machine is started that is using openvswitch for the network port, a vnetX interface is created (where X is an incremental number, from 0) on start and destroyed on shutdown by libvirt. openvswitch's configuration is persistent, being that the vnetX interface created by libvirt is saved to a database and will be available on the following reboot.

As outlined in my bug report submitted in September 2013, this quickly breaks down if libvirtd is shutdown after openvswitch because libvirt can't delete the port it's created or the machine is restarted/shutdown incorrectly. If you have virtual machines that are on different VLANs, or interfaces you can quickly have them being assigned to the wrong virtual machine as libvirt doesn't error out if the interface already exists when it tries to create it (imagine swapping around LAN and WAN ports on a firewall.)

I solved this by adding creating an upstart job override on the Ubuntu LTS releases in /etc/init/openvswitch-switch.override:

post-start script
    ovs-vsctl show | grep 'Port \"vnet[0-9]*\"' | awk -F\" {'print $2'} | xargs -I {} ovs-vsctl del-port {} || :
end script

I've tested this issue and proven it's existence in OpenSuSE 12.3 (Dartmouth), Debian (stable) and Ubuntu 12.04/14.04 (LTS) distributions.

Networking with a gateway not on the local subnet on NetBSD at OVH

NetBSD has a FAQ for networking that outlines how to do Networking with a gateway not on the local subnet, unfortunately the recipe that they provide doesn't actually work "in the real world." The route command they provide does not make the network stack send an ARP who-has for the IP address and requires that you statically set the MAC address of the gateway.

I figured out a work-around for this, based on some insight from people on the NetBSD tech-talk mailing list. This allows you to use NetBSD as a guest operating system on providers such as OVH and Hetzner:

# ifconfig fxp0 inet 10.0.0.1 
# route add -net 192.168.0.1/32 -cloning -link fxp0 -iface 
# route add default -ifa 10.0.0.1 192.168.0.1

The trick was to specify use route cloning, and use a net definition instead of a host definition. Now NetBSD will send an ARP who-has request for the gateway IP address.

To supplement the OVH bridge client guide that is available on their Wiki, it would fit into the following template:

# ifconfig fxp0 inet Fail.over.IP netmask 255.255.255.255 broadcast Fail.over.IP 
# route add -net Your.Server.IP.254/32 -cloning -link fxp0 -iface 
# route add default -ifa Fail.over.IP Your.Server.IP.254

This should allow you to use NetBSD as a guest and not get blocked by OVH robots that check for too many ARP requests.

What Linux/*BSD distributions have Syncookies enabled by default?

In light of the recently published article on Quick Blind TCP Connection Spoofing with SYN Cookies, I wanted to see what operating systems and distributions have Syncookies enabled by default.

Distribution Sysctl Default
Ubuntu Linux 12.04 net.ipv4.tcp_syncookies On
Debian Linux 6 Off
Debian Linux 7 On
CentOS 5 On
CentOS 6 On
FreeBSD 8 net.ipv4.tcp_syncookies On
Solaris 10 Not Implemented Off
OpenBSD 5.3 Not Implemented Off

I'm not sure that turning off Syncookies is the best idea, due to the potential DoS effects from disabling them – applications should use something besides IP addresses for authentication.

KVM PCI Passthrough of an AHCI SATA controller to a guest causing data corruption

I recently migrated from VMware ESXi to Linux KVM, where I was using PCI Passthrough under VMware ESXi to pass through an Intel AHCI SATA controller to a guest. I implemented the same setup by enabling IOMMU on the KVM host, and passed through the AHCI SATA controller to the guest.

After a week or two, I started seeing the following messages in /var/log/syslog on the guest:

Aug  6 13:25:28 yama kernel: [78351.258573] XFS (md0): Corruption detected. Unmount and run xfs_repair
Aug  6 13:25:28 yama kernel: [78351.259102] XFS (md0): Corruption detected. Unmount and run xfs_repair
Aug  6 13:25:28 yama kernel: [78351.259616] XFS (md0): metadata I/O error: block 0x31214bd0 ("xfs_trans_read_buf_map") error 117 numblks 16
Aug  6 13:25:28 yama kernel: [78351.260203] XFS (md0): xfs_imap_to_bp: xfs_trans_read_buf() returned error 117.
Aug  6 13:29:10 yama kernel: [78573.533933] XFS (md0): Invalid inode number 0xfeffffffffffffff
Aug  6 13:29:10 yama kernel: [78573.533940] XFS (md0): Internal error xfs_dir_ino_validate at line 160 of file /build/buildd/linux-lts-raring-3.8.0/fs/xfs/xfs_dir2.c.  Caller 0xffffffffa045cd96
Aug  6 13:29:10 yama kernel: [78573.533940]
Aug  6 13:29:10 yama kernel: [78573.538440] Pid: 1723, comm: kworker/0:1H Tainted: GF            3.8.0-27-generic #40~precise3-Ubuntu
Aug  6 13:29:10 yama kernel: [78573.538443] Call Trace:
Aug  6 13:29:10 yama kernel: [78573.538496]  [<ffffffffa042316f>] xfs_error_report+0x3f/0x50 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538537]  [<ffffffffa045cd96>] ? __xfs_dir2_data_check+0x1e6/0x4a0 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538560]  [<ffffffffa045a150>] xfs_dir_ino_validate+0x90/0xe0 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538579]  [<ffffffffa045cd96>] __xfs_dir2_data_check+0x1e6/0x4a0 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538598]  [<ffffffffa045d0ca>] xfs_dir2_data_verify+0x7a/0x90 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538637]  [<ffffffff810135aa>] ? __switch_to+0x12a/0x4a0
Aug  6 13:29:10 yama kernel: [78573.538664]  [<ffffffffa045d195>] xfs_dir2_data_reada_verify+0x95/0xa0 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538675]  [<ffffffff8108e2aa>] ? finish_task_switch+0x4a/0xf0
Aug  6 13:29:10 yama kernel: [78573.538697]  [<ffffffffa042133f>] xfs_buf_iodone_work+0x3f/0xa0 [xfs]
Aug  6 13:29:10 yama kernel: [78573.538706]  [<ffffffff81078c21>] process_one_work+0x141/0x490
Aug  6 13:29:10 yama kernel: [78573.538710]  [<ffffffff81079be8>] worker_thread+0x168/0x400
Aug  6 13:29:10 yama kernel: [78573.538714]  [<ffffffff81079a80>] ? manage_workers+0x120/0x120
Aug  6 13:29:10 yama kernel: [78573.538721]  [<ffffffff8107f0f0>] kthread+0xc0/0xd0
Aug  6 13:29:10 yama kernel: [78573.538726]  [<ffffffff8107f030>] ? flush_kthread_worker+0xb0/0xb0
Aug  6 13:29:10 yama kernel: [78573.538730]  [<ffffffff816fc6ac>] ret_from_fork+0x7c/0xb0
Aug  6 13:29:10 yama kernel: [78573.538735]  [<ffffffff8107f030>] ? flush_kthread_worker+0xb0/0xb0

I initially used xfs_repair on the file system, thinking that the issue was caused by a number of power failures that happened when the machine was running ESXi. However, this did not resolve the issue and made the problem worse. Eventually I decided that I wanted to scrap the file system, and pulled a drive from the array to backup the data and re-create the file system.

The drive that I pulled from the array for backups started showing the same issues with XFS corruption.

After further investigation via trial-and-error, I determined that KVM PCI Passthrough was causing the issue and decided to just pass through an array to the guest using vrtio-block – This solved the corruption problem and I haven't had any issues (knock on wood) since!

Accessing USB devices as non-root: writing udev rules the easy way

I recently purchased a TEMPered USB thermometer, which I wanted to use as non-root using an open source utility called TEMPered. All the recipes I found, required that I use root to access the /dev/hidraw0 device that the particular TEMPered USB device exposed – of course this was not acceptable.

systemd (and udev, in general – I believe) has a handy utility called udevadm. You can use this tool to query a device on your system, for example:

udevadm info --query=all --name=/dev/hidraw0 --attribute-walk

Which allows you to retrieve all the required attributes to craft a file to put in /etc/dev/rules.d. I have created the following to expose PCsensor TEMPerV1.4 to a user that is part of the group temper:

# TEMPer1.4 USB thermometer
SUBSYSTEM=="hidraw", ATTRS{idVendor}=="0c45", ATTRS{idProduct}=="7401", GROUP="temper", MODE="0660"

I placed this in a file called /etc/udev/rules.d/60-temper.rules. You can now use TEMPered as a non-root user, which is a member of the group in question!