I recently migrated from VMware ESXi to Linux KVM, where I was using PCI Passthrough under VMware ESXi to pass through an Intel AHCI SATA controller to a guest. I implemented the same setup by enabling IOMMU on the KVM host, and passed through the AHCI SATA controller to the guest.
After a week or two, I started seeing the following messages in /var/log/syslog on the guest:
Aug 6 13:25:28 yama kernel: [78351.258573] XFS (md0): Corruption detected. Unmount and run xfs_repair Aug 6 13:25:28 yama kernel: [78351.259102] XFS (md0): Corruption detected. Unmount and run xfs_repair Aug 6 13:25:28 yama kernel: [78351.259616] XFS (md0): metadata I/O error: block 0x31214bd0 ("xfs_trans_read_buf_map") error 117 numblks 16 Aug 6 13:25:28 yama kernel: [78351.260203] XFS (md0): xfs_imap_to_bp: xfs_trans_read_buf() returned error 117. Aug 6 13:29:10 yama kernel: [78573.533933] XFS (md0): Invalid inode number 0xfeffffffffffffff Aug 6 13:29:10 yama kernel: [78573.533940] XFS (md0): Internal error xfs_dir_ino_validate at line 160 of file /build/buildd/linux-lts-raring-3.8.0/fs/xfs/xfs_dir2.c. Caller 0xffffffffa045cd96 Aug 6 13:29:10 yama kernel: [78573.533940] Aug 6 13:29:10 yama kernel: [78573.538440] Pid: 1723, comm: kworker/0:1H Tainted: GF 3.8.0-27-generic #40~precise3-Ubuntu Aug 6 13:29:10 yama kernel: [78573.538443] Call Trace: Aug 6 13:29:10 yama kernel: [78573.538496] [<ffffffffa042316f>] xfs_error_report+0x3f/0x50 [xfs] Aug 6 13:29:10 yama kernel: [78573.538537] [<ffffffffa045cd96>] ? __xfs_dir2_data_check+0x1e6/0x4a0 [xfs] Aug 6 13:29:10 yama kernel: [78573.538560] [<ffffffffa045a150>] xfs_dir_ino_validate+0x90/0xe0 [xfs] Aug 6 13:29:10 yama kernel: [78573.538579] [<ffffffffa045cd96>] __xfs_dir2_data_check+0x1e6/0x4a0 [xfs] Aug 6 13:29:10 yama kernel: [78573.538598] [<ffffffffa045d0ca>] xfs_dir2_data_verify+0x7a/0x90 [xfs] Aug 6 13:29:10 yama kernel: [78573.538637] [<ffffffff810135aa>] ? __switch_to+0x12a/0x4a0 Aug 6 13:29:10 yama kernel: [78573.538664] [<ffffffffa045d195>] xfs_dir2_data_reada_verify+0x95/0xa0 [xfs] Aug 6 13:29:10 yama kernel: [78573.538675] [<ffffffff8108e2aa>] ? finish_task_switch+0x4a/0xf0 Aug 6 13:29:10 yama kernel: [78573.538697] [<ffffffffa042133f>] xfs_buf_iodone_work+0x3f/0xa0 [xfs] Aug 6 13:29:10 yama kernel: [78573.538706] [<ffffffff81078c21>] process_one_work+0x141/0x490 Aug 6 13:29:10 yama kernel: [78573.538710] [<ffffffff81079be8>] worker_thread+0x168/0x400 Aug 6 13:29:10 yama kernel: [78573.538714] [<ffffffff81079a80>] ? manage_workers+0x120/0x120 Aug 6 13:29:10 yama kernel: [78573.538721] [<ffffffff8107f0f0>] kthread+0xc0/0xd0 Aug 6 13:29:10 yama kernel: [78573.538726] [<ffffffff8107f030>] ? flush_kthread_worker+0xb0/0xb0 Aug 6 13:29:10 yama kernel: [78573.538730] [<ffffffff816fc6ac>] ret_from_fork+0x7c/0xb0 Aug 6 13:29:10 yama kernel: [78573.538735] [<ffffffff8107f030>] ? flush_kthread_worker+0xb0/0xb0
I initially used xfs_repair on the file system, thinking that the issue was caused by a number of power failures that happened when the machine was running ESXi. However, this did not resolve the issue and made the problem worse. Eventually I decided that I wanted to scrap the file system, and pulled a drive from the array to backup the data and re-create the file system.
The drive that I pulled from the array for backups started showing the same issues with XFS corruption.
After further investigation via trial-and-error, I determined that KVM PCI Passthrough was causing the issue and decided to just pass through an array to the guest using vrtio-block – This solved the corruption problem and I haven't had any issues (knock on wood) since!