why is virtio-scsi much slower than virtio-blk in my experiment (over and ceph rbd image)?

Zhaohui Yang picture Zhaohui Yang · Aug 19, 2016 · Viewed 8.9k times · Source

Hi I recently did a experiment of virtio-scsi over rbd through qemu target (for its DISCARD/TRIM support), and compared the throughput and iops with that of a virtio-blk over rbd setup on the same machine, using fio in the guest. Turnout the throughput in sequential read write is 7 times smaller (42.3MB/s vs 309MB/s) and the iops in random read write is 10 times smaller (546 vs 5705).

What I did is setting up a virtual machine using OpenStack Juno, which give me the virtio-blk over rbd setup. Then I modified the relevant part in libvirt configure xml, from this:

<disk type='network' device='disk'>
  <driver name='qemu' type='raw' cache='writeback'/>
  <auth username='cinder'>
    <secret type='ceph' uuid='482b83f9-be95-448e-87cc-9fa602196590'/>
  </auth>
  <source protocol='rbd' name='vms/c504ea8b-18e6-491e-9470-41c60aa50b81_disk'>
    <host name='192.168.20.105' port='6789'/>
  </source>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>

to this:

<disk type='network' device='disk'>
  <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
  <auth username='cinder'>
    <secret type='ceph' uuid='482b83f9-be95-448e-87cc-9fa602196590'/>
  </auth>
  <source protocol='rbd' name='vms/c504ea8b-18e6-491e-9470-41c60aa50b81_disk'>
    <host name='192.168.20.105' port='6789'/>
  </source>
  <target dev='vda' bus='scsi'/>
  <controller type='scsi' model='virtio-scsi' index='0'/>
</disk>

The software versions are:

qemu 2.5.1

libvirt 1.2.2

kernel 3.18.0-031800-generic #201412071935 SMP Mon Dec 8 00:36:34 UTC 2014 x86_64 (a Ubuntu 14.04 kernel)

And the hypervisor is KVM.

I don't think the performance difference could be that large between virtio-scsi and virtio-blk. So please point out what I did wrong, and how to achieve a reasonable performance.

A constraint is that I want a solution that works for OpenStack (ideal if works for Juno) without many patching or coding around. E.g., I heard of virtio-scsi + vhost-scsi + scsi-mq, but that seems not available in OpenStack right now.

Answer

Austin Hemmelgarn picture Austin Hemmelgarn · Aug 3, 2017

The simple answer is that VirtIO-SCSI is slightly more complex than VirtIO-Block. Borrowing the simple description from here:

VirtIO Block has the following layers:

guest: app -> Block Layer -> virtio-blk
host: QEMU -> Block Layer -> Block Device Driver -> Hardware

Whereas VirtIO SCSI has looks like this:

guest: app -> Block Layer -> SCSI Layer -> scsi_mod
host: QEMU -> Block Layer -> SCSI Layer -> Block Device Driver -> Hardware

In essence, VirtIO SCSI has to go through another translation layer compared to VirtIO Block.

For most cases using local devices, it will as a result be slower. There are a couple of odd specific cases where the reverse is sometimes true though, namely:

  • Direct passthrough of host SCSI LUN's to the VirtIO SCSI adapter. This is marginally faster because it bypasses the block layer on the host side.
  • QEMU native access to iSCSI devices. This is sometimes faster because it avoids the host block and SCSI layers entirely, and doesn't have to translate from VirtIO Block commands to SCSI commands.

For the record though, there are three non-performance related benefits to using VirtIO SCSI over VirtIO Block:

  1. It supports far more devices. VirtIO Block exposes one PCI device per block device, which limits things to around 21-24 devices, whereas VirtIO SCSI uses only one PCI device, and can handle an absolutely astronomical number of LUN's on that device.
  2. VirtIO SCSI supports the SCSI UNMAP command (TRIM in ATA terms, DISCARD in Linux kernel terms). This is important if you're on thinly provisioned storage.
  3. VirtIO SCSI exposes devices as regular SCSI nodes, whereas VirtIO Block uses a special device major. This isn't usually very important, but can be helpful when converting from a physical system.