virtio,pc,pci: features, fixes

virtio sound card support
 
 vhost-user: back-end state migration
 
 cxl:
      line length reduction
      enabling fabric management
 
 vhost-vdpa:
      shadow virtqueue hash calculation Support
      shadow virtqueue RSS Support
 
 tests:
     CPU topology related smbios test cases
 
 Fixes, cleanups all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01
 GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83
 nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68
 D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG
 BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX
 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc=
 =vEv+
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes

virtio sound card support

vhost-user: back-end state migration

cxl:
     line length reduction
     enabling fabric management

vhost-vdpa:
     shadow virtqueue hash calculation Support
     shadow virtqueue RSS Support

tests:
    CPU topology related smbios test cases

Fixes, cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmVKDDoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpF08H/0Zts8uvkHbgiOEJw4JMHU6/VaCipfIYsp01
# GSfwYOyEsXJ7GIxKWaCiMnWXEm7tebNCPKf3DoUtcAojQj3vuF9XbWBKw/bfRn83
# nGO/iiwbYViSKxkwqUI+Up5YiN9o0M8gBFrY0kScPezbnYmo5u2bcADdEEq6gH68
# D0Ea8i+WmszL891ypvgCDBL2ObDk3qX3vA5Q6J2I+HKX2ofJM59BwaKwS5ghw+IG
# BmbKXUZJNjUQfN9dQ7vJuiuqdknJ2xUzwW2Vn612ffarbOZB1DZ6ruWlrHty5TjX
# 0w4IXEJPBgZYbX9oc6zvTQnbLDBJbDU89mnme0TcmNMKWmQKTtc=
# =vEv+
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 18:06:50 HKT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (63 commits)
  acpi/tests/avocado/bits: enable console logging from bits VM
  acpi/tests/avocado/bits: enforce 32-bit SMBIOS entry point
  hw/cxl: Add tunneled command support to mailbox for switch cci.
  hw/cxl: Add dummy security state get
  hw/cxl/type3: Cleanup multiple CXL_TYPE3() calls in read/write functions
  hw/cxl/mbox: Add Get Background Operation Status Command
  hw/cxl: Add support for device sanitation
  hw/cxl/mbox: Wire up interrupts for background completion
  hw/cxl/mbox: Add support for background operations
  hw/cxl: Implement Physical Ports status retrieval
  hw/pci-bridge/cxl_downstream: Set default link width and link speed
  hw/cxl/mbox: Add Physical Switch Identify command.
  hw/cxl/mbox: Add Information and Status / Identify command
  hw/cxl: Add a switch mailbox CCI function
  hw/pci-bridge/cxl_upstream: Move defintion of device to header.
  hw/cxl/mbox: Generalize the CCI command processing
  hw/cxl/mbox: Pull the CCI definition out of the CXLDeviceState
  hw/cxl/mbox: Split mailbox command payload into separate input and output
  hw/cxl/mbox: Pull the payload out of struct cxl_cmd and make instances constant
  hw/cxl: Fix a QEMU_BUILD_BUG_ON() in switch statement scope issue.
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
Stefan Hajnoczi 2023-11-07 18:59:40 +08:00
commit f6b615b52d
53 changed files with 4477 additions and 324 deletions

View file

@ -108,6 +108,43 @@ A vring state description
:num: a 32-bit number
A vring descriptor index for split virtqueues
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-------------+---------------------+
| vring index | index in avail ring |
+-------------+---------------------+
:vring index: 32-bit index of the respective virtqueue
:index in avail ring: 32-bit value, of which currently only the lower 16
bits are used:
- Bits 015: Index of the next *Available Ring* descriptor that the
back-end will process. This is a free-running index that is not
wrapped by the ring size.
- Bits 1631: Reserved (set to zero)
Vring descriptor indices for packed virtqueues
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-------------+--------------------+
| vring index | descriptor indices |
+-------------+--------------------+
:vring index: 32-bit index of the respective virtqueue
:descriptor indices: 32-bit value:
- Bits 014: Index of the next *Available Ring* descriptor that the
back-end will process. This is a free-running index that is not
wrapped by the ring size.
- Bit 15: Driver (Available) Ring Wrap Counter
- Bits 1630: Index of the entry in the *Used Ring* where the back-end
will place the next descriptor. This is a free-running index that
is not wrapped by the ring size.
- Bit 31: Device (Used) Ring Wrap Counter
A vring address description
^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -285,6 +322,32 @@ VhostUserShared
:UUID: 16 bytes UUID, whose first three components (a 32-bit value, then
two 16-bit values) are stored in big endian.
Device state transfer parameters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+--------------------+-----------------+
| transfer direction | migration phase |
+--------------------+-----------------+
:transfer direction: a 32-bit enum, describing the direction in which
the state is transferred:
- 0: Save: Transfer the state from the back-end to the front-end,
which happens on the source side of migration
- 1: Load: Transfer the state from the front-end to the back-end,
which happens on the destination side of migration
:migration phase: a 32-bit enum, describing the state in which the VM
guest and devices are:
- 0: Stopped (in the period after the transfer of memory-mapped
regions before switch-over to the destination): The VM guest is
stopped, and the vhost-user device is suspended (see
:ref:`Suspended device state <suspended_device_state>`).
In the future, additional phases might be added e.g. to allow
iterative migration while the device is running.
C structure
-----------
@ -344,6 +407,7 @@ in the ancillary data:
* ``VHOST_USER_SET_VRING_ERR``
* ``VHOST_USER_SET_BACKEND_REQ_FD`` (previous name ``VHOST_USER_SET_SLAVE_REQ_FD``)
* ``VHOST_USER_SET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
* ``VHOST_USER_SET_DEVICE_STATE_FD``
If *front-end* is unable to send the full message or receives a wrong
reply it will close the connection. An optional reconnection mechanism
@ -374,35 +438,50 @@ negotiation.
Ring states
-----------
Rings can be in one of three states:
Rings have two independent states: started/stopped, and enabled/disabled.
* stopped: the back-end must not process the ring at all.
* While a ring is stopped, the back-end must not process the ring at
all, regardless of whether it is enabled or disabled. The
enabled/disabled state should still be tracked, though, so it can come
into effect once the ring is started.
* started but disabled: the back-end must process the ring without
* started and disabled: The back-end must process the ring without
causing any side effects. For example, for a networking device,
in the disabled state the back-end must not supply any new RX packets,
but must process and discard any TX packets.
* started and enabled.
* started and enabled: The back-end must process the ring normally, i.e.
process all requests and execute them.
Each ring is initialized in a stopped state. The back-end must start
ring upon receiving a kick (that is, detecting that file descriptor is
readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
Each ring is initialized in a stopped and disabled state. The back-end
must start a ring upon receiving a kick (that is, detecting that file
descriptor is readable) on the descriptor specified by
``VHOST_USER_SET_VRING_KICK`` or receiving the in-band message
``VHOST_USER_VRING_KICK`` if negotiated, and stop a ring upon receiving
``VHOST_USER_GET_VRING_BASE``.
Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
ring starts directly in the enabled state.
If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
initialized in a disabled state and is enabled by
``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
In addition, upon receiving a ``VHOST_USER_SET_FEATURES`` message from
the front-end without ``VHOST_USER_F_PROTOCOL_FEATURES`` set, the
back-end must enable all rings immediately.
While processing the rings (whether they are enabled or not), the back-end
must support changing some configuration aspects on the fly.
.. _suspended_device_state:
Suspended device state
^^^^^^^^^^^^^^^^^^^^^^
While all vrings are stopped, the device is *suspended*. In addition to
not processing any vring (because they are stopped), the device must:
* not write to any guest memory regions,
* not send any notifications to the guest,
* not send any messages to the front-end,
* still process and reply to messages from the front-end.
Multiple queue support
----------------------
@ -490,7 +569,8 @@ ancillary data, it may be used to inform the front-end that the log has
been modified.
Once the source has finished migration, rings will be stopped by the
source. No further update must be done before rings are restarted.
source (:ref:`Suspended device state <suspended_device_state>`). No
further update must be done before rings are restarted.
In postcopy migration the back-end is started before all the memory has
been received from the source host, and care must be taken to avoid
@ -502,6 +582,80 @@ it performs WAKE ioctl's on the userfaultfd to wake the stalled
back-end. The front-end indicates support for this via the
``VHOST_USER_PROTOCOL_F_PAGEFAULT`` feature.
.. _migrating_backend_state:
Migrating back-end state
^^^^^^^^^^^^^^^^^^^^^^^^
Migrating device state involves transferring the state from one
back-end, called the source, to another back-end, called the
destination. After migration, the destination transparently resumes
operation without requiring the driver to re-initialize the device at
the VIRTIO level. If the migration fails, then the source can
transparently resume operation until another migration attempt is made.
Generally, the front-end is connected to a virtual machine guest (which
contains the driver), which has its own state to transfer between source
and destination, and therefore will have an implementation-specific
mechanism to do so. The ``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature
provides functionality to have the front-end include the back-end's
state in this transfer operation so the back-end does not need to
implement its own mechanism, and so the virtual machine may have its
complete state, including vhost-user devices' states, contained within a
single stream of data.
To do this, the back-end state is transferred from back-end to front-end
on the source side, and vice versa on the destination side. This
transfer happens over a channel that is negotiated using the
``VHOST_USER_SET_DEVICE_STATE_FD`` message. This message has two
parameters:
* Direction of transfer: On the source, the data is saved, transferring
it from the back-end to the front-end. On the destination, the data
is loaded, transferring it from the front-end to the back-end.
* Migration phase: Currently, the only supported phase is the period
after the transfer of memory-mapped regions before switch-over to the
destination, when both the source and destination devices are
suspended (:ref:`Suspended device state <suspended_device_state>`).
In the future, additional phases might be supported to allow iterative
migration while the device is running.
The nature of the channel is implementation-defined, but it must
generally behave like a pipe: The writing end will write all the data it
has into it, signalling the end of data by closing its end. The reading
end must read all of this data (until encountering the end of file) and
process it.
* When saving, the writing end is the source back-end, and the reading
end is the source front-end. After reading the state data from the
channel, the source front-end must transfer it to the destination
front-end through an implementation-defined mechanism.
* When loading, the writing end is the destination front-end, and the
reading end is the destination back-end. After reading the state data
from the channel, the destination back-end must deserialize its
internal state from that data and set itself up to allow the driver to
seamlessly resume operation on the VIRTIO level.
Seamlessly resuming operation means that the migration must be
transparent to the guest driver, which operates on the VIRTIO level.
This driver will not perform any re-initialization steps, but continue
to use the device as if no migration had occurred. The vhost-user
front-end, however, will re-initialize the vhost state on the
destination, following the usual protocol for establishing a connection
to a vhost-user back-end: This includes, for example, setting up memory
mappings and kick and call FDs as necessary, negotiating protocol
features, or setting the initial vring base indices (to the same value
as on the source side, so that operation can resume).
Both on the source and on the destination side, after the respective
front-end has seen all data transferred (when the transfer FD has been
closed), it sends the ``VHOST_USER_CHECK_DEVICE_STATE`` message to
verify that data transfer was successful in the back-end, too. The
back-end responds once it knows whether the transfer and processing was
successful or not.
Memory access
-------------
@ -896,6 +1050,7 @@ Protocol features
#define VHOST_USER_PROTOCOL_F_STATUS 16
#define VHOST_USER_PROTOCOL_F_XEN_MMAP 17
#define VHOST_USER_PROTOCOL_F_SHARED_OBJECT 18
#define VHOST_USER_PROTOCOL_F_DEVICE_STATE 19
Front-end message types
-----------------------
@ -1042,18 +1197,54 @@ Front-end message types
``VHOST_USER_SET_VRING_BASE``
:id: 10
:equivalent ioctl: ``VHOST_SET_VRING_BASE``
:request payload: vring state description
:request payload: vring descriptor index/indices
:reply payload: N/A
Sets the base offset in the available vring.
Sets the next index to use for descriptors in this vring:
* For a split virtqueue, sets only the next descriptor index to
process in the *Available Ring*. The device is supposed to read the
next index in the *Used Ring* from the respective vring structure in
guest memory.
* For a packed virtqueue, both indices are supplied, as they are not
explicitly available in memory.
Consequently, the payload type is specific to the type of virt queue
(*a vring descriptor index for split virtqueues* vs. *vring descriptor
indices for packed virtqueues*).
``VHOST_USER_GET_VRING_BASE``
:id: 11
:equivalent ioctl: ``VHOST_USER_GET_VRING_BASE``
:request payload: vring state description
:reply payload: vring state description
:reply payload: vring descriptor index/indices
Get the available vring base offset.
Stops the vring and returns the current descriptor index or indices:
* For a split virtqueue, returns only the 16-bit next descriptor
index to process in the *Available Ring*. Note that this may
differ from the available ring index in the vring structure in
memory, which points to where the driver will put new available
descriptors. For the *Used Ring*, the device only needs the next
descriptor index at which to put new descriptors, which is the
value in the vring structure in memory, so this value is not
covered by this message.
* For a packed virtqueue, neither index is explicitly available to
read from memory, so both indices (as maintained by the device) are
returned.
Consequently, the payload type is specific to the type of virt queue
(*a vring descriptor index for split virtqueues* vs. *vring descriptor
indices for packed virtqueues*).
When and as long as all of a devices vrings are stopped, it is
*suspended*, see :ref:`Suspended device state
<suspended_device_state>`.
The request payloads *num* field is currently reserved and must be
set to 0.
``VHOST_USER_SET_VRING_KICK``
:id: 12
@ -1464,6 +1655,76 @@ Front-end message types
the requested UUID. Back-end will reply passing the fd when the operation
is successful, or no fd otherwise.
``VHOST_USER_SET_DEVICE_STATE_FD``
:id: 42
:equivalent ioctl: N/A
:request payload: device state transfer parameters
:reply payload: ``u64``
Front-end and back-end negotiate a channel over which to transfer the
back-ends internal state during migration. Either side (front-end or
back-end) may create the channel. The nature of this channel is not
restricted or defined in this document, but whichever side creates it
must create a file descriptor that is provided to the respectively
other side, allowing access to the channel. This FD must behave as
follows:
* For the writing end, it must allow writing the whole back-end state
sequentially. Closing the file descriptor signals the end of
transfer.
* For the reading end, it must allow reading the whole back-end state
sequentially. The end of file signals the end of the transfer.
For example, the channel may be a pipe, in which case the two ends of
the pipe fulfill these requirements respectively.
Initially, the front-end creates a channel along with such an FD. It
passes the FD to the back-end as ancillary data of a
``VHOST_USER_SET_DEVICE_STATE_FD`` message. The back-end may create a
different transfer channel, passing the respective FD back to the
front-end as ancillary data of the reply. If so, the front-end must
then discard its channel and use the one provided by the back-end.
Whether the back-end should decide to use its own channel is decided
based on efficiency: If the channel is a pipe, both ends will most
likely need to copy data into and out of it. Any channel that allows
for more efficient processing on at least one end, e.g. through
zero-copy, is considered more efficient and thus preferred. If the
back-end can provide such a channel, it should decide to use it.
The request payload contains parameters for the subsequent data
transfer, as described in the :ref:`Migrating back-end state
<migrating_backend_state>` section.
The value returned is both an indication for success, and whether a
file descriptor for a back-end-provided channel is returned: Bits 07
are 0 on success, and non-zero on error. Bit 8 is the invalid FD
flag; this flag is set when there is no file descriptor returned.
When this flag is not set, the front-end must use the returned file
descriptor as its end of the transfer channel. The back-end must not
both indicate an error and return a file descriptor.
Using this function requires prior negotiation of the
``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
``VHOST_USER_CHECK_DEVICE_STATE``
:id: 43
:equivalent ioctl: N/A
:request payload: N/A
:reply payload: ``u64``
After transferring the back-ends internal state during migration (see
the :ref:`Migrating back-end state <migrating_backend_state>`
section), check whether the back-end was able to successfully fully
process the state.
The value returned indicates success or error; 0 is success, any
non-zero value is an error.
Using this function requires prior negotiation of the
``VHOST_USER_PROTOCOL_F_DEVICE_STATE`` feature.
Back-end message types
----------------------

View file

@ -93,6 +93,7 @@ Emulated Devices
devices/vhost-user.rst
devices/virtio-gpu.rst
devices/virtio-pmem.rst
devices/virtio-snd.rst
devices/vhost-user-rng.rst
devices/canokey.rst
devices/usb-u2f.rst

View file

@ -0,0 +1,49 @@
virtio sound
============
This document explains the setup and usage of the Virtio sound device.
The Virtio sound device is a paravirtualized sound card device.
Linux kernel support
--------------------
Virtio sound requires a guest Linux kernel built with the
``CONFIG_SND_VIRTIO`` option.
Description
-----------
Virtio sound implements capture and playback from inside a guest using the
configured audio backend of the host machine.
Device properties
-----------------
The Virtio sound device can be configured with the following properties:
* ``jacks`` number of physical jacks (Unimplemented).
* ``streams`` number of PCM streams. At the moment, no stream configuration is supported: the first one will always be a playback stream, an optional second will always be a capture stream. Adding more will cycle stream directions from playback to capture.
* ``chmaps`` number of channel maps (Unimplemented).
All streams are stereo and have the default channel positions ``Front left, right``.
Examples
--------
Add an audio device and an audio backend at once with ``-audio`` and ``model=virtio``:
* pulseaudio: ``-audio driver=pa,model=virtio``
or ``-audio driver=pa,model=virtio,server=/run/user/1000/pulse/native``
* sdl: ``-audio driver=sdl,model=virtio``
* coreaudio: ``-audio driver=coreaudio,model=virtio``
etc.
To specifically add virtualized sound devices, you have to specify a PCI device
and an audio backend listed with ``-audio driver=help`` that works on your host
machine, e.g.:
::
-device virtio-sound-pci,audiodev=my_audiodev \
-audiodev alsa,id=my_audiodev