Commit graph

998 commits

Author SHA1 Message Date
Steve Sistare
6f06e3729a vfio/pci: export MSI functions
Export various MSI functions, renamed with a vfio_pci prefix, for use by
CPR in subsequent patches.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-18-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
6d7696f329 vfio/pci: vfio_notifier_cleanup
Move event_notifier_cleanup calls to a helper vfio_notifier_cleanup.
This version is trivial, and does not yet use the vdev and nr parameters.
No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-17-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
c2559182c8 vfio/pci: vfio_notifier_init cpr parameters
Pass vdev and nr to vfio_notifier_init, for use by CPR in a subsequent
patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-16-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
d364d802fe vfio/pci: pass vector to virq functions
Pass the vector number to vfio_connect_kvm_msi_virq and
vfio_remove_kvm_msi_virq, so it can be passed to their subroutines in
a subsequent patch.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-15-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
8f5c696026 vfio/pci: vfio_notifier_init
Move event_notifier_init calls to a helper vfio_notifier_init.
This version is trivial, but it will be expanded to support CPR
in subsequent patches.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-14-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
906f524ef1 vfio/pci: vfio_pci_vector_init
Extract a subroutine vfio_pci_vector_init.  No functional change.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-13-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
031fbb7110 vfio-pci: skip reset during cpr
Do not reset a vfio-pci device during CPR, and do not complain if the
kernel's PCI config space changes for non-emulated bits between the
vmstate save and load, which can happen due to ongoing interrupt activity.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-12-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
24c156dcd9 pci: skip reset during cpr
Do not reset a vfio-pci device during CPR.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749576403-25355-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
eba1f657cb vfio/container: recover from unmap-all-vaddr failure
If there are multiple containers and unmap-all fails for some container, we
need to remap vaddr for the other containers for which unmap-all succeeded.
Recover by walking all address ranges of all containers to restore the vaddr
for each.  Do so by invoking the vfio listener callback, and passing a new
"remap" flag that tells it to restore a mapping without re-allocating new
userland data structures.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
dac0dd68d9 vfio/container: mdev cpr blocker
During CPR, after VFIO_DMA_UNMAP_FLAG_VADDR, the vaddr is temporarily
invalid, so mediated devices cannot be supported.  Add a blocker for them.
This restriction will not apply to iommufd containers when CPR is added
for them in a future patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-8-git-send-email-steven.sistare@oracle.com
[ clg: Fixed context change in VFIODevice ]
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
7e9f214113 vfio/container: restore DMA vaddr
In new QEMU, do not register the memory listener at device creation time.
Register it later, in the container post_load handler, after all vmstate
that may affect regions and mapping boundaries has been loaded.  The
post_load registration will cause the listener to invoke its callback on
each flat section, and the calls will match the mappings remembered by the
kernel.

The listener calls a special dma_map handler that passes the new VA of each
section to the kernel using VFIO_DMA_MAP_FLAG_VADDR.  Restore the normal
handler at the end.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
1faadd9630 vfio/container: discard old DMA vaddr
In the container pre_save handler, discard the virtual addresses in DMA
mappings with VFIO_DMA_UNMAP_FLAG_VADDR, because guest RAM will be
remapped at a different VA after in new QEMU.  DMA to already-mapped
pages continues.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-6-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
c29a65ed68 vfio/container: preserve descriptors
At vfio creation time, save the value of vfio container, group, and device
descriptors in CPR state.  On qemu restart, vfio_realize() finds and uses
the saved descriptors.

During reuse, device and iommu state is already configured, so operations
in vfio_realize that would modify the configuration, such as vfio ioctl's,
are skipped.  The result is that vfio_realize constructs qemu data
structures that reflect the current state of the device.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
54857b0816 vfio/container: register container for cpr
Register a legacy container for cpr-transfer, replacing the generic CPR
register call with a more specific legacy container register call.  Add a
blocker if the kernel does not support VFIO_UPDATE_VADDR or VFIO_UNMAP_ALL.

This is mostly boiler plate.  The fields to to saved and restored are added
in subsequent patches.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
John Levon
a574b06144 vfio: mark posted writes in region write callbacks
For vfio-user, the region write implementation needs to know if the
write is posted; add the necessary plumbing to support this.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-5-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
John Levon
59adfc6f18 vfio: add per-region fd support
For vfio-user, each region has its own fd rather than sharing
vbasedev's. Add the necessary plumbing to support this, and use the
correct fd in vfio_region_mmap().

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-4-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
John Levon
7163c0bca7 vfio: export PCI helpers needed for vfio-user
The vfio-user code will need to re-use various parts of the vfio PCI
code. Export them in hw/vfio/pci.h, and rename them to the vfio_pci_*
namespace.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Mark Cave-Ayland <mark.caveayland@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250607001056.335310-2-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Rorie Reyes
fd03360215 hw/vfio/ap: Storing event information for an AP configuration change event
These functions can be invoked by the function that handles interception
of the CHSC SEI instruction for requests indicating the accessibility of
one or more adjunct processors has changed.

Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-4-rreyes@linux.ibm.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Rorie Reyes
bc36d14e13 hw/vfio/ap: store object indicating AP config changed in a queue
Creates an object indicating that an AP configuration change event
has been received and stores it in a queue. These objects will later
be used to store event information for an AP configuration change
when the CHSC instruction is intercepted.

Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-3-rreyes@linux.ibm.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Rorie Reyes
0fb8a62fe4 hw/vfio/ap: notification handler for AP config changed event
Register an event notifier handler to process AP configuration
change events by queuing the event and generating a CRW to let
the guest know its AP configuration has changed

Signed-off-by: Rorie Reyes <rreyes@linux.ibm.com>
Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/qemu-devel/20250609164418.17585-2-rreyes@linux.ibm.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Zhenzhong Duan
6c4f752ea7 vfio/pci: Fix instance_size of VFIO_PCI_BASE
Currently the final instance_size of VFIO_PCI_BASE is sizeof(PCIDevice).
It should be sizeof(VFIOPCIDevice), VFIO_PCI uses same structure as
base class VFIO_PCI_BASE, so no need to set its instance_size explicitly.

This isn't catastrophic only because VFIO_PCI_BASE is an abstract class.

Fixes: d4e392d0a9 ("vfio: add vfio-pci-base class")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/qemu-devel/20250611024228.423666-1-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Zhenzhong Duan
144227dbc4 vfio/container: Fix vfio_listener_commit()
It's wrong to call into listener_begin callback in vfio_listener_commit().
Currently this impacts vfio-user.

Fixes: d9b7d8b699 ("vfio/container: pass listener_begin/commit callbacks")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250609115433.401775-1-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
3ed34463a2 vfio: move vfio-cpr.h
Move vfio-cpr.h to include/hw/vfio, because it will need to be included by
other files there.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1748546679-154091-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Steve Sistare
2372f8d94a vfio: vfio_find_ram_discard_listener
Define vfio_find_ram_discard_listener as a subroutine so additional calls to
it may be added in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1748546679-154091-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Zhenzhong Duan
1ab3d93fd2 vfio/iommufd: Save vendor specific device info
Some device information returned by ioctl(IOMMU_GET_HW_INFO) are vendor
specific. Save them as raw data in a union supporting different vendors,
then vendor IOMMU can query the raw data with its fixed format for
capability directly.

Because IOMMU_GET_HW_INFO is only supported in linux, so declare those
capability related structures with CONFIG_LINUX.

Suggested-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-5-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Zhenzhong Duan
e50a3ead97 vfio/iommufd: Implement [at|de]tach_hwpt handlers
Implement [at|de]tach_hwpt handlers in VFIO subsystem. vIOMMU
utilizes them to attach to or detach from hwpt on host side.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-4-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Zhenzhong Duan
98962d6298 vfio/iommufd: Add properties and handlers to TYPE_HOST_IOMMU_DEVICE_IOMMUFD
Enhance HostIOMMUDeviceIOMMUFD object with 3 new members, specific
to the iommufd BE + 2 new class functions.

IOMMUFD BE includes IOMMUFD handle, devid and hwpt_id. IOMMUFD handle
and devid are used to allocate/free ioas and hwpt. hwpt_id is used to
re-attach IOMMUFD backed device to its default VFIO sub-system created
hwpt, i.e., when vIOMMU is disabled by guest. These properties are
initialized in hiod::realize() after attachment.

2 new class functions are [at|de]tach_hwpt(). They are used to
attach/detach hwpt. VFIO and VDPA can have different implementions,
so implementation will be in sub-class instead of HostIOMMUDeviceIOMMUFD,
e.g., in HostIOMMUDeviceIOMMUFDVFIO.

Add two wrappers host_iommu_device_iommufd_[at|de]tach_hwpt to wrap the
two functions.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250604062115.4004200-3-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
John Levon
44d0acf834 vfio/container: pass MemoryRegion to DMA operations
Pass through the MemoryRegion to DMA operation handlers of vfio
containers. The vfio-user container will need this later, to translate
the vaddr into an offset for the dma map vfio-user message; CPR will
also will need this.

Originally-by: John Johnson <john.g.johnson@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/qemu-devel/20250521215534.2688540-1-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Steve Sistare
e3353d63e1 vfio: return mr from vfio_get_xlat_addr
Modify memory_get_xlat_addr and vfio_get_xlat_addr to return the memory
region that the translated address is found in.  This will be needed by
CPR in a subsequent patch to map blocks using IOMMU_IOAS_MAP_FILE.

Also return the xlat offset, so we can simplify the interface by removing
the out parameters that can be trivially derived from mr and xlat.

Lastly, rename the functions to  to memory_translate_iotlb() and
vfio_translate_iotlb().

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1747661203-136490-1-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Tomita Moeko
0992ea07db vfio/igd: Fix incorrect error propagation in vfio_pci_igd_opregion_detect()
In vfio_pci_igd_opregion_detect(), errp will be set when the device does
not have OpRegion or is hotplugged. This errp will be propagated to
pci_qdev_realize(), which interprets it as failure, causing unexpected
termination on devices without OpRegion like SR-IOV VFs or discrete
GPUs. Fix it by not setting errp in vfio_pci_igd_opregion_detect().

This patch also checks if the device has OpRegion before hotplug status
to prevent unwanted warning messages on non-IGD devices.

Fixes: c0273e77f2 ("vfio/igd: Detect IGD device by OpRegion")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2968
Reported-by: Edmund Raile <edmund.raile@protonmail.com>
Link: https://lore.kernel.org/qemu-devel/30044d14-17ec-46e3-b9c3-63d27a5bde27@gmail.com
Tested-by: Edmund Raile <edmund.raile@protonmail.com>
Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Corvin Köhne <c.koehne@beckhoff.com>
Link: https://lore.kernel.org/qemu-devel/20250522151636.20001-1-tomitamoeko@gmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Zhenzhong Duan
1c729ca886 vfio/iommufd: Add comment emphasizing no movement of hiod->realize() call
The nested IOMMU support needs device and hwpt id which are generated
only after attachment. Hiod encapsulates these information in realize()
and passes to vIOMMU.

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250521110301.3313877-1-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
John Levon
a483ad5347 vfio: refactor out IRQ signalling setup
This makes for a slightly more readable vfio_msix_vector_do_use()
implementation, and we will rely on this shortly.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-5-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
John Levon
1539945946 vfio: move config space read into vfio_pci_config_setup()
Small cleanup that reduces duplicate code for vfio-user and reduces the
size of vfio_realize(); while we're here, correct that name to
vfio_pci_realize().

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-4-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
John Levon
33528f255a vfio: move more cleanup into vfio_pci_put_device()
All of the cleanup can be done in the same place, and vfio-user will
want to do the same.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250520150419.2172078-3-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Edmund Raile
aca0a50452 vfio/igd: OpRegion not found fix error typo
Signed-off-by: Edmund Raile <edmund.raile@protonmail.com>
Reviewed-by: Tomita Moeko <tomitamoeko@gmail.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/MFFbQoTpea_CK5ELq8oJ-a3Q57wo7ywQlrIqDvdIDKhUuCm59VUz2QzvdojO5r_wb_7SHifU0Kym3loj4eASPhdzYpjtiMCTePzyg1zrroo=@protonmail.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-05 10:40:38 +02:00
Haoqian He
e0f300b36d system/runstate: add VM state change cb with return value
This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`,
which has return value for `VMChangeStateEntry`.

Thus, we can register a new VM state change cb with return value for device.
Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually
exclusive and cannot be provided at the same time.

This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.

Virtio device will use VMChangeStateHandlerWithRet.

Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-2-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-05-14 05:39:14 -04:00
Stefan Hajnoczi
7be29f2f1a vfio queue:
* Preparatory changes for the introduction of CPR support
 * Automatic enablement of OpRegion for IGD device passthrough
 * Linux headers update
 * Preparatory changes for the introduction of vfio-user
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmgd/0kACgkQUaNDx8/7
 7KHRmRAArw1PXMCmoVBBeLcZ8BZPGjBZHtsvRzwS1yhVnNQadlpDlq4wd9HrfDFK
 BTr7//Ag2Q1dKgibesh0A8hSjorXHUGQCmdkcCuGGTFnEwC86q5jCH1lUxgI0cs5
 3bVwc43zhXGoKqmo07g4f2UFbjDYHe89LgWz2c7TFFGz7Tda/LCOdhnmXlXcIwz+
 v1ocutXd7VbDWvUzN7uZbf0SIH3Zj3p96dwmpLDtdzdliDA0JidNvS27+Z5gtvWe
 O+1NW9MDzNfd6zLXCxL3GLeT61WZCe1dRCHEPX4cBo+DhnrifsC25DtJwYlDFvi2
 NMFfGzdKcEVSpeDp7WeM6MJgCZsGHC7ytmAKOKgN2M2kFSj3SI3sTFNlE1rzUhe6
 yjjCa59HzNLIi7L7xYCrVtCLGC/VXOp9kh67Sjs7FY7v778QUEdiudFBdBki7Bwh
 bpRhdFJgCLHuKc6XrM7hsMnsRyM28MywyfHDo3M/pRSFNKfeImW6zSMXnyncZztK
 W8e8OIz2DBMfH8pIu8hPw9Gsm5VAAs4aVmVFNa0CLl0oBko0Ew2YXcA5pTK5gGqv
 x24uc/BhbLcfFUtK0OnP4N/B4rcoADebPV2u4eWoUK3aF5u4+7BY235bFuoTj+sb
 55DPDyWm5cmkX58Tdq46tD39dbD1hlUYkcydPbANH51wYx/lPpc=
 =OqYP
 -----END PGP SIGNATURE-----

Merge tag 'pull-vfio-20250509' of https://github.com/legoater/qemu into staging

vfio queue:

* Preparatory changes for the introduction of CPR support
* Automatic enablement of OpRegion for IGD device passthrough
* Linux headers update
* Preparatory changes for the introduction of vfio-user

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmgd/0kACgkQUaNDx8/7
# 7KHRmRAArw1PXMCmoVBBeLcZ8BZPGjBZHtsvRzwS1yhVnNQadlpDlq4wd9HrfDFK
# BTr7//Ag2Q1dKgibesh0A8hSjorXHUGQCmdkcCuGGTFnEwC86q5jCH1lUxgI0cs5
# 3bVwc43zhXGoKqmo07g4f2UFbjDYHe89LgWz2c7TFFGz7Tda/LCOdhnmXlXcIwz+
# v1ocutXd7VbDWvUzN7uZbf0SIH3Zj3p96dwmpLDtdzdliDA0JidNvS27+Z5gtvWe
# O+1NW9MDzNfd6zLXCxL3GLeT61WZCe1dRCHEPX4cBo+DhnrifsC25DtJwYlDFvi2
# NMFfGzdKcEVSpeDp7WeM6MJgCZsGHC7ytmAKOKgN2M2kFSj3SI3sTFNlE1rzUhe6
# yjjCa59HzNLIi7L7xYCrVtCLGC/VXOp9kh67Sjs7FY7v778QUEdiudFBdBki7Bwh
# bpRhdFJgCLHuKc6XrM7hsMnsRyM28MywyfHDo3M/pRSFNKfeImW6zSMXnyncZztK
# W8e8OIz2DBMfH8pIu8hPw9Gsm5VAAs4aVmVFNa0CLl0oBko0Ew2YXcA5pTK5gGqv
# x24uc/BhbLcfFUtK0OnP4N/B4rcoADebPV2u4eWoUK3aF5u4+7BY235bFuoTj+sb
# 55DPDyWm5cmkX58Tdq46tD39dbD1hlUYkcydPbANH51wYx/lPpc=
# =OqYP
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 09 May 2025 09:12:41 EDT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20250509' of https://github.com/legoater/qemu: (28 commits)
  vfio/container: pass listener_begin/commit callbacks
  vfio: add vfio-pci-base class
  vfio: add read/write to device IO ops vector
  vfio: add region info cache
  vfio: add device IO ops vector
  vfio: implement unmap all for DMA unmap callbacks
  vfio: add unmap_all flag to DMA unmap callback
  vfio: add vfio_pci_config_space_read/write()
  vfio: add strread/writeerror()
  vfio: consistently handle return value for helpers
  vfio: add vfio_device_get_irq_info() helper
  vfio: add vfio_attach_device_by_iommu_type()
  vfio: add vfio_device_unprepare()
  vfio: add vfio_device_prepare()
  linux-headers: Update to Linux v6.15-rc3
  linux-header: update-linux-header script changes
  vfio/igd: Remove generation limitation for IGD passthrough
  vfio/igd: Only emulate GGC register when x-igd-gms is set
  vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+
  vfio/igd: Enable OpRegion by default
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-05-09 12:04:35 -04:00
John Levon
d9b7d8b699 vfio/container: pass listener_begin/commit callbacks
The vfio-user container will later need to hook into these callbacks;
set up vfio to use them, and optionally pass them through to the
container.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-15-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
d4e392d0a9 vfio: add vfio-pci-base class
Split out parts of TYPE_VFIO_PCI into a base TYPE_VFIO_PCI_BASE,
although we have not yet introduced another subclass, so all the
properties have remained in TYPE_VFIO_PCI.

Note that currently there is no need for additional data for
TYPE_VFIO_PCI, so it shares the same C struct type as
TYPE_VFIO_PCI_BASE, VFIOPCIDevice.

Originally-by: John Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-14-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
776066ac90 vfio: add read/write to device IO ops vector
Now we have the region info cache, add ->region_read/write device I/O
operations instead of explicit pread()/pwrite() system calls.

Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-13-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
95cdb02451 vfio: add region info cache
Instead of requesting region information on demand with
VFIO_DEVICE_GET_REGION_INFO, maintain a cache: this will become
necessary for performance for vfio-user, where this call becomes a
message over the control socket, so is of higher overhead than the
traditional path.

We will also need it to generalize region accesses, as that means we
can't use ->config_offset for configuration space accesses, but must
look up the region offset (if relevant) each time.

Originally-by: John Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-12-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
38bf025d0d vfio: add device IO ops vector
For vfio-user, device operations such as IRQ handling and region
read/writes are implemented in userspace over the control socket, not
ioctl() to the vfio kernel driver; add an ops vector to generalize this,
and implement vfio_device_io_ops_ioctl for interacting with the kernel
vfio driver.

Originally-by: John Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-11-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
9458d9b4dc vfio: implement unmap all for DMA unmap callbacks
Handle unmap_all in the DMA unmap handlers rather than in the caller.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-10-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
5a22b50591 vfio: add unmap_all flag to DMA unmap callback
We'll use this parameter shortly; this just adds the plumbing.

Signed-off-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-9-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
cae04b5634 vfio: add vfio_pci_config_space_read/write()
Add these helpers that access config space and return an -errno style
return.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-8-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
2e27becf17 vfio: consistently handle return value for helpers
Various bits of code that call vfio device APIs should consistently use
the "return -errno" approach for passing errors back, rather than
presuming errno is (still) set correctly.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-6-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
5321e623eb vfio: add vfio_device_get_irq_info() helper
Add a helper similar to vfio_device_get_region_info() and use it
everywhere.

Replace a couple of needless allocations with stack variables.

As a side-effect, this fixes a minor error reporting issue in the call
from vfio_msix_early_setup().

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-5-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
ef73671f0b vfio: add vfio_attach_device_by_iommu_type()
Allow attachment by explicitly passing a TYPE_VFIO_IOMMU_* string;
vfio-user will use this later.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-4-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
d60fb709cf vfio: add vfio_device_unprepare()
Add a helper that's the inverse of vfio_device_prepare().

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-3-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00
John Levon
a901682f53 vfio: add vfio_device_prepare()
Commonize some initialization code shared by the legacy and iommufd vfio
implementations.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250507152020.1254632-2-john.levon@nutanix.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-05-09 12:42:28 +02:00