Currently we depend on .realize() calling iommufd_backend_get_device_info()
to get IOMMU capabilities and check for dirty page tracking support.
By make a extra separate call, this dependency is removed. This happens
only during device attach, it's not a hot path.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250423072824.3647952-2-zhenzhong.duan@intel.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
"hw/vfio/vfio-common.h" has been emptied of most of its declarations
by the previous changes and the only declarations left are related to
VFIODevice. Rename it to "hw/vfio/vfio-device.h" and make the
necessary adjustments.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-36-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This hides the MemoryListener implementation and makes the code common
to both IOMMU backends, legacy and IOMMUFD.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-35-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
File "common.c" has been emptied of most of its definitions by the
previous changes and the only definitions left are related to the VFIO
MemoryListener handlers. Rename it to "listener.c" and introduce its
associated "vfio-listener.h" header file for the declarations.
Cleanup a little the includes while at it.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-33-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Rename to vfio_container_query_dirty_bitmap() to be consistent with
the VFIO container routine naming scheme.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-32-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Use the prefix 'vfio_container_devices_' to reflect the routine simply
loops over the container's device list.
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-31-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Also rename vfio_devices_all_device_dirty_tracking_started() while at
it and use the prefix 'vfio_container_devices_' for routines simply
looping over the container's device list.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-30-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_container_query_dirty_bitmap() is only used in "container-base.c".
Also, rename to vfio_container_iommu_query_dirty_bitmap() to reflect it
is using the VFIO IOMMU backend device ->query_dirty_bitmap() handler.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-29-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
vfio_devices_query_dirty_bitmap() is only used in "container-base.c".
Also, rename to vfio_container_devices_query_dirty_bitmap() to reflect
with the prefix 'vfio_container_devices_' that it simply loops over
the container's device list.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-28-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Routines of common.c :
vfio_devices_all_dirty_tracking_started
vfio_devices_all_device_dirty_tracking
vfio_devices_query_dirty_bitmap
vfio_get_dirty_bitmap
are all related to dirty page tracking directly at the container level
or at the container device level. Naming is a bit confusing. We will
propose new names in the following changes.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-27-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Pass-through devices of a VM are not necessarily in the same group and
all groups/address_spaces need to be scanned when the machine is
reset. Commit f16f39c3fc ("Implement PCI hot reset") introduced a VM
reset handler for this purpose. Move it under device.c
Also reintroduce the comment which explained the context and was lost
along the way.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-26-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Gather all CPR related declarations into "vfio-cpr.h" to reduce exposure
of VFIO internals in "hw/vfio/vfio-common.h". These were introduced in
commit d9fa4223b3 ("vfio: register container for cpr").
Order file list in meson.build while at it.
Cc: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-22-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Move all VFIODevice related routines of "helpers.c" into a new "device.c"
file.
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-21-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIOAddressSpace is a common object used by VFIOContainerBase which is
declared in "hw/vfio/vfio-container-base.h". Move the VFIOAddressSpace
related services into "container-base.c".
While at it, rename :
vfio_get_address_space -> vfio_address_space_get
vfio_put_address_space -> vfio_address_space_put
to better reflect the namespace these routines belong to.
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-15-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Gather all VFIORegion related declarations and definitions into their
own files to reduce exposure of VFIO internals in "hw/vfio/vfio-common.h".
They were introduced for 'vfio-platform' support in commits
db0da029a1 ("vfio: Generalize region support") and a664477db8
("hw/vfio/pci: Introduce VFIORegion").
To be noted that the 'vfio-platform' devices have been deprecated and
will be removed in QEMU 10.2. Until then, make the declarations
available externally for 'sysbus-fdt.c'.
Cc: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-12-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Gather all VFIOIOMMUFD related declarations introduced by commits
5ee3dc7af7 ("vfio/iommufd: Implement the iommufd backend") and
5b1e96e654 ("vfio/iommufd: Introduce auto domain creation") into
"vfio-iommufd.h". This to reduce exposure of VFIO internals in
"hw/vfio/vfio-common.h".
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250318095415.670319-10-clg@redhat.com
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-11-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
These routines are migration related. Move their declaration and
implementation under the migration files.
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: John Levon <john.levon@nutanix.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-8-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The migration core subsystem makes use of the VFIO migration API to
collect statistics on the number of bytes transferred. These services
are declared in "hw/vfio/vfio-common.h" which also contains VFIO
internal declarations. Move the migration declarations into a new
header file "hw/vfio/vfio-migration.h" to reduce the exposure of VFIO
internals.
While at it, use a 'vfio_migration_' prefix for these services.
To be noted, vfio_migration_add_bytes_transferred() is a VFIO
migration internal service which we will be moved in the subsequent
patches.
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: John Levon <john.levon@nutanix.com>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250326075122.1299361-4-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
An L2 KVM guest fails to boot inside a pSeries LPAR when booted with a
memory more than 128 GB and PCI device passthrough. The L2 guest also
crashes when it is booted with a memory greater than 128 GB and a PCI
device is hotplugged later.
The issue arises from a conditional check for `levels > 1` in
`spapr_tce_create_table()` within L1 KVM. This check is meant to prevent
multi-level TCEs, which are not supported by the PowerVM hypervisor. As
a result, when QEMU makes a `VFIO_IOMMU_SPAPR_TCE_CREATE` ioctl call
with `levels > 1`, it triggers the conditional check and returns
`EINVAL`, causing the guest to crash with the following errors:
2025-03-04T06:36:36.133117Z qemu-system-ppc64: Failed to create a window, ret = -1 (Invalid argument)
2025-03-04T06:36:36.133176Z qemu-system-ppc64: Failed to create SPAPR window: Invalid argument
qemu: hardware error: vfio: DMA mapping failed, unable to continue
Fix this by checking the supported DDW "levels" returned by the
VFIO_IOMMU_SPAPR_TCE_GET_INFO ioctl before attempting the TCE create
ioctl in KVM.
The patch has been tested on KVM guests with memory configurations of up
to 390GB, and 450GB on PowerVM and bare-metal environments respectively.
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250408124042.2695955-3-amachhiw@linux.ibm.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Introduce an Error ** parameter to vfio_spapr_create_window() to enable
structured error reporting. This allows the function to propagate
detailed errors back to callers.
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250408124042.2695955-2-amachhiw@linux.ibm.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
VFIO uses migration_file_set_error() in a couple of places where an
'Error **' parameter is not provided. In MemoryListener handlers :
vfio_listener_region_add
vfio_listener_log_global_stop
vfio_listener_log_sync
and in callback routines for IOMMU notifiers :
vfio_iommu_map_notify
vfio_iommu_map_dirty_notify
Hopefully, one day, we will be able to extend these callbacks with an
'Error **' parameter and avoid setting the global migration error.
Until then, it seems sensible to clearly identify the use cases, which
are limited, and open code vfio_migration_set_error(). One other
benefit is an improved error reporting when migration is running.
While at it, slightly modify error reporting to only report errors
when migration is not active and not always as is currently done.
Cc: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Avihai Horon <avihaih@nvidia.com>
Link: https://lore.kernel.org/qemu-devel/20250324123315.637827-1-clg@redhat.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
-----BEGIN PGP SIGNATURE-----
iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCaAmmRQAKCRBAov/yOSY+
3yZoA/4udi9ZmLsaiPqfKCS+0eF8XScIT493lVD359lFTBTT7mshh9PPhTLzdtiC
8fcfYi7jSjfC9gGTjPgnNCOzKIg3Gbdl61AFDgIwd8q/5HQAgonHAywTUtmqDaPK
bXZ/JkkJQby2dla6015XKQS/d/EXWHgYjrcb1JZIRoaLworZPw==
=zBCJ
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20250424' of https://github.com/gaosong715/qemu into staging
pull-loongarch-20230424
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCaAmmRQAKCRBAov/yOSY+
# 3yZoA/4udi9ZmLsaiPqfKCS+0eF8XScIT493lVD359lFTBTT7mshh9PPhTLzdtiC
# 8fcfYi7jSjfC9gGTjPgnNCOzKIg3Gbdl61AFDgIwd8q/5HQAgonHAywTUtmqDaPK
# bXZ/JkkJQby2dla6015XKQS/d/EXWHgYjrcb1JZIRoaLworZPw==
# =zBCJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 23 Apr 2025 22:47:33 EDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20250424' of https://github.com/gaosong715/qemu:
target/loongarch: Guard BCEQZ/BCNEZ instructions with FP feature
target/loongarch: Add CRC feature flag and use it to gate CRC instructions
linux-user/loongarch64: Decode BRK break codes for FPE signals
target/loongarch: Move definition of TCG specified function to tcg directory
target/loongarch: Add static definition with function loongarch_tlb_search()
target/loongarch: Move function loongarch_tlb_search to directory tcg
target/loongarch: Define function loongarch_get_addr_from_tlb() non-static
target/loongarch: Set function loongarch_map_address() with common code
target/loongarch: Add stub function loongarch_get_addr_from_tlb
target/loongarch: Move function get_dir_base_width to common directory
target/loongarch: Add function loongarch_get_addr_from_tlb
target/loongarch: Move header file helper.h to directory tcg
hw/intc/loongarch_pch_msi: Remove gpio input handler
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A few functions now end with a label. The next commit will clean them
up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250407082643.2310002-3-armbru@redhat.com>
[Straightforward conflict with commit 988ad4cceb (hw/loongarch/virt:
Fix cpuslot::cpu set at last in virt_cpu_plug()) resolved]
Coccinelle's indentation of virt_create_plic() results in a long line.
Avoid that by mimicking the old indentation manually.
Don't touch tests/tcg/mips/user/. I'm not sure these files are ours
to make style cleanups on. They might be imported third-party code,
which we should leave as is to not complicate future updates.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250407082643.2310002-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
MSI interrupt is triggered by writing message on specified memory address.
In generic it is used by PCI devices, and no device is connected pch MSI
irqchip with GPIO pin line method, here remove gpio input setting for MSI
controller.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Tested-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250410085004.3577627-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>