RISC-V IO Mapping Table (RIMT) is a new static ACPI table used to
communicate IOMMU information to the OS. Add support for creating this
table when the IOMMU is present. The specification is frozen and
available at [1].
[1] - https://github.com/riscv-non-isa/riscv-acpi-rimt/releases/download/v0.99/rimt-spec.pdf
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20250322043139.2003479-3-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
When the IOMMU is implemented as a PCI device, its BDF is created
locally in virt.c. However, the same BDF is also required in
virt-acpi-build.c to support ACPI. Therefore, make this information part
of the global RISCVVirtState structure so that it can be accessed
outside of virt.c as well.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20250322043139.2003479-2-sunilvl@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
vhost-scsi now supports scsi hotplug
cxl gained a bag of new operations, motably media operations
virtio-net now supports SR-IOV emulation
pci-testdev now supports backing memory bar with host memory
amd iommu now supports migration
fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmgkg0UPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpcDIH+wbrq7DzG+BVOraYtmD69BQCzYszby1mAWry
2OUYuAx9Oh+DsAwbzwbBdh9+SmJoi1oJ/d8rzSK328hdDrpCaPmc7bcBdAWJ3YcB
bGNPyJ+9eJLRXtlceGIhfAOMLIB0ugXGkHLQ61zlVCTg4Xwnj7/dQp2tAQ1BkTwW
Azc7ujBoJOBF3WVpa1Pqw0t1m3K74bwanOlkIg/JUWXk27sgP2YMnyrcpOu9Iz1T
VazgobyHo5y15V0wvd05w4Bk7cJSHwgW+y3DtgTtIffetIaAbSRgl3Pl5Ic1yKcX
ofg9aDFN6m0S8tv4WgFc+rT3Xaa/aPue9awjD5sEEldRasWKKNo=
=847R
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: fixes, features
vhost-scsi now supports scsi hotplug
cxl gained a bag of new operations, motably media operations
virtio-net now supports SR-IOV emulation
pci-testdev now supports backing memory bar with host memory
amd iommu now supports migration
fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCgAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmgkg0UPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpcDIH+wbrq7DzG+BVOraYtmD69BQCzYszby1mAWry
# 2OUYuAx9Oh+DsAwbzwbBdh9+SmJoi1oJ/d8rzSK328hdDrpCaPmc7bcBdAWJ3YcB
# bGNPyJ+9eJLRXtlceGIhfAOMLIB0ugXGkHLQ61zlVCTg4Xwnj7/dQp2tAQ1BkTwW
# Azc7ujBoJOBF3WVpa1Pqw0t1m3K74bwanOlkIg/JUWXk27sgP2YMnyrcpOu9Iz1T
# VazgobyHo5y15V0wvd05w4Bk7cJSHwgW+y3DtgTtIffetIaAbSRgl3Pl5Ic1yKcX
# ofg9aDFN6m0S8tv4WgFc+rT3Xaa/aPue9awjD5sEEldRasWKKNo=
# =847R
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 May 2025 07:49:25 EDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (27 commits)
hw/i386/amd_iommu: Allow migration when explicitly create the AMDVI-PCI device
hw/i386/amd_iommu: Isolate AMDVI-PCI from amd-iommu device to allow full control over the PCI device creation
intel_iommu: Take locks when looking for and creating address spaces
intel_iommu: Use BQL_LOCK_GUARD to manage cleanup automatically
virtio: Move virtio_reset()
virtio: Call set_features during reset
vhost-scsi: support VIRTIO_SCSI_F_HOTPLUG
vhost-user: return failure if backend crash when live migration
vhost: return failure if stop virtqueue failed in vhost_dev_stop
system/runstate: add VM state change cb with return value
pci-testdev.c: Add membar-backed option for backing membar
pcie_sriov: Make a PCI device with user-created VF ARI-capable
docs: Document composable SR-IOV device
virtio-net: Implement SR-IOV VF
virtio-pci: Implement SR-IOV PF
pcie_sriov: Allow user to create SR-IOV device
pcie_sriov: Check PCI Express for SR-IOV PF
pcie_sriov: Ensure PF and VF are mutually exclusive
hw/pci: Fix SR-IOV VF number calculation
hw/pci: Do not add ROM BAR for SR-IOV VF
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Fix a memleak in s390x code
* Skip some functional tests if the corresponding feature is not available
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmgkfWURHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbXaKA/+K/buSKZWNcvrXtU4AqEyIjicvUsbY79S
BGmwTjO46uDzlqTIOxGJ2uBAocXSlNJ7YsvH75vBHWHF3Vy6LB1zPWDgaYTz7XkA
K9GqtrmRdlPArKa1Q7ot0tJ/wu7lzQuccieJJwNJhotMC3C4dl1HSpp+u/rmk7gG
vG9l5Cdi34BWXp2QCKPdrNs++4mOudLSJtYhBlSpxIaBe6h2LoHmKJNEmD9x4Xcg
SWTqalpWUhJW4L3zCj1JXWv6HAyR6GG7+7FLr5FkorSDG/sMX7+09GLE1/BLlD87
KtZlTBkcbXs+eXmP4y+qtskI0ca4dLaZnfIq8/v0wqCXvfOUM4Xi0E2HvGmHeI4u
rvC/ZhK2RztMZbVMFXHSmCFJvpi2sGgH+sIHt18BJzkAC+nx0ZdCz81fgKVERHhJ
1ZnsRiMcf7dI6yEgbJ89vZihv3WbyCcwlnyLDN+lovZzCYTvxPLn5SRH0LEm4kN5
N/qRwTTlPM4xCGCSc3JEGJVDDy36ojVfvGMFt4ZcFehcpkfcLznw7QYjk3QDwI2N
58FImsf2VVEl4sdpzpi6zfutMhFuL1N0m/kXb8GBonekXYTPtyBMqHsmhyRe5xXN
vP9paghpU0xBuDMtmZWyq4RCubZNESA7wAbSf0+VcC/1Uhjc3QS5820kV7/WVwsU
VwObtSEAG1c=
=zUob
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2025-05-14' of https://gitlab.com/thuth/qemu into staging
* Removal of obsolete s390x machines
* Fix a memleak in s390x code
* Skip some functional tests if the corresponding feature is not available
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmgkfWURHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXaKA/+K/buSKZWNcvrXtU4AqEyIjicvUsbY79S
# BGmwTjO46uDzlqTIOxGJ2uBAocXSlNJ7YsvH75vBHWHF3Vy6LB1zPWDgaYTz7XkA
# K9GqtrmRdlPArKa1Q7ot0tJ/wu7lzQuccieJJwNJhotMC3C4dl1HSpp+u/rmk7gG
# vG9l5Cdi34BWXp2QCKPdrNs++4mOudLSJtYhBlSpxIaBe6h2LoHmKJNEmD9x4Xcg
# SWTqalpWUhJW4L3zCj1JXWv6HAyR6GG7+7FLr5FkorSDG/sMX7+09GLE1/BLlD87
# KtZlTBkcbXs+eXmP4y+qtskI0ca4dLaZnfIq8/v0wqCXvfOUM4Xi0E2HvGmHeI4u
# rvC/ZhK2RztMZbVMFXHSmCFJvpi2sGgH+sIHt18BJzkAC+nx0ZdCz81fgKVERHhJ
# 1ZnsRiMcf7dI6yEgbJ89vZihv3WbyCcwlnyLDN+lovZzCYTvxPLn5SRH0LEm4kN5
# N/qRwTTlPM4xCGCSc3JEGJVDDy36ojVfvGMFt4ZcFehcpkfcLznw7QYjk3QDwI2N
# 58FImsf2VVEl4sdpzpi6zfutMhFuL1N0m/kXb8GBonekXYTPtyBMqHsmhyRe5xXN
# vP9paghpU0xBuDMtmZWyq4RCubZNESA7wAbSf0+VcC/1Uhjc3QS5820kV7/WVwsU
# VwObtSEAG1c=
# =zUob
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 May 2025 07:24:21 EDT
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2025-05-14' of https://gitlab.com/thuth/qemu:
tests/functional: Skip the screendump tests if the command is not available
tests/functional/test_s390x_tuxrun: Check whether the machine is available
include/hw/dma/xlnx_dpdma: Remove dependency on console.h
s390x: Fix leak in machine_set_loadparm
hw/s390x/s390-virtio-ccw: Remove the deprecated 4.0 machine type
hw/s390x/s390-virtio-ccw: Remove the deprecated 3.1 machine type
hw/s390x: Remove the obsolete hpage_1m_allowed switch
hw/s390x/s390-virtio-ccw: Remove the deprecated 3.0 machine type
hw/s390x/s390-virtio-ccw: Remove the deprecated 2.12 machine type
target/s390x: Rename the qemu_V2_11 feature set to qemu_MIN
hw/s390x/event-facility: Remove the obsolete "allow_all_mask_sizes" code
hw/s390x/s390-virtio-ccw: Remove the deprecated 2.11 machine type
hw/s390x/s390-virtio-ccw: Remove the deprecated 2.10 machine type
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit cd59f50ab0 caused a regression on nvme hotplugging for devices
with an implicit nvm subsystem.
The nvme-subsys device was incorrectly left with being marked as
non-hotpluggable. Fix this.
Cc: qemu-stable@nongnu.org
Reported-by: Stéphane Graber <stgraber@stgraber.org>
Tested-by: Stéphane Graber <stgraber@stgraber.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2950
Fixes: cd59f50ab0 ("hw/nvme: always initialize a subsystem")
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
In hw/arm and include/hw/arm, some source files for the OMAP SoC
and the sx1 boards that are our only remaining OMAP boards still
have hard-coded tabs (almost entirely used for the indent on
inline comments, not for actual code indent).
Replace the tabs with spaces using vim :retab. I used 4 spaces
except in some defines and comments where I tried to put
everything aligned in the same column for better readability.
This commit is a purely whitespace-only change.
Signed-off-by: Santiago Monserrat Campanello <santimonserr@gmail.com>
Message-id: 20250505131130.82206-1-santimonserr@gmail.com
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/373
[PMM: expanded commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently we call gdb_init_cpu() in cpu_common_initfn(), which is
very early in the CPU object's init->realize creation sequence. In
particular this happens before the architecture-specific subclass's
init fn has even run. This means that gdb_init_cpu() can only do
things that depend strictly on the class, not on the object, because
the CPUState* that it is passed is currently half-initialized.
In commit a1f728ecc9 we accidentally broke this rule, by adding
a call to the gdb_get_core_xml_file method which takes the CPUState.
At the moment we get away with this because the only implementation
doesn't actually look at the pointer it is passed. However the whole
reason we created that method was so that we could make the "which
XML file?" decision based on a property of the CPU object, and we
currently can't change the Arm implementation of the method to do
what we want without causing wrong behaviour or a crash.
The ordering restrictions here are:
* we must call gdb_init_cpu before:
- any call to gdb_register_coprocessor()
- any use of the gdb_num_regs field (this is only used
in code that's about to call gdb_register_coprocessor()
and wants to know the first register number of the
set of registers it's about to add)
* we must call gdb_init_cpu after CPU properties have been
set, which is to say somewhere in realize
The function cpu_exec_realizefn() meets both of these requirements,
as it is called by the architecture-specific CPU realize function
early in realize, before any calls ot gdb_register_coprocessor().
Move the gdb_init_cpu() call to there.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250429132200.605611-4-peter.maydell@linaro.org
Add migration support for AMD IOMMU model by saving necessary AMDVIState
parameters for MMIO registers, device table, command buffer, and event
buffers.
Also change devtab_len type from size_t to uint64_t to avoid 32-bit build
issue.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20250504170405.12623-3-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Current amd-iommu model internally creates an AMDVI-PCI device. Here is
a snippet from info qtree:
bus: main-system-bus
type System
dev: amd-iommu, id ""
xtsup = false
pci-id = ""
intremap = "on"
device-iotlb = false
pt = true
...
dev: q35-pcihost, id ""
MCFG = -1 (0xffffffffffffffff)
pci-hole64-size = 34359738368 (32 GiB)
below-4g-mem-size = 134217728 (128 MiB)
above-4g-mem-size = 0 (0 B)
smm-ranges = true
x-pci-hole64-fix = true
x-config-reg-migration-enabled = true
bypass-iommu = false
bus: pcie.0
type PCIE
dev: AMDVI-PCI, id ""
addr = 01.0
romfile = ""
romsize = 4294967295 (0xffffffff)
rombar = -1 (0xffffffffffffffff)
multifunction = false
x-pcie-lnksta-dllla = true
x-pcie-extcap-init = true
failover_pair_id = ""
acpi-index = 0 (0x0)
x-pcie-err-unc-mask = true
x-pcie-ari-nextfn-1 = false
x-max-bounce-buffer-size = 4096 (4 KiB)
x-pcie-ext-tag = true
busnr = 0 (0x0)
class Class 0806, addr 00:01.0, pci id 1022:0000 (sub 1af4:1100)
...
This prohibits users from specifying the PCI topology for the amd-iommu device,
which becomes a problem when trying to support VM migration since it does not
guarantee the same enumeration of AMD IOMMU device.
Therefore, allow the 'AMDVI-PCI' device to optionally be pre-created and
associated with a 'amd-iommu' device via a new 'pci-id' parameter on the
latter.
For example:
-device AMDVI-PCI,id=iommupci0,bus=pcie.0,addr=0x05 \
-device amd-iommu,intremap=on,pt=on,xtsup=on,pci-id=iommupci0 \
For backward-compatibility, internally create the AMDVI-PCI device if not
specified on the CLI.
Co-developed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20250504170405.12623-2-suravee.suthikulpanit@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vtd_find_add_as can be called by multiple threads which leads to a race
condition. Taking the IOMMU lock ensures we avoid such a race.
Moreover we also need to take the bql to avoid an assert to fail in
memory_region_add_subregion_overlap when actually allocating a new
address space.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250430124750.240412-3-clement.mathieu--drif@eviden.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCaCRNgwAKCRBAov/yOSY+
343NBACeXLcXkNfPDRsuYC/Z0iYrMO8HuQ6VAcN1f4H+qP6Uo7ywb13GpJTLmewD
iYmD93qVZBAglSUWhaVzBZbAjGFzZSDLcO0bmfsMvmUaIJfIZkJqRG01shk9iMMR
zDLEax9udJdhJxBPCINNonXHds4vYKasjUureKd1SJidiBKG4w==
=wdkQ
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20250514' of https://github.com/gaosong715/qemu into staging
pull-loongarch-20250514
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCaCRNgwAKCRBAov/yOSY+
# 343NBACeXLcXkNfPDRsuYC/Z0iYrMO8HuQ6VAcN1f4H+qP6Uo7ywb13GpJTLmewD
# iYmD93qVZBAglSUWhaVzBZbAjGFzZSDLcO0bmfsMvmUaIJfIZkJqRG01shk9iMMR
# zDLEax9udJdhJxBPCINNonXHds4vYKasjUureKd1SJidiBKG4w==
# =wdkQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 14 May 2025 04:00:03 EDT
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20250514' of https://github.com/gaosong715/qemu:
hw/loongarch/boot: Adjust the loading position of the initrd
hw/intc/loongarch_pch: Merge three memory region into one
hw/intc/loongarch_pch: Set flexible memory access size with iomem region
hw/intc/loongarch_pch: Rename memory region iomem32_low with iomem
hw/intc/loongarch_pch: Use unified trace event for memory region ops
hw/intc/loongarch_pch: Use generic write callback for iomem8 region
hw/intc/loongarch_pch: Use generic write callback for iomem32_high region
hw/intc/loongarch_pch: Use generic write callback for iomem32_low region
hw/intc/loongarch_pch: Use generic read callback for iomem8 region
hw/intc/loongarch_pch: Use generic read callback for iomem32_high region
hw/intc/loongarch_pch: Use generic read callback for iomem32_low region
hw/intc/loongarch_pch: Discard write operation with ISR register
hw/intc/loongarch_pch: Use relative address in MemoryRegionOps
hw/intc/loongarch_pch: Set version information at initial stage
hw/intc/loongarch_pch: Remove some duplicate macro
hw/intc/loongarch_pch: Modify register name PCH_PIC_xxx_OFFSET with PCH_PIC_xxx
hw/intc/loongarch_pch: Modify name of some registers
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
vtd_switch_address_space needs to take the BQL if not already held.
Use BQL_LOCK_GUARD to make the iommu implementation more consistent.
Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com>
Message-Id: <20250430124750.240412-2-clement.mathieu--drif@eviden.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move virtio_reset() to a later part of the file to remove the forward
declaration of virtio_set_features_nocheck() and to prepare the
situation that we need more operations to perform during reset.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250421-reset-v2-2-e4c1ead88ea1@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net expects set_features() will be called when the feature set
used by the guest changes to update the number of virtqueues but it is
not called during reset, which will clear all features, leaving the
queues added for VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS. Not only these
extra queues are visible to the guest, they will cause segmentation
fault during migration.
Call set_features() during reset to remove those queues for virtio-net
as we call set_status(). It will also prevent similar bugs for
virtio-net and other devices in the future.
Fixes: f9d6dbf0bf ("virtio-net: remove virtio queues if the guest doesn't support multiqueue")
Buglink: https://issues.redhat.com/browse/RHEL-73842
Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250421-reset-v2-1-e4c1ead88ea1@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
So far there isn't way to test host kernel vhost-scsi event queue path,
because VIRTIO_SCSI_F_HOTPLUG isn't supported by QEMU.
virtio-scsi.c and vhost-user-scsi.c already support VIRTIO_SCSI_F_HOTPLUG
as property "hotplug".
Add support to vhost-scsi.c to help evaluate and test event queue.
To test the feature:
1. Create vhost-scsi target with targetcli.
targetcli /backstores/fileio create name=storage file_or_dev=disk01.raw
targetcli /vhost create naa.1123451234512345
targetcli /vhost/naa.1123451234512345/tpg1/luns create /backstores/fileio/storage
2. Create QEMU instance with vhost-scsi.
-device vhost-scsi-pci,wwpn=naa.1123451234512345,hotplug=true
3. Once guest bootup, hotplug a new LUN from host.
targetcli /backstores/fileio create name=storage02 file_or_dev=disk02.raw
targetcli /vhost/naa.1123451234512345/tpg1/luns create /backstores/fileio/storage02
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20250203005215.1502-1-dongli.zhang@oracle.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Live migration should be terminated if the vhost-user backend crashes
before the migration completes.
Specifically, since the vhost device will be stopped when VM is stopped
before the end of the live migration, in current implementation if the
backend crashes, vhost-user device set_status() won't return failure,
live migration won't perceive the disconnection between QEMU and the
backend.
When the VM is migrated to the destination, the inflight IO will be
resubmitted, and if the IO was completed out of order before, it will
cause IO error.
To fix this issue:
1. Add the return value to set_status() for VirtioDeviceClass.
a. For the vhost-user device, return failure when the backend crashes.
b. For other virtio devices, always return 0.
2. Return failure if vhost_dev_stop() failed for vhost-user device.
If QEMU loses connection with the vhost-user backend, virtio set_status()
can return failure to the upper layer, migration_completion() can handle
the error, terminate the live migration, and restore the VM, so that
inflight IO can be completed normally.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-4-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch captures the error of vhost_virtqueue_stop() in vhost_dev_stop()
and returns the error upward.
Specifically, if QEMU is disconnected from the vhost backend, some actions
in vhost_dev_stop() will fail, such as sending vhost-user messages to the
backend (GET_VRING_BASE, SET_VRING_ENABLE) and vhost_reset_status.
Considering that both set_vring_enable and vhost_reset_status require setting
the specific virtio feature bit, we can capture vhost_virtqueue_stop()'s
error to indicate that QEMU has lost connection with the backend.
This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-3-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds the new VM state change cb type `VMChangeStateHandlerWithRet`,
which has return value for `VMChangeStateEntry`.
Thus, we can register a new VM state change cb with return value for device.
Note that `VMChangeStateHandler` and `VMChangeStateHandlerWithRet` are mutually
exclusive and cannot be provided at the same time.
This patch is the pre patch for 'vhost-user: return failure if backend crashes
when live migration', which makes the live migration aware of the loss of
connection with the vhost-user backend and aborts the live migration.
Virtio device will use VMChangeStateHandlerWithRet.
Signed-off-by: Haoqian He <haoqian.he@smartx.com>
Message-Id: <20250416024729.3289157-2-haoqian.he@smartx.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The pci-testdev device allows for an optional BAR. We have
historically used this without backing to test that systems and OSes
can accomodate large PCI BARs. However to help test p2pdma operations
it is helpful to add an option to back this BAR with host memory.
We add a membar-backed boolean parameter and when set to true or on we
do a host RAM backing. The default is false which ensures backward
compatability.
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Message-Id: <Z_6JhDtn5PlaDgB_@MKMSTEBATES01.amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A virtio-net device can be added as a SR-IOV VF to another virtio-pci
device that will be the PF.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-7-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allow user to attach SR-IOV VF to a virtio-pci PF.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-6-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A user can create a SR-IOV device by specifying the PF with the
sriov-pf property of the VFs. The VFs must be added before the PF.
A user-creatable VF must have PCIDeviceClass::sriov_vf_user_creatable
set. Such a VF cannot refer to the PF because it is created before the
PF.
A PF that user-creatable VFs can be attached calls
pcie_sriov_pf_init_from_user_created_vfs() during realization and
pcie_sriov_pf_exit() when exiting.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-5-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A device cannot be a SR-IOV PF and a VF at the same time.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-3-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_config_get_bar_addr() had a division by vf_stride. vf_stride needs
to be non-zero when there are multiple VFs, but the specification does
not prohibit to make it zero when there is only one VF.
Do not perform the division for the first VF to avoid division by zero.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-2-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A SR-IOV VF cannot have a ROM BAR.
Co-developed-by: Yui Washizu <yui.washidu@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20250314-sriov-v9-1-57dae8ae3ab5@daynix.com>
Tested-by: Yui Washizu <yui.washidu@gmail.com>
Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
1) get alert configuration(Opcode 4201h)
2) set alert configuration(Opcode 4202h)
Signed-off-by: Sweta Kumari <s5.kumari@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250305092501.191929-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CXL spec 3.2 section 8.2.10.9.5.3 describes media operations commands.
CXL devices supports media operations Sanitize and Write zero command.
Signed-off-by: Vinayak Holikatti <vinayak.kh@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250305092501.191929-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the code of calculation of sanitize duration into function
for usability by other sanitize routines
Estimate times based on:
https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Vinayak Holikatti <vinayak.kh@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250305092501.191929-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add Get/Set Response Message Limit commands.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250305092501.191929-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As of 3.1 spec, background commands can be canceled with a new
abort command. Implement the support, which is advertised in
the CEL. No ad-hoc context undoing is necessary as all the
command logic of the running bg command is done upon completion.
Arbitrarily, the on-going background cmd will not be aborted if
already at least 85% done;
A mutex is introduced to stabilize mbox request cancel command vs
the timer callback being fired scenarios (as well as reading the
mbox registers). While some operations under critical regions
may be unnecessary (irq notifying, cmd callbacks), this is not
a path where performance is important, so simplicity is preferred.
Tested-by: Ajay Joshi <ajay.opensrc@micron.com>
Reviewed-by: Ajay Joshi <ajay.opensrc@micron.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20250305092501.191929-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When only the -kernel parameter is used to load the elf kernel, the initrd
is loaded in the ram. If the initrd size is too large, the loading fails,
resulting in a VM startup failure. This patch first loads initrd near
the kernel.
When the nearby memory space of the kernel is insufficient, it tries to
load it to the starting position of high memory. If there is still not
enough, qemu will report an error and ask the user to increase the memory
space for the virtual machine to boot.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Message-Id: <20250506080946.817092-1-lixianglai@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Since memory region iomem supports memory access size with 1/2/4/8,
it can be used for memory region iomem8 and iomem32_high. Now remove
memory region iomem8 and iomem32_high, merge them into iomem together.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023754.1877445-5-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
The original iomem region only supports 4 bytes access size, set it ok
with 1/2/4/8 bytes. Also unaligned memory access is not supported.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023754.1877445-4-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Rename memory region iomem32_low with iomem, also change ops name
as follows:
loongarch_pch_pic_reg32_low_ops --> loongarch_pch_pic_ops
loongarch_pch_pic_low_readw --> loongarch_pch_pic_read
loongarch_pch_pic_low_writew --> loongarch_pch_pic_write
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023754.1877445-3-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add trace event trace_loongarch_pch_pic_read(), replaces the following
three events:
trace_loongarch_pch_pic_low_readw()
trace_loongarch_pch_pic_high_readw()
trace_loongarch_pch_pic_readb()
The similiar with write trace event.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023754.1877445-2-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add iomem8 region register write operation emulation in generic write
function loongarch_pch_pic_write(), and use this function for iomem8
region.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023754.1877445-1-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add iomem32_high region register write operation emulation in generic
write function loongarch_pch_pic_write(), and use this function for
iomem32_high region.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-12-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
For memory region iomem32_low, generic write callback is used.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-11-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add iomem8 region register read operation emulation in generic read
function loongarch_pch_pic_read(), and use this function for iomem8
region.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-10-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Add register read operation emulation in generic read function
loongarch_pch_pic_read(), and use this function for iomem32_high region.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-9-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
For memory region iomem32_low, generic read callback is used.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-8-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
With the latest 7A1000 user manual, interrupt status register ISR is
read only. Here discard write operation with ISR register.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-7-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Parameter address for read and write callback in MemoryRegionOps is
relative offset with base address of this MemoryRegionOps. It can
be directly used as offset and offset calculation can be removed.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-6-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Register PCH_PIC_INT_ID constains version and supported irq number
information, and it is read only register. The detailed value can
be set at initial stage, rather than read callback.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20250507023148.1877287-5-maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>