Currently if the user requests via -machine dumpdtb=file.dtb that we
dump the DTB, but the machine doesn't have a DTB, we silently ignore
the option. This is confusing to users, and is a legacy of the old
board-specific implementation of the option, where if the execution
codepath didn't go via a call to qemu_fdt_dumpdtb() we would never
handle the option.
Now we handle the option in one place in machine.c, we can provide
the user with a useful message if they asked us to dump a DTB when
none exists. qmp_dumpdtb() already produces this error; remove the
logic in handle_machine_dumpdtb() that was there specifically to
avoid hitting it.
While we're here, beef up the error message a bit with a hint, and
make it consistent about "an FDT" rather than "a FDT". (In the
qmp_dumpdtb() case this needs an ERRP_GUARD to make
error_append_hint() work when the caller passes error_fatal.)
Note that the three places where we might report "doesn't have an
FDT" are hit in different situations:
(1) in handle_machine_dumpdtb(), if CONFIG_FDT is not set: this is
because the QEMU binary was built without libfdt at all. The
build system will not let you build with a machine type that
needs an FDT but no libfdt, so here we know both that the machine
doesn't use FDT and that QEMU doesn't have the support:
(2) in the device_tree-stub.c qmp_dumpdtb(): this is used when
we had libfdt at build time but the target architecture didn't
enable any machines which did "select DEVICE_TREE", so here we
know that the machine doesn't use FDT.
(3) in qmp_dumpdtb(), if current_machine->fdt is NULL all we know
is that this machine never set it. That might be because it doesn't
use FDT, or it might be because the user didn't pass an FDT
on the command line and the machine doesn't autogenerate an FDT.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2733
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250206151214.2947842-7-peter.maydell@linaro.org
Currently we handle the 'dumpdtb' machine sub-option ad-hoc in every
board model that has an FDT. It's up to the board code to make sure
it calls qemu_fdt_dumpdtb() in the right place.
This means we're inconsistent and often just ignore the user's
command line argument:
* if the board doesn't have an FDT at all
* if the board supports FDT, but there happens not to be one
present (usually because of a missing -fdt option)
This isn't very helpful because it gives the user no clue why their
option was ignored.
However, in order to support the QMP/HMP dumpdtb commands we require
now that every FDT machine stores a pointer to the FDT in
MachineState::fdt. This means we can handle -machine dumpdtb
centrally by calling the qmp_dumpdtb() function, unifying its
handling with the QMP/HMP commands. All the board code calls to
qemu_fdt_dumpdtb() can then be removed.
For this commit we retain the existing behaviour that if there
is no FDT we silently ignore the -machine dumpdtb option.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We're currently changing the way the source multifd migration handles
the shutdown of the multifd channels when TLS is in use to perform a
clean termination by calling gnutls_bye().
Older src QEMUs will always close the channel without terminating the
TLS session. New dst QEMUs treat an unclean termination as an error.
Add multifd_clean_tls_termination (default true) that can be switched
on the destination whenever a src QEMU <= 9.2 is in use.
(Note that the compat property is only strictly necessary for src
QEMUs older than 9.1. Due to synchronization coincidences, src QEMUs
9.1 and 9.2 can put the destination in a condition where it doesn't
see the unclean termination. Still, make the property more inclusive
to facilitate potential backports.)
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
* Fix the broken aarch64_tcg_plugins test
* Add test for 64-bit mac99 machine
* Add a Linux-based test for the 40p machine
* Fix issues with record/replay of some s390x instructions
* Fix node.js crashes on emulated s390x due to a bug in the MVC instruction
* Enable virtio-balloon-pci and virtio-mem-pci on s390x
* Fix a libslirp v4.9.0 compilation problem
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmebewIRHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbUfZQ//WHrZNVQNe0d+wOtAa5Zj4X9RpadHeGO9
WCKtBWZ1tDADHiVkZzU6L6q/LYM5FcAOE+Kah/xr8rtf6he+LCYQ0RDHbgY6/oUE
t9TkIeph59+MMvBXWJ8flngaoVtxe8l2aYem8wk3ATPZtHyMQAZ5PAjY3+WYQAGc
gm13k1AMD4mA6mBUOs67QSitTqBQsunKpb1IvpyBjtv9NBl61L8h5hWn0bsxa8yC
3KKZhw6Nclc8RVe33e6ZDrHrBi9klORd6Z+7fJ4w8Yj+C48ogfbQx+Zvb82jXhRe
2qGdVb6cF7LVQ5D3pECBK7yo4Lkd7MYnNvn+EmbTXhj1y5MSPdokP6k0ZWkhhkCP
2kIY0o5tFipdxkdDpCptU3gYJLdQFbNX2MqDFY0KeurLDGe4o6jIoRNmdZ67TJei
zleLlcEatoyRqpCKqTNMDVeWgza3ngykhiQIrG9PMPCRQET0N4qY6db35hzDujLI
NVuI1traCLawfCDYiMnU59dOxWSHy1bwSfnUxhZ92+Fl3AOb6c6PzhpkIGl/grwT
8T8EcjFyA4hpaHHKjCeNgSrKt9N0Ka2G3l9oF8eWwJm4KAlwtYBDvfVb+juGBP9+
8gW0lXA8tYy/P5XgPQ0N5Z8coc1xUrYBhC7v70ud3ponMmmTdhRnosey2cOFUGsN
/U7avgXIm0Q=
=rEzl
-----END PGP SIGNATURE-----
Merge tag 'pull-request-2025-01-30' of https://gitlab.com/thuth/qemu into staging
* Convert more avocado tests to the functional framework
* Fix the broken aarch64_tcg_plugins test
* Add test for 64-bit mac99 machine
* Add a Linux-based test for the 40p machine
* Fix issues with record/replay of some s390x instructions
* Fix node.js crashes on emulated s390x due to a bug in the MVC instruction
* Enable virtio-balloon-pci and virtio-mem-pci on s390x
* Fix a libslirp v4.9.0 compilation problem
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmebewIRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUfZQ//WHrZNVQNe0d+wOtAa5Zj4X9RpadHeGO9
# WCKtBWZ1tDADHiVkZzU6L6q/LYM5FcAOE+Kah/xr8rtf6he+LCYQ0RDHbgY6/oUE
# t9TkIeph59+MMvBXWJ8flngaoVtxe8l2aYem8wk3ATPZtHyMQAZ5PAjY3+WYQAGc
# gm13k1AMD4mA6mBUOs67QSitTqBQsunKpb1IvpyBjtv9NBl61L8h5hWn0bsxa8yC
# 3KKZhw6Nclc8RVe33e6ZDrHrBi9klORd6Z+7fJ4w8Yj+C48ogfbQx+Zvb82jXhRe
# 2qGdVb6cF7LVQ5D3pECBK7yo4Lkd7MYnNvn+EmbTXhj1y5MSPdokP6k0ZWkhhkCP
# 2kIY0o5tFipdxkdDpCptU3gYJLdQFbNX2MqDFY0KeurLDGe4o6jIoRNmdZ67TJei
# zleLlcEatoyRqpCKqTNMDVeWgza3ngykhiQIrG9PMPCRQET0N4qY6db35hzDujLI
# NVuI1traCLawfCDYiMnU59dOxWSHy1bwSfnUxhZ92+Fl3AOb6c6PzhpkIGl/grwT
# 8T8EcjFyA4hpaHHKjCeNgSrKt9N0Ka2G3l9oF8eWwJm4KAlwtYBDvfVb+juGBP9+
# 8gW0lXA8tYy/P5XgPQ0N5Z8coc1xUrYBhC7v70ud3ponMmmTdhRnosey2cOFUGsN
# /U7avgXIm0Q=
# =rEzl
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 30 Jan 2025 08:13:38 EST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* tag 'pull-request-2025-01-30' of https://gitlab.com/thuth/qemu:
net/slirp: libslirp 4.9.0 compatibility
tests/functional/test_mips_malta: Convert the mips big endian replay tests
tests/functional/test_mips64el_malta: Convert the mips64el replay tests
tests/functional/test_mipsel_malta: Convert the mipsel replay tests
tests/functional: Add the ReplayKernelBase class
tests/functional: Add a decorator for skipping long running tests
tests/functional: Extend PPC 40p test with Linux boot
s390x/s390-virtio-ccw: Support plugging PCI-based virtio memory devices
virtio-mem-pci: Allow setting nvectors, so we can use MSI-X
virtio-balloon-pci: Allow setting nvectors, so we can use MSI-X
hw/s390x/s390-virtio-ccw: Fix a record/replay deadlock
tests/tcg/s390x: Test modifying code using the MVC instruction
target/s390x: Fix MVC not always invalidating translation blocks
target/s390x: Fix PPNO execution with icount
tests/functional/test_mips_malta: Fix comment about endianness of the test
tests/functional: Add a ppc64 mac99 test
tests/functional: Fix the aarch64_tcg_plugins test
tests/functional: Convert the migration avocado test
tests/functional: Fix broken decorators with lamda functions
tests/functional/qemu_test/decorators: Fix bad check for imports
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Let's do it similar as virtio-balloon-pci. With this change, we can
use virtio-mem-pci on s390x, although plugging will still fail until
properly wired up in the machine.
No need to worry about transitional/non_transitional devices, because they
don't exist for virtio-mem.
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250128185705.1609038-2-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Most virtio-pci devices allow MSI-X. Add it to virtio-balloon-pci, but
only enable it in new machine types, so we don't break migration of
existing machine types between different qemu versions.
This copies what was done for virtio-rng-pci in:
9ea02e8f13 ("virtio-rng-pci: Allow setting nvectors, so we can use MSI-X")
bad9c5a516 ("virtio-rng-pci: fix migration compat for vectors")
62bdb88715 ("virtio-rng-pci: fix transitional migration compat for vectors")
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <20250115161425.246348-1-arbab@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Allocate auxilliary guest RAM as an anonymous file that is shareable
with an external process. This option applies to memory allocated as
a side effect of creating various devices. It does not apply to
memory-backend-objects, whether explicitly specified on the command
line, or implicitly created by the -m command line option.
This option is intended to support new migration modes, in which the
memory region can be transferred in place to a new QEMU process, by sending
the memfd file descriptor to the process. Memory contents are preserved,
and if the mode also transfers device descriptors, then pages that are
locked in memory for DMA remain locked. This behavior is a pre-requisite
for supporting vfio, vdpa, and iommufd devices with the new modes.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Pointer authentication on aarch64 is pretty expensive (up to 50% of
execution time) when running a virtual machine with tcg and -cpu max
(which enables pauth=on).
The advice is always: use pauth-impdef=on.
Our documentation even mentions it "by default" in
docs/system/introduction.rst.
Thus, we change the default to use impdef by default. This does not
affect kvm or hvf acceleration, since pauth algorithm used is the one
from host cpu.
This change is retro compatible, in terms of cli, with previous
versions, as the semantic of using -cpu max,pauth-impdef=on, and -cpu
max,pauth-qarma3=on is preserved.
The new option introduced in previous patch and matching old default is
-cpu max,pauth-qarma5=on.
It is retro compatible with migration as well, by defining a backcompat
property, that will use qarma5 by default for virt machine <= 9.2.
Tested by saving and restoring a vm from qemu 9.2.0 into qemu-master
(10.0) for cpus neoverse-n2 and max.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241219183211.3493974-3-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
"exec/confidential-guest-support.h" is specific to system
emulation, so move it under the system/ namespace.
Mechanical change doing:
$ sed -i \
-e 's,exec/confidential-guest-support.h,sysemu/confidential-guest-support.h,' \
$(git grep -l exec/confidential-guest-support.h)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-2-philmd@linaro.org>
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.
Files renamed manually then mechanical change using sed tool.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
Always explicitly create QEMU system containers upfront.
Root containers will be created when trying to fetch the root object the
1st time. They are:
/objects
/chardevs
/backend
Machine sub-containers will be created only until machine is being
initialized. They are:
/machine/unattached
/machine/peripheral
/machine/peripheral-anon
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-8-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add new -shim command line option, wire up for the x86 loader.
When specified load shim into the new "etc/boot/shim" fw_cfg file.
Needs OVMF changes too to be actually useful.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-6-kraxel@redhat.com>
The 'maxmem' parameter parsed on the command line is held in uint64_t
and then assigned to the MachineState field that is 'ram_addr_t'. This
assignment will wrap on 32-bit hosts, silently changing the user's
config request if it were over-sized.
Improve the existing diagnositics for validating 'size', and add the
same diagnostics for 'maxmem'
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Message-ID: <20241127114057.255995-1-berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The x86 and ARM need to allow user to configure cache properties
(current only topology):
* For x86, the default cache topology model (of max/host CPU) does not
always match the Host's real physical cache topology. Performance can
increase when the configured virtual topology is closer to the
physical topology than a default topology would be.
* For ARM, QEMU can't get the cache topology information from the CPU
registers, then user configuration is necessary. Additionally, the
cache information is also needed for MPAM emulation (for TCG) to
build the right PPTT.
Define smp-cache related enumeration and properties in QAPI, so that
user could configure cache properties for SMP system through -machine in
the subsequent patch.
Cache enumeration (CacheLevelAndType) is implemented as the combination
of cache level (level 1/2/3) and cache type (data/instruction/unified).
Currently, separated L1 cache (L1 data cache and L1 instruction cache)
with unified higher-level cache (e.g., unified L2 and L3 caches), is the
most common cache architectures.
Therefore, enumerate the L1 D-cache, L1 I-cache, L2 cache and L3 cache
with smp-cache object to add the basic cache topology support. Other
kinds of caches (e.g., L1 unified or L2/L3 separated caches) can be
added directly into CacheLevelAndType if necessary.
Cache properties (SmpCacheProperties) currently only contains cache
topology information, and other cache properties can be added in it
if necessary.
Note, define cache topology based on CPU topology level with two
reasons:
1. In practice, a cache will always be bound to the CPU container
(either private in the CPU container or shared among multiple
containers), and CPU container is often expressed in terms of CPU
topology level.
2. The x86's cache-related CPUIDs encode cache topology based on APIC
ID's CPU topology layout. And the ACPI PPTT table that ARM/RISCV
relies on also requires CPU containers to help indicate the private
shared hierarchy of the cache. Therefore, for SMP systems, it is
natural to use the CPU topology hierarchy directly in QEMU to define
the cache topology.
With smp-cache QAPI support, add smp cache topology for machine by
parsing the smp-cache object list.
Also add the helper to access/update cache topology level of machine.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-ID: <20241101083331.340178-4-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Include the missing "qemu/units.h" to fix when refactoring code:
../hw/core/machine.c:743:34: error: use of undeclared identifier 'MiB'
743 | mc->default_ram_size = 128 * MiB;
| ^
../hw/core/machine.c:750:44: error: use of undeclared identifier 'TiB'
750 | mc->smbios_memory_device_size = 2047 * TiB;
| ^
and "qemu/error-report.h" to fix:
../hw/core/machine.c:1029:13: error: call to undeclared function 'error_report' [-Wimplicit-function-declaration]
1029 | error_report("NUMA node %" PRIu16 " is missing, use "
| ^
../hw/core/machine.c:1240:9: error: call to undeclared function 'warn_report' [-Wimplicit-function-declaration]
1240 | warn_report("CPU model %s is deprecated -- %s",
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240930221900.59525-2-philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
+DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
=zSX4
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes, cleanups
CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
# fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
# 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
# HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
# +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
# OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
# =zSX4
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 21:03:33 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (65 commits)
intel_iommu: Add missed reserved bit check for IEC descriptor
intel_iommu: Add missed sanity check for 256-bit invalidation queue
intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
hw/acpi: Update GED with vCPU Hotplug VMSD for migration
tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
qtest: allow ACPI DSDT Table changes
hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
hw/pci: Add parenthesis to PCI_BUILD_BDF macro
hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
hw/cxl: Check that writes do not go beyond end of target attributes
hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
hw/cxl: Avoid accesses beyond the end of cel_log.
hw/cxl: Check the length of data requested fits in get_log()
hw/cxl: Check enough data in cmd_firmware_update_transfer()
hw/cxl: Check input length is large enough in cmd_events_clear_records()
hw/cxl: Check input includes at least the header in cmd_features_set_feature()
hw/cxl: Check size of input data to dynamic capacity mailbox commands
hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Big cleanup of deprecated machines
* Power11 support for spapr
* XIVE improvements
* Goodbye to Cedric and David as ppc reviewers, thank you both o7
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi
HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z
TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP
8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh
MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf
xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J
wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ
QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN
IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS
M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S
kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24
42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg=
=vmIG
-----END PGP SIGNATURE-----
Merge tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu into staging
* Various bug fixes
* Big cleanup of deprecated machines
* Power11 support for spapr
* XIVE improvements
* Goodbye to Cedric and David as ppc reviewers, thank you both o7
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi
# HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z
# TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP
# 8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh
# MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf
# xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J
# wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ
# QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN
# IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS
# M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S
# kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24
# 42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg=
# =vmIG
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 00:15:35 GMT
# gpg: using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0 A795 67B3 0276 A862 1CAE
* tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu: (67 commits)
MAINTAINERS: Remove myself as reviewer
MAINTAINERS: Remove myself from XIVE
MAINTAINERS: Remove myself from the PowerNV machines
hw/ppc: Consolidate ppc440 initial mapping creation functions
hw/ppc: Consolidate e500 initial mapping creation functions
tests/qtest: Add XIVE tests for the powernv10 machine
pnv/xive2: TIMA CI ops using alternative offsets or byte lengths
pnv/xive2: TIMA support for 8-byte OS context push for PHYP
pnv/xive: Update PIPR when updating CPPR
pnv/xive: Add special handling for pool targets
ppc/xive2: Support "Pull Thread Context to Odd Thread Reporting Line"
ppc/xive2: Change context/ring specific functions to be generic
ppc/xive2: Support "Pull Thread Context to Register" operation
ppc/xive2: Allow 1-byte write of Target field in TIMA
ppc/xive2: Dump the VP-group and crowd tables with 'info pic'
ppc/xive2: Dump more NVP state with 'info pic'
pnv/xive2: Support for "OS LGS Push" TIMA operation
ppc/xive2: Support TIMA "Pull OS Context to Odd Thread Reporting Line"
pnv/xive2: Define OGEN field in the TIMA
pnv/xive: TIMA patch sets pre-req alignment and formatting changes
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>From what I read PCI has 32 transactions, PCI Express devices can handle
256 with Extended tag enabled (spec mentions also larger values but I
lack PCIe knowledge).
QEMU leaves 'Extended tag field' with 0 as value:
Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, IntMsgNum 0
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+ FLReset- TEE-IO-
SBSA ACS has test 824 which checks for PCIe device capabilities. BSA
specification [1] (SBSA is on top of BSA) in section F.3.2 lists
expected values for Device Capabilities Register:
Device Capabilities Register Requirement
Role based error reporting RCEC and RCiEP: Hardwired to 1
Endpoint L0s acceptable latency RCEC and RCiEP: Hardwired to 0
L1 acceptable latency RCEC and RCiEP: Hardwired to 0
Captured slot power limit scale RCEC and RCiEP: Hardwired to 0
Captured slot power limit value RCEC and RCiEP: Hardwired to 0
Max payload size value must be compliant with PCIe spec
Phantom functions RCEC and RCiEP: Recommendation is to
hardwire this bit to 0.
Extended tag field Hardwired to 1
1. https://developer.arm.com/documentation/den0094/c/
This change enables Extended tag field. All versioned platforms should
have it disabled for older versions (tested with Arm/virt).
Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <20241023113820.486017-1-marcin.juszkiewicz@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 1392617d35 intended to tag pseries-2.1 - 2.11 machines as
deprecated with reasons mentioned in its commit log.
Removing pseries-2.3 specific code with this patch for now.
While at it, also remove the dynamic-reconfiguration option which was
introduced to disable it by default for legacy machines until pseries-2.3.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Commit 1392617d35 intended to tag pseries-2.1 - 2.11 machines as
deprecated with reasons mentioned in its commit log.
Removing pseries-2.2 specific code with this patch for now.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Commit 1392617d35 intended to tag pseries-2.1 - 2.11 machines as
deprecated with reasons mentioned in its commit log.
Removing pseries-2.1 specific code with this patch for now.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
This is in preparation for the next commit where the nitro-enclave
machine type will need to instead use a memfd backend, for the built-in
vhost-user-vsock device to work.
Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20241008211727.49088-5-dorjoychy111@gmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
=k8e9
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci,pc: features,fixes
pci: Initial support for SPDM Responders
cxl: Add support for scan media, feature commands, device patrol scrub
control, DDR5 ECS control, firmware updates
virtio: in-order support
virtio-net: support for SR-IOV emulation (note: known issues on s390,
might get reverted if not fixed)
smbios: memory device size is now configurable per Machine
cpu: architecture agnostic code to support vCPU Hotplug
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmae9l8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8fYH/impBH9nViO/WK48io4mLSkl0EUL8Y/xrMvH
# zKFCKaXq8D96VTt1Z4EGKYgwG0voBKZaCEKYU/0ARGnSlSwxINQ8ROCnBWMfn2sx
# yQt08EXVMznNLtXjc6U5zCoCi6SaV85GH40No3MUFXBQt29ZSlFqO/fuHGZHYBwS
# wuVKvTjjNF4EsGt3rS4Qsv6BwZWMM+dE6yXpKWk68kR8IGp+6QGxkMbWt9uEX2Md
# VuemKVnFYw0XGCGy5K+ZkvoA2DGpEw0QxVSOMs8CI55Oc9SkTKz5fUSzXXGo1if+
# M1CTjOPJu6pMym6gy6XpFa8/QioDA/jE2vBQvfJ64TwhJDV159s=
# =k8e9
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 Jul 2024 10:16:31 AM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (61 commits)
hw/nvme: Add SPDM over DOE support
backends: Initial support for SPDM socket support
hw/pci: Add all Data Object Types defined in PCIe r6.0
tests/acpi: Add expected ACPI AML files for RISC-V
tests/qtest/bios-tables-test.c: Enable basic testing for RISC-V
tests/acpi: Add empty ACPI data files for RISC-V
tests/qtest/bios-tables-test.c: Remove the fall back path
tests/acpi: update expected DSDT blob for aarch64 and microvm
acpi/gpex: Create PCI link devices outside PCI root bridge
tests/acpi: Allow DSDT acpi table changes for aarch64
hw/riscv/virt-acpi-build.c: Update the HID of RISC-V UART
hw/riscv/virt-acpi-build.c: Add namespace devices for PLIC and APLIC
virtio-iommu: Add trace point on virtio_iommu_detach_endpoint_from_domain
hw/vfio/common: Add vfio_listener_region_del_iommu trace event
virtio-iommu: Remove the end point on detach
virtio-iommu: Free [host_]resv_ranges on unset_iommu_devices
virtio-iommu: Remove probe_done
Revert "virtio-iommu: Clear IOMMUDevice when VFIO device is unplugged"
gdbstub: Add helper function to unregister GDB register space
physmem: Add helper function to destroy CPU AddressSpace
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently QEMU describes initial[1] RAM* in SMBIOS as a series of
virtual DIMMs (capped at 16Gb max) using type 17 structure entries.
Which is fine for the most cases. However when starting guest
with terabytes of RAM this leads to too many memory device
structures, which eventually upsets linux kernel as it reserves
only 64K for these entries and when that border is crossed out
it runs out of reserved memory.
Instead of partitioning initial RAM on 16Gb DIMMs, use maximum
possible chunk size that SMBIOS spec allows[2]. Which lets
encode RAM in lower 31 bits of 32bit field (which amounts upto
2047Tb per DIMM).
As result initial RAM will generate only one type 17 structure
until host/guest reach ability to use more RAM in the future.
Compat changes:
We can't unconditionally change chunk size as it will break
QEMU<->guest ABI (and migration). Thus introduce a new machine
class field that would let older versioned machines to use
legacy 16Gb chunks, while new(er) machine type[s] use maximum
possible chunk size.
PS:
While it might seem to be risky to rise max entry size this large
(much beyond of what current physical RAM modules support),
I'd not expect it causing much issues, modulo uncovering bugs
in software running within guest. And those should be fixed
on guest side to handle SMBIOS spec properly, especially if
guest is expected to support so huge RAM configs.
In worst case, QEMU can reduce chunk size later if we would
care enough about introducing a workaround for some 'unfixable'
guest OS, either by fixing up the next machine type or
giving users a CLI option to customize it.
1) Initial RAM - is RAM configured with help '-m SIZE' CLI option/
implicitly defined by machine. It doesn't include memory
configured with help of '-device' option[s] (pcdimm,nvdimm,...)
2) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
PS:
* tested on 8Tb host with RHEL6 guest, which seems to parse
type 17 SMBIOS table entries correctly (according to 'dmidecode').
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240715122417.4059293-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"make check SPEED=slow" is currently failing the device-introspect-test on
older machine types since introspecting "scsi-block" is causing an abort:
$ ./qemu-system-x86_64 -M pc-q35-8.0 -monitor stdio
QEMU 9.0.50 monitor - type 'help' for more information
(qemu) device_add scsi-block,help
Unexpected error in object_property_find_err() at
../../devel/qemu/qom/object.c:1357:
can't apply global scsi-disk-base.migrate-emulated-scsi-request=false:
Property 'scsi-block.migrate-emulated-scsi-request' not found
Aborted (core dumped)
The problem is that the compat code tries to change the
"migrate-emulated-scsi-request" property for all devices that are
derived from "scsi-block", but the property has only been added
to "scsi-hd" and "scsi-cd" via the DEFINE_SCSI_DISK_PROPERTIES macro.
Thus let's fix the problem by only changing the property on the devices
that really have this property.
Fixes: b4912afa5f ("scsi-disk: Fix crash for VM configured with USB CDROM after live migration")
Message-ID: <20240703090904.909720-1-thuth@redhat.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Recent SDHCI expect cards to support the v3.01 spec
to negociate lower I/O voltage. Select it by default.
Versioned machine types with a version of 9.0 or
earlier retain the old default (spec v2.00).
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Tested-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240703134356.85972-2-philmd@linaro.org>
In current code, when guest does S3, virtio-gpu are reset due to the
bit No_Soft_Reset is not set. After resetting, the display resources
of virtio-gpu are destroyed, then the display can't come back and only
show blank after resuming.
Implement No_Soft_Reset bit of PCI_PM_CTRL register, then guest can check
this bit, if this bit is set, the devices resetting will not be done, and
then the display can work after resuming.
No_Soft_Reset bit is implemented for all virtio devices, and was tested
only on virtio-gpu device. Set it false by default for safety.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Message-Id: <20240606102205.114671-3-Jiqian.Chen@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For VMs configured with the USB CDROM device:
-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/share_fs/cdrom.iso'/>
<target dev='sda' bus='usb'/>
<readonly/>
<address type='usb' bus='0' port='2'/>
</disk>
<controller type='usb' index='0' model='piix3-uhci'/>
Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:
324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 0x19800000, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x00000080, len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180
trace log at the destination after live migration is as follows:
3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 0x1000000, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 0x39800000, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+0000: shutting down, reason=crashed
The backtrace reveals the following:
Program terminated with signal SIGSEGV, Segmentation fault.
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312 movq -8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1 memcpy (__len=8, __src=<optimized out>, __dest=<optimized out>) at /usr/include/bits/string_fortified.h:34
2 iov_from_buf_full (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3 iov_from_buf (bytes=8, buf=<optimized out>, offset=<optimized out>, iov_cnt=<optimized out>, iov=<optimized out>)
at /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4 usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=<optimized out>, bytes=bytes@entry=8) at ../hw/usb/core.c:636
5 usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:186
6 usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at ../hw/usb/dev-storage.c:496
7 usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at ../hw/usb/core.c:455
8 uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=<optimized out>,
int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9 uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472
When designing the USB mass storage device model, QEMU places SCSI disk
device as the backend of USB mass storage device. In addition, USB mass
device driver in Guest OS conforms to the "Universal Serial Bus Mass
Storage Class Bulk-Only Transport" specification in order to simulate
the transform behavior between a USB controller and a USB mass device.
The following shows the protocol hierarchy:
+----------------+
CDROM driver | scsi command | CDROM
+----------------+
+-----------------------+
USB mass | USB Mass Storage Class| USB mass
storage driver | Bulk-Only Transport | storage device
+-----------------------+
+----------------+
USB Controller | USB Protocol | USB device
+----------------+
In the USB protocol layer, between the USB controller and USB device, at
least two USB packets will be transformed when guest OS send a
read operation to USB mass storage device:
1. The CBW packet, which will be delivered to the USB device's Bulk-Out
endpoint. In order to simulate a read operation, the USB mass storage
device parses the CBW and converts it to a SCSI command, which would be
executed by CDROM(represented as SCSI disk in QEMU internally), and store
the result data of the SCSI command in a buffer.
2. The DATA-IN packet, which will be delivered from the USB device's
Bulk-In endpoint(fetched directly from the preceding buffer) to the USB
controller.
We consider UHCI to be the controller. The two packets mentioned above may
have been processed by UHCI in two separate frame entries of the Frame List
, and also described by two different TDs. Unlike the physical environment,
a virtualized environment requires the QEMU to make sure that the result
data of CBW is not lost and is delivered to the UHCI controller.
Currently, these types of SCSI requests are not migrated, so QEMU cannot
ensure the result data of the IO operation is not lost if there are
inflight emulated SCSI requests during the live migration.
Assume for the moment that the USB mass storage device is processing the
CBW and storing the result data of the read operation to a buffre, live
migration happens and moves the VM to the destination while not migrating
the result data of the read operation.
After migration, when UHCI at the destination issues a DATA-IN request to
the USB mass storage device, a crash happens because USB mass storage device
fetches the result data and get nothing.
The scenario this patch addresses is this one.
Theoretically, any device that uses the SCSI disk as a back-end would be
affected by this issue. In this case, it is the USB CDROM.
To fix it, inflight emulated SCSI request be migrated during live migration,
similar to the DMA SCSI request.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Message-ID: <878c8f093f3fc2f584b5c31cb2490d9f6a12131a.1716531409.git.yong.huang@smartx.com>
[Do not bump migration version, introduce compat property instead. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Detect early unsupported MADV_MERGEABLE and MADV_DONTDUMP, and print a clearer
error message that points to the deficiency of the host.
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise, starting any guest on a non-Linux guests results in
qemu-system-arm: Couldn't set property 'merge' on 'memory-backend-ram': Invalid argument
Cc: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ask the ConfidentialGuestSupport object whether to use guest_memfd
for KVM-backend private memory. This bool can be set in instance_init
(or user_complete) so that it is available when the machine is created.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The legacy SCSI passthrough functionality has never been enabled for
VIRTIO 1.0 and was deprecated more than four years ago.
Get rid of it---almost, because QEMU is advertising it unconditionally
for legacy virtio-blk devices. Just parse the header and return a
nonzero status.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
P45Dym1u4g==
=DDrh
-----END PGP SIGNATURE-----
Merge tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu into staging
Migration pull request
- Li Zhijian's COLO minor fixes
- Marc-André's virtio-gpu fix
- Fiona's virtio-net USO fix
- A couple of migration-test fixes from Thomas
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZObggQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnWE8D/49RGE+g29qyk9aKx3lU8mSq+ZzmX5GncBt
# 5+Mx5qoHDsBCQTE+dQpEVIoeMJ2HIbgbOML4qsnp6Hw/4/TWkfwC/R6+ZmHBevRk
# fVLkVh2JMHVg8Tq+0FO1X1QnMU03uJ7EAuWdDa8HqlJ5dQY/K3gDaku8oQBXk96X
# 13pChSbMob76tdb+wiwbdEakabigH7XfrPdI6lzI8MCGTIcPKc/UKTFYuoj/OsNx
# raqy+uBtvKtfHxiaYnIgHIPNAF/1f4tP3iAOcPoZWIMXWxFkE8+ANDJAbWo6xIcL
# DGg/wEzZO/OnXLjOhjvLBUHK/fx4wQ5bsqA09BVxoRyBGblkXr+bcwBLYjgiEqzT
# aniPiAx5W/Db+T7HqZPIWesFYj3cmcwvYUTrx/RPMdC0epG+ZczDMtescHdZbxvt
# Pjs3nFeCLhyYcVhlTI72eXRCxdd/26+r6/OmrBC2+GaZrybM61TvNo+3XvO0Pfhi
# UmwF2EN27XmSMelLvH/MnflUVgBHKDs3CCQzDlxreHq2jMVR0SL7LU5wMJJ58Iok
# M3u74izQM25bwYxiASH+4iRn0puH1mOwgOx28W0uiQfZY/678/lCnwa1Tul15BRE
# fIQZJhyIGzhSpwLqEXmdXdlLQs1isqIgpd/mzKgZ285nLr7kz+4gxCUqiXgVbrl7
# P45Dym1u4g==
# =DDrh
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 22 May 2024 03:13:28 PM PDT
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-20240522-pull-request' of https://gitlab.com/farosas/qemu:
tests/qtest/migration-test: Fix the check for a successful run of analyze-migration.py
tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too
hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1
virtio-gpu: fix v2 migration
migration: fix a typo
migration: add "exists" info to load-state-field trace
migration/colo: Tidy up bql_unlock() around bdrv_activate_all()
migration/colo: make colo_incoming_co() return void
migration/colo: Minor fix for colo error message
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Migration from an 8.2 or 9.0 binary to an 8.1 binary with machine
version 8.1 can fail with:
> kvm: Features 0x1c0010130afffa7 unsupported. Allowed features: 0x10179bfffe7
> kvm: Failed to load virtio-net:virtio
> kvm: error while loading state for instance 0x0 of device '0000:00:12.0/virtio-net'
> kvm: load of migration failed: Operation not permitted
The series
53da8b5a99 virtio-net: Add support for USO features
9da1684954 virtio-net: Add USO flags to vhost support.
f03e0cf63b tap: Add check for USO features
2ab0ec3121 tap: Add USO support to tap device.
only landed in QEMU 8.2, so the compatibility flags should be part of
machine version 8.1.
Moving the flags unfortunately breaks forward migration with machine
version 8.1 from a binary without this patch to a binary with this
patch.
Fixes: 53da8b5a99 ("virtio-net: Add support for USO features")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Commit dfcf74fa ("virtio-gpu: fix scanout migration post-load") broke
forward/backward version migration. Versioning of nested VMSD structures
is not straightforward, as the wire format doesn't have nested
structures versions. Introduce x-scanout-vmstate-version and a field
test to save/load appropriately according to the machine version.
Fixes: dfcf74fa ("virtio-gpu: fix scanout migration post-load")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
[fixed long lines]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In case of migration, during restore operation, qemu checks config space of the
pci device with the config space in the migration stream captured during save
operation. In case of config space data mismatch, restore operation is failed.
config space check is done in function get_pci_config_device(). By default VSC
(vendor-specific-capability) in config space is checked.
Due to qemu's config space check for VSC, live migration is broken across NVIDIA
vGPU devices in situation where source and destination host driver is different.
In this situation, Vendor Specific Information in VSC varies on the destination
to ensure vGPU feature capabilities exposed to the guest driver are compatible
with destination host.
If a vfio-pci device is migration capable and vfio-pci vendor driver is OK with
volatile Vendor Specific Info in VSC then qemu should exempt config space check
for Vendor Specific Info. It is vendor driver's responsibility to ensure that
VSC is consistent across migration. Here consistency could mean that VSC format
should be same on source and destination, however actual Vendor Specific Info
may not be byte-to-byte identical.
This patch skips the check for Vendor Specific Information in VSC for VFIO-PCI
device by clearing pdev->cmask[] offsets. Config space check is still enforced
for 3 byte VSC header. If cmask[] is not set for an offset, then qemu skips
config space check for that offset.
VSC check is skipped for machine types >= 9.1. The check would be enforced on
older machine types (<= 9.0).
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Vinayak Kale <vkale@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
The 'compress' migration capability enables the old compression code
which has shown issues over the years and is thought to be less stable
and tested than the more recent multifd-based compression. The old
compression code has been deprecated in 8.2 and now is time to remove
it.
Deprecation commit 864128df46 ("migration: Deprecate old compression
method").
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In previous versions of the Arm architecture, the frequency of the
generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value,
and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns.
In Armv8.6, the architecture standardized this frequency to 1GHz.
Because there is no ID register feature field that indicates whether
a CPU is v8.6 or that it ought to have this counter frequency, we
implement this by changing our default CNTFRQ value for all CPUs,
with exceptions for backwards compatibility:
* CPU types which we already implement will retain the old
default value. None of these are v8.6 CPUs, so this is
architecturally OK.
* CPUs used in versioned machine types with a version of 9.0
or earlier will retain the old default value.
The upshot is that the only CPU type that changes is 'max'; but any
new type we add in future (whether v8.6 or not) will also get the new
1GHz default.
It remains the case that the machine model can override the default
value via the 'cntfrq' QOM property (regardless of the CPU type).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240426122913.3427983-5-peter.maydell@linaro.org
Module is a level above the core, thereby supporting numa
configuration on the module level can bring user more numa flexibility.
This is the natural further support for module level.
Add module level support in numa configuration.
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-5-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add "modules" parameter parsing support in -smp.
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240424154929.1487382-3-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In x86, module is the topology level above core, which contains a set
of cores that share certain resources (in current products, the resource
usually includes L2 cache, as well as module scoped features and MSRs).
Though smp.clusters could also share the L2 cache resource [1], there
are following reasons that drive us to introduce the new smp.modules:
* As the CPU topology abstraction in device tree [2], cluster supports
nesting (though currently QEMU hasn't support that). In contrast,
(x86) module does not support nesting.
* Due to nesting, there is great flexibility in sharing resources
on cluster, rather than narrowing cluster down to sharing L2 (and
L3 tags) as the lowest topology level that contains cores.
* Flexible nesting of cluster allows it to correspond to any level
between the x86 package and core.
* In Linux kernel, x86's cluster only represents the L2 cache domain
but QEMU's smp.clusters is the CPU topology level. Linux kernel will
also expose module level topology information in sysfs for x86. To
avoid cluster ambiguity and keep a consistent CPU topology naming
style with the Linux kernel, we introduce module level for x86.
The module is, in existing hardware practice, the lowest layer that
contains the core, while the cluster is able to have a higher
topological scope than the module due to its nesting.
Therefore, place the module between the cluster and the core:
drawer/book/socket/die/cluster/module/core/thread
With the above topological hierarchy order, introduce module level
support in MachineState and MachineClass.
[1]: https://lore.kernel.org/qemu-devel/c3d68005-54e0-b8fe-8dc1-5989fe3c7e69@huawei.com/
[2]: https://www.kernel.org/doc/Documentation/devicetree/bindings/cpu/cpu-topology.txt
Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-2-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.
Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.
MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-8-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
+jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
=C0UU
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
# 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
# +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
# uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
# 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
# 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
# =C0UU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits)
docs/specs/pvpanic: document shutdown event
hw/cxl: Fix missing reserved data in CXL Device DVSEC
hmat acpi: Fix out of bounds access due to missing use of indirection
hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.
qemu-options.hx: Document the virtio-iommu-pci aw-bits option
hw/arm/virt: Set virtio-iommu aw-bits default value to 48
hw/i386/q35: Set virtio-iommu aw-bits default value to 39
virtio-iommu: Add an option to define the input range width
virtio-iommu: Trace domain range limits as unsigned int
qemu-options.hx: Document the virtio-iommu-pci granule option
virtio-iommu: Change the default granule to the host page size
virtio-iommu: Add a granule property
hw/i386/acpi-build: Add support for SRAT Generic Initiator structures
hw/acpi: Implement the SRAT GI affinity structure
qom: new object to associate device to NUMA node
hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
hw/i386/pc: Avoid one use of the current_machine global
hw/i386/pc: Remove "rtc_state" link again
Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/core/machine.c
Currently the default input range can extend to 64 bits. On x86,
when the virtio-iommu protects vfio devices, the physical iommu
may support only 39 bits. Let's set the default to 39, as done
for the intel-iommu.
We use hw_compat_8_2 to handle the compatibility for machines
before 9.0 which used to have a virtio-iommu default input range
of 64 bits.
Of course if aw-bits is set from the command line, the default
is overriden.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-8-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
We used to set the default granule to 4KB but with VFIO assignment
it makes more sense to use the actual host page size.
Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we current get a qemu crash:
"vfio: DMA mapping failed, unable to continue"
This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4KB on
machine init done.
Set the granule property to "host" and introduce a new compat.
The page size mask used before 9.0 was qemu_target_page_mask().
Since the virtio-iommu currently only supports x86_64 and aarch64,
this matched a 4KB granule.
Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x10000 larger than system page size 0x1000".
In that case the workaround is to request 4K granule.
The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-4-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>