* target/i386: small code generation improvements
* target/i386: various cleanups and fixes
* cpu: remove env->nr_cores
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
=xIQ4
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* rust: miscellaneous changes
* target/i386: small code generation improvements
* target/i386: various cleanups and fixes
* cpu: remove env->nr_cores
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
# MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
# 5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
# NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
# CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
# 07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
# =xIQ4
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 17:34:48 EST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (38 commits)
i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX]
i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
cpu: Remove nr_cores from struct CPUState
i386/cpu: Hoist check of CPUID_EXT3_TOPOEXT against threads_per_core
i386/cpu: Track a X86CPUTopoInfo directly in CPUX86State
i386/topology: Introduce helpers for various topology info of different level
i386/topology: Update the comment of x86_apicid_from_topo_ids()
i386/cpu: Drop cores_per_pkg in cpu_x86_cpuid()
i386/cpu: Drop the variable smp_cores and smp_threads in x86_cpu_pre_plug()
i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
target/i386/kvm: Replace ARRAY_SIZE(msr_handlers) with KVM_MSR_FILTER_MAX_RANGES
target/i386/kvm: Clean up error handling in kvm_arch_init()
target/i386/kvm: Return -1 when kvm_msr_energy_thread_init() fails
target/i386/kvm: Clean up return values of MSR filter related functions
target/i386/confidential-guest: Fix comment of x86_confidential_guest_kvm_type()
target/i386/kvm: Drop workaround for KVM_X86_DISABLE_EXITS_HTL typo
target/i386/kvm: Only save/load kvmclock MSRs when kvmclock enabled
target/i386/kvm: Remove local MSR_KVM_WALL_CLOCK and MSR_KVM_SYSTEM_TIME definitions
target/i386/kvm: Add feature bit definitions for KVM CPUID
i386/cpu: Mark avx10_version filtered when prefix is NULL
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEMUsIrNDeSBEzpfKGm+mA/QrAFUQFAmeIxhcSHGR3bXdAYW1h
em9uLmNvLnVrAAoJEJvpgP0KwBVEbKoP/iqQ+PhbwT9+xz6lxW+g1Dx+YGrT/ugp
d3xHn9AEkR0EHC42J6RB/llyWbKVD/IIhYwUk5GDm+4InGrtuQDhG6UqWxqvIRht
0JuZvVm7x5akmKv73igxNqZHVg0ZEAS+EllBUaBYWj0pvpMbBK93Sdz9PXKxA7Nm
dPeFrOpL2TAmnDCH1UuBbXypHEjAghmv7WFphMtk6qLX+wYVaK3F2J/ed2TNyT0V
LliOdQH0Pxt445SSVJIZRe9bW3FH7qyvZV1gCnxSnqPUlN7vBhpjzgl4hWEzVYcp
7X21ZAD9kPc81DJjYucbLjAbrqSmlDrJqL05qtRigfPcnqz2NoKrYxhj8B0F8mgt
1IbymPyeab5gk5Hi1QgMmG5eobDDaglDSxpq6gRfJBiJW+1adif00z/HVvt5onS0
uQ6i6w5NzQciBX77muAb2ZDEMysjk+3wSJMMpkfl90D0kjlMqeWWs4FH9ThasjC+
EhQioUD0euedgnzOSfQjNNtAW4gzv9rcShkcV84bjxP/0Es+Pgx9f6wtCUTzdeqy
Cid8/72lHIgrkZGfpv8BBZkA1XP09vgtUGKyAWm4yHOcB57l8cNiL1nKtqoCLwkQ
8JWFWzFeEY19KoiRGY5saH6ExeOx8fmc/lYwqImZqFqvuFX4Vf2RJdTIRIYr7g05
2QffxFmskg+A
=Wz0V
-----END PGP SIGNATURE-----
Merge tag 'pull-xenfv-20250116' of git://git.infradead.org/users/dwmw2/qemu into staging
Xen regression fixes and cleanups
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEMUsIrNDeSBEzpfKGm+mA/QrAFUQFAmeIxhcSHGR3bXdAYW1h
# em9uLmNvLnVrAAoJEJvpgP0KwBVEbKoP/iqQ+PhbwT9+xz6lxW+g1Dx+YGrT/ugp
# d3xHn9AEkR0EHC42J6RB/llyWbKVD/IIhYwUk5GDm+4InGrtuQDhG6UqWxqvIRht
# 0JuZvVm7x5akmKv73igxNqZHVg0ZEAS+EllBUaBYWj0pvpMbBK93Sdz9PXKxA7Nm
# dPeFrOpL2TAmnDCH1UuBbXypHEjAghmv7WFphMtk6qLX+wYVaK3F2J/ed2TNyT0V
# LliOdQH0Pxt445SSVJIZRe9bW3FH7qyvZV1gCnxSnqPUlN7vBhpjzgl4hWEzVYcp
# 7X21ZAD9kPc81DJjYucbLjAbrqSmlDrJqL05qtRigfPcnqz2NoKrYxhj8B0F8mgt
# 1IbymPyeab5gk5Hi1QgMmG5eobDDaglDSxpq6gRfJBiJW+1adif00z/HVvt5onS0
# uQ6i6w5NzQciBX77muAb2ZDEMysjk+3wSJMMpkfl90D0kjlMqeWWs4FH9ThasjC+
# EhQioUD0euedgnzOSfQjNNtAW4gzv9rcShkcV84bjxP/0Es+Pgx9f6wtCUTzdeqy
# Cid8/72lHIgrkZGfpv8BBZkA1XP09vgtUGKyAWm4yHOcB57l8cNiL1nKtqoCLwkQ
# 8JWFWzFeEY19KoiRGY5saH6ExeOx8fmc/lYwqImZqFqvuFX4Vf2RJdTIRIYr7g05
# 2QffxFmskg+A
# =Wz0V
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 16 Jan 2025 03:40:55 EST
# gpg: using RSA key 314B08ACD0DE481133A5F2869BE980FD0AC01544
# gpg: issuer "dwmw@amazon.co.uk"
# gpg: Good signature from "David Woodhouse <dwmw@amazon.co.uk>" [unknown]
# gpg: aka "David Woodhouse <dwmw@amazon.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 314B 08AC D0DE 4811 33A5 F286 9BE9 80FD 0AC0 1544
* tag 'pull-xenfv-20250116' of git://git.infradead.org/users/dwmw2/qemu:
system/runstate: Fix regression, clarify BQL status of exit notifiers
hw/xen: Fix errp handling in xen_console
hw/xen: Use xs_node_read() from xenstore_read_str() instead of open-coding it
hw/xen: Use xs_node_read() from xen_netdev_get_name()
hw/xen: Use xs_node_read() from xen_console_get_name()
hw/xen: Use xs_node_read() from xs_node_vscanf()
xen: do not use '%ms' scanf specifier
hw/xen: Add xs_node_read() helper function
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ4hk/QAKCRAfewwSUazn
0WagAQDgJaWBLQxZkyQR2FQm3WHg3Uf/qolab9nDGo3b2BpixgD/RdvZf+mZpAwf
2ipAQ7g5GqGTKtTAdqO/aBAqTCZCqQU=
=7KKt
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20250116' of https://gitlab.com/bibo-mao/qemu into staging
loongarch queue
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ4hk/QAKCRAfewwSUazn
# 0WagAQDgJaWBLQxZkyQR2FQm3WHg3Uf/qolab9nDGo3b2BpixgD/RdvZf+mZpAwf
# 2ipAQ7g5GqGTKtTAdqO/aBAqTCZCqQU=
# =7KKt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 Jan 2025 20:46:37 EST
# gpg: using EDDSA key 0D8642A3A2659F80B0B3D1A41F7B0C1251ACE7D1
# gpg: Good signature from "bibo mao <maobibo@loongson.cn>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7044 3A00 19C0 E97A 31C7 13C4 8E86 8FB7 A176 9D4C
# Subkey fingerprint: 0D86 42A3 A265 9F80 B0B3 D1A4 1F7B 0C12 51AC E7D1
* tag 'pull-loongarch-20250116' of https://gitlab.com/bibo-mao/qemu:
hw/intc/loongarch_ipi: Use alternative implemation for cpu_by_arch_id
hw/intc/loongson_ipi: Add more input parameter for cpu_by_arch_id
hw/intc/loongarch_ipi: Remove property num-cpu
hw/intc/loongarch_ipi: Get cpu number from possible_cpu_arch_ids
hw/intc/loongson_ipi: Remove property num_cpu from loongson_ipi_common
hw/intc/loongson_ipi: Remove num_cpu from loongson_ipi_common
hw/intc/loongarch_ipi: Implement realize interface
target/loongarch: Add page table walker support for debugger usage
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The big thing here are:
stage-1 translation in vtd
internal migration in vhost-user
ghes driver preparation for error injection
new resource uuid feature in virtio gpu
new vmclock device
And as usual, fixes and cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmeIOiIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpORgIAL0clwZxQL7PIPJ91FwXc1bo6Do/HYquAzvH
eA+ryCG5S5ewh/e2R8SdIUG7nYesEMWJGVL1gb3BFu7wgGh1aLaaTxQ1LIo5HpRF
P0Ak3QO7TKIsSEcZIz9h3eMEpg6X9d8i2h7llp7H3qqXBbduO+cGfeNH/fZD5IEl
7DFvXuJUgUtZb38I+qtcO+9EQFKGHjgdQAN5P/I4vawWJdxN9sBfT4YVEgpVhiq/
ALxdSeaEiXA4EXexdHVZhXiQzEBsCQ78RZIIDiRE8I34cVY7rolTodKRfr4bip3P
6Llu11yvzNi1gppOzkny3QFsRza3hV0RisWYjAMTwLhNCdi/mHQ=
=GjDq
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, fixes, cleanups
The big thing here are:
stage-1 translation in vtd
internal migration in vhost-user
ghes driver preparation for error injection
new resource uuid feature in virtio gpu
new vmclock device
And as usual, fixes and cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmeIOiIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpORgIAL0clwZxQL7PIPJ91FwXc1bo6Do/HYquAzvH
# eA+ryCG5S5ewh/e2R8SdIUG7nYesEMWJGVL1gb3BFu7wgGh1aLaaTxQ1LIo5HpRF
# P0Ak3QO7TKIsSEcZIz9h3eMEpg6X9d8i2h7llp7H3qqXBbduO+cGfeNH/fZD5IEl
# 7DFvXuJUgUtZb38I+qtcO+9EQFKGHjgdQAN5P/I4vawWJdxN9sBfT4YVEgpVhiq/
# ALxdSeaEiXA4EXexdHVZhXiQzEBsCQ78RZIIDiRE8I34cVY7rolTodKRfr4bip3P
# 6Llu11yvzNi1gppOzkny3QFsRza3hV0RisWYjAMTwLhNCdi/mHQ=
# =GjDq
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 Jan 2025 17:43:46 EST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (49 commits)
hw/acpi: Add vmclock device
virtio-net: vhost-user: Implement internal migration
vhost: Add stubs for the migration state transfer interface
hw/cxl: Fix msix_notify: Assertion `vector < dev->msix_entries_nr`
tests: acpi: update expected blobs
pci: acpi: Windows 'PCI Label Id' bug workaround
tests: acpi: whitelist expected blobs
docs: acpi_hest_ghes: fix documentation for CPER size
acpi/ghes: Change ghes fill logic to work with only one source
acpi/ghes: move offset calculus to a separate function
acpi/ghes: better name the offset of the hardware error firmware
acpi/ghes: rename etc/hardware_error file macros
acpi/ghes: don't crash QEMU if ghes GED is not found
acpi/ghes: better name GHES memory error function
acpi/ghes: make the GHES record generation more generic
acpi/ghes: don't check if physical_address is not zero
acpi/ghes: Change the type for source_id
acpi/ghes: Remove a duplicated out of bounds check
acpi/ghes: Fix acpi_ghes_record_errors() argument
acpi/ghes: better handle source_id and notification
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The vmclock device addresses the problem of live migration with
precision clocks. The tolerances of a hardware counter (e.g. TSC) are
typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that
counter against an external source of 'real' time, and track the precise
frequency of the counter as it changes with environmental conditions.
When a guest is live migrated, anything it knows about the frequency of
the underlying counter becomes invalid. It may move from a host where
the counter running at -50PPM of its nominal frequency, to a host where
it runs at +50PPM. There will also be a step change in the value of the
counter, as the correctness of its absolute value at migration is
limited by the accuracy of the source and destination host's time
synchronization.
The device exposes a shared memory region to guests, which can be mapped
all the way to userspace. In the first phase, this merely advertises a
'disruption_marker', which indicates that the guest should throw away any
NTP synchronization it thinks it has, and start again.
Because the region can be exposed all the way to userspace, applications
can still use time from a fast vDSO 'system call', and check the
disruption marker to be sure that their timestamp is indeed truthful.
The structure also allows for the precise time, as known by the host, to
be exposed directly to guests so that they don't have to wait for NTP to
resync from scratch.
The values and fields are based on the nascent virtio-rtc specification,
and the intent is that a version (hopefully precisely this version) of
this structure will be included as an optional part of that spec. In the
meantime, a simple ACPI device along the lines of VMGENID is perfectly
sufficient and is compatible with what's being shipped in certain
commercial hypervisors.
Linux guest support was merged into the 6.13-rc1 kernel:
https://git.kernel.org/torvalds/c/205032724226
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <07fd5e2f529098ad4d7cab1423fe9f4a03a9cc14.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add support of VHOST_USER_PROTOCOL_F_DEVICE_STATE in virtio-net
with vhost-user backend.
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20250115135044.799698-3-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Migration state transfer interface is only used by vhost-user-fs,
so the interface needs to be defined only when vhost is built.
But I need to use this interface with virtio-net and vhost is not always
enabled, and to avoid undefined reference error during build, define stub
functions for vhost_supports_device_state(), vhost_save_backend_state() and
vhost_load_backend_state().
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20250115135044.799698-2-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This assertion always happens when we sanitize the CXL memory device.
$ echo 1 > /sys/bus/cxl/devices/mem0/security/sanitize
It is incorrect to register an MSIX number beyond the device's capability.
Increase the device's MSIX number to cover the mailbox msix number(9).
Fixes: 43efb0bfad ("hw/cxl/mbox: Wire up interrupts for background completion")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Message-Id: <20250115075834.167504-1-lizhijian@fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Current versions of Windows call _DSM(func=7) regardless
of whether it is supported or not. It leads to NICs having bogus
'PCI Label Id = 0', where none should be set at all.
Also presence of 'PCI Label Id' triggers another Windows bug
on localized versions that leads to hangs. The later bug is fixed
in latest updates for 'Windows Server' but not in consumer
versions of Windows (and there is no plans to fix it
as far as I'm aware).
Given it's easy, implement Microsoft suggested workaround
(return invalid Package) so that affected Windows versions
could boot on QEMU.
This would effectvely remove bogus 'PCI Label Id's on NICs,
but MS teem confirmed that flipping 'PCI Label Id' should not
change 'Network Connection' ennumeration, so it should be safe
for QEMU to change _DSM without any compat code.
Smoke tested with WinXP and WS2022
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20250115125342.3883374-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20250115125342.3883374-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While the spec defines a CPER size of 4KiB for each record,
currently it is set to 1KiB. Fix the documentation and add
a pointer to the macro name there, as this may help to keep
it updated.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <f7e94433bec19a9d6b23ecccc24b5fe3a6f7f52b.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Extending to multiple sources require a BIOS pointer to the
beginning of the HEST table, which in turn requires a backward-compatible
code.
So, the current code supports only one source. Ensure that and simplify
the code.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <66bddd42a64c8515ad98b9975d953b4a70ffcc6d.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, CPER address location is calculated as an offset of
the hardware_errors table. It is also badly named, as the
offset actually used is the address where the CPER data starts,
and not the beginning of the error source.
Move the logic which calculates such offset to a separate
function, in preparation for a patch that will be changing the
logic to calculate it from the HEST table.
While here, properly name the variable which stores the cper
address.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <60fdd1bf379ba1db3099710868802aa49a27febb.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The hardware error firmware is where HEST error structures are
stored. Those can be GHESv2, but they can also be other types.
Better name the location of the hardware error.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <ddbb94294bafee998f12fede3ba0b05dae5ee45f.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that we have also have a file to store HEST data location,
which is part of GHES, better name the file where CPER records
are stored.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <e79a013bcd9f634b46ff6b34756d1b1403713af3.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make error handling within ghes_record_cper_errors() consistent,
i.e. instead abort just print a error in case ghes GED is not found.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <c7e1665ba46df321f0ce161d60dfd681ab827535.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current function used to generate GHES data is specific for
memory errors. Give a better name for it, as we now have a generic
function as well.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Message-Id: <35b59121129d5e99cb5062cc3d775594bbb0905b.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the code into separate functions to allow using the
common CPER filling code by different error sources.
The generic code was moved to ghes_record_cper_errors(),
and ghes_gen_err_data_uncorrectable_recoverable() now contains
only a logic to fill the Generic Error Data part of the record,
as described at:
ACPI 6.2: 18.3.2.7.1 Generic Error Data
The remaining code to generate a memory error now belongs to
acpi_ghes_record_errors() function.
A further patch will give it a better name.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <68d9f787d8c4fc8d1dbc227d6902fe801e42dea9.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The 'physical_address' value is a faulty page. As such, 0 is
as valid as any other value.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <da32536bf4962e5c03471e2a4e6e0ef92be4a1be.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As described at: ACPI 6.5 spec at:
18.3.2. ACPI Error Source
In particular at GHES/GHESv2 table:
Table 18.10 Generic Hardware Error Source Structure
HEST source ID is actually a 16-bit value.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <0e83ba548c1aedd1299fe387b94db78986590a34.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
acpi_ghes_record_errors() has an assert() at the beginning
to ensure that source_id will be lower than
ACPI_GHES_ERROR_SOURCE_COUNT. Remove a duplicated check.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <df33b004d85b7b9aa388fb2ac530dcdea94b7edc.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Align the header file with the actual implementation of
this function, as the first argument is source ID and not
notification type.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <d55f2a6ede5a168e42a20a228b2c066cb4c60939.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
GHES has two fields that are stored on HEST error source
blocks associated with notifications:
- notification type, which is a number defined at the ACPI spec
containing several arch-specific synchronous and assynchronous
types;
- source id, which is a HW/FW defined number, used to distinguish
between different implemented sources.
There could be several sources with the same notification type,
which is dependent of the way each architecture maps notifications.
Right now, build_ghes_v2() hardcodes a 1:1 mapping between such
fields. Move it to two independent parameters, allowing the
caller function to fill both.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <133ff72ea1041fed7dbcf97b7a2b0f4dfacde31a.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The GHES driver requires not only a HEST table, but also a
separate firmware file to store Error Structure records.
It can't do one without the other.
Simplify the caller logic for it to require one function.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <9584bb8953385e165681d5d185c503f8df8ef42f.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reduce the ident of the function and prepares it for
the next changes.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <19af4188535217213486d169e0501e592bc78a95.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is just duplicating ACPI_GHES_ERROR_SOURCE_COUNT, which
has a better name. So, drop the duplication.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <9012bf4c9630adf15a22af3c88fda8270916887b.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The end vector calculation has a bug that results in polling fewer
than required vectors when reading at a non-zero offset in PBA memory.
Fixes: bbef882cc1 ("msi: add API to get notified about pending bit poll")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20241212120402.1475053-1-npiggin@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the framework to test the intel-iommu device.
Currently only tested cap/ecap bits correctness when x-flts=on in scalable
mode. Also tested cap/ecap bits consistency before and after system reset.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-21-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This gives user flexibility to turn off FS1GP for debug purpose.
It is also useful for future nesting feature. When host IOMMU doesn't
support FS1GP but vIOMMU does, nested page table on host side works
after turning FS1GP off in vIOMMU.
This property has no effect when vIOMMU is in legacy mode or x-flts=off
in scalable modme.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-20-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Intel VT-d 3.0 introduces scalable mode, and it has a bunch of capabilities
related to scalable mode translation, thus there are multiple combinations.
This vIOMMU implementation wants to simplify it with a new property "x-flts".
When turned on in scalable mode, stage-1 translation is supported. When turned
on in legacy mode, throw out error.
With stage-1 translation support exposed to user, also accurate the pasid entry
check in vtd_pe_type_check().
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20241212083757.605022-19-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VTD spec, stage-1 page table could support 4-level and
5-level paging.
However, 5-level paging translation emulation is unsupported yet.
That means the only supported value for aw_bits is 48. So default
aw_bits to 48 when stage-1 translation is turned on.
For legacy and scalable modes, 48 is the default choice for modern
OS when both 48 and 39 are supported. So it makes sense to set
default to 48 for these two modes too starting from QEMU 9.2.
Use pc_compat_9_1 to handle the compatibility for machines before
9.2.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-17-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-16-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is used by some emulated devices which caches address
translation result. When piotlb invalidation issued in guest,
those caches should be refreshed.
There is already a similar implementation in iotlb invalidation.
So update vtd_iotlb_page_invalidate_notify() to make it work
also for piotlb invalidation.
For device that does not implement ATS capability or disable
it but still caches the translation result, it is better to
implement ATS cap or enable it if there is need to cache the
translation result.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20241212083757.605022-15-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-14-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will be used to implement the device IOTLB invalidation
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-13-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PASID-based iotlb (piotlb) is used during walking Intel
VT-d stage-1 page table.
This emulates the stage-1 page table iotlb invalidation requested
by a PASID-based IOTLB Invalidate Descriptor (P_IOTLB).
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-12-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to spec, Page-Selective-within-Domain Invalidation (11b):
1. IOTLB entries caching second-stage mappings (PGTT=010b) or pass-through
(PGTT=100b) mappings associated with the specified domain-id and the
input-address range are invalidated.
2. IOTLB entries caching first-stage (PGTT=001b) or nested (PGTT=011b)
mapping associated with specified domain-id are invalidated.
So per spec definition the Page-Selective-within-Domain Invalidation
needs to flush first stage and nested cached IOTLB entries as well.
We don't support nested yet and pass-through mapping is never cached,
so what in iotlb cache are only first-stage and second-stage mappings.
Add a tag pgtt in VTDIOTLBEntry to mark PGTT type of the mapping and
invalidate entries based on PGTT type.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-11-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-10-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per VT-d spec 4.1 section 3.15, "Untranslated requests and translation
requests that result in an address in the interrupt range will be
blocked with condition code LGN.4 or SGN.8."
This applies to both stage-1 and stage-2 IOMMU page table, move the
check from vtd_iova_to_slpte() to vtd_do_iommu_translate() so stage-1
page table could also be checked.
By this chance, update the comment with correct section number.
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-9-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Stage-1 translation must fail if the address to translate is
not canonical.
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-8-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds stage-1 page table walking to support stage-1 only
translation in scalable mode.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Co-developed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-7-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Because we will support both FST(a.k.a, FLT) and SST(a.k.a, SLT) translation,
rename variable and functions from slpte to pte whenever possible.
But some are SST only, they are renamed with sl_ prefix.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Co-developed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-6-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per VT-d spec 4.1, 6.5.2.4, "Table 21. PASID-based-IOTLB Invalidation",
PADID-selective PASID-based iotlb invalidation will flush stage-2 iotlb
entries with matching domain id and pasid.
With stage-1 translation introduced, guest could send PASID-selective
PASID-based iotlb invalidation to flush either stage-1 or stage-2 entries.
By this chance, remove old IOTLB related definitions which were unused.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-5-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an new element flts in IntelIOMMUState to mark stage-1 translation support
in scalable mode, this element will be exposed as an intel_iommu property
x-flts finally.
For now, it's only a placehholder and used for address width compatibility
check and block host device passthrough until nesting is supported.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When guest configures Nested Translation(011b) or First-stage Translation only
(001b), type check passed unaccurately.
Fails the type check in those cases as their simulation isn't supported yet.
Fixes: fb43cf739e ("intel_iommu: scalable mode emulation")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Spec revision 3.0 or above defines more detailed fault reasons for
scalable mode. So introduce them into emulation code, see spec
section 7.1.2 for details.
Note spec revision has no relation with VERSION register, Guest
kernel should not use that register to judge what features are
supported. Instead cap/ecap bits should be checked.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
CPU_SCAN_METHOD was processing insert events first and only if insert event was
not present then it would check remove event.
Normally it's not an issue as it doesn't make much sense tho hotplug and
immediately unplug it. In this corner case, which can be reproduced with:
qemu -smp 1,maxcpus=2 -cpu host -monitor stdio \
-drive if=pflash,format=raw,readonly,file=edk2-x86_64-code.fd
* boot till GRUB prompt and pause guest (either via monitor or stop GRUB
from automatic boot)
* at monitor prompt add CPU:
device_add host-x86_64-cpu,socket-id=0,core-id=1,thread-id=0,id=foo
* let guest OS boot completely, and unplug CPU from monitor prompt:
device_del foo
which triggers GPE event that leads to CPU_SCAN_METHOD on guest side
as result of above cpu 'foo' will not be hotunplugged, since QEMU sees
insert event and ignores remove event (leaving it in pending state) for
the GPE event.
Any follow up CPU hotplug/unplug action from QEMU side will handle
previously ignored event, so as workaround user can repeat device_del.
Fix this corner-case by queuing remove events independently from insert
events, aka the same way as we do with insert events. And then go over remove
queue to send eject notify events to OSPM within the same GPE event.
PS:
Process remove queue after the cpu add queue has been processed 1st
to ensure that OSPM gets hotadd evets after hotremove ones.
PS2:
Case where it's still borken happens when guest OS is Linux and
device_del happens before guest OS initializes ACPI subsystem.
Culprit in this case though is the guest kernel, which mangles GPE.sts
(by clearing them up) and thus pending SCI turns to NOP leaving
insert/remove events in pending state.
That is the guest bug and should be fixed there.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Eric Mackay <eric.mackay@oracle.com>
Message-Id: <20241210163945.3422623-3-imammedo@redhat.com>
Tested-by: Eric Mackay <eric.mackay@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>