Make sure MemTxAttrs is packed into 8 bytes and does not exceed 8 bytes.
Suggested-by: Philippe Mathieu-Daudà <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121151322.171832-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert `unspecified` member of MemTxAttrs from bit field to bool, so
that bindgen could generate more ergonomic Rust binding with bool type.
As a result, MemTxAttrs needs to be expanded from 4 bytes to 8 bytes.
Therefore, move `unspecified` to after the bit fields and add reserved
members to ensure that the whole structure is packed into 8 bytes.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121151322.171832-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because the RNMI interrupt trap handler address is implementation defined.
We add the 'rnmi-interrupt-vector' and 'rnmi-exception-vector' as the property
of the harts. It’s very easy for users to set the address based on their
expectation. This patch also adds the functionality to handle the RNMI signals.
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Signed-off-by: Tommy Wu <tommy.wu@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20250106054336.1878291-4-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
- log a guest_error for failed semihosting open()
- clean up semihosting includes to reduce build duplication
- re-factor misc device initialisation to fail with &error_exit
- propagate Error * to gdbserver_start sub-functions
- fix 32-bit build of plugins and re-enable by default
- ensure IRQs don't preempt io recompiled instructions
- remove usage of gcc_struct to enable clang builds
- enable clang/lld to build plugins on windows
- various small kdoc typo fixes
- add perl scripts to editorconfig
- remove unused field from MemoryRegion
- make kdoc script a dependency so doc rebuilds get triggered
- expand developer documentation:
- notes on git-publish
- describe usage of b4
- setting up build dependencies
- codebase layout
- add a glossary of common terms
- optimise the windows ndis script
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeKO8sACgkQ+9DbCVqe
KkTbBQf9HRlspCl2r5a8K9O1ymylKiZ653OBWMStGTQ8xPXeLDFhT+ION34VBgBh
LXHEcjIHn24cN2s1BO5+xJs0nzqYe7UEAK6JQmdX3/HEuf8VmaVslvhm+nCWKoIL
JQbsHno92wh6vvTWQu53zijEuG5HdBseWiwQKHbE1oSRc2CikG70o902AL9zXAsp
mpUYWxUmWwg5uQATztp4XfylJBcSX3SiVgv22jXLqBj9drXPftl/E33fcWXxTj5f
AM3kz9fxaCfo5+znmYw3R1tT/Hv52Q6hW+oNAm34XeWp1/+ho27QMRrpIi/dpdwX
mCbvJwI75sCnD91p9NW7vZIYZJKsLg==
=SLCY
-----END PGP SIGNATURE-----
Merge tag 'pull-10.0-gdb-plugins-doc-updates-170125-1' of https://gitlab.com/stsquad/qemu into staging
semihosting, plugin and doc updates:
- log a guest_error for failed semihosting open()
- clean up semihosting includes to reduce build duplication
- re-factor misc device initialisation to fail with &error_exit
- propagate Error * to gdbserver_start sub-functions
- fix 32-bit build of plugins and re-enable by default
- ensure IRQs don't preempt io recompiled instructions
- remove usage of gcc_struct to enable clang builds
- enable clang/lld to build plugins on windows
- various small kdoc typo fixes
- add perl scripts to editorconfig
- remove unused field from MemoryRegion
- make kdoc script a dependency so doc rebuilds get triggered
- expand developer documentation:
- notes on git-publish
- describe usage of b4
- setting up build dependencies
- codebase layout
- add a glossary of common terms
- optimise the windows ndis script
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeKO8sACgkQ+9DbCVqe
# KkTbBQf9HRlspCl2r5a8K9O1ymylKiZ653OBWMStGTQ8xPXeLDFhT+ION34VBgBh
# LXHEcjIHn24cN2s1BO5+xJs0nzqYe7UEAK6JQmdX3/HEuf8VmaVslvhm+nCWKoIL
# JQbsHno92wh6vvTWQu53zijEuG5HdBseWiwQKHbE1oSRc2CikG70o902AL9zXAsp
# mpUYWxUmWwg5uQATztp4XfylJBcSX3SiVgv22jXLqBj9drXPftl/E33fcWXxTj5f
# AM3kz9fxaCfo5+znmYw3R1tT/Hv52Q6hW+oNAm34XeWp1/+ho27QMRrpIi/dpdwX
# mCbvJwI75sCnD91p9NW7vZIYZJKsLg==
# =SLCY
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 17 Jan 2025 06:15:23 EST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-10.0-gdb-plugins-doc-updates-170125-1' of https://gitlab.com/stsquad/qemu: (37 commits)
scripts/nsis.py: Run dependency check for each DLL file only once
docs: add a glossary
docs/devel: add a codebase section
docs/devel: add information on how to setup build environments
docs/devel: add b4 for patch retrieval
docs/devel: add git-publish for patch submitting
docs/sphinx: include kernel-doc script as a dependency
include/exec: remove warning_printed from MemoryRegion
include/exec: fix some copy and paste errors in kdoc
tests/qtest: fix some copy and paste errors in kdoc
editorconfig: update for perl scripts
plugins: fix kdoc annotation
plugins: enable linking with clang/lld
docs/devel/style: add a section about bitfield, and disallow them for packed structures
win32: remove usage of attribute gcc_struct
accel/tcg: also suppress asynchronous IRQs for cpu_io_recompile
configure: reenable plugins by default for 32-bit hosts
contrib/plugins/hotpages: fix 32-bit build
contrib/plugins/hwprofile: fix 32-bit build
contrib/plugins/cflow: fix 32-bit build
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* target/i386: small code generation improvements
* target/i386: various cleanups and fixes
* cpu: remove env->nr_cores
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
=xIQ4
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* rust: miscellaneous changes
* target/i386: small code generation improvements
* target/i386: various cleanups and fixes
* cpu: remove env->nr_cores
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
# MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
# 5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
# NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
# CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
# 07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
# =xIQ4
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 17:34:48 EST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (38 commits)
i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX]
i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
cpu: Remove nr_cores from struct CPUState
i386/cpu: Hoist check of CPUID_EXT3_TOPOEXT against threads_per_core
i386/cpu: Track a X86CPUTopoInfo directly in CPUX86State
i386/topology: Introduce helpers for various topology info of different level
i386/topology: Update the comment of x86_apicid_from_topo_ids()
i386/cpu: Drop cores_per_pkg in cpu_x86_cpuid()
i386/cpu: Drop the variable smp_cores and smp_threads in x86_cpu_pre_plug()
i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
target/i386/kvm: Replace ARRAY_SIZE(msr_handlers) with KVM_MSR_FILTER_MAX_RANGES
target/i386/kvm: Clean up error handling in kvm_arch_init()
target/i386/kvm: Return -1 when kvm_msr_energy_thread_init() fails
target/i386/kvm: Clean up return values of MSR filter related functions
target/i386/confidential-guest: Fix comment of x86_confidential_guest_kvm_type()
target/i386/kvm: Drop workaround for KVM_X86_DISABLE_EXITS_HTL typo
target/i386/kvm: Only save/load kvmclock MSRs when kvmclock enabled
target/i386/kvm: Remove local MSR_KVM_WALL_CLOCK and MSR_KVM_SYSTEM_TIME definitions
target/i386/kvm: Add feature bit definitions for KVM CPUID
i386/cpu: Mark avx10_version filtered when prefix is NULL
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since d197063fcf (memory: move unassigned_mem_ops to memory.c) this
field is unused.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-31-alex.bennee@linaro.org>
A number of copy and paste kdoc comments are referring to the wrong
definition. Fix those cases.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-30-alex.bennee@linaro.org>
The function is qemu_plugin_mem_get_value()
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-27-alex.bennee@linaro.org>
This attribute is not recognized by clang.
An investigation has been performed to ensure this attribute has no
effect on layout of structures we use in QEMU [1], so it's safe to
remove now.
In the future, we'll forbid introducing new bitfields in packed struct,
as they are the one potentially impacted by this change.
[1] https://lore.kernel.org/qemu-devel/66c346de-7e20-4831-b3eb-1cda83240af9@linaro.org/
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Stefan Weil <sw@weilnetz.de>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250110203401.178532-2-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-24-alex.bennee@linaro.org>
This started as a clean-up to properly pass a Error handler to the
gdbserver_start so we could do the right thing for command line and
HMP invocations.
Now that we have cleaned up foreach_device_config_or_exit() in earlier
patches we can further simplify by it by passing &error_fatal instead
of checking the return value. Having a return value is still useful
for HMP though so tweak the return to use a simple bool instead.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-11-alex.bennee@linaro.org>
The CPUState structure is declared in "hw/core/cpu.h",
the EXCP_HALTED definition in "exec/cpu-common.h".
Both headers are indirectly include by "cpu.h". In
order to remove "cpu.h" from "semihosting/console.h",
explicitly include them in console.c, otherwise we'd
get:
../semihosting/console.c:88:11: error: incomplete definition of type 'struct CPUState'
88 | cs->exception_index = EXCP_HALTED;
| ~~^
../semihosting/console.c:88:31: error: use of undeclared identifier 'EXCP_HALTED'
88 | cs->exception_index = EXCP_HALTED;
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250103171037.11265-5-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-7-alex.bennee@linaro.org>
target_ulong is defined in each target "cpu-param.h",
itself included by "exec/cpu-defs.h".
Include the latter in order to avoid when refactoring:
include/semihosting/syscalls.h:26:24: error: unknown type name 'target_ulong'
26 | target_ulong fname, target_ulong fname_len,
| ^
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250103171037.11265-2-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-4-alex.bennee@linaro.org>
Since it is not obvious the get/put_user*() methods
can return an error, add brief docstrings about it.
Also remind to use *unlock_user() when appropriate.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241212115413.42109-1-philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250116160306.1709518-3-alex.bennee@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEMUsIrNDeSBEzpfKGm+mA/QrAFUQFAmeIxhcSHGR3bXdAYW1h
em9uLmNvLnVrAAoJEJvpgP0KwBVEbKoP/iqQ+PhbwT9+xz6lxW+g1Dx+YGrT/ugp
d3xHn9AEkR0EHC42J6RB/llyWbKVD/IIhYwUk5GDm+4InGrtuQDhG6UqWxqvIRht
0JuZvVm7x5akmKv73igxNqZHVg0ZEAS+EllBUaBYWj0pvpMbBK93Sdz9PXKxA7Nm
dPeFrOpL2TAmnDCH1UuBbXypHEjAghmv7WFphMtk6qLX+wYVaK3F2J/ed2TNyT0V
LliOdQH0Pxt445SSVJIZRe9bW3FH7qyvZV1gCnxSnqPUlN7vBhpjzgl4hWEzVYcp
7X21ZAD9kPc81DJjYucbLjAbrqSmlDrJqL05qtRigfPcnqz2NoKrYxhj8B0F8mgt
1IbymPyeab5gk5Hi1QgMmG5eobDDaglDSxpq6gRfJBiJW+1adif00z/HVvt5onS0
uQ6i6w5NzQciBX77muAb2ZDEMysjk+3wSJMMpkfl90D0kjlMqeWWs4FH9ThasjC+
EhQioUD0euedgnzOSfQjNNtAW4gzv9rcShkcV84bjxP/0Es+Pgx9f6wtCUTzdeqy
Cid8/72lHIgrkZGfpv8BBZkA1XP09vgtUGKyAWm4yHOcB57l8cNiL1nKtqoCLwkQ
8JWFWzFeEY19KoiRGY5saH6ExeOx8fmc/lYwqImZqFqvuFX4Vf2RJdTIRIYr7g05
2QffxFmskg+A
=Wz0V
-----END PGP SIGNATURE-----
Merge tag 'pull-xenfv-20250116' of git://git.infradead.org/users/dwmw2/qemu into staging
Xen regression fixes and cleanups
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEMUsIrNDeSBEzpfKGm+mA/QrAFUQFAmeIxhcSHGR3bXdAYW1h
# em9uLmNvLnVrAAoJEJvpgP0KwBVEbKoP/iqQ+PhbwT9+xz6lxW+g1Dx+YGrT/ugp
# d3xHn9AEkR0EHC42J6RB/llyWbKVD/IIhYwUk5GDm+4InGrtuQDhG6UqWxqvIRht
# 0JuZvVm7x5akmKv73igxNqZHVg0ZEAS+EllBUaBYWj0pvpMbBK93Sdz9PXKxA7Nm
# dPeFrOpL2TAmnDCH1UuBbXypHEjAghmv7WFphMtk6qLX+wYVaK3F2J/ed2TNyT0V
# LliOdQH0Pxt445SSVJIZRe9bW3FH7qyvZV1gCnxSnqPUlN7vBhpjzgl4hWEzVYcp
# 7X21ZAD9kPc81DJjYucbLjAbrqSmlDrJqL05qtRigfPcnqz2NoKrYxhj8B0F8mgt
# 1IbymPyeab5gk5Hi1QgMmG5eobDDaglDSxpq6gRfJBiJW+1adif00z/HVvt5onS0
# uQ6i6w5NzQciBX77muAb2ZDEMysjk+3wSJMMpkfl90D0kjlMqeWWs4FH9ThasjC+
# EhQioUD0euedgnzOSfQjNNtAW4gzv9rcShkcV84bjxP/0Es+Pgx9f6wtCUTzdeqy
# Cid8/72lHIgrkZGfpv8BBZkA1XP09vgtUGKyAWm4yHOcB57l8cNiL1nKtqoCLwkQ
# 8JWFWzFeEY19KoiRGY5saH6ExeOx8fmc/lYwqImZqFqvuFX4Vf2RJdTIRIYr7g05
# 2QffxFmskg+A
# =Wz0V
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 16 Jan 2025 03:40:55 EST
# gpg: using RSA key 314B08ACD0DE481133A5F2869BE980FD0AC01544
# gpg: issuer "dwmw@amazon.co.uk"
# gpg: Good signature from "David Woodhouse <dwmw@amazon.co.uk>" [unknown]
# gpg: aka "David Woodhouse <dwmw@amazon.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 314B 08AC D0DE 4811 33A5 F286 9BE9 80FD 0AC0 1544
* tag 'pull-xenfv-20250116' of git://git.infradead.org/users/dwmw2/qemu:
system/runstate: Fix regression, clarify BQL status of exit notifiers
hw/xen: Fix errp handling in xen_console
hw/xen: Use xs_node_read() from xenstore_read_str() instead of open-coding it
hw/xen: Use xs_node_read() from xen_netdev_get_name()
hw/xen: Use xs_node_read() from xen_console_get_name()
hw/xen: Use xs_node_read() from xs_node_vscanf()
xen: do not use '%ms' scanf specifier
hw/xen: Add xs_node_read() helper function
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ4hk/QAKCRAfewwSUazn
0WagAQDgJaWBLQxZkyQR2FQm3WHg3Uf/qolab9nDGo3b2BpixgD/RdvZf+mZpAwf
2ipAQ7g5GqGTKtTAdqO/aBAqTCZCqQU=
=7KKt
-----END PGP SIGNATURE-----
Merge tag 'pull-loongarch-20250116' of https://gitlab.com/bibo-mao/qemu into staging
loongarch queue
# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ4hk/QAKCRAfewwSUazn
# 0WagAQDgJaWBLQxZkyQR2FQm3WHg3Uf/qolab9nDGo3b2BpixgD/RdvZf+mZpAwf
# 2ipAQ7g5GqGTKtTAdqO/aBAqTCZCqQU=
# =7KKt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 15 Jan 2025 20:46:37 EST
# gpg: using EDDSA key 0D8642A3A2659F80B0B3D1A41F7B0C1251ACE7D1
# gpg: Good signature from "bibo mao <maobibo@loongson.cn>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7044 3A00 19C0 E97A 31C7 13C4 8E86 8FB7 A176 9D4C
# Subkey fingerprint: 0D86 42A3 A265 9F80 B0B3 D1A4 1F7B 0C12 51AC E7D1
* tag 'pull-loongarch-20250116' of https://gitlab.com/bibo-mao/qemu:
hw/intc/loongarch_ipi: Use alternative implemation for cpu_by_arch_id
hw/intc/loongson_ipi: Add more input parameter for cpu_by_arch_id
hw/intc/loongarch_ipi: Remove property num-cpu
hw/intc/loongarch_ipi: Get cpu number from possible_cpu_arch_ids
hw/intc/loongson_ipi: Remove property num_cpu from loongson_ipi_common
hw/intc/loongson_ipi: Remove num_cpu from loongson_ipi_common
hw/intc/loongarch_ipi: Implement realize interface
target/loongarch: Add page table walker support for debugger usage
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The vmclock device addresses the problem of live migration with
precision clocks. The tolerances of a hardware counter (e.g. TSC) are
typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that
counter against an external source of 'real' time, and track the precise
frequency of the counter as it changes with environmental conditions.
When a guest is live migrated, anything it knows about the frequency of
the underlying counter becomes invalid. It may move from a host where
the counter running at -50PPM of its nominal frequency, to a host where
it runs at +50PPM. There will also be a step change in the value of the
counter, as the correctness of its absolute value at migration is
limited by the accuracy of the source and destination host's time
synchronization.
The device exposes a shared memory region to guests, which can be mapped
all the way to userspace. In the first phase, this merely advertises a
'disruption_marker', which indicates that the guest should throw away any
NTP synchronization it thinks it has, and start again.
Because the region can be exposed all the way to userspace, applications
can still use time from a fast vDSO 'system call', and check the
disruption marker to be sure that their timestamp is indeed truthful.
The structure also allows for the precise time, as known by the host, to
be exposed directly to guests so that they don't have to wait for NTP to
resync from scratch.
The values and fields are based on the nascent virtio-rtc specification,
and the intent is that a version (hopefully precisely this version) of
this structure will be included as an optional part of that spec. In the
meantime, a simple ACPI device along the lines of VMGENID is perfectly
sufficient and is compatible with what's being shipped in certain
commercial hypervisors.
Linux guest support was merged into the 6.13-rc1 kernel:
https://git.kernel.org/torvalds/c/205032724226
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <07fd5e2f529098ad4d7cab1423fe9f4a03a9cc14.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Migration state transfer interface is only used by vhost-user-fs,
so the interface needs to be defined only when vhost is built.
But I need to use this interface with virtio-net and vhost is not always
enabled, and to avoid undefined reference error during build, define stub
functions for vhost_supports_device_state(), vhost_save_backend_state() and
vhost_load_backend_state().
Cc: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20250115135044.799698-2-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The hardware error firmware is where HEST error structures are
stored. Those can be GHESv2, but they can also be other types.
Better name the location of the hardware error.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <ddbb94294bafee998f12fede3ba0b05dae5ee45f.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current function used to generate GHES data is specific for
memory errors. Give a better name for it, as we now have a generic
function as well.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Message-Id: <35b59121129d5e99cb5062cc3d775594bbb0905b.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the code into separate functions to allow using the
common CPER filling code by different error sources.
The generic code was moved to ghes_record_cper_errors(),
and ghes_gen_err_data_uncorrectable_recoverable() now contains
only a logic to fill the Generic Error Data part of the record,
as described at:
ACPI 6.2: 18.3.2.7.1 Generic Error Data
The remaining code to generate a memory error now belongs to
acpi_ghes_record_errors() function.
A further patch will give it a better name.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <68d9f787d8c4fc8d1dbc227d6902fe801e42dea9.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As described at: ACPI 6.5 spec at:
18.3.2. ACPI Error Source
In particular at GHES/GHESv2 table:
Table 18.10 Generic Hardware Error Source Structure
HEST source ID is actually a 16-bit value.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <0e83ba548c1aedd1299fe387b94db78986590a34.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Align the header file with the actual implementation of
this function, as the first argument is source ID and not
notification type.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <d55f2a6ede5a168e42a20a228b2c066cb4c60939.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The GHES driver requires not only a HEST table, but also a
separate firmware file to store Error Structure records.
It can't do one without the other.
Simplify the caller logic for it to require one function.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <9584bb8953385e165681d5d185c503f8df8ef42f.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is just duplicating ACPI_GHES_ERROR_SOURCE_COUNT, which
has a better name. So, drop the duplication.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <9012bf4c9630adf15a22af3c88fda8270916887b.1736945236.git.mchehab+huawei@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the framework to test the intel-iommu device.
Currently only tested cap/ecap bits correctness when x-flts=on in scalable
mode. Also tested cap/ecap bits consistency before and after system reset.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-21-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This gives user flexibility to turn off FS1GP for debug purpose.
It is also useful for future nesting feature. When host IOMMU doesn't
support FS1GP but vIOMMU does, nested page table on host side works
after turning FS1GP off in vIOMMU.
This property has no effect when vIOMMU is in legacy mode or x-flts=off
in scalable modme.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-20-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to VTD spec, stage-1 page table could support 4-level and
5-level paging.
However, 5-level paging translation emulation is unsupported yet.
That means the only supported value for aw_bits is 48. So default
aw_bits to 48 when stage-1 translation is turned on.
For legacy and scalable modes, 48 is the default choice for modern
OS when both 48 and 39 are supported. So it makes sense to set
default to 48 for these two modes too starting from QEMU 9.2.
Use pc_compat_9_1 to handle the compatibility for machines before
9.2.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-17-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to spec, Page-Selective-within-Domain Invalidation (11b):
1. IOTLB entries caching second-stage mappings (PGTT=010b) or pass-through
(PGTT=100b) mappings associated with the specified domain-id and the
input-address range are invalidated.
2. IOTLB entries caching first-stage (PGTT=001b) or nested (PGTT=011b)
mapping associated with specified domain-id are invalidated.
So per spec definition the Page-Selective-within-Domain Invalidation
needs to flush first stage and nested cached IOTLB entries as well.
We don't support nested yet and pass-through mapping is never cached,
so what in iotlb cache are only first-stage and second-stage mappings.
Add a tag pgtt in VTDIOTLBEntry to mark PGTT type of the mapping and
invalidate entries based on PGTT type.
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-11-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Because we will support both FST(a.k.a, FLT) and SST(a.k.a, SLT) translation,
rename variable and functions from slpte to pte whenever possible.
But some are SST only, they are renamed with sl_ prefix.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Co-developed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-6-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add an new element flts in IntelIOMMUState to mark stage-1 translation support
in scalable mode, this element will be exposed as an intel_iommu property
x-flts finally.
For now, it's only a placehholder and used for address width compatibility
check and block host device passthrough until nesting is supported.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add the VIRTIO_GPU_F_RESOURCE_UUID feature to enable the assignment
of resources UUIDs for export to other virtio devices.
Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Message-Id: <20241007070013.3350752-1-dbassey@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By changing the way the main QEMU event loop is invoked, I inadvertently
changed the BQL status of exit notifiers: some of them implicitly
assumed they would be called with the BQL held; the BQL is however
not held during the exit(status) call in qemu_default_main().
Instead of attempting to ensuring we always call exit() from the BQL -
including any transitive calls - this change adds a BQL lock guard to
qemu_run_exit_notifiers, ensuring the BQL will always be held in the
exit notifiers.
Additionally, the BQL promise is now documented at the
qemu_{add,remove}_exit_notifier() declarations.
Fixes: f5ab12caba ("ui & main loop: Redesign of system-specific main
thread event handling")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2771
Reported-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
The 'm' parameter used to request auto-allocation of the destination variable
is not supported on FreeBSD, and as such leads to failures to parse.
What's more, the current usage of '%ms' with xs_node_scanf() is pointless, as
it just leads to a double allocation of the same string. Instead use
xs_node_read() to read the whole xenstore node.
Fixes: a783f8ad4e ('xen: add a mechanism to automatically create XenDevice-s...')
Fixes: 9b77374690 ('hw/xen: update Xen console to XenDevice model')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
This returns the full contents of the node, having created the node path
from the printf-style format string provided in its arguments.
This will save various callers from having to do so for themselves (and
from using xs_node_scanf() with the non-portable %ms format string.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
[remove double newline and constify trace parameters]
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Add logic cpu index input parameter for function cpu_by_arch_id,
CPUState::cpu_index is logic cpu slot index for possible_cpus.
At the same time it is logic index with LoongsonIPICommonState::IPICore,
here hide access for CPUState::cpu_index directly, it comes from
function cpu_by_arch_id().
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Supported CPU number can be acquired from function
possible_cpu_arch_ids(), cpu-num property is not necessary and can
be removed.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
The renamed state will not only represent powering state of PFs, but
also represent SR-IOV VF enablement in the future.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250109-reuse-v19-1-f541e82ca5f7@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Follow the assumed QOM type definition style, prefixing with
'TYPE_', and dropping the '_DEVICE' suffix which doesn't add
any value.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20250102132624.53443-1-philmd@linaro.org>
Factor qdev_hotunplug_allowed() out of qdev_unplug().
Start checking the device is not blocked.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
[PMD: Split from bigger patch, part 2/6]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20250110091908.64454-3-philmd@linaro.org>
In preparation of checking the parent bus is hot(un)pluggable
in a few commits, pass a 'bus' argument to qdev_hotplug_allowed().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
[PMD: Split from bigger patch, part 1/6]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20250110091908.64454-2-philmd@linaro.org>
Inline the 3 uses of usb_new().
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240216110313.17039-11-philmd@linaro.org>
Inline the single use of usb_try_new().
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240216110313.17039-10-philmd@linaro.org>
Introduce various helpers for getting the topology info of different
semantics. Using the helper is more self-explanatory.
Besides, the semantic of the helper will stay unchanged even when new
topology is added in the future. At that time, updating the
implementation of the helper without affecting the callers.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-6-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update the comment of x86_apicid_from_topo_ids() to match the current
implementation,
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-5-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "concrete_class" field of InterfaceClass is only ever written, and as far
as I can tell is not particularly useful when debugging either; remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- compression:
Shameer's fix for CONFIG_UADK build
Yuan Liu fixes for zero-page, QPL, qatzip
- multifd sync cleanups, prereq. for VFIO and postcopy work
- fixes for 9.2 regressions:
multifd with pre-9.0 -> post-9.1 migrations (#2720)
s390x migration (#2704)
- fix for assertions during paused migrations; rework of
late-block-activate logic (#2395, #686)
- fixes for compressed arrays creation and parsing, mostly affecting
s390x
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmeBDgkQHGZhcm9zYXNA
c3VzZS5kZQAKCRDHmNx0G+wxnSlUEACl31wY+77JxWnBva/eDDwnJ9HiCrqsoqaZ
YIJJXNlk4lYJWNdZRt6p27exzWrQwm+kWKPECeCakgCMlfhnKCvejGq7iV/fJY4o
D8hjE3t1htQ8mfblY1+bqzg3Rml59KwXxiqAwvlljbNWdkXruv026dq9vgJMzFhi
ia043fOO1tYULIoawgmwmLEHnztht0v+ZTZ1v5KQbrH655tpxls/8kHc6v5PXEpA
3PSmCrCQh1dPtkYRjuJ9yHyfU+/T8tYwIjrU6VR1wQW7MBNkjtqNudaqAFiuyuqn
P8gh4rAQrMhA9y+aq6xSoJP8XGkuOHxLQtlNutlmtbcQyZ7JqgLmK9ZLdoPf21sK
//erV63NoyaciYB9Nk3NXflwroc6zyvo8A584kGNPwBznZOJLESP4SPvVm/nlE29
vbyq8AWHRjFiqqf6P0ttQLAFkusZJzM1Y9UakF51hyVBX70yfqLG20XXZtIq/aZA
GbBB2Fo0MIlbmWaur3vLsSzn7B8d++Gl9TTGcK/eIXJ1ANCuCxGv9fbXJQlP5F4I
3OAoSmAVJ2eqw4v0+2WMiEa8yUA5drNnDSI3VRkG+0K9jRfHKXki466/QQdGrNw7
8GuuzLBNai3gEKbavDU0Be73r982KjXeYXj7RuAkQfm0d4H7tiwtg91Cd1dPKfzh
mhpmOFJDCg==
=joNM
-----END PGP SIGNATURE-----
Merge tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu into staging
Migration pull request
- compression:
Shameer's fix for CONFIG_UADK build
Yuan Liu fixes for zero-page, QPL, qatzip
- multifd sync cleanups, prereq. for VFIO and postcopy work
- fixes for 9.2 regressions:
multifd with pre-9.0 -> post-9.1 migrations (#2720)
s390x migration (#2704)
- fix for assertions during paused migrations; rework of
late-block-activate logic (#2395, #686)
- fixes for compressed arrays creation and parsing, mostly affecting
s390x
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmeBDgkQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnSlUEACl31wY+77JxWnBva/eDDwnJ9HiCrqsoqaZ
# YIJJXNlk4lYJWNdZRt6p27exzWrQwm+kWKPECeCakgCMlfhnKCvejGq7iV/fJY4o
# D8hjE3t1htQ8mfblY1+bqzg3Rml59KwXxiqAwvlljbNWdkXruv026dq9vgJMzFhi
# ia043fOO1tYULIoawgmwmLEHnztht0v+ZTZ1v5KQbrH655tpxls/8kHc6v5PXEpA
# 3PSmCrCQh1dPtkYRjuJ9yHyfU+/T8tYwIjrU6VR1wQW7MBNkjtqNudaqAFiuyuqn
# P8gh4rAQrMhA9y+aq6xSoJP8XGkuOHxLQtlNutlmtbcQyZ7JqgLmK9ZLdoPf21sK
# //erV63NoyaciYB9Nk3NXflwroc6zyvo8A584kGNPwBznZOJLESP4SPvVm/nlE29
# vbyq8AWHRjFiqqf6P0ttQLAFkusZJzM1Y9UakF51hyVBX70yfqLG20XXZtIq/aZA
# GbBB2Fo0MIlbmWaur3vLsSzn7B8d++Gl9TTGcK/eIXJ1ANCuCxGv9fbXJQlP5F4I
# 3OAoSmAVJ2eqw4v0+2WMiEa8yUA5drNnDSI3VRkG+0K9jRfHKXki466/QQdGrNw7
# 8GuuzLBNai3gEKbavDU0Be73r982KjXeYXj7RuAkQfm0d4H7tiwtg91Cd1dPKfzh
# mhpmOFJDCg==
# =joNM
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 07:09:45 EST
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu: (25 commits)
multifd: bugfix for incorrect migration data with qatzip compression
multifd: bugfix for incorrect migration data with QPL compression
multifd: bugfix for migration using compression methods
s390x: Fix CSS migration
migration: Fix arrays of pointers in JSON writer
migration: Dump correct JSON format for nullptr replacement
migration: Rename vmstate_info_nullptr
migration: Fix parsing of s390 stream
migration: Remove unused argument in vmsd_desc_field_end
migration: Add more error handling to analyze-migration.py
migration/block: Rewrite disk activation
migration/block: Fix possible race with block_inactive
migration/block: Apply late-block-active behavior to postcopy
migration/block: Make late-block-active the default
qmp/cont: Only activate disks if migration completed
migration: Add helper to get target runstate
migration/multifd: Fix compat with QEMU < 9.0
migration/multifd: Document the reason to sync for save_setup()
migration/multifd: Cleanup src flushes on condition check
migration/multifd: Remove sync processing on postcopy
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch proposes a flag to maintain disk activation status globally. It
mostly rewrites disk activation mgmt for QEMU, including COLO and QMP
command xen_save_devices_state.
Backgrounds
===========
We have two problems on disk activations, one resolved, one not.
Problem 1: disk activation recover (for switchover interruptions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.
It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().
It used to be fixed in different commits, all over the places in QEMU. So
these are the relevant changes I saw, I'm not sure if it's complete list:
- In 2016, commit fe904ea824 ("migration: regain control of images when
migration fails to complete")
- In 2017, commit 1d2acc3162 ("migration: re-active images while migration
been canceled after inactive them")
- In 2023, commit 6dab4c93ec ("migration: Attempt disk reactivation in
more failure scenarios")
Now since we have a slightly better picture maybe we can unify the
reactivation in a single path.
One side benefit of doing so is, we can move the disk operation outside QMP
command "migrate_cancel". It's possible that in the future we may want to
make "migrate_cancel" be OOB-compatible, while that requires the command
doesn't need BQL in the first place. This will already do that and make
migrate_cancel command lightweight.
Problem 2: disk invalidation on top of invalidated disks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is an unresolved bug for current QEMU. Link in "Resolves:" at the
end. It turns out besides the src switchover phase (problem 1 above), QEMU
also needs to remember block activation on destination.
Consider two continuous migration in a row, where the VM was always paused.
In that scenario, the disks are not activated even until migration
completed in the 1st round. When the 2nd round starts, if QEMU doesn't
know the status of the disks, it needs to try inactivate the disk again.
Here the issue is the block layer API bdrv_inactivate_all() will crash a
QEMU if invoked on already inactive disks for the 2nd migration. For
detail, see the bug link at the end.
Implementation
==============
This patch proposes to maintain disk activation with a global flag, so we
know:
- If we used to inactivate disks for migration, but migration got
cancelled, or failed, QEMU will know it should reactivate the disks.
- On incoming side, if the disks are never activated but then another
migration is triggered, QEMU should be able to tell that inactivate is
not needed for the 2nd migration.
We used to have disk_inactive, but it only solves the 1st issue, not the
2nd. Also, it's done in completely separate paths so it's extremely hard
to follow either how the flag changes, or the duration that the flag is
valid, and when we will reactivate the disks.
Convert the existing disk_inactive flag into that global flag (also invert
its naming), and maintain the disk activation status for the whole
lifecycle of qemu. That includes the incoming QEMU.
Put both of the error cases of source migration (failure, cancelled)
together into migration_iteration_finish(), which will be invoked for
either of the scenario. So from that part QEMU should behave the same as
before. However with such global maintenance on disk activation status, we
not only cleanup quite a few temporary paths that we try to maintain the
disk activation status (e.g. in postcopy code), meanwhile it fixes the
crash for problem 2 in one shot.
For freshly started QEMU, the flag is initialized to TRUE showing that the
QEMU owns the disks by default.
For incoming migrated QEMU, the flag will be initialized to FALSE once and
for all showing that the dest QEMU doesn't own the disks until switchover.
That is guaranteed by the "once" variable.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>