- update the tuxrun images to the latest baseline
- add the m68k tuxrun test
- ensure qtest checks the result of clock_step operations
- introduce new ztsd helper to functional tests
- ensure aarch64_virt test exits early when no TCG
- add new test to exercise virtio-vulkan
- bump libvirt-ci to latest version
- move riscv64 cross container from sid to trixie
- remove workaround from mips containers now upstream updated
- fix VM tests to use correct path for local QEMU binary
- add ability to get a root debug shell on VM images
- add keymap dependency to vnc tests
- remove retiring maintainers from avocado and dockerfiles
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeCYB4ACgkQ+9DbCVqe
KkRS5Qf/V0rQ1OAxjK+/xrUPB84AYA4gwgInPzcENbQ0Oqkn2rnkEkyMlxC6AMd0
H8AmARy/mkSivm6ZaKqhz0Xhw0rblU2ZtGMUp3Xw47fwVDJZY1Pvr8vXxPySiHGW
7GmiHJzOh+tDcY0TO2biCNFmiJJ2az2STIQDS6YX4QRJOU26qsfbIMXOoqDnUACo
2e5MDgRtVFuD3/6J1SpQxGnBE79mPMkCN0gqqfn0x6W9EqsVlqB3MfHEIp5/b+Tt
Gfx8gi8HrKHUua01rjnxXyZPqrrAZ2zRo4bN1vLSJmqLN1X+yPYNEXCjOtM4f9uk
g3PA76X1kX1MFMLMPkCefKFCxI5nsQ==
=+yCJ
-----END PGP SIGNATURE-----
Merge tag 'pull-10.0-testing-updates-110125-1' of https://gitlab.com/stsquad/qemu into staging
Testing updates for 10.0
- update the tuxrun images to the latest baseline
- add the m68k tuxrun test
- ensure qtest checks the result of clock_step operations
- introduce new ztsd helper to functional tests
- ensure aarch64_virt test exits early when no TCG
- add new test to exercise virtio-vulkan
- bump libvirt-ci to latest version
- move riscv64 cross container from sid to trixie
- remove workaround from mips containers now upstream updated
- fix VM tests to use correct path for local QEMU binary
- add ability to get a root debug shell on VM images
- add keymap dependency to vnc tests
- remove retiring maintainers from avocado and dockerfiles
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmeCYB4ACgkQ+9DbCVqe
# KkRS5Qf/V0rQ1OAxjK+/xrUPB84AYA4gwgInPzcENbQ0Oqkn2rnkEkyMlxC6AMd0
# H8AmARy/mkSivm6ZaKqhz0Xhw0rblU2ZtGMUp3Xw47fwVDJZY1Pvr8vXxPySiHGW
# 7GmiHJzOh+tDcY0TO2biCNFmiJJ2az2STIQDS6YX4QRJOU26qsfbIMXOoqDnUACo
# 2e5MDgRtVFuD3/6J1SpQxGnBE79mPMkCN0gqqfn0x6W9EqsVlqB3MfHEIp5/b+Tt
# Gfx8gi8HrKHUua01rjnxXyZPqrrAZ2zRo4bN1vLSJmqLN1X+yPYNEXCjOtM4f9uk
# g3PA76X1kX1MFMLMPkCefKFCxI5nsQ==
# =+yCJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 11 Jan 2025 07:12:14 EST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* tag 'pull-10.0-testing-updates-110125-1' of https://gitlab.com/stsquad/qemu: (32 commits)
MAINTAINERS: Remove myself from reviewers
dockerfiles: Remove 'MAINTAINER' entry in debian-tricore-cross.docker
pc-bios: ensure keymaps dependencies set vnc tests
tests/vm: allow interactive login as root
tests/vm: partially un-tabify help output
tests/vm: fix build_path based path
tests/lcitool: remove temp workaround for debian mips64el
tests/docker: move riscv64 cross container from sid to trixie
tests/lcitool: bump to latest version of libvirt-ci
tests/functional: extend test_aarch64_virt with vulkan test
tests/functional: bail aarch64_virt tests early if missing TCG
tests/functional: remove unused kernel_command_line
tests/functional: update tuxruntest to use uncompress utility
tests/functional: add zstd support to uncompress utility
tests/functional: remove hacky sleep from the tests
system/qtest: properly feedback results of clock_[step|set]
tests/qtest: remove clock_steps from virtio tests
tests/functional/aarch64: add tests for FEAT_RME
tests/functional: update the x86_64 tuxrun tests
tests/functional: update the sparc64 tuxrun tests
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The time I spent contributing to QEMU was great, but I've not been active
for a long time now. So removing myself from the reviewers list of
"Integration Testing with the Avocado framework" and
"Build and test automation" subsystems.
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20250103194450.360789-1-wainersm@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-33-alex.bennee@linaro.org>
AMSAT closed its email service [*] so my personal email
address is now defunct. Remove it to avoid bouncing emails.
[*] https://web.archive.org/web/20240617194936/https://forum.amsat-dl.org/index.php?thread/4581-amsat-mail-alias-service-to-end-august-1-2024/
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250102152513.61065-1-philmd@linaro.org>
[AJB: update URL to web.archive.org]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-32-alex.bennee@linaro.org>
I was seeing failures on vnc-display-test on FreeBSD:
make vm-build-freebsd V=1 TARGET_LIST=aarch64-softmmu BUILD_TARGET=check-qtest QEMU_LOCAL=1 DEBUG=1
Leads to:
qemu-system-aarch64: -vnc none: could not read keymap file: 'en-us'
Broken pipe
../src/tests/qtest/libqtest.c:196: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
which was as far as I could tell because we don't populate the
$BLD/pc-bios/keymaps (although scripts/symlink-install-tree.py
attempts to symlink qemu-bundle/usr/local/share/qemu/keymaps/ to that
dir).
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-31-alex.bennee@linaro.org>
This is useful when debugging and you want to add packages to an
image.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-30-alex.bennee@linaro.org>
While the make syntax itself uses tabs having a mixture of tabs and
spaces in the vm-help output make no sense and confuses things lining
up between terminal and editor. Fix that.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-29-alex.bennee@linaro.org>
We no longer need to go into the per-arch build directories to find
the build directories binary. Lets call it directly.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-28-alex.bennee@linaro.org>
The workaround applied in
commit c60473d292
Author: Alex Bennée <alex.bennee@linaro.org>
Date: Wed Oct 2 10:03:33 2024 +0200
testing: bump mips64el cross to bookworm and fix package list
Is no longer required since the affected builds are now fixed.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20241217133525.3836570-1-berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-27-alex.bennee@linaro.org>
Although riscv64 isn't going to be a release architecture for trixie
the packages are still built while it is testing. Moving from sid will
also avoid some of the volatility we get from tracking the bleeding
edge.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-26-alex.bennee@linaro.org>
We will shortly need this to build our riscv64 cross container.
However to keep the delta down just do the bump first. As ccache4 is
now preferred for FreeBSD to get the latest version there is a little
update in the FreeBSD metadata.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-25-alex.bennee@linaro.org>
Now that we have virtio-gpu Vulkan support, let's add a test for it.
Currently this is using images build by buildroot:
https://lists.buildroot.org/pipermail/buildroot/2024-December/768196.html
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-24-alex.bennee@linaro.org>
- compression:
Shameer's fix for CONFIG_UADK build
Yuan Liu fixes for zero-page, QPL, qatzip
- multifd sync cleanups, prereq. for VFIO and postcopy work
- fixes for 9.2 regressions:
multifd with pre-9.0 -> post-9.1 migrations (#2720)
s390x migration (#2704)
- fix for assertions during paused migrations; rework of
late-block-activate logic (#2395, #686)
- fixes for compressed arrays creation and parsing, mostly affecting
s390x
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmeBDgkQHGZhcm9zYXNA
c3VzZS5kZQAKCRDHmNx0G+wxnSlUEACl31wY+77JxWnBva/eDDwnJ9HiCrqsoqaZ
YIJJXNlk4lYJWNdZRt6p27exzWrQwm+kWKPECeCakgCMlfhnKCvejGq7iV/fJY4o
D8hjE3t1htQ8mfblY1+bqzg3Rml59KwXxiqAwvlljbNWdkXruv026dq9vgJMzFhi
ia043fOO1tYULIoawgmwmLEHnztht0v+ZTZ1v5KQbrH655tpxls/8kHc6v5PXEpA
3PSmCrCQh1dPtkYRjuJ9yHyfU+/T8tYwIjrU6VR1wQW7MBNkjtqNudaqAFiuyuqn
P8gh4rAQrMhA9y+aq6xSoJP8XGkuOHxLQtlNutlmtbcQyZ7JqgLmK9ZLdoPf21sK
//erV63NoyaciYB9Nk3NXflwroc6zyvo8A584kGNPwBznZOJLESP4SPvVm/nlE29
vbyq8AWHRjFiqqf6P0ttQLAFkusZJzM1Y9UakF51hyVBX70yfqLG20XXZtIq/aZA
GbBB2Fo0MIlbmWaur3vLsSzn7B8d++Gl9TTGcK/eIXJ1ANCuCxGv9fbXJQlP5F4I
3OAoSmAVJ2eqw4v0+2WMiEa8yUA5drNnDSI3VRkG+0K9jRfHKXki466/QQdGrNw7
8GuuzLBNai3gEKbavDU0Be73r982KjXeYXj7RuAkQfm0d4H7tiwtg91Cd1dPKfzh
mhpmOFJDCg==
=joNM
-----END PGP SIGNATURE-----
Merge tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu into staging
Migration pull request
- compression:
Shameer's fix for CONFIG_UADK build
Yuan Liu fixes for zero-page, QPL, qatzip
- multifd sync cleanups, prereq. for VFIO and postcopy work
- fixes for 9.2 regressions:
multifd with pre-9.0 -> post-9.1 migrations (#2720)
s390x migration (#2704)
- fix for assertions during paused migrations; rework of
late-block-activate logic (#2395, #686)
- fixes for compressed arrays creation and parsing, mostly affecting
s390x
# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmeBDgkQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnSlUEACl31wY+77JxWnBva/eDDwnJ9HiCrqsoqaZ
# YIJJXNlk4lYJWNdZRt6p27exzWrQwm+kWKPECeCakgCMlfhnKCvejGq7iV/fJY4o
# D8hjE3t1htQ8mfblY1+bqzg3Rml59KwXxiqAwvlljbNWdkXruv026dq9vgJMzFhi
# ia043fOO1tYULIoawgmwmLEHnztht0v+ZTZ1v5KQbrH655tpxls/8kHc6v5PXEpA
# 3PSmCrCQh1dPtkYRjuJ9yHyfU+/T8tYwIjrU6VR1wQW7MBNkjtqNudaqAFiuyuqn
# P8gh4rAQrMhA9y+aq6xSoJP8XGkuOHxLQtlNutlmtbcQyZ7JqgLmK9ZLdoPf21sK
# //erV63NoyaciYB9Nk3NXflwroc6zyvo8A584kGNPwBznZOJLESP4SPvVm/nlE29
# vbyq8AWHRjFiqqf6P0ttQLAFkusZJzM1Y9UakF51hyVBX70yfqLG20XXZtIq/aZA
# GbBB2Fo0MIlbmWaur3vLsSzn7B8d++Gl9TTGcK/eIXJ1ANCuCxGv9fbXJQlP5F4I
# 3OAoSmAVJ2eqw4v0+2WMiEa8yUA5drNnDSI3VRkG+0K9jRfHKXki466/QQdGrNw7
# 8GuuzLBNai3gEKbavDU0Be73r982KjXeYXj7RuAkQfm0d4H7tiwtg91Cd1dPKfzh
# mhpmOFJDCg==
# =joNM
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 07:09:45 EST
# gpg: using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg: issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg: aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3 64CF C798 DC74 1BEC 319D
* tag 'migration-20250110-pull-request' of https://gitlab.com/farosas/qemu: (25 commits)
multifd: bugfix for incorrect migration data with qatzip compression
multifd: bugfix for incorrect migration data with QPL compression
multifd: bugfix for migration using compression methods
s390x: Fix CSS migration
migration: Fix arrays of pointers in JSON writer
migration: Dump correct JSON format for nullptr replacement
migration: Rename vmstate_info_nullptr
migration: Fix parsing of s390 stream
migration: Remove unused argument in vmsd_desc_field_end
migration: Add more error handling to analyze-migration.py
migration/block: Rewrite disk activation
migration/block: Fix possible race with block_inactive
migration/block: Apply late-block-active behavior to postcopy
migration/block: Make late-block-active the default
qmp/cont: Only activate disks if migration completed
migration: Add helper to get target runstate
migration/multifd: Fix compat with QEMU < 9.0
migration/multifd: Document the reason to sync for save_setup()
migration/multifd: Cleanup src flushes on condition check
migration/multifd: Remove sync processing on postcopy
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The set_machine and require_accelerator steps can bail early so move
those to the front of the test functions. While we are at it also
clean up some long lines when adding the vm arguments.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-23-alex.bennee@linaro.org>
The Alpine test boots from the CDROM so we don't --append a command
line. Drop the unused code.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-22-alex.bennee@linaro.org>
Use the utility functions to reduce code duplication.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-21-alex.bennee@linaro.org>
Rather than using the python library (which has a different API
anyway) lets just call the binary. zstdtools is already in out
qemu.yml so all test containers should have it around. Tests should
still use @skipIfMissingCommands('zstd') to gracefully handle when
only minimal dependencies have been installed.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-20-alex.bennee@linaro.org>
We have proper detection of prompts now so we don't need to guess with
sleep() sprinkled through the test. The extra step of calling halt is
just to flush the final bits of the log (although the last line is
still missed).
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-19-alex.bennee@linaro.org>
Time will not advance if the system is paused or there are no timer
events set for the future. In absence of pending timer events
advancing time would make no difference the system state. Attempting
to do so would be a bug and the test or device under test would need
fixing.
Tighten up the result reporting to `FAIL` if time was not advanced.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2687
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-18-alex.bennee@linaro.org>
In the qtest environment time will not step forward if the system is
paused (timers disabled) or we have no timer events to fire. As a
result VirtIO events are responded to directly and we don't need to
step time forward.
We still do timeout processing to handle the fact the target QEMU may
not be ready to respond right away. This will usually be due to a slow
CI system or if QEMU is running under something like rr.
Future qtest patches will assert that time actually changes when a
step is requested.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-17-alex.bennee@linaro.org>
This boot an OP-TEE environment, and launch a nested guest VM inside it
using the Realms feature. We do it for virt and sbsa-ref platforms.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20241220165212.3653495-1-pierrick.bouvier@linaro.org>
[AJB: tweak ordering of setup, strip changelog from commit]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20250108121054.1126164-16-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-15-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-14-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-13-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Note we re-use the riscv32 kernel and rootfs for test_riscv64_rv32.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-12-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-11-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-10-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-9-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-8-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-7-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-6-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-5-alex.bennee@linaro.org>
We didn't have this before and as it exercises the m68k virt platform
it seems worth adding. We don't wait for the shutdown because QEMU
will auto-exit on the shutdown.
Cc: Laurent Vivier <laurent@vivier.eu>
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-4-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-3-alex.bennee@linaro.org>
Now there are new up to date images available we should update to them.
Cc: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250108121054.1126164-2-alex.bennee@linaro.org>
When QPL compression is enabled on the migration channel and the same
dirty page changes from a normal page to a zero page in the iterative
memory copy, the dirty page will not be updated to a zero page again
on the target side, resulting in incorrect memory data on the source
and target sides.
The root cause is that the target side does not record the normal pages
to the receivedmap.
The solution is to add ramblock_recv_bitmap_set_offset in target side
to record the normal pages.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241218091413.140396-4-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When QPL compression is enabled on the migration channel and the same
dirty page changes from a normal page to a zero page in the iterative
memory copy, the dirty page will not be updated to a zero page again
on the target side, resulting in incorrect memory data on the source
and target sides.
The root cause is that the target side does not record the normal pages
to the receivedmap.
The solution is to add ramblock_recv_bitmap_set_offset in target side
to record the normal pages.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241218091413.140396-3-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When compression is enabled on the migration channel and
the pages processed are all zero pages, these pages will
not be sent and updated on the target side, resulting in
incorrect memory data on the source and target sides.
The root cause is that all compression methods call
multifd_send_prepare_common to determine whether to compress
dirty pages, but multifd_send_prepare_common does not update
the IOV of MultiFDPacket_t when all dirty pages are zero pages.
The solution is to always update the IOV of MultiFDPacket_t
regardless of whether the dirty pages are all zero pages.
Fixes: 303e6f54f9 ("migration/multifd: Implement zero page transmission on the multifd thread.")
Cc: qemu-stable@nongnu.org #9.0+
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Jason Zeng <jason.zeng@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20241218091413.140396-2-yuan1.liu@intel.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Currently, if an array of pointers contains a NULL pointer, that
pointer will be encoded as '0' in the stream. Since the JSON writer
doesn't define a "pointer" type, that '0' will now be an uint8, which
is different from the original type being pointed to, e.g. struct.
(we're further calling uint8 "nullptr", but that's irrelevant to the
issue)
That mixed-type array shouldn't be compressed, otherwise data is lost
as the code currently makes the whole array have the type of the first
element:
css = {NULL, NULL, ..., 0x5555568a7940, NULL};
{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
"version": 1, "fields": [
...,
{"name": "css", "array_len": 256, "type": "nullptr", "size": 1},
...,
]}
In the above, the valid pointer at position 254 got lost among the
compressed array of nullptr.
While we could disable the array compression when a NULL pointer is
found, the JSON part of the stream still makes part of downtime, so we
should avoid writing unecessary bytes to it.
Keep the array compression in place, but if NULL and non-NULL pointers
are mixed break the array into several type-contiguous pieces :
css = {NULL, NULL, ..., 0x5555568a7940, NULL};
{"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css",
"version": 1, "fields": [
...,
{"name": "css", "array_len": 254, "type": "nullptr", "size": 1},
{"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768},
{"name": "css", "type": "nullptr", "size": 1},
...,
]}
Now each type-discontiguous region will become a new JSON entry. The
reader should interpret this as a concatenation of values, all part of
the same field.
Parsing the JSON with analyze-script.py now shows the proper data
being pointed to at the places where the pointer is valid and
"nullptr" where there's NULL:
"s390_css (14)": {
...
"css": [
"nullptr",
"nullptr",
...
"nullptr",
{
"chpids": [
{
"in_use": "0x00",
"type": "0x00",
"is_virtual": "0x00"
},
...
]
},
"nullptr",
}
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250109185249.23952-7-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
QEMU plays a trick with null pointers inside an array of pointers in a VMSD
field. See 07d4e69147 ("migration/vmstate: fix array of ptr with
nullptrs") for more details on why. The idea makes sense in general, but
it may overlooked the JSON writer where it could write nothing in a
"struct" in the JSON hints section.
We hit some analyze-migration.py issues on s390 recently, showing that some
of the struct field contains nothing, like:
{"name": "css", "array_len": 256, "type": "struct", "struct": {}, "size": 1}
As described in details by Fabiano:
https://lore.kernel.org/r/87pll37cin.fsf@suse.de
It could be that we hit some null pointers there, and JSON was gone when
they're null pointers.
To fix it, instead of hacking around only at VMStateInfo level, do that
from VMStateField level, so that JSON writer can also be involved. In this
case, JSON writer will replace the pointer array (which used to be a
"struct") to be the real representation of the nullptr field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250109185249.23952-6-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Rename vmstate_info_nullptr from "uint64_t" to "nullptr". This vmstate
actually reads and writes just a byte, so the proper name would be
uint8. However, since this is a marker for a NULL pointer, it's
convenient to have a more explicit name that can be identified by the
consumers of the JSON part of the stream.
Change the name to "nullptr" and add support for it in the
analyze-migration.py script. Arbitrarily use the name of the type as
the value of the field to avoid the script showing 0x30 or '0', which
could be confusing for readers.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250109185249.23952-5-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The parsing for the S390StorageAttributes section is currently leaving
an unconsumed token that is later interpreted by the generic code as
QEMU_VM_EOF, cutting the parsing short.
The migration will issue a STATTR_FLAG_DONE between iterations, which
the script consumes correctly, but there's a final STATTR_FLAG_EOS at
.save_complete that the script is ignoring. Since the EOS flag is a
u64 0x1ULL and the stream is big endian, on little endian hosts a byte
read from it will be 0x0, the same as QEMU_VM_EOF.
Fixes: 81c2c9dd5d ("tests/qtest/migration-test: Fix analyze-migration.py for s390x")
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250109185249.23952-4-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The analyze-migration script was seen failing in s390x in misterious
ways. It seems we're reaching the VMSDFieldStruct constructor without
any fields, which would indicate an empty .subsection entry, a
VMSTATE_STRUCT with no fields or a vmsd with no fields. We don't have
any of those, at least not without the unmigratable flag set, so this
should never happen.
Add some debug statements so that we can see what's going on the next
time the issue happens.
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20250109185249.23952-2-farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This patch proposes a flag to maintain disk activation status globally. It
mostly rewrites disk activation mgmt for QEMU, including COLO and QMP
command xen_save_devices_state.
Backgrounds
===========
We have two problems on disk activations, one resolved, one not.
Problem 1: disk activation recover (for switchover interruptions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.
It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().
It used to be fixed in different commits, all over the places in QEMU. So
these are the relevant changes I saw, I'm not sure if it's complete list:
- In 2016, commit fe904ea824 ("migration: regain control of images when
migration fails to complete")
- In 2017, commit 1d2acc3162 ("migration: re-active images while migration
been canceled after inactive them")
- In 2023, commit 6dab4c93ec ("migration: Attempt disk reactivation in
more failure scenarios")
Now since we have a slightly better picture maybe we can unify the
reactivation in a single path.
One side benefit of doing so is, we can move the disk operation outside QMP
command "migrate_cancel". It's possible that in the future we may want to
make "migrate_cancel" be OOB-compatible, while that requires the command
doesn't need BQL in the first place. This will already do that and make
migrate_cancel command lightweight.
Problem 2: disk invalidation on top of invalidated disks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is an unresolved bug for current QEMU. Link in "Resolves:" at the
end. It turns out besides the src switchover phase (problem 1 above), QEMU
also needs to remember block activation on destination.
Consider two continuous migration in a row, where the VM was always paused.
In that scenario, the disks are not activated even until migration
completed in the 1st round. When the 2nd round starts, if QEMU doesn't
know the status of the disks, it needs to try inactivate the disk again.
Here the issue is the block layer API bdrv_inactivate_all() will crash a
QEMU if invoked on already inactive disks for the 2nd migration. For
detail, see the bug link at the end.
Implementation
==============
This patch proposes to maintain disk activation with a global flag, so we
know:
- If we used to inactivate disks for migration, but migration got
cancelled, or failed, QEMU will know it should reactivate the disks.
- On incoming side, if the disks are never activated but then another
migration is triggered, QEMU should be able to tell that inactivate is
not needed for the 2nd migration.
We used to have disk_inactive, but it only solves the 1st issue, not the
2nd. Also, it's done in completely separate paths so it's extremely hard
to follow either how the flag changes, or the duration that the flag is
valid, and when we will reactivate the disks.
Convert the existing disk_inactive flag into that global flag (also invert
its naming), and maintain the disk activation status for the whole
lifecycle of qemu. That includes the incoming QEMU.
Put both of the error cases of source migration (failure, cancelled)
together into migration_iteration_finish(), which will be invoked for
either of the scenario. So from that part QEMU should behave the same as
before. However with such global maintenance on disk activation status, we
not only cleanup quite a few temporary paths that we try to maintain the
disk activation status (e.g. in postcopy code), meanwhile it fixes the
crash for problem 2 in one shot.
For freshly started QEMU, the flag is initialized to TRUE showing that the
QEMU owns the disks by default.
For incoming migrated QEMU, the flag will be initialized to FALSE once and
for all showing that the dest QEMU doesn't own the disks until switchover.
That is guaranteed by the "once" variable.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-7-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Src QEMU sets block_inactive=true very early before the invalidation takes
place. It means if something wrong happened during setting the flag but
before reaching qemu_savevm_state_complete_precopy_non_iterable() where it
did the invalidation work, it'll make block_inactive flag inconsistent.
For example, think about when qemu_savevm_state_complete_precopy_iterable()
can fail: it will have block_inactive set to true even if all block drives
are active.
Fix that by only update the flag after the invalidation is done.
No Fixes for any commit, because it's not an issue if bdrv_activate_all()
is re-entrant upon all-active disks - false positive block_inactive can
bring nothing more than "trying to active the blocks but they're already
active". However let's still do it right to avoid the inconsistent flag
v.s. reality.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-6-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Postcopy never cared about late-block-active. However there's no mention
in the capability that it doesn't apply to postcopy.
Considering that we _assumed_ late activation is always good, do that too
for postcopy unconditionally, just like precopy. After this patch, we
should have unified the behavior across all.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-5-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Migration capability 'late-block-active' controls when the block drives
will be activated. If enabled, block drives will only be activated until
VM starts, either src runstate was "live" (RUNNING, or SUSPENDED), or it'll
be postponed until qmp_cont().
Let's do this unconditionally. There's no harm to delay activation of
block drives. Meanwhile there's no ABI breakage if dest does it, because
src QEMU has nothing to do with it, so it's no concern on ABI breakage.
IIUC we could avoid introducing this cap when introducing it before, but
now it's still not too late to just always do it. Cap now prone to
removal, but it'll be for later patches.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-4-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
As the comment says, the activation of disks is for the case where
migration has completed, rather than when QEMU is still during
migration (RUN_STATE_INMIGRATE).
Move the code over to reflect what the comment is describing.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-Id: <20241206230838.1111496-3-peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>