Commit graph

119183 commits

Author SHA1 Message Date
Michael Tokarev
f141caa270 net/slirp: libslirp 4.9.0 compatibility
Update the code in net/slirp.c to be compatible with
libslirp 4.9.0, which deprecated slirp_pollfds_fill()
and started using slirp_os_socket type for sockets
(which is a 64-bit integer on win64) for all callbacks
starting with version 6 of the interface.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-ID: <20250130123253.864681-1-mjt@tls.msk.ru>
[thuth: Added some spaces to make checkpatch.pl happy]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 14:01:46 +01:00
Helge Deller
75f73d5af1 target/hppa: 64-bit CPUs start with space register hashing enabled
Turn on space register hashing for 64-bit CPUs when reset.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
f7aa7fa97c target/hppa: Add instruction decoding for mfdiag and mtdiag
Add 32- and 64-bit instruction decoding of the mfdiag and mtdiag
instructions which modify the diagnose registers.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
009e0927f3 target/hppa: Drop diag_getshadowregs_pa2 and diag_putshadowregs_pa2
diag_getshadowregs_pa2() and diag_putshadowregs_pa2() were added in
commit 3bdf20819e based on some analysis of ODE code, but now they
conflict with the generic mfdiag/mtdiag instructions. I believe the
former analysis was wrong, so remove them again. Note that all diag
instructions are badly documented, so most things are based on reverse
engineering and thus may be wrong.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 3bdf20819e ("target/hppa: Add diag instructions to set/restore shadow registers")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
8f2a1c5926 target/hppa: Add CPU diagnose registers
Add the diagnose registers (%dr) to the CPUArchState. Those are mostly
undocumented and control cache behaviour, memory behaviour, reset button
management and many other related internal CPU things.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
bf60e2a72a disas/hppa: implement mfdiag/mtdiag disassembly
The various PA-RISC CPUs implement different CPU-specific diag
instructions (mfdiag, mtdiag, mfcpu, mtcpu, ...) to access CPU-internal
diagnose/configuration registers, e.g. for cache control, managing space
register hashing, control front panel LEDs and read status of the
hardware reset button.

Those instructions are mostly undocumented, but are used by ODE, HP-UX
and Linux.

This patch adds some neccessary instructions for PCXL and PCXU CPUs.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
3f8c3d7bf6 hppa: Sync contents of hppa_hardware.h header file with SeaBIOS-hppa
The hppa_hardware.h header file holds many constants for addresses and
offsets which are needed while building the firmware (SeaBIOS-hppa) and
while setting up the virtual machine in QEMU.

This patch brings it in sync between both source code repositories.

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:20 +01:00
Helge Deller
2177d0c177 MAINTAINERS: Add myself as HPPA maintainer
Since I contribute quite some code to hppa, I'd like to step up and
become the secondary maintainer for HPPA beside Richard.
Additionally change status of hppa machines to maintained as I will
take care of them.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-30 13:37:14 +01:00
Philippe Mathieu-Daudé
6a784f1200 hw/i386/pc: Remove unused pc_compat_2_3 declarations
We removed the implementations in commit 46a2bd5257
("hw/i386/pc: Remove deprecated pc-i440fx-2.3 machine")
but forgot to remove the declarations. Do it now.

Fixes: 46a2bd5257 ("hw/i386/pc: Remove deprecated pc-i440fx-2.3 machine")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 13:01:22 +03:00
Philippe Mathieu-Daudé
d6f7642230 licenses: Remove SPDX tags not being license identifier for Linaro
Per [*]:

  "we're only interested in adopting SPDX for recording the
  licensing info, [not] any other SPDX metadata."

Replace the 'SPDX-FileCopyrightText' and 'SPDX-FileContributor'
tags added by Linaro by 'Copyright (c)' and 'Authors' words
respectively.

[*] https://lore.kernel.org/qemu-devel/20241007154548.1144961-4-berrange@redhat.com/

Inspired-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 13:01:22 +03:00
Thomas Huth
394388dfdb tests/functional/test_mips_malta: Fix comment about endianness of the test
This test is for the big endian MIPS target, not for the little endian
target.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Fixes: 79cb4a14cb ("tests/functional: Convert mips32eb 4Kc Malta tests")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 13:01:22 +03:00
Dominik 'Disconnect3d' Czarnota
8b647bd352 gdbstub/user-target: fix gdbserver int format (%d -> %x)
This commit fixes an incorrect format string for formatting integers
provided to GDB when debugging a target run in QEMU user mode.

The correct format is hexadecimal for both success and errno values,
some of which can be seen here [0].

[0] e65a355022/gdbserver/hostio.cc (L196-L213)

Signed-off-by: Dominik 'Disconnect3d' Czarnota <dominik.b.czarnota@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Fixes: e282010b2e ("gdbstub: Add support for info proc mappings")
Cc: qemu-stable@nongnu.org
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 13:01:22 +03:00
Michael Tokarev
6003402aa9 vvfat: create_long_filename: fix out-of-bounds array access
create_long_filename() intentionally uses direntry_t->name[8+3] array
as a larger array.  This works, but makes static code analysis tools
unhappy.  The problem here is that a directory entry holding long file
name is significantly different from regular directory entry, and the
name is split into several parts within the entry, not just in regular
8+3 name field.

Treat the entry as array of bytes instead.  This fixes the OOB access
from the compiler/tools PoV, but does not change the resulting code
in any way.

Keep the existing code style.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 13:01:22 +03:00
Thomas Huth
f194f0cab0 tests/functional/test_mips_malta: Convert the mips big endian replay tests
Move the mips big endian replay tests from tests/avocado/replay_kernel.py
to the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a separate file now.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-6-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:51:24 +01:00
Thomas Huth
f348229ec7 tests/functional/test_mips64el_malta: Convert the mips64el replay tests
Move the mips64el replay tests from tests/avocado/replay_kernel.py to
the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a separate file there now.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-5-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:51:20 +01:00
Thomas Huth
350a998d85 tests/functional/test_mipsel_malta: Convert the mipsel replay tests
Move the mipsel replay tests from tests/avocado/replay_kernel.py to
the functional framework. Since the functional tests should be run per
target, we cannot stick all replay tests in one file. Thus let's add
these tests to a new, separate file there instead.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:51:17 +01:00
Thomas Huth
a2fe7bb751 tests/functional: Add the ReplayKernelBase class
Copy the ReplayKernelBase class from the avocado tests. We are going
to need it to convert the related replay tests in the following patches.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:51:12 +01:00
Thomas Huth
4ae633b012 tests/functional: Add a decorator for skipping long running tests
Some tests have a very long runtime and might run into timeout issues
e.g. when QEMU has been compiled with --enable-debug. Add a decorator
for marking them more easily. Rename the corresponding environment
variable to be more in sync with the other QEMU_TEST_ALLOW_* switches
that we already have, and add a paragraph about it in the documentation.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250128152839.184599-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:50:56 +01:00
Laurent Vivier
c6a1b591a6 net/dump: Correctly compute Ethernet packet offset
When a packet is sent with QEMU_NET_PACKET_FLAG_RAW by QEMU it
never includes virtio-net header even if qemu_get_vnet_hdr_len()
is not 0, and filter-dump is not managing this case.

The only user of QEMU_NET_PACKET_FLAG_RAW is announce_self,
we can show the problem using it and tcpddump:

- QEMU parameters:

  .. -monitor stdio \
     -netdev bridge,id=netdev0,br=virbr0 \
     -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \
     -object filter-dump,netdev=netdev0,file=log.pcap,id=pcap0

- HMP command:

  (qemu) announce_self

- TCP dump:

  $ tcpdump -nxr log.pcap

  without the fix:

    08:00:06:04:00:03 > 2e:2f:80:35:00:01, ethertype Unknown (0x9a2b), length 50:
         0x0000:  2c2d 2e2f 0000 0000 9a2b 2c2d 2e2f 0000
         0x0010:  0000 0000 0000 0000 0000 0000 0000 0000
         0x0020:  0000 0000

  with the fix:

    ARP, Reverse Request who-is 9a:2b:2c:2d:2e:2f tell 9a:2b:2c:2d:2e:2f, length 46
         0x0000:  0001 0800 0604 0003 9a2b 2c2d 2e2f 0000
         0x0010:  0000 9a2b 2c2d 2e2f 0000 0000 0000 0000
         0x0020:  0000 0000 0000 0000 0000 0000 0000

Fixes: 481c52320a ("net: Strip virtio-net header when dumping")
Cc: akihiko.odaki@daynix.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 12:50:07 +03:00
Laurent Vivier
84dfdcbff3 net: Fix announce_self
b9ad513e18 ("net: Remove receive_raw()") adds an iovec entry
in qemu_deliver_packet_iov() to add the virtio-net header
in the data when QEMU_NET_PACKET_FLAG_RAW is set but forgets
to increase the number of iovec entries in the array, so
receive_iov() will only send the first entry (the virtio-net
entry, full of 0) and no data. The packet will be discarded.

The only user of QEMU_NET_PACKET_FLAG_RAW is announce_self.

We can see the problem with tcpdump:

- QEMU parameters:

  .. -monitor stdio \
     -netdev bridge,id=netdev0,br=virbr0 \
     -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \

- HMP command:

  (qemu) announce_self

- TCP dump:

  $ sudo tcpdump -nxi virbr0

  without the fix:

    <nothing>

  with the fix:

   ARP, Reverse Request who-is 9a:2b:2c:2d:2e:2f tell 9a:2b:2c:2d:2e:2f, length 46
        0x0000:  0001 0800 0604 0003 9a2b 2c2d 2e2f 0000
        0x0010:  0000 9a2b 2c2d 2e2f 0000 0000 0000 0000
        0x0020:  0000 0000 0000 0000 0000 0000 0000

Reported-by: Xiaohui Li <xiaohli@redhat.com>
Bug: https://issues.redhat.com/browse/RHEL-73891
Fixes: b9ad513e18 ("net: Remove receive_raw()")
Cc: akihiko.odaki@daynix.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2025-01-30 12:50:07 +03:00
Cédric Le Goater
fc9fea48be tests/functional: Extend PPC 40p test with Linux boot
Fetch the cdrom image for the IBM 6015 PReP PowerPC machine hosted on
the Juneau Linux Users Group site, boot and check Linux version.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20250129104844.1322100-1-clg@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
David Hildenbrand
d77ae821e8 s390x/s390-virtio-ccw: Support plugging PCI-based virtio memory devices
Let's just wire it up, unlocking virtio-mem-pci support on s390x.

While at it, drop the "return;" in s390_machine_device_unplug_request(),
to make it look like the other handlers.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-ID: <20250128185705.1609038-3-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
David Hildenbrand
03248e714b virtio-mem-pci: Allow setting nvectors, so we can use MSI-X
Let's do it similar as virtio-balloon-pci. With this change, we can
use virtio-mem-pci on s390x, although plugging will still fail until
properly wired up in the machine.

No need to worry about transitional/non_transitional devices, because they
don't exist for virtio-mem.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250128185705.1609038-2-david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Reza Arbab
fe638ae67b virtio-balloon-pci: Allow setting nvectors, so we can use MSI-X
Most virtio-pci devices allow MSI-X. Add it to virtio-balloon-pci, but
only enable it in new machine types, so we don't break migration of
existing machine types between different qemu versions.

This copies what was done for virtio-rng-pci in:
9ea02e8f13 ("virtio-rng-pci: Allow setting nvectors, so we can use MSI-X")
bad9c5a516 ("virtio-rng-pci: fix migration compat for vectors")
62bdb88715 ("virtio-rng-pci: fix transitional migration compat for vectors")

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Message-ID: <20250115161425.246348-1-arbab@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Ilya Leoshkevich
a4cda3f5df hw/s390x/s390-virtio-ccw: Fix a record/replay deadlock
Booting an s390x VM in record/replay mode hangs due to a deadlock
between rr_cpu_thread_fn() and s390_machine_reset(). The former needs
the record/replay mutex held by the latter, and the latter waits until
the former completes its run_on_cpu() request.

Fix by temporarily dropping the record/replay mutex, like it's done in
pause_all_vcpus().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-ID: <20250124112625.23050-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Ilya Leoshkevich
b497b0376c tests/tcg/s390x: Test modifying code using the MVC instruction
Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250128001338.11474-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Ilya Leoshkevich
e43ced8be1 target/s390x: Fix MVC not always invalidating translation blocks
Node.js crashes in qemu-system-s390x with random SIGSEGVs / SIGILLs.

The v8 JIT used by Node.js can garbage collect and overwrite unused
code. Overwriting is performed by WritableJitAllocation::CopyCode(),
which ultimately calls memcpy(). For certain sizes, memcpy() uses the
MVC instruction.

QEMU implements MVC and other similar instructions using helpers. While
TCG store ops invalidate affected translation blocks automatically,
helpers must do this manually by calling probe_access_flags(). The MVC
helper does this using the access_prepare() -> access_prepare_nf() ->
s390_probe_access() -> probe_access_flags() call chain.

At the last step of this chain, the store size is replaced with 0. This
causes the probe_access_flags() -> notdirty_write() ->
tb_invalidate_phys_range_fast() chain to miss some translation blocks.

When this happens, QEMU executes a mix of old and new code. This
quickly leads to either a SIGSEGV or a SIGILL in case the old code
ends in the middle of a new instruction.

Fix by passing the true size.

Reported-by: Berthold Gunreben <azouhr@opensuse.org>
Cc: Sarah Kriesch <ada.lovelace@gmx.de>
Cc: qemu-stable@nongnu.org
Closes: https://bugzilla.opensuse.org/show_bug.cgi?id=1235709
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Fixes: e2faabee78 ("accel/tcg: Forward probe size on to notdirty_write")
Message-ID: <20250128001338.11474-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Ilya Leoshkevich
ba68dd7d47 target/s390x: Fix PPNO execution with icount
Executing PERFORM RANDOM NUMBER OPERATION makes QEMU exit with "Bad
icount read" when using record/replay. This is caused by
icount_get_raw_locked() if the current instruction is not the last one
in the respective translation block.

For the x86_64's rdrand this is resolved by calling
translator_io_start(). On s390x one uses IF_IO in order to make this
call happen automatically.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250123123808.194405-1-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Thomas Huth
c8b2deb921 tests/functional/test_mips_malta: Fix comment about endianness of the test
This test is for the big endian MIPS target, not for the little endian
target.

Fixes: 79cb4a14cb ("tests/functional: Convert mips32eb 4Kc Malta tests")
Message-ID: <20250127184112.108122-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:39:37 +01:00
Cédric Le Goater
5e65408624 tests/functional: Add a ppc64 mac99 test
The test sequence boots from disk a mac99 machine in 64-bit mode, in
which case the CPU is a PPC 970.

The buildroot rootfs is built with config :

BR2_powerpc64=y
BR2_powerpc_970=y

and the kernel with the g5 deconfig.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Message-ID: <20250128212145.1186617-1-clg@redhat.com>
[thuth: Adjusted the comment about '-nographic]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:38:51 +01:00
Thomas Huth
156ee8b812 tests/functional: Fix the aarch64_tcg_plugins test
Unfortunately, this test had not been added to meson.build, so we did
not notice a regression: Looking for 'Kernel panic - not syncing: VFS:'
as the indication for the final boot state of the kernel was a bad
idea since 'Kernel panic - not syncing' is the default failure
message of the LinuxKernelTest class, and since we're now reading
the console input byte by byte instead of linewise (see commit
cdad03b74f), the failure now triggers before we fully read the
success string. Let's fix this by simply looking for the previous
line in the console output instead.

Also, replace the call to cancel() - this was only available in the
Avocado framework. In the functional framework, we must use skipTest()
instead. While we're at it, also fix the TODO here by looking for the
exact error and only skip the test if the plugins are not available.

Fixes: 3abc545e66 ("tests/functional: Convert the tcg_plugins test")
Fixes: cdad03b74f ("tests/functional: rewrite console handling to be bytewise")
Message-ID: <20250123083625.1498495-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:30:36 +01:00
Thomas Huth
092fd6486d tests/functional: Convert the migration avocado test
Now that we've got a find_free_port() function in the functional
test framework, we can convert the migration test, too.
While the original avocado test was only meant to run on aarch64,
ppc64 and x86, we can turn this into a more generic test by now
and run it on all architectures that have a machine which ships
with a working firmware. To avoid overlapping with the migration
qtest, we now also test migration on machines that are not covered
by the migration qtest yet.

Acked-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250103074308.463860-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:30:36 +01:00
Thomas Huth
257de641e0 tests/functional: Fix broken decorators with lamda functions
The decorators that use a lambda function are currently broken
and do not properly skip the test if the condition is not met.
Using "return skipUnless(lambda: ...)" does not work as expected.
To fix it, rewrite the decorators without lambda, it's simpler
that way anyway.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250122134315.1448794-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:30:36 +01:00
Thomas Huth
aca2c48e4d tests/functional/qemu_test/decorators: Fix bad check for imports
skipIfMissingImports should use importlib.import_module() for checking
whether a module with the name stored in the "impname" variable is
available or not, otherwise the code tries to import a module with
the name "impname" instead.
(This bug hasn't been noticed before since there is another issue
with this decorator that will be fixed by the next patch)

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250122134315.1448794-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2025-01-30 10:30:36 +01:00
Prasad Pandit
bc38dc2f5f migration: refactor ram_save_target_page functions
Refactor ram_save_target_page legacy and multifd
functions into one. Other than simplifying it,
it frees 'migration_ops' object from usage, so it
is expunged.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20250127120823.144949-3-ppandit@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:42 -03:00
Peter Xu
a10b37c553 migration: Trivial cleanup on JSON writer of vmstate_save()
Two small cleanups in the same section of vmstate_save():

  - Check vmdesc before the "mixed null/non-null data in array" logic, to
  be crystal clear that it's only about the JSON writer, not the vmstate on
  its own in the migration stream.

  - Since we have is_null variable now, use that to replace a check.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-17-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
3dde8fdbad migration: Merge precopy/postcopy on switchover start
Now after all the cleanups, finally we can merge the switchover startup
phase into one single function for precopy/postcopy.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-16-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
4881411136 migration: Always set DEVICE state
DEVICE state was introduced back in 2017:

https://lore.kernel.org/qemu-devel/20171020090556.18631-1-dgilbert@redhat.com/

Quote from Dave's cover letter, when the pre-switchover phase was enabled,
the state transition looks like this:

  The precopy flow is:
  active->pre-switchover->device->completed

  The postcopy flow is:
  active->pre-switchover->postcopy-active->completed

To supplement above, when the cap is not enabled:

  The precopy flow is:
  active->completed

  The postcopy flow is:
  active->postcopy-active->completed

It works for us, though we have some code just to special case these state
transitions, so the DEVICE state currently is special only to precopy, and
only conditionally.

I had a quick discussion with Libvirt developers, it turns out that this
may not be necessary. IOW, it seems okay we can have DEVICE state to be
generic, so that we don't have over-complicated state machines.  It not
only helps align all the migration state machine, help cleanup the code
path especially on pre-switchover handling (see the patch itself), another
side benefit is we can unconditionally have a specific state to mark the
switchover phase, which might be helpful for debugging too.

This patch makes the DEVICE state to be present always, marking that source
QEMU is switching over.  Then the state machine will be always as simple
as:

  active-> [pre-switchover->] -> device -> [postcopy-active->] -> complete

After the change, no matter whether pre-switchover or postcopy is enabled
or not, we always have DEVICE state showing the switchover phase.  When
pre-switchover enabled, we'll have an extra stage before that.  When
postcopy is enabled, we'll have an extra stage after that.

A few qtests need touch up in QEMU tree for this change:

  - A few iotest outputs (194, 203, 234, 262, 280)
  - Teach libqos's migrate() on "device" state

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-15-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
455c1963d3 migration: Cleanup qemu_savevm_state_complete_precopy()
Now qemu_savevm_state_complete_precopy() is never used in postcopy, clean
it up as in_postcopy==false now unconditionally.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-14-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
15c2ffa0b7 migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy
Postcopy invokes qemu_savevm_state_complete_precopy() twice for a long
time, and that caused way too much confusions.  Let's clean this up and
make postcopy easier to read.

It's actually fairly straightforward: postcopy starts with saving
non-postcopiable iterables, then later it saves again with non-iterable
only.  Move these two calls out makes everything much easier to follow.
Otherwise it's very unclear what qemu_savevm_state_complete_precopy() did
in either of the calls.

No functional change intended.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-13-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
46b0155ecf migration: Notify COMPLETE once for postcopy
Postcopy invokes qemu_savevm_state_complete_precopy() twice, that means
it'll invoke COMPLETE notify twice.. also twice the tracepoints that
marking precopy complete.

Move that notification (along with the tracepoint) out to the caller, so
that postcopy will only notify once right at the start of switchover phase
from precopy.  When at it, rename it to suite the file now it locates.

For precopy, there should have no functional change except the tracepoint
has a name change.

For the other two users of qemu_savevm_state_complete_precopy(), namely:
qemu_savevm_state() and qemu_savevm_live_state(): the notifier shouldn't
matter because they're not precopy at all.  Now in these two contexts (aka,
"savevm", and "colo") sometimes the precopy notifiers will still be
invoked, but that's outside the scope of this patch.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-12-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
a880ddd8ce migration: Take BQL slightly longer in postcopy_start()
This paves way for some follow up patch to modify migration states at the
end of postcopy_start(), which should better be with the BQL so that
there's no way of concurrent cancellation.

So we'll do something slightly more with BQL but they're really trivial,
hopefully nothing will really chance with this.

A side benefit is we can drop another explicit lock() in failure path.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-11-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
ec611bd731 migration: Drop cached migration state in migration_maybe_pause()
I can't see why we must cache the state now after we avoided possible
CANCEL race: that's the only thing I can think of that can modify the
migration state concurrently with the migration thread itself.  Make all
the state updates to happen always, then we don't need to cache the state
anymore.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-10-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:41 -03:00
Peter Xu
1f9b657cae migration: Adjust locking in migration_maybe_pause()
In migration_maybe_pause() QEMU may yield BQL before waiting for a
semaphore.  However it yields the BQL too early, which logically gives it
chance for the main thread to quickly take the BQL and modify the state to
CANCELLING.

To avoid such race condition from happening at all, always update the
migration states within the BQL.  It'll make sure no concurrent
cancellation can ever happen.

With that, IIUC there's chance we can remove the extra parameter in
migration_maybe_pause() to update active state, but that'll be done
separately later.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-9-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
40004007e6 migration: Adjust postcopy bandwidth during switchover
Precopy uses unlimited bandwidth always during switchover, it makes sense
because this is so critical and no one would like to throttle bandwidth
during the VM blackout.

OTOH, postcopy surprisingly didn't do that.  There's one line that in the
middle of the postcopy switchover it tries to switch to postcopy's
specified max-postcopy-bandwidth, but even so it's somewhere in the middle
which is strange.

This patch brings the two modes to always use unlimited bandwidth for
switchover, meanwhile only apply the postcopy max bandwidth after the
switchover is completed.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-8-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
89011a702f migration: Synchronize all CPU states only for non-iterable dump
Do one shot cpu sync at qemu_savevm_state_complete_precopy_non_iterable(),
instead of coding it separately in two places.

Note that in the context of qemu_savevm_state_complete_precopy(), this
patch is also an optimization for postcopy path, in that we can avoid sync
cpu twice during switchover: before this patch, postcopy_start() invokes
twice on qemu_savevm_state_complete_precopy(), each of them will try to
sync CPU info.  In reality, only one of them would be enough.

For background snapshot, there's no intended functional change.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-7-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
4822128693 migration: Drop inactivate_disk param in qemu_savevm_state_complete*
This parameter is only used by one caller, which is the genuine precopy
complete path (migration_completion_precopy).

The parameter was introduced in a1fbe750fd ("migration: Fix race of image
locking between src and dst") to make sure the inactivate will happen
before EOF to make sure dest will always be able to activate the disk
properly.  However there's no limitation on how early we inactivate the
disk.  For precopy completion path, we can always do that as long as VM is
stopped.

Move the disk inactivate there, then we can remove this inactivate_disk
parameter in the whole call stack, because all the rest users pass in false
always.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-6-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
812145fcf7 migration: Avoid two src-downtime-end tracepoints for postcopy
Postcopy can trigger this tracepoint twice, while only the 1st one is
valid.  Avoid triggering the 2nd tracepoint just like what we do with
recording the total downtime.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-5-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
9cde9b435a migration: Optimize postcopy on downtime by avoiding JSON writer
postcopy_start() is the entry function that postcopy is destined to start.
It also means QEMU source will not dump VM description, aka, the JSON
writer is garbage now.

We can leave that to be cleaned up when migration completes, however when
with the JSON writer object being present, vmstate_save() will still try to
construct the JSON objects for the VM descriptions, even though it'll never
be used later if it's postcopy.

To save those cycles, release the JSON writer earlier for postcopy. Then
vmstate_save() later will be smart enough to skip the JSON object
constructions completely.  It can logically reduce downtime because all
such JSON constructions happen during postcopy blackout.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-4-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00
Peter Xu
a55090db2a migration: Do not construct JSON description if suppressed
QEMU machine has a property "suppress-vmdesc". When it is enabled, QEMU
will stop attaching JSON VM description at the end of the precopy migration
stream (postcopy is never affected because postcopy never attach that).

However even if it's suppressed by the user, the source QEMU will still
construct the JSON descriptions, which is a complete waste of CPU and
memory resources.

To avoid it, only create the JSON writer object if suppress-vmdesc is not
specified.

Luckily, vmstate_save() already supports vmdesc==NULL, so only a few spots
that are left to be prepared that vmdesc can be NULL now.

When at it, move the init / destroy of the JSON writer object to start /
end of the migration - the JSON writer object is a sub-struct of migration
state, and that looks like the only object that was dynamically allocated /
destroyed within migration process.  Make it the same as the rest objects
that migration uses.

Signed-off-by: Peter Xu <peterx@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250114230746.3268797-3-peterx@redhat.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:40 -03:00