Move the remaining tests into a misc-tests.c file. These tests are
mostly about validation of input and should be in the future replaced
by unit testing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
This test is already starting to bitrot, so first remove it from ifdef
and fix compile issues. ppc64 transfers about 2MB, so bump the size
threshold too.
It was said to be broken on aarch64 but it may have been due to the
limited shm size under Gitlab CI.
Cc: Yury Kotov <yury-kotov@yandex-team.ru>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>
[put it under flaky tests, we still don't have a solution for the CI]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Move the mode/reboot test into a separate file to hold all the CPR
tests. Currently there's just one test, but we're adding more CPR
modes and the feature is different enough from live migration that
it's worth it to have a separate file for it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Split the precopy tests from migration-test.c. This is the largest
group of tests and the more difficult one to break into smaller
groups, so move all of it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Split the file tests from migration-test.c. These are being moved to
their own file due to being special enough compared with the regular
stream migration. There is also the entire mapped-ram feature which
depends on file migration.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Split the next group of tests from migration-test.c, the postcopy
tests. This is another well-defined group of tests and postcopy is a
unique enough feature that it deserves it's own file.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Continuing the split of groups of tests from migration-test.c, split
the compression tests into their own file.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The migration-test.c file has become unwieldy large. It's quite
confusing to navigate with all the test definitions mixed with hook
definitions. The TLS tests make this worse with ifdef'ery.
Since we're planning on having a smaller set of tests to run as smoke
testing on all architectures, I'm taking the time to split some tests
into their own file.
Move the TLS tests into a file of their own.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The migration tests have a set of core infrastructure routines. These
are functions that are called by (almost) all tests and centralize the
common operations of: starting migration on both sides, waiting for
guests to boot, performing guest initialization and teardown, guest
memory validation, etc.
Move this basic framework code (and a few static helpers) into a
separate file. Leave only individual test functions (and their own
static helpers) in migration-test.c.
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We currently have some environment validation to perform and flags to
set during the initialization of the tests. To be able to add more
migration test binaries, we'll need these tasks to be in their own
function so they can be called from more than one place.
Move the initialization code to a function and introduce the
MigrationTestEnv structure to hold the flags that are accessed during
test registration.
Make the env object static to avoid have to change all the code to
pass it around. Similarly with the tmpfs variable, which is used
extensively.
Note: I'm keeping the new functions in migration-test.c because they
are going to be moved in the next patch to the correct place.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Move kvm_dirty_ring_supported() to migration-util.c. Similarly to
ufd_version_check(), this function is used during test definition so
put it in the utils file.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Move ufd_version_check() to migration-util.c file. This is a helper
function that is used during tests definition so it should be
available outside of migration-test.c
Since the function is moving to another file, change it to stop
accessing the global uffd_feature_thread_id and take it as argument
instead.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Rename migration-helpers.c to migration-util.c to make its purpose
more explicit and avoid the "helper" terminology.
Move the file to the qtest/migration/ directory along with the rest of
the migration files.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We current have a bunch of non-test functions in migration-test.c and
some others in migration-helpers.c. In order to split migration-test.c
into separate test binaries, these helpers need to go somewhere
else.
To avoid making migration-helpers even larger, move all QMP-related
functions into a new migration-qmp.c file and put it under the
qtest/migration/ directory.
The new file holds everything that has as its main responsibility to
call into QMP.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Move the code that creates the guest binary out of migration-test and
into the qtest/migration/ directory, along with the rest of the
a-b-kernel code.
That code is part of the basic infrastructure of migration tests, it
shouldn't be among the tests themselves.
Also take the chance and rename migration-test.h, which is too generic
a name for this header which contains only values related to guest
memory offsets.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The current build structure for migration tests is confusing. There is
the tests/migration directory, which contains two different guest code
implementations, one for the qtests (a-b-{bootblock|kernel}.S) and
another for the guestperf script (stress.c). One uses a Makefile,
while the other uses meson.
The next patches will add a new qtests/migration/ directory to hold
qtest code which will make the situation even more confusing.
Move the guest code used by qtests into a new qtests/migration/
directory and rename the old one to tests/migration-stress.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Test frameworks usually prefix "test_" to the entry point of the test
code. Having every function prefixed with test_ makes it hard to
understand the code and to grep for the actual tests.
Remove the "test" prefix from everything that is not a test.
In order to still keep some namespacing, stick to the "migrate_"
prefix, which is the most used currently.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Standardize the hook names:
- change the names to .start|end_hook to match
test_migrate_start|end()
- use the migrate_hook_start_ and migrate_hook_end_ prefixes
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Select all the code and hit tab. I'll be moving functions around quite
a lot in the next patches, so make sure all indentation is correct
now.
Add parentheses around some expressions to preserve readability.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20241113194630.3385-2-farosas@suse.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
The source QEMU might not finish the cancellation of the migration
before we start setting up the next attempt. During the setup, the
test_migrate_start() function and others might need to interact with the
source in a way that is not possible unless the migration is fully
canceled. For example, setting capabilities when the migration is still
running leads to an error.
By moving the wait before the setup, we ensure this does not happen.
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20240920161319.2337625-1-jmarcin@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The xbzrel and vcpu_dirty_limit are the two slowest tests from
migration-test. Move them under g_test_slow() to save about 40s per
run.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240911145204.17692-1-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
The test_vcpu_dirty_limit is the only test that does not check for the
availability of the machine before starting the test, so it fails when
QEMU has been configured with --without-default-devices. Add a check for
the "pc" machine type to fix it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In test_multifd_tcp_cancel() we create three QEMU processes: 'from',
'to' and 'to2'. We clean up (via qtest_quit()) 'from' and 'to2' when
we call test_migrate_end(), but never clean up 'to', which results in
this leak:
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x55e984fcd328 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f328) (BuildId: 710d409b68bb04427009e9ca6e1b63ff8af785d3)
#1 0x7f0878b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55e98503a172 in qtest_spawn_qemu tests/qtest/libqtest.c:397:21
#3 0x55e98502bc4a in qtest_init_internal tests/qtest/libqtest.c:471:9
#4 0x55e98502c5b7 in qtest_init_with_env tests/qtest/libqtest.c:533:21
#5 0x55e9850eef0f in test_migrate_start tests/qtest/migration-test.c:857:11
#6 0x55e9850eb01d in test_multifd_tcp_cancel tests/qtest/migration-test.c:3297:9
#7 0x55e985103407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Call qtest_quit() on 'to' to clean it up once it has exited.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
We g_strdup() the "status" string we get out of the qdict in
get_dirty_rate(), but we never free it. Since we only use this
string while the dictionary is still valid, we don't need to strdup
at all; drop the unnecessary call to avoid this leak:
Direct leak of 18 byte(s) in 2 object(s) allocated from:
#0 0x564b3e01913e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: d6403a811332fcc846f93c45e23abfd06d1e67c4)
#1 0x7f2f278ff738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f2f27914583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x564b3e14bb5b in get_dirty_rate tests/qtest/migration-test.c:3447:14
#4 0x564b3e138e00 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3565:16
#5 0x564b3e14f417 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In the migration test we create several TLS certificates with
the TLS_* macros from crypto-tls-x509-helpers.h. These macros
create both a QCryptoTLSCertReq object which must be deinitialized
and also an on-disk certificate file. The migration test currently
removes the on-disk file in test_migrate_tls_x509_finish() but
never deinitializes the QCryptoTLSCertReq, which means that memory
allocated as part of it is leaked:
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x5558ba33712e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 4c8618f663e538538cad19d35233124cea161491)
#1 0x7f64afc131f4 (/lib/x86_64-linux-gnu/libtasn1.so.6+0x81f4) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#2 0x7f64afc18d58 in asn1_write_value (/lib/x86_64-linux-gnu/libtasn1.so.6+0xdd58) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#3 0x7f64af8fc678 in gnutls_x509_crt_set_version (/lib/x86_64-linux-gnu/libgnutls.so.30+0xe7678) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#4 0x5558ba470035 in test_tls_generate_cert tests/unit/crypto-tls-x509-helpers.c:234:5
#5 0x5558ba464e4a in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1058:5
#6 0x5558ba462c8a in test_migrate_tls_x509_start_default_host tests/qtest/migration-test.c:1123:12
#7 0x5558ba45ab40 in test_precopy_common tests/qtest/migration-test.c:1786:21
#8 0x5558ba450015 in test_precopy_unix_tls_x509_default_host tests/qtest/migration-test.c:2077:5
#9 0x5558ba46d3c7 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
(and similar reports).
The only function currently provided to deinit a QCryptoTLSCertReq is
test_tls_discard_cert(), which also removes the on-disk certificate
file. For the migration tests we need to retain the on-disk files
until we've finished running the test, so the simplest fix is to
provide a new function test_tls_deinit_cert() which does only the
cleanup of the QCryptoTLSCertReq, and call it in the right places.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In calc_dirtyrate_ready() we g_strdup() a string but then never free it:
Direct leak of 19 byte(s) in 2 object(s) allocated from:
#0 0x55ead613413e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: e7cd5c37b2987a1af682b43ee5240b98bb316737)
#1 0x7f7a13d39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f7a13d4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x55ead6266f48 in calc_dirtyrate_ready tests/qtest/migration-test.c:3409:14
#4 0x55ead62669fe in wait_for_calc_dirtyrate_complete tests/qtest/migration-test.c:3422:13
#5 0x55ead6253df7 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3562:9
#6 0x55ead626a407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
We also fail to unref the QMP rsp_return, so we leak that also.
Rather than duplicating the string, use the in-place value from
the qdict, and then unref the qdict.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
In multifd_mapped_ram_fdset_end() we call qtest_qmp() but forgot
to unref the response QDict we get back, which means it is leaked:
Indirect leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x55c0c095d318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildI
d: 07f667506452d6c467dbc06fd95191966d3e91b4)
#1 0x7f186f939c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55c0c0ae9b01 in qdict_new qobject/qdict.c:30:13
#3 0x55c0c0afc16c in parse_object qobject/json-parser.c:317:12
#4 0x55c0c0afb90f in parse_value qobject/json-parser.c:545:16
#5 0x55c0c0afb579 in json_parser_parse qobject/json-parser.c:579:14
#6 0x55c0c0afa21d in json_message_process_token qobject/json-streamer.c:92:12
#7 0x55c0c0bca2e5 in json_lexer_feed_char qobject/json-lexer.c:313:13
#8 0x55c0c0bc97ce in json_lexer_feed qobject/json-lexer.c:350:9
#9 0x55c0c0afabbc in json_message_parser_feed qobject/json-streamer.c:121:5
#10 0x55c0c09cbd52 in qmp_fd_receive tests/qtest/libqmp.c:86:9
#11 0x55c0c09be69b in qtest_qmp_receive_dict tests/qtest/libqtest.c:760:12
#12 0x55c0c09bca77 in qtest_qmp_receive tests/qtest/libqtest.c:741:27
#13 0x55c0c09bee9d in qtest_vqmp tests/qtest/libqtest.c:812:12
#14 0x55c0c09bd257 in qtest_qmp tests/qtest/libqtest.c:835:16
#15 0x55c0c0a87747 in multifd_mapped_ram_fdset_end tests/qtest/migration-test.c:2393:12
#16 0x55c0c0a85eb3 in test_file_common tests/qtest/migration-test.c:1978:9
#17 0x55c0c0a746a3 in test_multifd_file_mapped_ram_fdset tests/qtest/migration-test.c:2437:5
#18 0x55c0c0a93237 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
#19 0x7f186f958aed in test_case_run debian/build/deb/../../../glib/gtestutils.c:2930:15
#20 0x7f186f958aed in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3018:16
#21 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#22 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#23 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#24 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#25 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#26 0x7f186f958faa in g_test_run_suite debian/build/deb/../../../glib/gtestutils.c:3109:18
#27 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2231:7
#28 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2218:1
#29 0x55c0c0a6e427 in main tests/qtest/migration-test.c:4033:11
Unref the object after we've confirmed that it is what we expect.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
If you invoke the migration-test binary in such a way that it doesn't run
any tests, then we never call bootfile_create(), and at the end of
main() bootfile_delete() will try to unlink(NULL), which is not valid.
This can happen if for instance you tell the test binary to run a
subset of tests that turns out to be empty, like this:
(cd build/asan && QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p bang)
# random seed: R02S6501b289ff8ced4231ba452c3a87bc6f
# Skipping test: userfaultfd not available
1..0
../../tests/qtest/migration-test.c:182:12: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/unistd.h:858:48: note: nonnull attribute specified here
Handle this by making bootfile_delete() not needing to do anything
because bootfile_create() was never called.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed conflict with aee07f2563]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
I fumbled one of my last pull requests when fixing in-tree an issue
with commit 87d67fadb9 ("monitor: Stop removing non-duplicated
fds"). Basically mixed-up my `git add -p` and `git checkout -p` and
committed a piece of test infra that has not been reviewed yet.
This has not caused any bad symptoms because the test is not enabled
by default anywhere: make check doesn't use two qemu binaries and the
CI doesn't have PYTHON set for the compat tests. Besides, the test
works fine anyway, it would not break anything.
Remove this because it was never intended to be merged.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
A test run may create boot files several times. Delete the previous boot
file before creating a new one.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240823-san-v4-7-a24c6dfa4ceb@daynix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
s390x with TCG is more stable now. Enable it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240525131241.378473-3-npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
[thuth: Added "with TCG" to the commit message]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Make sure there will be an event for postcopy recovery, irrelevant of
whether the reconnect will success, or when the failure happens.
The added new case is to fail early in postcopy recovery, in which case it
didn't even reach RECOVER stage on src (and in real life it'll be the same
to dest, but the test case is just slightly more involved due to the dual
socketpair setup).
To do that, rename the postcopy_recovery_test_fail to reflect either stage
to fail, instead of a boolean.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Making sure the postcopy-recover-setup status is present in the postcopy
failure unit test. Note that it only applies to src QEMU not dest.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Libvirt should always enable it, so it'll be nice qtest also cover that for
all tests on both sides. migrate_incoming_qmp() used to enable it only on
dst, now we enable them on both, as we'll start to sanity check events even
on the src QEMU.
We'll need to leave the one in migrate_incoming_qmp(), because
virtio-net-failover test uses that one only, and it relies on the events to
work.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Most of them are not needed, we can stick with one ifdef inside
postcopy_recover_fail() so as to cover the scm right tricks only.
The tests won't run on windows anyway due to has_uffd always false.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a multifd test for mapped-ram with passing of fds into QEMU. This
is how libvirt will consume the feature.
There are a couple of details to the fdset mechanism:
- multifd needs two distinct file descriptors (not duplicated with
dup()) so it can enable O_DIRECT only on the channels that do
aligned IO. The dup() system call creates file descriptors that
share status flags, of which O_DIRECT is one.
- the open() access mode flags used for the fds passed into QEMU need
to match the flags QEMU uses to open the file. Currently O_WRONLY
for src and O_RDONLY for dst.
Note that fdset code goes under _WIN32 because fd passing is not
supported on Windows.
Reviewed-by: Peter Xu <peterx@redhat.com>
[brought back the qmp_remove_fd() call at the end of the tests]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
The tests are only allowed to run in systems that know about the
O_DIRECT flag and in filesystems which support it.
Note: this also brings back migrate_set_parameter_bool() which went
away when we removed the compression tests. I copied it verbatim.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
monitor_fdsets_cleanup() currently has three responsibilities:
1- Remove the fds that have been marked for removal(->removed=true) by
qmp_remove_fd(). This is overly complicated, but ok.
2- Remove any file descriptors that have been passed into QEMU and
never duplicated[1,2]. A file descriptor without duplicates
indicates that no part of QEMU has made use of it. This is
problematic because the current implementation does it only if the
guest is not running and the monitor is closed.
3- Remove/free fdsets that have become empty due to the above
removals. This is ok.
The scenario described in (2) is starting to show some cracks now that
we're trying to consume fds from the migration code:
- Doing cleanup every time the last monitor connection closes works to
reap unused fds, but also has the side effect of forcing the
management layer to pass the file descriptors again in case of a
disconnect/re-connect, if that happened to be the only monitor
connection.
Another side effect is that removing an fd with qmp_remove_fd() is
effectively delayed until the last monitor connection closes.
The usage of mon_refcount is also problematic because it's racy.
- Checking runstate_is_running() skips the cleanup unless the VM is
running and avoids premature cleanup of the fds, but also has the
side effect of blocking the legitimate removal of an fd via
qmp_remove_fd() if the VM happens to be in another state.
This affects qmp_remove_fd() and qmp_query_fdsets() in particular
because requesting a removal at a bad time (guest stopped) might
cause an fd to never be removed, or to be removed at a much later
point in time, causing the query command to continue showing the
supposedly removed fd/fdset.
Note that file descriptors that *have* been duplicated are owned by
the code that uses them and will be removed after qemu_close() is
called. Therefore we've decided that the best course of action to
avoid the undesired side-effects is to stop managing non-duplicated
file descriptors.
1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
2- ebe52b592d ("monitor: Prevent removing fd from set during init")
Reviewed-by: Peter Xu <peterx@redhat.com>
[fix logic mistake: s/fdset_free/fdset_free_if_empty]
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Add a test for file migration using fdset. The passing of fds is more
complex than using a file path. This is also the scenario where it's
most important we ensure that the initial migration stream offset is
respected because the fdset interface is the one used by the
management layer when providing a non empty migration file.
Note that fd passing is not available on Windows, so anything that
uses add-fd needs to exclude that platform.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
When doing file migration, QEMU accepts an offset that should be
skipped when writing the migration stream to the file. The purpose of
the offset is to allow the management layer to put its own metadata at
the start of the file.
We have tests for this in migration-test, but only testing that the
migration stream starts at the correct offset and not that it actually
leaves the data intact. Unsurprisingly, there's been a bug in that
area that the tests didn't catch.
Fix the tests to write some data to the offset region and check that
it's actually there after the migration.
While here, switch to using g_get_file_contents() which is more
portable than mmap().
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
add qpl to compression method test for multifd migration
the qpl compression supports software path and hardware
path(IAA device), and the hardware path is used first by
default. If the hardware path is unavailable, it will
automatically fallback to the software path for testing.
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Similar to other archs, build a custom bios memory updater. Running the
test with OF code is a cool trick, but SLOF takes a long time to boot.
This reduces test time by around 3x (150s to 50s).
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
ppc64 with TCG seems to no longer be failing this test, perhaps since
commit 03bfc2188f ("physmem: Fix migration dirty bitmap coherency
with TCG memory access") which is not ppc specific but was seen to hit
ppc64 quite easily.
Let's enable it again.
The s390x problem has been identified so mention it while we are
adjusting the comment.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Prasad Pandit <pjp@fedoraproject.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
If analyze-migration.py cannot be run or crashes, the error is currently
ignored since the code only checks for nonzero values in case the child
exited properly. For example, if you run the test with a non-existing
Python interpreter, it still succeeds:
$ PYTHON=wrongpython QTEST_QEMU_BINARY=./qemu-system-x86_64 tests/qtest/migration-test
...
# Running /x86_64/migration/analyze-script
# Using machine type: pc-q35-9.1
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/src_serial -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -uuid 11111111-1111-1111-1111-111111111111 -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-417639.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-417639.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-9.1, -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-XPLUN2/dest_serial -incoming tcp:127.0.0.1:0 -drive if=none,id=d0,file=/tmp/migration-test-XPLUN2/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -accel qtest
**
ERROR:../../devel/qemu/tests/qtest/migration-test.c:1603:test_analyze_script: code should not be reached
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
migration-test: ../../devel/qemu/tests/qtest/libqtest.c:240: qtest_wait_qemu: Assertion `pid == s->qemu_pid' failed.
ok 2 /x86_64/migration/analyze-script
...
Let's better fail the test in case the child did not exit properly, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
On s390x, we recently had a regression that broke migration / savevm
(see commit bebe9603fc ("hw/intc/s390_flic: Fix crash that occurs when
saving the machine state"). The problem was merged without being noticed
since we currently do not run any migration / savevm related tests on
x86 hosts.
While we currently cannot run all migration tests for the s390x target
on x86 hosts yet (due to some unresolved issues with TCG), we can at
least run some of the non-live tests to avoid such problems in the future.
Thus enable the "analyze-script" and the "bad_dest" tests before checking
for KVM on s390x or ppc64 (this also fixes the problem that the
"analyze-script" test was not run on s390x at all anymore since it got
disabled again by accident in a previous refactoring of the code).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
ARM/aarch64 are easy to fix because they already have to pass a machine
type by hand. Just guard the tests with a check that the machine actually
exists.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240509170044.190795-14-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>