mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-08-04 16:23:55 -06:00
Migration Pull request (20231031)
Hi This is repeat of the Migration PULL for 20231020. - I removed vmstate_register(big problems with s390x) - I added yet more countes (juan) CI:1055797950
Please apply. Thanks, Juan. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmVAwmYACgkQ9IfvGFhy 1yPJ9g//f8Up+5Az0DmJMWwRe+08vLa3ZRCSh5aCRJguFVfMZSVxRNuoikQ/C/Gz 1ePB+Q8H0NcP86FF7pifhtLU0uE9L4At4Z+vOQP1+n67p7aush050kKQxyDYIfO2 3tO2HkfHvC/R3S5FtqQtE1Y0/MpHdj1vgV9bNidPorA6EZ01KEEfWw3soptuD14I LPvXA8BG5mOvB7R55MymTAej3ZDmOUQlZotsE2KmlkOfzYoqTtApkLtW03/WH8b8 fAYJ0ghYpesRTO1rF61n1peLMUr+/HRLqGJmhLDSEZZlB5tnUYeiLR9dRJ1/1+o2 zNjLr6X2hnia6Kb0UibRoAcyyy8lSLp79Zt5nhDneuTSQxeYhNh6EecxAzKvd/02 vfE/reOEkZn7KzYH/MvlD5P6XmwrT5aV9cqmyC/8BkNnipHAtJ2Av1H4ONdnahuK hOhLRAGE7SINtgo8jdauQNor1QAsIX19nvYk9p7ta5VAysrDSbuD+9Yq7HtUErlP 585z5BPGfaP2GwIXPNJNcqXwPh0InInGASqEWmYSlu8GF3Ic0KNWWrC5bwSn7tHL I7qaMrCHxvWGYx6cRzzp08EqCcbOQCixrPyk8g6o3SgXHrTGKthzjPG5bLe+QXpv P2gblC7Fo3sUo89IwVjsRMO3nU9wBfb9skE7iZM06SILO7QD3u8= =r1DI -----END PGP SIGNATURE----- Merge tag 'migration-20231031-pull-request' of https://gitlab.com/juan.quintela/qemu into staging Migration Pull request (20231031) Hi This is repeat of the Migration PULL for 20231020. - I removed vmstate_register(big problems with s390x) - I added yet more countes (juan) CI:1055797950
Please apply. Thanks, Juan. # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmVAwmYACgkQ9IfvGFhy # 1yPJ9g//f8Up+5Az0DmJMWwRe+08vLa3ZRCSh5aCRJguFVfMZSVxRNuoikQ/C/Gz # 1ePB+Q8H0NcP86FF7pifhtLU0uE9L4At4Z+vOQP1+n67p7aush050kKQxyDYIfO2 # 3tO2HkfHvC/R3S5FtqQtE1Y0/MpHdj1vgV9bNidPorA6EZ01KEEfWw3soptuD14I # LPvXA8BG5mOvB7R55MymTAej3ZDmOUQlZotsE2KmlkOfzYoqTtApkLtW03/WH8b8 # fAYJ0ghYpesRTO1rF61n1peLMUr+/HRLqGJmhLDSEZZlB5tnUYeiLR9dRJ1/1+o2 # zNjLr6X2hnia6Kb0UibRoAcyyy8lSLp79Zt5nhDneuTSQxeYhNh6EecxAzKvd/02 # vfE/reOEkZn7KzYH/MvlD5P6XmwrT5aV9cqmyC/8BkNnipHAtJ2Av1H4ONdnahuK # hOhLRAGE7SINtgo8jdauQNor1QAsIX19nvYk9p7ta5VAysrDSbuD+9Yq7HtUErlP # 585z5BPGfaP2GwIXPNJNcqXwPh0InInGASqEWmYSlu8GF3Ic0KNWWrC5bwSn7tHL # I7qaMrCHxvWGYx6cRzzp08EqCcbOQCixrPyk8g6o3SgXHrTGKthzjPG5bLe+QXpv # P2gblC7Fo3sUo89IwVjsRMO3nU9wBfb9skE7iZM06SILO7QD3u8= # =r1DI # -----END PGP SIGNATURE----- # gpg: Signature made Tue 31 Oct 2023 18:01:26 JST # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'migration-20231031-pull-request' of https://gitlab.com/juan.quintela/qemu: (38 commits) qemu-file: Make qemu_fflush() return errors migration: Remove transferred atomic counter migration: Use migration_transferred_bytes() qemu-file: Simplify qemu_file_get_error() migration: migration_rate_limit_reset() don't need the QEMUFile migration: migration_transferred_bytes() don't need the QEMUFile qemu-file: Remove _noflush from qemu_file_transferred_noflush() qemu_file: Remove unused qemu_file_transferred() migration: Use the number of transferred bytes directly qemu_file: total_transferred is not used anymore qemu_file: Use a stat64 for qemu_file_transferred qemu-file: Don't increment qemu_file_transferred at qemu_file_fill_buffer migration: Stop migration immediately in RDMA error paths migration: Deprecate old compression method migration: Deprecate block migration migration: migrate 'blk' command option is deprecated. migration: migrate 'inc' command option is deprecated. qemu-iotests: Filter warnings about block migration being deprecated migration: set file error on subsection loading migration: rename vmstate_save_needed->vmstate_section_needed ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
commit
f13b978cc7
23 changed files with 880 additions and 276 deletions
|
@ -469,3 +469,38 @@ Migration
|
|||
``skipped`` field in Migration stats has been deprecated. It hasn't
|
||||
been used for more than 10 years.
|
||||
|
||||
``inc`` migrate command option (since 8.2)
|
||||
''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
Use blockdev-mirror with NBD instead.
|
||||
|
||||
As an intermediate step the ``inc`` functionality can be achieved by
|
||||
setting the ``block-incremental`` migration parameter to ``true``.
|
||||
But this parameter is also deprecated.
|
||||
|
||||
``blk`` migrate command option (since 8.2)
|
||||
''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
Use blockdev-mirror with NBD instead.
|
||||
|
||||
As an intermediate step the ``blk`` functionality can be achieved by
|
||||
setting the ``block`` migration capability to ``true``. But this
|
||||
capability is also deprecated.
|
||||
|
||||
block migration (since 8.2)
|
||||
'''''''''''''''''''''''''''
|
||||
|
||||
Block migration is too inflexible. It needs to migrate all block
|
||||
devices or none.
|
||||
|
||||
Please see "QMP invocation for live storage migration with
|
||||
``blockdev-mirror`` + NBD" in docs/interop/live-block-operations.rst
|
||||
for a detailed explanation.
|
||||
|
||||
old compression method (since 8.2)
|
||||
''''''''''''''''''''''''''''''''''
|
||||
|
||||
Compression method fails too much. Too many races. We are going to
|
||||
remove it if nobody fixes it. For starters, migration-test
|
||||
compression tests are disabled becase they fail randomly. If you need
|
||||
compression, use multifd compression methods.
|
||||
|
|
|
@ -28,6 +28,8 @@ the guest to be stopped. Typically the time that the guest is
|
|||
unresponsive during live migration is the low hundred of milliseconds
|
||||
(notice that this depends on a lot of things).
|
||||
|
||||
.. contents::
|
||||
|
||||
Transports
|
||||
==========
|
||||
|
||||
|
@ -917,3 +919,521 @@ versioned machine types to cut down on the combinations that will need
|
|||
support. This is also useful when newer versions of firmware outgrow
|
||||
the padding.
|
||||
|
||||
|
||||
Backwards compatibility
|
||||
=======================
|
||||
|
||||
How backwards compatibility works
|
||||
---------------------------------
|
||||
|
||||
When we do migration, we have two QEMU processes: the source and the
|
||||
target. There are two cases, they are the same version or they are
|
||||
different versions. The easy case is when they are the same version.
|
||||
The difficult one is when they are different versions.
|
||||
|
||||
There are two things that are different, but they have very similar
|
||||
names and sometimes get confused:
|
||||
|
||||
- QEMU version
|
||||
- machine type version
|
||||
|
||||
Let's start with a practical example, we start with:
|
||||
|
||||
- qemu-system-x86_64 (v5.2), from now on qemu-5.2.
|
||||
- qemu-system-x86_64 (v5.1), from now on qemu-5.1.
|
||||
|
||||
Related to this are the "latest" machine types defined on each of
|
||||
them:
|
||||
|
||||
- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
|
||||
- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1
|
||||
|
||||
First of all, migration is only supposed to work if you use the same
|
||||
machine type in both source and destination. The QEMU hardware
|
||||
configuration needs to be the same also on source and destination.
|
||||
Most aspects of the backend configuration can be changed at will,
|
||||
except for a few cases where the backend features influence frontend
|
||||
device feature exposure. But that is not relevant for this section.
|
||||
|
||||
I am going to list the number of combinations that we can have. Let's
|
||||
start with the trivial ones, QEMU is the same on source and
|
||||
destination:
|
||||
|
||||
1 - qemu-5.2 -M pc-5.2 -> migrates to -> qemu-5.2 -M pc-5.2
|
||||
|
||||
This is the latest QEMU with the latest machine type.
|
||||
This have to work, and if it doesn't work it is a bug.
|
||||
|
||||
2 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
|
||||
|
||||
Exactly the same case than the previous one, but for 5.1.
|
||||
Nothing to see here either.
|
||||
|
||||
This are the easiest ones, we will not talk more about them in this
|
||||
section.
|
||||
|
||||
Now we start with the more interesting cases. Consider the case where
|
||||
we have the same QEMU version in both sides (qemu-5.2) but we are using
|
||||
the latest machine type for that version (pc-5.2) but one of an older
|
||||
QEMU version, in this case pc-5.1.
|
||||
|
||||
3 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
|
||||
|
||||
It needs to use the definition of pc-5.1 and the devices as they
|
||||
were configured on 5.1, but this should be easy in the sense that
|
||||
both sides are the same QEMU and both sides have exactly the same
|
||||
idea of what the pc-5.1 machine is.
|
||||
|
||||
4 - qemu-5.1 -M pc-5.2 -> migrates to -> qemu-5.1 -M pc-5.2
|
||||
|
||||
This combination is not possible as the qemu-5.1 doen't understand
|
||||
pc-5.2 machine type. So nothing to worry here.
|
||||
|
||||
Now it comes the interesting ones, when both QEMU processes are
|
||||
different. Notice also that the machine type needs to be pc-5.1,
|
||||
because we have the limitation than qemu-5.1 doesn't know pc-5.2. So
|
||||
the possible cases are:
|
||||
|
||||
5 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
|
||||
|
||||
This migration is known as newer to older. We need to make sure
|
||||
when we are developing 5.2 we need to take care about not to break
|
||||
migration to qemu-5.1. Notice that we can't make updates to
|
||||
qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
|
||||
in qemu-5.2 side to make the relevant changes.
|
||||
|
||||
6 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
|
||||
|
||||
This migration is known as older to newer. We need to make sure
|
||||
than we are able to receive migrations from qemu-5.1. The problem is
|
||||
similar to the previous one.
|
||||
|
||||
If qemu-5.1 and qemu-5.2 were the same, there will not be any
|
||||
compatibility problems. But the reason that we create qemu-5.2 is to
|
||||
get new features, devices, defaults, etc.
|
||||
|
||||
If we get a device that has a new feature, or change a default value,
|
||||
we have a problem when we try to migrate between different QEMU
|
||||
versions.
|
||||
|
||||
So we need a way to tell qemu-5.2 that when we are using machine type
|
||||
pc-5.1, it needs to **not** use the feature, to be able to migrate to
|
||||
real qemu-5.1.
|
||||
|
||||
And the equivalent part when migrating from qemu-5.1 to qemu-5.2.
|
||||
qemu-5.2 has to expect that it is not going to get data for the new
|
||||
feature, because qemu-5.1 doesn't know about it.
|
||||
|
||||
How do we tell QEMU about these device feature changes? In
|
||||
hw/core/machine.c:hw_compat_X_Y arrays.
|
||||
|
||||
If we change a default value, we need to put back the old value on
|
||||
that array. And the device, during initialization needs to look at
|
||||
that array to see what value it needs to get for that feature. And
|
||||
what are we going to put in that array, the value of a property.
|
||||
|
||||
To create a property for a device, we need to use one of the
|
||||
DEFINE_PROP_*() macros. See include/hw/qdev-properties.h to find the
|
||||
macros that exist. With it, we set the default value for that
|
||||
property, and that is what it is going to get in the latest released
|
||||
version. But if we want a different value for a previous version, we
|
||||
can change that in the hw_compat_X_Y arrays.
|
||||
|
||||
hw_compat_X_Y is an array of registers that have the format:
|
||||
|
||||
- name_device
|
||||
- name_property
|
||||
- value
|
||||
|
||||
Let's see a practical example.
|
||||
|
||||
In qemu-5.2 virtio-blk-device got multi queue support. This is a
|
||||
change that is not backward compatible. In qemu-5.1 it has one
|
||||
queue. In qemu-5.2 it has the same number of queues as the number of
|
||||
cpus in the system.
|
||||
|
||||
When we are doing migration, if we migrate from a device that has 4
|
||||
queues to a device that have only one queue, we don't know where to
|
||||
put the extra information for the other 3 queues, and we fail
|
||||
migration.
|
||||
|
||||
Similar problem when we migrate from qemu-5.1 that has only one queue
|
||||
to qemu-5.2, we only sent information for one queue, but destination
|
||||
has 4, and we have 3 queues that are not properly initialized and
|
||||
anything can happen.
|
||||
|
||||
So, how can we address this problem. Easy, just convince qemu-5.2
|
||||
that when it is running pc-5.1, it needs to set the number of queues
|
||||
for virtio-blk-devices to 1.
|
||||
|
||||
That way we fix the cases 5 and 6.
|
||||
|
||||
5 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.1 -M pc-5.1
|
||||
|
||||
qemu-5.2 -M pc-5.1 sets number of queues to be 1.
|
||||
qemu-5.1 -M pc-5.1 expects number of queues to be 1.
|
||||
|
||||
correct. migration works.
|
||||
|
||||
6 - qemu-5.1 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
|
||||
|
||||
qemu-5.1 -M pc-5.1 sets number of queues to be 1.
|
||||
qemu-5.2 -M pc-5.1 expects number of queues to be 1.
|
||||
|
||||
correct. migration works.
|
||||
|
||||
And now the other interesting case, case 3. In this case we have:
|
||||
|
||||
3 - qemu-5.2 -M pc-5.1 -> migrates to -> qemu-5.2 -M pc-5.1
|
||||
|
||||
Here we have the same QEMU in both sides. So it doesn't matter a
|
||||
lot if we have set the number of queues to 1 or not, because
|
||||
they are the same.
|
||||
|
||||
WRONG!
|
||||
|
||||
Think what happens if we do one of this double migrations:
|
||||
|
||||
A -> migrates -> B -> migrates -> C
|
||||
|
||||
where:
|
||||
|
||||
A: qemu-5.1 -M pc-5.1
|
||||
B: qemu-5.2 -M pc-5.1
|
||||
C: qemu-5.2 -M pc-5.1
|
||||
|
||||
migration A -> B is case 6, so number of queues needs to be 1.
|
||||
|
||||
migration B -> C is case 3, so we don't care. But actually we
|
||||
care because we haven't started the guest in qemu-5.2, it came
|
||||
migrated from qemu-5.1. So to be in the safe place, we need to
|
||||
always use number of queues 1 when we are using pc-5.1.
|
||||
|
||||
Now, how was this done in reality? The following commit shows how it
|
||||
was done::
|
||||
|
||||
commit 9445e1e15e66c19e42bea942ba810db28052cd05
|
||||
Author: Stefan Hajnoczi <stefanha@redhat.com>
|
||||
Date: Tue Aug 18 15:33:47 2020 +0100
|
||||
|
||||
virtio-blk-pci: default num_queues to -smp N
|
||||
|
||||
The relevant parts for migration are::
|
||||
|
||||
@@ -1281,7 +1284,8 @@ static Property virtio_blk_properties[] = {
|
||||
#endif
|
||||
DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
|
||||
true),
|
||||
- DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, 1),
|
||||
+ DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
|
||||
+ VIRTIO_BLK_AUTO_NUM_QUEUES),
|
||||
DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
|
||||
|
||||
It changes the default value of num_queues. But it fishes it for old
|
||||
machine types to have the right value::
|
||||
|
||||
@@ -31,6 +31,7 @@
|
||||
GlobalProperty hw_compat_5_1[] = {
|
||||
...
|
||||
+ { "virtio-blk-device", "num-queues", "1"},
|
||||
...
|
||||
};
|
||||
|
||||
A device with diferent features on both sides
|
||||
---------------------------------------------
|
||||
|
||||
Let's assume that we are using the same QEMU binary on both sides,
|
||||
just to make the things easier. But we have a device that has
|
||||
different features on both sides of the migration. That can be
|
||||
because the devices are different, because the kernel driver of both
|
||||
devices have different features, whatever.
|
||||
|
||||
How can we get this to work with migration. The way to do that is
|
||||
"theoretically" easy. You have to get the features that the device
|
||||
has in the source of the migration. The features that the device has
|
||||
on the target of the migration, you get the intersection of the
|
||||
features of both sides, and that is the way that you should launch
|
||||
QEMU.
|
||||
|
||||
Notice that this is not completely related to QEMU. The most
|
||||
important thing here is that this should be handled by the managing
|
||||
application that launches QEMU. If QEMU is configured correctly, the
|
||||
migration will succeed.
|
||||
|
||||
That said, actually doing it is complicated. Almost all devices are
|
||||
bad at being able to be launched with only some features enabled.
|
||||
With one big exception: cpus.
|
||||
|
||||
You can read the documentation for QEMU x86 cpu models here:
|
||||
|
||||
https://qemu-project.gitlab.io/qemu/system/qemu-cpu-models.html
|
||||
|
||||
See when they talk about migration they recommend that one chooses the
|
||||
newest cpu model that is supported for all cpus.
|
||||
|
||||
Let's say that we have:
|
||||
|
||||
Host A:
|
||||
|
||||
Device X has the feature Y
|
||||
|
||||
Host B:
|
||||
|
||||
Device X has not the feature Y
|
||||
|
||||
If we try to migrate without any care from host A to host B, it will
|
||||
fail because when migration tries to load the feature Y on
|
||||
destination, it will find that the hardware is not there.
|
||||
|
||||
Doing this would be the equivalent of doing with cpus:
|
||||
|
||||
Host A:
|
||||
|
||||
$ qemu-system-x86_64 -cpu host
|
||||
|
||||
Host B:
|
||||
|
||||
$ qemu-system-x86_64 -cpu host
|
||||
|
||||
When both hosts have different cpu features this is guaranteed to
|
||||
fail. Especially if Host B has less features than host A. If host A
|
||||
has less features than host B, sometimes it works. Important word of
|
||||
last sentence is "sometimes".
|
||||
|
||||
So, forgetting about cpu models and continuing with the -cpu host
|
||||
example, let's see that the differences of the cpus is that Host A and
|
||||
B have the following features:
|
||||
|
||||
Features: 'pcid' 'stibp' 'taa-no'
|
||||
Host A: X X
|
||||
Host B: X
|
||||
|
||||
And we want to migrate between them, the way configure both QEMU cpu
|
||||
will be:
|
||||
|
||||
Host A:
|
||||
|
||||
$ qemu-system-x86_64 -cpu host,pcid=off,stibp=off
|
||||
|
||||
Host B:
|
||||
|
||||
$ qemu-system-x86_64 -cpu host,taa-no=off
|
||||
|
||||
And you would be able to migrate between them. It is responsability
|
||||
of the management application or of the user to make sure that the
|
||||
configuration is correct. QEMU doesn't know how to look at this kind
|
||||
of features in general.
|
||||
|
||||
Notice that we don't recomend to use -cpu host for migration. It is
|
||||
used in this example because it makes the example simpler.
|
||||
|
||||
Other devices have worse control about individual features. If they
|
||||
want to be able to migrate between hosts that show different features,
|
||||
the device needs a way to configure which ones it is going to use.
|
||||
|
||||
In this section we have considered that we are using the same QEMU
|
||||
binary in both sides of the migration. If we use different QEMU
|
||||
versions process, then we need to have into account all other
|
||||
differences and the examples become even more complicated.
|
||||
|
||||
How to mitigate when we have a backward compatibility error
|
||||
-----------------------------------------------------------
|
||||
|
||||
We broke migration for old machine types continuously during
|
||||
development. But as soon as we find that there is a problem, we fix
|
||||
it. The problem is what happens when we detect after we have done a
|
||||
release that something has gone wrong.
|
||||
|
||||
Let see how it worked with one example.
|
||||
|
||||
After the release of qemu-8.0 we found a problem when doing migration
|
||||
of the machine type pc-7.2.
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-7.2 -M pc-7.2
|
||||
|
||||
This migration works
|
||||
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-8.0 -M pc-7.2
|
||||
|
||||
This migration works
|
||||
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-7.2 -M pc-7.2
|
||||
|
||||
This migration fails
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-8.0 -M pc-7.2
|
||||
|
||||
This migration fails
|
||||
|
||||
So clearly something fails when migration between qemu-7.2 and
|
||||
qemu-8.0 with machine type pc-7.2. The error messages, and git bisect
|
||||
pointed to this commit.
|
||||
|
||||
In qemu-8.0 we got this commit::
|
||||
|
||||
commit 010746ae1db7f52700cb2e2c46eb94f299cfa0d2
|
||||
Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
||||
Date: Thu Mar 2 13:37:02 2023 +0000
|
||||
|
||||
hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register
|
||||
|
||||
|
||||
The relevant bits of the commit for our example are this ones::
|
||||
|
||||
--- a/hw/pci/pcie_aer.c
|
||||
+++ b/hw/pci/pcie_aer.c
|
||||
@@ -112,6 +112,10 @@ int pcie_aer_init(PCIDevice *dev,
|
||||
|
||||
pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
|
||||
PCI_ERR_UNC_SUPPORTED);
|
||||
+ pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
|
||||
+ PCI_ERR_UNC_MASK_DEFAULT);
|
||||
+ pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
|
||||
+ PCI_ERR_UNC_SUPPORTED);
|
||||
|
||||
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
|
||||
PCI_ERR_UNC_SEVERITY_DEFAULT);
|
||||
|
||||
The patch changes how we configure PCI space for AER. But QEMU fails
|
||||
when the PCI space configuration is different between source and
|
||||
destination.
|
||||
|
||||
The following commit shows how this got fixed::
|
||||
|
||||
commit 5ed3dabe57dd9f4c007404345e5f5bf0e347317f
|
||||
Author: Leonardo Bras <leobras@redhat.com>
|
||||
Date: Tue May 2 21:27:02 2023 -0300
|
||||
|
||||
hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0
|
||||
|
||||
[...]
|
||||
|
||||
The relevant parts of the fix in QEMU are as follow:
|
||||
|
||||
First, we create a new property for the device to be able to configure
|
||||
the old behaviour or the new behaviour::
|
||||
|
||||
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
|
||||
index 8a87ccc8b0..5153ad63d6 100644
|
||||
--- a/hw/pci/pci.c
|
||||
+++ b/hw/pci/pci.c
|
||||
@@ -79,6 +79,8 @@ static Property pci_props[] = {
|
||||
DEFINE_PROP_STRING("failover_pair_id", PCIDevice,
|
||||
failover_pair_id),
|
||||
DEFINE_PROP_UINT32("acpi-index", PCIDevice, acpi_index, 0),
|
||||
+ DEFINE_PROP_BIT("x-pcie-err-unc-mask", PCIDevice, cap_present,
|
||||
+ QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
|
||||
DEFINE_PROP_END_OF_LIST()
|
||||
};
|
||||
|
||||
Notice that we enable the feature for new machine types.
|
||||
|
||||
Now we see how the fix is done. This is going to depend on what kind
|
||||
of breakage happens, but in this case it is quite simple::
|
||||
|
||||
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
|
||||
index 103667c368..374d593ead 100644
|
||||
--- a/hw/pci/pcie_aer.c
|
||||
+++ b/hw/pci/pcie_aer.c
|
||||
@@ -112,10 +112,13 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver,
|
||||
uint16_t offset,
|
||||
|
||||
pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
|
||||
PCI_ERR_UNC_SUPPORTED);
|
||||
- pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
|
||||
- PCI_ERR_UNC_MASK_DEFAULT);
|
||||
- pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
|
||||
- PCI_ERR_UNC_SUPPORTED);
|
||||
+
|
||||
+ if (dev->cap_present & QEMU_PCIE_ERR_UNC_MASK) {
|
||||
+ pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
|
||||
+ PCI_ERR_UNC_MASK_DEFAULT);
|
||||
+ pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
|
||||
+ PCI_ERR_UNC_SUPPORTED);
|
||||
+ }
|
||||
|
||||
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
|
||||
PCI_ERR_UNC_SEVERITY_DEFAULT);
|
||||
|
||||
I.e. If the property bit is enabled, we configure it as we did for
|
||||
qemu-8.0. If the property bit is not set, we configure it as it was in 7.2.
|
||||
|
||||
And now, everything that is missing is disabling the feature for old
|
||||
machine types::
|
||||
|
||||
diff --git a/hw/core/machine.c b/hw/core/machine.c
|
||||
index 47a34841a5..07f763eb2e 100644
|
||||
--- a/hw/core/machine.c
|
||||
+++ b/hw/core/machine.c
|
||||
@@ -48,6 +48,7 @@ GlobalProperty hw_compat_7_2[] = {
|
||||
{ "e1000e", "migrate-timadj", "off" },
|
||||
{ "virtio-mem", "x-early-migration", "false" },
|
||||
{ "migration", "x-preempt-pre-7-2", "true" },
|
||||
+ { TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
|
||||
};
|
||||
const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
|
||||
|
||||
And now, when qemu-8.0.1 is released with this fix, all combinations
|
||||
are going to work as supposed.
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-7.2 -M pc-7.2 (works)
|
||||
- $ qemu-8.0.1 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 (works)
|
||||
- $ qemu-8.0.1 -M pc-7.2 -> qemu-7.2 -M pc-7.2 (works)
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 (works)
|
||||
|
||||
So the normality has been restored and everything is ok, no?
|
||||
|
||||
Not really, now our matrix is much bigger. We started with the easy
|
||||
cases, migration from the same version to the same version always
|
||||
works:
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-7.2 -M pc-7.2
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-8.0 -M pc-7.2
|
||||
- $ qemu-8.0.1 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
|
||||
|
||||
Now the interesting ones. When the QEMU processes versions are
|
||||
different. For the 1st set, their fail and we can do nothing, both
|
||||
versions are released and we can't change anything.
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-8.0 -M pc-7.2
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-7.2 -M pc-7.2
|
||||
|
||||
This two are the ones that work. The whole point of making the
|
||||
change in qemu-8.0.1 release was to fix this issue:
|
||||
|
||||
- $ qemu-7.2 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
|
||||
- $ qemu-8.0.1 -M pc-7.2 -> qemu-7.2 -M pc-7.2
|
||||
|
||||
But now we found that qemu-8.0 neither can migrate to qemu-7.2 not
|
||||
qemu-8.0.1.
|
||||
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
|
||||
- $ qemu-8.0.1 -M pc-7.2 -> qemu-8.0 -M pc-7.2
|
||||
|
||||
So, if we start a pc-7.2 machine in qemu-8.0 we can't migrate it to
|
||||
anything except to qemu-8.0.
|
||||
|
||||
Can we do better?
|
||||
|
||||
Yeap. If we know that we are going to do this migration:
|
||||
|
||||
- $ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
|
||||
|
||||
We can launch the appropriate devices with::
|
||||
|
||||
--device...,x-pci-e-err-unc-mask=on
|
||||
|
||||
And now we can receive a migration from 8.0. And from now on, we can
|
||||
do that migration to new machine types if we remember to enable that
|
||||
property for pc-7.2. Notice that we need to remember, it is not
|
||||
enough to know that the source of the migration is qemu-8.0. Think of
|
||||
this example:
|
||||
|
||||
$ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 -> qemu-8.2 -M pc-7.2
|
||||
|
||||
In the second migration, the source is not qemu-8.0, but we still have
|
||||
that "problem" and have that property enabled. Notice that we need to
|
||||
continue having this mark/property until we have this machine
|
||||
rebooted. But it is not a normal reboot (that don't reload QEMU) we
|
||||
need the machine to poweroff/poweron on a fixed QEMU. And from now
|
||||
on we can use the proper real machine.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue