vfio/migration: Add VFIO migration pre-copy support

Pre-copy support allows the VFIO device data to be transferred while the
VM is running. This helps to accommodate VFIO devices that have a large
amount of data that needs to be transferred, and it can reduce migration
downtime.

Pre-copy support is optional in VFIO migration protocol v2.
Implement pre-copy of VFIO migration protocol v2 and use it for devices
that support it. Full description of it can be found in the following
Linux commit: 4db52602a607 ("vfio: Extend the device migration protocol
with PRE_COPY").

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
This commit is contained in:
Avihai Horon 2023-06-21 14:12:00 +03:00 committed by Cédric Le Goater
parent 6cd1fe1159
commit eda7362af9
5 changed files with 190 additions and 22 deletions

View file

@ -7,12 +7,14 @@ the guest is running on source host and restoring this saved state on the
destination host. This document details how saving and restoring of VFIO
devices is done in QEMU.
Migration of VFIO devices currently consists of a single stop-and-copy phase.
During the stop-and-copy phase the guest is stopped and the entire VFIO device
data is transferred to the destination.
The pre-copy phase of migration is currently not supported for VFIO devices.
Support for VFIO pre-copy will be added later on.
Migration of VFIO devices consists of two phases: the optional pre-copy phase,
and the stop-and-copy phase. The pre-copy phase is iterative and allows to
accommodate VFIO devices that have a large amount of data that needs to be
transferred. The iterative pre-copy phase of migration allows for the guest to
continue whilst the VFIO device state is transferred to the destination, this
helps to reduce the total downtime of the VM. VFIO devices opt-in to pre-copy
support by reporting the VFIO_MIGRATION_PRE_COPY flag in the
VFIO_DEVICE_FEATURE_MIGRATION ioctl.
Note that currently VFIO migration is supported only for a single device. This
is due to VFIO migration's lack of P2P support. However, P2P support is planned
@ -29,10 +31,20 @@ VFIO implements the device hooks for the iterative approach as follows:
* A ``load_setup`` function that sets the VFIO device on the destination in
_RESUMING state.
* A ``state_pending_estimate`` function that reports an estimate of the
remaining pre-copy data that the vendor driver has yet to save for the VFIO
device.
* A ``state_pending_exact`` function that reads pending_bytes from the vendor
driver, which indicates the amount of data that the vendor driver has yet to
save for the VFIO device.
* An ``is_active_iterate`` function that indicates ``save_live_iterate`` is
active only when the VFIO device is in pre-copy states.
* A ``save_live_iterate`` function that reads the VFIO device's data from the
vendor driver during iterative pre-copy phase.
* A ``save_state`` function to save the device config space if it is present.
* A ``save_live_complete_precopy`` function that sets the VFIO device in
@ -111,8 +123,10 @@ Flow of state changes during Live migration
===========================================
Below is the flow of state change during live migration.
The values in the brackets represent the VM state, the migration state, and
The values in the parentheses represent the VM state, the migration state, and
the VFIO device state, respectively.
The text in the square brackets represents the flow if the VFIO device supports
pre-copy.
Live migration save path
------------------------
@ -124,11 +138,12 @@ Live migration save path
|
migrate_init spawns migration_thread
Migration thread then calls each device's .save_setup()
(RUNNING, _SETUP, _RUNNING)
(RUNNING, _SETUP, _RUNNING [_PRE_COPY])
|
(RUNNING, _ACTIVE, _RUNNING)
If device is active, get pending_bytes by .state_pending_exact()
(RUNNING, _ACTIVE, _RUNNING [_PRE_COPY])
If device is active, get pending_bytes by .state_pending_{estimate,exact}()
If total pending_bytes >= threshold_size, call .save_live_iterate()
[Data of VFIO device for pre-copy phase is copied]
Iterate till total pending bytes converge and are less than threshold
|
On migration completion, vCPU stops and calls .save_live_complete_precopy for