qemu/migration
Steve Sistare 624e6e654e migration: cpr-transfer mode
Add the cpr-transfer migration mode, which allows the user to transfer
a guest to a new QEMU instance on the same host with minimal guest pause
time, by preserving guest RAM in place, albeit with new virtual addresses
in new QEMU, and by preserving device file descriptors.  Pages that were
locked in memory for DMA in old QEMU remain locked in new QEMU, because the
descriptor of the device that locked them remains open.

cpr-transfer preserves memory and devices descriptors by sending them to
new QEMU over a unix domain socket using SCM_RIGHTS.  Such CPR state cannot
be sent over the normal migration channel, because devices and backends
are created prior to reading the channel, so this mode sends CPR state
over a second "cpr" migration channel.  New QEMU reads the cpr channel
prior to creating devices or backends.  The user specifies the cpr channel
in the channel arguments on the outgoing side, and in a second -incoming
command-line parameter on the incoming side.

The user must start old QEMU with the the '-machine aux-ram-share=on' option,
which allows anonymous memory to be transferred in place to the new process
by transferring a memory descriptor for each ram block.  Memory-backend
objects must have the share=on attribute, but memory-backend-epc is not
supported.

The user starts new QEMU on the same host as old QEMU, with command-line
arguments to create the same machine, plus the -incoming option for the
main migration channel, like normal live migration.  In addition, the user
adds a second -incoming option with channel type "cpr".  This CPR channel
must support file descriptor transfer with SCM_RIGHTS, i.e. it must be a
UNIX domain socket.

To initiate CPR, the user issues a migrate command to old QEMU, adding
a second migration channel of type "cpr" in the channels argument.
Old QEMU stops the VM, saves state to the migration channels, and enters
the postmigrate state.  New QEMU mmap's memory descriptors, and execution
resumes.

The implementation splits qmp_migrate into start and finish functions.
Start sends CPR state to new QEMU, which responds by closing the CPR
channel.  Old QEMU detects the HUP then calls finish, which connects the
main migration channel.

In summary, the usage is:

  qemu-system-$arch -machine aux-ram-share=on ...

  start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>"

  Issue commands to old QEMU:
    migrate_set_parameter mode cpr-transfer

    {"execute": "migrate", ...
        {"channel-type": "main"...}, {"channel-type": "cpr"...} ... }

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1736967650-129648-17-git-send-email-steven.sistare@oracle.com
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-01-29 11:56:24 -03:00
..
block-active.c migration/block: Rewrite disk activation 2025-01-09 17:38:57 -03:00
block-dirty-bitmap.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
channel-block.c io: follow coroutine AioContext in qio_channel_yield() 2023-09-07 20:32:11 -05:00
channel-block.h migration: introduce a QIOChannel impl for BlockDriverState VMState 2022-06-22 19:33:43 +01:00
channel.c migration: Fix migration_channel_read_peek() error path 2024-01-04 09:52:42 +08:00
channel.h migration: check magic value for deciding the mapping of channels 2023-02-06 19:22:57 +01:00
colo-failover.c migration/colo: Improve an x-colo-lost-heartbeat error message 2023-02-23 14:10:17 +01:00
colo-stubs.c migration/colo: make colo_incoming_co() return void 2024-05-22 17:34:31 -03:00
colo.c migration/block: Rewrite disk activation 2025-01-09 17:38:57 -03:00
cpr-transfer.c migration: cpr-transfer save and load 2025-01-29 11:43:05 -03:00
cpr.c migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
cpu-throttle.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
dirtyrate.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
dirtyrate.h include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
exec.c migration: simplify exec migration functions 2024-03-04 07:12:40 +01:00
exec.h migration: convert exec backend to accept MigrateAddress. 2023-11-02 11:35:04 +01:00
fd.c migration: Allow pipes to keep working for fd migrations 2024-11-25 16:21:55 -05:00
fd.h migration: Revert mapped-ram multifd support to fd: URI 2024-03-22 12:12:08 -04:00
file.c migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov 2024-09-03 16:24:35 -03:00
file.h migration/multifd: Pass in MultiFDPages_t to file_write_ramblock_iov 2024-09-03 16:24:35 -03:00
global_state.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
meson.build migration: cpr-transfer save and load 2025-01-29 11:43:05 -03:00
migration-hmp-cmds.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
migration-stats.c migration: migration_rate_limit_reset() don't need the QEMUFile 2023-10-31 08:44:33 +01:00
migration-stats.h migration: Remove transferred atomic counter 2023-10-31 08:44:33 +01:00
migration.c migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
migration.h migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
multifd-nocomp.c multifd: bugfix for migration using compression methods 2025-01-09 17:40:15 -03:00
multifd-qatzip.c multifd: bugfix for incorrect migration data with qatzip compression 2025-01-09 17:40:27 -03:00
multifd-qpl.c multifd: bugfix for incorrect migration data with QPL compression 2025-01-09 17:40:21 -03:00
multifd-uadk.c migration/multifd: Fix compile error caused by page_size usage 2025-01-09 17:37:50 -03:00
multifd-zero-page.c migration/multifd: Move pages accounting into multifd_send_zero_page_detect() 2024-09-03 16:24:35 -03:00
multifd-zlib.c migration/multifd: Make MultiFDMethods const 2024-09-03 16:24:36 -03:00
multifd-zstd.c migration/multifd: Fix loop conditions in multifd_zstd_send_prepare and multifd_zstd_recv 2024-09-18 14:27:24 -04:00
multifd.c migration/multifd: Fix compat with QEMU < 9.0 2025-01-09 17:38:35 -03:00
multifd.h migration/multifd: Cleanup src flushes on condition check 2025-01-09 17:38:27 -03:00
options.c migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
options.h migration: Use device_class_set_props_n 2024-12-19 19:33:37 +01:00
page_cache.c migration: Fix cache_init()'s "Failed to allocate" error messages 2021-02-08 11:19:51 +00:00
page_cache.h migration: Clean up signed vs. unsigned XBZRLE cache-size 2021-02-08 11:19:51 +00:00
postcopy-ram.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
postcopy-ram.h migration/postcopy: Add postcopy-recover-setup phase 2024-06-21 09:47:59 -03:00
qemu-file.c migration: SCM_RIGHTS for QEMUFile 2025-01-29 11:43:05 -03:00
qemu-file.h migration: SCM_RIGHTS for QEMUFile 2025-01-29 11:43:05 -03:00
ram.c migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
ram.h migration/ram: Move RAM_SAVE_FLAG* into ram.h 2025-01-09 17:38:15 -03:00
rdma.c migration/rdma: Fix a memory issue for migration 2024-03-11 14:41:40 -04:00
rdma.h migration/ram: Move RAM_SAVE_FLAG* into ram.h 2025-01-09 17:38:15 -03:00
savevm.c migration: fix -Werror=maybe-uninitialized 2025-01-29 11:43:03 -03:00
savevm.h migration: Add Error** argument to qemu_savevm_state_setup() 2024-04-23 18:36:01 -04:00
socket.c migration: Remove unused socket_send_channel_create_sync 2024-10-08 15:28:55 -04:00
socket.h migration: Remove unused socket_send_channel_create_sync 2024-10-08 15:28:55 -04:00
target.c migration: Add migration prefix to functions in target.c 2023-09-11 08:34:06 +02:00
threadinfo.c migration/multifd: Protect accesses to migration_threads 2023-07-26 10:55:56 +02:00
threadinfo.h migration/multifd: Protect accesses to migration_threads 2023-07-26 10:55:56 +02:00
tls.c migration: Drop unused parameter for migration_tls_client_create() 2023-05-03 11:24:20 +02:00
tls.h migration: Drop unused parameter for migration_tls_client_create() 2023-05-03 11:24:20 +02:00
trace-events migration: cpr-transfer save and load 2025-01-29 11:43:05 -03:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
vmstate-types.c migration: cpr-transfer mode 2025-01-29 11:56:24 -03:00
vmstate.c migration: Fix arrays of pointers in JSON writer 2025-01-09 17:39:54 -03:00
xbzrle.c migration/xbzrle: Use i386 host/cpuinfo.h 2023-05-23 16:51:18 -07:00
xbzrle.h migration/xbzrle: Use i386 host/cpuinfo.h 2023-05-23 16:51:18 -07:00
yank_functions.c migration/yank: Use channel features 2024-01-29 11:02:12 +08:00
yank_functions.h migration: Move the yank unregister of channel_close out 2021-07-26 12:45:03 +01:00