qemu-img: make convert async

the convert process is currently completely implemented with sync operations. That means it reads one buffer and then writes it. No parallelism and each sync request takes as long as it takes until it is completed. This can be a big performance hit when the convert process reads and writes to devices which do not benefit from kernel readahead or pagecache. In our environment we heavily have the following two use cases when using qemu-img convert. a) reading from NFS and writing to iSCSI for deploying templates b) reading from iSCSI and writing to NFS for backups In both processes we use libiscsi and libnfs so we have no kernel cache. This patch changes the convert process to work with parallel running coroutines which can significantly improve performance for network storage devices: qemu-img (master) nfs -> iscsi 22.8 secs nfs -> ram 11.7 secs ram -> iscsi 12.3 secs qemu-img-async (8 coroutines, in-order write disabled) nfs -> iscsi 11.0 secs nfs -> ram 10.4 secs ram -> iscsi 9.0 secs This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to write to the target out of order rather than sequential. This improves performance as the writes do not have to wait for each other to complete. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-08-08 18:23:57 -06:00 · 2017-02-28 13:40:07 +01:00 · 2017-02-28 13:40:07 +01:00 · 2d9187bc65
commit 2d9187bc65
parent 9514f2648c
3 changed files with 243 additions and 99 deletions
--- a/qemu-img.texi
+++ b/qemu-img.texi
@ -137,6 +137,12 @@ Parameters to convert subcommand:

@item -n
 Skip the creation of the target volume
+@item -m
+Number of parallel coroutines for the convert process
+@item -W
+Allow out-of-order writes to the destination. This option improves performance,
+but is only recommended for preallocated devices like host devices or other
+raw block devices.
@end table

 Parameters to dd subcommand:
@ -296,7 +302,7 @@ Error on reading data

@end table

-@item convert [-c] [-p] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l @var{snapshot_param}] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}
+@item convert [-c] [-p] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-o @var{options}] [-s @var{snapshot_id_or_name}] [-l @var{snapshot_param}] [-m @var{num_coroutines}] [-W] [-S @var{sparse_size}] @var{filename} [@var{filename2} [...]] @var{output_filename}

 Convert the disk image @var{filename} or a snapshot @var{snapshot_param}(@var{snapshot_id_or_name} is deprecated)
 to disk image @var{output_filename} using format @var{output_fmt}. It can be optionally compressed (@code{-c}
@ -326,6 +332,14 @@ skipped. This is useful for formats such as @code{rbd} if the target
 volume has already been created with site specific options that cannot
 be supplied through qemu-img.

+Out of order writes can be enabled with @code{-W} to improve performance.
+This is only recommended for preallocated devices like host devices or other
+raw block devices. Out of order write does not work in combination with
+creating compressed images.
+
+@var{num_coroutines} specifies how many coroutines work in parallel during
+the convert process (defaults to 8).
+
@item dd [-f @var{fmt}] [-O @var{output_fmt}] [bs=@var{block_size}] [count=@var{blocks}] [skip=@var{blocks}] if=@var{input} of=@var{output}

 Dd copies from @var{input} file to @var{output} file converting it from