mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-12-26 01:18:36 -07:00
It should be safe to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.
How to reproduce the dead lock:
1. Create nbd-fault-injector.conf with the following contents:
[inject-error "mega1"]
event=data
io=readwrite
when=before
2. In one terminal run nbd-fault-injector in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./nbd-fault-injector.py 127.0.0.1:10000 nbd-fault-injector.conf;
done
3. In another terminal run qemu-io in a loop, like this:
n=1; while true; do
echo $n; ((n++));
./qemu-io -c 'read 0 512' nbd://127.0.0.1:10000;
done
After some time, qemu-io will hang trying to drain, for example, like
this:
#3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
util/aio-posix.c:600
#4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
#5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
#6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
#7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
#8 bdrv_open_inherit (filename=0x7fffba1563bc
"nbd+tcp://127.0.0.1:10000", reference=0x0, options=0x55f006be86d0,
flags=24578, parent=0x0, child_class=0x0, child_role=0,
errp=0x7fffba154620) at block.c:3491
#9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
block.c:3513
#10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:10000",
reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
block/block-backend.c:421
And connection_co stack like this:
#0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
#1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
#2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
io/channel.c:472
#3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
#4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
#5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
"\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
io/channel.c:247
#6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
/work/src/qemu/master/include/block/nbd.h:365
#7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
/work/src/qemu/master/include/block/nbd.h:391
#8 nbd_start_negotiate (aio_context=0x55f006bdd890,
ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
outioc=0x55f006bf19f8, structured_reply=true,
zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
#9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
nbd/client.c:1032
#10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
block/nbd.c:1460
#11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
#12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
#13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
#14 coroutine_trampoline (i0=113190480, i1=22000) at
util/coroutine-ucontext.c:173
Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200727184751.15704-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
|
||
|---|---|---|
| .. | ||
| monitor | ||
| accounting.c | ||
| aio_task.c | ||
| amend.c | ||
| backup-top.c | ||
| backup-top.h | ||
| backup.c | ||
| blkdebug.c | ||
| blklogwrites.c | ||
| blkreplay.c | ||
| blkverify.c | ||
| block-backend.c | ||
| block-copy.c | ||
| bochs.c | ||
| cloop.c | ||
| commit.c | ||
| copy-on-read.c | ||
| create.c | ||
| crypto.c | ||
| crypto.h | ||
| curl.c | ||
| dirty-bitmap.c | ||
| dmg-bz2.c | ||
| dmg-lzfse.c | ||
| dmg.c | ||
| dmg.h | ||
| file-posix.c | ||
| file-win32.c | ||
| filter-compress.c | ||
| gluster.c | ||
| io.c | ||
| io_uring.c | ||
| iscsi-opts.c | ||
| iscsi.c | ||
| linux-aio.c | ||
| Makefile.objs | ||
| mirror.c | ||
| nbd.c | ||
| nfs.c | ||
| null.c | ||
| nvme.c | ||
| parallels.c | ||
| parallels.h | ||
| qapi-sysemu.c | ||
| qapi.c | ||
| qcow.c | ||
| qcow2-bitmap.c | ||
| qcow2-cache.c | ||
| qcow2-cluster.c | ||
| qcow2-refcount.c | ||
| qcow2-snapshot.c | ||
| qcow2-threads.c | ||
| qcow2.c | ||
| qcow2.h | ||
| qed-check.c | ||
| qed-cluster.c | ||
| qed-l2-cache.c | ||
| qed-table.c | ||
| qed.c | ||
| qed.h | ||
| quorum.c | ||
| raw-format.c | ||
| rbd.c | ||
| replication.c | ||
| sheepdog.c | ||
| snapshot.c | ||
| ssh.c | ||
| stream.c | ||
| throttle-groups.c | ||
| throttle.c | ||
| trace-events | ||
| vdi.c | ||
| vhdx-endian.c | ||
| vhdx-log.c | ||
| vhdx.c | ||
| vhdx.h | ||
| vmdk.c | ||
| vpc.c | ||
| vvfat.c | ||
| win32-aio.c | ||
| write-threshold.c | ||