Commit graph

3418 commits

Author SHA1 Message Date
Paolo Bonzini
43771539d4 exec: protect mru_block with RCU
Hence, freeing a RAMBlock has to be switched to call_rcu.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Paolo Bonzini
439c5e02d5 rcu: add g_free_rcu
This simplifies calling g_free from an RCU callback.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Mike Day
341774fe6c rcu: introduce RCU-enabled QLIST
Add RCU-enabled variants on the existing bsd DQ facility. Each
operation has the same interface as the existing (non-RCU)
version. Also, each operation is implemented as macro.

Using the RCU-enabled QLIST, existing QLIST users will be able to
convert to RCU without using a different list interface.

Signed-off-by: Mike Day <ncmike@ncultra.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Paolo Bonzini
79e2b9aecc exec: RCUify AddressSpaceDispatch
Note that even after this patch, most callers of address_space_*
functions must still be under the big QEMU lock, otherwise the memory
region returned by address_space_translate can disappear as soon as
address_space_translate returns.  This will be fixed in the next part
of this series.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Paolo Bonzini
9d82b5a792 exec: make iotlb RCU-friendly
After the previous patch, TLBs will be flushed on every change to
the memory mapping.  This patch augments that with synchronization
of the MemoryRegionSections referred to in the iotlb array.

With this change, it is guaranteed that iotlb_to_region will access
the correct memory map, even once the TLB will be accessed outside
the BQL.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Paolo Bonzini
76e5c76f2e exec: introduce cpu_reload_memory_map
This for now is a simple TLB flush.  This can change later for two
reasons:

1) an AddressSpaceDispatch will be cached in the CPUState object

2) it will not be possible to do tlb_flush once the TCG-generated code
runs outside the BQL.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:19 +01:00
Paolo Bonzini
5cd5e70159 pci: split shpc_cleanup and shpc_free
object_unparent should not be called until the parent device is going to be
destroyed.  Only remove the capability and do memory_region_del_subregion
at unrealize time.  Freeing the data structures is left in shpc_free, to
be called from the instance_finalize callback.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-16 17:30:14 +01:00
Max Reitz
c0191e763b block: Remove "growable" from BDS
Now that request clamping is done in the BlockBackend, the "growable"
field can be removed from the BlockDriverState. All BDSs are now treated
as being "growable" (that is, they are allowed to grow; they are not
necessarily actually able to).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:19 +00:00
Max Reitz
4c7b7e9b94 qemu-io: Use BlockBackend
qemu-io should behave like a guest, therefore it should use BlockBackend
to access the block layer.

There are a couple of places where that is infeasible: First, the
bdrv_debug_* functions could theoretically be mirrored in the
BlockBackend, but since these are functions internal to the block layer,
they should not be visible externally (qemu-io as a test tool is exempt
from this).

Second, bdrv_get_info() and bdrv_get_specific_info() work on a single
BDS alone, therefore they should stay BDS-specific.

Third, bdrv_is_allocated() mainly works on a single BDS as well. Some
data may be passed through from the BDS's file (if sectors which are
apparently allocated in the file are not really allocated there but just
zero).

[Fixed conflicts around block_acct_start() usage from Fam Zheng's
"qemu-io: Account IO by aio_read and aio_write" commit.  Use
BlockBackend and blk_get_stats() instead of BlockDriverState.
--Stefan]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-14-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:19 +00:00
Max Reitz
b65a5e12a4 block: Add Error parameter to bdrv_find_protocol()
The argument given to bdrv_find_protocol() is just a file name, which
makes it difficult for the caller to reconstruct what protocol
bdrv_find_protocol() was hoping to find. This patch adds an Error
parameter to that function to solve this issue.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:18 +00:00
Max Reitz
ca49a4fdb3 block: Add blk_new_open()
blk_new_with_bs() creates a BlockBackend with an empty BlockDriverState
attached to it. Empty BDSs are not nice, therefore add an alternative
function which combines blk_new_with_bs() with bdrv_open().

Note: In contrast to bdrv_open() which takes a BlockDriver parameter,
blk_new_open() does not take such a parameter. This is because
bdrv_open() opens a BlockDriverState, therefore it is natural to be able
to set the BlockDriver for that BDS. The fact that bdrv_open() can open
more than a single BDS is merely some form of a byproduct.

blk_new_open() on the other hand is intended to be used to create a
whole tree of BlockDriverStates. Therefore, setting a single BlockDriver
does not make much sense. Instead, the drivers to be used for each of
the nodes must be configured through the "options" QDict; including the
driver of the root BDS.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1423162705-32065-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:18 +00:00
Max Reitz
1ef01253eb block: Lift some BDS functions to the BlockBackend
Create the blk_* counterparts for the following bdrv_* functions (which
make sense to call on the BlockBackend level):
- bdrv_co_write_zeroes()
- bdrv_write_compressed()
- bdrv_truncate()
- bdrv_nb_sectors()
- bdrv_discard()
- bdrv_load_vmstate()
- bdrv_save_vmstate()

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:18 +00:00
Cornelia Huck
b0e5d90ebc dataplane: endianness-aware accesses
The vring.c code currently assumes that guest and host endianness match,
which is not true for a number of cases:

- emulating targets with a different endianness than the host
- bi-endian targets, where the correct endianness depends on the virtio
  device
- upcoming support for the virtio-1 standard mandates little-endian
  accesses even for big-endian targets and hosts

Make sure to use accessors that depend on the virtio device.

Note that dataplane now needs to be built per-target.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1422289602-17874-2-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:16 +00:00
Max Reitz
f53a829bb9 nbd: Drop BDS backpointer
Before this patch, the "opaque" pointer in an NBD BDS points to a
BDRVNBDState, which contains an NbdClientSession object, which in turn
contains a pointer to the BDS. This pointer may become invalid due to
bdrv_swap(), so drop it, and instead pass the BDS directly to the
nbd-client.c functions which then retrieve the NbdClientSession object
from there.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1423256778-3340-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 14:36:03 +00:00
Zhoujian
f824e8ed03 qom: Fix typo, 'my_class_init' -> 'derived_class_init'
Signed-off-by: Zhoujian <jianjay.zhou@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-02-16 15:07:09 +01:00
Gonglei
2779672fa3 vnc: introduce an wrapper for auto assign vnc id
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-02-16 08:47:59 +01:00
Hervé Poussineau
b19c1c08de isa: remove isa_mem_base variable
Now that isa_mem_base variable is always 0, we can remove its usage.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2015-02-13 14:09:28 +00:00
Hervé Poussineau
bb2ed009e7 isa: add memory space parameter to isa_bus_new
Currently, keep current behaviour by always using get_system_memory().

Also use QOM casts when possible.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2015-02-13 14:09:27 +00:00
Peter Maydell
cd2d554127 Convert to linked list.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJU3Y5ZAAoJEK0ScMxN0CebuvYH/iQa+ZA1yavKsOa2xtB4tq0K
 gwS0hwCmAtS7CV6iM8Wrte/3HJr4xtnQGeBQgatmM7PrDx6o6cjnxwga2mBx6EQv
 fIhWlNPlctAgd11ocuhXXUdGv2yL8TWf+WLTDcO6CS6zbdYp+42sOvrWGKL1UQbz
 RAJhiK5BZsDHv2DodqZBk7JZr9l4hAohARs627xG7uq/j6vggEHo6y/9Z6mETcRX
 fkIwx/qjUmDJbBlFgtgEPBYMX/AWDu7XbhU1kT+EjcrFwEjEd0mxi52NN8uEyGQ9
 xgAekmPvcNR0UpkVl5pIr64/wTkd+m2mBG8CVENG3ryDKePJk64jVth+I/8Z/Qo=
 =+mGW
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20150212' into staging

Convert to linked list.

# gpg: Signature made Fri 13 Feb 2015 05:40:41 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tcg-20150212:
  tcg: Remove unused opcodes
  tcg: Implement insert_op_before
  tcg: Remove opcodes instead of noping them out
  tcg: Put opcodes in a linked list
  tcg: Introduce tcg_op_buf_count and tcg_op_buf_full
  tcg: Move emit of INDEX_op_end into gen_tb_end
  tcg: Reduce ifdefs in tcg-op.c
  tcg: Move some opcode generation functions out of line

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-13 11:44:50 +00:00
Alexander Graf
4ab29b8214 arm: Add PCIe host bridge in virt machine
Now that we have a working "generic" PCIe host bridge driver, we can plug
it into ARM's virt machine to always have PCIe available to normal ARM VMs.

I've successfully managed to expose a Bochs VGA device, XHCI and an e1000
into an AArch64 VM with this and they all lived happily ever after.

Signed-off-by: Alexander Graf <agraf@suse.de>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
[PMM: Squashed in fix for off-by-one error in bus-range DT property
 from Laszlo Ersek <lersek@redhat.com>]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-13 05:46:08 +00:00
Alexander Graf
4d8fde1126 pci: Add generic PCIe host bridge
With simple exposure of MMFG, ioport window, mmio window and an IRQ line we
can successfully create a workable PCIe host bridge that can be mapped anywhere
and only needs to get described to the OS using whatever means it likes.

This patch implements such a "generic" host bridge. It handles 4 legacy IRQ
lines. MSIs need to be handled external to the host bridge.

This device is particularly useful for the "pci-host-ecam-generic" driver in
Linux.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-13 05:46:07 +00:00
Alexander Graf
bf439db499 pci: Allocate PCIe host bridge PCI ID
We are going to introduce a PCIe host controller that doesn't exist that
way in real hardware, but still needs to expose some PCIe root device which
has PCI IDs.

Allocate a PCI ID in the Red Hat space that we use for other devices of this
kind.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-13 05:46:07 +00:00
Richard Henderson
c45cb8bb89 tcg: Put opcodes in a linked list
The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-02-12 21:21:38 -08:00
Richard Henderson
0a7df5da98 tcg: Move emit of INDEX_op_end into gen_tb_end
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2015-02-12 21:21:38 -08:00
Gonglei
9143d5f0f1 vhost-scsi: add a property for booting
Because Qemu only accept an wwpn argument for vhost-scsi, we
cannot assign a tpgt. That's say tpg is transparent for Qemu, Qemu
doesn't know which tpg can boot, but vhost-scsi driver module
doesn't know too for one assigned wwpn.

At present, we assume that the first tpg can boot only, and add
a boot_tpgt property that defaults to 0. Of course, people can
pass a valid value by qemu command line.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-11 21:47:52 +01:00
Gonglei
1956cf6fa1 vhost-scsi: expose the TYPE_FW_PATH_PROVIDER interface
In the way, we can make the bootindex property take effect.
At the meanwhile, the firmware path name of vhost-scsi is
"channel@channel/vhost-scsi@target,lun".

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-11 21:47:51 +01:00
Gonglei
d4433f3211 vhost-scsi: add bootindex property
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-11 21:47:51 +01:00
Gonglei
0be63901d2 qdev: support to get a device firmware path directly
commit 6b1566c (qdev: Introduce FWPathProvider interface) did a
good job for supproting to get firmware path on some different
architectures.

Moreover further more, we can use the interface to get firmware
path name for a device which isn't attached a specific bus,
such as virtio-bus, scsi-bus etc.

When the device (such as vhost-scsi) realize the TYPE_FW_PATH_PROVIDER
interface, we should introduce a new function to get the correct firmware
path name for it.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-02-11 21:47:51 +01:00
Peter Maydell
449008f864 RCU fixes and cleanup (Paolo Bonzini)
Switch to v2 IOMMU interface (Alex Williamson)
 DEBUG build fix (Alexey Kardashevskiy)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU2kHCAAoJECObm247sIsiqEgP/j+b8PHknblPJ91t3NoG/S71
 cORhqKPZpsWMEDrzpAUNX/EZ6G7RR6ZD7UsV6BQ1FkxXGiPnA6cnrlhm9uhWwiDb
 GOiYdA9hFBuxZw8Wc1l6HXRWk/xn3hWFnV+JxxVskS0/tC1OvBPkDoTbCsGgKLbG
 A9S1c981bt7VCDxemo0Z4shTlmpUXtlyFpdqRNj1ATAKPbm2K5jT2ZHLIguJAv6y
 x9gBfB2swmI56afmpS2cU2j2MPjovJSRrkmvUhjHOMbYMhhs/gvDuwQpVxCNfW+2
 w+8NnKxjnOKnCcYvPI+NXziFMpx6FitshYwfCgw8rpmDJvYuweDkXtz08U8I7ECW
 GuDpRuJayyadG8a/JqnLrG0Ekcw35WCje4OLdbCBwxfpdCn/xFKYGCZTazI1KLAx
 tt8iKp4N8tHc/Iptw2ZE4Ow7Bw6/73mX/tcm7D5RNjUpktXDT4EIV22Hq5aTM2Kp
 zuWgizBtQuTgTrMuGPaUG915iyfeemFsOPdzX+z/2Dxl+Cd0qrJhE/aS2wdKB6F7
 eMXcRED+tkXuqk+PxiHQv82eUiMrTgmGbkByjUTJo4xS0/9SxUa8F8n1tAfIS0Di
 X9MZbpajk0udFLpz8BxYaWO4H+1VVYnvPBGwA/7O586UB20ouSeqn6jRBHAHvgGF
 bDIdLQgBK9wthjj0uxd3
 =ol6u
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150210.0' into staging

RCU fixes and cleanup (Paolo Bonzini)
Switch to v2 IOMMU interface (Alex Williamson)
DEBUG build fix (Alexey Kardashevskiy)

# gpg: Signature made Tue 10 Feb 2015 17:37:06 GMT using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20150210.0:
  vfio: Fix debug message compile error
  vfio: Use vfio type1 v2 IOMMU interface
  vfio: unmap and free BAR data in instance_finalize
  vfio: free dynamically-allocated data in instance_finalize
  vfio: cleanup vfio_get_device error path, remove vfio_populate_device callback
  memory: unregister AddressSpace MemoryListener within BQL

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-11 05:14:41 +00:00
Paolo Bonzini
217e9fdcad vfio: cleanup vfio_get_device error path, remove vfio_populate_device callback
Now that vfio_put_base_device is called unconditionally at instance_finalize
time, it can be called twice if vfio_populate_device fails.  This works
but it is slightly harder to follow.

Change vfio_get_device to not touch the vbasedev struct until it will
definitely succeed, moving the vfio_populate_device call back to vfio-pci.
This way, vfio_put_base_device will only be called once.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 10:25:44 -07:00
Paolo Bonzini
6e48e8f9e0 memory: unregister AddressSpace MemoryListener within BQL
address_space_destroy_dispatch is called from an RCU callback and hence
outside the iothread mutex (BQL).  However, after address_space_destroy
no new accesses can hit the destroyed AddressSpace so it is not necessary
to observe changes to the memory map.  Move the memory_listener_unregister
call earlier, to make it thread-safe again.

Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 374f2981d1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2015-02-10 10:25:44 -07:00
Stefan Weil
6afc14e92a migration: Fix warning caused by missing declaration of vmstate_dummy
Warning from the Sparse static analysis tool:

stubs/vmstate.c:4:26: warning:
 symbol 'vmstate_dummy' was not declared. Should it be static?

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-02-10 10:26:05 +03:00
Peter Maydell
3d815ac82b Block patches for 2.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJU1PZiAAoJEH8JsnLIjy/Ww4YQAKPx878lJRccUqoTTVOTNGon
 moDQfutErwksljNl6LmyulUxTdPIpI8H7apLkOd7+HHD9TIk1wFXyFmadnwLfsKC
 QYj9DHFAX7+5mLMol+bDFqODmnj7BttQ+xidGJPGBRhX1fju9cRJW2ztajE21N/1
 Pj3x8nvIkmshG4WU1c3etT7lTzKLrhtMatzs//nAGJpiRnyZMAOtwDANnFnCc5Y0
 Ns4qB9lTOmUa5l9POuuzl/pLhtSSe95BI9lKeRG86ZYie4en35po/aVDbhGSDdAP
 2y3SNi7hro51wQKbV5B0fyp5TJiJZu4jnuGK8mpC8VtMLsT233TuQruPySgMmKh3
 uJpCJWRxcXg4JlLEJzN8uenz3ay3OnSHXZ/FS3eHKveQO01sTmEgkIBQNjCbvIYT
 Jl/Rpf3InXd9L/w2FFwGs0akY0V6nbp+KzCF7Ubdi6OeZJrQrizTKOpmvWPQusES
 FUZNEqtEhHNCzJuraCBxd1+7Y5q0np/IO0NQted8ypM0gKvjcYsjyzEm6JapgjVl
 nf3oM0g6GwpuTRzSm1GfiFE3nVNFDV+Rw8VZKb7sD/Vndl4681hortx8zJHGyTI6
 N9sCtlDa622oXwDaLIjuG8fjFkBODMSCwbH0swH42AwnOsTTWiqi2r/J7dZ3CJi/
 qBC7h9W9ZbxBo6VwEcvW
 =RmXU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block patches for 2.3

# gpg: Signature made Fri 06 Feb 2015 17:14:10 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (47 commits)
  block/raw-posix.c: Fix raw_getlength() on Mac OS X block devices
  block: Eliminate silly QERR_ macros used for encryption keys
  block: New bdrv_add_key(), convert monitor to use it
  blockdev: Eliminate silly QERR_BLOCK_JOB_NOT_ACTIVE macro
  blockdev: Give find_block_job() an Error ** parameter
  qcow2: Rewrite qcow2_alloc_bytes()
  block: Give always priority to unused entries in the qcow2 L2 cache
  nbd: fix max_discard/max_transfer_length
  block: introduce BDRV_REQUEST_MAX_SECTORS
  nbd: Improve error messages
  iotests: Fix 104 for NBD
  iotests: Fix 100 for nbd
  iotests: Fix 083
  block: fix off-by-one error in qcow and qcow2
  qemu-iotests: add 116 invalid QED input file tests
  qed: check for header size overflow
  block/dmg: improve zeroes handling
  block/dmg: support bzip2 block entry types
  block/dmg: factor out block type check
  block/dmg: use SectorNumber from BLKX header
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-02-06 18:06:07 +00:00
Markus Armbruster
b1ca639184 block: Eliminate silly QERR_ macros used for encryption keys
The QERR_ macros are leftovers from the days of "rich" error objects.
They're used with error_set() and qerror_report(), and expand into the
first *two* arguments.  This trickiness has become pointless.  Clean
up QERR_DEVICE_ENCRYPTED and QERR_DEVICE_NOT_ENCRYPTED.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-5-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-02-06 11:46:32 -05:00
Markus Armbruster
4d2855a348 block: New bdrv_add_key(), convert monitor to use it
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-4-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-02-06 11:46:32 -05:00
Markus Armbruster
2e3a0266bd blockdev: Eliminate silly QERR_BLOCK_JOB_NOT_ACTIVE macro
The QERR_ macros are leftovers from the days of "rich" error objects.
They're used with error_set() and qerror_report(), and expand into the
first *two* arguments.  This trickiness has become pointless.  Clean
this one up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1422524221-8566-3-git-send-email-armbru@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-02-06 11:46:32 -05:00
Peter Lieven
75af1f34cd block: introduce BDRV_REQUEST_MAX_SECTORS
we check and adjust request sizes at several places with
sometimes inconsistent checks or default values:
 INT_MAX
 INT_MAX >> BDRV_SECTOR_BITS
 UINT_MAX >> BDRV_SECTOR_BITS
 SIZE_MAX >> BDRV_SECTOR_BITS

This patches introdocues a macro for the maximal allowed sectors
per request and uses it at several places.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:22 +01:00
Max Reitz
1ce52846d3 nbd: Improve error messages
This patch makes use of the Error object for nbd_receive_negotiate() so
that errors during negotiation look nicer.

Furthermore, this patch adds an additional error message if the received
magic was wrong, but would be correct for the other protocol version,
respectively: So if an export name was specified, but the NBD server
magic corresponds to an old handshake, this condition is explicitly
signaled to the user, and vice versa.

As these messages are now part of the "Could not open image" error
message, additional filtering has to be employed in iotest 083, which
this patch does as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:22 +01:00
Francesco Romani
e2462113b2 block: add event when disk usage exceeds threshold
Managing applications, like oVirt (http://www.ovirt.org), make extensive
use of thin-provisioned disk images.
To let the guest run smoothly and be not unnecessarily paused, oVirt sets
a disk usage threshold (so called 'high water mark') based on the occupation
of the device,  and automatically extends the image once the threshold
is reached or exceeded.

In order to detect the crossing of the threshold, oVirt has no choice but
aggressively polling the QEMU monitor using the query-blockstats command.
This lead to unnecessary system load, and is made even worse under scale:
deployments with hundreds of VMs are no longer rare.

To fix this, this patch adds:
* A new monitor command `block-set-write-threshold', to set a mark for
  a given block device.
* A new event `BLOCK_WRITE_THRESHOLD', to report if a block device
  usage exceeds the threshold.
* A new `write_threshold' field into the `BlockDeviceInfo' structure,
  to report the configured threshold.

This will allow the managing application to use smarter and more
efficient monitoring, greatly reducing the need of polling.

[Updated qemu-iotests 067 output to add the new 'write_threshold'
property. --Stefan]
[Changed g_assert_false() to !g_assert() to fix the build on older glib
versions. --Kevin]

Signed-off-by: Francesco Romani <fromani@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Lieven
c99495ac1b virtio-blk: add a knob to disable request merging
this adds a knob to disable request merging for debugging or benchmarks if dedired.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Lieven
95f7142abc virtio-blk: introduce multiread
this patch finally introduces multiread support to virtio-blk. While
multiwrite support was there for a long time, read support was missing.

The complete merge logic is moved into virtio-blk.c which has
been the only user of request merging ever since. This is required
to be able to merge chunks of requests and immediately invoke callbacks
for those requests. Secondly, this is required to switch to
direct invocation of coroutines which is planned at a later stage.

The following benchmarks show the performance of running fio with
4 worker threads on a local ram disk. The numbers show the average
of 10 test runs after 1 run as warmup phase.

              |        4k        |       64k        |        4k
MB/s          | rd seq | rd rand | rd seq | rd rand | wr seq | wr rand
--------------+--------+---------+--------+---------+--------+--------
master        | 1221   | 1187    | 4178   | 4114    | 1745   | 1213
multiread     | 1829   | 1189    | 4639   | 4110    | 1894   | 1216

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Lieven
454057b7d9 block-backend: expose bs->bl.max_transfer_length
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Lieven
d901f3c457 hw/virtio-blk: add a constant for max number of merged requests
As it was not obvious (at least for me) where the 32 comes from;
add a constant for it.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Lieven
f4564d53c6 block: add accounting for merged requests
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Peter Maydell
a2f2d288b5 softfloat: expand out STATUS macro
Expand out and remove the STATUS macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-02-06 16:11:38 +00:00
Peter Maydell
ff32e16e86 softfloat: expand out STATUS_VAR
Expand out and remove the STATUS_VAR macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-02-06 16:11:38 +00:00
Peter Maydell
e5a41ffa87 softfloat: Expand out the STATUS_PARAM macro
Expand out STATUS_PARAM wherever it is used and delete the definition.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2015-02-06 16:11:38 +00:00
Alexander Graf
8118f0950f migration: Append JSON description of migration stream
One of the annoyances of the current migration format is the fact that
it's not self-describing. In fact, it's not properly describing at all.
Some code randomly scattered throughout QEMU elaborates roughly how to
read and write a stream of bytes.

We discussed an idea during KVM Forum 2013 to add a JSON description of
the migration protocol itself to the migration stream. This patch
adds a section after the VM_END migration end marker that contains
description data on what the device sections of the stream are composed of.

This approach is backwards compatible with any QEMU version reading the
stream, because QEMU just stops reading after the VM_END marker and ignores
any data following it.

With an additional external program this allows us to decipher the
contents of any migration stream and hopefully make migration bugs easier
to track down.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-02-05 17:16:14 +01:00
Alexander Graf
9722140011 qemu-file: Add fast ftell code path
For ftell we flush the output buffer to ensure that we don't have anything
lingering in our internal buffers. This is a very safe thing to do.

However, with the dynamic size measurement that the dynamic vmstate
description will bring this would turn out quite slow.

Instead, we can fast path this specific measurement and just take the
internal buffers into account when telling the kernel our position.

I'm sure I overlooked some corner cases where this doesn't work, so
instead of tuning the safe, existing version, this patch adds a fast
variant of ftell that gets used by the dynamic vmstate description code
which isn't critical when it fails.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-02-05 17:16:14 +01:00
Alexander Graf
190c882ce2 QJSON: Add JSON writer
To support programmatic JSON assembly while keeping the code that generates it
readable, this patch introduces a simple JSON writer. It emits JSON serially
into a buffer in memory.

The nice thing about this writer is its simplicity and low memory overhead.
Unlike the QMP JSON writer, this one does not need to spawn QObjects for every
element it wants to represent.

This is a prerequisite for the migration stream format description generator.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-02-05 17:16:14 +01:00