Commit graph

21114 commits

Author SHA1 Message Date
Andrey Smirnov
a62bf59fd9 i.MX: Add i.MX7 GPT variant
Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:30 +00:00
Andrey Smirnov
0999e87fa5 i.MX: Add code to emulate GPCv2 IP block
Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:30 +00:00
Andrey Smirnov
0a7bc1c045 i.MX: Add code to emulate i.MX7 SNVS IP-block
Add code to emulate SNVS IP-block. Currently only the bits needed to
be able to emulate machine shutdown are implemented.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:30 +00:00
Andrey Smirnov
067e68e704 i.MX: Add code to emulate i.MX2 watchdog IP block
Add enough code to emulate i.MX2 watchdog IP block so it would be
possible to reboot the machine running Linux Guest.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:29 +00:00
Andrey Smirnov
e9e0ef15d2 i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks
Add minimal code needed to allow upstream Linux guest to boot.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:29 +00:00
Andrey Smirnov
df2a5cf4c8 hw: i.MX: Convert i.MX6 to use TYPE_IMX_USDHC
Convert i.MX6 to use TYPE_IMX_USDHC since that's what real HW comes
with.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:29 +00:00
Andrey Smirnov
fd1e5c8179 sdhci: Add i.MX specific subtype of SDHCI
IP block found on several generations of i.MX family does not use
vanilla SDHCI implementation and it comes with a number of quirks.

Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to
support unmodified Linux guest driver.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Marcel Apfelbaum <marcel.apfelbaum@zoho.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: define and use ESDHC_UNDOCUMENTED_REG27]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-09 10:40:29 +00:00
Peter Maydell
6c94851881 target/arm: Split "get pending exception info" from "acknowledge it"
Currently armv7m_nvic_acknowledge_irq() does three things:
 * make the current highest priority pending interrupt active
 * return a bool indicating whether that interrupt is targeting
   Secure or NonSecure state
 * implicitly tell the caller which is the highest priority
   pending interrupt by setting env->v7m.exception

We need to split these jobs, because v7m_exception_taken()
needs to know whether the pending interrupt targets Secure so
it can choose to stack callee-saves registers or not, but it
must not make the interrupt active until after it has done
that stacking, in case the stacking causes a derived exception.
Similarly, it needs to know the number of the pending interrupt
so it can read the correct vector table entry before the
interrupt is made active, because vector table reads might
also cause a derived exception.

Create a new armv7m_nvic_get_pending_irq_info() function which simply
returns information about the highest priority pending interrupt, and
use it to rearrange the v7m_exception_taken() code so we don't
acknowledge the exception until we've done all the things which could
possibly cause a derived exception.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1517324542-6607-3-git-send-email-peter.maydell@linaro.org
2018-02-09 10:40:27 +00:00
Peter Maydell
5ede82b8cc target/arm: Add armv7m_nvic_set_pending_derived()
In order to support derived exceptions (exceptions generated in
the course of trying to take an exception), we need to be able
to handle prioritizing whether to take the original exception
or the derived exception.

We do this by introducing a new function
armv7m_nvic_set_pending_derived() which the exception-taking code in
helper.c will call when a derived exception occurs.  Derived
exceptions are dealt with mostly like normal pending exceptions, so
we share the implementation with the armv7m_nvic_set_pending()
function.

Note that the way we structure this is significantly different
from the v8M Arm ARM pseudocode: that does all the prioritization
logic in the DerivedLateArrival() function, whereas we choose to
let the existing "identify highest priority exception" logic
do the prioritization for us. The effect is the same, though.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-2-git-send-email-peter.maydell@linaro.org
2018-02-09 10:40:27 +00:00
Yi Min Zhao
f9125e3a31 s390x/pci: use the right pal and pba in reg_ioat()
When registering ioat, pba should be comprised of leftmost 52 bits and
rightmost 12 binary zeros, and pal should be comprised of leftmost 52
bits and right most 12 binary ones. The lower 12 bits of words 5 and 7
of the FIB are ignored by the facility. Let's fixup this.

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-4-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
Yi Min Zhao
b3f05d8c7f s390x/pci: fixup global refresh
The VFIO common code doesn't provide the possibility to modify a
previous mapping entry in another way than unmapping and mapping again
with new properties.

To avoid -EEXIST DMA mapping error, we introduce a GHashTable to store
S390IOTLBEntry instances in order to cache the mapped entries. When
intercepting rpcit instruction, ignore the identical mapped entries to
avoid doing map operations multiple times and do unmap and re-map
operations for the case of updating the valid entries.

Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-3-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
Yi Min Zhao
0125861eac s390x/pci: fixup the code walking IOMMU tables
Current s390x PCI IOMMU code is lack of flags' checking, including:
1) protection bit
2) table length
3) table offset
4) intermediate tables' invalid bit
5) format control bit

This patch introduces a new struct named S390IOTLBEntry, and makes up
these missed checkings. At the same time, inform the guest with the
corresponding error number when the check fails. Finally, in order to
get the error number, we export s390_guest_io_table_walk().

Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-2-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
Christian Borntraeger
869e676ae7 s390x/sclp: fix event mask handling
commit 67915de9f0 ("s390x/event-facility: variable-length event
masks") switched the sclp receive/send mask. This broke the sclp
lm console.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: commit 67915de9f0 ("s390x/event-facility: variable-length event masks")
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20180202094241.59537-1-borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
6762808fda s390x/flic: cache the common flic class in a central function
This avoids tons of conversions when handling interrupts.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-19-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
c21a6106c1 s390x/kvm: cache the kvm flic in a central function
This avoids tons of conversions when handling interrupts.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
f68ecdd4f3 s390x/tcg: cache the qemu flic in a central function
This avoids tons of conversions when handling interrupts.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-17-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
de352394ff s390x/tcg: remove SMP warning
We should be pretty good in shape now. Floating interrupts are working
and atomic instructions should be atomic.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
631b59664c s390x/flic: optimize CPU wakeup for TCG
Kicking all CPUs on every floating interrupt is far from efficient.
Let's optimize it at least a little bit.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-12-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
6e0d8175d6 s390x/flic: implement qemu_s390_clear_io_flic()
Now that we have access to the io interrupts, we can implement
clear_io_irq() for TCG.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
b194e44785 s390x/flic: make floating interrupts on TCG actually floating
Move floating interrupt handling into the flic. Floating interrupts
will now be considered by all CPUs, not just CPU #0. While at it, convert
I/O interrupts to use a list and make sure we properly consider I/O
sub-classes in s390_cpu_has_io_int().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
d8d7942df6 s390x/flic: no need to call s390_io_interrupt() from flic
We can directly call the right function.

Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
e6505d5395 s390x/flic: factor out injection of floating interrupts
Let the flic device handle it internally. This will allow us to later
on store floating interrupts in the flic for the TCG case.

This now also simplifies kvm.c. All that's left is the fallback
interface for floating interrupts, which is now triggered directly via
the flic in case anything goes wrong.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
b03d9970c4 s390x/tcg: simplify lookup of flic
We can simply search for an object of our common type.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
David Hildenbrand
e2ac12f014 s390x/flic: simplify flic initialization
This makes it clearer, which device is used for which accelerator.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
2018-02-09 09:37:13 +01:00
Markus Armbruster
8f0a3716e4 Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes, with the change
to target/s390x/gen-features.c manually reverted, and blank lines
around deletions collapsed.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-3-armbru@redhat.com>
2018-02-09 05:05:11 +01:00
Markus Armbruster
d8e39b7062 Use #include "..." for our own headers, <...> for others
System headers should be included with <...>, our own headers with
"...".  Offenders tracked down with an ugly, brittle and probably
buggy Perl script.  Previous iteration was commit a9c94277f0.

Delete inclusions of "string.h" and "strings.h" instead of fixing them
to <string.h> and <strings.h>, because we always include these via
osdep.h.

Put the cleaned up system header includes first.

While there, separate #include from file comment with exactly one
blank line.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-2-armbru@redhat.com>
2018-02-09 05:05:11 +01:00
Yoni Bettan
d61a363d3e pci: removed the is_express field since a uniform interface was inserted
according to Eduardo Habkost's commit fd3b02c889 all PCIEs now implement
INTERFACE_PCIE_DEVICE so we don't need is_express field anymore.

Devices that implements only INTERFACE_PCIE_DEVICE (is_express == 1)
or
devices that implements only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express == 0)
where not affected by the change.

The only devices that were affected are those that are hybrid and also
had (is_express == 1) - therefor only:
  - hw/vfio/pci.c
  - hw/usb/hcd-xhci.c
  - hw/xen/xen_pt.c

For those 3 I made sure that QEMU_PCI_CAP_EXPRESS is on in instance_init()

Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Changpeng Liu
0ebf9a7488 virtio-blk: enable multiple vectors when using multiple I/O queues
Currently virtio-pci driver hardcoded 2 vectors for virtio-blk device,
for multiple I/O queues scenario, all the I/O queues will share one
interrupt vector, while here, enable multiple vectors according to
the number of I/O queues.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Peter Xu
9d6b9db19c pci/bus: let it has higher migration priority
In the past, we prioritized IOMMU migration so that we have such a
priority order:

    IOMMU > PCI Devices

When migrating a guest with both vIOMMU and a pcie-root-port, we'll
always migrate vIOMMU first, since pci buses will be seen to have the
same priority of general PCI devices.

That's problematic.

The thing is that PCI bus number information is stored in the root port,
and that is needed by vIOMMU during post_load(), e.g., to figure out
context entry for a device.  If we don't have correct bus numbers for
devices, we won't be able to recover device state of the DMAR memory
regions, and things will be messed up.

So let's boost the PCIe root ports to be even with higher priority:

   PCIe Root Port > IOMMU > PCI Devices

A smoke test shows that this patch fixes bug 1538953.

Also, apply this rule to all the PCI bus/bridge devices: ioh3420,
xio3130_downstream, xio3130_upstream, pcie_pci_bridge, pci-pci bridge,
i82801b11.

I noted that we set pcie_pci_bridge_dev_vmstate twice.  Clean that up
together.

CC: Alex Williamson <alex.williamson@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Laurent Vivier <lvivier@redhat.com>
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Laszlo Ersek
ed247f40db pci-bridge/i82801b11: clear bridge registers on platform reset
The "i82801b11-bridge" device model is a descendant of "base-pci-bridge"
(TYPE_PCI_BRIDGE). However, unlike other similar devices, such as

- pci-bridge,
- pcie-pci-bridge,
- PCIE Root Port,
- xio3130 switch upstream and downstream ports,
- dec-21154-p2p-bridge,
- pbm-bridge,
- xilinx-pcie-root,

"i82801b11-bridge" does not clear the bridge specific registers at
platform reset.

This is a problem because devices on "i82801b11-bridge" continue to
respond to config space cycles after platform reset, when addressed with
the bus number that was previously programmed into the secondary bus
number register of "i82801b11-bridge". This error breaks OVMF's search for
extra (PXB) root buses, for example.

The device class reset method for "i82801b11-bridge" is currently NULL;
set it directly to pci_bridge_reset(), like the last three bridge models
in the above listing do.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: qemu-stable@nongnu.org
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1541839
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert
aa3c40f6bf vhost: Move log_dirty check
Move the log_dirty check into vhost_section.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert
938eeb640c vhost: Merge and delete unused callbacks
Now that the olf vhost_set_memory code is gone, the _nop and _add
callbacks are identical and can be merged.  The _del callback is
no longer needed.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert
06709c120c vhost: Clean out old vhost_set_memory and friends
Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert
ade6d081fc vhost: Regenerate region list from changed sections list
Compare the sections list that's just been generated, and if it's
different from the old one regenerate the region list.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert
48d7c97577 vhost: Merge sections added to temporary list
As sections are reported by the listener to the _nop and _add
methods, add them to the temporary section list but now merge them
with the previous section if the new one abuts and the backend allows.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert
0ca1fd2d68 vhost: Simplify ring verification checks
vhost_verify_ring_mappings() were used to verify that
rings are still accessible and related memory hasn't
been moved after flatview is updated.

It was doing checks by mapping ring's GPA+len and
checking that HVA hadn't changed with new memory map.
To avoid maybe expensive mapping call, we were
identifying address range that changed and were doing
mapping only if ring was in changed range.

However it's not neccessary to perform ring's GPA
mapping as we already have its current HVA and all
we need is to verify that ring's GPA translates to
the same HVA in updated flatview.

This will allow the following patches to simplify the range
comparison that was previously needed to avoid expensive
verify_ring_mapping calls.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
with modifications by:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert
c44317efec vhost: Build temporary section list and deref after commit
Igor spotted that there's a race, where a region that's unref'd
in a _del callback might be free'd before the set_mem_table call in
the _commit callback, and thus the vhost might end up using free memory.

Fix this by building a complete temporary sections list, ref'ing every
section (during add and nop) and then unref'ing the whole list right
at the end of commit.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Gal Hammer
710fccf80d virtio: improve virtio devices initialization time
The loading time of a VM is quite significant when its virtio
devices use a large amount of virt-queues (e.g. a virtio-serial
device with max_ports=511). Most of the time is spend in the
creation of all the required event notifiers (ioeventfd and memory
regions).

This patch pack all the changes to the memory regions in a
single memory transaction.

Reported-by: Sitong Liu
Reported-by: Xiaoling Gao
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
2018-02-08 21:06:40 +02:00
Gal Hammer
76143618a5 virtio: remove event notifier cleanup call on de-assign
The virtio_bus_set_host_notifier function no longer calls
event_notifier_cleanup when a event notifier is removed.

The commit updates the code to match the new behavior and calls
virtio_bus_cleanup_host_notifier after the notifier was de-assign
and no longer in use.

This change is a preparation to allow executing the
virtio_bus_set_host_notifier function in a memory region
transaction.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:26 +02:00
Michael S. Tsirkin
f41d912023 Revert "vhost: add traces for memory listeners"
This reverts commit 0750b06021.

Follow up patches are reworking the memory listeners, the new mechanism
will add its own set of traces.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 19:26:51 +02:00
Fam Zheng
a3d9a352d4 block: Move NVMe constants to a separate header
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-8-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2018-02-08 09:22:03 +08:00
Peter Maydell
7b213bb475 * socket option parsing fix (Daniel)
* SCSI fixes (Fam)
 * Readline double-free fix (Greg)
 * More HVF attribution fixes (Izik)
 * WHPX (Windows Hypervisor Platform Extensions) support (Justin)
 * POLLHUP handler (Klim)
 * ivshmem fixes (Ladi)
 * memfd memory backend (Marc-André)
 * improved error message (Marcelo)
 * Memory fixes (Peter Xu, Zhecheng)
 * Remove obsolete code and comments (Peter M.)
 * qdev API improvements (Philippe)
 * Add CONFIG_I2C switch (Thomas)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaexoYAAoJEL/70l94x66DVL0IAJC//aZCwwgyN9CRNDcOo10/
 UPtzprfezERkur77r1KvEYVNIfslRF6iTBou2+suOWkzoNL2LJ0XZ+wi+2u2sFIF
 ikvbQVk4dOWqJJQj7e1cmv5A2EZy2dcxjAoD1IG6CRy76+HzYqwjHVw+HkYY5CUS
 qwnUWjQddP6WtH9MsUHpX7p7atWo7T1tzkx4v8H+CIHBO3uUJQSZLkGYflvcstpj
 Fo04bZzSkDj2rnlqqBo/6UgJQXD8++Rs64vmiX2xwcK47TWO31Vbuwu+r8V9osWm
 LHFmRpL8ZkZfL0yqf0bpjmd688dirjVpHIJ5KE043Lo6AdI+K5xBfoBjXxtPiKE=
 =o90D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* socket option parsing fix (Daniel)
* SCSI fixes (Fam)
* Readline double-free fix (Greg)
* More HVF attribution fixes (Izik)
* WHPX (Windows Hypervisor Platform Extensions) support (Justin)
* POLLHUP handler (Klim)
* ivshmem fixes (Ladi)
* memfd memory backend (Marc-André)
* improved error message (Marcelo)
* Memory fixes (Peter Xu, Zhecheng)
* Remove obsolete code and comments (Peter M.)
* qdev API improvements (Philippe)
* Add CONFIG_I2C switch (Thomas)

# gpg: Signature made Wed 07 Feb 2018 15:24:08 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (47 commits)
  Add the WHPX acceleration enlightenments
  Introduce the WHPX impl
  Add the WHPX vcpu API
  Add the Windows Hypervisor Platform accelerator.
  tests/test-filter-redirector: move close()
  tests: use memfd in vhost-user-test
  vhost-user-test: make read-guest-mem setup its own qemu
  tests: keep compiling failing vhost-user tests
  Add memfd based hostmem
  memfd: add hugetlbsize argument
  memfd: add hugetlb support
  memfd: add error argument, instead of perror()
  cpus: join thread when removing a vCPU
  cpus: hvf: unregister thread with RCU
  cpus: tcg: unregister thread with RCU, fix exiting of loop on unplug
  cpus: dummy: unregister thread with RCU, exit loop on unplug
  cpus: kvm: unregister thread with RCU
  cpus: hax: register/unregister thread with RCU, exit loop on unplug
  ivshmem: Disable irqfd on device reset
  ivshmem: Improve MSI irqfd error handling
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	cpus.c
2018-02-07 20:40:36 +00:00
Peter Maydell
17a5bbb44d Error reporting patches for 2018-02-06
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaegaOAAoJEDhwtADrkYZT5HcP/ApeXZCqiDOiJrpq046gCahC
 0Bl31NPiOloS6ya8gFT3p3ufeRdvKfdPRTWwa8lHOIkWXEvF/OtNQQGJ7Ff4HB0F
 f2o8yMS68srJ6zasCwizwY98vxo0574Hd9coZRGRKBvC9qm8jVDqNs2JxqUF/OhK
 Z+3XJ4uAFtqKDE6zXWqc/e/aRQe/1Z4zFwzl6p7MvpcBI06s81jIa3W0Pqz7BFtS
 jcXjrkV6bcD28cibK5P3A21wNICrD0yGhMHL0ZZ5iPTDZdoUY0CDYiUeynhI3TgL
 iyCNpc/ANA4BLU6CN5eWd4PWswhSlLx0LqV5qDnQYgNP2v1JzWDrHOfCq7jgk1rb
 rY8NMkFinBH7eyidOfPd6FWU3f+Gz+niNdbPTMv1HfkC+GIsndhNEw8TkZTR02RE
 kgGFcfNoBihfpo8VfnS2hCv8ZG8eExna6H9j4qkIOGoCOnqeq4+cyOI3Yya3vNDC
 Snx0Npb1alLAXasyLxMSTJjcCPqzH4co2YJWYzO4bXqTOS3V/SUx+0cVIwHElDRw
 0Pm2Eff7s/nGBvBuBrPjZwjAGpDCeAOTCboUsgTB6SH0iwzuIFeCM7k191WkGhz3
 BFdsdbOgwSrEy8bA8HgNJrjPZ65Zvct8q8L7EuhahYZRvnO5qa2LhN8ID4vaizDa
 gNjc8Z9F8PfWMJ8rGdWA
 =LSkA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-02-06' into staging

Error reporting patches for 2018-02-06

# gpg: Signature made Tue 06 Feb 2018 19:48:30 GMT
# gpg:                using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2018-02-06:
  tcg: Replace fprintf(stderr, "*\n" with error_report()
  hw/xen*: Replace fprintf(stderr, "*\n" with error_report()
  hw/sparc*: Replace fprintf(stderr, "*\n" with error_report()
  hw/sd: Replace fprintf(stderr, "*\n" with DPRINTF()
  hw/ppc: Replace fprintf(stderr, "*\n" with error_report()
  hw/pci*: Replace fprintf(stderr, "*\n" with error_report()
  hw/openrisc: Replace fprintf(stderr, "*\n" with error_report()
  hw/moxie: Replace fprintf(stderr, "*\n" with error_report()
  hw/mips: Replace fprintf(stderr, "*\n" with error_report()
  hw/lm32: Replace fprintf(stderr, "*\n" with error_report()
  hw/dma: Replace fprintf(stderr, "*\n" with error_report()
  hw/arm: Replace fprintf(stderr, "*\n" with error_report()
  audio: Replace AUDIO_FUNC with __func__
  error: Improve documentation of error_append_hint()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-07 16:26:01 +00:00
Marc-André Lureau
0f2956f915 memfd: add error argument, instead of perror()
This will allow callers to silence error report when the call is
allowed to failed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Ladi Prosek
a40227911c ivshmem: Disable irqfd on device reset
The effects of ivshmem_enable_irqfd() was not undone on device reset.

This manifested as:
ivshmem_add_kvm_msi_virq: Assertion `!s->msi_vectors[vector].pdev' failed.

when irqfd was enabled before reset and then enabled again after reset, making
ivshmem_enable_irqfd() run for the second time.

To reproduce, run:

  ivshmem-server

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then install the Windows driver, at the time of writing available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

and crash-reboot the guest by inducing a BSOD.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-Id: <20171211072110.9058-5-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Ladi Prosek
0b88dd9420 ivshmem: Improve MSI irqfd error handling
Adds a rollback path to ivshmem_enable_irqfd() and fixes
ivshmem_disable_irqfd() to bail if irqfd has not been enabled.

To reproduce, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load, unload, and load again the Windows driver, at the time of writing
available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-4-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Ladi Prosek
089fd80376 ivshmem: Always remove irqfd notifiers
As of commit 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

ivshmem: msix_set_vector_notifiers failed
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.

if MSI-X is repeatedly enabled and disabled on the ivshmem device, for example
by loading and unloading the Windows ivshmem driver. This is because
msix_unset_vector_notifiers() doesn't call any of the release notifier callbacks
since MSI-X is already disabled at that point (msix_enabled() returning false
is how this transition is detected in the first place). Thus ivshmem_vector_mask()
doesn't run and when MSI-X is subsequently enabled again ivshmem_vector_unmask()
fails.

This is fixed by keeping track of unmasked vectors and making sure that
ivshmem_vector_mask() always runs on MSI-X disable.

Fixes: 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-3-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Ladi Prosek
e6a354be6e ivshmem: Don't update non-existent MSI routes
As of commit 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

if the ivshmem device is configured with more vectors than what the server
supports. This is caused by the ivshmem_vector_unmask() being called on
vectors that have not been initialized by ivshmem_add_kvm_msi_virq().

This commit fixes it by adding a simple check to the mask and unmask
callbacks.

Note that the opposite mismatch, if the server supplies more vectors than
what the device is configured for, is already handled and leads to output
like:

  Too many eventfd received, device has 1 vectors

To reproduce the assert, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load the Windows driver, at the time of writing available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Fixes: 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-2-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Peter Xu
d25836cafd memory: do explicit cleanup when remove listeners
When unregister memory listeners, we should call, e.g.,
region_del() (and possibly other undo operations) on every existing
memory region sections there, otherwise we may leak resources that are
held during the region_add(). This patch undo the stuff for the
listeners, which emulates the case when the address space is set from
current to an empty state.

I found this problem when debugging a refcount leak issue that leads to
a device unplug event lost (please see the "Bug:" line below).  In that
case, the leakage of resource is the PCI BAR memory region refcount.
And since memory regions are not keeping their own refcount but onto
their owners, so the vfio-pci device's (who is the owner of the PCI BAR
memory regions) refcount is leaked, and event missing.

We had encountered similar issues before and fixed in other
way (ee4c112846, "vhost: Release memory references on cleanup"). This
patch can be seen as a more high-level fix of similar problems that are
caused by the resource leaks from memory listeners. So now we can remove
the explicit unref of memory regions since that'll be done altogether
during unregistering of listeners now.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1531393
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-5-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Peter Xu
369686267a vfio: listener unregister before unset container
After next patch, listener unregister will need the container to be
alive.  Let's move this unregister phase to be before unset container,
since that operation will free the backend container in kernel,
otherwise we'll get these after next patch:

qemu-system-x86_64: VFIO_UNMAP_DMA: -22
qemu-system-x86_64: vfio_dma_unmap(0x559bf53a4590, 0x0, 0xa0000) = -22 (Invalid argument)

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-4-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00