Commit graph

21114 commits

Author SHA1 Message Date
Alexey Kardashevskiy
2e4109de8e vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)
New VFIO_SPAPR_TCE_v2_IOMMU type supports dynamic DMA window management.
This adds ability to VFIO common code to dynamically allocate/remove
DMA windows in the host kernel when new VFIO container is added/removed.

This adds a helper to vfio_listener_region_add which makes
VFIO_IOMMU_SPAPR_TCE_CREATE ioctl and adds just created IOMMU into
the host IOMMU list; the opposite action is taken in
vfio_listener_region_del.

When creating a new window, this uses heuristic to decide on the TCE table
levels number.

This should cause no guest visible change in behavior.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Added some casts to prevent printf() warnings on certain targets
 where the kernel headers' __u64 doesn't match uint64_t or PRIx64]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy
f4ec5e26ed vfio: Add host side DMA window capabilities
There are going to be multiple IOMMUs per a container. This moves
the single host IOMMU parameter set to a list of VFIOHostDMAWindow.

This should cause no behavioral change and will be used later by
the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy
318f67ce13 vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2)
This makes use of the new "memory registering" feature. The idea is
to provide the userspace ability to notify the host kernel about pages
which are going to be used for DMA. Having this information, the host
kernel can pin them all once per user process, do locked pages
accounting (once) and not spent time on doing that in real time with
possible failures which cannot be handled nicely in some cases.

This adds a prereg memory listener which listens on address_space_memory
and notifies a VFIO container about memory which needs to be
pinned/unpinned. VFIO MMIO regions (i.e. "skip dump" regions) are skipped.

The feature is only enabled for SPAPR IOMMU v2. The host kernel changes
are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does
not call it when v2 is detected and enabled.

This enforces guest RAM blocks to be host page size aligned; however
this is not new as KVM already requires memory slots to be host page
size aligned.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Fix compile error on 32-bit host]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:30:54 +10:00
Alexey Kardashevskiy
606b54986d spapr_iommu: Realloc guest visible TCE table when starting/stopping listening
The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU -
a guest view of the table and a hardware TCE table. If there is no VFIO
presense in the address space, then just the guest view is used, if
this is the case, it is allocated in the KVM. However since there is no
support yet for VFIO in KVM TCE hypercalls, when we start using VFIO,
we need to move the guest view from KVM to the userspace; and we need
to do this for every IOMMU on a bus with VFIO devices.

This implements the callbacks for the sPAPR IOMMU - notify_started()
reallocated the guest view to the user space, notify_stopped() does
the opposite.

This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug
path as the new callbacks do this better - they notify IOMMU at
the exact moment when the configuration is changed, and this also
includes the case of PCI hot unplug.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 10:43:02 +10:00
Bharata B Rao
7093645a84 spapr: Ensure thread0 of CPU core is always realized first
During CPU core realization, we create all the thread objects and parent
them to the core object in a loop. However, the realization of thread
objects is done separately by walking the threads of a core using
object_child_foreach(). With this, there is no guarantee on the order
in which the child thread objects get realized. Since CPU device tree
properties are currently derived from the CPU thread object, we assume
thread0 of the core to be the representative thread of the core when
creating device tree properties for the core. If thread0 is not the
first thread that gets realized, then we would end up having an
incorrect dt_id for the core and this causes hotplug failures from
the guest.

Fix this by realizing each thread object by walking the core's thread
object list thereby ensuring that thread0 and other threads are always
realized in the correct order.

Future TODO: CPU DT nodes are per-core properties and we should
ideally base the creation of CPU DT nodes on core objects rather than
the thread objects.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 10:43:02 +10:00
Markus Armbruster
a0efbf1660 range: Eliminate direct Range member access
Users of struct Range mess liberally with its members, which makes
refactoring hard.  Create a set of methods, and convert all users to
call them instead of accessing members.  The methods have carefully
worded contracts, and use assertions to check them.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 16:49:33 +03:00
Cao jin
5178ecd863 pci_register_bar: cleanup
place relevant code tegother, make the code easier to read

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-04 16:49:33 +03:00
Cédric Le Goater
e1ad9bc405 ast2400: create SPI flash slaves
A set of SPI flash slaves is attached under the flash controllers of
the palmetto platform. "n25q256a" flash modules are used for the BMC
and "mx25l25635e" for the host. These types are common in the
OpenPower ecosystem.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-9-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
924ed16386 ast2400: add SPI flash slaves
Each controller on the ast2400 has a memory range on which it maps its
flash module slaves. Each slave is assigned a memory segment for its
mapping that can be changed at bootime with the Segment Address
Register. This is not supported in the current implementation so we
are using the defaults provided by the specs.

Each SPI flash slave can then be accessed in two modes: Command and
User. When in User mode, accesses to the memory segment of the slaves
are translated in SPI transfers. When in Command mode, the HW
generates the SPI commands automatically and the memory segment is
accessed as if doing a MMIO. Other SPI controllers call that mode
linear addressing mode.

For this purpose, we are adding below each crontoller an array of
structs gathering for each SPI flash module, a segment rank, a
MemoryRegion to handle the memory accesses and the associated SPI
slave device, which should be a m25p80.

Only the User mode is supported for now but we are preparing ground
for the Command mode. The framework is sufficient to support Linux.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-8-git-send-email-clg@kaod.org
[PMM: Use g_new0() rather than g_malloc0()]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
7c1c69bca4 ast2400: add SMC controllers (FMC and SPI)
The Aspeed AST2400 soc includes a static memory controller for the BMC
which supports NOR, NAND and SPI flash memory modules. This controller
has two modes : the SMC for the legacy interface which supports only
one module and the FMC for the new interface which supports up to five
modules. The AST2400 also includes a SPI only controller used for the
host firmware, commonly called BIOS on Intel. It can be used in three
mode : a SPI master, SPI slave and SPI pass-through

Below is the initial framework for the SMC controller (FMC mode only)
and the SPI controller: the sysbus object, MMIO for registers
configuration and controls. Each controller has a SPI bus and a
configurable number of CS lines for SPI flash slaves.

The differences between the controllers are small, so they are
abstracted using indirections on the register numbers.

Only SPI flash modules are supported.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-7-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added one missing error_propagate]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
73bce5187b m25p80: qdev-ify drive property
This allows specifying the property via -drive if=none and creating
the flash device with -device.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-6-git-send-email-clg@kaod.org
[clg: added an extra fix for sabrelite_init()
      keeping the test on flash_dev did not seem necessary. ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
b7f480c3f6 m25p80: change cur_addr to 32 bit integer
The maximum amount of storage that can be addressed by the m25p80 command
set is 4 GiB.  However, cur_addr is currently a 64-bit integer.  To avoid
further problems related to sign extension of signed 32-bit integer
expressions, change cur_addr to a 32 bit integer.  Preserve migration
format by adding a dummy 4-byte field in place of the (big-endian)
high four bytes in the formerly 64-bit cur_addr field.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
b68cb06093 m25p80: avoid out of bounds accesses
s->cur_addr can be made to point outside s->storage, either by
writing a value >= 128 to s->ear (because s->ear * MAX_3BYTES_SIZE
is a signed integer and sign-extends into the 64-bit cur_addr),
or just by writing an address beyond the size of the flash being
emulated.  Avoid the sign extension to make the code cleaner, and
on top of that mask s->cur_addr to s->size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-4-git-send-email-clg@kaod.org
Reviewed by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
cace7b801d m25p80: do not put iovec on the stack
When doing a read-modify-write cycle, QEMU uses the iovec after returning
from blk_aio_pwritev.  m25p80 puts the iovec on the stack of blk_aio_pwritev's
caller, which causes trouble in this case.  This has been a problem
since commit 243e6f6 ("m25p80: Switch to byte-based block access",
2016-05-12) started doing writes at a smaller granularity than 512 bytes.
In principle however it could have broken before when using -drive
if=mtd,cache=none on a disk with 4K native sectors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
7673bb4cd3 ssi: change ssi_slave_init to be a realize ops
This enables qemu to handle late inits and report errors. All the SSI
slave routine names were changed accordingly. Code was modified to
handle errors when possible (m25p80 and ssi-sd)

Tested with the m25p80 slave object.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-2-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
f4b99537f1 xilinx_zynq: Connect devcfg to the Zynq machine model
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 85f39c9a13569b1113dacac3b952b0af54fc1260.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
034c2e6902 dma: Add Xilinx Zynq devcfg device model
Add a minimal model for the devcfg device which is part of Zynq.
This model supports DMA capabilities and interrupt generation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 83df49d8fa2d203a421ca71620809e4b04754e65.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
a74229597e register: Add block initialise helper
Add a helper that will scan a static RegisterAccessInfo Array
and populate a container MemoryRegion with registers as defined.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 347b810b2799e413c98d5bbeca97bcb1557946c3.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
49e14ddbce register: QOMify
QOMify registers as a child of TYPE_DEVICE. This allows registers to
define GPIOs.

Define an init helper that will do QOM initialisation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 2545f71db26bf5586ca0c08a3e3cf1b217450552.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
0b73c9bb06 register: Add Memory API glue
Add memory io handlers that glue the register API to the memory API.
Just translation functions at this stage. Although it does allow for
devices to be created without all-in-one mmio r/w handlers.

This patch also adds the RegisterInfoArray struct, which allows all of
the individual RegisterInfo structs to be grouped into a single memory
region.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: f7704d8ac6ac0f469ed35401f8151a38bd01468b.1467053537.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
1599121b57 register: Add Register API
This API provides some encapsulation of registers and factors out some
common functionality to common code. Bits of device state (usually MMIO
registers) often have all sorts of access restrictions and semantics
associated with them. This API allows you to define what those
restrictions are on a bit-by-bit basis.

Helper functions are then used to access the register which observe the
semantics defined by the RegisterAccessInfo struct.

Some features:
Bits can be marked as read_only (ro field)
Bits can be marked as write-1-clear (w1c field)
Bits can be marked as reserved (rsvd field)
Reset values can be defined (reset)
Bits can be marked clear on read (cor)
Pre and post action callbacks can be added to read and write ops
Verbose debugging info can be enabled/disabled

Useful for defining device register spaces in a data driven way. Cuts
down on a lot of the verbosity and repetition in the switch-case blocks
in the standard foo_mmio_read/write functions.

Also useful for automated generation of device models from hardware
design sources.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 40d62c7e1bf6e63bb4193ec46b15092a7d981e59.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Ard Biesheuvel
5d636e21c4 hw/arm/virt: mark the PCIe host controller as DMA coherent in the DT
Since QEMU performs cacheable accesses to guest memory when doing DMA
as part of the implementation of emulated PCI devices, guest drivers
should use cacheable accesses as well when running under KVM. Since this
essentially means that emulated PCI devices are DMA coherent, set the
'dma-coherent' DT property on the PCIe host controller DT node.

This brings the DT description into line with the ACPI description,
which already marks the PCI bridge as cache coherent (see commit
bc64b96c98).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1467134090-5099-1-git-send-email-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Andrey Smirnov
a19861666b armv7m_nvic: Use qemu_get_cpu(0) instead of current_cpu
Starting QEMU with -S results in current_cpu containing its initial
value of NULL. It is however possible to connect to such QEMU instance
and query various CPU registers, one example being CPUID, and doing that
results in QEMU segfaulting.

Using qemu_get_cpu(0) seem reasonable enough given that ARMv7M
architecture is a single core architecture.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Maydell
a7aeb5f7b2 imx: Use memory_region_init_rom() for ROMs
The imx boards were all incorrectly creating ROMs using
memory_region_init_rom_device() with a NULL ops pointer. This
will cause QEMU to abort if the guest tries to write to the
ROM. Switch to the new memory_region_init_rom() instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-3-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Michael S. Tsirkin
6c6668232e Revert "virtio-net: unbreak self announcement and guest offloads after migration"
This reverts commit 1f8828ef57.

Cc: qemu-stable@nongnu.org
Reported-by: Robin Geuze <robing@transip.nl>
Tested-by: Robin Geuze <robing@transip.nl>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:52:10 +03:00
Michael S. Tsirkin
62cee1a28a virtio: set low features early on load
virtio migrates the low 32 feature bits twice, the first copy is there
for compatibility but ever since
019a3edbb2: ("virtio: make features 64bit
wide") it's ignored on load. This is wrong since virtio_net_load tests
self announcement and guest offloads before the second copy including
high feature bits is loaded.  This means that self announcement, control
vq and guest offloads are all broken after migration.

Fix it up by loading low feature bits: somewhat ugly since high and low
bits become out of sync temporarily, but seems unavoidable for
compatibility.  The right thing to do for new features is probably to
test the host features, anyway.

Fixes: 019a3edbb2
    ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Reported-by: Robin Geuze <robing@transip.nl>
Tested-by: Robin Geuze <robing@transip.nl>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:52:10 +03:00
Cornelia Huck
0830c96d70 virtio: revert host notifiers to old semantics
The host notifier rework tried both to unify host notifiers across
transports and plug a possible hole during host notifier
re-assignment. Unfortunately, this meant a change in semantics that
breaks vhost and iSCSI+dataplane.

As the minimal fix, keep the common host notifier code but revert
to the old semantics so that we have time to figure out the proper
fix.

Fixes: 6798e245a3 ("virtio-bus: common ioeventfd infrastructure")
Reported-by: Peter Lieven <pl@kamp.de>
Reported-by: Jason Wang <jasowang@redhat.com>
Reported-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
Tested-by: Peter Lieven <pl@kamp.de>
2016-07-04 14:52:10 +03:00
Markus Armbruster
01c9742d9d pc: Eliminate PcPciInfo
PcPciInfo has two (ill-named) members: Range w32 is the PCI hole, and
w64 is the PCI64 hole.

Three users:

* I440FXState and MCHPCIState have a member PcPciInfo pci_info, but
  only pci_info.w32 is actually used.  This is confusing.  Replace by
  Range pci_hole.

* acpi_build() uses auto PcPciInfo pci_info to forward both PCI holes
  from acpi_get_pci_info() to build_dsdt().  Replace by two variables
  Range pci_hole, pci_hole64.  Rename acpi_get_pci_info() to
  acpi_get_pci_holes().

PcPciInfo is now unused; drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-07-04 14:52:10 +03:00
Markus Armbruster
97a83ec3a9 piix: Set I440FXState member pci_info.w32 in one place
Range pci_info.w32 records the location of the PCI hole.

It's initialized to empty when QOM zeroes I440FXState.  That's a fine
value for a still unknown PCI hole.

i440fx_init() sets pci_info.w32.begin = below_4g_mem_size.  Changes
the PCI hole from empty to [below_4g_mem_size, UINT64_MAX].  That's a
bogus value.

i440fx_pcihost_initfn() sets pci_info.end = IO_APIC_DEFAULT_ADDRESS.
Since i440fx_init() ran already, this changes the PCI hole to
[below_4g_mem_size, IO_APIC_DEFAULT_ADDRESS-1].  That's the correct
value.

Setting the bounds of the PCI hole in two separate places is
confusing, and begs the question whether the bogus intermediate value
could be used by something, or what would happen if we somehow managed
to realize an i440FX device without having run the board init function
i440fx_init() first.

Avoid the confusion by setting the (constant) upper bound along with
the lower bound in i440fx_init().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-07-04 14:50:59 +03:00
Marcel Apfelbaum
10d01f73e3 machine: remove iommu property
Since iommu devices can be created with '-device' there is
no need to keep iommu as machine and mch property.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:58 +03:00
Marcel Apfelbaum
621d983a1f hw/iommu: enable iommu with -device
Use the standard '-device intel-iommu' to create the IOMMU device.
The legacy '-machine,iommu=on' can still be used.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:58 +03:00
Marcel Apfelbaum
bf8d492405 q35: allow dynamic sysbus
Allow adding sysbus devices with -device on Q35.

At first Q35 will support only intel-iommu to be added this way,
however the command line will support all sysbus devices.

Mark with 'cannot_instantiate_with_device_add_yet' the ones
causing immediate problems (e.g. crashes).

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:01 +03:00
Marcel Apfelbaum
b86eacb804 hw/pci: delay bus_master_enable_region initialization
Skip bus_master_enable region creation on PCI device init
in order to be sure the IOMMU device (if present) would
be created in advance. Add this memory region at machine_done time.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:01 +03:00
Marcel Apfelbaum
1b04cc801a hw/ppc: realize the PCI root bus as part of mac99 init
Mac99's PCI root bus is not part of a host bridge,
realize it manually.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:01 +03:00
Gerd Hoffmann
5ec7d09818 xen: fix ram init regression
Commit "8156d48 pc: allow raising low memory via max-ram-below-4g
option" causes a regression on xen, because it uses a different
memory split.

This patch initializes max-ram-below-4g to zero and leaves the
initialization to the memory initialization functions.  That way
they can pick different default values (max-ram-below-4g is zero
still) or use the user supplied value (max-ram-below-4g is non-zero).

Also skip the whole ram split calculation on Xen.  xen_ram_init()
does its own split calculation anyway so it is superfluous, also
this way xen_ram_init can actually see whenever max-ram-below-4g
is zero or not.

Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:00 +03:00
Peter Maydell
96b39d8327 Only trivial fixes.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iEYEABECAAYFAld2ZGoACgkQAvw66wEB28LbNQCfbKhlgcwRvD/F1B5IFQErRkLf
 rBsAnR5QQhl8eJK+Dfdiy4H/mjKuXHyN
 =6o/q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging

Only trivial fixes.

# gpg: Signature made Fri 01 Jul 2016 13:39:06 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <gkurz@fr.ibm.com>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg:                 aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9p: synth: drop v9fs_ prefix
  9p: don't include <sys/uio.h>

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 19:29:27 +01:00
Greg Kurz
b05528b533 9p: synth: drop v9fs_ prefix
To have shorter lines and be consistent with other fs devices.

Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2016-07-01 14:38:54 +02:00
Peter Maydell
1b756f1abf ppc patch queue 2016-07-01
Here's the current ppc patch queue.  This is a fairly large batch,
 containing:
     * A number of further preliminary patches towards full hypervisor
       mode emulation
     * Some further fixes / cleanups for the recently merged device_add
       based CPU hotplug
     * Preliminary patches towards supporting a native (rather than
       paravirtualized) XICS device.  This will be needed to emulate a
       physical Power machine, including hypervisor capabilities
     * Assorted bug fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXdgYTAAoJEGw4ysog2bOS8qIQAKjNHf4U1KAf1cI6SWqtrtOJ
 y6PVbSZc/ywc3KF/on/o6owt1VidoLe8mkd1rTlQminY0/RXL5Fv+kwIqQ4F54h9
 0kKf9GDrLwjKI6IXEeQawzSa8wlfU1b3zjQJhJpztjwfdeTKAydOh5XEujF+hoCB
 NkESZbJbH1gdWlxSOUD+1+DOxOfcHXsE4aJmqaw9F6l2GSHLdO//t9+KPxYfSY02
 b9v6FTUkvZ/2xIQ1k8/0laE0tWqfLpTNCwkRbacHauZu1zUQVZa+fJDx7XgIlcGr
 5undbLWy9/kBONQbjN9BeU9pc7XUPeNWKYZx7MX6ClbV2V74OPgjNdDHgcpTQim8
 /ThiF8389qZ9tzoL+LPIdaIlOok0/I8NEcp2jfa2W2Z+AQnyjJ7nFJcjDhtRmhsI
 YHUYtTrceY40scUcucDth4osZ0w/PDNr94qb6HwnYxrekXmptfTMf2dq4i6yoH31
 VHQYTyBZ5dYg4DjzYuZLpNjC1Bok683Y5g4MtKf0h0A/LUaKdcHYLtuGw+jV9RtP
 qC1RZMv2ZmU2jT0Jp2hUcj+S9M1OcitUUwemT2YrMDV0iXiWV6yiK7Rw28VBDLCV
 TPYoDDK9xlqYsw0wOAM3uS9ywLn4tIRSG6hp6iIh8ZWt8z66KSyRikWnSjAVI+GO
 Fm6xNz0a/r9UomjPB0wa
 =KhVy
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160701' into staging

ppc patch queue 2016-07-01

Here's the current ppc patch queue.  This is a fairly large batch,
containing:
    * A number of further preliminary patches towards full hypervisor
      mode emulation
    * Some further fixes / cleanups for the recently merged device_add
      based CPU hotplug
    * Preliminary patches towards supporting a native (rather than
      paravirtualized) XICS device.  This will be needed to emulate a
      physical Power machine, including hypervisor capabilities
    * Assorted bug fixes

# gpg: Signature made Fri 01 Jul 2016 06:56:35 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160701: (23 commits)
  qmp: fix spapr example of query-hotpluggable-cpus
  spapr: drop duplicate variable in spapr_core_release()
  spapr: do proper error propagation in spapr_cpu_core_realize_child()
  spapr: drop reference on child object during core realization
  spapr: Restore support for 970MP and POWER8NVL CPU cores
  target-ppc: gen_pause for instructions: yield, mdoio, mdoom, miso
  ppc/xics: Replace "icp" with "xics" in most places
  ppc/xics: Implement H_IPOLL using an accessor
  ppc/xics: Move SPAPR specific code to a separate file
  ppc/xics: Rename existing xics to xics_spapr
  ppc: Fix 64K pages support in full emulation
  target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb
  spapr: Restore support for older PowerPC CPU cores
  spapr: fix write-past-end-of-array error in cpu core device init code
  hw/ppc/spapr: Add some missing hcall function set strings
  ppc: Print HSRR0/HSRR1 in "info registers"
  ppc: LPCR is a HV resource
  ppc: Initial HDEC support
  ppc: Enforce setting MSR:EE,IR and DR when MSR:PR is set
  ppc: Fix conditions for delivering external interrupts to a guest
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 13:31:48 +01:00
Greg Kurz
8a1eb71bd8 spapr: drop duplicate variable in spapr_core_release()
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz
f11235b920 spapr: do proper error propagation in spapr_cpu_core_realize_child()
This patch changes spapr_cpu_core_realize_child() to have a local error
pointer and use error_propagate() as it is supposed to be done.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz
8e758dee66 spapr: drop reference on child object during core realization
When a core is being realized, we create a child object for each thread
of the core.

The child is first initialized with object_initialize() which sets its ref
count to 1, and then added to the core with object_property_add_child()
which bumps the ref count to 2.

When the core gets released, object_unparent() decreases the ref count to 1,
and we g_free() the object: we hence loose the reference on an unfinalized
object. This is likely to cause random crashes.

Let's drop the extra reference as soon as we don't need it, after the
thread is added to the core.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Bharata B Rao
470f215787 spapr: Restore support for 970MP and POWER8NVL CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970MP and POWER8NVL based core types. Add support for
the same.

While we are here, add support for explicit specification of POWER5+_v2.1
core type.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
27f2458245 ppc/xics: Replace "icp" with "xics" in most places
The "ICP" is a different object than the "XICS". For historical reasons,
we have a number of places where we name a variable "icp" while it contains
a XICSState pointer. There *is* an ICPState structure too so this makes
the code really confusing.

This is a mechanical replacement of all those instances to use the name
"xics" instead. There should be no functional change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[spapr_cpu_init has been moved to spapr_cpu_core.c, change there]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
1cbd222055 ppc/xics: Implement H_IPOLL using an accessor
None of the other presenter functions directly mucks with the
internal state, so don't do it there either.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
9c7027ba94 ppc/xics: Move SPAPR specific code to a separate file
Leave the core ICP/ICS logic in xics.c and move the top level
class wrapper, hypercall and RTAS handlers to xics_spapr.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to
 xics_spapr.c]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:46 +10:00
Benjamin Herrenschmidt
161deaf225 ppc/xics: Rename existing xics to xics_spapr
The common class doesn't change, the KVM one is sPAPR specific. Rename
variables and functions to xics_spapr.

Retain the type name as "xics" to preserve migration for existing sPAPR
guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:46 +10:00
Aaron Larson
a36848ff7c target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb
Eliminate redundant and incorrect booke206_page_size_to_tlb function
from ppce500_spin.c in preference to previously existing but newly
exported definition from e500.c

Defect analysis:

The booke206_page_size_to_tlb function in e500.c was updated in commit
2bd9543 "ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages" to
reflect a change in the definition of MAS1_TSIZE_SHIFT from 8
(corresponding to a min TLB page size of 4kb) to a value of 7 (TLB
page size 2k).  The booke206_page_size_to_tlb() function defined in
ppce500_spin.c was never updated to reflect the change in
MAS1_TSIZE_SHIFT.

In http://lists.nongnu.org/archive/html/qemu-ppc/2016-06/msg00533.html,
Scott Wood suggested this "root cause" explanation:

SW> The patch that changed MAS1_TSIZE_SHIFT from 8 to 7 was around the
SW> same time as the patch that added this code, which is probably why
SW> adjusting it got missed.  Commit 2bd9543cd3 did update the
SW> equivalent code in ppce500_mpc8544ds.c, which now resides in
SW> hw/ppc/e500.c and has been changed to not assume a power-of-2
SW> size.  The ppce500_spin version should be eliminated.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Bharata B Rao
ff461b8da9 spapr: Restore support for older PowerPC CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970 and POWER5+ based core types. Add support for
the same.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Greg Kurz
dde35bc966 spapr: fix write-past-end-of-array error in cpu core device init code
This fixes a potential QEMU crash introduced by commit 3b54254966.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Thomas Huth
6cc09e261b hw/ppc/spapr: Add some missing hcall function set strings
Add "hcall-sprg0" (for H_SET_SPRG0), "hcall-copy" (for H_PAGE_INIT)
and "hcall-debug" (for H_LOGICAL_CI_LOAD/STORE) to the property
"ibm,hypertas-functions" to indicate that we support these hypercalls.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00