ppc 7.0 queue:

* General cleanup for Mac machines (Peter)
 * Fixes for FPU exceptions (Lucas)
 * Support for new ISA31 instructions (Matheus)
 * Fixes for ivshmem (Daniel)
 * Cleanups for PowerNV PHB (Christophe and Cedric)
 * Updates of PowerNV and pSeries documentation (Leonardo and Daniel)
 * Fixes for PowerNV (Daniel)
 * Large cleanup of FPU implementation (Richard)
 * Removal of SoftTLBs support for PPC74x CPUs (Fabiano)
 * Fixes for exception models in MPCx and 60x CPUs (Fabiano)
 * Removal of 401/403 CPUs (Cedric)
 * Deprecation of taihu machine (Thomas)
 * Large rework of PPC405 machine (Cedric)
 * Fixes for VSX instructions (Victor and Matheus)
 * Fix for e6500 CPU (Fabiano)
 * Initial support for PMU (Daniel)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmG8xt8ACgkQUaNDx8/7
 7KG3Dg/9EXK3GslNgUNRvB1pgRSimnrUirGUiDmZPXxevIbsoPsYaXmUcD1zOnlb
 zXiCzQ2Bvi8ZUjT1uScP7dkFCdzs6gXYbTEcTzscX3k2VnTjXHXhQ3cnb0uModP5
 U1QzrjV7K/q1usJW5OVSGZS1PoWOqWuZNdcp0mIUWcJHhSaYtUGGPohp7rH0JSug
 ncmkRA0KLgIX8eg8swyfJxrw9wCcXlFIcmwHipB8S/Dd/gUpmFEoaQsmugSJNYZe
 zi8Fd4jfzlRXVwb8EUSiOiaXSd/WKjEcQx/usbzzaBacbktk/nfy+rligUMryCpO
 vGFM5blxEX5SXD3Cd0vcFwYhCZImphD8K+Sxe6Us69rsUH11hJS+q29/Puk1MkHt
 DTubqB3k4BheiatOV1zeUMlbRm5svUhGj3VstFZYZeZ3Oh47Jsx3XH4hoytUuc/1
 lP9UGkaf3nIx12vSqBA/3Crc7zalWX5OhaUV5RG30+jxd8zHOKcasKbd22710DNz
 4WybQLb3bpUr091mWMKcaAkP6bxcE8S+mR4LE2kdELboAnkB+OgSmrdZ3slceaCv
 btV8qjNl4f8lBvyFQVxZ5bn05+TfxUXFlFxXipxf1fI64bYwRnyQQ3yRxMHipRYK
 CRta1akVgIgcBbeeRHBZLA12UgTQJY6WIoDaZMz9NxIDHJnX/jw=
 =APFd
 -----END PGP SIGNATURE-----

Merge tag 'pull-ppc-20211217' of https://github.com/legoater/qemu into staging

ppc 7.0 queue:

* General cleanup for Mac machines (Peter)
* Fixes for FPU exceptions (Lucas)
* Support for new ISA31 instructions (Matheus)
* Fixes for ivshmem (Daniel)
* Cleanups for PowerNV PHB (Christophe and Cedric)
* Updates of PowerNV and pSeries documentation (Leonardo and Daniel)
* Fixes for PowerNV (Daniel)
* Large cleanup of FPU implementation (Richard)
* Removal of SoftTLBs support for PPC74x CPUs (Fabiano)
* Fixes for exception models in MPCx and 60x CPUs (Fabiano)
* Removal of 401/403 CPUs (Cedric)
* Deprecation of taihu machine (Thomas)
* Large rework of PPC405 machine (Cedric)
* Fixes for VSX instructions (Victor and Matheus)
* Fix for e6500 CPU (Fabiano)
* Initial support for PMU (Daniel)

# gpg: Signature made Fri 17 Dec 2021 09:20:31 AM PST
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@kaod.org>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-ppc-20211217' of https://github.com/legoater/qemu: (101 commits)
  ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices
  ppc/pnv: Move realize of PEC stacks under the PEC model
  ppc/pnv: Remove "system-memory" property from PHB4 PEC
  ppc/pnv: Compute the PHB index from the PHB4 PEC model
  ppc/pnv: Introduce a num_stack class attribute
  ppc/pnv: Introduce a "chip" property under the PHB4 model
  ppc/pnv: Introduce version and device_id class atributes for PHB4 devices
  ppc/pnv: Introduce a num_pecs class attribute for PHB4 PEC devices
  ppc/pnv: Use QOM hierarchy to scan PHB3 devices
  ppc/pnv: Move mapping of the PHB3 CQ regions under pnv_pbcq_realize()
  ppc/pnv: Drop the "num-phbs" property
  ppc/pnv: Use the chip class to check the index of PHB3 devices
  ppc/pnv: Introduce a "chip" property under PHB3
  PPC64/TCG: Implement 'rfebb' instruction
  target/ppc/power8-pmu.c: add PM_RUN_INST_CMPL (0xFA) event
  target/ppc: enable PMU instruction count
  target/ppc: enable PMU counter overflow with cycle events
  target/ppc: PMU: update counters on MMCR1 write
  target/ppc: PMU: update counters on PMCs r/w
  target/ppc: PMU basic cycle count for pseries TCG
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This commit is contained in:
Richard Henderson 2021-12-17 09:55:14 -08:00
commit 93dc314c92
59 changed files with 2513 additions and 1646 deletions

View file

@ -1,7 +1,7 @@
PowerNV family boards (``powernv8``, ``powernv9``)
PowerNV family boards (``powernv8``, ``powernv9``, ``powernv10``)
==================================================================
PowerNV (as Non-Virtualized) is the "baremetal" platform using the
PowerNV (as Non-Virtualized) is the "bare metal" platform using the
OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can
be used as an hypervisor OS, running KVM guests, or simply as a host
OS.
@ -16,16 +16,14 @@ Supported devices
-----------------
* Multi processor support for POWER8, POWER8NVL and POWER9.
* XSCOM, serial communication sideband bus to configure chiplets
* Simple LPC Controller
* Processor Service Interface (PSI) Controller
* Interrupt Controller, XICS (POWER8) and XIVE (POWER9)
* POWER8 PHB3 PCIe Host bridge and POWER9 PHB4 PCIe Host bridge
* Simple OCC is an on-chip microcontroller used for power management
tasks
* iBT device to handle BMC communication, with the internal BMC
simulator provided by QEMU or an external BMC such as an Aspeed
QEMU machine.
* XSCOM, serial communication sideband bus to configure chiplets.
* Simple LPC Controller.
* Processor Service Interface (PSI) Controller.
* Interrupt Controller, XICS (POWER8) and XIVE (POWER9) and XIVE2 (Power10).
* POWER8 PHB3 PCIe Host bridge and POWER9 PHB4 PCIe Host bridge.
* Simple OCC is an on-chip micro-controller used for power management tasks.
* iBT device to handle BMC communication, with the internal BMC simulator
provided by QEMU or an external BMC such as an Aspeed QEMU machine.
* PNOR containing the different firmware partitions.
Missing devices
@ -33,31 +31,42 @@ Missing devices
A lot is missing, among which :
* POWER10 processor
* XIVE2 (POWER10) interrupt controller
* I2C controllers (yet to be merged)
* NPU/NPU2/NPU3 controllers
* EEH support for PCIe Host bridge controllers
* NX controller
* VAS controller
* chipTOD (Time Of Day)
* I2C controllers (yet to be merged).
* NPU/NPU2/NPU3 controllers.
* EEH support for PCIe Host bridge controllers.
* NX controller.
* VAS controller.
* chipTOD (Time Of Day).
* Self Boot Engine (SBE).
* FSI bus
* FSI bus.
Firmware
--------
The OPAL firmware (OpenPower Abstraction Layer) for OpenPower systems
includes the runtime services ``skiboot`` and the bootloader kernel and
initramfs ``skiroot``. Source code can be found on GitHub:
initramfs ``skiroot``. Source code can be found on the `OpenPOWER account at
GitHub <https://github.com/open-power>`_.
https://github.com/open-power.
Prebuilt images of ``skiboot`` and ``skiroot`` are made available on the `OpenPOWER <https://github.com/open-power/op-build/releases/>`__ site.
Prebuilt images of ``skiboot`` and ``skiroot`` are made available on the
`OpenPOWER <https://github.com/open-power/op-build/releases/>`__ site.
QEMU includes a prebuilt image of ``skiboot`` which is updated when a
more recent version is required by the models.
Current acceleration status
---------------------------
KVM acceleration in Linux Power hosts is provided by the kvm-hv and
kvm-pr modules. kvm-hv is adherent to PAPR and it's not compliant with
powernv. kvm-pr in theory could be used as a valid accel option but
this isn't supported by kvm-pr at this moment.
To spare users from dealing with not so informative errors when attempting
to use accel=kvm, the powernv machine will throw an error informing that
KVM is not supported. This can be revisited in the future if kvm-pr (or
any other KVM alternative) is usable as KVM accel for this machine.
Boot options
------------
@ -83,6 +92,7 @@ and a SATA disk :
Complex PCIe configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~
Six PHBs are defined per chip (POWER9) but no default PCI layout is
provided (to be compatible with libvirt). One PCI device can be added
on any of the available PCIe slots using command line options such as:
@ -157,7 +167,7 @@ one on the command line :
The files `palmetto-SDR.bin <http://www.kaod.org/qemu/powernv/palmetto-SDR.bin>`__
and `palmetto-FRU.bin <http://www.kaod.org/qemu/powernv/palmetto-FRU.bin>`__
define a Sensor Data Record repository and a Field Replaceable Unit
inventory for a palmetto BMC. They can be used to extend the QEMU BMC
inventory for a Palmetto BMC. They can be used to extend the QEMU BMC
simulator.
.. code-block:: bash
@ -189,4 +199,8 @@ CAVEATS
-------
* No support for multiple HW threads (SMT=1). Same as pseries.
* CPU can hang when doing intensive I/Os. Use ``-append powersave=off`` in that case.
Maintainer contact information
------------------------------
Cédric Le Goater <clg@kaod.org>

View file

@ -1,12 +1,238 @@
pSeries family boards (``pseries``)
===================================
The Power machine para-virtualized environment described by the `Linux on Power
Architecture Reference document (LoPAR)
<https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf>`_
is called pSeries. This environment is also known as sPAPR, System p guests, or
simply Power Linux guests (although it is capable of running other operating
systems, such as AIX).
Even though pSeries is designed to behave as a guest environment, it is also
capable of acting as a hypervisor OS, providing, on that role, nested
virtualization capabilities.
Supported devices
-----------------
* Multi processor support for many Power processors generations: POWER7,
POWER7+, POWER8, POWER8NVL, POWER9, and Power10. Support for POWER5+ exists,
but its state is unknown.
* Interrupt Controller, XICS (POWER8) and XIVE (POWER9 and Power10)
* vPHB PCIe Host bridge.
* vscsi and vnet devices, compatible with the same devices available on a
PowerVM hypervisor with VIOS managing LPARs.
* Virtio based devices.
* PCIe device pass through.
Missing devices
---------------
* SPICE support.
Firmware
--------
`SLOF <https://github.com/aik/SLOF>`_ (Slimline Open Firmware) is an
implementation of the `IEEE 1275-1994, Standard for Boot (Initialization
Configuration) Firmware: Core Requirements and Practices
<https://standards.ieee.org/standard/1275-1994.html>`_.
QEMU includes a prebuilt image of SLOF which is updated when a more recent
version is required.
Build directions
----------------
.. code-block:: bash
./configure --target-list=ppc64-softmmu && make
Running instructions
--------------------
Someone can select the pSeries machine type by running QEMU with the following
options:
.. code-block:: bash
qemu-system-ppc64 -M pseries <other QEMU arguments>
sPAPR devices
-------------
The sPAPR specification defines a set of para-virtualized devices, which are
also supported by the pSeries machine in QEMU and can be instantiated with the
``-device`` option:
* ``spapr-vlan`` : a virtual network interface.
* ``spapr-vscsi`` : a virtual SCSI disk interface.
* ``spapr-rng`` : a pseudo-device for passing random number generator data to the
guest (see the `H_RANDOM hypercall feature
<https://wiki.qemu.org/Features/HRandomHypercall>`_ for details).
* ``spapr-vty``: a virtual teletype.
* ``spapr-pci-host-bridge``: a PCI host bridge.
* ``tpm-spapr``: a Trusted Platform Module (TPM).
* ``spapr-tpm-proxy``: a TPM proxy.
These are compatible with the devices historically available for use when
running the IBM PowerVM hypervisor with LPARs.
However, since these devices have originally been specified with another
hypervisor and non-Linux guests in mind, you should use the virtio counterparts
(virtio-net, virtio-blk/scsi and virtio-rng for instance) if possible instead,
since they will most probably give you better performance with Linux guests in a
QEMU environment.
The pSeries machine in QEMU is always instantiated with the following devices:
* A NVRAM device (``spapr-nvram``).
* A virtual teletype (``spapr-vty``).
* A PCI host bridge (``spapr-pci-host-bridge``).
Hence, it is not needed to add them manually, unless you use the ``-nodefaults``
command line option in QEMU.
In the case of the default ``spapr-nvram`` device, if someone wants to make the
contents of the NVRAM device persistent, they will need to specify a PFLASH
device when starting QEMU, i.e. either use
``-drive if=pflash,file=<filename>,format=raw`` to set the default PFLASH
device, or specify one with an ID
(``-drive if=none,file=<filename>,format=raw,id=pfid``) and pass that ID to the
NVRAM device with ``-global spapr-nvram.drive=pfid``.
sPAPR specification
^^^^^^^^^^^^^^^^^^^
The main source of documentation on the sPAPR standard is the `Linux on Power
Architecture Reference document (LoPAR)
<https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200812.pdf>`_.
However, documentation specific to QEMU's implementation of the specification
can also be found in QEMU documentation:
.. toctree::
:maxdepth: 1
../../specs/ppc-spapr-hcalls.rst
../../specs/ppc-spapr-numa.rst
../../specs/ppc-spapr-xive.rst
Other documentation available in QEMU docs directory:
* Hot plug (``/docs/specs/ppc-spapr-hotplug.txt``).
* Hypervisor calls needed by the Ultravisor
(``/docs/specs/ppc-spapr-uv-hcalls.txt``).
Switching between the KVM-PR and KVM-HV kernel module
-----------------------------------------------------
Currently, there are two implementations of KVM on Power, ``kvm_hv.ko`` and
``kvm_pr.ko``.
If a host supports both KVM modes, and both KVM kernel modules are loaded, it is
possible to switch between the two modes with the ``kvm-type`` parameter:
* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=PR`` to use the
``kvm_pr.ko`` kernel module.
* Use ``qemu-system-ppc64 -M pseries,accel=kvm,kvm-type=HV`` to use ``kvm_hv.ko``
instead.
KVM-PR
^^^^^^
KVM-PR uses the so-called **PR**\ oblem state of the PPC CPUs to run the guests,
i.e. the virtual machine is run in user mode and all privileged instructions
trap and have to be emulated by the host. That means you can run KVM-PR inside
a pSeries guest (or a PowerVM LPAR for that matter), and that is where it has
originated, as historically (prior to POWER7) it was not possible to run Linux
on hypervisor mode on a Power processor (this function was restricted to
PowerVM, the IBM proprietary hypervisor).
Because all privileged instructions are trapped, guests that use a lot of
privileged instructions run quite slow with KVM-PR. On the other hand, because
of that, this kernel module can run on pretty much every PPC hardware, and is
able to emulate a lot of guests CPUs. This module can even be used to run other
PowerPC guests like an emulated PowerMac.
As KVM-PR can be run inside a pSeries guest, it can also provide nested
virtualization capabilities (i.e. running a guest from within a guest).
It is important to notice that, as KVM-HV provides a much better execution
performance, maintenance work has been much more focused on it in the past
years. Maintenance for KVM-PR has been minimal.
In order to run KVM-PR guests with POWER9 processors, someone will need to start
QEMU with ``kernel_irqchip=off`` command line option.
KVM-HV
^^^^^^
KVM-HV uses the hypervisor mode of more recent Power processors, that allow
access to the bare metal hardware directly. Although POWER7 had this capability,
it was only starting with POWER8 that this was officially supported by IBM.
Originally, KVM-HV was only available when running on a PowerNV platform (a.k.a.
Power bare metal). Although it runs on a PowerNV platform, it can only be used
to start pSeries guests. As the pSeries guest doesn't have access to the
hypervisor mode of the Power CPU, it wasn't possible to run KVM-HV on a guest.
This limitation has been lifted, and now it is possible to run KVM-HV inside
pSeries guests as well, making nested virtualization possible with KVM-HV.
As KVM-HV has access to privileged instructions, guests that use a lot of these
can run much faster than with KVM-PR. On the other hand, the guest CPU has to be
of the same type as the host CPU this way, e.g. it is not possible to specify an
embedded PPC CPU for the guest with KVM-HV. However, there is at least the
possibility to run the guest in a backward-compatibility mode of the previous
CPUs generations, e.g. you can run a POWER7 guest on a POWER8 host by using
``-cpu POWER8,compat=power7`` as parameter to QEMU.
Modules support
---------------
As noticed in the sections above, each module can run in a different
environment. The following table shows with which environment each module can
run. As long as you are in a supported environment, you can run KVM-PR or KVM-HV
nested. Combinations not shown in the table are not available.
+--------------+------------+------+-------------------+----------+--------+
| Platform | Host type | Bits | Page table format | KVM-HV | KVM-PR |
+==============+============+======+===================+==========+========+
| PowerNV | bare metal | 32 | hash | no | yes |
| | | +-------------------+----------+--------+
| | | | radix | N/A | N/A |
| | +------+-------------------+----------+--------+
| | | 64 | hash | yes | yes |
| | | +-------------------+----------+--------+
| | | | radix | yes | no |
+--------------+------------+------+-------------------+----------+--------+
| pSeries [1]_ | PowerNV | 32 | hash | no | yes |
| | | +-------------------+----------+--------+
| | | | radix | N/A | N/A |
| | +------+-------------------+----------+--------+
| | | 64 | hash | no | yes |
| | | +-------------------+----------+--------+
| | | | radix | yes [2]_ | no |
| +------------+------+-------------------+----------+--------+
| | PowerVM | 32 | hash | no | yes |
| | | +-------------------+----------+--------+
| | | | radix | N/A | N/A |
| | +------+-------------------+----------+--------+
| | | 64 | hash | no | yes |
| | | +-------------------+----------+--------+
| | | | radix [3]_ | no | yes |
+--------------+------------+------+-------------------+----------+--------+
.. [1] On POWER9 DD2.1 processors, the page table format on the host and guest
must be the same.
.. [2] KVM-HV cannot run nested on POWER8 machines.
.. [3] Introduced on Power10 machines.
Maintainer contact information
------------------------------
Cédric Le Goater <clg@kaod.org>
Daniel Henrique Barboza <danielhb413@gmail.com>