docs: rSTify ppc-spapr-hotplug.txt.

While working on this file, also removed and unused reference in the end of the file. The reference in the text was removed by commit 9f992cca93 (spapr: update spapr hotplug documentation), but the link in the end of the document was not removed then.

Signed-off-by: Leonardo Garcia <lagarcia@br.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <50ed30232e0e6eafb580c17adec3fba17b873014.1641995058.git.lagarcia@br.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This commit is contained in:
Leonardo Garcia 2022-01-18 12:56:30 +01:00 committed by Cédric Le Goater
parent 6621441db5
commit 8915106c69

View file

@ -1,224 +1,316 @@
= sPAPR Dynamic Reconfiguration = =============================
sPAPR Dynamic Reconfiguration
=============================
sPAPR/"pseries" guests make use of a facility called dynamic-reconfiguration sPAPR or pSeries guests make use of a facility called dynamic reconfiguration
to handle hotplugging of dynamic "physical" resources like PCI cards, or to handle hot plugging of dynamic "physical" resources like PCI cards, or
"logical"/paravirtual resources like memory, CPUs, and "physical" "logical"/para-virtual resources like memory, CPUs, and "physical"
host-bridges, which are generally managed by the host/hypervisor and provided host-bridges, which are generally managed by the host/hypervisor and provided
to guests as virtualized resources. The specifics of dynamic-reconfiguration to guests as virtualized resources. The specifics of dynamic reconfiguration
are documented extensively in PAPR+ v2.7, Section 13.1. This document are documented extensively in section 13 of the Linux on Power Architecture
provides a summary of that information as it applies to the implementation Reference document ([LoPAR]_). This document provides a summary of that
within QEMU. information as it applies to the implementation within QEMU.
== Dynamic-reconfiguration Connectors == Dynamic-reconfiguration Connectors
==================================
To manage hotplug/unplug of these resources, a firmware abstraction known as To manage hot plug/unplug of these resources, a firmware abstraction known as
a Dynamic Resource Connector (DRC) is used to assign a particular dynamic a Dynamic Resource Connector (DRC) is used to assign a particular dynamic
resource to the guest, and provide an interface for the guest to manage resource to the guest, and provide an interface for the guest to manage
configuration/removal of the resource associated with it. configuration/removal of the resource associated with it.
== Device-tree description of DRCs == Device tree description of DRCs
===============================
A set of 4 Open Firmware device tree array properties are used to describe A set of four Open Firmware device tree array properties are used to describe
the name/index/power-domain/type of each DRC allocated to a guest at the name/index/power-domain/type of each DRC allocated to a guest at
boot-time. There may be multiple sets of these arrays, rooted at different boot time. There may be multiple sets of these arrays, rooted at different
paths in the device tree depending on the type of resource the DRCs manage. paths in the device tree depending on the type of resource the DRCs manage.
In some cases, the DRCs themselves may be provided by a dynamic resource, In some cases, the DRCs themselves may be provided by a dynamic resource,
such as the DRCs managing PCI slots on a hotplugged PHB. In this case the such as the DRCs managing PCI slots on a hot plugged PHB. In this case the
arrays would be fetched as part of the device tree retrieval interfaces arrays would be fetched as part of the device tree retrieval interfaces
for hotplugged resources described under "Guest->Host interface". for hot plugged resources described under :ref:`guest-host-interface`.
The array properties are described below. Each entry/element in an array The array properties are described below. Each entry/element in an array
describes the DRC identified by the element in the corresponding position describes the DRC identified by the element in the corresponding position
of ibm,drc-indexes: of ``ibm,drc-indexes``:
ibm,drc-names: ``ibm,drc-names``
first 4-bytes: BE-encoded integer denoting the number of entries -----------------
each entry: a NULL-terminated <name> string encoded as a byte array
<name> values for logical/virtual resources are defined in PAPR+ v2.7, First 4-bytes: big-endian (BE) encoded integer denoting the number of entries.
Section 13.5.2.4, and basically consist of the type of the resource
followed by a space and a numerical value that's unique across resources
of that type.
<name> values for "physical" resources such as PCI or VIO devices are Each entry: a NULL-terminated ``<name>`` string encoded as a byte array.
defined as being "location codes", which are the "location labels" of
each encapsulating device, starting from the chassis down to the
individual slot for the device, concatenated by a hyphen. This provides
a mapping of resources to a physical location in a chassis for debugging
purposes. For QEMU, this mapping is less important, so we assign a
location code that conforms to naming specifications, but is simply a
location label for the slot by itself to simplify the implementation.
The naming convention for location labels is documented in detail in
PAPR+ v2.7, Section 12.3.1.5, and in our case amounts to using "C<n>"
for PCI/VIO device slots, where <n> is unique across all PCI/VIO
device slots.
ibm,drc-indexes: ``<name>`` values for logical/virtual resources are defined in the Linux on
first 4-bytes: BE-encoded integer denoting the number of entries Power Architecture Reference ([LoPAR]_) section 13.5.2.4, and basically
each 4-byte entry: BE-encoded <index> integer that is unique across all DRCs consist of the type of the resource followed by a space and a numerical
in the machine value that's unique across resources of that type.
<index> is arbitrary, but in the case of QEMU we try to maintain the ``<name>`` values for "physical" resources such as PCI or VIO devices are
convention used to assign them to pSeries guests on pHyp: defined as being "location codes", which are the "location labels" of each
encapsulating device, starting from the chassis down to the individual slot
for the device, concatenated by a hyphen. This provides a mapping of
resources to a physical location in a chassis for debugging purposes. For
QEMU, this mapping is less important, so we assign a location code that
conforms to naming specifications, but is simply a location label for the
slot by itself to simplify the implementation. The naming convention for
location labels is documented in detail in the [LoPAR]_ section 12.3.1.5,
and in our case amounts to using ``C<n>`` for PCI/VIO device slots, where
``<n>`` is unique across all PCI/VIO device slots.
bit[31:28]: integer encoding of <type>, where <type> is: ``ibm,drc-indexes``
1 for CPU resource -------------------
2 for PHB resource
3 for VIO resource
4 for PCI resource
8 for Memory resource
bit[27:0]: integer encoding of <id>, where <id> is unique across
all resources of specified type
ibm,drc-power-domains: First 4-bytes: BE-encoded integer denoting the number of entries.
first 4-bytes: BE-encoded integer denoting the number of entries
each 4-byte entry: 32-bit, BE-encoded <index> integer that specifies the Each 4-byte entry: BE-encoded ``<index>`` integer that is unique across all
power domain the resource will be assigned to. In the case of QEMU DRCs in the machine.
we associated all resources with a "live insertion" domain, where the
power is assumed to be managed automatically. The integer value for ``<index>`` is arbitrary, but in the case of QEMU we try to maintain the
this domain is a special value of -1. convention used to assign them to pSeries guests on pHyp (the hypervisor
portion of PowerVM):
``bit[31:28]``: integer encoding of ``<type>``, where ``<type>`` is:
``1`` for CPU resource.
``2`` for PHB resource.
``3`` for VIO resource.
``4`` for PCI resource.
``8`` for memory resource.
``bit[27:0]``: integer encoding of ``<id>``, where ``<id>`` is unique
across all resources of specified type.
``ibm,drc-power-domains``
-------------------------
First 4-bytes: BE-encoded integer denoting the number of entries.
Each 4-byte entry: 32-bit, BE-encoded ``<index>`` integer that specifies the
power domain the resource will be assigned to. In the case of QEMU we
associated all resources with a "live insertion" domain, where the power is
assumed to be managed automatically. The integer value for this domain is a
special value of ``-1``.
ibm,drc-types: ``ibm,drc-types``
first 4-bytes: BE-encoded integer denoting the number of entries -----------------
each entry: a NULL-terminated <type> string encoded as a byte array
<type> is assigned as follows: First 4-bytes: BE-encoded integer denoting the number of entries.
"CPU" for a CPU
"PHB" for a physical host-bridge
"SLOT" for a VIO slot
"28" for a PCI slot
"MEM" for memory resource
== Guest->Host interface to manage dynamic resources == Each entry: a NULL-terminated ``<type>`` string encoded as a byte array.
``<type>`` is assigned as follows:
Each DRC is given a globally unique DRC Index, and resources associated with "CPU" for a CPU.
a particular DRC are configured/managed by the guest via a number of RTAS
calls which reference individual DRCs based on the DRC index. This can be
considered the guest->host interface.
rtas-set-power-level: "PHB" for a physical host-bridge.
arg[0]: integer identifying power domain
arg[1]: new power level for the domain, 0-100
output[0]: status, 0 on success
output[1]: power level after command
Set the power level for a specified power domain "SLOT" for a VIO slot.
rtas-get-power-level: "28" for a PCI slot.
arg[0]: integer identifying power domain
output[0]: status, 0 on success
output[1]: current power level
Get the power level for a specified power domain "MEM" for memory resource.
rtas-set-indicator: .. _guest-host-interface:
arg[0]: integer identifying sensor/indicator type
arg[1]: index of sensor, for DR-related sensors this is generally the
DRC index
arg[2]: desired sensor value
output[0]: status, 0 on success
Set the state of an indicator or sensor. For the purpose of this document we Guest->Host interface to manage dynamic resources
focus on the indicator/sensor types associated with a DRC. The types are: =================================================
9001: isolation-state, controls/indicates whether a device has been made Each DRC is given a globally unique DRC index, and resources associated with a
accessible to a guest particular DRC are configured/managed by the guest via a number of RTAS calls
which reference individual DRCs based on the DRC index. This can be considered
the guest->host interface.
supported sensor values: ``rtas-set-power-level``
0: isolate, device is made unaccessible by guest OS ------------------------
1: unisolate, device is made available to guest OS
9002: dr-indicator, controls "visual" indicator associated with device Set the power level for a specified power domain.
supported sensor values: ``arg[0]``: integer identifying power domain.
0: inactive, resource may be safely removed
1: active, resource is in use and cannot be safely removed
2: identify, used to visually identify slot for interactive hotplug
3: action, in most cases, used in the same manner as identify
9003: allocation-state, generally only used for "logical" DR resources to ``arg[1]``: new power level for the domain, ``0-100``.
request the allocation/deallocation of a resource prior to acquiring
it via isolation-state->unisolate, or after releasing it via
isolation-state->isolate, respectively. for "physical" DR (like PCI
hotplug/unplug) the pre-allocation of the resource is implied and
this sensor is unused.
supported sensor values: ``output[0]``: status, ``0`` on success.
0: unusable, tell firmware/system the resource can be
unallocated/reclaimed and added back to the system resource pool
1: usable, request the resource be allocated/reserved for use by
guest OS
2: exchange, used to allocate a spare resource to use for fail-over
in certain situations. unused in QEMU
3: recover, used to reclaim a previously allocated resource that's
not currently allocated to the guest OS. unused in QEMU
rtas-get-sensor-state: ``output[1]``: power level after command.
arg[0]: integer identifying sensor/indicator type
arg[1]: index of sensor, for DR-related sensors this is generally the
DRC index
output[0]: status, 0 on success
Used to read an indicator or sensor value. ``rtas-get-power-level``
------------------------
For DR-related operations, the only noteworthy sensor is dr-entity-sense, Get the power level for a specified power domain.
which has a type value of 9003, as allocation-state does in the case of
rtas-set-indicator. The semantics/encodings of the sensor values are distinct
however:
supported sensor values for dr-entity-sense (9003) sensor: ``arg[0]``: integer identifying power domain.
0: empty,
for physical resources: DRC/slot is empty
for logical resources: unused
1: present,
for physical resources: DRC/slot is populated with a device/resource
for logical resources: resource has been allocated to the DRC
2: unusable,
for physical resources: unused
for logical resources: DRC has no resource allocated to it
3: exchange,
for physical resources: unused
for logical resources: resource available for exchange (see
allocation-state sensor semantics above)
4: recovery,
for physical resources: unused
for logical resources: resource available for recovery (see
allocation-state sensor semantics above)
rtas-ibm-configure-connector: ``output[0]``: status, ``0`` on success.
arg[0]: guest physical address of 4096-byte work area buffer
arg[1]: 0, or address of additional 4096-byte work area buffer. only non-zero
if a prior RTAS response indicated a need for additional memory
output[0]: status:
0: completed transmittal of device-tree node
1: instruct guest to prepare for next DT sibling node
2: instruct guest to prepare for next DT child node
3: instruct guest to prepare for next DT property
4: instruct guest to ascend to parent DT node
5: instruct guest to provide additional work-area buffer
via arg[1]
990x: instruct guest that operation took too long and to try
again later
Used to fetch an OF device-tree description of the resource associated with ``output[1]``: current power level.
a particular DRC. The DRC index is encoded in the first 4-bytes of the first
work area buffer.
Work area layout, using 4-byte offsets: ``rtas-set-indicator``
wa[0]: DRC index of the DRC to fetch device-tree nodes from ----------------------
wa[1]: 0 (hard-coded)
wa[2]: for next-sibling/next-child response:
wa offset of null-terminated string denoting the new node's name
for next-property response:
wa offset of null-terminated string denoting new property's name
wa[3]: for next-property response (unused otherwise):
byte-length of new property's value
wa[4]: for next-property response (unused otherwise):
new property's value, encoded as an OFDT-compatible byte array
== hotplug/unplug events == Set the state of an indicator or sensor.
``arg[0]``: integer identifying sensor/indicator type.
``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC
index.
``arg[2]``: desired sensor value.
``output[0]``: status, ``0`` on success.
For the purpose of this document we focus on the indicator/sensor types
associated with a DRC. The types are:
* ``9001``: ``isolation-state``, controls/indicates whether a device has been
made accessible to a guest. Supported sensor values:
``0``: ``isolate``, device is made inaccessible by guest OS.
``1``: ``unisolate``, device is made available to guest OS.
* ``9002``: ``dr-indicator``, controls "visual" indicator associated with
device. Supported sensor values:
``0``: ``inactive``, resource may be safely removed.
``1``: ``active``, resource is in use and cannot be safely removed.
``2``: ``identify``, used to visually identify slot for interactive hot plug.
``3``: ``action``, in most cases, used in the same manner as identify.
* ``9003``: ``allocation-state``, generally only used for "logical" DR resources
to request the allocation/deallocation of a resource prior to acquiring it via
``isolation-state->unisolate``, or after releasing it via
``isolation-state->isolate``, respectively. For "physical" DR (like PCI
hot plug/unplug) the pre-allocation of the resource is implied and this sensor
is unused. Supported sensor values:
``0``: ``unusable``, tell firmware/system the resource can be
unallocated/reclaimed and added back to the system resource pool.
``1``: ``usable``, request the resource be allocated/reserved for use by
guest OS.
``2``: ``exchange``, used to allocate a spare resource to use for fail-over
in certain situations. Unused in QEMU.
``3``: ``recover``, used to reclaim a previously allocated resource that's
not currently allocated to the guest OS. Unused in QEMU.
``rtas-get-sensor-state:``
--------------------------
Used to read an indicator or sensor value.
``arg[0]``: integer identifying sensor/indicator type.
``arg[1]``: index of sensor, for DR-related sensors this is generally the DRC
index
``output[0]``: status, 0 on success
For DR-related operations, the only noteworthy sensor is ``dr-entity-sense``,
which has a type value of ``9003``, as ``allocation-state`` does in the case of
``rtas-set-indicator``. The semantics/encodings of the sensor values are
distinct however.
Supported sensor values for ``dr-entity-sense`` (``9003``) sensor:
``0``: empty.
For physical resources: DRC/slot is empty.
For logical resources: unused.
``1``: present.
For physical resources: DRC/slot is populated with a device/resource.
For logical resources: resource has been allocated to the DRC.
``2``: unusable.
For physical resources: unused.
For logical resources: DRC has no resource allocated to it.
``3``: exchange.
For physical resources: unused.
For logical resources: resource available for exchange (see
``allocation-state`` sensor semantics above).
``4``: recovery.
For physical resources: unused.
For logical resources: resource available for recovery (see
``allocation-state`` sensor semantics above).
``rtas-ibm-configure-connector``
--------------------------------
Used to fetch an OpenFirmware device tree description of the resource associated
with a particular DRC.
``arg[0]``: guest physical address of 4096-byte work area buffer.
``arg[1]``: 0, or address of additional 4096-byte work area buffer; only
non-zero if a prior RTAS response indicated a need for additional memory.
``output[0]``: status:
``0``: completed transmittal of device tree node.
``1``: instruct guest to prepare for next device tree sibling node.
``2``: instruct guest to prepare for next device tree child node.
``3``: instruct guest to prepare for next device tree property.
``4``: instruct guest to ascend to parent device tree node.
``5``: instruct guest to provide additional work-area buffer via ``arg[1]``.
``990x``: instruct guest that operation took too long and to try again
later.
The DRC index is encoded in the first 4-bytes of the first work area buffer.
Work area (``wa``) layout, using 4-byte offsets:
``wa[0]``: DRC index of the DRC to fetch device tree nodes from.
``wa[1]``: ``0`` (hard-coded).
``wa[2]``:
For next-sibling/next-child response:
``wa`` offset of null-terminated string denoting the new node's name.
For next-property response:
``wa`` offset of null-terminated string denoting new property's name.
``wa[3]``: for next-property response (unused otherwise):
Byte-length of new property's value.
``wa[4]``: for next-property response (unused otherwise):
New property's value, encoded as an OFDT-compatible byte array.
Hot plug/unplug events
======================
For most DR operations, the hypervisor will issue host->guest add/remove events For most DR operations, the hypervisor will issue host->guest add/remove events
using the EPOW/check-exception notification framework, where the host issues a using the EPOW/check-exception notification framework, where the host issues a
@ -230,130 +322,140 @@ requests via EPOW events.
For DR, this framework has been extended to include hotplug events, which were For DR, this framework has been extended to include hotplug events, which were
previously unneeded due to direct manipulation of DR-related guest userspace previously unneeded due to direct manipulation of DR-related guest userspace
tools by host-level management such as an HMC. This level of management is not tools by host-level management such as an HMC. This level of management is not
applicable to PowerKVM, hence the reason for extending the notification applicable to KVM on Power, hence the reason for extending the notification
framework to support hotplug events. framework to support hotplug events.
The format for these EPOW-signalled events is described below under The format for these EPOW-signalled events is described below under
"hotplug/unplug event structure". Note that these events are not :ref:`hot-plug-unplug-event-structure`. Note that these events are not formally
formally part of the PAPR+ specification, and have been superseded by a part of the PAPR+ specification, and have been superseded by a newer format,
newer format, also described below under "hotplug/unplug event structure", also described below under :ref:`hot-plug-unplug-event-structure`, and so are
and so are now deemed a "legacy" format. The formats are similar, but the now deemed a "legacy" format. The formats are similar, but the "modern" format
"modern" format contains additional fields/flags, which are denoted for the contains additional fields/flags, which are denoted for the purposes of this
purposes of this documentation with "#ifdef GUEST_SUPPORTS_MODERN" guards. documentation with ``#ifdef GUEST_SUPPORTS_MODERN`` guards.
QEMU should assume support only for "legacy" fields/flags unless the guest QEMU should assume support only for "legacy" fields/flags unless the guest
advertises support for the "modern" format via ibm,client-architecture-support advertises support for the "modern" format via
hcall by setting byte 5, bit 6 of it's ibm,architecture-vec-5 option vector ``ibm,client-architecture-support`` hcall by setting byte 5, bit 6 of it's
structure (as described by LoPAPR v11, B.6.2.3). As with "legacy" format events, ``ibm,architecture-vec-5`` option vector structure (as described by [LoPAR]_,
"modern" format events are surfaced to the guest via check-exception RTAS calls, section B.5.2.3). As with "legacy" format events, "modern" format events are
but use a dedicated event source to signal the guest. This event source is surfaced to the guest via check-exception RTAS calls, but use a dedicated event
advertised to the guest by the addition of a "hot-plug-events" node under source to signal the guest. This event source is advertised to the guest by the
"/event-sources" node of the guest's device tree using the standard format addition of a ``hot-plug-events`` node under ``/event-sources`` node of the
described in LoPAPR v11, B.6.12.1. guest's device tree using the standard format described in [LoPAR]_,
section B.5.12.2.
== hotplug/unplug event structure == .. _hot-plug-unplug-event-structure:
The hotplug-specific payload in QEMU is implemented as follows (with all values Hot plug/unplug event structure
===============================
The hot plug specific payload in QEMU is implemented as follows (with all values
encoded in big-endian format): encoded in big-endian format):
struct rtas_event_log_v6_hp { .. code-block:: c
#define SECTION_ID_HOTPLUG 0x4850 /* HP */
struct section_header {
uint16_t section_id; /* set to SECTION_ID_HOTPLUG */
uint16_t section_length; /* sizeof(rtas_event_log_v6_hp),
* plus the length of the DRC name
* if a DRC name identifier is
* specified for hotplug_identifier
*/
uint8_t section_version; /* version 1 */
uint8_t section_subtype; /* unused */
uint16_t creator_component_id; /* unused */
} hdr;
#define RTAS_LOG_V6_HP_TYPE_CPU 1
#define RTAS_LOG_V6_HP_TYPE_MEMORY 2
#define RTAS_LOG_V6_HP_TYPE_SLOT 3
#define RTAS_LOG_V6_HP_TYPE_PHB 4
#define RTAS_LOG_V6_HP_TYPE_PCI 5
uint8_t hotplug_type; /* type of resource/device */
#define RTAS_LOG_V6_HP_ACTION_ADD 1
#define RTAS_LOG_V6_HP_ACTION_REMOVE 2
uint8_t hotplug_action; /* action (add/remove) */
#define RTAS_LOG_V6_HP_ID_DRC_NAME 1
#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2
#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3
#ifdef GUEST_SUPPORTS_MODERN
#define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4
#endif
uint8_t hotplug_identifier; /* type of the resource identifier,
* which serves as the discriminator
* for the 'drc' union field below
*/
#ifdef GUEST_SUPPORTS_MODERN
uint8_t capabilities; /* capability flags, currently unused
* by QEMU
*/
#else
uint8_t reserved;
#endif
union {
uint32_t index; /* DRC index of resource to take action
* on
*/
uint32_t count; /* number of DR resources to take
* action on (guest chooses which)
*/
#ifdef GUEST_SUPPORTS_MODERN
struct {
uint32_t count; /* number of DR resources to take
* action on
*/
uint32_t index; /* DRC index of first resource to take
* action on. guest will take action
* on DRC index <index> through
* DRC index <index + count - 1> in
* sequential order
*/
} count_indexed;
#endif
char name[1]; /* string representing the name of the
* DRC to take action on
*/
} drc;
} QEMU_PACKED;
== ibm,lrdr-capacity == struct rtas_event_log_v6_hp {
#define SECTION_ID_HOTPLUG 0x4850 /* HP */
struct section_header {
uint16_t section_id; /* set to SECTION_ID_HOTPLUG */
uint16_t section_length; /* sizeof(rtas_event_log_v6_hp),
* plus the length of the DRC name
* if a DRC name identifier is
* specified for hotplug_identifier
*/
uint8_t section_version; /* version 1 */
uint8_t section_subtype; /* unused */
uint16_t creator_component_id; /* unused */
} hdr;
#define RTAS_LOG_V6_HP_TYPE_CPU 1
#define RTAS_LOG_V6_HP_TYPE_MEMORY 2
#define RTAS_LOG_V6_HP_TYPE_SLOT 3
#define RTAS_LOG_V6_HP_TYPE_PHB 4
#define RTAS_LOG_V6_HP_TYPE_PCI 5
uint8_t hotplug_type; /* type of resource/device */
#define RTAS_LOG_V6_HP_ACTION_ADD 1
#define RTAS_LOG_V6_HP_ACTION_REMOVE 2
uint8_t hotplug_action; /* action (add/remove) */
#define RTAS_LOG_V6_HP_ID_DRC_NAME 1
#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2
#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3
#ifdef GUEST_SUPPORTS_MODERN
#define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4
#endif
uint8_t hotplug_identifier; /* type of the resource identifier,
* which serves as the discriminator
* for the 'drc' union field below
*/
#ifdef GUEST_SUPPORTS_MODERN
uint8_t capabilities; /* capability flags, currently unused
* by QEMU
*/
#else
uint8_t reserved;
#endif
union {
uint32_t index; /* DRC index of resource to take action
* on
*/
uint32_t count; /* number of DR resources to take
* action on (guest chooses which)
*/
#ifdef GUEST_SUPPORTS_MODERN
struct {
uint32_t count; /* number of DR resources to take
* action on
*/
uint32_t index; /* DRC index of first resource to take
* action on. guest will take action
* on DRC index <index> through
* DRC index <index + count - 1> in
* sequential order
*/
} count_indexed;
#endif
char name[1]; /* string representing the name of the
* DRC to take action on
*/
} drc;
} QEMU_PACKED;
ibm,lrdr-capacity is a property in the /rtas device tree node that identifies ``ibm,lrdr-capacity``
the dynamic reconfiguration capabilities of the guest. It consists of a triple =====================
consisting of <phys>, <size> and <maxcpus>.
<phys>, encoded in BE format represents the maximum address in bytes and ``ibm,lrdr-capacity`` is a property in the /rtas device tree node that
identifies the dynamic reconfiguration capabilities of the guest. It consists
of a triple consisting of ``<phys>``, ``<size>`` and ``<maxcpus>``.
``<phys>``, encoded in BE format represents the maximum address in bytes and
hence the maximum memory that can be allocated to the guest. hence the maximum memory that can be allocated to the guest.
<size>, encoded in BE format represents the size increments in which ``<size>``, encoded in BE format represents the size increments in which
memory can be hot-plugged to the guest. memory can be hot-plugged to the guest.
<maxcpus>, a BE-encoded integer, represents the maximum number of ``<maxcpus>``, a BE-encoded integer, represents the maximum number of
processors that the guest can have. processors that the guest can have.
pseries guests use this property to note the maximum allowed CPUs for the ``pseries`` guests use this property to note the maximum allowed CPUs for the
guest. guest.
== ibm,dynamic-reconfiguration-memory == ``ibm,dynamic-reconfiguration-memory``
======================================
ibm,dynamic-reconfiguration-memory is a device tree node that represents ``ibm,dynamic-reconfiguration-memory`` is a device tree node that represents
dynamically reconfigurable logical memory blocks (LMB). This node dynamically reconfigurable logical memory blocks (LMB). This node is generated
is generated only when the guest advertises the support for it via only when the guest advertises the support for it via
ibm,client-architecture-support call. Memory that is not dynamically ``ibm,client-architecture-support`` call. Memory that is not dynamically
reconfigurable is represented by /memory nodes. The properties of this reconfigurable is represented by ``/memory`` nodes. The properties of this node
node that are of interest to the sPAPR memory hotplug implementation that are of interest to the sPAPR memory hotplug implementation in QEMU are
in QEMU are described here. described here.
ibm,lmb-size ``ibm,lmb-size``
----------------
This 64bit integer defines the size of each dynamically reconfigurable LMB. This 64-bit integer defines the size of each dynamically reconfigurable LMB.
ibm,associativity-lookup-arrays ``ibm,associativity-lookup-arrays``
-----------------------------------
This property defines a lookup array in which the NUMA associativity This property defines a lookup array in which the NUMA associativity
information for each LMB can be found. It is a property encoded array information for each LMB can be found. It is a property encoded array
@ -361,13 +463,14 @@ that begins with an integer M, the number of associativity lists followed
by an integer N, the number of entries per associativity list and terminated by an integer N, the number of entries per associativity list and terminated
by M associativity lists each of length N integers. by M associativity lists each of length N integers.
This property provides the same information as given by ibm,associativity This property provides the same information as given by ``ibm,associativity``
property in a /memory node. Each assigned LMB has an index value between property in a ``/memory`` node. Each assigned LMB has an index value between
0 and M-1 which is used as an index into this table to select which 0 and M-1 which is used as an index into this table to select which
associativity list to use for the LMB. This index value for each LMB associativity list to use for the LMB. This index value for each LMB is defined
is defined in ibm,dynamic-memory property. in ``ibm,dynamic-memory`` property.
ibm,dynamic-memory ``ibm,dynamic-memory``
----------------------
This property describes the dynamically reconfigurable memory. It is a This property describes the dynamically reconfigurable memory. It is a
property encoded array that has an integer N, the number of LMBs followed property encoded array that has an integer N, the number of LMBs followed
@ -375,19 +478,19 @@ by N LMB list entries.
Each LMB list entry consists of the following elements: Each LMB list entry consists of the following elements:
- Logical address of the start of the LMB encoded as a 64bit integer. This - Logical address of the start of the LMB encoded as a 64-bit integer. This
corresponds to reg property in /memory node. corresponds to ``reg`` property in ``/memory`` node.
- DRC index of the LMB that corresponds to ibm,my-drc-index property - DRC index of the LMB that corresponds to ``ibm,my-drc-index`` property
in a /memory node. in a ``/memory`` node.
- Four bytes reserved for expansion. - Four bytes reserved for expansion.
- Associativity list index for the LMB that is used as an index into - Associativity list index for the LMB that is used as an index into
ibm,associativity-lookup-arrays property described earlier. This ``ibm,associativity-lookup-arrays`` property described earlier. This is used
is used to retrieve the right associativity list to be used for this to retrieve the right associativity list to be used for this LMB.
LMB. - A 32-bit flags word. The bit at bit position ``0x00000008`` defines whether
- A 32bit flags word. The bit at bit position 0x00000008 defines whether
the LMB is assigned to the partition as of boot time. the LMB is assigned to the partition as of boot time.
ibm,dynamic-memory-v2 ``ibm,dynamic-memory-v2``
-------------------------
This property describes the dynamically reconfigurable memory. This is This property describes the dynamically reconfigurable memory. This is
an alternate and newer way to describe dynamically reconfigurable memory. an alternate and newer way to describe dynamically reconfigurable memory.
@ -397,13 +500,11 @@ for each sequential group of LMBs that share common attributes.
Each LMB set entry consists of the following elements: Each LMB set entry consists of the following elements:
- Number of sequential LMBs in the entry represented by a 32bit integer. - Number of sequential LMBs in the entry represented by a 32-bit integer.
- Logical address of the first LMB in the set encoded as a 64bit integer. - Logical address of the first LMB in the set encoded as a 64-bit integer.
- DRC index of the first LMB in the set. - DRC index of the first LMB in the set.
- Associativity list index that is used as an index into - Associativity list index that is used as an index into
ibm,associativity-lookup-arrays property described earlier. This ``ibm,associativity-lookup-arrays`` property described earlier. This
is used to retrieve the right associativity list to be used for all is used to retrieve the right associativity list to be used for all
the LMBs in this set. the LMBs in this set.
- A 32bit flags word that applies to all the LMBs in the set. - A 32-bit flags word that applies to all the LMBs in the set.
[1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/75350/focus=106867