mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-08-04 08:13:54 -06:00
target-arm queue:
* linux-user/elfload: Add missing arm64 hwcap values * stellaris-gamepad: Convert to qdev * docs/specs: Convert various txt docs to rST * MAINTAINERS: Make sure that gicv3_internal.h is covered, too * hw/arm/pxa2xx_gpio: Pass CPU using QOM link property * hw/watchdog/wdt_imx2: Trace MMIO access and timer activity * hw/misc/imx7_snvs: Trace MMIO access * hw/misc/imx6_ccm: Convert DPRINTF to trace events * hw/i2c/pm_smbus: Convert DPRINTF to trace events * target/arm: Enable FEAT_MOPS insns in user-mode emulation * linux-user: Report AArch64 hwcap2 fields above bit 31 * target/arm: Make FEAT_MOPS SET* insns handle Xs == XZR correctly * target/arm: Fix SVE STR increment * hw/char/stm32f2xx_usart: implement TX interrupts * target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk * xlnx-versal-virt: Add AMD/Xilinx TRNG device -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmVD3hEZHHBldGVyLm1h eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3kuRD/4mLL2DB+yvQJrzSvUlrjfi /orPDrY9xEQ7ln2YpNqc2BZ4wAgh947yk/ae5+lyACQcBhCPiwMyVK1bBscNxkgA 8YPmuugNem/64+IHiKkz6aroqjvC83dUzJ9R5O9ctV70mgrX32YnhXNkkYVI81Ar bEwBznyYeCiy8ZafVxc2m70fiBOlurb6htYYdt7VHsgB0ozK/80UmuFI6exOKt1r oVyYouMaidNV/AoqZBGSKT2UFvFmI57PWN0YQD8CMECLsB/mBE9TEzSvLRdlOB4G qI5hgEJks61qDL6+YMJ+hskxW+D3g3I1WjuyqhKfiAzcKmmTAp1NsiiDtva8yBzX lDUXc6bPomalrKo1SPsooJv9r4uE3hCayDOlR+qM38DL4j2soSd3QIP7dCzERbZx snrD+ZTtgXtomUN8ojbnOK+kClEfURZ+wALbUEXwAh1sBwrKBxaD4ss4lA2esq10 HJPjBJzAWoSmK2DY6GWt2xIa+GvQwdPnxMpHbp3yAddGP7i/lHM0x60q5YpjHV++ DHaZmLBA7L9wcvT1VrwmieJaB+ADcSfkzBz2KznC4usdEY8BiJhjdRAzkqdGZWV5 HKEg8QwMYHg4QRUoZxW/XdtVzdqcjO5pTSUr3HUE+85sum2e9Yee6rybg1W/EWYv 7SnVkD5zG1BU268/p5k6UA== =OgfH -----END PGP SIGNATURE----- Merge tag 'pull-target-arm-20231102' of https://git.linaro.org/people/pmaydell/qemu-arm into staging target-arm queue: * linux-user/elfload: Add missing arm64 hwcap values * stellaris-gamepad: Convert to qdev * docs/specs: Convert various txt docs to rST * MAINTAINERS: Make sure that gicv3_internal.h is covered, too * hw/arm/pxa2xx_gpio: Pass CPU using QOM link property * hw/watchdog/wdt_imx2: Trace MMIO access and timer activity * hw/misc/imx7_snvs: Trace MMIO access * hw/misc/imx6_ccm: Convert DPRINTF to trace events * hw/i2c/pm_smbus: Convert DPRINTF to trace events * target/arm: Enable FEAT_MOPS insns in user-mode emulation * linux-user: Report AArch64 hwcap2 fields above bit 31 * target/arm: Make FEAT_MOPS SET* insns handle Xs == XZR correctly * target/arm: Fix SVE STR increment * hw/char/stm32f2xx_usart: implement TX interrupts * target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk * xlnx-versal-virt: Add AMD/Xilinx TRNG device * tag 'pull-target-arm-20231102' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits) tests/qtest: Introduce tests for AMD/Xilinx Versal TRNG device hw/arm: xlnx-versal-virt: Add AMD/Xilinx TRNG device hw/misc: Introduce AMD/Xilix Versal TRNG device target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk hw/char/stm32f2xx_usart: Add more definitions for CR1 register hw/char/stm32f2xx_usart: Update IRQ when DR is written hw/char/stm32f2xx_usart: Extract common IRQ update code to update_irq() target/arm: Fix SVE STR increment target/arm: Make FEAT_MOPS SET* insns handle Xs == XZR correctly linux-user: Report AArch64 hwcap2 fields above bit 31 target/arm: Enable FEAT_MOPS insns in user-mode emulation hw/i2c/pm_smbus: Convert DPRINTF to trace events hw/misc/imx6_ccm: Convert DPRINTF to trace events hw/misc/imx7_snvs: Trace MMIO access hw/watchdog/wdt_imx2: Trace timer activity hw/watchdog/wdt_imx2: Trace MMIO access hw/arm/pxa2xx_gpio: Pass CPU using QOM link property MAINTAINERS: Make sure that gicv3_internal.h is covered, too docs/specs/vmgenid: Convert to rST docs/specs/vmcoreinfo: Convert to rST ... Conflicts: hw/input/stellaris_input.c The qdev conversion in this pull request ("stellaris-gamepad: Convert to qdev") eliminates the vmstate_register() call that was converted to vmstate_register_any() in the conflicting migration pull request. vmstate_register_any() is no longer necessary now that this device has been converted to qdev, so take this pull request's version of stellaris_gamepad.c over the previous pull request's stellaris_input.c (the file was renamed). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
commit
d762bf9793
56 changed files with 2302 additions and 776 deletions
|
@ -2,9 +2,10 @@
|
|||
EDU device
|
||||
==========
|
||||
|
||||
Copyright (c) 2014-2015 Jiri Slaby
|
||||
..
|
||||
Copyright (c) 2014-2015 Jiri Slaby
|
||||
|
||||
This document is licensed under the GPLv2 (or later).
|
||||
This document is licensed under the GPLv2 (or later).
|
||||
|
||||
This is an educational device for writing (kernel) drivers. Its original
|
||||
intention was to support the Linux kernel lectures taught at the Masaryk
|
||||
|
@ -15,10 +16,11 @@ The devices behaves very similar to the PCI bridge present in the COMBO6 cards
|
|||
developed under the Liberouter wings. Both PCI device ID and PCI space is
|
||||
inherited from that device.
|
||||
|
||||
Command line switches:
|
||||
-device edu[,dma_mask=mask]
|
||||
Command line switches
|
||||
---------------------
|
||||
|
||||
dma_mask makes the virtual device work with DMA addresses with the given
|
||||
``-device edu[,dma_mask=mask]``
|
||||
``dma_mask`` makes the virtual device work with DMA addresses with the given
|
||||
mask. For educational purposes, the device supports only 28 bits (256 MiB)
|
||||
by default. Students shall set dma_mask for the device in the OS driver
|
||||
properly.
|
||||
|
@ -26,7 +28,8 @@ Command line switches:
|
|||
PCI specs
|
||||
---------
|
||||
|
||||
PCI ID: 1234:11e8
|
||||
PCI ID:
|
||||
``1234:11e8``
|
||||
|
||||
PCI Region 0:
|
||||
I/O memory, 1 MB in size. Users are supposed to communicate with the card
|
||||
|
@ -35,24 +38,29 @@ PCI Region 0:
|
|||
MMIO area spec
|
||||
--------------
|
||||
|
||||
Only size == 4 accesses are allowed for addresses < 0x80. size == 4 or
|
||||
size == 8 for the rest.
|
||||
Only ``size == 4`` accesses are allowed for addresses ``< 0x80``.
|
||||
``size == 4`` or ``size == 8`` for the rest.
|
||||
|
||||
0x00 (RO) : identification (0xRRrr00edu)
|
||||
RR -- major version
|
||||
rr -- minor version
|
||||
0x00 (RO) : identification
|
||||
Value is in the form ``0xRRrr00edu`` where:
|
||||
- ``RR`` -- major version
|
||||
- ``rr`` -- minor version
|
||||
|
||||
0x04 (RW) : card liveness check
|
||||
It is a simple value inversion (~ C operator).
|
||||
It is a simple value inversion (``~`` C operator).
|
||||
|
||||
0x08 (RW) : factorial computation
|
||||
The stored value is taken and factorial of it is put back here.
|
||||
This happens only after factorial bit in the status register (0x20
|
||||
below) is cleared.
|
||||
|
||||
0x20 (RW) : status register, bitwise OR
|
||||
0x01 -- computing factorial (RO)
|
||||
0x80 -- raise interrupt after finishing factorial computation
|
||||
0x20 (RW) : status register
|
||||
Bitwise OR of:
|
||||
|
||||
0x01
|
||||
computing factorial (RO)
|
||||
0x80
|
||||
raise interrupt after finishing factorial computation
|
||||
|
||||
0x24 (RO) : interrupt status register
|
||||
It contains values which raised the interrupt (see interrupt raise
|
||||
|
@ -76,13 +84,19 @@ size == 8 for the rest.
|
|||
0x90 (RW) : DMA transfer count
|
||||
The size of the area to perform the DMA on.
|
||||
|
||||
0x98 (RW) : DMA command register, bitwise OR
|
||||
0x01 -- start transfer
|
||||
0x02 -- direction (0: from RAM to EDU, 1: from EDU to RAM)
|
||||
0x04 -- raise interrupt 0x100 after finishing the DMA
|
||||
0x98 (RW) : DMA command register
|
||||
Bitwise OR of:
|
||||
|
||||
0x01
|
||||
start transfer
|
||||
0x02
|
||||
direction (0: from RAM to EDU, 1: from EDU to RAM)
|
||||
0x04
|
||||
raise interrupt 0x100 after finishing the DMA
|
||||
|
||||
IRQ controller
|
||||
--------------
|
||||
|
||||
An IRQ is generated when written to the interrupt raise register. The value
|
||||
appears in interrupt status register when the interrupt is raised and has to
|
||||
be written to the interrupt acknowledge register to lower it.
|
||||
|
@ -94,22 +108,28 @@ routine.
|
|||
|
||||
DMA controller
|
||||
--------------
|
||||
|
||||
One has to specify, source, destination, size, and start the transfer. One
|
||||
4096 bytes long buffer at offset 0x40000 is available in the EDU device. I.e.
|
||||
one can perform DMA to/from this space when programmed properly.
|
||||
|
||||
Example of transferring a 100 byte block to and from the buffer using a given
|
||||
PCI address 'addr':
|
||||
addr -> DMA source address
|
||||
0x40000 -> DMA destination address
|
||||
100 -> DMA transfer count
|
||||
1 -> DMA command register
|
||||
while (DMA command register & 1)
|
||||
;
|
||||
PCI address ``addr``:
|
||||
|
||||
0x40000 -> DMA source address
|
||||
addr+100 -> DMA destination address
|
||||
100 -> DMA transfer count
|
||||
3 -> DMA command register
|
||||
while (DMA command register & 1)
|
||||
;
|
||||
::
|
||||
|
||||
addr -> DMA source address
|
||||
0x40000 -> DMA destination address
|
||||
100 -> DMA transfer count
|
||||
1 -> DMA command register
|
||||
while (DMA command register & 1)
|
||||
;
|
||||
|
||||
::
|
||||
|
||||
0x40000 -> DMA source address
|
||||
addr+100 -> DMA destination address
|
||||
100 -> DMA transfer count
|
||||
3 -> DMA command register
|
||||
while (DMA command register & 1)
|
||||
;
|
|
@ -24,3 +24,11 @@ guest hardware that is specific to QEMU.
|
|||
acpi_erst
|
||||
sev-guest-firmware
|
||||
fw_cfg
|
||||
vmw_pvscsi-spec
|
||||
edu
|
||||
ivshmem-spec
|
||||
pvpanic
|
||||
standard-vga
|
||||
virt-ctlr
|
||||
vmcoreinfo
|
||||
vmgenid
|
||||
|
|
|
@ -1,4 +1,6 @@
|
|||
= Device Specification for Inter-VM shared memory device =
|
||||
======================================================
|
||||
Device Specification for Inter-VM shared memory device
|
||||
======================================================
|
||||
|
||||
The Inter-VM shared memory device (ivshmem) is designed to share a
|
||||
memory region between multiple QEMU processes running different guests
|
||||
|
@ -12,42 +14,17 @@ can obtain one from an ivshmem server.
|
|||
In the latter case, the device can additionally interrupt its peers, and
|
||||
get interrupted by its peers.
|
||||
|
||||
For information on configuring the ivshmem device on the QEMU
|
||||
command line, see :doc:`../system/devices/ivshmem`.
|
||||
|
||||
== Configuring the ivshmem PCI device ==
|
||||
|
||||
There are two basic configurations:
|
||||
|
||||
- Just shared memory:
|
||||
|
||||
-device ivshmem-plain,memdev=HMB,...
|
||||
|
||||
This uses host memory backend HMB. It should have option "share"
|
||||
set.
|
||||
|
||||
- Shared memory plus interrupts:
|
||||
|
||||
-device ivshmem-doorbell,chardev=CHR,vectors=N,...
|
||||
|
||||
An ivshmem server must already be running on the host. The device
|
||||
connects to the server's UNIX domain socket via character device
|
||||
CHR.
|
||||
|
||||
Each peer gets assigned a unique ID by the server. IDs must be
|
||||
between 0 and 65535.
|
||||
|
||||
Interrupts are message-signaled (MSI-X). vectors=N configures the
|
||||
number of vectors to use.
|
||||
|
||||
For more details on ivshmem device properties, see the QEMU Emulator
|
||||
user documentation.
|
||||
|
||||
|
||||
== The ivshmem PCI device's guest interface ==
|
||||
The ivshmem PCI device's guest interface
|
||||
========================================
|
||||
|
||||
The device has vendor ID 1af4, device ID 1110, revision 1. Before
|
||||
QEMU 2.6.0, it had revision 0.
|
||||
|
||||
=== PCI BARs ===
|
||||
PCI BARs
|
||||
--------
|
||||
|
||||
The ivshmem PCI device has two or three BARs:
|
||||
|
||||
|
@ -59,8 +36,7 @@ There are two ways to use this device:
|
|||
|
||||
- If you only need the shared memory part, BAR2 suffices. This way,
|
||||
you have access to the shared memory in the guest and can use it as
|
||||
you see fit. Memnic, for example, uses ivshmem this way from guest
|
||||
user space (see http://dpdk.org/browse/memnic).
|
||||
you see fit.
|
||||
|
||||
- If you additionally need the capability for peers to interrupt each
|
||||
other, you need BAR0 and BAR1. You will most likely want to write a
|
||||
|
@ -77,10 +53,13 @@ accessing BAR2.
|
|||
Revision 0 of the device is not capable to tell guest software whether
|
||||
it is configured for interrupts.
|
||||
|
||||
=== PCI device registers ===
|
||||
PCI device registers
|
||||
--------------------
|
||||
|
||||
BAR 0 contains the following registers:
|
||||
|
||||
::
|
||||
|
||||
Offset Size Access On reset Function
|
||||
0 4 read/write 0 Interrupt Mask
|
||||
bit 0: peer interrupt (rev 0)
|
||||
|
@ -145,18 +124,20 @@ With multiple MSI-X vectors, different vectors can be used to indicate
|
|||
different events have occurred. The semantics of interrupt vectors
|
||||
are left to the application.
|
||||
|
||||
|
||||
== Interrupt infrastructure ==
|
||||
Interrupt infrastructure
|
||||
========================
|
||||
|
||||
When configured for interrupts, the peers share eventfd objects in
|
||||
addition to shared memory. The shared resources are managed by an
|
||||
ivshmem server.
|
||||
|
||||
=== The ivshmem server ===
|
||||
The ivshmem server
|
||||
------------------
|
||||
|
||||
The server listens on a UNIX domain socket.
|
||||
|
||||
For each new client that connects to the server, the server
|
||||
|
||||
- picks an ID,
|
||||
- creates eventfd file descriptors for the interrupt vectors,
|
||||
- sends the ID and the file descriptor for the shared memory to the
|
||||
|
@ -189,7 +170,8 @@ vectors.
|
|||
A standalone client is in contrib/ivshmem-client/. It can be useful
|
||||
for debugging.
|
||||
|
||||
=== The ivshmem Client-Server Protocol ===
|
||||
The ivshmem Client-Server Protocol
|
||||
----------------------------------
|
||||
|
||||
An ivshmem device configured for interrupts connects to an ivshmem
|
||||
server. This section details the protocol between the two.
|
||||
|
@ -245,7 +227,8 @@ Known bugs:
|
|||
|
||||
* The protocol is poorly designed.
|
||||
|
||||
=== The ivshmem Client-Client Protocol ===
|
||||
The ivshmem Client-Client Protocol
|
||||
----------------------------------
|
||||
|
||||
An ivshmem device configured for interrupts receives eventfd file
|
||||
descriptors for interrupting peers and getting interrupted by peers
|
|
@ -50,7 +50,7 @@ maintained as part of the virtio specification.
|
|||
by QEMU.
|
||||
|
||||
1af4:1110
|
||||
ivshmem device (shared memory, ``docs/specs/ivshmem-spec.txt``)
|
||||
ivshmem device (:doc:`ivshmem-spec`)
|
||||
|
||||
All other device IDs are reserved.
|
||||
|
||||
|
|
|
@ -21,18 +21,21 @@ recognize. On write, the bits not recognized by the device are ignored.
|
|||
Software should set only bits both itself and the device recognize.
|
||||
|
||||
Bit Definition
|
||||
--------------
|
||||
bit 0: a guest panic has happened and should be processed by the host
|
||||
bit 1: a guest panic has happened and will be handled by the guest;
|
||||
the host should record it or report it, but should not affect
|
||||
the execution of the guest.
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
bit 0
|
||||
a guest panic has happened and should be processed by the host
|
||||
bit 1
|
||||
a guest panic has happened and will be handled by the guest;
|
||||
the host should record it or report it, but should not affect
|
||||
the execution of the guest.
|
||||
|
||||
PCI Interface
|
||||
-------------
|
||||
|
||||
The PCI interface is similar to the ISA interface except that it uses an MMIO
|
||||
address space provided by its BAR0, 1 byte long. Any machine with a PCI bus
|
||||
can enable a pvpanic device by adding '-device pvpanic-pci' to the command
|
||||
can enable a pvpanic device by adding ``-device pvpanic-pci`` to the command
|
||||
line.
|
||||
|
||||
ACPI Interface
|
||||
|
@ -40,15 +43,25 @@ ACPI Interface
|
|||
|
||||
pvpanic device is defined with ACPI ID "QEMU0001". Custom methods:
|
||||
|
||||
RDPT: To determine whether guest panic notification is supported.
|
||||
Arguments: None
|
||||
Return: Returns a byte, with the same semantics as the I/O port
|
||||
interface.
|
||||
RDPT
|
||||
~~~~
|
||||
|
||||
WRPT: To send a guest panic event
|
||||
Arguments: Arg0 is a byte to be written, with the same semantics as
|
||||
the I/O interface.
|
||||
Return: None
|
||||
To determine whether guest panic notification is supported.
|
||||
|
||||
Arguments
|
||||
None
|
||||
Return
|
||||
Returns a byte, with the same semantics as the I/O port interface.
|
||||
|
||||
WRPT
|
||||
~~~~
|
||||
|
||||
To send a guest panic event.
|
||||
|
||||
Arguments
|
||||
Arg0 is a byte to be written, with the same semantics as the I/O interface.
|
||||
Return
|
||||
None
|
||||
|
||||
The ACPI device will automatically refer to the right port in case it
|
||||
is modified.
|
94
docs/specs/standard-vga.rst
Normal file
94
docs/specs/standard-vga.rst
Normal file
|
@ -0,0 +1,94 @@
|
|||
|
||||
QEMU Standard VGA
|
||||
=================
|
||||
|
||||
Exists in two variants, for isa and pci.
|
||||
|
||||
command line switches:
|
||||
|
||||
``-vga std``
|
||||
picks isa for -M isapc, otherwise pci
|
||||
``-device VGA``
|
||||
pci variant
|
||||
``-device isa-vga``
|
||||
isa variant
|
||||
``-device secondary-vga``
|
||||
legacy-free pci variant
|
||||
|
||||
|
||||
PCI spec
|
||||
--------
|
||||
|
||||
Applies to the pci variant only for obvious reasons.
|
||||
|
||||
PCI ID
|
||||
``1234:1111``
|
||||
|
||||
PCI Region 0
|
||||
Framebuffer memory, 16 MB in size (by default).
|
||||
Size is tunable via vga_mem_mb property.
|
||||
|
||||
PCI Region 1
|
||||
Reserved (so we have the option to make the framebuffer bar 64bit).
|
||||
|
||||
PCI Region 2
|
||||
MMIO bar, 4096 bytes in size (QEMU 1.3+)
|
||||
|
||||
PCI ROM Region
|
||||
Holds the vgabios (QEMU 0.14+).
|
||||
|
||||
|
||||
The legacy-free variant has no ROM and has ``PCI_CLASS_DISPLAY_OTHER``
|
||||
instead of ``PCI_CLASS_DISPLAY_VGA``.
|
||||
|
||||
|
||||
IO ports used
|
||||
-------------
|
||||
|
||||
Doesn't apply to the legacy-free pci variant, use the MMIO bar instead.
|
||||
|
||||
``03c0 - 03df``
|
||||
standard vga ports
|
||||
``01ce``
|
||||
bochs vbe interface index port
|
||||
``01cf``
|
||||
bochs vbe interface data port (x86 only)
|
||||
``01d0``
|
||||
bochs vbe interface data port
|
||||
|
||||
|
||||
Memory regions used
|
||||
-------------------
|
||||
|
||||
``0xe0000000``
|
||||
Framebuffer memory, isa variant only.
|
||||
|
||||
The pci variant used to mirror the framebuffer bar here, QEMU 0.14+
|
||||
stops doing that (except when in ``-M pc-$old`` compat mode).
|
||||
|
||||
|
||||
MMIO area spec
|
||||
--------------
|
||||
|
||||
Likewise applies to the pci variant only for obvious reasons.
|
||||
|
||||
``0000 - 03ff``
|
||||
edid data blob.
|
||||
``0400 - 041f``
|
||||
vga ioports (``0x3c0`` to ``0x3df``), remapped 1:1. Word access
|
||||
is supported, bytes are written in little endian order (aka index
|
||||
port first), so indexed registers can be updated with a single
|
||||
mmio write (and thus only one vmexit).
|
||||
``0500 - 0515``
|
||||
bochs dispi interface registers, mapped flat without index/data ports.
|
||||
Use ``(index << 1)`` as offset for (16bit) register access.
|
||||
``0600 - 0607``
|
||||
QEMU extended registers. QEMU 2.2+ only.
|
||||
The pci revision is 2 (or greater) when these registers are present.
|
||||
The registers are 32bit.
|
||||
``0600``
|
||||
QEMU extended register region size, in bytes.
|
||||
``0604``
|
||||
framebuffer endianness register.
|
||||
- ``0xbebebebe`` indicates big endian.
|
||||
- ``0x1e1e1e1e`` indicates little endian.
|
|
@ -1,81 +0,0 @@
|
|||
|
||||
QEMU Standard VGA
|
||||
=================
|
||||
|
||||
Exists in two variants, for isa and pci.
|
||||
|
||||
command line switches:
|
||||
-vga std [ picks isa for -M isapc, otherwise pci ]
|
||||
-device VGA [ pci variant ]
|
||||
-device isa-vga [ isa variant ]
|
||||
-device secondary-vga [ legacy-free pci variant ]
|
||||
|
||||
|
||||
PCI spec
|
||||
--------
|
||||
|
||||
Applies to the pci variant only for obvious reasons.
|
||||
|
||||
PCI ID: 1234:1111
|
||||
|
||||
PCI Region 0:
|
||||
Framebuffer memory, 16 MB in size (by default).
|
||||
Size is tunable via vga_mem_mb property.
|
||||
|
||||
PCI Region 1:
|
||||
Reserved (so we have the option to make the framebuffer bar 64bit).
|
||||
|
||||
PCI Region 2:
|
||||
MMIO bar, 4096 bytes in size (qemu 1.3+)
|
||||
|
||||
PCI ROM Region:
|
||||
Holds the vgabios (qemu 0.14+).
|
||||
|
||||
|
||||
The legacy-free variant has no ROM and has PCI_CLASS_DISPLAY_OTHER
|
||||
instead of PCI_CLASS_DISPLAY_VGA.
|
||||
|
||||
|
||||
IO ports used
|
||||
-------------
|
||||
|
||||
Doesn't apply to the legacy-free pci variant, use the MMIO bar instead.
|
||||
|
||||
03c0 - 03df : standard vga ports
|
||||
01ce : bochs vbe interface index port
|
||||
01cf : bochs vbe interface data port (x86 only)
|
||||
01d0 : bochs vbe interface data port
|
||||
|
||||
|
||||
Memory regions used
|
||||
-------------------
|
||||
|
||||
0xe0000000 : Framebuffer memory, isa variant only.
|
||||
|
||||
The pci variant used to mirror the framebuffer bar here, qemu 0.14+
|
||||
stops doing that (except when in -M pc-$old compat mode).
|
||||
|
||||
|
||||
MMIO area spec
|
||||
--------------
|
||||
|
||||
Likewise applies to the pci variant only for obvious reasons.
|
||||
|
||||
0000 - 03ff : edid data blob.
|
||||
0400 - 041f : vga ioports (0x3c0 -> 0x3df), remapped 1:1.
|
||||
word access is supported, bytes are written
|
||||
in little endia order (aka index port first),
|
||||
so indexed registers can be updated with a
|
||||
single mmio write (and thus only one vmexit).
|
||||
0500 - 0515 : bochs dispi interface registers, mapped flat
|
||||
without index/data ports. Use (index << 1)
|
||||
as offset for (16bit) register access.
|
||||
|
||||
0600 - 0607 : qemu extended registers. qemu 2.2+ only.
|
||||
The pci revision is 2 (or greater) when
|
||||
these registers are present. The registers
|
||||
are 32bit.
|
||||
0600 : qemu extended register region size, in bytes.
|
||||
0604 : framebuffer endianness register.
|
||||
- 0xbebebebe indicates big endian.
|
||||
- 0x1e1e1e1e indicates little endian.
|
|
@ -1,9 +1,9 @@
|
|||
Virtual System Controller
|
||||
=========================
|
||||
|
||||
This device is a simple interface defined for the pure virtual machine with no
|
||||
hardware reference implementation to allow the guest kernel to send command
|
||||
to the host hypervisor.
|
||||
The ``virt-ctrl`` device is a simple interface defined for the pure
|
||||
virtual machine with no hardware reference implementation to allow the
|
||||
guest kernel to send command to the host hypervisor.
|
||||
|
||||
The specification can evolve, the current state is defined as below.
|
||||
|
||||
|
@ -11,14 +11,12 @@ This is a MMIO mapped device using 256 bytes.
|
|||
|
||||
Two 32bit registers are defined:
|
||||
|
||||
1- the features register (read-only, address 0x00)
|
||||
|
||||
the features register (read-only, address 0x00)
|
||||
This register allows the device to report features supported by the
|
||||
controller.
|
||||
The only feature supported for the moment is power control (0x01).
|
||||
|
||||
2- the command register (write-only, address 0x04)
|
||||
|
||||
the command register (write-only, address 0x04)
|
||||
This register allows the kernel to send the commands to the hypervisor.
|
||||
The implemented commands are part of the power control feature and
|
||||
are reset (1), halt (2) and panic (3).
|
54
docs/specs/vmcoreinfo.rst
Normal file
54
docs/specs/vmcoreinfo.rst
Normal file
|
@ -0,0 +1,54 @@
|
|||
=================
|
||||
VMCoreInfo device
|
||||
=================
|
||||
|
||||
The ``-device vmcoreinfo`` will create a ``fw_cfg`` entry for a guest to
|
||||
store dump details.
|
||||
|
||||
``etc/vmcoreinfo``
|
||||
==================
|
||||
|
||||
A guest may use this ``fw_cfg`` entry to add information details to QEMU
|
||||
dumps.
|
||||
|
||||
The entry of 16 bytes has the following layout, in little-endian::
|
||||
|
||||
#define VMCOREINFO_FORMAT_NONE 0x0
|
||||
#define VMCOREINFO_FORMAT_ELF 0x1
|
||||
|
||||
struct FWCfgVMCoreInfo {
|
||||
uint16_t host_format; /* formats host supports */
|
||||
uint16_t guest_format; /* format guest supplies */
|
||||
uint32_t size; /* size of vmcoreinfo region */
|
||||
uint64_t paddr; /* physical address of vmcoreinfo region */
|
||||
};
|
||||
|
||||
Only full write (of 16 bytes) are considered valid for further
|
||||
processing of entry values.
|
||||
|
||||
A write of 0 in ``guest_format`` will disable further processing of
|
||||
vmcoreinfo entry values & content.
|
||||
|
||||
You may write a ``guest_format`` that is not supported by the host, in
|
||||
which case the entry data can be ignored by QEMU (but you may still
|
||||
access it through a debugger, via ``vmcoreinfo_realize::vmcoreinfo_state``).
|
||||
|
||||
Format & content
|
||||
================
|
||||
|
||||
As of QEMU 2.11, only ``VMCOREINFO_FORMAT_ELF`` is supported.
|
||||
|
||||
The entry gives location and size of an ELF note that is appended in
|
||||
qemu dumps.
|
||||
|
||||
The note format/class must be of the target bitness and the size must
|
||||
be less than 1Mb.
|
||||
|
||||
If the ELF note name is ``VMCOREINFO``, it is expected to be the Linux
|
||||
vmcoreinfo note (see `the kernel documentation for its format
|
||||
<https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-kernel-vmcoreinfo>`_).
|
||||
In this case, qemu dump code will read the content
|
||||
as a key=value text file, looking for ``NUMBER(phys_base)`` key
|
||||
value. The value is expected to be more accurate than architecture
|
||||
guess of the value. This is useful for KASLR-enabled guest with
|
||||
ancient tools not handling the ``VMCOREINFO`` note.
|
|
@ -1,53 +0,0 @@
|
|||
=================
|
||||
VMCoreInfo device
|
||||
=================
|
||||
|
||||
The `-device vmcoreinfo` will create a fw_cfg entry for a guest to
|
||||
store dump details.
|
||||
|
||||
etc/vmcoreinfo
|
||||
**************
|
||||
|
||||
A guest may use this fw_cfg entry to add information details to qemu
|
||||
dumps.
|
||||
|
||||
The entry of 16 bytes has the following layout, in little-endian::
|
||||
|
||||
#define VMCOREINFO_FORMAT_NONE 0x0
|
||||
#define VMCOREINFO_FORMAT_ELF 0x1
|
||||
|
||||
struct FWCfgVMCoreInfo {
|
||||
uint16_t host_format; /* formats host supports */
|
||||
uint16_t guest_format; /* format guest supplies */
|
||||
uint32_t size; /* size of vmcoreinfo region */
|
||||
uint64_t paddr; /* physical address of vmcoreinfo region */
|
||||
};
|
||||
|
||||
Only full write (of 16 bytes) are considered valid for further
|
||||
processing of entry values.
|
||||
|
||||
A write of 0 in guest_format will disable further processing of
|
||||
vmcoreinfo entry values & content.
|
||||
|
||||
You may write a guest_format that is not supported by the host, in
|
||||
which case the entry data can be ignored by qemu (but you may still
|
||||
access it through a debugger, via vmcoreinfo_realize::vmcoreinfo_state).
|
||||
|
||||
Format & content
|
||||
****************
|
||||
|
||||
As of qemu 2.11, only VMCOREINFO_FORMAT_ELF is supported.
|
||||
|
||||
The entry gives location and size of an ELF note that is appended in
|
||||
qemu dumps.
|
||||
|
||||
The note format/class must be of the target bitness and the size must
|
||||
be less than 1Mb.
|
||||
|
||||
If the ELF note name is "VMCOREINFO", it is expected to be the Linux
|
||||
vmcoreinfo note (see Documentation/ABI/testing/sysfs-kernel-vmcoreinfo
|
||||
in Linux source). In this case, qemu dump code will read the content
|
||||
as a key=value text file, looking for "NUMBER(phys_base)" key
|
||||
value. The value is expected to be more accurate than architecture
|
||||
guess of the value. This is useful for KASLR-enabled guest with
|
||||
ancient tools not handling the VMCOREINFO note.
|
246
docs/specs/vmgenid.rst
Normal file
246
docs/specs/vmgenid.rst
Normal file
|
@ -0,0 +1,246 @@
|
|||
Virtual Machine Generation ID Device
|
||||
====================================
|
||||
|
||||
..
|
||||
Copyright (C) 2016 Red Hat, Inc.
|
||||
Copyright (C) 2017 Skyport Systems, Inc.
|
||||
|
||||
This work is licensed under the terms of the GNU GPL, version 2 or later.
|
||||
See the COPYING file in the top-level directory.
|
||||
|
||||
The VM generation ID (``vmgenid``) device is an emulated device which
|
||||
exposes a 128-bit, cryptographically random, integer value identifier,
|
||||
referred to as a Globally Unique Identifier, or GUID.
|
||||
|
||||
This allows management applications (e.g. libvirt) to notify the guest
|
||||
operating system when the virtual machine is executed with a different
|
||||
configuration (e.g. snapshot execution or creation from a template). The
|
||||
guest operating system notices the change, and is then able to react as
|
||||
appropriate by marking its copies of distributed databases as dirty,
|
||||
re-initializing its random number generator etc.
|
||||
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
These requirements are extracted from the "How to implement virtual machine
|
||||
generation ID support in a virtualization platform" section of
|
||||
`the Microsoft Virtual Machine Generation ID specification
|
||||
<http://go.microsoft.com/fwlink/?LinkId=260709>`_ dated August 1, 2012.
|
||||
|
||||
- **R1a** The generation ID shall live in an 8-byte aligned buffer.
|
||||
|
||||
- **R1b** The buffer holding the generation ID shall be in guest RAM,
|
||||
ROM, or device MMIO range.
|
||||
|
||||
- **R1c** The buffer holding the generation ID shall be kept separate from
|
||||
areas used by the operating system.
|
||||
|
||||
- **R1d** The buffer shall not be covered by an AddressRangeMemory or
|
||||
AddressRangeACPI entry in the E820 or UEFI memory map.
|
||||
|
||||
- **R1e** The generation ID shall not live in a page frame that could be
|
||||
mapped with caching disabled. (In other words, regardless of whether the
|
||||
generation ID lives in RAM, ROM or MMIO, it shall only be mapped as
|
||||
cacheable.)
|
||||
|
||||
- **R2** to **R5** [These AML requirements are isolated well enough in the
|
||||
Microsoft specification for us to simply refer to them here.]
|
||||
|
||||
- **R6** The hypervisor shall expose a _HID (hardware identifier) object
|
||||
in the VMGenId device's scope that is unique to the hypervisor vendor.
|
||||
|
||||
|
||||
QEMU Implementation
|
||||
-------------------
|
||||
|
||||
The above-mentioned specification does not dictate which ACPI descriptor table
|
||||
will contain the VM Generation ID device. Other implementations (Hyper-V and
|
||||
Xen) put it in the main descriptor table (Differentiated System Description
|
||||
Table or DSDT). For ease of debugging and implementation, we have decided to
|
||||
put it in its own Secondary System Description Table, or SSDT.
|
||||
|
||||
The following is a dump of the contents from a running system::
|
||||
|
||||
# iasl -p ./SSDT -d /sys/firmware/acpi/tables/SSDT
|
||||
|
||||
Intel ACPI Component Architecture
|
||||
ASL+ Optimizing Compiler version 20150717-64
|
||||
Copyright (c) 2000 - 2015 Intel Corporation
|
||||
|
||||
Reading ACPI table from file /sys/firmware/acpi/tables/SSDT - Length
|
||||
00000198 (0x0000C6)
|
||||
ACPI: SSDT 0x0000000000000000 0000C6 (v01 BOCHS VMGENID 00000001 BXPC 00000001)
|
||||
Acpi table [SSDT] successfully installed and loaded
|
||||
Pass 1 parse of [SSDT]
|
||||
Pass 2 parse of [SSDT]
|
||||
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
|
||||
|
||||
Parsing completed
|
||||
Disassembly completed
|
||||
ASL Output: ./SSDT.dsl - 1631 bytes
|
||||
# cat SSDT.dsl
|
||||
/*
|
||||
* Intel ACPI Component Architecture
|
||||
* AML/ASL+ Disassembler version 20150717-64
|
||||
* Copyright (c) 2000 - 2015 Intel Corporation
|
||||
*
|
||||
* Disassembling to symbolic ASL+ operators
|
||||
*
|
||||
* Disassembly of /sys/firmware/acpi/tables/SSDT, Sun Feb 5 00:19:37 2017
|
||||
*
|
||||
* Original Table Header:
|
||||
* Signature "SSDT"
|
||||
* Length 0x000000CA (202)
|
||||
* Revision 0x01
|
||||
* Checksum 0x4B
|
||||
* OEM ID "BOCHS "
|
||||
* OEM Table ID "VMGENID"
|
||||
* OEM Revision 0x00000001 (1)
|
||||
* Compiler ID "BXPC"
|
||||
* Compiler Version 0x00000001 (1)
|
||||
*/
|
||||
DefinitionBlock ("/sys/firmware/acpi/tables/SSDT.aml", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
|
||||
{
|
||||
Name (VGIA, 0x07FFF000)
|
||||
Scope (\_SB)
|
||||
{
|
||||
Device (VGEN)
|
||||
{
|
||||
Name (_HID, "QEMUVGID") // _HID: Hardware ID
|
||||
Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID
|
||||
Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name
|
||||
Method (_STA, 0, NotSerialized) // _STA: Status
|
||||
{
|
||||
Local0 = 0x0F
|
||||
If ((VGIA == Zero))
|
||||
{
|
||||
Local0 = Zero
|
||||
}
|
||||
|
||||
Return (Local0)
|
||||
}
|
||||
|
||||
Method (ADDR, 0, NotSerialized)
|
||||
{
|
||||
Local0 = Package (0x02) {}
|
||||
Index (Local0, Zero) = (VGIA + 0x28)
|
||||
Index (Local0, One) = Zero
|
||||
Return (Local0)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE
|
||||
{
|
||||
Notify (\_SB.VGEN, 0x80) // Status Change
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Design Details:
|
||||
---------------
|
||||
|
||||
Requirements R1a through R1e dictate that the memory holding the
|
||||
VM Generation ID must be allocated and owned by the guest firmware,
|
||||
in this case BIOS or UEFI. However, to be useful, QEMU must be able to
|
||||
change the contents of the memory at runtime, specifically when starting a
|
||||
backed-up or snapshotted image. In order to do this, QEMU must know the
|
||||
address that has been allocated.
|
||||
|
||||
The mechanism chosen for this memory sharing is writable fw_cfg blobs.
|
||||
These are data object that are visible to both QEMU and guests, and are
|
||||
addressable as sequential files.
|
||||
|
||||
More information about fw_cfg can be found in :doc:`fw_cfg`.
|
||||
|
||||
Two fw_cfg blobs are used in this case:
|
||||
|
||||
``/etc/vmgenid_guid``
|
||||
|
||||
- contains the actual VM Generation ID GUID
|
||||
- read-only to the guest
|
||||
|
||||
``/etc/vmgenid_addr``
|
||||
|
||||
- contains the address of the downloaded vmgenid blob
|
||||
- writable by the guest
|
||||
|
||||
|
||||
QEMU sends the following commands to the guest at startup:
|
||||
|
||||
1. Allocate memory for vmgenid_guid fw_cfg blob.
|
||||
2. Write the address of vmgenid_guid into the SSDT (VGIA ACPI variable as
|
||||
shown above in the iasl dump). Note that this change is not propagated
|
||||
back to QEMU.
|
||||
3. Write the address of vmgenid_guid back to QEMU's copy of vmgenid_addr
|
||||
via the fw_cfg DMA interface.
|
||||
|
||||
After step 3, QEMU is able to update the contents of vmgenid_guid at will.
|
||||
|
||||
Since BIOS or UEFI does not necessarily run when we wish to change the GUID,
|
||||
the value of VGIA is persisted via the VMState mechanism.
|
||||
|
||||
As spelled out in the specification, any change to the GUID executes an
|
||||
ACPI notification. The exact handler to use is not specified, so the vmgenid
|
||||
device uses the first unused one: ``\_GPE._E05``.
|
||||
|
||||
|
||||
Endian-ness Considerations:
|
||||
---------------------------
|
||||
|
||||
Although not specified in Microsoft's document, it is assumed that the
|
||||
device is expected to use little-endian format.
|
||||
|
||||
All GUID passed in via command line or monitor are treated as big-endian.
|
||||
GUID values displayed via monitor are shown in big-endian format.
|
||||
|
||||
|
||||
GUID Storage Format:
|
||||
--------------------
|
||||
|
||||
In order to implement an OVMF "SDT Header Probe Suppressor", the contents of
|
||||
the vmgenid_guid fw_cfg blob are not simply a 128-bit GUID. There is also
|
||||
significant padding in order to align and fill a memory page, as shown in the
|
||||
following diagram::
|
||||
|
||||
+----------------------------------+
|
||||
| SSDT with OEM Table ID = VMGENID |
|
||||
+----------------------------------+
|
||||
| ... | TOP OF PAGE
|
||||
| VGIA dword object ---------------|-----> +---------------------------+
|
||||
| ... | | fw-allocated array for |
|
||||
| _STA method referring to VGIA | | "etc/vmgenid_guid" |
|
||||
| ... | +---------------------------+
|
||||
| ADDR method referring to VGIA | | 0: OVMF SDT Header probe |
|
||||
| ... | | suppressor |
|
||||
+----------------------------------+ | 36: padding for 8-byte |
|
||||
| alignment |
|
||||
| 40: GUID |
|
||||
| 56: padding to page size |
|
||||
+---------------------------+
|
||||
END OF PAGE
|
||||
|
||||
|
||||
Device Usage:
|
||||
-------------
|
||||
|
||||
The device has one property, which may be only be set using the command line:
|
||||
|
||||
``guid``
|
||||
sets the value of the GUID. A special value ``auto`` instructs
|
||||
QEMU to generate a new random GUID.
|
||||
|
||||
For example::
|
||||
|
||||
QEMU -device vmgenid,guid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
|
||||
QEMU -device vmgenid,guid=auto
|
||||
|
||||
The property may be queried via QMP/HMP::
|
||||
|
||||
(QEMU) query-vm-generation-id
|
||||
{"return": {"guid": "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"}}
|
||||
|
||||
Setting of this parameter is intentionally left out from the QMP/HMP
|
||||
interfaces. There are no known use cases for changing the GUID once QEMU is
|
||||
running, and adding this capability would greatly increase the complexity.
|
|
@ -1,245 +0,0 @@
|
|||
VIRTUAL MACHINE GENERATION ID
|
||||
=============================
|
||||
|
||||
Copyright (C) 2016 Red Hat, Inc.
|
||||
Copyright (C) 2017 Skyport Systems, Inc.
|
||||
|
||||
This work is licensed under the terms of the GNU GPL, version 2 or later.
|
||||
See the COPYING file in the top-level directory.
|
||||
|
||||
===
|
||||
|
||||
The VM generation ID (vmgenid) device is an emulated device which
|
||||
exposes a 128-bit, cryptographically random, integer value identifier,
|
||||
referred to as a Globally Unique Identifier, or GUID.
|
||||
|
||||
This allows management applications (e.g. libvirt) to notify the guest
|
||||
operating system when the virtual machine is executed with a different
|
||||
configuration (e.g. snapshot execution or creation from a template). The
|
||||
guest operating system notices the change, and is then able to react as
|
||||
appropriate by marking its copies of distributed databases as dirty,
|
||||
re-initializing its random number generator etc.
|
||||
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
These requirements are extracted from the "How to implement virtual machine
|
||||
generation ID support in a virtualization platform" section of the
|
||||
specification, dated August 1, 2012.
|
||||
|
||||
|
||||
The document may be found on the web at:
|
||||
http://go.microsoft.com/fwlink/?LinkId=260709
|
||||
|
||||
R1a. The generation ID shall live in an 8-byte aligned buffer.
|
||||
|
||||
R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device
|
||||
MMIO range.
|
||||
|
||||
R1c. The buffer holding the generation ID shall be kept separate from areas
|
||||
used by the operating system.
|
||||
|
||||
R1d. The buffer shall not be covered by an AddressRangeMemory or
|
||||
AddressRangeACPI entry in the E820 or UEFI memory map.
|
||||
|
||||
R1e. The generation ID shall not live in a page frame that could be mapped with
|
||||
caching disabled. (In other words, regardless of whether the generation ID
|
||||
lives in RAM, ROM or MMIO, it shall only be mapped as cacheable.)
|
||||
|
||||
R2 to R5. [These AML requirements are isolated well enough in the Microsoft
|
||||
specification for us to simply refer to them here.]
|
||||
|
||||
R6. The hypervisor shall expose a _HID (hardware identifier) object in the
|
||||
VMGenId device's scope that is unique to the hypervisor vendor.
|
||||
|
||||
|
||||
QEMU Implementation
|
||||
-------------------
|
||||
|
||||
The above-mentioned specification does not dictate which ACPI descriptor table
|
||||
will contain the VM Generation ID device. Other implementations (Hyper-V and
|
||||
Xen) put it in the main descriptor table (Differentiated System Description
|
||||
Table or DSDT). For ease of debugging and implementation, we have decided to
|
||||
put it in its own Secondary System Description Table, or SSDT.
|
||||
|
||||
The following is a dump of the contents from a running system:
|
||||
|
||||
# iasl -p ./SSDT -d /sys/firmware/acpi/tables/SSDT
|
||||
|
||||
Intel ACPI Component Architecture
|
||||
ASL+ Optimizing Compiler version 20150717-64
|
||||
Copyright (c) 2000 - 2015 Intel Corporation
|
||||
|
||||
Reading ACPI table from file /sys/firmware/acpi/tables/SSDT - Length
|
||||
00000198 (0x0000C6)
|
||||
ACPI: SSDT 0x0000000000000000 0000C6 (v01 BOCHS VMGENID 00000001 BXPC
|
||||
00000001)
|
||||
Acpi table [SSDT] successfully installed and loaded
|
||||
Pass 1 parse of [SSDT]
|
||||
Pass 2 parse of [SSDT]
|
||||
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
|
||||
|
||||
Parsing completed
|
||||
Disassembly completed
|
||||
ASL Output: ./SSDT.dsl - 1631 bytes
|
||||
# cat SSDT.dsl
|
||||
/*
|
||||
* Intel ACPI Component Architecture
|
||||
* AML/ASL+ Disassembler version 20150717-64
|
||||
* Copyright (c) 2000 - 2015 Intel Corporation
|
||||
*
|
||||
* Disassembling to symbolic ASL+ operators
|
||||
*
|
||||
* Disassembly of /sys/firmware/acpi/tables/SSDT, Sun Feb 5 00:19:37 2017
|
||||
*
|
||||
* Original Table Header:
|
||||
* Signature "SSDT"
|
||||
* Length 0x000000CA (202)
|
||||
* Revision 0x01
|
||||
* Checksum 0x4B
|
||||
* OEM ID "BOCHS "
|
||||
* OEM Table ID "VMGENID"
|
||||
* OEM Revision 0x00000001 (1)
|
||||
* Compiler ID "BXPC"
|
||||
* Compiler Version 0x00000001 (1)
|
||||
*/
|
||||
DefinitionBlock ("/sys/firmware/acpi/tables/SSDT.aml", "SSDT", 1, "BOCHS ",
|
||||
"VMGENID", 0x00000001)
|
||||
{
|
||||
Name (VGIA, 0x07FFF000)
|
||||
Scope (\_SB)
|
||||
{
|
||||
Device (VGEN)
|
||||
{
|
||||
Name (_HID, "QEMUVGID") // _HID: Hardware ID
|
||||
Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID
|
||||
Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name
|
||||
Method (_STA, 0, NotSerialized) // _STA: Status
|
||||
{
|
||||
Local0 = 0x0F
|
||||
If ((VGIA == Zero))
|
||||
{
|
||||
Local0 = Zero
|
||||
}
|
||||
|
||||
Return (Local0)
|
||||
}
|
||||
|
||||
Method (ADDR, 0, NotSerialized)
|
||||
{
|
||||
Local0 = Package (0x02) {}
|
||||
Index (Local0, Zero) = (VGIA + 0x28)
|
||||
Index (Local0, One) = Zero
|
||||
Return (Local0)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE
|
||||
{
|
||||
Notify (\_SB.VGEN, 0x80) // Status Change
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Design Details:
|
||||
---------------
|
||||
|
||||
Requirements R1a through R1e dictate that the memory holding the
|
||||
VM Generation ID must be allocated and owned by the guest firmware,
|
||||
in this case BIOS or UEFI. However, to be useful, QEMU must be able to
|
||||
change the contents of the memory at runtime, specifically when starting a
|
||||
backed-up or snapshotted image. In order to do this, QEMU must know the
|
||||
address that has been allocated.
|
||||
|
||||
The mechanism chosen for this memory sharing is writable fw_cfg blobs.
|
||||
These are data object that are visible to both QEMU and guests, and are
|
||||
addressable as sequential files.
|
||||
|
||||
More information about fw_cfg can be found in "docs/specs/fw_cfg.txt"
|
||||
|
||||
Two fw_cfg blobs are used in this case:
|
||||
|
||||
/etc/vmgenid_guid - contains the actual VM Generation ID GUID
|
||||
- read-only to the guest
|
||||
/etc/vmgenid_addr - contains the address of the downloaded vmgenid blob
|
||||
- writable by the guest
|
||||
|
||||
|
||||
QEMU sends the following commands to the guest at startup:
|
||||
|
||||
1. Allocate memory for vmgenid_guid fw_cfg blob.
|
||||
2. Write the address of vmgenid_guid into the SSDT (VGIA ACPI variable as
|
||||
shown above in the iasl dump). Note that this change is not propagated
|
||||
back to QEMU.
|
||||
3. Write the address of vmgenid_guid back to QEMU's copy of vmgenid_addr
|
||||
via the fw_cfg DMA interface.
|
||||
|
||||
After step 3, QEMU is able to update the contents of vmgenid_guid at will.
|
||||
|
||||
Since BIOS or UEFI does not necessarily run when we wish to change the GUID,
|
||||
the value of VGIA is persisted via the VMState mechanism.
|
||||
|
||||
As spelled out in the specification, any change to the GUID executes an
|
||||
ACPI notification. The exact handler to use is not specified, so the vmgenid
|
||||
device uses the first unused one: \_GPE._E05.
|
||||
|
||||
|
||||
Endian-ness Considerations:
|
||||
---------------------------
|
||||
|
||||
Although not specified in Microsoft's document, it is assumed that the
|
||||
device is expected to use little-endian format.
|
||||
|
||||
All GUID passed in via command line or monitor are treated as big-endian.
|
||||
GUID values displayed via monitor are shown in big-endian format.
|
||||
|
||||
|
||||
GUID Storage Format:
|
||||
--------------------
|
||||
|
||||
In order to implement an OVMF "SDT Header Probe Suppressor", the contents of
|
||||
the vmgenid_guid fw_cfg blob are not simply a 128-bit GUID. There is also
|
||||
significant padding in order to align and fill a memory page, as shown in the
|
||||
following diagram:
|
||||
|
||||
+----------------------------------+
|
||||
| SSDT with OEM Table ID = VMGENID |
|
||||
+----------------------------------+
|
||||
| ... | TOP OF PAGE
|
||||
| VGIA dword object ---------------|-----> +---------------------------+
|
||||
| ... | | fw-allocated array for |
|
||||
| _STA method referring to VGIA | | "etc/vmgenid_guid" |
|
||||
| ... | +---------------------------+
|
||||
| ADDR method referring to VGIA | | 0: OVMF SDT Header probe |
|
||||
| ... | | suppressor |
|
||||
+----------------------------------+ | 36: padding for 8-byte |
|
||||
| alignment |
|
||||
| 40: GUID |
|
||||
| 56: padding to page size |
|
||||
+---------------------------+
|
||||
END OF PAGE
|
||||
|
||||
|
||||
Device Usage:
|
||||
-------------
|
||||
|
||||
The device has one property, which may be only be set using the command line:
|
||||
|
||||
guid - sets the value of the GUID. A special value "auto" instructs
|
||||
QEMU to generate a new random GUID.
|
||||
|
||||
For example:
|
||||
|
||||
QEMU -device vmgenid,guid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
|
||||
QEMU -device vmgenid,guid=auto
|
||||
|
||||
The property may be queried via QMP/HMP:
|
||||
|
||||
(QEMU) query-vm-generation-id
|
||||
{"return": {"guid": "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"}}
|
||||
|
||||
Setting of this parameter is intentionally left out from the QMP/HMP
|
||||
interfaces. There are no known use cases for changing the GUID once QEMU is
|
||||
running, and adding this capability would greatly increase the complexity.
|
115
docs/specs/vmw_pvscsi-spec.rst
Normal file
115
docs/specs/vmw_pvscsi-spec.rst
Normal file
|
@ -0,0 +1,115 @@
|
|||
==============================
|
||||
VMWare PVSCSI Device Interface
|
||||
==============================
|
||||
|
||||
..
|
||||
Created by Dmitry Fleytman (dmitry@daynix.com), Daynix Computing LTD.
|
||||
|
||||
This document describes the VMWare PVSCSI device interface specification,
|
||||
based on the source code of the PVSCSI Linux driver from kernel 3.0.4.
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The interface is based on a memory area shared between hypervisor and VM.
|
||||
The memory area is obtained by driver as a device IO memory resource of
|
||||
``PVSCSI_MEM_SPACE_SIZE`` length.
|
||||
The shared memory consists of a registers area and a rings area.
|
||||
The registers area is used to raise hypervisor interrupts and issue device
|
||||
commands. The rings area is used to transfer data descriptors and SCSI
|
||||
commands from VM to hypervisor and to transfer messages produced by
|
||||
hypervisor to VM. Data itself is transferred via virtual scatter-gather DMA.
|
||||
|
||||
PVSCSI Device Registers
|
||||
=======================
|
||||
|
||||
The length of the registers area is 1 page
|
||||
(``PVSCSI_MEM_SPACE_COMMAND_NUM_PAGES``). The structure of the
|
||||
registers area is described by the ``PVSCSIRegOffset`` enum. There
|
||||
are registers to issue device commands (with optional short data),
|
||||
issue device interrupts, and control interrupt masking.
|
||||
|
||||
PVSCSI Device Rings
|
||||
===================
|
||||
|
||||
There are three rings in shared memory:
|
||||
|
||||
Request ring (``struct PVSCSIRingReqDesc *req_ring``)
|
||||
ring for OS to device requests
|
||||
|
||||
Completion ring (``struct PVSCSIRingCmpDesc *cmp_ring``)
|
||||
ring for device request completions
|
||||
|
||||
Message ring (``struct PVSCSIRingMsgDesc *msg_ring``)
|
||||
ring for messages from device. This ring is optional and the
|
||||
guest might not configure it.
|
||||
|
||||
There is a control area (``struct PVSCSIRingsState *rings_state``)
|
||||
used to control rings operation.
|
||||
|
||||
PVSCSI Device to Host Interrupts
|
||||
================================
|
||||
|
||||
The following interrupt types are supported by the PVSCSI device:
|
||||
|
||||
Completion interrupts (completion ring notifications):
|
||||
|
||||
- ``PVSCSI_INTR_CMPL_0``
|
||||
- ``PVSCSI_INTR_CMPL_1``
|
||||
|
||||
Message interrupts (message ring notifications):
|
||||
|
||||
- ``PVSCSI_INTR_MSG_0``
|
||||
- ``PVSCSI_INTR_MSG_1``
|
||||
|
||||
Interrupts are controlled via the ``PVSCSI_REG_OFFSET_INTR_MASK``
|
||||
register. If a bit is set it means the interrupt is enabled, and if
|
||||
it is clear then the interrupt is disabled.
|
||||
|
||||
The interrupt modes supported are legacy, MSI and MSI-X.
|
||||
In the case of legacy interrupts, the ``PVSCSI_REG_OFFSET_INTR_STATUS``
|
||||
register is used to check which interrupt has arrived. Interrupts are
|
||||
acknowledged when the corresponding bit is written to the interrupt
|
||||
status register.
|
||||
|
||||
PVSCSI Device Operation Sequences
|
||||
=================================
|
||||
|
||||
Startup sequence
|
||||
----------------
|
||||
|
||||
a. Issue ``PVSCSI_CMD_ADAPTER_RESET`` command
|
||||
b. Windows driver reads interrupt status register here
|
||||
c. Issue ``PVSCSI_CMD_SETUP_MSG_RING`` command with no additional data,
|
||||
check status and disable device messages if error returned
|
||||
(Omitted if device messages disabled by driver configuration)
|
||||
d. Issue ``PVSCSI_CMD_SETUP_RINGS`` command, provide rings configuration
|
||||
as ``struct PVSCSICmdDescSetupRings``
|
||||
e. Issue ``PVSCSI_CMD_SETUP_MSG_RING`` command again, provide
|
||||
rings configuration as ``struct PVSCSICmdDescSetupMsgRing``
|
||||
f. Unmask completion and message (if device messages enabled) interrupts
|
||||
|
||||
Shutdown sequence
|
||||
-----------------
|
||||
|
||||
a. Mask interrupts
|
||||
b. Flush request ring using ``PVSCSI_REG_OFFSET_KICK_NON_RW_IO``
|
||||
c. Issue ``PVSCSI_CMD_ADAPTER_RESET`` command
|
||||
|
||||
Send request
|
||||
------------
|
||||
|
||||
a. Fill next free request ring descriptor
|
||||
b. Issue ``PVSCSI_REG_OFFSET_KICK_RW_IO`` for R/W operations
|
||||
or ``PVSCSI_REG_OFFSET_KICK_NON_RW_IO`` for other operations
|
||||
|
||||
Abort command
|
||||
-------------
|
||||
|
||||
a. Issue ``PVSCSI_CMD_ABORT_CMD`` command
|
||||
|
||||
Request completion processing
|
||||
-----------------------------
|
||||
|
||||
a. Upon completion interrupt arrival process completion
|
||||
and message (if enabled) rings
|
|
@ -1,92 +0,0 @@
|
|||
General Description
|
||||
===================
|
||||
|
||||
This document describes VMWare PVSCSI device interface specification.
|
||||
Created by Dmitry Fleytman (dmitry@daynix.com), Daynix Computing LTD.
|
||||
Based on source code of PVSCSI Linux driver from kernel 3.0.4
|
||||
|
||||
PVSCSI Device Interface Overview
|
||||
================================
|
||||
|
||||
The interface is based on memory area shared between hypervisor and VM.
|
||||
Memory area is obtained by driver as device IO memory resource of
|
||||
PVSCSI_MEM_SPACE_SIZE length.
|
||||
The shared memory consists of registers area and rings area.
|
||||
The registers area is used to raise hypervisor interrupts and issue device
|
||||
commands. The rings area is used to transfer data descriptors and SCSI
|
||||
commands from VM to hypervisor and to transfer messages produced by
|
||||
hypervisor to VM. Data itself is transferred via virtual scatter-gather DMA.
|
||||
|
||||
PVSCSI Device Registers
|
||||
=======================
|
||||
|
||||
The length of the registers area is 1 page (PVSCSI_MEM_SPACE_COMMAND_NUM_PAGES).
|
||||
The structure of the registers area is described by the PVSCSIRegOffset enum.
|
||||
There are registers to issue device command (with optional short data),
|
||||
issue device interrupt, control interrupts masking.
|
||||
|
||||
PVSCSI Device Rings
|
||||
===================
|
||||
|
||||
There are three rings in shared memory:
|
||||
|
||||
1. Request ring (struct PVSCSIRingReqDesc *req_ring)
|
||||
- ring for OS to device requests
|
||||
2. Completion ring (struct PVSCSIRingCmpDesc *cmp_ring)
|
||||
- ring for device request completions
|
||||
3. Message ring (struct PVSCSIRingMsgDesc *msg_ring)
|
||||
- ring for messages from device.
|
||||
This ring is optional and the guest might not configure it.
|
||||
There is a control area (struct PVSCSIRingsState *rings_state) used to control
|
||||
rings operation.
|
||||
|
||||
PVSCSI Device to Host Interrupts
|
||||
================================
|
||||
There are following interrupt types supported by PVSCSI device:
|
||||
1. Completion interrupts (completion ring notifications):
|
||||
PVSCSI_INTR_CMPL_0
|
||||
PVSCSI_INTR_CMPL_1
|
||||
2. Message interrupts (message ring notifications):
|
||||
PVSCSI_INTR_MSG_0
|
||||
PVSCSI_INTR_MSG_1
|
||||
|
||||
Interrupts are controlled via PVSCSI_REG_OFFSET_INTR_MASK register
|
||||
Bit set means interrupt enabled, bit cleared - disabled
|
||||
|
||||
Interrupt modes supported are legacy, MSI and MSI-X
|
||||
In case of legacy interrupts, register PVSCSI_REG_OFFSET_INTR_STATUS
|
||||
is used to check which interrupt has arrived. Interrupts are
|
||||
acknowledged when the corresponding bit is written to the interrupt
|
||||
status register.
|
||||
|
||||
PVSCSI Device Operation Sequences
|
||||
=================================
|
||||
|
||||
1. Startup sequence:
|
||||
a. Issue PVSCSI_CMD_ADAPTER_RESET command;
|
||||
aa. Windows driver reads interrupt status register here;
|
||||
b. Issue PVSCSI_CMD_SETUP_MSG_RING command with no additional data,
|
||||
check status and disable device messages if error returned;
|
||||
(Omitted if device messages disabled by driver configuration)
|
||||
c. Issue PVSCSI_CMD_SETUP_RINGS command, provide rings configuration
|
||||
as struct PVSCSICmdDescSetupRings;
|
||||
d. Issue PVSCSI_CMD_SETUP_MSG_RING command again, provide
|
||||
rings configuration as struct PVSCSICmdDescSetupMsgRing;
|
||||
e. Unmask completion and message (if device messages enabled) interrupts.
|
||||
|
||||
2. Shutdown sequences
|
||||
a. Mask interrupts;
|
||||
b. Flush request ring using PVSCSI_REG_OFFSET_KICK_NON_RW_IO;
|
||||
c. Issue PVSCSI_CMD_ADAPTER_RESET command.
|
||||
|
||||
3. Send request
|
||||
a. Fill next free request ring descriptor;
|
||||
b. Issue PVSCSI_REG_OFFSET_KICK_RW_IO for R/W operations;
|
||||
or PVSCSI_REG_OFFSET_KICK_NON_RW_IO for other operations.
|
||||
|
||||
4. Abort command
|
||||
a. Issue PVSCSI_CMD_ABORT_CMD command;
|
||||
|
||||
5. Request completion processing
|
||||
a. Upon completion interrupt arrival process completion
|
||||
and message (if enabled) rings.
|
|
@ -33,7 +33,7 @@ syntax when using the shared memory server is:
|
|||
When using the server, the guest will be assigned a VM ID (>=0) that
|
||||
allows guests using the same server to communicate via interrupts.
|
||||
Guests can read their VM ID from a device register (see
|
||||
ivshmem-spec.txt).
|
||||
:doc:`../../specs/ivshmem-spec`).
|
||||
|
||||
Migration with ivshmem
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue