Commit graph

37690 commits

Author SHA1 Message Date
Stefan Hajnoczi
aa3a285b5b Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
 - Fixup handling of virtio-mem unplug during system resets, as
   preparation for s390x support (especially kdump in the Linux guest)
 - virtio-mem support for s390x
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmdnFD4RHGRhdmlkQHJl
 ZGhhdC5jb20ACgkQTd4Q9wD/g1rWBBAAp7WkYaNAjRy1PgpjNZ3z1gUJc/vk+skJ
 xVgGodA8txrJOFpNrbTyfhrdLs2TV4oWDvB/zrZRRtuxvur3O1EhFd9k6EqXuydr
 0FunvLvVJwRHfEZycjN4aacQMRH3CJw07OaTzexeSl5UR/6w5PRofwUK4HX7W/Ka
 arqomGa3OJrs1+WgkV0Qcn4vh9HLRVv3iNC2Xo4W1wOCr1Du9zSPn9oC7zOQ0EO4
 ZC//7QsdkNRjUX/yMXMkhlSXx3b/RmRg2DBrxo7BZXg27VwGu4uHxL4LRBZiB2A7
 V9MqFOcVKzPMkXKTRjrgZ0vXQx1MPJ6WprEihMzMpYU6DrpA7KN/l8Ca8H24B2ln
 h7+bmkDsHVVcWovE9ii/9cMRfws6uWXXg3KoA8RQ8IbX1tU02lblw2uHhXEzcoge
 npqp/Z5LAiKVMetEnNnLH5thjut5PAEjuqD00cmZAMy4DNngLX2bGSdzMeVBkDMa
 78ehLGRplm3t7ibUfaZaMKe6UD9tFrcD6XKsvUTXXHNbYO8ynbx58WOxSZmY98zU
 n3JNQRqtXYjBVlH3Dqm47vOTZHgOzFv3raa8BmSLpcBDeTXCTcUIl20s77dGw/vT
 r5YNCMN7O4YPFKUoRK9604QTgw6qlYaRTQlJD09usprGqVylb6gQtfZZuZkYDMp8
 sEI77QHsePA=
 =HDxr
 -----END PGP SIGNATURE-----

Merge tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu into staging

Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Fixup handling of virtio-mem unplug during system resets, as
  preparation for s390x support (especially kdump in the Linux guest)
- virtio-mem support for s390x

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmdnFD4RHGRhdmlkQHJl
 # ZGhhdC5jb20ACgkQTd4Q9wD/g1rWBBAAp7WkYaNAjRy1PgpjNZ3z1gUJc/vk+skJ
 # xVgGodA8txrJOFpNrbTyfhrdLs2TV4oWDvB/zrZRRtuxvur3O1EhFd9k6EqXuydr
 # 0FunvLvVJwRHfEZycjN4aacQMRH3CJw07OaTzexeSl5UR/6w5PRofwUK4HX7W/Ka
 # arqomGa3OJrs1+WgkV0Qcn4vh9HLRVv3iNC2Xo4W1wOCr1Du9zSPn9oC7zOQ0EO4
 # ZC//7QsdkNRjUX/yMXMkhlSXx3b/RmRg2DBrxo7BZXg27VwGu4uHxL4LRBZiB2A7
 # V9MqFOcVKzPMkXKTRjrgZ0vXQx1MPJ6WprEihMzMpYU6DrpA7KN/l8Ca8H24B2ln
 # h7+bmkDsHVVcWovE9ii/9cMRfws6uWXXg3KoA8RQ8IbX1tU02lblw2uHhXEzcoge
 # npqp/Z5LAiKVMetEnNnLH5thjut5PAEjuqD00cmZAMy4DNngLX2bGSdzMeVBkDMa
 # 78ehLGRplm3t7ibUfaZaMKe6UD9tFrcD6XKsvUTXXHNbYO8ynbx58WOxSZmY98zU
 # n3JNQRqtXYjBVlH3Dqm47vOTZHgOzFv3raa8BmSLpcBDeTXCTcUIl20s77dGw/vT
 # r5YNCMN7O4YPFKUoRK9604QTgw6qlYaRTQlJD09usprGqVylb6gQtfZZuZkYDMp8
 # sEI77QHsePA=
 # =HDxr
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Sat 21 Dec 2024 14:17:18 EST
 # gpg:                using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
 # gpg:                issuer "david@redhat.com"
 # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
 # gpg:                 aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
 # gpg:                 aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
 # gpg: WARNING: The key's User ID is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D  FCCA 4DDE 10F7 00FF 835A

* tag 'mem-2024-12-21' of https://github.com/davidhildenbrand/qemu:
  s390x: virtio-mem support
  s390x/virtio-ccw: add support for virtio based memory devices
  s390x: remember the maximum page size
  s390x/pv: prepare for memory devices
  s390x/s390-virtio-ccw: prepare for memory devices
  s390x/s390-skeys: prepare for memory devices
  s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts
  s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT
  s390x: introduce s390_get_memory_limit()
  s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code
  s390x: rename s390-virtio-hcall* to s390-hypercall*
  s390x/s390-virtio-hcall: prepare for more diag500 hypercalls
  s390x/s390-virtio-hcall: remove hypercall registration mechanism
  s390x/s390-virtio-ccw: don't crash on weird RAM sizes
  virtio-mem: unplug memory only during system resets, not device resets

Conflicts:
- hw/s390x/s390-stattrib-kvm.c
  sysemu/ -> system/ header rename conflict.
- hw/s390x/virtio-ccw-mem.c
  Make Property array const and removed DEFINE_PROP_END_OF_LIST() to
  conform to the latest conventions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-22 14:33:27 -05:00
David Hildenbrand
aa910c20ec s390x: virtio-mem support
Let's add our virtio-mem-ccw proxy device and wire it up. We should
be supporting everything (e.g., device unplug, "dynamic-memslots") that
we already support for the virtio-pci variant.

With a Linux guest that supports virtio-mem (and has automatic memory
onlining properly configured) the following example will work:

1. Start a VM with 4G initial memory and a virtio-mem device with a maximum
   capacity of 16GB:

   qemu/build/qemu-system-s390x \
    --enable-kvm \
    -m 4G,maxmem=20G \
    -nographic \
    -smp 8 \
    -hda Fedora-Server-KVM-40-1.14.s390x.qcow2 \
    -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
    -mon chardev=monitor,mode=readline \
    -object memory-backend-ram,id=mem0,size=16G,reserve=off \
    -device virtio-mem-ccw,id=vmem0,memdev=mem0,dynamic-memslots=on

2. Query the current size of virtio-mem device:

    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 0
      size: 0
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

3. Request to grow it to 8GB (hotplug 8GB):

    (qemu) qom-set vmem0 requested-size 8G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 8589934592
      size: 8589934592
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

4. Request to grow to 16GB (hotplug another 8GB):

    (qemu) qom-set vmem0 requested-size 16G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 17179869184
      size: 17179869184
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

5. Try to hotunplug all memory again, shrinking to 0GB:

    (qemu) qom-set vmem0 requested-size 0G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 0
      size: 0
      max-size: 17179869184
      block-size: 1048576
      memdev: /objects/mem0

6. If it worked, unplug the device

    (qemu) device_del vmem0
    (qemu) info memory-devices
    (qemu) object_del mem0

7. Hotplug a new device with a smaller capacity and directly size it to 1GB

    (qemu) object_add memory-backend-ram,id=mem0,size=8G,reserve=off
    (qemu) device_add virtio-mem-ccw,id=vmem0,memdev=mem0,\
                      dynamic-memslots=on,requested-size=1G
    (qemu) info memory-devices
    Memory device [virtio-mem]: "vmem0"
      memaddr: 0x100000000
      node: 0
      requested-size: 1073741824
      size: 1073741824
      max-size: 8589934592
      block-size: 1048576
      memdev: /objects/mem0

Trying to use a virtio-mem device backed by hugetlb into a !hugetlb VM
correctly results in the error:
   ... Memory device uses a bigger page size than initial memory

Note that the virtio-mem driver in Linux will supports 1 MiB (pageblock)
granularity.

Message-ID: <20241219144115.2820241-15-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 20:15:06 +01:00
Stefan Hajnoczi
65cb7129f4 Accel & Exec patch queue
- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
 - Add '-d invalid_mem' logging option (Zoltan)
 - Create QOM containers explicitly (Peter)
 - Rename sysemu/ -> system/ (Philippe)
 - Re-orderning of include/exec/ headers (Philippe)
   Move a lot of declarations from these legacy mixed bag headers:
     . "exec/cpu-all.h"
     . "exec/cpu-common.h"
     . "exec/cpu-defs.h"
     . "exec/exec-all.h"
     . "exec/translate-all"
   to these more specific ones:
     . "exec/page-protection.h"
     . "exec/translation-block.h"
     . "user/cpu_loop.h"
     . "user/guest-host.h"
     . "user/page-protection.h"
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 =cjz8
 -----END PGP SIGNATURE-----

Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging

Accel & Exec patch queue

- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
  Move a lot of declarations from these legacy mixed bag headers:
    . "exec/cpu-all.h"
    . "exec/cpu-common.h"
    . "exec/cpu-defs.h"
    . "exec/exec-all.h"
    . "exec/translate-all"
  to these more specific ones:
    . "exec/page-protection.h"
    . "exec/translation-block.h"
    . "user/cpu_loop.h"
    . "user/guest-host.h"
    . "user/page-protection.h"

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 # =cjz8
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
 # gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
 # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
 # gpg: WARNING: This key is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
  util/qemu-timer: fix indentation
  meson: Do not define CONFIG_DEVICES on user emulation
  system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
  system/numa: Remove unnecessary 'exec/cpu-common.h' header
  hw/xen: Remove unnecessary 'exec/cpu-common.h' header
  target/mips: Drop left-over comment about Jazz machine
  target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
  target/xtensa: Remove tswap() calls in semihosting simcall() helper
  accel/tcg: Un-inline translator_is_same_page()
  accel/tcg: Include missing 'exec/translation-block.h' header
  accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
  accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
  qemu/coroutine: Include missing 'qemu/atomic.h' header
  exec/translation-block: Include missing 'qemu/atomic.h' header
  accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
  exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
  target/sparc: Move sparc_restore_state_to_opc() to cpu.c
  target/sparc: Uninline cpu_get_tb_cpu_state()
  target/loongarch: Declare loongarch_cpu_dump_state() locally
  user: Move various declarations out of 'exec/exec-all.h'
  ...

Conflicts:
	hw/char/riscv_htif.c
	hw/intc/riscv_aplic.c
	target/s390x/cpu.c

	Apply sysemu header path changes to not in the pull request.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 11:07:00 -05:00
David Hildenbrand
88d86f6f1e s390x/virtio-ccw: add support for virtio based memory devices
Let's implement support for abstract virtio based memory devices, using
the virtio-pci implementation as an orientation. Wire them up in the
machine hotplug handler, taking care of s390x page size limitations.

As we neither support virtio-mem or virtio-pmem yet, the code is
effectively unused. We'll implement support for virtio-mem based on this
next.

Note that we won't wire up the virtio-pci variant (should currently be
impossible due to lack of support for MSI-X), but we'll add a safety net
to reject plugging them in the pre-plug handler.

Message-ID: <20241219144115.2820241-14-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
df2ac211a6 s390x: remember the maximum page size
Let's remember the value (successfully) set via s390_set_max_pagesize().
This will be helpful to reject hotplugged memory devices that would exceed
this initially set page size.

Handle it just like how we handle s390_get_memory_limit(), storing it in
the machine, and moving the handling to machine code.

Message-ID: <20241219144115.2820241-13-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
1e86400298 s390x/s390-virtio-ccw: prepare for memory devices
Let's prepare our address space for memory devices if enabled via
"maxmem" and if we have CONFIG_MEM_DEVICE enabled at all. Note that
CONFIG_MEM_DEVICE will be selected automatically once we add support
for devices.

Just like on other architectures, the region container for memory devices
is placed directly above our initial memory. For now, we only align the
start address of the region up to 1 GiB, but we won't add any additional
space to the region for internal alignment purposes; this can be done in
the future if really required.

The RAM size returned via SCLP is not modified, as this only
covers initial RAM (and standby memory we don't implement) and not memory
devices; clarify that in the docs of read_SCP_info(). Existing OSes without
support for memory devices will keep working as is, even when memory
devices would be attached the VM.

Guest OSs which support memory devices, such as virtio-mem, will
consult diag500(), to find out the maximum possible pfn. Guest OSes that
don't support memory devices, don't have to be changed and will continue
relying on information provided by SCLP.

There are no remaining maxram_size users in s390x code, and the remaining
ram_size users only care about initial RAM:
* hw/s390x/ipl.c
* hw/s390x/s390-hypercall.c
* hw/s390x/sclp.c
* target/s390x/kvm/pv.c

Message-ID: <20241219144115.2820241-11-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
d1e3c2ac41 s390x/s390-skeys: prepare for memory devices
With memory devices, we will have storage keys for memory that
exceeds the initial ram size.

The TODO already states that current handling is subopimal,
but we won't worry about improving that (TCG-only) thing for now.

Message-ID: <20241219144115.2820241-10-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
241e6b2d27 s390x/s390-stattrib-kvm: prepare for memory devices and sparse memory layouts
With memory devices, we will have storage attributes for memory that
exceeds the initial ram size. Further, we can easily have memory holes,
for which there (currently) are no storage attributes.

In particular, with memory holes, KVM_S390_SET_CMMA_BITS will fail to set
some storage attributes.

So let's do it like we handle storage keys migration, relying on
guest_phys_blocks_append(). However, in contrast to storage key
migration, we will handle it on the migration destination.

This is a preparation for virtio-mem support. Note that ever since the
"early migration" feature was added (x-early-migration), the state
of device blocks (plugged/unplugged) is migrated early such that
guest_phys_blocks_append() will properly consider all currently plugged
memory blocks and skip any unplugged ones.

In the future, we should try getting rid of the large temporary buffer
and also not send any attributes for any memory holes, just so they
get ignored on the destination.

Message-ID: <20241219144115.2820241-9-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
f7c1686578 s390x/s390-hypercall: introduce DIAG500 STORAGE_LIMIT
A guest OS that supports memory hotplug / memory devices must during
boot be aware of the maximum possible physical memory address that it might
have to handle at a later stage during its runtime.

For example, the maximum possible memory address might be required to
prepare the kernel virtual address space accordingly (e.g., select page
table hierarchy depth).

On s390x there is currently no such mechanism that is compatible with
paravirtualized memory devices, because the whole SCLP interface was
designed around the idea of "storage increments" and "standby memory".
Paravirtualized memory devices we want to support, such as virtio-mem, have
no intersection with any of that, but could co-exist with them in the
future if ever needed.

In particular, a guest OS must never detect and use device memory
without the help of a proper device driver. Device memory must not be
exposed in any firmware-provided memory map (SCLP or diag260 on s390x).
For this reason, these memory devices will be places in memory *above*
the "maximum storage increment" exposed via SCLP.

Let's provide a new diag500 subcode to query the memory limit determined in
s390_memory_init().

Message-ID: <20241219144115.2820241-8-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
27221b69a3 s390x: introduce s390_get_memory_limit()
Let's add s390_get_memory_limit(), to query what has been successfully
set via s390_set_memory_limit(). Allow setting the limit only once.

We'll remember the limit in the machine state. Move
s390_set_memory_limit() to machine code, merging it into
set_memory_limit(), because this really is a machine property.

Message-ID: <20241219144115.2820241-7-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
3c6fb557d2 s390x/s390-virtio-ccw: move setting the maximum guest size from sclp to machine code
Nowadays, it feels more natural to have that code located in
s390_memory_init(), where we also have direct access to the machine
object.

While at it, use the actual RAM size, not the maximum RAM size which
cannot currently be reached without support for any memory devices.
Consequently update s390_pv_vm_try_disable_async() to rely on the RAM size
as well, to avoid temporary issues while we further rework that
handling.

set_memory_limit() is temporary, we'll merge it with
s390_set_memory_limit() next.

Message-ID: <20241219144115.2820241-6-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
85489fc365 s390x: rename s390-virtio-hcall* to s390-hypercall*
Let's make it clearer that we are talking about general
QEMU/KVM-specific hypercalls.

Message-ID: <20241219144115.2820241-5-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
6e9cc2da4e s390x/s390-virtio-hcall: prepare for more diag500 hypercalls
Let's generalize, abstracting the virtio bits. diag500 is now a generic
hypercall to handle QEMU/KVM specific things. Explicitly specify all
already defined subcodes, including legacy ones (so we know what we can
use for new hypercalls).

Move the PGM_SPECIFICATION injection into the renamed function
handle_diag_500(), so we can turn it into a void function.

We'll rename the files separately, so git properly detects the rename.

Message-ID: <20241219144115.2820241-4-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
4be0fce498 s390x/s390-virtio-hcall: remove hypercall registration mechanism
Nowadays, we only have a single machine type in QEMU, everything is based
on virtio-ccw and the traditional virtio machine does no longer exist. No
need to dynamically register diag500 handlers. Move the two existing
handlers into s390-virtio-hcall.c.

Message-ID: <20241219144115.2820241-3-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
14e568ab48 s390x/s390-virtio-ccw: don't crash on weird RAM sizes
KVM is not happy when starting a VM with weird RAM sizes:

  # qemu-system-s390x --enable-kvm --nographic -m 1234K
  qemu-system-s390x: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION
    failed, slot=0, start=0x0, size=0x244000: Invalid argument
  kvm_set_phys_mem: error registering slot: Invalid argument
  Aborted (core dumped)

Let's handle that in a better way by rejecting such weird RAM sizes
right from the start:

  # qemu-system-s390x --enable-kvm --nographic -m 1234K
  qemu-system-s390x: ram size must be multiples of 1 MiB

Message-ID: <20241219144115.2820241-2-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
David Hildenbrand
713484d038 virtio-mem: unplug memory only during system resets, not device resets
We recently converted from the LegacyReset to the new reset framework
in commit c009a311e9 ("virtio-mem: Use new Resettable framework instead
of LegacyReset") to be able to use the ResetType to filter out wakeup
resets.

However, this change had an undesired implications: as we override the
Resettable interface methods in VirtIOMEMClass, the reset handler will
not only get called during system resets (i.e., qemu_devices_reset())
but also during any direct or indirect device rests (e.g.,
device_cold_reset()).

Further, we might now receive two reset callbacks during
qemu_devices_reset(), first when reset by a parent and later when reset
directly.

The memory state of virtio-mem devices is rather special: it's supposed to
be persistent/unchanged during most resets (similar to resetting a hard
disk will not destroy the data), unless actually cold-resetting the whole
system (different to a hard disk where a reboot will not destroy the data):
ripping out system RAM is something guest OSes don't particularly enjoy,
but we want to detect when rebooting to an OS that does not support
virtio-mem and wouldn't be able to detect+use the memory -- and we want
to force-defragment hotplugged memory to also shrink the usable device
memory region. So we rally want to catch system resets to do that.

On supported targets (e.g., x86), getting a cold reset on the
device/parent triggers is not that easy (but looks like PCI code
might trigger it), so this implication went unnoticed.

However, with upcoming s390x support it is problematic: during
kdump, s390x triggers a subsystem reset, ending up in
s390_machine_reset() and calling only subsystem_reset() instead of
qemu_devices_reset() -- because it's not a full system reset.

In subsystem_reset(), s390x performs a device_cold_reset() of any
TYPE_VIRTUAL_CSS_BRIDGE device, which ends up resetting all children,
including the virtio-mem device. Consequently, we wrongly detect a system
reset and unplug all device memory, resulting in hotplugged memory not
getting included in the crash dump -- undesired.

We really must not mess with hotplugged memory state during simple
device resets. To fix, create+register a new reset object that will only
get triggered during qemu_devices_reset() calls, but not during any other
resets as it is logically not the child of any other object.

Message-ID: <20241025104103.342188-1-david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Juraj Marcin <jmarcin@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-12-21 15:59:59 +01:00
Stefan Hajnoczi
60a07d4a6e RISC-V PR for 10.0
* Correct the validness check of iova
 * Fix APLIC in_clrip and clripnum write emulation
 * Support riscv-iommu-sys device
 * Add Tenstorrent Ascalon CPU
 * Add AIA userspace irqchip_split support
 * Add Microblaze V generic board
 * Upgrade ACPI SPCR table to support SPCR table revision 4 format
 * Remove tswap64() calls from HTIF
 * Support 64-bit address of initrd
 * Introduce svukte ISA extension
 * Support ssstateen extension
 * Support for RV64 Xiangshan Nanhu CPU
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmdkzjgACgkQr3yVEwxT
 gBOcyA//e0XhAQciQglCZZCfINdOyI8qSh+P2K0qtrXZ4VERHEMp7UoD5CQr2cZv
 h8ij1EkatXCwukVELx0rNckxG33bEFgG1oESnQSrwGE0Iu4csNW24nK5WlUS0/r+
 A5oD2wtzEF+cbhTKrVSDBN/PvlnWTKGEoJRkuXWfz5d4uR9eyQhfED0S2j36lNEC
 X1x/OZoKM89XuXtOFe9g55Z5UNzAatcdTISozL0FydiPh7QeVjTLHh28/tt559MX
 7v5aJFlQuZ78z1mIHkZmPSorSrJ0zqhkP6NWe1ae06oMgzwRQQhYLppDILV4ZgUF
 3mSDRoXmBycQXiYNPcHep3LdXfvxr+PpWHSevx8gH1jwm93On7Y/H7Uol6TDXzfC
 mrFjalfV5tzrD90ZvB+s5bCMF1q5Z8Dlj0pYF9aN9P1ILoWy3dndFAPJB6uKKDP7
 Qd4qOQ3dVyHAX9jLmVkB6QvAV/vTDrYTsAxaF/EaoLOy0IoKhjTvgda3XzE1MFKA
 gVafLluADIfSEdqa2QR2ExL8d1SZVoiObCp5TMLRer0HIpg/vQZwjfdbo4BgQKL3
 7Q6wBxcZUNqrFgspXjm5WFIrdk2rfS/79OmvpNM6SZaK6BnklntdJHJHtAWujGsm
 EM310AUFpHMp2h6Nqnemb3qr5l4d20KSt8DhoPAUq1IE59Kb8XY=
 =0iQW
 -----END PGP SIGNATURE-----

Merge tag 'pull-riscv-to-apply-20241220' of https://github.com/alistair23/qemu into staging

RISC-V PR for 10.0

* Correct the validness check of iova
* Fix APLIC in_clrip and clripnum write emulation
* Support riscv-iommu-sys device
* Add Tenstorrent Ascalon CPU
* Add AIA userspace irqchip_split support
* Add Microblaze V generic board
* Upgrade ACPI SPCR table to support SPCR table revision 4 format
* Remove tswap64() calls from HTIF
* Support 64-bit address of initrd
* Introduce svukte ISA extension
* Support ssstateen extension
* Support for RV64 Xiangshan Nanhu CPU

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmdkzjgACgkQr3yVEwxT
 # gBOcyA//e0XhAQciQglCZZCfINdOyI8qSh+P2K0qtrXZ4VERHEMp7UoD5CQr2cZv
 # h8ij1EkatXCwukVELx0rNckxG33bEFgG1oESnQSrwGE0Iu4csNW24nK5WlUS0/r+
 # A5oD2wtzEF+cbhTKrVSDBN/PvlnWTKGEoJRkuXWfz5d4uR9eyQhfED0S2j36lNEC
 # X1x/OZoKM89XuXtOFe9g55Z5UNzAatcdTISozL0FydiPh7QeVjTLHh28/tt559MX
 # 7v5aJFlQuZ78z1mIHkZmPSorSrJ0zqhkP6NWe1ae06oMgzwRQQhYLppDILV4ZgUF
 # 3mSDRoXmBycQXiYNPcHep3LdXfvxr+PpWHSevx8gH1jwm93On7Y/H7Uol6TDXzfC
 # mrFjalfV5tzrD90ZvB+s5bCMF1q5Z8Dlj0pYF9aN9P1ILoWy3dndFAPJB6uKKDP7
 # Qd4qOQ3dVyHAX9jLmVkB6QvAV/vTDrYTsAxaF/EaoLOy0IoKhjTvgda3XzE1MFKA
 # gVafLluADIfSEdqa2QR2ExL8d1SZVoiObCp5TMLRer0HIpg/vQZwjfdbo4BgQKL3
 # 7Q6wBxcZUNqrFgspXjm5WFIrdk2rfS/79OmvpNM6SZaK6BnklntdJHJHtAWujGsm
 # EM310AUFpHMp2h6Nqnemb3qr5l4d20KSt8DhoPAUq1IE59Kb8XY=
 # =0iQW
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Thu 19 Dec 2024 20:54:00 EST
 # gpg:                using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
 # gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
 # gpg: WARNING: This key is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65  9296 AF7C 9513 0C53 8013

* tag 'pull-riscv-to-apply-20241220' of https://github.com/alistair23/qemu: (39 commits)
  target/riscv: add support for RV64 Xiangshan Nanhu CPU
  target/riscv: add ssstateen
  target/riscv/tcg: hide warn for named feats when disabling via priv_ver
  target/riscv: Include missing headers in 'internals.h'
  target/riscv: Include missing headers in 'vector_internals.h'
  target/riscv: Check svukte is not enabled in RV32
  target/riscv: Expose svukte ISA extension
  target/riscv: Check memory access to meet svukte rule
  target/riscv: Support hstatus[HUKTE] bit when svukte extension is enabled
  target/riscv: Support senvcfg[UKTE] bit when svukte extension is enabled
  target/riscv: Add svukte extension capability variable
  hw/riscv: Add the checking if DTB overlaps to kernel or initrd
  hw/riscv: Add a new struct RISCVBootInfo
  hw/riscv: Support to load DTB after 3GB memory on 64-bit system.
  hw/char/riscv_htif: Clarify MemoryRegionOps expect 32-bit accesses
  hw/char/riscv_htif: Explicit little-endian implementation
  MAINTAINERS: Cover RISC-V HTIF interface
  tests/qtest/bios-tables-test: Update virt SPCR golden reference for RISC-V
  hw/acpi: Upgrade ACPI SPCR table to support SPCR table revision 4 format
  qtest: allow SPCR acpi table changes
  ...

Conflicts:
  target/riscv/cpu.c

  Merge conflict with DEFINE_PROP_END_OF_LIST() removal. No Property
  array terminator is needed anymore.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 08:13:16 -05:00
Stefan Hajnoczi
e3a207722b * qdev: second part of Property cleanups
* rust: second part of QOM rework
 * rust: callbacks wrapper
 * rust: pl011 bugfixes
 * kvm: cleanup errors in kvm_convert_memory()
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmdkaEkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0/wgAgIJg8BrlRKfmiz14NZfph8/jarSj
 TOWYVxL2v4q98KBuL5pta2ucObgzwqyqSyc02S2DGSOIMQCIiBB5MaCk1iMjx+BO
 pmVU8gNlD8faO8SSmnnr+jDQt+G+bQ/nRgQJOAReF8oVw3O2aC/FaVKpitMzWtvv
 PLnJWdrqqpGq14OzX8iNCzSujxppAuyjrhT4lNlekzDoDfdTez72r+rXkvg4GzZL
 QC3xLYg/LrT8Rs+zgOhm/AaIyS4bOyMlkU9Du1rQ6Tyne45ey2FCwKVzBKrJdGcw
 sVbzEclxseLenoTbZqYK6JTzLdDoThVUbY2JwoCGUaIm+74P4NjEsUsTVg==
 =TuQM
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* qdev: second part of Property cleanups
* rust: second part of QOM rework
* rust: callbacks wrapper
* rust: pl011 bugfixes
* kvm: cleanup errors in kvm_convert_memory()

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmdkaEkUHHBib256aW5p
 # QHJlZGhhdC5jb20ACgkQv/vSX3jHroN0/wgAgIJg8BrlRKfmiz14NZfph8/jarSj
 # TOWYVxL2v4q98KBuL5pta2ucObgzwqyqSyc02S2DGSOIMQCIiBB5MaCk1iMjx+BO
 # pmVU8gNlD8faO8SSmnnr+jDQt+G+bQ/nRgQJOAReF8oVw3O2aC/FaVKpitMzWtvv
 # PLnJWdrqqpGq14OzX8iNCzSujxppAuyjrhT4lNlekzDoDfdTez72r+rXkvg4GzZL
 # QC3xLYg/LrT8Rs+zgOhm/AaIyS4bOyMlkU9Du1rQ6Tyne45ey2FCwKVzBKrJdGcw
 # sVbzEclxseLenoTbZqYK6JTzLdDoThVUbY2JwoCGUaIm+74P4NjEsUsTVg==
 # =TuQM
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Thu 19 Dec 2024 13:39:05 EST
 # gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
 # gpg:                issuer "pbonzini@redhat.com"
 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
 # gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
 # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
 #      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (42 commits)
  rust: pl011: simplify handling of the FIFO enabled bit in LCR
  rust: pl011: fix migration stream
  rust: pl011: extend registers to 32 bits
  rust: pl011: fix break errors and definition of Data struct
  rust: pl011: always use reset() method on registers
  rust: pl011: match break logic of C version
  rust: pl011: fix declaration of LineControl bits
  target/i386: Reset TSCs of parked vCPUs too on VM reset
  kvm: consistently return 0/-errno from kvm_convert_memory
  rust: qemu-api: add a module to wrap functions and zero-sized closures
  rust: qom: add initial subset of methods on Object
  rust: qom: add casting functionality
  rust: tests: allow writing more than one test
  bql: add a "mock" BQL for Rust unit tests
  rust: re-export C types from qemu-api submodules
  rust: rename qemu-api modules to follow C code a bit more
  rust: qom: add possibility of overriding unparent
  rust: qom: put class_init together from multiple ClassInitImpl<>
  Constify all opaque Property pointers
  hw/core/qdev-properties: Constify Property argument to PropertyInfo.print
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 08:06:50 -05:00
Philippe Mathieu-Daudé
433442a75d system: Move 'exec/confidential-guest-support.h' to system/
"exec/confidential-guest-support.h" is specific to system
emulation, so move it under the system/ namespace.
Mechanical change doing:

  $ sed -i \
    -e 's,exec/confidential-guest-support.h,sysemu/confidential-guest-support.h,' \
        $(git grep -l exec/confidential-guest-support.h)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-2-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Philippe Mathieu-Daudé
63cda19446 target/i386/sev: Reduce system specific declarations
"system/confidential-guest-support.h" is not needed,
remove it. Reorder #ifdef'ry to reduce declarations
exposed on user emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-3-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Peter Xu
5cfd38a2e7 qom: Create system containers explicitly
Always explicitly create QEMU system containers upfront.

Root containers will be created when trying to fetch the root object the
1st time.  They are:

  /objects
  /chardevs
  /backend

Machine sub-containers will be created only until machine is being
initialized.  They are:

  /machine/unattached
  /machine/peripheral
  /machine/peripheral-anon

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-8-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Peter Xu
7c03a17c8d hw/ppc: Explicitly create the drc container
QEMU will start to not rely on implicit creations of containers soon.  Make
PPC drc devices follow by explicitly create the container whenever a drc
device is realized, dropping container_get() calls.

No functional change intended.

Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Harsh Prateek Bora <harshpb@linux.ibm.com>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-7-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-20 17:44:55 +01:00
Peter Xu
6de3c4917f ppc/e500: Avoid abuse of container_get()
container_get() is going to become strict on not allowing to return a
non-container.

Switch the e500 user to use object_resolve_path_component() explicitly.

Cc: Bharat Bhushan <r65777@freescale.com>
Cc: qemu-ppc@nongnu.org
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20241121192202.4155849-6-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-20 17:44:55 +01:00
Peter Xu
e469b331cd qom: Add TYPE_CONTAINER macro
Provide a macro for the container type across QEMU source tree, rather than
hard code it every time.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20241121192202.4155849-2-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-20 17:44:55 +01:00
Jim Shu
1a65064c1f hw/riscv: Add the checking if DTB overlaps to kernel or initrd
DTB is placed to the end of memory, so we will check if the start
address of DTB overlaps to the address of kernel/initrd.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-4-jim.shu@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Jim Shu
d3592955af hw/riscv: Add a new struct RISCVBootInfo
Add a new struct RISCVBootInfo to sync boot information between multiple
boot functions.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-3-jim.shu@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Jim Shu
b4132a9e62 hw/riscv: Support to load DTB after 3GB memory on 64-bit system.
Larger initrd image will overlap the DTB at 3GB address. Since 64-bit
system doesn't have 32-bit addressable issue, we just load DTB to the end
of dram in 64-bit system.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241120153935.24706-2-jim.shu@sifive.com>
[ Changes by AF
 -  Store fdt_load_addr_hi32 in the reset vector
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Philippe Mathieu-Daudé
d2ed9fffba hw/char/riscv_htif: Clarify MemoryRegionOps expect 32-bit accesses
Looking at htif_mm_ops[] read/write handlers, we notice they
expect 32-bit values to accumulate into to the 'fromhost' and
'tohost' 64-bit variables. Explicit by setting the .impl
min/max fields.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241129154304.34946-4-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Philippe Mathieu-Daudé
be0a70b93f hw/char/riscv_htif: Explicit little-endian implementation
Since our RISC-V system emulation is only built for little
endian, the HTIF device aims to interface with little endian
memory accesses, thus we can explicit htif_mm_ops:endianness
being DEVICE_LITTLE_ENDIAN.

In that case tswap64() is equivalent to le64_to_cpu(), as in
"convert this 64-bit little-endian value into host cpu order".
Replace to simplify.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241129154304.34946-3-philmd@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Sia Jee Heng
6ab861421c hw/acpi: Upgrade ACPI SPCR table to support SPCR table revision 4 format
Update the SPCR table to accommodate the SPCR Table revision 4 [1].
The SPCR table has been modified to adhere to the revision 4 format [2].

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/serports/serial-port-console-redirection-table
[2]: https://github.com/acpica/acpica/pull/931

Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20241028015744.624943-3-jeeheng.sia@starfivetech.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Sai Pavan Boddu
77aad42ee2 hw/riscv: Add Microblaze V generic board
Add a basic board with interrupt controller (intc), timer, serial
(uartlite), small memory called LMB@0 (128kB) and DDR@0x80000000
(configured via command line eg. -m 2g).
This is basic configuration which matches HW generated out of AMD Vivado
(design tools). But initial configuration is going beyond what it is
configured by default because validation should be done on other
configurations too. That's why wire also additional uart16500, axi
ethernet(with axi dma).
GPIOs, i2c and qspi is also listed for completeness.

IRQ map is: (addr)
0 - timer (0x41c00000)
1 - uartlite (0x40600000)
2 - i2c (0x40800000)
3 - qspi (0x44a00000)
4 - uart16550 (0x44a10000)
5 - emaclite (0x40e00000)
6 - timer2 (0x41c10000)
7 - axi emac (0x40c00000)
8 - axi dma (0x41e00000)
9 - axi dma
10 - gpio (0x40000000)
11 - gpio2 (0x40010000)
12 - gpio3 (0x40020000)

Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@amd.com>
Signed-off-by: Michal Simek <michal.simek@amd.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241125134739.18189-1-sai.pavan.boddu@amd.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
e0c87e3067 hw/intc/riscv_aplic: add kvm_msicfgaddr for split mode aplic-imsic
The last step to enable KVM AIA aplic-imsic with irqchip in split mode
is to deal with how MSIs are going to be sent. In our current design we
don't allow an APLIC controller to send MSIs unless it's on m-mode. And
we also do not allow Supervisor MSI address configuration via the
'smsiaddrcfg' and 'smsiaddrcfgh' registers unless it's also a m-mode
APLIC controller.

Add a new RISCVACPLICState attribute called 'kvm_msicfgaddr'. This
attribute represents the base configuration address for MSIs, in our
case the base addr of the IMSIC controller. This attribute is being set
only when running irqchip_split() mode with aia=aplic-imsic.

During riscv_aplic_msi_send() we'll check if the attribute was set to
skip the check for a m-mode APLIC controller and to change the resulting
MSI addr by adding kvm_msicfgaddr right before address_space_stl_le().

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
b319ef15b8 hw/riscv/virt.c, riscv_aplic.c: add 'emulated_aplic' helpers
The current logic to determine if we don't need an emulated APLIC
controller, i.e. KVM will provide for us, is to determine if we're
running KVM, with in-kernel irqchip support, and running
aia=aplic-imsic. This is modelled by riscv_is_kvm_aia_aplic_imsic() and
virt_use_kvm_aia_aplic_imsic().

This won't suffice to support irqchip_split() mode: it will match
exactly the same conditions as the one above, but setting the irqchip to
'split' mode will now require us to emulate an APLIC s-mode controller,
like we're doing with 'aia=aplic'.

Create a new riscv_use_emulated_aplic() helper that will encapsulate
this logic. Replace the uses of "riscv_is_kvm_aia_aplic_imsic()" with
this helper every time we're taking a decision on emulate an APLIC
controller or not. Do the same in virt.c with virt_use_emulated_aplic().

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-6-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
2711e1e324 hw/riscv/virt.c: rename helper to virt_use_kvm_aia_aplic_imsic()
Similar to the riscv_is_kvm_aia_aplic_imsic() helper from riscv_aplic.c,
the existing virt_use_kvm_aia() is testing for KVM aia=aplic-imsic with
in-kernel irqchip enabled. It is not checking for a generic AIA support.

Rename the helper to virt_use_kvm_aia_aplic_imsic() to reflect what the
helper is doing, and use the existing riscv_is_kvm_aia_aplic_imsic() to
obscure details such as the presence of the in-kernel irqchip.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
01948b1dea hw/riscv/virt.c: reduce virt_use_kvm_aia() usage
In create_fdt_sockets() we have the following pattern:

    if (kvm_enabled() && virt_use_kvm_aia(s)) {
        (... do stuff ...)
    } else {
        (... do other stuff ...)
    }
    if (kvm_enabled() && virt_use_kvm_aia(s)) {
        (... do more stuff ...)
    } else {
        (... do more other stuff)
    }

Do everything in a single if/else clause to reduce the usage of
virt_use_kvm_aia() helper and to make the code a bit less repetitive.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
7d0b35b3c9 hw/intc/riscv_aplic: rename is_kvm_aia()
The helper is_kvm_aia() is checking not only for AIA, but for
aplic-imsic (i.e. "aia=aplic-imsic" in 'virt' RISC-V machine) with an
in-kernel chip present.

Rename it to be a bit clear what the helper is doing since we'll add
more AIA helpers in the next patches.

Make the helper public because the 'virt' machine will use it as well.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241119191706.718860-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:47 +10:00
Daniel Henrique Barboza
9afd26715e hw/riscv/riscv-iommu: implement reset protocol
Add a riscv_iommu_reset() helper in the base emulation code that
implements the expected reset behavior as defined by the riscv-iommu
spec.

Devices can then use this helper in their own reset callbacks.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-7-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:22:46 +10:00
Daniel Henrique Barboza
01c1caa9d1 hw/riscv/virt.c, riscv-iommu-sys.c: add MSIx support
MSIx support is added in the RISC-V IOMMU platform device by including
the required MSIx facilities to alow software to properly setup the MSIx
subsystem.

We took inspiration of what is being done in the riscv-iommu-pci device,
mainly msix_init() and msix_notify(), while keeping in mind that
riscv-iommu-sys isn't a true PCI device and we don't need to copy/paste
all the contents of these MSIx functions.

Two extra MSI MemoryRegions were added: 'msix-table' and 'msix-pba'.
They are used to manage r/w of the MSI table and Pending Bit Array (PBA)
respectively. Both are subregions of the main IOMMU memory region,
iommu->regs_mr, initialized during riscv_iommu_realize(), and each one
has their own handlers for MSIx reads and writes.

This is the expected memory map when using this device in the 'virt'
machine:

    0000000003010000-0000000003010fff (prio 0, i/o): riscv-iommu-regs
      0000000003010300-000000000301034f (prio 0, i/o): msix-table
      0000000003010400-0000000003010407 (prio 0, i/o): msix-pba

We're now able to set IGS to RISCV_IOMMU_CAP_IGS_BOTH, and userspace is
free to decide which interrupt model to use.

Enabling MSIx support for this device in the 'virt' machine requires
adding 'msi-parent' in the iommu-sys DT.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-6-dbarboza@ventanamicro.com>
[ Changes by AF:
 - Used PRIx64 in trace
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:20:17 +10:00
Sunil V L
2c12de1460 hw/riscv/virt: Add IOMMU as platform device if the option is set
Add a new machine option called 'iommu-sys' that enables a
riscv-iommu-sys platform device for the 'virt' machine. The option is
default 'off'.

The device will use IRQs 36 to 39.

We will not support both riscv-iommu-sys and riscv-iommu-pci devices in
the same board in this first implementation. If a riscv-iommu-pci device
is added in the command line we will disable the riscv-iommu-sys device.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-5-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Tomasz Jeznach
5b128435dc hw/riscv: add riscv-iommu-sys platform device
This device models the RISC-V IOMMU as a sysbus device. The same design
decisions taken in the riscv-iommu-pci device were kept, namely the
existence of 4 vectors are available for each interrupt cause.

The WSIs are emitted using the input of the s->notify() callback as a
index to an IRQ list. The IRQ list starts at 'base_irq' and goes until
base_irq + 3. This means that boards must have 4 contiguous IRQ lines
available, starting from 'base_irq'.

Signed-off-by: Tomasz Jeznach <tjeznach@rivosinc.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-4-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Daniel Henrique Barboza
d13346d105 hw/riscv/riscv-iommu: parametrize CAP.IGS
Interrupt Generation Support (IGS) is a capability that is tied to the
interrupt deliver mechanism, not with the core IOMMU emulation. We
should allow device implementations to set IGS as they wish.

A new helper is added to make it easier for device impls to set IGS. Use
it in our existing IOMMU device (riscv-iommu-pci) to set
RISCV_IOMMU_CAPS_IGS_MSI.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-3-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Daniel Henrique Barboza
4876e6f7b5 hw/riscv/riscv-iommu.c: add riscv_iommu_instance_init()
Move all the static initializion of the device to an init() function,
leaving only the dynamic initialization to be done during realize.

With this change s->cap is initialized with RISCV_IOMMU_CAP_DBG during
init(), and realize() will increment s->cap with the extra caps.

This will allow callers to add IOMMU capabilities before the
realization.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241106133407.604587-2-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Yong-Xuan Wang
0d0141fadc hw/intc/riscv_aplic: Fix APLIC in_clrip and clripnum write emulation
In the section "4.7 Precise effects on interrupt-pending bits"
of the RISC-V AIA specification defines that:

"If the source mode is Level1 or Level0 and the interrupt domain
is configured in MSI delivery mode (domaincfg.DM = 1):
The pending bit is cleared whenever the rectified input value is
low, when the interrupt is forwarded by MSI, or by a relevant
write to an in_clrip register or to clripnum."

Update the riscv_aplic_set_pending() to match the spec.

Fixes: bf31cf06eb ("hw/intc/riscv_aplic: Fix setipnum_le write emulation for APLIC MSI-mode")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20241029085349.30412-1-yongxuan.wang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Jason Chien
e5d28bf2b3 hw/riscv/riscv-iommu.c: Correct the validness check of iova
From RISCV IOMMU spec section 2.1.3:
When SXL is 1, the following rules apply:
- If the first-stage is not Bare, then a page fault corresponding to the
original access type occurs if the IOVA has bits beyond bit 31 set to 1.
- If the second-stage is not Bare, then a guest page fault corresponding
to the original access type occurs if the incoming GPA has bits beyond bit
33 set to 1.

From RISCV IOMMU spec section 2.3 step 17:
Use the process specified in Section "Two-Stage Address Translation" of
the RISC-V Privileged specification to determine the GPA accessed by the
transaction.

From RISCV IOMMU spec section 2.3 step 19:
Use the second-stage address translation process specified in Section
"Two-Stage Address Translation" of the RISC-V Privileged specification
to translate the GPA A to determine the SPA accessed by the transaction.

This commit adds the iova check with the following rules:
- For Sv32, Sv32x4, Sv39x4, Sv48x4 and Sv57x4, the iova must be zero
extended.
- For Sv39, Sv48 and Sv57, the iova must be signed extended with most
significant bit.

Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20241114065617.25133-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2024-12-20 11:19:16 +10:00
Stefan Hajnoczi
9863d46a5a loongarch queue
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ2PKBQAKCRAfewwSUazn
 0QAZAQCxbLnvzOb9TPORlg5w0n/xFaKCL7dJbJE4WjlM7dhLkAEA5G8JVoP5Ju2B
 mcK7wbymyXNX1ocsukL/JM2JavHS+AI=
 =JoSk
 -----END PGP SIGNATURE-----

Merge tag 'pull-loongarch-20241219' of https://gitlab.com/bibo-mao/qemu into staging

loongarch queue

# -----BEGIN PGP SIGNATURE-----
#
# iHUEABYKAB0WIQQNhkKjomWfgLCz0aQfewwSUazn0QUCZ2PKBQAKCRAfewwSUazn
# 0QAZAQCxbLnvzOb9TPORlg5w0n/xFaKCL7dJbJE4WjlM7dhLkAEA5G8JVoP5Ju2B
# mcK7wbymyXNX1ocsukL/JM2JavHS+AI=
# =JoSk
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 02:23:49 EST
# gpg:                using EDDSA key 0D8642A3A2659F80B0B3D1A41F7B0C1251ACE7D1
# gpg: Good signature from "bibo mao <maobibo@loongson.cn>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7044 3A00 19C0 E97A 31C7  13C4 8E86 8FB7 A176 9D4C
#      Subkey fingerprint: 0D86 42A3 A265 9F80 B0B3  D1A4 1F7B 0C12 51AC E7D1

* tag 'pull-loongarch-20241219' of https://gitlab.com/bibo-mao/qemu:
  hw/intc/loongarch_extioi: Code cleanup about loongarch_extioi
  hw/intc/loongarch_extioi: Add pre_save interface
  hw/intc/loongarch_extioi: Inherit from loongarch_extioi_common
  hw/intc/loongarch_extioi: Add common file loongarch_extioi_common
  hw/intc/loongarch_extioi: Add unrealize interface
  hw/intc/loongarch_extioi: Add common realize interface
  hw/intc/loongarch_extioi: Rename LoongArchExtIOI with LoongArchExtIOICommonState
  include: Rename LoongArchExtIOI with LoongArchExtIOICommonState
  include: Move struct LoongArchExtIOI to header file loongarch_extioi_common
  include: Add loongarch_extioi_common header file
  hw/intc/loongarch_pch: Code cleanup about loongarch_pch_pic
  hw/intc/loongarch_pch: Add pre_save and post_load interfaces
  hw/intc/loongarch_pch: Inherit from loongarch_pic_common
  hw/intc/loongarch_pch: Move some functions to file loongarch_pic_common
  hw/intc/loongarch_pch: Rename LoongArchPCHPIC with LoongArchPICCommonState
  hw/intc/loongarch_pch: Merge instance_init() into realize()
  include: Move struct LoongArchPCHPIC to loongarch_pic_common header file
  include: Add loongarch_pic_common header file

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-19 15:46:43 -05:00
Richard Henderson
b1987a2547 Constify all opaque Property pointers
Via sed "s/  Property [*]/  const Property */".

The opaque pointers passed to ObjectProperty callbacks are
the last instances of non-const Property pointers in the tree.
For the most part, these callbacks only use object_field_prop_ptr,
which now takes a const pointer itself.

This logically should have accompanied d36f165d95 which
allowed const Property to be registered.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-25-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
Richard Henderson
857c4a8a33 hw/core/qdev-properties: Constify Property argument to PropertyInfo.print
This logically should have accompanied d36f165d95 which
allowed const Property to be registered.

There is exactly one instance of this method: print_pci_devfn.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-24-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
Richard Henderson
e9c0346a95 hw/core/qdev-properties: Constify Property argument to object_field_prop_ptr
This logically should have accompanied d36f165d95 which
allowed const Property to be registered.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-23-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
Richard Henderson
5fcabe628b include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LIST
Now that all of the Property arrays are counted, we can remove
the terminator object from each array.  Update the assertions
in device_class_set_props to match.

With struct Property being 88 bytes, this was a rather large
form of terminator.  Saves 30k from qemu-system-aarch64.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00