Commit graph

17318 commits

Author SHA1 Message Date
Nicholas Piggin
ce5a32d180 ppc/pnv: Move the PNOR LPC address into struct PnvPnor
Rather than use the hardcoded define throughout the tree for the
PNOR LPC address, keep it within the PnvPnor object.

This should solve a dead code issue in the BMC HIOMAP checks where
Coverity (correctly) reported that the sanity checks are dead code.
We would like to keep the sanity checks without turning them into a
compile time assert in case we would like to make them configurable
in future.

Fixes: 4c84a0a4a6 ("ppc/pnv: Add a PNOR address and size sanity checks")
Resolves: Coverity CID 1593723
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-20 19:58:10 +10:00
Steve Sistare
1632a2017f migration: cpr_is_incoming
Define the cpr_is_incoming helper, to be used in several cpr fix patches.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <1741380954-341079-2-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-03-14 09:29:19 -03:00
Stefan Hajnoczi
0462a32b4f Block layer patches
- virtio-scsi: add iothread-vq-mapping parameter
 - Improve writethrough performance
 - Fix missing zero init in bdrv_snapshot_goto()
 - Added scripts/qcow2-to-stdout.py
 - Code cleanup and iotests fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmfTDysRHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9Yz6A//asOl37zjbtf9pYjY/gliH859TQOppPGD
 LB9IIr+nTDME0wfUkCOlag+CeEYZwkeo2PF+XeopsyzlJeBOk4tL7AkY57XYe3lZ
 M5hlnNrn6l3gb6iioMg60pEKSMrpKprB16vT3nAtyN6aEXsm9TvtPkWPFTCFGVeK
 W74VCr7wuXbfdEJcOGd8WhB9ZHIgwoWYnoL41tvCoefW2yNaMA6X0TLn98toXzOi
 il50ZnnchTQngns5R+n+1R1Ma995t393D+CArQcYVRzxKGOs5p0y4otz4gCkMhdp
 GVL09R7Ge4TteSJ2myxlN/EjYOxmdoMrVDajr4xPdHBw12MKzgk8i82h4/Es/Q5o
 3Npgx74+jDyqlICb/czTVM5KJINpyO80vO3N3WpYUOQGyTCcYgv7pIpy8pB2o6Te
 RPlv0W9bHVSSgThFFLQ0Ud8WRGJe1K/ar8bdmiWN08Wez1avENWaYmsv5zGnFL24
 vD6cNXMR4mF7mzyeWda/5hGKv75djVgX+ZfzvWNT3qgizD56JBOA3RdCRwBZJOJb
 TvJkfi5RGyaji9BfKVCYBL3/iDELJEVDW8jxvIIUrS0aPcTHpAQ5gTO7VAokreqZ
 5Smll11eeoEgPPvNLw8ikmOGTWOMkJGrmExP2K1ApANq3kSbBSU4jroEr0BG9PZT
 6Y0hUdtFSdU=
 =w2Ri
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- virtio-scsi: add iothread-vq-mapping parameter
- Improve writethrough performance
- Fix missing zero init in bdrv_snapshot_goto()
- Added scripts/qcow2-to-stdout.py
- Code cleanup and iotests fixes

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmfTDysRHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9Yz6A//asOl37zjbtf9pYjY/gliH859TQOppPGD
# LB9IIr+nTDME0wfUkCOlag+CeEYZwkeo2PF+XeopsyzlJeBOk4tL7AkY57XYe3lZ
# M5hlnNrn6l3gb6iioMg60pEKSMrpKprB16vT3nAtyN6aEXsm9TvtPkWPFTCFGVeK
# W74VCr7wuXbfdEJcOGd8WhB9ZHIgwoWYnoL41tvCoefW2yNaMA6X0TLn98toXzOi
# il50ZnnchTQngns5R+n+1R1Ma995t393D+CArQcYVRzxKGOs5p0y4otz4gCkMhdp
# GVL09R7Ge4TteSJ2myxlN/EjYOxmdoMrVDajr4xPdHBw12MKzgk8i82h4/Es/Q5o
# 3Npgx74+jDyqlICb/czTVM5KJINpyO80vO3N3WpYUOQGyTCcYgv7pIpy8pB2o6Te
# RPlv0W9bHVSSgThFFLQ0Ud8WRGJe1K/ar8bdmiWN08Wez1avENWaYmsv5zGnFL24
# vD6cNXMR4mF7mzyeWda/5hGKv75djVgX+ZfzvWNT3qgizD56JBOA3RdCRwBZJOJb
# TvJkfi5RGyaji9BfKVCYBL3/iDELJEVDW8jxvIIUrS0aPcTHpAQ5gTO7VAokreqZ
# 5Smll11eeoEgPPvNLw8ikmOGTWOMkJGrmExP2K1ApANq3kSbBSU4jroEr0BG9PZT
# 6Y0hUdtFSdU=
# =w2Ri
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 14 Mar 2025 01:00:27 HKT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (23 commits)
  scripts/qcow2-to-stdout.py: Add script to write qcow2 images to stdout
  virtio-scsi: only expose cmd vqs via iothread-vq-mapping
  virtio-scsi: handle ctrl virtqueue in main loop
  virtio-scsi: add iothread-vq-mapping parameter
  virtio: extract iothread-vq-mapping.h API
  virtio-blk: tidy up iothread_vq_mapping functions
  virtio-blk: extract cleanup_iothread_vq_mapping() function
  virtio-scsi: perform TMFs in appropriate AioContexts
  virtio-scsi: protect events_dropped field
  virtio-scsi: introduce event and ctrl virtqueue locks
  scsi: introduce requests_lock
  scsi: track per-SCSIRequest AioContext
  dma: use current AioContext for dma_blk_io()
  scsi-disk: drop unused SCSIDiskState->bh field
  iotests: Limit qsd-migrate to working formats
  aio-posix: Adjust polling time also for new handlers
  aio-posix: Separate AioPolledEvent per AioHandler
  aio-posix: Factor out adjust_polling_time()
  aio: Create AioPolledEvent
  block/io: Ignore FUA with cache.no-flush=on
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-14 09:31:13 +08:00
Stefan Hajnoczi
bcede51d2d virtio-scsi: handle ctrl virtqueue in main loop
Previously the ctrl virtqueue was handled in the AioContext where SCSI
requests are processed. When IOThread Virtqueue Mapping was added things
become more complicated because SCSI requests could run in other
AioContexts.

Simplify by handling the ctrl virtqueue in the main loop where reset
operations can be performed. Note that BHs are still used canceling SCSI
requests in their AioContexts but at least the mean loop activity
doesn't need BHs anymore.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250311132616.1049687-13-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
2e8e18c2e4 virtio-scsi: add iothread-vq-mapping parameter
Allow virtio-scsi virtqueues to be assigned to different IOThreads. This
makes it possible to take advantage of host multi-queue block layer
scalability by assigning virtqueues that have affinity with vCPUs to
different IOThreads that have affinity with host CPUs. The same feature
was introduced for virtio-blk in the past:
https://developers.redhat.com/articles/2024/09/05/scaling-virtio-blk-disk-io-iothread-virtqueue-mapping

Here are fio randread 4k iodepth=64 results from a 4 vCPU guest with an
Intel P4800X SSD:
iothreads IOPS
------------------------------
1         189576
2         312698
4         346744

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20250311132616.1049687-12-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
[kwolf: Updated 051 output, virtio-scsi can now use any iothread]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
b50629c335 virtio: extract iothread-vq-mapping.h API
The code that builds an array of AioContext pointers indexed by the
virtqueue is not specific to virtio-blk. virtio-scsi will need to do the
same thing, so extract the functions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-11-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
7d8ab5b2f7 virtio-scsi: protect events_dropped field
The block layer can invoke the resize callback from any AioContext that
is processing requests. The virtqueue is already protected but the
events_dropped field also needs to be protected against races. Cover it
using the event virtqueue lock because it is closely associated with
accesses to the virtqueue.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-7-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
b348ca2e04 virtio-scsi: introduce event and ctrl virtqueue locks
Virtqueues are not thread-safe. Until now this was not a major issue
since all virtqueue processing happened in the same thread. The ctrl
queue's Task Management Function (TMF) requests sometimes need the main
loop, so a BH was used to schedule the virtqueue completion back in the
thread that has virtqueue access.

When IOThread Virtqueue Mapping is introduced in later commits, event
and ctrl virtqueue accesses from other threads will become necessary.
Introduce an optional per-virtqueue lock so the event and ctrl
virtqueues can be protected in the commits that follow.

The addition of the ctrl virtqueue lock makes
virtio_scsi_complete_req_from_main_loop() and its BH unnecessary.
Instead, take the ctrl virtqueue lock from the main loop thread.

The cmd virtqueue does not have a lock because the entirety of SCSI
command processing happens in one thread. Only one thread accesses the
cmd virtqueue and a lock is unnecessary.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-6-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
1cf18cc9bf scsi: introduce requests_lock
SCSIDevice keeps track of in-flight requests for device reset and Task
Management Functions (TMFs). The request list requires protection so
that multi-threaded SCSI emulation can be implemented in commits that
follow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-5-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
7eecba3778 scsi: track per-SCSIRequest AioContext
Until now, a SCSIDevice's I/O requests have run in a single AioContext.
In order to support multiple IOThreads it will be necessary to move to
the concept of a per-SCSIRequest AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-4-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Stefan Hajnoczi
a89c3c9b2c dma: use current AioContext for dma_blk_io()
In the past a single AioContext was used for block I/O and it was
fetched using blk_get_aio_context(). Nowadays the block layer supports
running I/O from any AioContext and multiple AioContexts at the same
time. Remove the dma_blk_io() AioContext argument and use the current
AioContext instead.

This makes calling the function easier and enables multiple IOThreads to
use dma_blk_io() concurrently for the same block device.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250311132616.1049687-3-stefanha@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
ee416407b3 aio-posix: Separate AioPolledEvent per AioHandler
Adaptive polling has a big problem: It doesn't consider that an event
loop can wait for many different events that may have very different
typical latencies.

For example, think of a guest that tends to send a new I/O request soon
after the previous I/O request completes, but the storage on the host is
rather slow. In this case, getting the new request from guest quickly
means that polling is enabled, but the next thing is performing the I/O
request on the backend, which is slow and disables polling again for the
next guest request. This means that in such a scenario, polling could
help for every other event, but is only ever enabled when it can't
succeed.

In order to fix this, keep a separate AioPolledEvent for each
AioHandler. We will then know that the backend file descriptor always
has a high latency and isn't worth polling for, but we also know that
the guest is always fast and we should poll for it. This solves at least
half of the problem, we can now keep polling for those cases where it
makes sense and get the improved performance from it.

Since the event loop doesn't know which event will be next, we still do
some unnecessary polling while we're waiting for the slow disk. I made
some attempts to be more clever than just randomly growing and shrinking
the polling time, and even to let callers be explicit about when they
expect a new event, but so far this hasn't resulted in improved
performance or even caused performance regressions. For now, let's just
fix the part that is easy enough to fix, we can revisit the rest later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-6-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
518db1013c aio: Create AioPolledEvent
As a preparation for having multiple adaptive polling states per
AioContext, move the 'ns' field into a separate struct.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-4-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:57:23 +01:00
Kevin Wolf
984a32f17e file-posix: Support FUA writes
Until now, FUA was always emulated with a separate flush after the write
for file-posix. The overhead of processing a second request can reduce
performance significantly for a guest disk that has disabled the write
cache, especially if the host disk is already write through, too, and
the flush isn't actually doing anything.

Advertise support for REQ_FUA in write requests and implement it for
Linux AIO and io_uring using the RWF_DSYNC flag for write requests. The
thread pool still performs a separate fdatasync() call. This can be
improved later by using the pwritev2() syscall if available.

As an example, this is how fio numbers can be improved in some scenarios
with this patch (all using virtio-blk with cache=directsync on an nvme
block device for the VM, fio with ioengine=libaio,direct=1,sync=1):

                              | old           | with FUA support
------------------------------+---------------+-------------------
bs=4k, iodepth=1, numjobs=1   |  45.6k iops   |  56.1k iops
bs=4k, iodepth=1, numjobs=16  | 183.3k iops   | 236.0k iops
bs=4k, iodepth=16, numjobs=1  | 258.4k iops   | 311.1k iops

However, not all scenarios are clear wins. On another slower disk I saw
little to no improvment. In fact, in two corner case scenarios, I even
observed a regression, which I however consider acceptable:

1. On slow host disks in a write through cache mode, when the guest is
   using virtio-blk in a separate iothread so that polling can be
   enabled, and each completion is quickly followed up with a new
   request (so that polling gets it), it can happen that enabling FUA
   makes things slower - the additional very fast no-op flush we used to
   have gave the adaptive polling algorithm a success so that it kept
   polling. Without it, we only have the slow write request, which
   disables polling. This is a problem in the polling algorithm that
   will be fixed later in this series.

2. With a high queue depth, it can be beneficial to have flush requests
   for another reason: The optimisation in bdrv_co_flush() that flushes
   only once per write generation acts as a synchronisation mechanism
   that lets all requests complete at the same time. This can result in
   better batching and if the disk is very fast (I only saw this with a
   null_blk backend), this can make up for the overhead of the flush and
   improve throughput. In theory, we could optionally introduce a
   similar artificial latency in the normal completion path to achieve
   the same kind of completion batching. This is not implemented in this
   series.

Compatibility is not a concern for the kernel side of io_uring, it has
supported RWF_DSYNC from the start. However, io_uring_prep_writev2() is
not available before liburing 2.2.

Linux AIO started supporting it in Linux 4.13 and libaio 0.3.111. The
kernel is not a problem for any supported build platform, so it's not
necessary to add runtime checks. However, openSUSE is still stuck with
an older libaio version that would break the build.

We must detect the presence of the writev2 functions in the user space
libraries at build time to avoid build failures.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250307221634.71951-2-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-13 17:44:55 +01:00
Stefan Hajnoczi
4c33c097f3 Misc HW patches
- Set correct values for MPC8569E's eSDHC (Zoltan)
 - Emulate Ricoh RS5C372 RTC device (Bernhard)
 - Array overflow fixes in SMSC91C111 netdev (Peter)
 - Fix typo in Xen HVM (Philippe)
 - Move graphic height/width/depth globals to their own file (Philippe)
 - Introduce qemu_arch_available() helper (Philippe)
 - Check fw_cfg's ACPI availability at runtime (Philippe)
 - Remove virtio-mem dependency on CONFIG_DEVICES (Philippe)
 - Sort HyperV SYNDBG API definitions (Pierrick)
 - Remove need for SDHCI_VENDOR_FSL definition (Philippe)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmfRXiMACgkQ4+MsLN6t
 wN5zFhAAzSW/hZneD8hycKtr9nBlvZSD72cEt+b656OCbTyyucUi1sG4rMPMvHeW
 h6HP6xt2SfQxXbec6Y0pWxWUkBOQzk72s0zpttOED3oEspkrId2D+VSsSH1E+QLh
 WoG7/hVgz0bDHexWYIDdGufO4no/icwewAKmC5Kp2HbaNxIIHyWlK1+RO69/lCLN
 s3qkNesMsQyEWN28ogEMRqyCIG3oJVP76U4TVcdxIiE51WI8sP8/7V2um0AXN68m
 IV3INrfVJjGDp501elrUbD3qsYopRdxoMAvwiVojrLXin6xtS+SQjEe/hcNxzM70
 0IQPp9WWwLjNkeFlAJF4wpwGJttFNHj+5gtH7/YRrP75jt9kAxPXkFw/OFfpVd30
 NYbeFlWDhRL1QPBs+WPBZTrfD7fRmpfMJRLF3/w61+WvnVrshlyDaoCWbR+L329F
 uOQFsBdAD7m/lkZ0mHtskS2vkZx7Itn1av4gql7T7/6cE1R7ItKy1HY9UUCtY6Gp
 7V6XrsAE3khg2HY8IcJ73+sPLQn/GxqZFE7PqmAhgcl6RZEFQv8PNrEgFxCEYyuK
 KJjx0hRMLoigp0CEclLfOqz2d3knsI8SJbgD4iTYQc02E69lx8a4XS4N8JXoLEdh
 3i/ndwKEFmzwNuqbU0nYsSJDiAO9ejra8O2BXZS/a4pkxC2jtdw=
 =VVr6
 -----END PGP SIGNATURE-----

Merge tag 'hw-misc-20250312' of https://github.com/philmd/qemu into staging

Misc HW patches

- Set correct values for MPC8569E's eSDHC (Zoltan)
- Emulate Ricoh RS5C372 RTC device (Bernhard)
- Array overflow fixes in SMSC91C111 netdev (Peter)
- Fix typo in Xen HVM (Philippe)
- Move graphic height/width/depth globals to their own file (Philippe)
- Introduce qemu_arch_available() helper (Philippe)
- Check fw_cfg's ACPI availability at runtime (Philippe)
- Remove virtio-mem dependency on CONFIG_DEVICES (Philippe)
- Sort HyperV SYNDBG API definitions (Pierrick)
- Remove need for SDHCI_VENDOR_FSL definition (Philippe)

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmfRXiMACgkQ4+MsLN6t
# wN5zFhAAzSW/hZneD8hycKtr9nBlvZSD72cEt+b656OCbTyyucUi1sG4rMPMvHeW
# h6HP6xt2SfQxXbec6Y0pWxWUkBOQzk72s0zpttOED3oEspkrId2D+VSsSH1E+QLh
# WoG7/hVgz0bDHexWYIDdGufO4no/icwewAKmC5Kp2HbaNxIIHyWlK1+RO69/lCLN
# s3qkNesMsQyEWN28ogEMRqyCIG3oJVP76U4TVcdxIiE51WI8sP8/7V2um0AXN68m
# IV3INrfVJjGDp501elrUbD3qsYopRdxoMAvwiVojrLXin6xtS+SQjEe/hcNxzM70
# 0IQPp9WWwLjNkeFlAJF4wpwGJttFNHj+5gtH7/YRrP75jt9kAxPXkFw/OFfpVd30
# NYbeFlWDhRL1QPBs+WPBZTrfD7fRmpfMJRLF3/w61+WvnVrshlyDaoCWbR+L329F
# uOQFsBdAD7m/lkZ0mHtskS2vkZx7Itn1av4gql7T7/6cE1R7ItKy1HY9UUCtY6Gp
# 7V6XrsAE3khg2HY8IcJ73+sPLQn/GxqZFE7PqmAhgcl6RZEFQv8PNrEgFxCEYyuK
# KJjx0hRMLoigp0CEclLfOqz2d3knsI8SJbgD4iTYQc02E69lx8a4XS4N8JXoLEdh
# 3i/ndwKEFmzwNuqbU0nYsSJDiAO9ejra8O2BXZS/a4pkxC2jtdw=
# =VVr6
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Mar 2025 18:12:51 HKT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'hw-misc-20250312' of https://github.com/philmd/qemu:
  hw/sd/sdhci: Remove need for SDHCI_VENDOR_IMX definition
  hw/hyperv/hyperv-proto: Move SYNDBG definitions from target/i386
  hw/virtio/virtio-mem: Remove CONFIG_DEVICES include
  hw/i386/fw_cfg: Check ACPI availability with acpi_builtin()
  hw/acpi: Introduce acpi_builtin() helper
  system: Replace arch_type global by qemu_arch_available() helper
  system: Extract target-specific globals to their own compilation unit
  hw/xen/hvm: Fix Aarch64 typo
  hw/net/smc91c111: Don't allow data register access to overrun buffer
  hw/net/smc91c111: Use MAX_PACKET_SIZE instead of magic numbers
  hw/net/smc91c111: Sanitize packet length on tx
  hw/net/smc91c111: Sanitize packet numbers
  hw/rtc: Add Ricoh RS5C372 RTC emulation
  hw/sd/sdhci: Set reset value of interrupt registers

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-13 10:35:25 +08:00
Stefan Hajnoczi
74b3445378 vfio queue:
* Fixed endianness of VFIO device state packets
 * Improved IGD passthrough support with legacy mode
 * Improved build
 * Added support for old AMD GPUs (x550)
 * Updated property documentation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmfQfQcACgkQUaNDx8/7
 7KEUNw/+PjFpHrz5muQ8itkbyd36eJJdcxCl+9IPIWfnUfB582epkLcgvWyswGUo
 krFTregoRG0PKtgZDtv95owGtVJOgK6XYFadGHiYkvvsb41twOYsP7/SuI+KMiEv
 IDFLMvCTyorSIIoEF8i2EexfGPRV1VoWwvBoHgRRmYlzwzXnufjABpoZ0a25DTye
 DQ4yhSfqoIh1gOcdL9tPictnZg9OxKr2ePXNdrtymtEIhg3ZobD3Jd8J4WCcsfKT
 fxxBO5NsGgA8oM7i02fYN9kgMwqTnVhSAu1wq9PXsbrnNXam+trywAWSO6CjL+rV
 ++STWNSrRoHzuotRBr7BzrTpTFyQyfwBWqUT5L4NlhgXB3Xybk+M6Zj08Yva8pjE
 w78JQKvKp54gU34AWBW0/J6+u3v+iE8l1Eywx6xueF9Q+YSUDeW9B1LDdjFJryhF
 d8j3J+vuglbdsp05D+tVErf5cqFvFDfrjTkXkZNtmx7wky45XS9ZvNazYW1KI3f9
 bg8Wjb7ZujuvxpSjycPRZzdKa8kqSgSZg7fg91Wimiy1Iqe3SZVVWNchLYiPp8Dm
 nXMfOEpVHQZ1vzeo7dVWyxu9Y1ujgvUQy8kMa9q2W2S7HQ5Sna79n7eMVJxqZQ4G
 m0ETFToOcPPOnZBWgqNOSUlSQncFuIVgNTDvycQ9dMhGorYcBDI=
 =Vh0m
 -----END PGP SIGNATURE-----

Merge tag 'pull-vfio-20250311' of https://github.com/legoater/qemu into staging

vfio queue:

* Fixed endianness of VFIO device state packets
* Improved IGD passthrough support with legacy mode
* Improved build
* Added support for old AMD GPUs (x550)
* Updated property documentation

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmfQfQcACgkQUaNDx8/7
# 7KEUNw/+PjFpHrz5muQ8itkbyd36eJJdcxCl+9IPIWfnUfB582epkLcgvWyswGUo
# krFTregoRG0PKtgZDtv95owGtVJOgK6XYFadGHiYkvvsb41twOYsP7/SuI+KMiEv
# IDFLMvCTyorSIIoEF8i2EexfGPRV1VoWwvBoHgRRmYlzwzXnufjABpoZ0a25DTye
# DQ4yhSfqoIh1gOcdL9tPictnZg9OxKr2ePXNdrtymtEIhg3ZobD3Jd8J4WCcsfKT
# fxxBO5NsGgA8oM7i02fYN9kgMwqTnVhSAu1wq9PXsbrnNXam+trywAWSO6CjL+rV
# ++STWNSrRoHzuotRBr7BzrTpTFyQyfwBWqUT5L4NlhgXB3Xybk+M6Zj08Yva8pjE
# w78JQKvKp54gU34AWBW0/J6+u3v+iE8l1Eywx6xueF9Q+YSUDeW9B1LDdjFJryhF
# d8j3J+vuglbdsp05D+tVErf5cqFvFDfrjTkXkZNtmx7wky45XS9ZvNazYW1KI3f9
# bg8Wjb7ZujuvxpSjycPRZzdKa8kqSgSZg7fg91Wimiy1Iqe3SZVVWNchLYiPp8Dm
# nXMfOEpVHQZ1vzeo7dVWyxu9Y1ujgvUQy8kMa9q2W2S7HQ5Sna79n7eMVJxqZQ4G
# m0ETFToOcPPOnZBWgqNOSUlSQncFuIVgNTDvycQ9dMhGorYcBDI=
# =Vh0m
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 12 Mar 2025 02:12:23 HKT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20250311' of https://github.com/legoater/qemu: (21 commits)
  vfio/pci: Drop debug commentary from x-device-dirty-page-tracking
  vfio/pci-quirks: Exclude non-ioport BAR from ATI quirk
  hw/vfio: Compile display.c once
  hw/vfio: Compile iommufd.c once
  hw/vfio: Compile more objects once
  hw/vfio: Compile some common objects once
  hw/vfio/common: Get target page size using runtime helpers
  hw/vfio/common: Include missing 'system/tcg.h' header
  hw/vfio/spapr: Do not include <linux/kvm.h>
  system: Declare qemu_[min/max]rampagesize() in 'system/hostmem.h'
  vfio/migration: Use BE byte order for device state wire packets
  vfio/igd: Fix broken KVMGT OpRegion support
  vfio/igd: Introduce x-igd-lpc option for LPC bridge ID quirk
  vfio/igd: Handle x-igd-opregion option in config quirk
  vfio/igd: Decouple common quirks from legacy mode
  vfio/igd: Refactor vfio_probe_igd_bar4_quirk into pci config quirk
  vfio/pci: Add placeholder for device-specific config space quirks
  vfio/igd: Move LPC bridge initialization to a separate function
  vfio/igd: Consolidate OpRegion initialization into a single function
  vfio/igd: Do not include GTT stolen size in etc/igd-bdsm-size
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-13 10:35:12 +08:00
Philippe Mathieu-Daudé
7f2a5272ff hw/sd/sdhci: Remove need for SDHCI_VENDOR_IMX definition
All instances of TYPE_IMX_USDHC set vendor=SDHCI_VENDOR_IMX.
No need to special-case it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20250308213640.13138-3-philmd@linaro.org>
2025-03-12 11:11:42 +01:00
Pierrick Bouvier
003d35ad6c hw/hyperv/hyperv-proto: Move SYNDBG definitions from target/i386
Allows SYNDBG definitions to be available for common compilation units.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20250307215623.524987-5-pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-03-11 20:03:27 +01:00
Philippe Mathieu-Daudé
e6ffea40e2 hw/acpi: Introduce acpi_builtin() helper
acpi_builtin() can be used to check at runtime whether
the ACPI subsystem is built in a qemu-system binary.

Reviewed-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250307223949.54040-3-philmd@linaro.org>
2025-03-11 20:03:26 +01:00
Philippe Mathieu-Daudé
44ac8eaff0 system: Replace arch_type global by qemu_arch_available() helper
qemu_arch_available() is a bit simpler to understand while
reviewing than the undocumented arch_type variable.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20250305005225.95051-5-philmd@linaro.org>
2025-03-11 20:03:26 +01:00
Philippe Mathieu-Daudé
3a11b653a6 hw/xen/hvm: Fix Aarch64 typo
There is no TARGET_ARM_64 definition. Luckily enough,
when TARGET_AARCH64 is defined, TARGET_ARM also is.

Fixes: 733766cd37 ("hw/arm: introduce xenpvh machine")
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250305153929.43687-2-philmd@linaro.org>
2025-03-11 20:03:26 +01:00
BALATON Zoltan
d060b2789f hw/sd/sdhci: Set reset value of interrupt registers
The interrupt enable registers are not reset to 0 on Freescale eSDHC
but some bits are enabled on reset. At least some U-Boot versions seem
to expect this and not initialise these registers before expecting
interrupts. Use existing vendor property for Freescale eSDHC and set
the reset value of the interrupt registers to match Freescale
documentation.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-ID: <20250210160329.DDA7F4E600E@zero.eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-03-11 20:00:16 +01:00
Philippe Mathieu-Daudé
c6cd30fead system: Declare qemu_[min/max]rampagesize() in 'system/hostmem.h'
Both qemu_minrampagesize() and qemu_maxrampagesize() are
related to host memory backends, having the following call
stack:

  qemu_minrampagesize()
     -> find_min_backend_pagesize()
         -> object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)

  qemu_maxrampagesize()
     -> find_max_backend_pagesize()
        -> object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)

Having TYPE_MEMORY_BACKEND defined in "system/hostmem.h":

  include/system/hostmem.h:23:#define TYPE_MEMORY_BACKEND "memory-backend"

Move their prototype declaration to "system/hostmem.h".

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20250308230917.18907-7-philmd@linaro.org>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/qemu-devel/20250311085743.21724-2-philmd@linaro.org
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-03-11 17:01:14 +01:00
Kevin Wolf
000a41b69c block: Remove unused blk_op_is_blocked()
Commit fc4e394b28 removed the last caller of blk_op_is_blocked(). Remove
the now unused function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20250206165331.379033-1-kwolf@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-03-11 15:49:14 +01:00
Nicholas Piggin
d91b101da1 spapr: Generate random HASHPKEYR for spapr machines
The hypervisor is expected to create a value for the HASHPKEY SPR for
each partition. Currently it uses zero for all partitions, use a
random number instead, which in theory might make kernel ROP protection
more secure.

Signed-of-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241219034035.1826173-4-npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:32 +10:00
Vaibhav Jain
5f7d861e65 spapr: nested: Add support for reporting Hostwide state counter
Add support for reporting Hostwide state counters for nested KVM pseries
guests running with 'cap-nested-papr' on Qemu-TCG acting as
L0-hypervisor. The Hostwide state counters are statistics about state that
L0-hypervisor maintains for the L2-guests and represent the state of all
L2-guests, not just a specific one.

These stats counters are exposed to L1-Hypervisor by the L0-Hypervisor via a
new bit-flag named 'getHostWideState' for the H_GUEST_GET_STATE hcall which
is documented at [1]. Once this flag is set the hcall should populate the
Guest-State-Elements in the requested GSB with the stat counter
values. Currently following five counters are supported:

* l0_guest_heap_size_inuse
* l0_guest_heap_size_max
* l0_guest_pagetable_size_inuse
* l0_guest_pagetable_size_max
* l0_guest_pagetable_reclaimed

At the moment '0' is being reported for all these counters as these
counters doesn't align with how L0-Qemu manages Guest memory.

The patch implements support for these counters by adding new members to
the 'struct SpaprMachineStateNested'. These new members are then plugged
into the existing 'guest_state_element_types[]' with the help of a new
macro 'GSBE_NESTED_MACHINE_DW' together with a new helper
'get_machine_ptr()'. guest_state_request_check() is updated to ensure
correctness of the requested GSB and finally h_guest_getset_state() is
updated to handle the newly introduced flag
'GUEST_STATE_REQUEST_HOST_WIDE'.

This patch is tested with the proposed linux-kernel implementation to
expose these stat-counter as perf-events at [2].

[1]
https://lore.kernel.org/all/20241222140247.174998-2-vaibhav@linux.ibm.com

[2]
https://lore.kernel.org/all/20241222140247.174998-1-vaibhav@linux.ibm.com

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-ID: <20250221155449.530645-1-vaibhav@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:32 +10:00
Shivaprasad G Bhat
5f361ea187 ppc: spapr: Enable 2nd DAWR on Power10 pSeries machine
As per the PAPR, bit 0 of byte 64 in pa-features property
indicates availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to find
whether kvm supports 2nd DAWR or not. If it's supported, allow user to set
the pa-feature bit in guest DT using cap-dawr1 machine capability.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Message-ID: <173708681866.1678.11128625982438367069.stgit@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:32 +10:00
Chalapathi V
a613b9d321 hw/ssi/pnv_spi: Put a limit to RDR match failures
There is a possibility that SPI controller can get into loop due to indefinite
RDR match failures. Hence put a limit to failures and stop the sequencer.

Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20250303141328.23991-5-chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Chalapathi V
7192d7b7fe hw/ssi/pnv_spi: Make bus names distinct for each controllers of a socket
Create a spi buses with distinct names on each socket so that responders
are attached to correct SPI controllers.

Change the bus name to chipX.spi.<busnum> where X = 0..<num_sockets>

QOM tree on a 2 socket machine:
(qemu) info qom-tree
/machine (powernv10-machine)
  /chip[0] (power10_v2.0-pnv-chip)
    /pib_spic[0] (pnv-spi)
      /chip0.spi.0 (SSI)
      /xscom-spi[0] (memory-region)
  /chip[1] (power10_v2.0-pnv-chip)
    /pib_spic[0] (pnv-spi)
      /chip1.spi.0 (SSI)
      /xscom-spi[0] (memory-region)

Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Message-ID: <20250303141328.23991-4-chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Chalapathi V
17befecda8 hw/ssi/pnv_spi: Replace PnvXferBuffer with Fifo8 structure
In PnvXferBuffer dynamically allocating and freeing is a
process overhead. Hence used an existing Fifo8 buffer with
capacity of 16 bytes.

Signed-off-by: Chalapathi V <chalapathi.v@linux.ibm.com>
Message-ID: <20250303141328.23991-2-chalapathi.v@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
0b266ae15e ppc/xive2: Check crowd backlog when scanning group backlog
When processing a backlog scan for group interrupts, also take
into account crowd interrupts.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
1a3cc1209b ppc/xive2: Support crowd-matching when looking for target
XIVE crowd sizes are encoded into a 2-bit field as follows:
  0: 0b00
  2: 0b01
  4: 0b10
 16: 0b11

A crowd size of 8 is not supported.

If an END is defined with the 'crowd' bit set, then a target can be
running on different blocks. It means that some bits from the block
VP are masked when looking for a match. It is similar to groups, but
on the block instead of the VP index.

Most of the changes are due to passing the extra argument 'crowd' all
the way to the function checking for matches.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Glenn Miles <milesg@linux.vnet.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
96a2132ce9 ppc/xive2: Add support for MMIO operations on the NVPG/NVC BAR
Add support for the NVPG and NVC BARs.  Access to the BAR pages will
cause backlog counter operations to either increment or decriment
the counter.

Also added qtests for the same.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
26c55b9941 ppc/xive2: Process group backlog when updating the CPPR
When the hypervisor or OS pushes a new value to the CPPR, if the LSMFB
value is lower than the new CPPR value, there could be a pending group
interrupt in the backlog, so it needs to be scanned.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
58fa4433e0 ppc/xive2: Add undelivered group interrupt to backlog
When a group interrupt cannot be delivered, we need to:
- increment the backlog counter for the group in the NVG table
  (if the END is configured to keep a backlog).
- start a broadcast operation to set the LSMFB field on matching CPUs
  which can't take the interrupt now because they're running at too
  high a priority.

[npiggin: squash in fixes from milesg]
[milesg: only load the NVP if the END is !ignore]
[milesg: always broadcast backlog, not only when there are precluded VPs]

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
9cb7f6ebed ppc/xive2: Support group-matching when looking for target
If an END has the 'i' bit set (ignore), then it targets a group of
VPs. The size of the group depends on the VP index of the target
(first 0 found when looking at the least significant bits of the
index) so a mask is applied on the VP index of a running thread to
know if we have a match.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
9d2b6058c5 ppc/xive2: Add grouping level to notification
The NSR has a (so far unused) grouping level field. When a interrupt
is presented, that field tells the hypervisor or OS if the interrupt
is for an individual VP or for a VP-group/crowd. This patch reworks
the presentation API to allow to set/unset the level when
raising/accepting an interrupt.

It also renames xive_tctx_ipb_update() to xive_tctx_pipr_update() as
the IPB is only used for VP-specific target, whereas the PIPR always
needs to be updated.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Michael Kowal
a45580ad03 ppc/xive: Rename ipb_to_pipr() to xive_ipb_to_pipr()
Rename to follow the convention of the other function names.

Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Frederic Barrat
19db3b5a24 ppc/xive2: Update NVP save/restore for group attributes
If the 'H' attribute is set on the NVP structure, the hardware
automatically saves and restores some attributes from the TIMA in the
NVP structure.

The group-specific attributes LSMFB, LGS and T have an extra flag to
individually control what is saved/restored.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:31 +10:00
Nicholas Piggin
b899de9a3d ppc/pnv: Move PNOR to offset 0 in the ISA FW space
skiboot has a bug that does not handle ISA FW access correctly for IDSEL
devices > 0, and the current PNOR default address and size puts 64MB in
device 0 and 64MB in device 1, which causes skiboot to hit this bug and
breaks PNOR accesses.

Move the PNOR address down to 0 for now, so a 256MB PNOR can be accessed
via device 0.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:30 +10:00
Nicholas Piggin
a1750b2cba ppc/pnv/occ: Implement a basic dynamic OCC model
The OCC is an On Chip Controller that handles various thermal and power
management. It is a PPC405 microcontroller that runs its own firmware
which is out of scope of the powernv machine model. Some dynamic
behaviour and interfaces that are important for host CPU testing can be
implemented with a much simpler state machine.

This change adds a 100ms timer that ticks through a simple state machine
that looks for "OCC command requests" coming from host firmware, and
responds to them.

For now the powercap command is implemented because that is used by
OPAL and exported to Linux and is easy to test.

  $ F=/sys/firmware/opal/powercap/system-powercap/powercap-current
  $ cat $F
  100
  $ echo 50 | sudo tee $F
  50
  $ cat $F
  50

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:30 +10:00
Nicholas Piggin
70bc5c2498 ppc/pnv: Make HOMER memory a RAM region
The HOMER is a region of memory used by host and firmware and
microconrollers. It has very little logic by itself, just some BAR
registers. Users of this memory should operate on it rather than
have HOMER implement them with MMIO registers, which is not the
right model.

This change switches the implementation of HOMER from MMIO to RAM,
and moves the OCC register implementation to in-memory structure
accesses performed by the OCC model.

This has the downside that access to unimplemented regions of HOMER
are no longer flagged. Perhaps that could be done by adding a memory
region for HOMER, and ram subregions under that for each implemented
part. But for now this takes the simpler approach.

Note: This brings some data structure definitions from skiboot, which
does not match QEMU coding style but is not changed to make comparisons
and updates simpler.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:30 +10:00
Nicholas Piggin
2935a3fb03 ppc/pnv/homer: class-based base and size
Put HOMER memory region base and size into the class, to allow more
code-reuse between different machines in later changes.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:30 +10:00
Nicholas Piggin
6b56bb6dbc ppc/pnv/phb4: Add pervasive chiplet support to PHB4/5
Each non-core chiplet on a chip has a "pervasive chiplet" unit and its
xscom register set. This adds support for PHB4/5.

skiboot reads the CPLT_CONF1 register in __phb4/5_get_max_link_width(),
which shows up as unimplemented xscom reads. Set a value in PCI CONF1
register's link-width field to demonstrate skiboot doing something
interesting with it.

In the bigger picture, it might be better to model the pervasive
chiplet type as parent that each non-core chiplet model derives from.
For now this is enough to get the PHB registers implemented and working
for skiboot, and provides a second example (after the N1 chiplet) that
will help if the design is reworked as such.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2025-03-11 22:43:30 +10:00
Stefan Hajnoczi
825b96dbce Migration pull request
- Fix use-after-free in incoming migration
 - Improve cpr migration blocker for volatile ram
 - Fix RDMA migration
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmfPaCAQHGZhcm9zYXNA
 c3VzZS5kZQAKCRDHmNx0G+wxnQy9EADRp/6GaSzoqWgafU8DGM5Q69HyKiZ888DZ
 7qXqJeH3c95nvOnIw2BMhUYX4t8kkAbUcWlr7L8KCjZT/6N/d1/Z5fimqymRkw4x
 +8kDyADv5FY0339aMLf3qBbIAQj/gvPvg8H+e+hXfokZqoYgLXZ0eqNAz8MjIcyN
 +A+waEBMLNvTgZyTQl2TbCvb+mbRial8u8C9BIoILhn/gNuoMX7lbt0tq41HZwe0
 l3v16jnXlsDvQUXp99bGySomRgkcYqdAt+HWHLje3frT/Ap8dGaUJKlpgJ8DXJiA
 fV1reKihJdj37q9GSG8cR02W+ATBesiecufV4TUPNQYQzTdxn3fOMwdc3Pck074D
 YAQxFT20OPou+NRxjYoHT/GqFUY36/2qBJpt7TY3ramdklHJhXpRyedK4rppTZNn
 pC3lnbpA/LHRmfD1Nh0CRmqZpbV+qW1BWEgMwk4qui46BxYWHxKHFpxAuwlJQmcw
 RxY8qPhIXQM03tiTgIddBNDZLoVqRoUP7YpzR7MMa1rz0T5inNFMcNGm72WpKODE
 rzpw4ezXO7+D4/QmMq3PoPfhFv3QFnH6jaGj8JkJM378KLvh4fQ0woXtDKFl4Tbq
 1oBZ17WUv6aHr75b+KMyKJNLinvMu5WF5WoRYIt1lNXaqk7I494yvIjtRrimWZIS
 Z5Q0tpUmpw==
 =yEH0
 -----END PGP SIGNATURE-----

Merge tag 'migration-20250310-pull-request' of https://gitlab.com/farosas/qemu into staging

Migration pull request

- Fix use-after-free in incoming migration
- Improve cpr migration blocker for volatile ram
- Fix RDMA migration

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmfPaCAQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnQy9EADRp/6GaSzoqWgafU8DGM5Q69HyKiZ888DZ
# 7qXqJeH3c95nvOnIw2BMhUYX4t8kkAbUcWlr7L8KCjZT/6N/d1/Z5fimqymRkw4x
# +8kDyADv5FY0339aMLf3qBbIAQj/gvPvg8H+e+hXfokZqoYgLXZ0eqNAz8MjIcyN
# +A+waEBMLNvTgZyTQl2TbCvb+mbRial8u8C9BIoILhn/gNuoMX7lbt0tq41HZwe0
# l3v16jnXlsDvQUXp99bGySomRgkcYqdAt+HWHLje3frT/Ap8dGaUJKlpgJ8DXJiA
# fV1reKihJdj37q9GSG8cR02W+ATBesiecufV4TUPNQYQzTdxn3fOMwdc3Pck074D
# YAQxFT20OPou+NRxjYoHT/GqFUY36/2qBJpt7TY3ramdklHJhXpRyedK4rppTZNn
# pC3lnbpA/LHRmfD1Nh0CRmqZpbV+qW1BWEgMwk4qui46BxYWHxKHFpxAuwlJQmcw
# RxY8qPhIXQM03tiTgIddBNDZLoVqRoUP7YpzR7MMa1rz0T5inNFMcNGm72WpKODE
# rzpw4ezXO7+D4/QmMq3PoPfhFv3QFnH6jaGj8JkJM378KLvh4fQ0woXtDKFl4Tbq
# 1oBZ17WUv6aHr75b+KMyKJNLinvMu5WF5WoRYIt1lNXaqk7I494yvIjtRrimWZIS
# Z5Q0tpUmpw==
# =yEH0
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 11 Mar 2025 06:30:56 HKT
# gpg:                using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg:                issuer "farosas@suse.de"
# gpg: Good signature from "Fabiano Rosas <farosas@suse.de>" [unknown]
# gpg:                 aka "Fabiano Almeida Rosas <fabiano.rosas@suse.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3  64CF C798 DC74 1BEC 319D

* tag 'migration-20250310-pull-request' of https://gitlab.com/farosas/qemu:
  migration: Prioritize RDMA in ram_save_target_page()
  migration: ram block cpr blockers
  migration: Fix UAF for incoming migration on MigrationState

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-11 09:32:07 +08:00
Stefan Hajnoczi
1a5f3d2eee Xen queue:
* xen/passthrough: use gsi to map pirq when dom0 is PVH
 * Fix missing xenstore node from xen-block backend
 * Fix xen mapcache extraneous invalidate
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmfO+nEACgkQDPVXL9f7
 Va+QYggA9dmxMGDO05UEd2ZPv/Goub37Le44qBN4oeXizVRZgGUs2w9ETBXhPZus
 34aI8CTID4fcH4rgF4LgJ4XuyOxYwP1ot8EpDHQg+ji2nyHeMpAyePTfubprq17U
 APN6Qqefd9X+TX+W9zUS5jV/AXO+apGX+tmVkVexFuy4gSRGSVCPoibHePtoLH9G
 3rSREjdEx7ByY6ieCV5x3zHPp5tmnLWeHpNCVc5x6NplBslQduBz6vOqLNWB1LKO
 3a/lYcvTn9PIla1zpvGNbeTsPv2lcdx3SccThcZmyTv2PDm1kzyUOIo1lSIP6bb3
 LjCl3dm1mfxAGEaZ+//rsRhTH8d5ew==
 =K79y
 -----END PGP SIGNATURE-----

Merge tag 'pull-xen-20250310' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm into staging

Xen queue:

* xen/passthrough: use gsi to map pirq when dom0 is PVH
* Fix missing xenstore node from xen-block backend
* Fix xen mapcache extraneous invalidate

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmfO+nEACgkQDPVXL9f7
# Va+QYggA9dmxMGDO05UEd2ZPv/Goub37Le44qBN4oeXizVRZgGUs2w9ETBXhPZus
# 34aI8CTID4fcH4rgF4LgJ4XuyOxYwP1ot8EpDHQg+ji2nyHeMpAyePTfubprq17U
# APN6Qqefd9X+TX+W9zUS5jV/AXO+apGX+tmVkVexFuy4gSRGSVCPoibHePtoLH9G
# 3rSREjdEx7ByY6ieCV5x3zHPp5tmnLWeHpNCVc5x6NplBslQduBz6vOqLNWB1LKO
# 3a/lYcvTn9PIla1zpvGNbeTsPv2lcdx3SccThcZmyTv2PDm1kzyUOIo1lSIP6bb3
# LjCl3dm1mfxAGEaZ+//rsRhTH8d5ew==
# =K79y
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Mar 2025 22:42:57 HKT
# gpg:                using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [unknown]
# gpg:                 aka "Anthony PERARD <anthony.perard@vates.tech>" [unknown]
# gpg:                 aka "Anthony PERARD <anthony@xenproject.org>" [unknown]
# gpg:                 aka "Anthony PERARD <anthony.perard@citrix.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A  7161 D8D5 7199 DF83 42C8
#      Subkey fingerprint: F80C 0063 08E2 2CFD 8A92  E798 0CF5 572F D7FB 55AF

* tag 'pull-xen-20250310' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm:
  xen: No need to flush the mapcache for grants
  hw/xen: Add "mode" parameter to xen-block devices
  xen/passthrough: use gsi to map pirq when dom0 is PVH

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-11 09:31:36 +08:00
Stefan Hajnoczi
920aa48824 -----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmfO1zkACgkQ7wSWWzmN
 YhET+wf+PkaGeFTNUrOtWpl35fSMKlmOVbb1fkPfuhVBmeY2Vh1EIN3OjqnzdV0F
 wxpuk+wwmFiuV1n6RNuMHQ0nz1mhgsSlZh93N5rArC/PUr3iViaT0cb82RjwxhaI
 RODBhhy7V9WxEhT9hR8sCP2ky2mrKgcYbjiIEw+IvFZOVQa58rMr2h/cbAb/iH4l
 7T9Wba03JBqOS6qgzSFZOMxvqnYdVjhqXN8M6W9ngRJOjPEAkTB6Evwep6anRjcM
 mCUOgkf2sgQwKve8pYAeTMkzXFctvTc/qCU4ZbN8XcoKVVxe2jllGQqdOpMskPEf
 slOuINeW5M0K5gyjsb/huqcOTfDI2A==
 =/Y0+
 -----END PGP SIGNATURE-----

Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEIV1G9IJGaJ7HfzVi7wSWWzmNYhEFAmfO1zkACgkQ7wSWWzmN
# YhET+wf+PkaGeFTNUrOtWpl35fSMKlmOVbb1fkPfuhVBmeY2Vh1EIN3OjqnzdV0F
# wxpuk+wwmFiuV1n6RNuMHQ0nz1mhgsSlZh93N5rArC/PUr3iViaT0cb82RjwxhaI
# RODBhhy7V9WxEhT9hR8sCP2ky2mrKgcYbjiIEw+IvFZOVQa58rMr2h/cbAb/iH4l
# 7T9Wba03JBqOS6qgzSFZOMxvqnYdVjhqXN8M6W9ngRJOjPEAkTB6Evwep6anRjcM
# mCUOgkf2sgQwKve8pYAeTMkzXFctvTc/qCU4ZbN8XcoKVVxe2jllGQqdOpMskPEf
# slOuINeW5M0K5gyjsb/huqcOTfDI2A==
# =/Y0+
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 10 Mar 2025 20:12:41 HKT
# gpg:                using RSA key 215D46F48246689EC77F3562EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [full]
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* tag 'net-pull-request' of https://github.com/jasowang/qemu:
  tap-linux: Open ipvtap and macvtap
  Revert "hw/net/net_tx_pkt: Fix overrun in update_sctp_checksum()"
  util/iov: Do not assert offset is in iov
  net: move backend cleanup to NIC cleanup
  net: parameterize the removing client from nc list

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-03-11 09:26:40 +08:00
Steve Sistare
094a3dbc55 migration: ram block cpr blockers
Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks
in the migration stream file and recreate them later, because the physical
memory for the blocks is pinned and registered for vfio.  Add a blocker
for volatile ram blocks.

Also add a blocker for RAM_GUEST_MEMFD.  Preserving guest_memfd may be
sufficient for CPR, but it has not been tested yet.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-03-10 12:09:24 -03:00
Jiqian Chen
cfcacbab38 xen/passthrough: use gsi to map pirq when dom0 is PVH
In PVH dom0, when passthrough a device to domU, QEMU code
xen_pt_realize->xc_physdev_map_pirq wants to use gsi, but in current codes
the gsi number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is
wrong, because irq is not equal with gsi, they are in different spaces, so
pirq mapping fails.

To solve above problem, use new interface of Xen, xc_pcidev_get_gsi to get
gsi and use xc_physdev_map_pirq_gsi to map pirq when dom0 is PVH.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Acked-by: Anthony PERARD <anthony@xenproject.org>
Reviewed-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Message-Id: <20241106061418.3655304-1-Jiqian.Chen@amd.com>
Signed-off-by: Anthony PERARD <anthony.perard@vates.tech>
2025-03-10 13:25:14 +01:00
Alex Bennée
4752ed5272 include/qemu: plugin-memory.h doesn't need cpu-defs.h
hwaddr is a fixed size on all builds.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250304222439.2035603-23-alex.bennee@linaro.org>
2025-03-10 10:30:01 +00:00