Commit graph

2726 commits

Author SHA1 Message Date
David Woodhouse
3634039b93 hw/acpi: Add vmclock device
The vmclock device addresses the problem of live migration with
precision clocks. The tolerances of a hardware counter (e.g. TSC) are
typically around ±50PPM. A guest will use NTP/PTP/PPS to discipline that
counter against an external source of 'real' time, and track the precise
frequency of the counter as it changes with environmental conditions.

When a guest is live migrated, anything it knows about the frequency of
the underlying counter becomes invalid. It may move from a host where
the counter running at -50PPM of its nominal frequency, to a host where
it runs at +50PPM. There will also be a step change in the value of the
counter, as the correctness of its absolute value at migration is
limited by the accuracy of the source and destination host's time
synchronization.

The device exposes a shared memory region to guests, which can be mapped
all the way to userspace. In the first phase, this merely advertises a
'disruption_marker', which indicates that the guest should throw away any
NTP synchronization it thinks it has, and start again.

Because the region can be exposed all the way to userspace, applications
can still use time from a fast vDSO 'system call', and check the
disruption marker to be sure that their timestamp is indeed truthful.

The structure also allows for the precise time, as known by the host, to
be exposed directly to guests so that they don't have to wait for NTP to
resync from scratch.

The values and fields are based on the nascent virtio-rtc specification,
and the intent is that a version (hopefully precisely this version) of
this structure will be included as an optional part of that spec. In the
meantime, a simple ACPI device along the lines of VMGENID is perfectly
sufficient and is compatible with what's being shipped in certain
commercial hypervisors.

Linux guest support was merged into the 6.13-rc1 kernel:
https://git.kernel.org/torvalds/c/205032724226

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <07fd5e2f529098ad4d7cab1423fe9f4a03a9cc14.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 17:43:24 -05:00
Igor Mammedov
0b05339198 pci: acpi: Windows 'PCI Label Id' bug workaround
Current versions of Windows call _DSM(func=7) regardless
of whether it is supported or not. It leads to NICs having bogus
'PCI Label Id = 0', where none should be set at all.

Also presence of 'PCI Label Id' triggers another Windows bug
on localized versions that leads to hangs. The later bug is fixed
in latest updates for 'Windows Server' but not in consumer
versions of Windows (and there is no plans to fix it
as far as I'm aware).

Given it's easy, implement Microsoft suggested workaround
(return invalid Package) so that affected Windows versions
could boot on QEMU.
This would effectvely remove bogus 'PCI Label Id's on NICs,
but MS teem confirmed that flipping 'PCI Label Id' should not
change 'Network Connection' ennumeration, so it should be safe
for QEMU to change _DSM without any compat code.

Smoke tested with WinXP and WS2022
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/774
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20250115125342.3883374-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:07:23 -05:00
Zhenzhong Duan
d9d32478ed intel_iommu: Introduce a property to control FS1GP cap bit setting
This gives user flexibility to turn off FS1GP for debug purpose.

It is also useful for future nesting feature. When host IOMMU doesn't
support FS1GP but vIOMMU does, nested page table on host side works
after turning FS1GP off in vIOMMU.

This property has no effect when vIOMMU is in legacy mode or x-flts=off
in scalable modme.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-20-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:36 -05:00
Zhenzhong Duan
aa68a9fbdb intel_iommu: Introduce a property x-flts for stage-1 translation
Intel VT-d 3.0 introduces scalable mode, and it has a bunch of capabilities
related to scalable mode translation, thus there are multiple combinations.

This vIOMMU implementation wants to simplify it with a new property "x-flts".
When turned on in scalable mode, stage-1 translation is supported. When turned
on in legacy mode, throw out error.

With stage-1 translation support exposed to user, also accurate the pasid entry
check in vtd_pe_type_check().

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20241212083757.605022-19-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:35 -05:00
Zhenzhong Duan
ddd84fd0c1 intel_iommu: Set default aw_bits to 48 starting from QEMU 9.2
According to VTD spec, stage-1 page table could support 4-level and
5-level paging.

However, 5-level paging translation emulation is unsupported yet.
That means the only supported value for aw_bits is 48. So default
aw_bits to 48 when stage-1 translation is turned on.

For legacy and scalable modes, 48 is the default choice for modern
OS when both 48 and 39 are supported. So it makes sense to set
default to 48 for these two modes too starting from QEMU 9.2.
Use pc_compat_9_1 to handle the compatibility for machines before
9.2.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-17-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:30 -05:00
Zhenzhong Duan
1ab93575bd intel_iommu: piotlb invalidation should notify unmap
This is used by some emulated devices which caches address
translation result. When piotlb invalidation issued in guest,
those caches should be refreshed.

There is already a similar implementation in iotlb invalidation.
So update vtd_iotlb_page_invalidate_notify() to make it work
also for piotlb invalidation.

For device that does not implement ATS capability or disable
it but still caches the translation result, it is better to
implement ATS cap or enable it if there is need to cache the
translation result.

Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Message-Id: <20241212083757.605022-15-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:27 -05:00
Clément Mathieu--Drif
dd8200503e intel_iommu: Add support for PASID-based device IOTLB invalidation
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-14-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:26 -05:00
Clément Mathieu--Drif
1075645d43 intel_iommu: Add an internal API to find an address space with PASID
This will be used to implement the device IOTLB invalidation

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-13-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:24 -05:00
Zhenzhong Duan
6ebe6cf2a0 intel_iommu: Process PASID-based iotlb invalidation
PASID-based iotlb (piotlb) is used during walking Intel
VT-d stage-1 page table.

This emulates the stage-1 page table iotlb invalidation requested
by a PASID-based IOTLB Invalidate Descriptor (P_IOTLB).

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-12-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:23 -05:00
Zhenzhong Duan
16d4e418e9 intel_iommu: Flush stage-1 cache in iotlb invalidation
According to spec, Page-Selective-within-Domain Invalidation (11b):

1. IOTLB entries caching second-stage mappings (PGTT=010b) or pass-through
(PGTT=100b) mappings associated with the specified domain-id and the
input-address range are invalidated.
2. IOTLB entries caching first-stage (PGTT=001b) or nested (PGTT=011b)
mapping associated with specified domain-id are invalidated.

So per spec definition the Page-Selective-within-Domain Invalidation
needs to flush first stage and nested cached IOTLB entries as well.

We don't support nested yet and pass-through mapping is never cached,
so what in iotlb cache are only first-stage and second-stage mappings.

Add a tag pgtt in VTDIOTLBEntry to mark PGTT type of the mapping and
invalidate entries based on PGTT type.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-11-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:20 -05:00
Clément Mathieu--Drif
65c4f09999 intel_iommu: Set accessed and dirty bits during stage-1 translation
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-10-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:17 -05:00
Zhenzhong Duan
fed51ee5e0 intel_iommu: Check stage-1 translation result with interrupt range
Per VT-d spec 4.1 section 3.15, "Untranslated requests and translation
requests that result in an address in the interrupt range will be
blocked with condition code LGN.4 or SGN.8."

This applies to both stage-1 and stage-2 IOMMU page table, move the
check from vtd_iova_to_slpte() to vtd_do_iommu_translate() so stage-1
page table could also be checked.

By this chance, update the comment with correct section number.

Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-9-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:15 -05:00
Clément Mathieu--Drif
305e469b71 intel_iommu: Check if the input address is canonical
Stage-1 translation must fail if the address to translate is
not canonical.

Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-8-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:14 -05:00
Yi Liu
eb9da9d263 intel_iommu: Implement stage-1 translation
This adds stage-1 page table walking to support stage-1 only
translation in scalable mode.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Co-developed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-7-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:12 -05:00
Yi Liu
eda4c9b5b3 intel_iommu: Rename slpte to pte
Because we will support both FST(a.k.a, FLT) and SST(a.k.a, SLT) translation,
rename variable and functions from slpte to pte whenever possible.

But some are SST only, they are renamed with sl_ prefix.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Co-developed-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-6-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:09 -05:00
Zhenzhong Duan
ad0a7f1e1e intel_iommu: Flush stage-2 cache in PASID-selective PASID-based iotlb invalidation
Per VT-d spec 4.1, 6.5.2.4, "Table 21. PASID-based-IOTLB Invalidation",
PADID-selective PASID-based iotlb invalidation will flush stage-2 iotlb
entries with matching domain id and pasid.

With stage-1 translation introduced, guest could send PASID-selective
PASID-based iotlb invalidation to flush either stage-1 or stage-2 entries.

By this chance, remove old IOTLB related definitions which were unused.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-5-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:06:07 -05:00
Zhenzhong Duan
791346f93d intel_iommu: Add a placeholder variable for scalable mode stage-1 translation
Add an new element flts in IntelIOMMUState to mark stage-1 translation support
in scalable mode, this element will be exposed as an intel_iommu property
x-flts finally.

For now, it's only a placehholder and used for address width compatibility
check and block host device passthrough until nesting is supported.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241212083757.605022-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:05:58 -05:00
Zhenzhong Duan
b291dae33d intel_iommu: Make pasid entry type check accurate
When guest configures Nested Translation(011b) or First-stage Translation only
(001b), type check passed unaccurately.

Fails the type check in those cases as their simulation isn't supported yet.

Fixes: fb43cf739e ("intel_iommu: scalable mode emulation")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:05:56 -05:00
Yu Zhang
a84e37af36 intel_iommu: Use the latest fault reasons defined by spec
Spec revision 3.0 or above defines more detailed fault reasons for
scalable mode. So introduce them into emulation code, see spec
section 7.1.2 for details.

Note spec revision has no relation with VERSION register, Guest
kernel should not use that register to judge what features are
supported. Instead cap/ecap bits should be checked.

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20241212083757.605022-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-15 13:05:53 -05:00
Stefan Hajnoczi
290f950361 QOM & QDev patches
- Remove DeviceState::opts (Akihiko)
 - Replace container_get by machine/object_get_container (Peter)
 - Remove InterfaceInfo::concrete_class field (Paolo)
 - Reduce machine_containers[] scope (Philippe)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmeABNgACgkQ4+MsLN6t
 wN4XtQ/+NyXEK9vjq+yXnk7LRxTDQBrXxNc71gLqNA8rGwXTuELIXOthNW+UM2a9
 CdnVbrIX/FRfQLXTHx0C2ENteafrR1oXDQmEOz1UeYgaCWJsNdVe3r1MYUdHcwVM
 90JcSbYhrvxFE/p/6WhTjjv2DXn4E8witsPwRc8EBi5bHeFz6cNPzhdF59A3ljZF
 0zr1MLHJHhwR6OoBbm9HM8x8i4Zw4LoKEjo8cCgcBfPQIMKf0HQ4XsinIDwn0VXN
 S3jIysNyGHlptHOiJuErILZtzrm4F2lGwYan89jxuElfWjC7SVB2z4CQkQtPceIJ
 HRBrE7VPwJ566OAThoSwPG3jXT1yCDOYmNCX1kJOMo9rYh3MwG0VrbMr5iwfYk8Z
 wO+8IyMAx7m8FibdsoMmxtI1PYTf0JQaCB6MSwdoAMMQVp1FDWBun2g+swLjQgO4
 15iSB+PMIZe7Ywd0b63VZrUMHKwMxd9RFYEbbsdA8DRI50W3HMQPZAJiGXt7RxJ9
 p9qxqg0WGpVjgTnInt/KH4axiWPD5cru+THVYk6dvOdtTM5wj2jEswWy2vQ6LkEF
 MgxaUXfja8E20AXvdr6uXKwcKOIJ9+TaU5AhUmjpvacjJhy5eQdoFt9OnIMQt25U
 KTtapCVsong5JzYZWhITNCMf5w2YGCJGJJekxdrqBvFk+FkMR38=
 =+TLu
 -----END PGP SIGNATURE-----

Merge tag 'qom-qdev-20250109' of https://github.com/philmd/qemu into staging

QOM & QDev patches

- Remove DeviceState::opts (Akihiko)
- Replace container_get by machine/object_get_container (Peter)
- Remove InterfaceInfo::concrete_class field (Paolo)
- Reduce machine_containers[] scope (Philippe)

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmeABNgACgkQ4+MsLN6t
# wN4XtQ/+NyXEK9vjq+yXnk7LRxTDQBrXxNc71gLqNA8rGwXTuELIXOthNW+UM2a9
# CdnVbrIX/FRfQLXTHx0C2ENteafrR1oXDQmEOz1UeYgaCWJsNdVe3r1MYUdHcwVM
# 90JcSbYhrvxFE/p/6WhTjjv2DXn4E8witsPwRc8EBi5bHeFz6cNPzhdF59A3ljZF
# 0zr1MLHJHhwR6OoBbm9HM8x8i4Zw4LoKEjo8cCgcBfPQIMKf0HQ4XsinIDwn0VXN
# S3jIysNyGHlptHOiJuErILZtzrm4F2lGwYan89jxuElfWjC7SVB2z4CQkQtPceIJ
# HRBrE7VPwJ566OAThoSwPG3jXT1yCDOYmNCX1kJOMo9rYh3MwG0VrbMr5iwfYk8Z
# wO+8IyMAx7m8FibdsoMmxtI1PYTf0JQaCB6MSwdoAMMQVp1FDWBun2g+swLjQgO4
# 15iSB+PMIZe7Ywd0b63VZrUMHKwMxd9RFYEbbsdA8DRI50W3HMQPZAJiGXt7RxJ9
# p9qxqg0WGpVjgTnInt/KH4axiWPD5cru+THVYk6dvOdtTM5wj2jEswWy2vQ6LkEF
# MgxaUXfja8E20AXvdr6uXKwcKOIJ9+TaU5AhUmjpvacjJhy5eQdoFt9OnIMQt25U
# KTtapCVsong5JzYZWhITNCMf5w2YGCJGJJekxdrqBvFk+FkMR38=
# =+TLu
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 09 Jan 2025 12:18:16 EST
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'qom-qdev-20250109' of https://github.com/philmd/qemu:
  system: Inline machine_containers[] in qemu_create_machine_containers()
  qom: remove unused InterfaceInfo::concrete_class field
  qom: Remove container_get()
  qom: Use object_get_container()
  qom: Add object_get_container()
  qdev: Use machine_get_container()
  qdev: Add machine_get_container()
  qdev: Make qdev_get_machine() not use container_get()
  qdev: Implement qdev_create_fake_machine() for user emulation
  qdev: Remove opts member
  hw/pci: Use -1 as the default value for rombar

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-10 10:30:51 -05:00
Peter Xu
1c34335844 qdev: Use machine_get_container()
Use machine_get_container() whenever applicable across the tree.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241121192202.4155849-11-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2025-01-09 18:16:24 +01:00
Akihiko Odaki
b6014c5089 hw/xen: Check if len is 0 before memcpy()
data->data can be NULL when len is 0. Strictly speaking, the behavior
of memcpy() in such a scenario is undefined so UBSan complaints.

Satisfy UBSan by checking if len is 0 before memcpy().

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
2025-01-09 10:43:13 +00:00
David Woodhouse
981780cdda hw/i386/pc: Fix level interrupt sharing for Xen event channel GSI
The system GSIs are not designed for sharing. One device might assert a
shared interrupt with qemu_set_irq() and another might deassert it, and
the level from the first device is lost.

This could be solved by refactoring the x86 GSI code to use an OrIrq
device, but that still wouldn't be ideal.

The best answer would be to have a 'resample' callback which is invoked
when the interrupt is acked at the interrupt controller, and causes the
devices to re-trigger the interrupt if it should still be pending. This
is the model that VFIO in Linux uses, with a 'resampler' eventfd that
actually unmasks the interrupt on the hardware device and thus triggers
a new interrupt from it if needed.

As things stand, QEMU currently doesn't use that VFIO interface
correctly, and just bashes on the resampler for every MMIO access to the
device "just in case". Which requires unmapping and trapping the MMIO
while an interrupt is pending!

For the Xen callback GSI, QEMU does something similar — a flag is set
which triggers a poll on *every* vmexst to see if the GSI should be
deasserted.

Proper resampler support would solve all of that, but is a task for
later which has already been on the TODO list for a while.

Since the Xen event channel GSI support *already* has hooks into the PC
gsi_handler() code for routing GSIs to PIRQs, we can use that for a
simpler bug fix.

So... remember the externally-driven state of the line (from e.g. PCI
INTx) and set the logical OR of that with the GSI. As a bonus, we now
only need to enable the polling of vcpu_info on vmexit if the Xen
callback GSI is the *only* reason the corresponding line is asserted.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/2731
Fixes: ddf0fd9ae1 ("hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback")
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-09 10:40:11 +00:00
Philippe Mathieu-Daudé
8c6619f3e6 hw/i386/amd_iommu: Simplify non-KVM checks on XTSup feature
Generic code wanting to access KVM specific methods should
do so being protected by the 'kvm_enabled()' helper.

Doing so avoid link failures when optimization is disabled
(using --enable-debug), see for example commits c04cfb4596
("hw/i386: fix short-circuit logic with non-optimizing builds")
and 0266aef8cd ("amd_iommu: Fix kvm_enable_x2apic link error
with clang in non-KVM builds").

XTSup feature depends on KVM, so protect the whole block
checking the XTSup feature with a check on whether KVM is
enabled.

Since x86_cpus_init() already checks APIC ID > 255 imply
kernel support for irqchip and X2APIC, remove the confuse
and unlikely reachable "AMD IOMMU xtsup=on requires support
on the KVM side" message.

Fix a type in "configuration" in error message.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Message-Id: <20241129155802.35534-1-philmd@linaro.org>
2024-12-31 21:21:34 +01:00
Philippe Mathieu-Daudé
a115ab5ba3 hw/i386: Mark devices as little-endian
These devices are only used by the X86 targets, which are only
built as little-endian. Therefore the DEVICE_NATIVE_ENDIAN
definition expand to DEVICE_LITTLE_ENDIAN (besides, the
DEVICE_BIG_ENDIAN case isn't tested). Simplify directly using
DEVICE_LITTLE_ENDIAN.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Message-Id: <20241106184612.71897-2-philmd@linaro.org>
2024-12-31 21:21:34 +01:00
Alexander Graf
ff871d0462 hw/pci-host/gpex: Allow more than 4 legacy IRQs
Some boards such as vmapple don't do real legacy PCI IRQ swizzling.
Instead, they just keep allocating more board IRQ lines for each new
legacy IRQ. Let's support that mode by giving instantiators a new
"nr_irqs" property they can use to support more than 4 legacy IRQ lines.
In this mode, GPEX will export more IRQ lines, one for each device.

Signed-off-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241223221645.29911-9-phil@philjordan.eu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-30 20:04:50 +01:00
Stefan Hajnoczi
65cb7129f4 Accel & Exec patch queue
- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
 - Add '-d invalid_mem' logging option (Zoltan)
 - Create QOM containers explicitly (Peter)
 - Rename sysemu/ -> system/ (Philippe)
 - Re-orderning of include/exec/ headers (Philippe)
   Move a lot of declarations from these legacy mixed bag headers:
     . "exec/cpu-all.h"
     . "exec/cpu-common.h"
     . "exec/cpu-defs.h"
     . "exec/exec-all.h"
     . "exec/translate-all"
   to these more specific ones:
     . "exec/page-protection.h"
     . "exec/translation-block.h"
     . "user/cpu_loop.h"
     . "user/guest-host.h"
     . "user/page-protection.h"
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 =cjz8
 -----END PGP SIGNATURE-----

Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging

Accel & Exec patch queue

- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
  Move a lot of declarations from these legacy mixed bag headers:
    . "exec/cpu-all.h"
    . "exec/cpu-common.h"
    . "exec/cpu-defs.h"
    . "exec/exec-all.h"
    . "exec/translate-all"
  to these more specific ones:
    . "exec/page-protection.h"
    . "exec/translation-block.h"
    . "user/cpu_loop.h"
    . "user/guest-host.h"
    . "user/page-protection.h"

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 # =cjz8
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
 # gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
 # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
 # gpg: WARNING: This key is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
  util/qemu-timer: fix indentation
  meson: Do not define CONFIG_DEVICES on user emulation
  system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
  system/numa: Remove unnecessary 'exec/cpu-common.h' header
  hw/xen: Remove unnecessary 'exec/cpu-common.h' header
  target/mips: Drop left-over comment about Jazz machine
  target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
  target/xtensa: Remove tswap() calls in semihosting simcall() helper
  accel/tcg: Un-inline translator_is_same_page()
  accel/tcg: Include missing 'exec/translation-block.h' header
  accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
  accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
  qemu/coroutine: Include missing 'qemu/atomic.h' header
  exec/translation-block: Include missing 'qemu/atomic.h' header
  accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
  exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
  target/sparc: Move sparc_restore_state_to_opc() to cpu.c
  target/sparc: Uninline cpu_get_tb_cpu_state()
  target/loongarch: Declare loongarch_cpu_dump_state() locally
  user: Move various declarations out of 'exec/exec-all.h'
  ...

Conflicts:
	hw/char/riscv_htif.c
	hw/intc/riscv_aplic.c
	target/s390x/cpu.c

	Apply sysemu header path changes to not in the pull request.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 11:07:00 -05:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Philippe Mathieu-Daudé
63cda19446 target/i386/sev: Reduce system specific declarations
"system/confidential-guest-support.h" is not needed,
remove it. Reorder #ifdef'ry to reduce declarations
exposed on user emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20241218155913.72288-3-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Richard Henderson
5fcabe628b include/hw/qdev-properties: Remove DEFINE_PROP_END_OF_LIST
Now that all of the Property arrays are counted, we can remove
the terminator object from each array.  Update the assertions
in device_class_set_props to match.

With struct Property being 88 bytes, this was a rather large
form of terminator.  Saves 30k from qemu-system-aarch64.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://lore.kernel.org/r/20241218134251.4724-21-richard.henderson@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:37 +01:00
Stefan Hajnoczi
8032c78e55 x86/loader: fix efi binary loading
x86/loader: support secure boot with direct kernel load
 firmware: json descriptor updates
 roms: re-add edk2-basetools target
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmdgBfwACgkQTLbY7tPo
 cTj7MQ/+MJkVWTYN59Yy1o+XgfIBMoPKuF8Rm9jyosR751Nb5slw7ivd/nr9vKOd
 QNmCUNSHqNhkt10fGZmiL/OBNPH2I226iJ/QPB6CPgn+klWu9/n/qCYHKqkUl+4V
 uAe2CtsljiMmBouJUshmUvtUeB62aykwYYUBb2WfpElBaAvDqs8O+WBCp/83ugfP
 pd0G/bG+7lI6co9KLa3u7hMgcmxu2t/uKd55BaD/H2+Py353geQtnwXThom33jhy
 RMDzSZKWXxcXpwYtGJmUgy2XQqRwCe2uCqCldJ+Yn+VqWIJhszGrfxa1W3AQWoT0
 BHcnH9uriEwMEL5gO6i83m1No9tPJQaw9qhOa/zKtAxoVjdB9FBab1+MYCyYiS4N
 BBz6pIwR+74iDjn1SCOn4vJPmblEL6qtV+IB7MauG1o9GN6IluWDDHotpcmI5B6k
 oXh7mld70cqUFWjFZvoPYEp6HBAvhXLyUf3A4fQoemEX6mSVM9eYol4GM4gTj0gs
 IsBfd9wvHmaurpXMgB0cJOpr7UbbijtssseB/WzkMWlKskuMlJxsif/IEJO+GrbZ
 RdEcdVOr45Ty1Hmqv6b9M9kUojphUchLe6nl+CQihm3K7dF27yqhcJYqNTe7mKpt
 4+i6RZaTKKtbY8FL80ycDRZIkDZg9cwMQHMxrDABQVN5WpVfRgU=
 =4fZc
 -----END PGP SIGNATURE-----

Merge tag 'firmware-20241216-pull-request' of https://gitlab.com/kraxel/qemu into staging

x86/loader: fix efi binary loading
x86/loader: support secure boot with direct kernel load
firmware: json descriptor updates
roms: re-add edk2-basetools target

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmdgBfwACgkQTLbY7tPo
# cTj7MQ/+MJkVWTYN59Yy1o+XgfIBMoPKuF8Rm9jyosR751Nb5slw7ivd/nr9vKOd
# QNmCUNSHqNhkt10fGZmiL/OBNPH2I226iJ/QPB6CPgn+klWu9/n/qCYHKqkUl+4V
# uAe2CtsljiMmBouJUshmUvtUeB62aykwYYUBb2WfpElBaAvDqs8O+WBCp/83ugfP
# pd0G/bG+7lI6co9KLa3u7hMgcmxu2t/uKd55BaD/H2+Py353geQtnwXThom33jhy
# RMDzSZKWXxcXpwYtGJmUgy2XQqRwCe2uCqCldJ+Yn+VqWIJhszGrfxa1W3AQWoT0
# BHcnH9uriEwMEL5gO6i83m1No9tPJQaw9qhOa/zKtAxoVjdB9FBab1+MYCyYiS4N
# BBz6pIwR+74iDjn1SCOn4vJPmblEL6qtV+IB7MauG1o9GN6IluWDDHotpcmI5B6k
# oXh7mld70cqUFWjFZvoPYEp6HBAvhXLyUf3A4fQoemEX6mSVM9eYol4GM4gTj0gs
# IsBfd9wvHmaurpXMgB0cJOpr7UbbijtssseB/WzkMWlKskuMlJxsif/IEJO+GrbZ
# RdEcdVOr45Ty1Hmqv6b9M9kUojphUchLe6nl+CQihm3K7dF27yqhcJYqNTe7mKpt
# 4+i6RZaTKKtbY8FL80ycDRZIkDZg9cwMQHMxrDABQVN5WpVfRgU=
# =4fZc
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 16 Dec 2024 05:50:36 EST
# gpg:                using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* tag 'firmware-20241216-pull-request' of https://gitlab.com/kraxel/qemu:
  roms: re-add edk2-basetools target
  pc-bios: add missing riscv64 descriptor
  pc-bios: Add amd-sev-es to edk2 json
  x86/loader: add -shim option
  x86/loader: expose unpatched kernel
  x86/loader: read complete kernel
  x86/loader: only patch linux kernels

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-16 14:20:33 -05:00
Gerd Hoffmann
a5bd044b15 x86/loader: add -shim option
Add new -shim command line option, wire up for the x86 loader.
When specified load shim into the new "etc/boot/shim" fw_cfg file.

Needs OVMF changes too to be actually useful.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-6-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
f2594d9284 x86/loader: expose unpatched kernel
Add a new "etc/boot/kernel" fw_cfg file, containing the kernel without
the setup header patches.  Intended use is booting in UEFI with secure
boot enabled, where the setup header patching breaks secure boot
verification.

Needs OVMF changes too to be actually useful.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-5-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
214191f6b5 x86/loader: read complete kernel
Load the complete kernel (including setup) into memory.  Excluding the
setup is handled later when adding the FW_CFG_KERNEL_SIZE and
FW_CFG_KERNEL_DATA entries.

This is a preparation for the next patch which adds a new fw_cfg file
containing the complete, unpatched kernel.  No functional change.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-4-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
57e2cc9abf x86/loader: only patch linux kernels
If the binary loaded via -kernel is *not* a linux kernel (in which
case protocol == 0), do not patch the linux kernel header fields.

It's (a) pointless and (b) might break binaries by random patching
and (c) changes the binary hash which in turn breaks secure boot
verification.

Background: OVMF happily loads and runs not only linux kernels but
any efi binary via direct kernel boot.

Note: Breaking the secure boot verification is a problem for linux
kernels too, but fixed that is left for another day ...

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-3-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Richard Henderson
90d45638af hw/i386: Constify all Property
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-15 12:55:04 -06:00
Philippe Mathieu-Daudé
e5fd678a0d hw: Use pci_bus_add_fw_cfg_extra_pci_roots()
We want to remove fw_cfg_add_extra_pci_roots() which introduced
PCI bus knowledge within the generic hw/nvram/fw_cfg.c file.
Replace the calls by the pci_bus_add_fw_cfg_extra_pci_roots()
which is a 1:1 equivalent, but using correct API.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20241206181352.6836-6-philmd@linaro.org>
2024-12-14 00:16:20 +01:00
Dorjoy Chowdhury
5b86ddd83d hw/core/eif: Use stateful qcrypto apis
We were storing the pointers to buffers in a GList due to lack of
stateful crypto apis and instead doing the final hash computation at
the end after we had all the necessary buffers. Now that we have the
stateful qcrypto apis available, we can instead update the hashes
inline in the read_eif_* functions which makes the code much simpler.

Signed-off-by: Dorjoy Chowdhury <dorjoychy111@gmail.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Message-ID: <20241109123039.24180-1-dorjoychy111@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-13 15:26:58 +01:00
Cornelia Huck
0a7c438a42 hw: add compat machines for 10.0
Add 10.0 machine types for arm/i440fx/m68k/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20241126103005.3794748-3-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-12-11 09:18:38 +01:00
Daniel P. Berrangé
7956b06068 hw/i386: define _AS_LATEST() macros for machine types
Follow the other architecture targets by adding extra macros for
defining a versioned machine type as the latest. This reduces the
size of the changes when introducing new machine types at the start
of each release cycle.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240910163041.3764176-1-berrange@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Message-ID: <20241126103005.3794748-2-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-12-11 09:18:38 +01:00
Sairaj Kodilkar
0266aef8cd amd_iommu: Fix kvm_enable_x2apic link error with clang in non-KVM builds
Commit b12cb3819 (amd_iommu: Check APIC ID > 255 for XTSup) throws
linking error for the `kvm_enable_x2apic` when kvm is disabled
and Clang is used for compilation.

This issue comes up because Clang does not remove the function callsite
(kvm_enable_x2apic in this case) during optimization when if condition
have variable. Intel IOMMU driver solves this issue by creating separate
if condition for checking variables, which causes call site being
optimized away by virtue of `kvm_irqchip_is_split()` being defined as 0.
Implement same solution for the AMD driver.

Fixes: b12cb3819b (amd_iommu: Check APIC ID > 255 for XTSup)
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Tested-by: Phil Dennis-Jordan <phil@philjordan.eu>
Link: https://lore.kernel.org/r/20241114114509.15350-1-sarunkod@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-28 17:59:47 +01:00
Kamil Szczęk
4a7a119b91 hw/i386/pc: Remove vmport value assertion
There is no need for this assertion here, as we only use vmport value
for equality/inequality checks. This was originally prompted by the
following Coverity report:
 >>> CID 1559533:  Integer handling issues (CONSTANT_EXPRESSION_RESULT)
 >>> "pcms->vmport >= 0" is always true regardless of the values of
 >>> its operands. This occurs as the logical first operand of "&&".

Signed-off-by: Kamil Szczęk <kamil@szczek.dev>
Reported-By: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/ZwF9ZexNs1h-uC0MrbkgGtMtdyLinROjVSmMNVzNftjGVWgOiuzdD1dSXEtzNH7OHbBFY6GVDYVFIDBgc3lhGqCOb7kaNZolSBkVyl3rNr4=@szczek.dev
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-19 17:36:28 +01:00
Sergio Lopez
13cd9e6798 hw/i386/elfboot: allocate "header" in heap
In x86_load_linux(), we were using a stack-allocated array as data for
fw_cfg_add_bytes(). Since the latter just takes a reference to the
pointer instead of copying the data, it can happen that the contents
have been overridden by the time the guest attempts to access them.

Instead of using the stack-allocated array, allocate some memory from
the heap, copy the contents of the array, and use it for fw_cfg.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241109053748.13183-1-slp@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-11-18 13:36:39 +01:00
Peter Maydell
bd0e501e1a hw/i386/pc: Don't try to init PCI NICs if there is no PCI bus
The 'isapc' machine type has no PCI bus, but pc_nic_init() still
calls pci_init_nic_devices() passing it a NULL bus pointer.  This
causes the clang sanitizer to complain:

$ ./build/clang/qemu-system-i386 -M isapc
../../hw/pci/pci.c:1866:39: runtime error: member access within null pointer of type 'PCIBus' (aka 'struct PCIBus')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../hw/pci/pci.c:1866:39 in

This is because pci_init_nic_devices() does
 &bus->qbus
which is undefined behaviour on a NULL pointer even though we're not
actually dereferencing the pointer. (We don't actually crash as
a result, so if you aren't running a sanitizer build then there
are no user-visible effects.)

Make pc_nic_init() avoid trying to initialize PCI NICs on a non-PCI
system.

Cc: qemu-stable@nongnu.org
Fixes: 8d39f9ba14 ("hw/i386/pc: use qemu_get_nic_info() and pci_init_nic_devices()")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Link: https://lore.kernel.org/r/20241105171813.3031969-1-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-09 08:34:07 +01:00
Peter Maydell
63dc369443 Misc HW patch queue
- Deprecate a pair of untested microblaze big-endian machines (Philippe)
 - Arch-agnostic CPU topology checks at machine level (Zhao)
 - Cleanups on PPC E500 (Bernhard)
 - Various conversions to DEFINE_TYPES() macro (Bernhard)
 - Fix RISC-V _pext_u64() name clashing (Pierrick)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmcqqycACgkQ4+MsLN6t
 wN7TfhAAkAjpWxFGptNw28LPpnZY/NTGKyXQrIEHu3XnJsZ28c/KZeCAYUUC6/q7
 tAnBMb5GIn2VTyt+ElORseFtHStThoR8WMrcQSlGvCZei9lRNKCW0pVIEUgLZEtT
 u8lChpaVAn8gXb885xlaCBBP4SuFHEpASSfWy0mYDIqZL3oRhr9AQ/KwzHFqenbK
 Uva4BCWRVnYju6MhfA/pmVP011SUTdCu/fsBTIJT3Xn7Sp7fRNShIzt+1rbmPnR2
 hhRl5bMKUgDUjX5GxeP0LOj/XdX9svlqL42imNQT5FFUMIR6qbrwj4U841mt0uuI
 FcthAoILvA2XUJoTESq0iXUoN4FQLtc01onY6k06EoZAnn8WRZRp2dNdu8fYmHMX
 y3pcXBK6wEhBVZ2DcGVf1txmieUc4TZohOridU1Xfckp+XVl6J3LtTKJIE56Eh68
 S9OJW1Sz2Io/8FJFvKStX0bhV0nBUyUXmi5PjV4vurS6Gy1aVodiiq3ls6baX05z
 /Y8DJGpPByA+GI2prdwq9oTIhEIU2bJDDz32NkwHM99SE25h+iyh21Ap5Ojkegm7
 1squIskxX3QLtEMxBCe+XIKzEZ51kzNZxmLXvCFW5YetypNdhyULqH/UDWt7hIDN
 BSh2w1g/lSw9n6DtEN3rURYAR/uV7/7IMEP8Td2wvcDX4o95Fkw=
 =q0cF
 -----END PGP SIGNATURE-----

Merge tag 'hw-misc-20241105' of https://github.com/philmd/qemu into staging

Misc HW patch queue

- Deprecate a pair of untested microblaze big-endian machines (Philippe)
- Arch-agnostic CPU topology checks at machine level (Zhao)
- Cleanups on PPC E500 (Bernhard)
- Various conversions to DEFINE_TYPES() macro (Bernhard)
- Fix RISC-V _pext_u64() name clashing (Pierrick)

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmcqqycACgkQ4+MsLN6t
# wN7TfhAAkAjpWxFGptNw28LPpnZY/NTGKyXQrIEHu3XnJsZ28c/KZeCAYUUC6/q7
# tAnBMb5GIn2VTyt+ElORseFtHStThoR8WMrcQSlGvCZei9lRNKCW0pVIEUgLZEtT
# u8lChpaVAn8gXb885xlaCBBP4SuFHEpASSfWy0mYDIqZL3oRhr9AQ/KwzHFqenbK
# Uva4BCWRVnYju6MhfA/pmVP011SUTdCu/fsBTIJT3Xn7Sp7fRNShIzt+1rbmPnR2
# hhRl5bMKUgDUjX5GxeP0LOj/XdX9svlqL42imNQT5FFUMIR6qbrwj4U841mt0uuI
# FcthAoILvA2XUJoTESq0iXUoN4FQLtc01onY6k06EoZAnn8WRZRp2dNdu8fYmHMX
# y3pcXBK6wEhBVZ2DcGVf1txmieUc4TZohOridU1Xfckp+XVl6J3LtTKJIE56Eh68
# S9OJW1Sz2Io/8FJFvKStX0bhV0nBUyUXmi5PjV4vurS6Gy1aVodiiq3ls6baX05z
# /Y8DJGpPByA+GI2prdwq9oTIhEIU2bJDDz32NkwHM99SE25h+iyh21Ap5Ojkegm7
# 1squIskxX3QLtEMxBCe+XIKzEZ51kzNZxmLXvCFW5YetypNdhyULqH/UDWt7hIDN
# BSh2w1g/lSw9n6DtEN3rURYAR/uV7/7IMEP8Td2wvcDX4o95Fkw=
# =q0cF
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 05 Nov 2024 23:32:55 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'hw-misc-20241105' of https://github.com/philmd/qemu: (29 commits)
  hw/riscv/iommu: fix build error with clang
  hw/usb/hcd-ehci-sysbus: Prefer DEFINE_TYPES() macro
  hw/rtc/ds1338: Prefer DEFINE_TYPES() macro
  hw/i2c/smbus_eeprom: Prefer DEFINE_TYPES() macro
  hw/block/pflash_cfi01: Prefer DEFINE_TYPES() macro
  hw/sd/sdhci: Prefer DEFINE_TYPES() macro
  hw/ppc/mpc8544_guts: Prefer DEFINE_TYPES() macro
  hw/gpio/mpc8xxx: Prefer DEFINE_TYPES() macro
  hw/net/fsl_etsec/etsec: Prefer DEFINE_TYPES() macro
  hw/net/fsl_etsec/miim: Reuse MII constants
  hw/pci-host/ppce500: Prefer DEFINE_TYPES() macro
  hw/pci-host/ppce500: Reuse TYPE_PPC_E500_PCI_BRIDGE define
  hw/i2c/mpc_i2c: Prefer DEFINE_TYPES() macro
  hw/i2c/mpc_i2c: Convert DPRINTF to trace events for register access
  hw/ppc/mpc8544_guts: Populate POR PLL ratio status register
  hw/ppc/e500: Add missing device tree properties to i2c controller node
  hw/ppc/e500: Remove unused "irqs" parameter
  hw/ppc/e500: Prefer QOM cast
  hw/core: Add a helper to check the cache topology level
  hw/core: Check smp cache topology support for machine
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-06 17:28:45 +00:00
Zhao Liu
e823ebe77d hw/core: Make CPU topology enumeration arch-agnostic
Cache topology needs to be defined based on CPU topology levels. Thus,
define CPU topology enumeration in qapi/machine.json to make it generic
for all architectures.

To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
socket.

Also, enumerate additional topology levels for non-i386 arches, and add
a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
with compatibility requirement of arch-specific cache topology models.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241101083331.340178-3-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-11-05 23:32:25 +00:00
Peter Maydell
9eb9350c0e virtio,pc,pci: features, fixes, cleanups
CXL now can use Generic Port Affinity Structures.
 CXL now allows control of link speed and width
 vhost-user-blk now supports live resize, by means of
 a new device-sync-config command
 amd iommu now supports interrupt remapping
 pcie devices now report extended tag field support
 intel_iommu dropped support for Transient Mapping, to match VTD spec
 arch agnostic ACPI infrastructure for vCPU Hotplug
 
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
 fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
 HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
 +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
 OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
 =zSX4
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, fixes, cleanups

CXL now can use Generic Port Affinity Structures.
CXL now allows control of link speed and width
vhost-user-blk now supports live resize, by means of
a new device-sync-config command
amd iommu now supports interrupt remapping
pcie devices now report extended tag field support
intel_iommu dropped support for Transient Mapping, to match VTD spec
arch agnostic ACPI infrastructure for vCPU Hotplug

Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcpNqUPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp/2oH/0qO33prhDa48J5mqT9NuJzzYwp5QHKF9Zjv
# fDAplMUEmfxZIEgJchcyDWPYTGX2geT4pCFhRWioZMIR/0JyzrFgSwsk1kL88cMh
# 46gzhNVD6ybyPJ7O0Zq3GLy5jo7rlw/n+fFxKAuRCzcbK/fmH8gNC+RwW1IP64Na
# HDczYilHUhnO7yKZFQzQNQVbK4BckrG1bu0Fcx0EMUQBf4V6x7GLOrT+3hkKYcr6
# +DG5DmUmv20or/FXnu2Ye+MzR8Ebx6JVK3A3sXEE4Ns2CCzK9QLzeeyc2aU13jWN
# OpZ6WcKF8HqYprIwnSsMTxhPcq0/c7TvrGrazVwna5RUBMyjjvc=
# =zSX4
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 21:03:33 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (65 commits)
  intel_iommu: Add missed reserved bit check for IEC descriptor
  intel_iommu: Add missed sanity check for 256-bit invalidation queue
  intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
  hw/acpi: Update GED with vCPU Hotplug VMSD for migration
  tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
  hw/acpi: Update ACPI `_STA` method with QOM vCPU ACPI Hotplug states
  qtest: allow ACPI DSDT Table changes
  hw/acpi: Make CPUs ACPI `presence` conditional during vCPU hot-unplug
  hw/pci: Add parenthesis to PCI_BUILD_BDF macro
  hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
  hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
  hw/cxl: Check that writes do not go beyond end of target attributes
  hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
  hw/cxl: Avoid accesses beyond the end of cel_log.
  hw/cxl: Check the length of data requested fits in get_log()
  hw/cxl: Check enough data in cmd_firmware_update_transfer()
  hw/cxl: Check input length is large enough in cmd_events_clear_records()
  hw/cxl: Check input includes at least the header in cmd_features_set_feature()
  hw/cxl: Check size of input data to dynamic capacity mailbox commands
  hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-05 15:47:52 +00:00
Zhenzhong Duan
096d96e7be intel_iommu: Add missed reserved bit check for IEC descriptor
IEC descriptor is 128-bit invalidation descriptor, must be padded with
128-bits of 0s in the upper bytes to create a 256-bit descriptor when
the invalidation queue is configured for 256-bit descriptors (IQA_REG.DW=1).

Fixes: 02a2cbc872 ("x86-iommu: introduce IEC notifiers")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Zhenzhong Duan
8e761fb61c intel_iommu: Add missed sanity check for 256-bit invalidation queue
According to VTD spec, a 256-bit descriptor will result in an invalid
descriptor error if submitted in an IQ that is setup to provide hardware
with 128-bit descriptors (IQA_REG.DW=0). Meanwhile, there are old inv desc
types (e.g. iotlb_inv_desc) that can be either 128bits or 256bits. If a
128-bit version of this descriptor is submitted into an IQ that is setup
to provide hardware with 256-bit descriptors will also result in an invalid
descriptor error.

The 2nd will be captured by the tail register update. So we only need to
focus on the 1st.

Because the reserved bit check between different types of invalidation desc
are common, so introduce a common function vtd_inv_desc_reserved_check()
to do all the checks and pass the differences as parameters.

With this change, need to replace error_report_once() call with error_report()
to catch different call sites. This isn't an issue as error_report_once()
here is mainly used to help debug guest error, but it only dumps once in
qemu life cycle and doesn't help much, we need error_report() instead.

Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Zhenzhong Duan
e70e83f561 intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
According to VTD spec, Figure 11-22, Invalidation Queue Tail Register,
"When Descriptor Width (DW) field in Invalidation Queue Address Register
(IQA_REG) is Set (256-bit descriptors), hardware treats bit-4 as reserved
and a value of 1 in the bit will result in invalidation queue error."

Current code missed to send IQE event to guest, fix it.

Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00