Commit graph

22 commits

Author SHA1 Message Date
Philippe Mathieu-Daudé
adc1a4a26a hw/loader: Pass ELFDATA endian order argument to load_elf()
Rather than passing a boolean 'is_big_endian' argument,
directly pass the ELFDATA, which can be unspecified using
the ELFDATANONE value.

Update the call sites:
  0                 -> ELFDATA2LSB
  1                 -> ELFDATA2MSB
  TARGET_BIG_ENDIAN -> TARGET_BIG_ENDIAN ? ELFDATA2MSB : ELFDATA2LSB

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250127113824.50177-7-philmd@linaro.org>
2025-01-31 19:36:44 +01:00
Stefan Hajnoczi
09360a048b * rust: miscellaneous changes
* target/i386: small code generation improvements
 * target/i386: various cleanups and fixes
 * cpu: remove env->nr_cores
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
 MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
 5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
 NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
 CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
 07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
 =xIQ4
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* rust: miscellaneous changes
* target/i386: small code generation improvements
* target/i386: various cleanups and fixes
* cpu: remove env->nr_cores

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmeBoIgUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOD2gf+NK7U1EhNIrsbBsbtu2i7+tnbRKIB
# MTu+Mxb2wz4C7//pxq+vva4bgT3iOuL9RF19PRe/63CMD65xMiwyyNrEWX2HbRIJ
# 5dytLLLdef3yMhHh2x1uZfm54g12Ppvn9kulMCbPawrlqWgg1sZbkUBrRtFzS45c
# NeYjGWWSpBDe7LtsrgSRYLPnz6wWEiy3tDpu2VoDtjrE86UVDXwyzpbtBk9Y8jPi
# CKdvLyQeO9xDE5OoXMjJMlJeQq3D9iwYEprXUqy+RUZtpW7YmqMCf2JQ4dAjVCad
# 07v/kITF4brGCVnzDcDA6W7LqHpBu1w+Hn23yLw3HEDDBt11o9JjQCl9qA==
# =xIQ4
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Jan 2025 17:34:48 EST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (38 commits)
  i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX]
  i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  cpu: Remove nr_cores from struct CPUState
  i386/cpu: Hoist check of CPUID_EXT3_TOPOEXT against threads_per_core
  i386/cpu: Track a X86CPUTopoInfo directly in CPUX86State
  i386/topology: Introduce helpers for various topology info of different level
  i386/topology: Update the comment of x86_apicid_from_topo_ids()
  i386/cpu: Drop cores_per_pkg in cpu_x86_cpuid()
  i386/cpu: Drop the variable smp_cores and smp_threads in x86_cpu_pre_plug()
  i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
  target/i386/kvm: Replace ARRAY_SIZE(msr_handlers) with KVM_MSR_FILTER_MAX_RANGES
  target/i386/kvm: Clean up error handling in kvm_arch_init()
  target/i386/kvm: Return -1 when kvm_msr_energy_thread_init() fails
  target/i386/kvm: Clean up return values of MSR filter related functions
  target/i386/confidential-guest: Fix comment of x86_confidential_guest_kvm_type()
  target/i386/kvm: Drop workaround for KVM_X86_DISABLE_EXITS_HTL typo
  target/i386/kvm: Only save/load kvmclock MSRs when kvmclock enabled
  target/i386/kvm: Remove local MSR_KVM_WALL_CLOCK and MSR_KVM_SYSTEM_TIME definitions
  target/i386/kvm: Add feature bit definitions for KVM CPUID
  i386/cpu: Mark avx10_version filtered when prefix is NULL
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2025-01-17 10:12:52 -05:00
Xiaoyao Li
84b71a131c i386/cpu: Track a X86CPUTopoInfo directly in CPUX86State
The name of nr_modules/nr_dies are ambiguous and they mislead people.

The purpose of them is to record and form the topology information. So
just maintain a X86CPUTopoInfo member in CPUX86State instead. Then
nr_modules and nr_dies can be dropped.

As the benefit, x86 can switch to use information in
CPUX86State::topo_info and get rid of the nr_cores and nr_threads in
CPUState. This helps remove the dependency on qemu_init_vcpu(), so that
x86 can get and use topology info earlier in x86_cpu_realizefn(); drop
the comment that highlighted the depedency.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-7-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:45 +01:00
Xiaoyao Li
81bd60625f i386/cpu: Drop the variable smp_cores and smp_threads in x86_cpu_pre_plug()
No need to define smp_cores and smp_threads, just using ms->smp.cores
and ms->smp.threads is straightforward. It's also consistent with other
checks of socket/die/module.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-3-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:45 +01:00
David Woodhouse
981780cdda hw/i386/pc: Fix level interrupt sharing for Xen event channel GSI
The system GSIs are not designed for sharing. One device might assert a
shared interrupt with qemu_set_irq() and another might deassert it, and
the level from the first device is lost.

This could be solved by refactoring the x86 GSI code to use an OrIrq
device, but that still wouldn't be ideal.

The best answer would be to have a 'resample' callback which is invoked
when the interrupt is acked at the interrupt controller, and causes the
devices to re-trigger the interrupt if it should still be pending. This
is the model that VFIO in Linux uses, with a 'resampler' eventfd that
actually unmasks the interrupt on the hardware device and thus triggers
a new interrupt from it if needed.

As things stand, QEMU currently doesn't use that VFIO interface
correctly, and just bashes on the resampler for every MMIO access to the
device "just in case". Which requires unmapping and trapping the MMIO
while an interrupt is pending!

For the Xen callback GSI, QEMU does something similar — a flag is set
which triggers a poll on *every* vmexst to see if the GSI should be
deasserted.

Proper resampler support would solve all of that, but is a task for
later which has already been on the TODO list for a while.

Since the Xen event channel GSI support *already* has hooks into the PC
gsi_handler() code for routing GSIs to PIRQs, we can use that for a
simpler bug fix.

So... remember the externally-driven state of the line (from e.g. PCI
INTx) and set the logical OR of that with the GSI. As a bonus, we now
only need to enable the polling of vcpu_info on vmexit if the Xen
callback GSI is the *only* reason the corresponding line is asserted.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/2731
Fixes: ddf0fd9ae1 ("hw/xen: Support HVM_PARAM_CALLBACK_TYPE_GSI callback")
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2025-01-09 10:40:11 +00:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Gerd Hoffmann
a5bd044b15 x86/loader: add -shim option
Add new -shim command line option, wire up for the x86 loader.
When specified load shim into the new "etc/boot/shim" fw_cfg file.

Needs OVMF changes too to be actually useful.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-6-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
f2594d9284 x86/loader: expose unpatched kernel
Add a new "etc/boot/kernel" fw_cfg file, containing the kernel without
the setup header patches.  Intended use is booting in UEFI with secure
boot enabled, where the setup header patching breaks secure boot
verification.

Needs OVMF changes too to be actually useful.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-5-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
214191f6b5 x86/loader: read complete kernel
Load the complete kernel (including setup) into memory.  Excluding the
setup is handled later when adding the FW_CFG_KERNEL_SIZE and
FW_CFG_KERNEL_DATA entries.

This is a preparation for the next patch which adds a new fw_cfg file
containing the complete, unpatched kernel.  No functional change.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-4-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Gerd Hoffmann
57e2cc9abf x86/loader: only patch linux kernels
If the binary loaded via -kernel is *not* a linux kernel (in which
case protocol == 0), do not patch the linux kernel header fields.

It's (a) pointless and (b) might break binaries by random patching
and (c) changes the binary hash which in turn breaks secure boot
verification.

Background: OVMF happily loads and runs not only linux kernels but
any efi binary via direct kernel boot.

Note: Breaking the secure boot verification is a problem for linux
kernels too, but fixed that is left for another day ...

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-ID: <20240905141211.1253307-3-kraxel@redhat.com>
2024-12-16 07:31:28 +01:00
Sergio Lopez
13cd9e6798 hw/i386/elfboot: allocate "header" in heap
In x86_load_linux(), we were using a stack-allocated array as data for
fw_cfg_add_bytes(). Since the latter just takes a reference to the
pointer instead of copying the data, it can happen that the contents
have been overridden by the time the guest attempts to access them.

Instead of using the stack-allocated array, allocate some memory from
the heap, copy the contents of the array, and use it for fw_cfg.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241109053748.13183-1-slp@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-11-18 13:36:39 +01:00
Zhao Liu
e823ebe77d hw/core: Make CPU topology enumeration arch-agnostic
Cache topology needs to be defined based on CPU topology levels. Thus,
define CPU topology enumeration in qapi/machine.json to make it generic
for all architectures.

To match the general topology naming style, rename CPU_TOPO_LEVEL_* to
CPU_TOPOLOGY_LEVEL_*, and rename SMT and package levels to thread and
socket.

Also, enumerate additional topology levels for non-i386 arches, and add
a CPU_TOPOLOGY_LEVEL_DEFAULT to help future smp-cache object to work
with compatibility requirement of arch-specific cache topology models.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20241101083331.340178-3-zhao1.liu@intel.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-11-05 23:32:25 +00:00
Philippe Mathieu-Daudé
c3fb1fc926 hw/i386: Use explicit little-endian LD/ST API
The x86 architecture uses little endianness. Directly use
the little-endian LD/ST API.

Mechanical change using:

  $ end=le; \
    for acc in uw w l q tul; do \
      sed -i -e "s/ld${acc}_p(/ld${acc}_${end}_p(/" \
             -e "s/st${acc}_p(/st${acc}_${end}_p(/" \
        $(git grep -wlE '(ld|st)t?u?[wlq]_p' hw/i386/); \
    done

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241004163042.85922-9-philmd@linaro.org>
2024-10-15 12:13:59 -03:00
Ani Sinha
80e3541282 hw/x86: add a couple of comments explaining how the kernel image is parsed
Cosmetic: add comments in x86_load_linux() pointing to the kernel documentation
so that users can better understand the code.

CC: qemu-trivial@nongnu.org
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-08-23 12:07:23 +03:00
Michael Roth
fc7a69e177 hw/i386: Add support for loading BIOS using guest_memfd
When guest_memfd is enabled, the BIOS is generally part of the initial
encrypted guest image and will be accessed as private guest memory. Add
the necessary changes to set up the associated RAM region with a
guest_memfd backend to allow for this.

Current support centers around using -bios to load the BIOS data.
Support for loading the BIOS via pflash requires additional enablement
since those interfaces rely on the use of ROM memory regions which make
use of the KVM_MEM_READONLY memslot flag, which is not supported for
guest_memfd-backed memslots.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-29-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:06 +02:00
Brijesh Singh
77d1abd91e hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
As with SEV, an SNP guest requires that the BIOS be part of the initial
encrypted/measured guest payload. Extend sev_encrypt_flash() to handle
the SNP case and plumb through the GPA of the BIOS location since this
is needed for SNP.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-25-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:06 +02:00
Zhao Liu
588208346f i386/cpu: Introduce module-id to X86CPU
Introduce module-id to be consistent with the module-id field in
CpuInstanceProperties.

Following the legacy smp check rules, also add the module_id validity
into x86_cpu_pre_plug().

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-17-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu
b17a26bc4b i386: Support module_id in X86CPUTopoIDs
Add module_id member in X86CPUTopoIDs.

module_id can be parsed from APIC ID, so also update APIC ID parsing
rule to support module level. With this support, the conversions with
module level between X86CPUTopoIDs, X86CPUTopoInfo and APIC ID are
completed.

module_id can be also generated from cpu topology, and before i386
supports "modules" in smp, the default "modules per die" (modules *
clusters) is only 1, thus the module_id generated in this way is 0,
so that it will not conflict with the module_id generated by APIC ID.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-16-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu
5304873acd i386: Expose module level in CPUID[0x1F]
Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.

Expose the module level in CPUID[0x1F] if the machine has more than 1
modules.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-15-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu
81c392ab5c i386: Introduce module level cpu topology to CPUX86State
Intel CPUs implement module level on hybrid client products (e.g.,
ADL-N, MTL, etc) and E-core server products.

A module contains a set of cores that share certain resources (in
current products, the resource usually includes L2 cache, as well as
module scoped features and MSRs).

Module level support is the prerequisite for L2 cache topology on
module level. With module level, we can implement the Guest's CPU
topology and future cache topology to be consistent with the Host's on
Intel hybrid client/E-core server platforms.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Co-developed-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Message-ID: <20240424154929.1487382-13-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Zhao Liu
6ddeb0ec8c i386/cpu: Introduce bitmap to cache available CPU topology levels
Currently, QEMU checks the specify number of topology domains to detect
if there's extended topology levels (e.g., checking nr_dies).

With this bitmap, the extended CPU topology (the levels other than SMT,
core and package) could be easier to detect without touching the
topology details.

This is also in preparation for the follow-up to decouple CPUID[0x1F]
subleaf with specific topology level.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Paolo Bonzini
b061f0598b hw/i386: split x86.c in multiple parts
Keep the basic X86MachineState definition in x86.c.  Move out functions that
are only needed by other files: x86-common.c for the pc and microvm machines,
x86-cpu.c for those used by accelerator code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-10 15:45:15 +02:00