Commit graph

176 commits

Author SHA1 Message Date
Pierrick Bouvier
9c2ff9cdc9 exec/cpu-all: remove exec/target_page include
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-04-23 15:04:57 -07:00
Xiaoyao Li
d3bb5d0d4f i386/cpu: Extract a common fucntion to setup value of MSR_CORE_THREAD_COUNT
There are duplicated code to setup the value of MSR_CORE_THREAD_COUNT.
Extract a common function for it.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241219110125.1266461-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Paolo Bonzini
d662b66da4 target/i386/kvm: Replace ARRAY_SIZE(msr_handlers) with KVM_MSR_FILTER_MAX_RANGES
kvm_install_msr_filters() uses KVM_MSR_FILTER_MAX_RANGES as the bound
when traversing msr_handlers[], while other places still compute the
size by ARRAY_SIZE(msr_handlers).

In fact, msr_handlers[] is an array with the fixed size
KVM_MSR_FILTER_MAX_RANGES, and this has to be true because
kvm_install_msr_filters copies from one array to the other.
For code consistency, assert that they match and use
ARRAY_SIZE(msr_handlers) everywehere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
d2401a6eae target/i386/kvm: Clean up error handling in kvm_arch_init()
Currently, there're following incorrect error handling cases in
kvm_arch_init():
* Missed to handle failure of kvm_get_supported_feature_msrs().
* Missed to return when kvm_vm_enable_disable_exits() fails.
* MSR filter related cases called exit() directly instead of returning
  to kvm_init(). (The caller of kvm_arch_init() - kvm_init() - needs to
  know if kvm_arch_init() fails in order to perform cleanup).

Fix the above cases.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-11-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
d7f895cb62 target/i386/kvm: Return -1 when kvm_msr_energy_thread_init() fails
It is common practice to return a negative value (like -1) to indicate
an error, and other functions in kvm_arch_init() follow this style.

To avoid confusion (sometimes returned -1 indicates failure, and
sometimes -1, in a same function), return -1 when
kvm_msr_energy_thread_init() fails.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-10-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
fb81c9cfdd target/i386/kvm: Clean up return values of MSR filter related functions
Before commit 0cc42e63bb ("kvm/i386: refactor kvm_arch_init and split
it into smaller functions"), error_report() attempts to print the error
code from kvm_filter_msr(). However, printing error code does not work
due to kvm_filter_msr() returns bool instead int.

0cc42e63bb fixed the error by removing error code printing, but this
lost useful error messages. Bring it back by making kvm_filter_msr()
return int.

This also makes the function call chain processing clearer, allowing for
better handling of error result propagation from kvm_filter_msr() to
kvm_arch_init(), preparing for the subsequent cleanup work of error
handling in kvm_arch_init().

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-9-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
5dabc87b51 target/i386/kvm: Drop workaround for KVM_X86_DISABLE_EXITS_HTL typo
The KVM_X86_DISABLE_EXITS_HTL typo has been fixed in commit
77d361b13c ("linux-headers: Update to kernel mainline commit
b357bf602").

Drop the related workaround.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-7-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
86e032bb7b target/i386/kvm: Only save/load kvmclock MSRs when kvmclock enabled
MSR_KVM_SYSTEM_TIME and MSR_KVM_WALL_CLOCK are attached with the (old)
kvmclock feature (KVM_FEATURE_CLOCKSOURCE).

So, just save/load them only when kvmclock (KVM_FEATURE_CLOCKSOURCE) is
enabled.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-5-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
f5bec7652d target/i386/kvm: Remove local MSR_KVM_WALL_CLOCK and MSR_KVM_SYSTEM_TIME definitions
These 2 MSRs have been already defined in kvm_para.h (standard-headers/
asm-x86/kvm_para.h).

Remove QEMU local definitions to avoid duplication.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-4-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Zhao Liu
cee1f341ce target/i386/kvm: Add feature bit definitions for KVM CPUID
Add feature definitions for KVM_CPUID_FEATURES in CPUID (
CPUID[4000_0001].EAX and CPUID[4000_0001].EDX), to get rid of lots of
offset calculations.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Link: https://lore.kernel.org/r/20241106030728.553238-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-01-10 23:34:44 +01:00
Stefan Hajnoczi
65cb7129f4 Accel & Exec patch queue
- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
 - Add '-d invalid_mem' logging option (Zoltan)
 - Create QOM containers explicitly (Peter)
 - Rename sysemu/ -> system/ (Philippe)
 - Re-orderning of include/exec/ headers (Philippe)
   Move a lot of declarations from these legacy mixed bag headers:
     . "exec/cpu-all.h"
     . "exec/cpu-common.h"
     . "exec/cpu-defs.h"
     . "exec/exec-all.h"
     . "exec/translate-all"
   to these more specific ones:
     . "exec/page-protection.h"
     . "exec/translation-block.h"
     . "user/cpu_loop.h"
     . "user/guest-host.h"
     . "user/page-protection.h"
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 =cjz8
 -----END PGP SIGNATURE-----

Merge tag 'exec-20241220' of https://github.com/philmd/qemu into staging

Accel & Exec patch queue

- Ignore writes to CNTP_CTL_EL0 on HVF ARM (Alexander)
- Add '-d invalid_mem' logging option (Zoltan)
- Create QOM containers explicitly (Peter)
- Rename sysemu/ -> system/ (Philippe)
- Re-orderning of include/exec/ headers (Philippe)
  Move a lot of declarations from these legacy mixed bag headers:
    . "exec/cpu-all.h"
    . "exec/cpu-common.h"
    . "exec/cpu-defs.h"
    . "exec/exec-all.h"
    . "exec/translate-all"
  to these more specific ones:
    . "exec/page-protection.h"
    . "exec/translation-block.h"
    . "user/cpu_loop.h"
    . "user/guest-host.h"
    . "user/page-protection.h"

 # -----BEGIN PGP SIGNATURE-----
 #
 # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmdlnyAACgkQ4+MsLN6t
 # wN6mBw//QFWi7CrU+bb8KMM53kOU9C507tjn99LLGFb5or73/umDsw6eo/b8DHBt
 # KIwGLgATel42oojKfNKavtAzLK5rOrywpboPDpa3SNeF1onW+99NGJ52LQUqIX6K
 # A6bS0fPdGG9ZzEuPpbjDXlp++0yhDcdSgZsS42fEsT7Dyj5gzJYlqpqhiXGqpsn8
 # 4Y0UMxSL21K3HEexlzw2hsoOBFA3tUm2ujNDhNkt8QASr85yQVLCypABJnuoe///
 # 5Ojl5wTBeDwhANET0rhwHK8eIYaNboiM9fHopJYhvyw1bz6yAu9jQwzF/MrL3s/r
 # xa4OBHBy5mq2hQV9Shcl3UfCQdk/vDaYaWpgzJGX8stgMGYfnfej1SIl8haJIfcl
 # VMX8/jEFdYbjhO4AeGRYcBzWjEJymkDJZoiSWp2NuEDi6jqIW+7yW1q0Rnlg9lay
 # ShAqLK5Pv4zUw3t0Jy3qv9KSW8sbs6PQxtzXjk8p97rTf76BJ2pF8sv1tVzmsidP
 # 9L92Hv5O34IqzBu2oATOUZYJk89YGmTIUSLkpT7asJZpBLwNM2qLp5jO00WVU0Sd
 # +kAn324guYPkko/TVnjC/AY7CMu55EOtD9NU35k3mUAnxXT9oDUeL4NlYtfgrJx6
 # x1Nzr2FkS68+wlPAFKNSSU5lTjsjNaFM0bIJ4LCNtenJVP+SnRo=
 # =cjz8
 # -----END PGP SIGNATURE-----
 # gpg: Signature made Fri 20 Dec 2024 11:45:20 EST
 # gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
 # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown]
 # gpg: WARNING: This key is not certified with a trusted signature!
 # gpg:          There is no indication that the signature belongs to the owner.
 # Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'exec-20241220' of https://github.com/philmd/qemu: (59 commits)
  util/qemu-timer: fix indentation
  meson: Do not define CONFIG_DEVICES on user emulation
  system/accel-ops: Remove unnecessary 'exec/cpu-common.h' header
  system/numa: Remove unnecessary 'exec/cpu-common.h' header
  hw/xen: Remove unnecessary 'exec/cpu-common.h' header
  target/mips: Drop left-over comment about Jazz machine
  target/mips: Remove tswap() calls in semihosting uhi_fstat_cb()
  target/xtensa: Remove tswap() calls in semihosting simcall() helper
  accel/tcg: Un-inline translator_is_same_page()
  accel/tcg: Include missing 'exec/translation-block.h' header
  accel/tcg: Move tcg_cflags_has/set() to 'exec/translation-block.h'
  accel/tcg: Restrict curr_cflags() declaration to 'internal-common.h'
  qemu/coroutine: Include missing 'qemu/atomic.h' header
  exec/translation-block: Include missing 'qemu/atomic.h' header
  accel/tcg: Declare cpu_loop_exit_requested() in 'exec/cpu-common.h'
  exec/cpu-all: Include 'cpu.h' earlier so MMU_USER_IDX is always defined
  target/sparc: Move sparc_restore_state_to_opc() to cpu.c
  target/sparc: Uninline cpu_get_tb_cpu_state()
  target/loongarch: Declare loongarch_cpu_dump_state() locally
  user: Move various declarations out of 'exec/exec-all.h'
  ...

Conflicts:
	hw/char/riscv_htif.c
	hw/intc/riscv_aplic.c
	target/s390x/cpu.c

	Apply sysemu header path changes to not in the pull request.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-12-21 11:07:00 -05:00
Philippe Mathieu-Daudé
32cad1ffb8 include: Rename sysemu/ -> system/
Headers in include/sysemu/ are not only related to system
*emulation*, they are also used by virtualization. Rename
as system/ which is clearer.

Files renamed manually then mechanical change using sed tool.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Lei Yang <leiyang@redhat.com>
Message-Id: <20241203172445.28576-1-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Maciej S. Szmigiero
3f2a05b31e target/i386: Reset TSCs of parked vCPUs too on VM reset
Since commit 5286c36622 ("target/i386: properly reset TSC on reset")
QEMU writes the special value of "1" to each online vCPU TSC on VM reset
to reset it.

However parked vCPUs don't get that handling and due to that their TSCs
get desynchronized when the VM gets reset.
This in turn causes KVM to turn off PVCLOCK_TSC_STABLE_BIT in its exported
PV clock.
Note that KVM has no understanding of vCPU being currently parked.

Without PVCLOCK_TSC_STABLE_BIT the sched clock is marked unstable in
the guest's kvm_sched_clock_init().
This causes a performance regressions to show in some tests.

Fix this issue by writing the special value of "1" also to TSCs of parked
vCPUs on VM reset.

Reproducing the issue:
1) Boot a VM with "-smp 2,maxcpus=3" or similar

2) device_add host-x86_64-cpu,id=vcpu,node-id=0,socket-id=0,core-id=2,thread-id=0

3) Wait a few seconds

4) device_del vcpu

5) Inside the VM run:
# echo "t" >/proc/sysrq-trigger; dmesg | grep sched_clock_stable
Observe the sched_clock_stable() value is 1.

6) Reboot the VM

7) Once the VM boots once again run inside it:
# echo "t" >/proc/sysrq-trigger; dmesg | grep sched_clock_stable
Observe the sched_clock_stable() value is now 0.

Fixes: 5286c36622 ("target/i386: properly reset TSC on reset")
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/5a605a88e9a231386dc803c60f5fed9b48108139.1734014926.git.maciej.szmigiero@oracle.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-12-19 19:36:38 +01:00
Tao Su
bccfb846fd target/i386: add AVX10 feature and AVX10 version property
When AVX10 enable bit is set, the 0x24 leaf will be present as "AVX10
Converged Vector ISA leaf" containing fields for the version number and
the supported vector bit lengths.

Introduce avx10-version property so that avx10 version can be controlled
by user and cpu model. Per spec, avx10 version can never be 0, the default
value of avx10-version is set to 0 to determine whether it is specified by
user.  The default can come from the device model or, for the max model,
from KVM's reported value.

Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Link: https://lore.kernel.org/r/20241028024512.156724-3-tao1.su@linux.intel.com
Link: https://lore.kernel.org/r/20241028024512.156724-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Link: https://lore.kernel.org/r/20241031085233.425388-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-31 18:28:33 +01:00
Vitaly Kuznetsov
7d7b9c7655 target/i386: Exclude 'hv-syndbg' from 'hv-passthrough'
Windows with Hyper-V role enabled doesn't boot with 'hv-passthrough' when
no debugger is configured, this significantly limits the usefulness of the
feature as there's no support for subtracting Hyper-V features from CPU
flags at this moment (e.g. "-cpu host,hv-passthrough,-hv-syndbg" does not
work). While this is also theoretically fixable, 'hv-syndbg' is likely
very special and unneeded in the default set. Genuine Hyper-V doesn't seem
to enable it either.

Introduce 'skip_passthrough' flag to 'kvm_hyperv_properties' and use it as
one-off to skip 'hv-syndbg' when enabling features in 'hv-passthrough'
mode. Note, "-cpu host,hv-passthrough,hv-syndbg" can still be used if
needed.

As both 'hv-passthrough' and 'hv-syndbg' are debug features, the change
should not have any effect on production environments.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240917160051.2637594-3-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 12:30:21 +02:00
Vitaly Kuznetsov
bbf3810f2c target/i386: Fix conditional CONFIG_SYNDBG enablement
Putting HYPERV_FEAT_SYNDBG entry under "#ifdef CONFIG_SYNDBG" in
'kvm_hyperv_properties' array is wrong: as HYPERV_FEAT_SYNDBG is not
the highest feature number, the result is an empty (zeroed) entry in
the array (and not a skipped entry!). hyperv_feature_supported() is
designed to check that all CPUID bits are set but for a zeroed
feature in 'kvm_hyperv_properties' it returns 'true' so QEMU considers
HYPERV_FEAT_SYNDBG as always supported, regardless of whether KVM host
actually supports it.

To fix the issue, leave HYPERV_FEAT_SYNDBG's definition in
'kvm_hyperv_properties' array, there's nothing wrong in having it defined
even when 'CONFIG_SYNDBG' is not set. Instead, put "hv-syndbg" CPU property
under '#ifdef CONFIG_SYNDBG' to alter the existing behavior when the flag
is silently skipped in !CONFIG_SYNDBG builds.

Leave an 'assert' sentinel in hyperv_feature_supported() making sure there
are no 'holes' or improperly defined features in 'kvm_hyperv_properties'.

Fixes: d8701185f4 ("hw: hyperv: Initial commit for Synthetic Debugging device")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20240917160051.2637594-2-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 12:30:21 +02:00
Gao Shiyuan
b5151ace58 target/i386: Add support save/load HWCR MSR
KVM commit 191c8137a939 ("x86/kvm: Implement HWCR support")
introduced support for emulating HWCR MSR.

Add support for QEMU to save/load this MSR for migration purposes.

Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Signed-off-by: Wang Liang <wangliang44@baidu.com>
Link: https://lore.kernel.org/r/20241009095109.66843-1-gaoshiyuan@baidu.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 12:30:21 +02:00
Xiaoyao Li
5ab639141b target/i386: Construct CPUID 2 as stateful iff times > 1
When times == 1, the CPUID leaf 2 is not stateful.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20240814075431.339209-6-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 12:30:21 +02:00
Xiaoyao Li
00c8a933d9 target/i386: Don't construct a all-zero entry for CPUID[0xD 0x3f]
Currently, QEMU always constructs a all-zero CPUID entry for
CPUID[0xD 0x3f].

It's meaningless to construct such a leaf as the end of leaf 0xD. Rework
the logic of how subleaves of 0xD are constructed to get rid of such
all-zero value of subleaf 0x3f.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20240814075431.339209-2-xiaoyao.li@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 12:30:20 +02:00
Julia Suvorova
fc058618d1 target/i386/kvm: Report which action failed in kvm_arch_put/get_registers
To help debug and triage future failure reports (akin to [1,2]) that
may occur during kvm_arch_put/get_registers, the error path of each
action is accompanied by unique error message.

[1] https://issues.redhat.com/browse/RHEL-7558
[2] https://issues.redhat.com/browse/RHEL-21761

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-3-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 22:04:24 +02:00
Julia Suvorova
a1676bb304 kvm: Allow kvm_arch_get/put_registers to accept Error**
This is necessary to provide discernible error messages to the caller.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240927104743.218468-2-jusual@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 22:04:19 +02:00
Paolo Bonzini
dc44854978 kvm/i386: replace identity_base variable with a constant
identity_base variable is first initialzied to address 0xfffbc000 and then
kvm_vm_set_identity_map_addr() overrides this value to address 0xfeffc000.
The initial address to which the variable was initialized was never used. Clean
everything up, placing 0xfeffc000 in a preprocessor constant.

Reported-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:23 +02:00
Ani Sinha
0cc42e63bb kvm/i386: refactor kvm_arch_init and split it into smaller functions
kvm_arch_init() enables a lot of vm capabilities. Refactor them into separate
smaller functions. Energy MSR related operations also moved to its own
function. There should be no functional impact.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903124143.39345-2-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03 19:33:22 +02:00
Ani Sinha
87e82951c1 kvm/i386: fix return values of is_host_cpu_intel()
is_host_cpu_intel() should return TRUE if the host cpu in Intel based, otherwise
it should return FALSE. Currently, it returns zero (FALSE) when the host CPU
is INTEL and non-zero otherwise. Fix the function so that it agrees more with
the semantics. Adjust the calling logic accordingly. RAPL needs Intel host cpus.
If the host CPU is not Intel baseed, we should report error.

Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903080004.33746-1-anisinha@redhat.com
[While touching the code remove too many spaces from the second part of the
 error. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Ani Sinha
ed2880f4e9 kvm/i386: make kvm_filter_msr() and related definitions private to kvm module
kvm_filer_msr() is only used from i386 kvm module. Make it static so that its
easy for developers to understand that its not used anywhere else.
Same for QEMURDMSRHandler, QEMUWRMSRHandler and KVMMSRHandlers definitions.

CC: philmd@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20240903140045.41167-1-anisinha@redhat.com
[Make struct unnamed. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Lei Wang
ab891454eb target/i386: Raise the highest index value used for any VMCS encoding
Because the index value of the VMCS field encoding of FRED injected-event
data (one of the newly added VMCS fields for FRED transitions), 0x52, is
larger than any existing index value, raise the highest index value used
for any VMCS encoding to 0x52.

Because the index value of the VMCS field encoding of Secondary VM-exit
controls, 0x44, is larger than any existing index value, raise the highest
index value used for any VMCS encoding to 0x44.

Co-developed-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Lei Wang <lei4.wang@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Link: https://lore.kernel.org/r/20240807081813.735158-4-xin@zytor.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-02 12:58:46 +02:00
Pierrick Bouvier
f4fa1a5350 target/i386/kvm: replace assert(false) with g_assert_not_reached()
This patch is part of a series that moves towards a consistent use of
g_assert_not_reached() rather than an ad hoc mix of different
assertion mechanisms.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240919044641.386068-15-pierrick.bouvier@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-09-24 13:53:35 +02:00
Johannes Stoelp
6a8703aecb kvm: Use 'unsigned long' for request argument in functions wrapping ioctl()
Change the data type of the ioctl _request_ argument from 'int' to
'unsigned long' for the various accel/kvm functions which are
essentially wrappers around the ioctl() syscall.

The correct type for ioctl()'s 'request' argument is confused:
 * POSIX defines the request argument as 'int'
 * glibc uses 'unsigned long' in the prototype in sys/ioctl.h
 * the glibc info documentation uses 'int'
 * the Linux manpage uses 'unsigned long'
 * the Linux implementation of the syscall uses 'unsigned int'

If we wrap ioctl() with another function which uses 'int' as the
type for the request argument, then requests with the 0x8000_0000
bit set will be sign-extended when the 'int' is cast to
'unsigned long' for the call to ioctl().

On x86_64 one such example is the KVM_IRQ_LINE_STATUS request.
Bit requests with the _IOC_READ direction bit set, will have the high
bit set.

Fortunately the Linux Kernel truncates the upper 32bit of the request
on 64bit machines (because it uses 'unsigned int', and see also Linus
Torvalds' comments in
  https://sourceware.org/bugzilla/show_bug.cgi?id=14362 )
so this doesn't cause active problems for us.  However it is more
consistent to follow the glibc ioctl() prototype when we define
functions that are essentially wrappers around ioctl().

This resolves a Coverity issue where it points out that in
kvm_get_xsave() we assign a value (KVM_GET_XSAVE or KVM_GET_XSAVE2)
to an 'int' variable which can't hold it without overflow.

Resolves: Coverity CID 1547759
Signed-off-by: Johannes Stoelp <johannes.stoelp@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20240815122747.3053871-1-peter.maydell@linaro.org
[PMM: Rebased patch, adjusted commit message, included note about
 Coverity fix, updated the type of the local var in kvm_get_xsave,
 updated the comment in the KVMState struct definition]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-09-13 15:31:46 +01:00
Anthony Harivel
a6e65975c3 target/i386: Fix arguments for vmsr_read_thread_stat()
Snapshot of the stat utime and stime for each thread, taken before and
after the pause, must be stored in separate locations

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240807124320.1741124-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-14 18:42:19 +02:00
Anthony Harivel
5997fbdfac target/i386: Fix typo that assign same value twice
Should fix: CID 1558553

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240726102632.1324432-2-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-31 13:13:31 +02:00
Anthony Harivel
0418f90809 Add support for RAPL MSRs in KVM/Qemu
Starting with the "Sandy Bridge" generation, Intel CPUs provide a RAPL
interface (Running Average Power Limit) for advertising the accumulated
energy consumption of various power domains (e.g. CPU packages, DRAM,
etc.).

The consumption is reported via MSRs (model specific registers) like
MSR_PKG_ENERGY_STATUS for the CPU package power domain. These MSRs are
64 bits registers that represent the accumulated energy consumption in
micro Joules. They are updated by microcode every ~1ms.

For now, KVM always returns 0 when the guest requests the value of
these MSRs. Use the KVM MSR filtering mechanism to allow QEMU handle
these MSRs dynamically in userspace.

To limit the amount of system calls for every MSR call, create a new
thread in QEMU that updates the "virtual" MSR values asynchronously.

Each vCPU has its own vMSR to reflect the independence of vCPUs. The
thread updates the vMSR values with the ratio of energy consumed of
the whole physical CPU package the vCPU thread runs on and the
thread's utime and stime values.

All other non-vCPU threads are also taken into account. Their energy
consumption is evenly distributed among all vCPUs threads running on
the same physical CPU package.

To overcome the problem that reading the RAPL MSR requires priviliged
access, a socket communication between QEMU and the qemu-vmsr-helper is
mandatory. You can specified the socket path in the parameter.

This feature is activated with -accel kvm,rapl=true,path=/path/sock.sock

Actual limitation:
- Works only on Intel host CPU because AMD CPUs are using different MSR
  adresses.

- Only the Package Power-Plane (MSR_PKG_ENERGY_STATUS) is reported at
  the moment.

Signed-off-by: Anthony Harivel <aharivel@redhat.com>
Link: https://lore.kernel.org/r/20240522153453.1230389-4-aharivel@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-22 19:19:37 +02:00
Richard Henderson
5915139aba * meson: Pass objects and dependencies to declare_dependency(), not static_library()
* meson: Drop the .fa library suffix
 * target/i386: drop AMD machine check bits from Intel CPUID
 * target/i386: add avx-vnni-int16 feature
 * target/i386: SEV bugfixes
 * target/i386: SEV-SNP -cpu host support
 * char: fix exit issues
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaGceoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcpgf/XziKojGOTvYsE7xMijOUswYjCG5m
 ZVLqxTug8Q0zO/9mGvluKBTWmh8KhRWOovX5iZL8+F0gPoYPG4ONpNhh3wpA9+S7
 H7ph4V6sDJBX4l3OrOK6htD8dO5D9kns1iKGnE0lY60PkcHl+pU8BNWfK1zYp5US
 geiyzuRFRRtDmoNx5+o+w+D+W5msPZsnlj5BnPWM+O/ykeFfSrk2ztfdwHKXUhCB
 5FJcu2sWVx+wsdVzdjgT8USi5+VTK4vabq3SfccmNRxBRnJOCU5MrR63stMDceo4
 TswSB88I0WRV1848AudcGZRkjvKaXLyHJ+QTjg2dp7itEARJ3MGsvOpS5A==
 =3kv7
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* meson: Pass objects and dependencies to declare_dependency(), not static_library()
* meson: Drop the .fa library suffix
* target/i386: drop AMD machine check bits from Intel CPUID
* target/i386: add avx-vnni-int16 feature
* target/i386: SEV bugfixes
* target/i386: SEV-SNP -cpu host support
* char: fix exit issues

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaGceoUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcpgf/XziKojGOTvYsE7xMijOUswYjCG5m
# ZVLqxTug8Q0zO/9mGvluKBTWmh8KhRWOovX5iZL8+F0gPoYPG4ONpNhh3wpA9+S7
# H7ph4V6sDJBX4l3OrOK6htD8dO5D9kns1iKGnE0lY60PkcHl+pU8BNWfK1zYp5US
# geiyzuRFRRtDmoNx5+o+w+D+W5msPZsnlj5BnPWM+O/ykeFfSrk2ztfdwHKXUhCB
# 5FJcu2sWVx+wsdVzdjgT8USi5+VTK4vabq3SfccmNRxBRnJOCU5MrR63stMDceo4
# TswSB88I0WRV1848AudcGZRkjvKaXLyHJ+QTjg2dp7itEARJ3MGsvOpS5A==
# =3kv7
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 04 Jul 2024 02:56:58 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  target/i386/SEV: implement mask_cpuid_features
  target/i386: add support for masking CPUID features in confidential guests
  char-stdio: Restore blocking mode of stdout on exit
  target/i386: add avx-vnni-int16 feature
  i386/sev: Fallback to the default SEV device if none provided in sev_get_capabilities()
  i386/sev: Fix error message in sev_get_capabilities()
  target/i386: do not include undefined bits in the AMD topoext leaf
  target/i386: SEV: fix formatting of CPUID mismatch message
  target/i386: drop AMD machine check bits from Intel CPUID
  target/i386: pass X86CPU to x86_cpu_get_supported_feature_word
  meson: Drop the .fa library suffix
  Revert "meson: Propagate gnutls dependency"
  meson: Pass objects and dependencies to declare_dependency()
  meson: merge plugin_ldflags into emulator_link_args
  meson: move block.syms dependency out of libblock
  meson: move shared_module() calls where modules are already walked

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-07-04 09:16:07 -07:00
Paolo Bonzini
c28d8b097f target/i386: add support for masking CPUID features in confidential guests
Some CPUID features may be provided by KVM for some guests, independent of
processor support, for example TSC deadline or TSC adjust.  If these are
not supported by the confidential computing firmware, however, the guest
will fail to start.  Add support for removing unsupported features from
"-cpu host".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-04 07:47:11 +02:00
David Woodhouse
93c76555d8 hw/i386/fw_cfg: Add etc/e820 to fw_cfg late
In e820_add_entry() the e820_table is reallocated with g_renew() to make
space for a new entry. However, fw_cfg_arch_create() just uses the
existing e820_table pointer. This leads to a use-after-free if anything
adds a new entry after fw_cfg is set up.

Shift the addition of the etc/e820 file to the machine done notifier, via
a new fw_cfg_add_e820() function.

Also make e820_table private and use an e820_get_table() accessor function
for it, which sets a flag that will trigger an assert() for any *later*
attempts to add to the table.

Make e820_add_entry() return void, as most callers don't check for error
anyway.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <a2708734f004b224f33d3b4824e9a5a262431568.camel@infradead.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-03 18:14:06 -04:00
Alex Bennée
5b7d54d4ed gdbstub: move enums into separate header
This is an experiment to further reduce the amount we throw into the
exec headers. It might not be as useful as I initially thought because
just under half of the users also need gdbserver_start().

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240620152220.2192768-3-alex.bennee@linaro.org>
2024-06-24 10:14:17 +01:00
Philippe Mathieu-Daudé
8291239113 target/i386: Remove X86CPU::kvm_no_smi_migration field
X86CPU::kvm_no_smi_migration was only used by the
pc-i440fx-2.3 machine, which got removed. Remove it
and simplify kvm_put_vcpu_events().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20240617071118.60464-23-philmd@linaro.org>
2024-06-19 12:40:49 +02:00
John Allen
1ea1432199 i386: Add support for overflow recovery
Add cpuid bit definition for overflow recovery. This is needed in the case
where a deferred error has been sent to the guest, a guest process accesses the
poisoned memory, but the machine_check_poll function has not yet handled the
original deferred error. If overflow recovery is not set in this case, when we
handle the uncorrected error from the poisoned memory access, the overflow bit
will be set and will result in the guest being shut down.

By the time the MCE reaches the guest, the overflow has been handled
by the host and has not caused a shutdown, so include the bit unconditionally.

Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-4-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:39 +02:00
John Allen
2ba8b7ee63 i386: Add support for SUCCOR feature
Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

----
v2:
  - Add "succor" feature word.
  - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.

Reported-by: William Roche <william.roche@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-3-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:39 +02:00
John Allen
4b77512b27 i386: Fix MCE support for AMD hosts
For the most part, AMD hosts can use the same MCE injection code as Intel, but
there are instances where the qemu implementation is Intel specific. First, MCE
delivery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.

Reported-by: William Roche <william.roche@oracle.com>
Signed-off-by: John Allen <john.allen@amd.com>
Message-ID: <20240603193622.47156-2-john.allen@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Xin Li
4ebd98eb3a target/i386: Add get/set/migrate support for FRED MSRs
FRED CPU states are managed in 9 new FRED MSRs, in addtion to a few
existing CPU registers and MSRs, e.g., CR4.FRED and MSR_IA32_PL0_SSP.

Save/restore/migrate FRED MSRs if FRED is exposed to the guest.

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Message-ID: <20231109072012.8078-7-xin3.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-08 10:33:38 +02:00
Richard Henderson
f1572ab947 * virtio-blk: remove SCSI passthrough functionality
* require x86-64-v2 baseline ISA
 * SEV-SNP host support
 * fix xsave.flat with TCG
 * fixes for CPUID checks done by TCG
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZgKVYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPKYgf/QkWrNXdjjD3yAsv5LbJFVTVyCYW3
 b4Iax29kEDy8k9wbzfLxOfIk9jXIjmbOMO5ZN9LFiHK6VJxbXslsMh6hm50M3xKe
 49X1Rvf9YuVA7KZX+dWkEuqLYI6Tlgj3HaCilYWfXrjyo6hY3CxzkPV/ChmaeYlV
 Ad4Y8biifoUuuEK8OTeTlcDWLhOHlFXylG3AXqULsUsXp0XhWJ9juXQ60eATv/W4
 eCEH7CSmRhYFu2/rV+IrWFYMnskLRTk1OC1/m6yXGPKOzgnOcthuvQfiUgPkbR/d
 llY6Ni5Aaf7+XX3S7Avcyvoq8jXzaaMzOrzL98rxYGDR1sYBYO+4h4ZToA==
 =qQeP
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* virtio-blk: remove SCSI passthrough functionality
* require x86-64-v2 baseline ISA
* SEV-SNP host support
* fix xsave.flat with TCG
* fixes for CPUID checks done by TCG

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZgKVYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPKYgf/QkWrNXdjjD3yAsv5LbJFVTVyCYW3
# b4Iax29kEDy8k9wbzfLxOfIk9jXIjmbOMO5ZN9LFiHK6VJxbXslsMh6hm50M3xKe
# 49X1Rvf9YuVA7KZX+dWkEuqLYI6Tlgj3HaCilYWfXrjyo6hY3CxzkPV/ChmaeYlV
# Ad4Y8biifoUuuEK8OTeTlcDWLhOHlFXylG3AXqULsUsXp0XhWJ9juXQ60eATv/W4
# eCEH7CSmRhYFu2/rV+IrWFYMnskLRTk1OC1/m6yXGPKOzgnOcthuvQfiUgPkbR/d
# llY6Ni5Aaf7+XX3S7Avcyvoq8jXzaaMzOrzL98rxYGDR1sYBYO+4h4ZToA==
# =qQeP
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 05 Jun 2024 02:01:10 AM PDT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
  hw/i386: Add support for loading BIOS using guest_memfd
  hw/i386/sev: Use guest_memfd for legacy ROMs
  memory: Introduce memory_region_init_ram_guest_memfd()
  i386/sev: Allow measured direct kernel boot on SNP
  i386/sev: Reorder struct declarations
  i386/sev: Extract build_kernel_loader_hashes
  i386/sev: Enable KVM_HC_MAP_GPA_RANGE hcall for SNP guests
  i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
  i386/sev: Invoke launch_updata_data() for SNP class
  i386/sev: Invoke launch_updata_data() for SEV class
  hw/i386/sev: Add support to encrypt BIOS when SEV-SNP is enabled
  i386/sev: Add support for SNP CPUID validation
  i386/sev: Add support for populating OVMF metadata pages
  hw/i386/sev: Add function to get SEV metadata from OVMF header
  i386/sev: Set CPU state to protected once SNP guest payload is finalized
  i386/sev: Add handling to encrypt/finalize guest launch data
  i386/sev: Add the SNP launch start context
  i386/sev: Update query-sev QAPI format to handle SEV-SNP
  i386/sev: Add a class method to determine KVM VM type for SNP guests
  i386/sev: Don't return launch measurements for SEV-SNP guests
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-06-05 07:45:23 -07:00
Michael Roth
47e76d03b1 i386/kvm: Add KVM_EXIT_HYPERCALL handling for KVM_HC_MAP_GPA_RANGE
KVM_HC_MAP_GPA_RANGE will be used to send requests to userspace for
private/shared memory attribute updates requested by the guest.
Implement handling for that use-case along with some basic
infrastructure for enabling specific hypercall events.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-31-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:06 +02:00
Paolo Bonzini
a808132f6d i386/sev: Add a class method to determine KVM VM type for SNP guests
SEV guests can use either KVM_X86_DEFAULT_VM, KVM_X86_SEV_VM,
or KVM_X86_SEV_ES_VM depending on the configuration and what
the host kernel supports. SNP guests on the other hand can only
ever use KVM_X86_SNP_VM, so split determination of VM type out
into a separate class method that can be set accordingly for
sev-guest vs. sev-snp-guest objects and add handling for SNP.

Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-14-pankaj.gupta@amd.com>
[Remove unnecessary function pointer declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05 11:01:06 +02:00
Richard Henderson
a93b4061b0 target/i386/kvm: Improve KVM_EXIT_NOTIFY warnings
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240412073346.458116-28-richard.henderson@linaro.org>
[PMD: Fixed typo reported by Peter Maydell]
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-06-04 10:02:39 +02:00
Zhao Liu
6ddeb0ec8c i386/cpu: Introduce bitmap to cache available CPU topology levels
Currently, QEMU checks the specify number of topology domains to detect
if there's extended topology levels (e.g., checking nr_dies).

With this bitmap, the extended CPU topology (the levels other than SMT,
core and package) could be easier to detect without touching the
topology details.

This is also in preparation for the follow-up to decouple CPUID[0x1F]
subleaf with specific topology level.

Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240424154929.1487382-10-zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-05-22 19:43:29 +02:00
Paolo Bonzini
663e2f443e target/i386: SEV: use KVM_SEV_INIT2 if possible
Implement support for the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM virtual
machine types, and the KVM_SEV_INIT2 function of KVM_MEMORY_ENCRYPT_OP.

These replace the KVM_SEV_INIT and KVM_SEV_ES_INIT functions, and have
several advantages:

- sharing the initialization sequence with SEV-SNP and TDX

- allowing arguments including the set of desired VMSA features

- protection against invalid use of KVM_GET/SET_* ioctls for guests
  with encrypted state

If the KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types are not supported,
fall back to KVM_SEV_INIT and KVM_SEV_ES_INIT (which use the
default x86 VM type).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:25 +02:00
Paolo Bonzini
ee88612df1 target/i386: Implement mc->kvm_type() to get VM type
KVM is introducing a new API to create confidential guests, which
will be used by TDX and SEV-SNP but is also available for SEV and
SEV-ES.  The API uses the VM type argument to KVM_CREATE_VM to
identify which confidential computing technology to use.

Since there are no other expected uses of VM types, delegate
mc->kvm_type() for x86 boards to the confidential-guest-support
object pointed to by ms->cgs.

For example, if a sev-guest object is specified to confidential-guest-support,
like,

  qemu -machine ...,confidential-guest-support=sev0 \
       -object sev-guest,id=sev0,...

it will check if a VM type KVM_X86_SEV_VM or KVM_X86_SEV_ES_VM
is supported, and if so use them together with the KVM_SEV_INIT2
function of the KVM_MEMORY_ENCRYPT_OP ioctl. If not, it will fall back to
KVM_SEV_INIT and KVM_SEV_ES_INIT.

This is a preparatory work towards TDX and SEV-SNP support, but it
will also enable support for VMSA features such as DebugSwap, which
are only available via KVM_SEV_INIT2.

Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:25 +02:00
Paolo Bonzini
a99c0c66eb KVM: remove kvm_arch_cpu_check_are_resettable
Board reset requires writing a fresh CPU state.  As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:25 +02:00
Xiaoyao Li
637c95b37b i386/sev: Switch to use confidential_guest_kvm_init()
Use confidential_guest_kvm_init() instead of calling SEV
specific sev_kvm_init(). This allows the introduction of multiple
confidential-guest-support subclasses for different x86 vendors.

As a bonus, stubs are not needed anymore since there is no
direct call from target/i386/kvm/kvm.c to SEV code.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:25 +02:00
Sean Christopherson
a5acf4f26c i386/kvm: Move architectural CPUID leaf generation to separate helper
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

For now this is just a cleanup, so keep the function static.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-23-xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-04-23 17:35:13 +02:00