On LA464, some CSR registers are not used such as CSR_SAVE8 -
CSR_SAVE15, also CSR registers relative with MCE is not used now.
Flag CSRFL_UNUSED is added for these registers, so that it will
not dumped. In order to keep compatiblity, these CSR registers are
not removed since it is used in vmstate already.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Common source file csr.c is added here, it can be used by both
TCG mode and kvm mode. The common code is removed from file
tcg/insn_trans/trans_privileged.c.inc to csrc.c
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Parameter type TCGv and TCGv_ptr for function GenCSRRead and GenCSRWrite
is not used in non-TCG mode. Generic csr function type is added here
with parameter void type, so that it passes to compile with non-TCG mode.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
With CSR register, dynamic function access is used for CSR register
access in TCG mode, so that csr info can be used by other modules.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
The only public interfaces for pl011 are TYPE_PL011 and pl011_create.
Remove pub from everything else.
Note: the "allow(dead_code)" is removed later.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use new_unchecked; the effect is the same, but the
code is easier to read and unsafe regions become smaller.
Likewise, NonNull::new can be used instead of assertion and
followed by as_ref() or as_mut() instead of dereferencing the
pointer.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Keep vmstate_clock!; because it uses a field of type VMStateDescription,
it cannot be converted to the VMState trait without access to the
const_refs_static feature.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is not type safe, but it's the best that can be done without
const_refs_static. It can also be used with BqlCell and BqlRefCell.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Scalar types are those that have their own VMStateInfo. This poses
a problem in that references to VMStateInfo can only be included in
associated consts starting with Rust 1.83.0, when the const_refs_static
was stabilized. Removing the requirement is done by placing a limited
list of VMStateInfos in an enum, and going from enum to &VMStateInfo
only when building the VMStateField.
The same thing cannot be done with VMS_STRUCT because the set of
VMStateDescriptions extends to structs defined by the devices.
Therefore, structs and cells cannot yet use vmstate_of!.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This shortens a bit the constants. Do not bother using it
in the vmstate macros since most of them will go away soon.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Arrays, pointers and cells use a VMStateField that is based on that
for the inner type. The implementation therefore delegates to the
VMState implementation of the inner type.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The existing translation of the C macros for vmstate does not make
any attempt to type-check vmstate declarations against the struct, so
introduce a new system that computes VMStateField based on the actual
struct declaration.
Macros do not have full access to the type system, therefore a full
implementation of this scheme requires a helper trait to analyze the
type and produce a VMStateField from it; a macro "vmstate_of!" accepts
arguments similar to "offset_of!" and tricks the compiler into looking
up the trait for the right type.
The patch introduces not just vmstate_of!, but also the slightly too
clever enabling macro call_func_with_field!. The particular trick used
here was proposed on the users.rust-lang.org forum, so I take no merit
and all the blame.
Introduce the trait and some functions to access it; the actual
implementation comes later.
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sure MemTxAttrs is packed into 8 bytes and does not exceed 8 bytes.
Suggested-by: Philippe Mathieu-Daudà <philmd@linaro.org>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121151322.171832-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert `unspecified` member of MemTxAttrs from bit field to bool, so
that bindgen could generate more ergonomic Rust binding with bool type.
As a result, MemTxAttrs needs to be expanded from 4 bytes to 8 bytes.
Therefore, move `unspecified` to after the bit fields and add reserved
members to ensure that the whole structure is packed into 8 bytes.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121151322.171832-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
List all the necessary bindings to better identify gaps in rust/qapi.
And include the bindings wrapped by rust/qapi instead mapping the raw
bindings directly.
Inspired-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121140457.84631-3-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A safe REALIZE accepts immutable reference.
Since current PL011's realize() only calls a char binding function (
qemu_chr_fe_set_handlers), it is possible to convert mutable reference
(&mut self) to immutable reference (&self), which only needs to convert
the pointers passed to C to mutable pointers.
Thus, make REALIZE accept immutable reference.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121140457.84631-2-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Configuring "--enable-user --disable-system --enable-tools" causes the
build failure with the following information:
/usr/bin/ld: libhwcore.a.p/hw_core_qdev.c.o: in function `device_finalize':
/qemu/build/../hw/core/qdev.c:688: undefined reference to `qapi_event_send_device_deleted'
collect2: error: ld returned 1 exit status
To fix the above issue, add qdev.c stub when build with `have_tools`.
With this fix, QEMU could be successfully built in the following cases:
--enable-user --disable-system --enable-tools
--enable-user --disable-system --disable-tools
--enable-user --disable-system
Cc: qemu-stable@nongnu.org
Fixes: 388b849fb6 ("stubs: avoid duplicate symbols in libqemuutil.a")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2766
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121154318.214680-1-zhao1.liu@intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update GraniteRapids, SierraForest and ClearwaterForest CPU models in
section "Preferred CPU models for Intel x86 hosts".
Also introduce bhi-no, gds-no and rfds-no in doc.
Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121020650.1899618-5-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to table 1-2 in Intel Architecture Instruction Set Extensions
and Future Features (rev 056) [1], ClearwaterForest has the following new
features which have already been virtualized:
- AVX-VNNI-INT16 CPUID.(EAX=7,ECX=1):EDX[bit 10]
- SHA512 CPUID.(EAX=7,ECX=1):EAX[bit 0]
- SM3 CPUID.(EAX=7,ECX=1):EAX[bit 1]
- SM4 CPUID.(EAX=7,ECX=1):EAX[bit 2]
Add above features to new CPU model ClearwaterForest. Comparing with
SierraForest, ClearwaterForest bare-metal contains all features of
SierraForest-v2 CPU model and adds:
- PREFETCHI CPUID.(EAX=7,ECX=1):EDX[bit 14]
- DDPD_U CPUID.(EAX=7,ECX=2):EDX[bit 3]
- BHI_NO IA32_ARCH_CAPABILITIES[bit 20]
Add above and all features of SierraForest-v2 CPU model to new CPU model
ClearwaterForest.
[1] https://cdrdv2.intel.com/v1/dl/getContent/671368
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121020650.1899618-4-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Branch History Injection (BHI) is a CPU side-channel vulnerability, where
an attacker may manipulate branch history before transitioning from user
to supervisor mode or from VMX non-root/guest to root mode. CPUs that set
BHI_NO bit in MSR IA32_ARCH_CAPABILITIES to indicate no additional
mitigation is required to prevent BHI.
Make BHI_NO bit available to guests.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121020650.1899618-3-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update SierraForest CPU model to add LAM, 4 bits indicating certain bits
of IA32_SPEC_CTR are supported(intel-psfd, ipred-ctrl, rrsba-ctrl,
bhi-ctrl) and the missing features(ss, tsc-adjust, cldemote, movdiri,
movdir64b)
Also add GDS-NO and RFDS-NO to indicate the related vulnerabilities are
mitigated in stepping 3.
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
Signed-off-by: Tao Su <tao1.su@linux.intel.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Link: https://lore.kernel.org/r/20250121020650.1899618-2-tao1.su@linux.intel.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For updates to implicit registers (RCX in LOOP instructions, RSI or RDI
in string instructions, or the stack pointer) do the add directly using
the registers (with no temporary) if 32-bit or 64-bit, or use a temporary
created for the occasion if 16-bit. This is more efficient and removes
move instructions for the MO_TL case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-14-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that everything has been cleaned up, look at DF and prefixes
in a single function, and call that one from gen_repz and gen_repz_nz.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a common operation that is executed many times in rep
movs or rep stos loops. It can improve performance by several
percentage points.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-13-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use a TCG loop so that it is not necessary to go through the setup steps
of REP and through the I/O check on every iteration. Interestingly, this
is not a particularly effective optimization on its own, though it avoids
the cost of correct RF emulation that was added in the previous patch.
The main benefit lies in allowing the hoisting of loop invariants outside
the loop, which will happen separately.
The loop exits when the low 16 bits of CX/ECX/RCX are zero (so generally
speaking the string operation runs in 65536 iteration batches) to give
the main loop an opportunity to pick up interrupts.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-12-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In a repeated string operation, CX/ECX will be decremented until it
is 0 but never underflow. Use this observation to avoid a deposit or
zero-extend operation if the address size of the operation is smaller
than MO_TL.
As in the previous patch, the patch is structured to include some
preparatory work for subsequent changes. In particular, introducing
cx_next prepares for when ECX will be decremented *before* calling
fn(s, ot), and therefore cannot yet be written back to cpu_regs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly generate a TSTEQ branch (which is optimized to NE x,0 if possible).
This does not make much sense yet, but later we will add more checks and some
will use a temporary to check on the decremented value of CX/ECX/RCX; it will
be clearer for all checks to share the same logic using TSTEQ(reg, cx_mask).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-10-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the cost of gen_update_cc_op() must be paid anyway, it's easier
to place them manually and not rely on spilling that is buried under
multiple levels of function calls. While at it, clarify the circumstances
in which the gen_update_cc_op() is needed, and why it is not for REPxx
SCAS and REPxx CMPS.
And since cc_op will have been spilled at the point of a fault, just
make the whole insn CC_OP_DYNAMIC. Once repz_opt is reintroduced,
a fault could happen either before or after the first execution of
CMPS/SCAS, and CC_OP_DYNAMIC sidesteps the complicated matter of what
x86_restore_state_to_opc would do.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-9-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
RF must be set on traps and interrupts from a string instruction,
except if they occur after the last iteration. Ensure it is set
before giving the main loop a chance to execute.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-8-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow using them in the code that translates REP/REPZ, without
forward declarations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-7-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The condition for optimizing repeat instruction is more or less the
opposite of what you imagine: almost always the string instruction
was _not_ optimized and optimizing the loop relied on goto_tb.
This is obviously not great for performance, due to the cost of the
exit-to-main-loop check, but also wrong. In fact, after expanding
dc->jmp_opt and simplifying "!!x" to "x", the condition for looping used
to be:
((cflags & CF_NO_GOTO_TB) ||
(flags & (HF_RF_MASK | HF_TF_MASK | HF_INHIBIT_IRQ_MASK))) && !(cflags & CF_USE_ICOUNT)
In other words, setting aside RF (it requires special handling for REP
instructions and it was completely missing), repeat instruction were
being optimized if TF or inhibit IRQ flags were set. This is certainly
wrong for TF, because string instructions trap after every execution,
and probably for interrupt shadow too.
Get rid of repz_opt completely. The next patches will reintroduce the
optimization, applying it in the common case instead of the unlikely
and wrong one.
While at it, place the CX/ECX/RCX=0 case is at the end of the function,
which saves a label and is clearer when reading the generated ops.
For clarity, mark the cc_op explicitly as DYNAMIC even if at the end
of the translation block; the cc_op can come from either the previous
instruction or the string instruction, and currently we rely on
a gen_update_cc_op() that is hidden in the bowels of gen_jcc() to
spill cc_op and mark it clean.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-6-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The same "if" is present in all generator functions for string instructions.
Push it inside gen_repz() and gen_repz_nz() instead.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20241215090613.89588-5-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It only differs in a single call to gen_jcc, so use a "bool" argument
to distinguish the two cases; do not duplicate code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-4-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is not needed anymore now that gen_jcc has been eliminated
(merged into the similarly-named gen_Jcc, where the uppercase letter
gives away that it is an emission function).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-3-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The code of gen_Jcc is very similar to gen_LOOP* and gen_JCXZ, but this
is hidden by gen_jcc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-2-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the problem with the non-quiesced virtio-net device and
make sure to abort the boot process if the user specified a wrong
loadparm parameter.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Because the loadparm specifies an exact kernel the user wants to boot, if the
loadparm is invalid it must represent a misconfiguration of the guest. Thus we
should abort the IPL immediately, without attempting to use other devices, to
avoid booting into an unintended guest image.
Signed-off-by: Jared Rossi <jrossi@linux.ibm.com>
Message-ID: <20250117212235.1324063-2-jrossi@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The code in net_init_ip() currently bails out early if "rc" is less
than 0, so the if-statements that check for negative "rc" codes to
print out some specific error messages with regards to the TFTP server
are never reached. Move them earlier to bring that dead code back to
life.
Reviewed-by: Jared Rossi <jrossi@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Jared Rossi <jrossi@linux.ibm.com>
Message-ID: <20250116115826.192047-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When we are trying to boot from virtio-net devices, the
s390-ccw bios currently leaves the virtio-net device enabled
after using it. That means that the receiving virt queues will
continue to happily write incoming network packets into memory.
This can corrupt data of the following boot process. For example,
if you set up a second guest on a virtual network and create a
lot of broadcast traffic there, e.g. with:
ping -i 0.02 -s 1400 -b 192.168.1.255
and then you try to boot a guest with two boot devices, a network
device first (which should not be bootable) and e.g. a bootable SCSI
CD second, then this guest will fail to load the kernel from the CD
image:
$ qemu-system-s390x -m 2G -nographic -device virtio-scsi-ccw \
-netdev tap,id=net0 -device virtio-net-ccw,netdev=net0,bootindex=1 \
-drive if=none,file=test.iso,format=raw,id=cd1 \
-device scsi-cd,drive=cd1,bootindex=2
LOADPARM=[ ]
Network boot device detected
Network boot starting...
Using MAC address: 52:54:00:12:34:56
Requesting information via DHCP: done
Using IPv4 address: 192.168.1.76
Using TFTP server: 192.168.1.1
Trying pxelinux.cfg files...
TFTP error: ICMP ERROR "port unreachable"
Receiving data: 0 KBytes
Repeating TFTP read request...
TFTP error: ICMP ERROR "port unreachable"
Failed to load OS from network.
Failed to IPL from this network!
LOADPARM=[ ]
Using virtio-scsi.
! virtio-scsi:setup:inquiry: response VS RESP=ff !
ERROR: No suitable device for IPL. Halting...
We really have to shut up the virtio-net devices after we're not
using it anymore. The easiest way to do this is to simply reset
the device, so let's do that now.
Reviewed-by: Jared Rossi <jrossi@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Jared Rossi <jrossi@linux.ibm.com>
Message-ID: <20250116115826.192047-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
To be able to properly silence a virtio device after using it,
we need a global function to reset the device.
Reviewed-by: Jared Rossi <jrossi@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Jared Rossi <jrossi@linux.ibm.com>
Message-ID: <20250116115826.192047-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>