The TYPE_AARCH64_CPU class is an abstract type that is the parent of
all the AArch64 CPUs. It now has no special behaviour of its own, so
we can eliminate it and make the AArch64 CPUs directly inherit from
TYPE_ARM_CPU.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250429132200.605611-8-peter.maydell@linaro.org
This macro is used by only one target, and even then under
unusual conditions -- AArch64 with mmap's PROT_MTE flag.
Since page size for aarch64-linux-user is variable, the
per-page data size is also variable.
Since page_reset_target_data via target_munmap does not
have ready access to CPUState, simply pass in the size
from the first allocation and remember that.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For some targets, simply remove the local definition.
For other targets, move the inline definition out of line.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Mechanical change using gsed, then style manually adapted
to pass checkpatch.pl script.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20250424194905.82506-4-philmd@linaro.org>
To avoid including the huge "cpu.h" for a simple definition,
move TARGET_INSN_START_EXTRA_WORDS to "cpu-param.h".
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To eliminate TARGET_AARCH64, we need to make various definitions common
between 32 and 64 bit Arm targets.
Added registers are used only by aarch64 code, and the only impact is on
the size of CPUARMState, and added zarray
(ARMVectorReg zarray[ARM_MAX_VQ * 16]) member (+64KB)
It could be eventually possible to allocate this array only for aarch64
emulation, but I'm not sure it's worth the hassle to save a few KB per
vcpu. Running qemu-system takes already several hundreds of MB of
(resident) memory, and qemu-user takes dozens of MB of (resident) memory
anyway.
As part of this, we define ARM_MAX_VQ once for aarch32 and aarch64,
which will affect zregs field for aarch32.
This field is used for MVE and SVE implementations. MVE implementation
is clipping index value to 0 or 1 for zregs[*].d[],
so we should not touch the rest of data in this case anyway.
This change is safe regarding migration, because aarch64 registers still
have the same size, and for aarch32, only zregs is modified.
Migration code explicitly specify a size of 2 for env.vfp.zregs[0].d,
VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[0].d, ARMCPU, 0, 2). So extending
the storage size has no impact.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250325045915.994760-22-pierrick.bouvier@linaro.org>
Do not rely on target dependent type, but use a fixed type instead.
Since the original type is unsigned, it is safe to extend its size
without any side effect.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250325045915.994760-21-pierrick.bouvier@linaro.org>
This does not hurt, even if they are not used.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250325045915.994760-20-pierrick.bouvier@linaro.org>
This define is used only in accel/kvm/kvm-all.c, so we push directly the
definition there. Add more visibility to kvm_arch_on_sigbus_vcpu() to
allow removing this define from any header.
The architectures defining KVM_HAVE_MCE_INJECTION are i386, x86_64 and
aarch64.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20250325045915.994760-18-pierrick.bouvier@linaro.org>
The functions arm_current_el() and arm_el_is_aa64() are used only in
target/arm and in hw/intc/arm_gicv3_cpuif.c. They're functions that
query internal state of the CPU. Move them out of cpu.h and into
internals.h.
This means we need to include internals.h in arm_gicv3_cpuif.c, but
this is justifiable because that file is implementing the GICv3 CPU
interface, which really is part of the CPU proper; we just ended up
implementing it in code in hw/intc/ for historical reasons.
The motivation for this move is that we'd like to change
arm_el_is_aa64() to add a condition that uses cpu_isar_feature();
but we don't want to include cpu-features.h in cpu.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The arm_cpu_data_is_big_endian() and related functions are now used
only in target/arm; they can be moved to internals.h.
The motivation here is that we would like to move arm_current_el()
to internals.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We would like to move arm_el_is_aa64() to internals.h; however, it is
used by access_secure_reg(). Make that function not be inline, so
that it can stay in cpu.h.
access_secure_reg() is used only in two places:
* in hflags.c
* in the user-mode arm emulators, to decide whether to store
the TLS value in the secure or non-secure banked field
The second of these is not on a super-hot path that would care about
the inlining (and incidentally will always use the NS banked field
because our user-mode CPUs never set ARM_FEATURE_EL3); put the
definition of access_secure_reg() in hflags.c, near its only use
inside target/arm.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The A32_BANKED_REG_{GET,SET} macros are only used inside target/arm;
move their definitions to cpregs.h. There's no need to have them
defined in all the code that includes cpu.h.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When FEAT_SEL2 was implemented the SEL2 timers were missed. This
shows up when building the latest Hafnium with SPMC_AT_EL=2. The
actual implementation utilises the same logic as the rest of the
timers so all we need to do is:
- define the timers and their access functions
- conditionally add the correct system registers
- create a new accessfn as the rules are subtly different to the
existing secure timer
Fixes: e9152ee91c (target/arm: add ARMv8.4-SEL2 system registers)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20250204125009.2281315-7-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
Cc: Andrei Homescu <ahomescu@google.com>
Cc: Arve Hjønnevåg <arve@google.com>
Cc: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
[PMM: CP_ACCESS_TRAP_UNCATEGORIZED -> CP_ACCESS_UNDEFINED;
offset logic now in gt_{indirect,direct}_access_timer_offset() ]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There are not many traps in AArch32 which should trap to Monitor
mode, but these trap bits should trap not just lower ELs to Monitor
mode but also the non-Monitor modes running at EL3 (i.e. Secure
System, Secure Undef, etc).
We get this wrong because the relevant access functions implement the
AArch64-style logic of
if (el < 3 && trap_bit_set) {
return CP_ACCESS_TRAP_EL3;
}
which won't trap the non-Monitor modes at EL3.
Correct this error by using arm_is_el3_or_mon() instead, which
returns true when the CPU is at AArch64 EL3 or AArch32 Monitor mode.
(Since the new callsites are compiled also for the linux-user mode,
we need to provide a dummy implementation for CONFIG_USER_ONLY.)
This affects only:
* trapping of ERRIDR via SCR.TERR
* trapping of the debug channel registers via SDCR.TDCC
* trapping of GICv3 registers via SCR.IRQ and SCR.FIQ
(which we already used arm_is_el3_or_mon() for)
This patch changes the handling of SCR.TERR and SDCR.TDCC. This
patch only changes guest-visible behaviour for "-cpu max" on
the qemu-system-arm binary, because SCR.TERR
and SDCR.TDCC (and indeed the entire SDCR register) only arrived
in Armv8, and the only guest CPU we support which has any v8
features and also starts in AArch32 EL3 is the 32-bit 'max'.
Other uses of CP_ACCESS_TRAP_EL3 don't need changing:
* uses in code paths that can't happen when EL3 is AArch32:
access_trap_aa32s_el1, cpacr_access, cptr_access, nsacr_access
* uses which are in accessfns for AArch64-only registers:
gt_stimer_access, gt_cntpoff_access, access_hxen, access_tpidr2,
access_smpri, access_smprimap, access_lor_ns, access_pauth,
access_mte, access_tfsr_el2, access_scxtnum, access_fgt
* trap bits which exist only in the AArch64 version of the
trap register, not the AArch32 one:
access_tpm, pmreg_access, access_dbgvcr32, access_tdra,
access_tda, access_tdosa (TPM, TDA and TDOSA exist only in
MDCR_EL3, not in SDCR, and we enforce this in sdcr_write())
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-8-peter.maydell@linaro.org
In system register access pseudocode the common pattern for
AArch32 registers with access traps to EL3 is:
at EL1 and EL2:
if HaveEL(EL3) && !ELUsingAArch32(EL3) && (SCR_EL3.TERR == 1) then
AArch64.AArch32SystemAccessTrap(EL3, 0x03);
elsif HaveEL(EL3) && ELUsingAArch32(EL3) && (SCR.TERR == 1) then
AArch32.TakeMonitorTrapException();
at EL3:
if (PSTATE.M != M32_Monitor) && (SCR.TERR == 1) then
AArch32.TakeMonitorTrapException();
(taking as an example the ERRIDR access pseudocode).
This implements the behaviour of (in this case) SCR.TERR that
"Accesses to the specified registers from modes other than Monitor
mode generate a Monitor Trap exception" and of SCR_EL3.TERR that
"Accesses of the specified Error Record registers at EL2 and EL1
are trapped to EL3, unless the instruction generates a higher
priority exception".
In QEMU we don't implement this pattern correctly in two ways:
* in access_check_cp_reg() we turn the CP_ACCESS_TRAP_EL3 into
an UNDEF, not a trap to Monitor mode
* in the access functions, we check trap bits like SCR.TERR
only when arm_current_el(env) < 3 -- this is correct for
AArch64 EL3, but misses the "trap non-Monitor-mode execution
at EL3 into Monitor mode" case for AArch32 EL3
In this commit we fix the first of these two issues, by
making access_check_cp_reg() handle CP_ACCESS_TRAP_EL3
as a Monitor trap. This is a kind of exception that we haven't
yet implemented(!), so we need a new EXCP_MON_TRAP for it.
This diverges from the pseudocode approach, where every access check
function explicitly checks for "if EL3 is AArch32" and takes a
monitor trap; if we wanted to be closer to the pseudocode we could
add a new CP_ACCESS_TRAP_MONITOR and make all the accessfns use it
when appropriate. But because there are no non-standard cases in the
pseudocode (i.e. where either it raises a Monitor trap that doesn't
correspond to an AArch64 SystemAccessTrap or where it raises a
SystemAccessTrap that doesn't correspond to a Monitor trap), handling
this all in one place seems less likely to result in future bugs
where we forgot again about this special case when writing an
accessor.
(The cc of stable here is because "hw/intc/arm_gicv3_cpuif: Don't
downgrade monitor traps for AArch32 EL3" which is also cc:stable
will implicitly use the new EXCP_MON_TRAP code path.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250130182309.717346-6-peter.maydell@linaro.org
Replace with fp_status[FPST_A32]. As this was the last of the
old structures, we can remove the anonymous union and struct.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-15-richard.henderson@linaro.org
[PMM: tweak to account for change to is_ebf()]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_A64].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-14-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_A32_F16].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-13-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_A64_F16].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-12-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_AH].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-11-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_AH_F16].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-10-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_STD].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-9-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Replace with fp_status[FPST_STD_F16].
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-8-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move ARMFPStatusFlavour to cpu.h with which to index
this array. For now, place the array in an anonymous
union with the existing structures. Adjust the order
of the existing structures to match the enum.
Simplify fpstatus_ptr() using the new array.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250129013857.135256-7-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For FEAT_AFP, we want to emit different code when FPCR.NEP is set, so
that instead of zeroing the high elements of a vector register when
we write the output of a scalar operation to it, we instead merge in
those elements from one of the source registers. Since this affects
the generated code, we need to put FPCR.NEP into the TBFLAGS.
FPCR.NEP is treated as 0 when in streaming SVE mode and FEAT_SME_FA64
is not implemented or not enabled; we can implement this logic in
rebuild_hflags_a64().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
When FPCR.AH is 1, the behaviour of some instructions changes:
* AdvSIMD BFCVT, BFCVTN, BFCVTN2, BFMLALB, BFMLALT
* SVE BFCVT, BFCVTNT, BFMLALB, BFMLALT, BFMLSLB, BFMLSLT
* SME BFCVT, BFCVTN, BFMLAL, BFMLSL (these are all in SME2 which
QEMU does not yet implement)
* FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS
The behaviour change is:
* the instructions do not update the FPSR cumulative exception flags
* trapped floating point exceptions are disabled (a no-op for QEMU,
which doesn't implement FPCR.{IDE,IXE,UFE,OFE,DZE,IOE})
* rounding is always round-to-nearest-even regardless of FPCR.RMode
* denormalized inputs and outputs are always flushed to zero, as if
FPCR.{FZ,FIZ} is {1,1}
* FPCR.FZ16 is still honoured for half-precision inputs
(See the Arm ARM DDI0487L.a section A1.5.9.)
We can provide all these behaviours with another pair of float_status fields
which we use only for these insns, when FPCR.AH is 1. These float_status
fields will always have:
* flush_to_zero and flush_inputs_to_zero set for the non-F16 field
* rounding mode set to round-to-nearest-even
and so the only FPCR fields they need to honour are DN and FZ16.
In this commit we only define the new fp_status fields and give them
the required behaviour when FPSR is updated. In subsequent commits
we will arrange to use this new fp_status field for the instructions
that should be affected by FPCR.AH in this way.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We are going to need to generate different code in some cases when
FPCR.AH is 1. For example:
* Floating point neg and abs must not flip the sign bit of NaNs
* some insns (FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS, and various
BFCVT and BFM bfloat16 ops) need to use a different float_status
to the usual one
Encode FPCR.AH into the A64 tbflags, so we can refer to it at
translate time.
Because we now have a bit in FPCR that affects codegen, we can't mark
the AArch64 FPCR register as being SUPPRESS_TB_END any more; writes
to it will now end the TB and trigger a regeneration of hflags.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The Armv8.7 FEAT_AFP feature defines three new control bits in
the FPCR:
* FPCR.AH: "alternate floating point mode"; this changes floating
point behaviour in a variety of ways, including:
- the sign of a default NaN is 1, not 0
- if FPCR.FZ is also 1, denormals detected after rounding
with an unbounded exponent has been applied are flushed to zero
- FPCR.FZ does not cause denormalized inputs to be flushed to zero
- miscellaneous other corner-case behaviour changes
* FPCR.FIZ: flush denormalized numbers to zero on input for
most instructions
* FPCR.NEP: makes scalar SIMD operations merge the result with
higher vector elements in one of the source registers, instead
of zeroing the higher elements of the destination
This commit defines the new bits in the FPCR, and allows them to be
read or written when FEAT_AFP is implemented. Actual behaviour
changes will be implemented in subsequent commits.
Note that these are the first FPCR bits which don't appear in the
AArch32 FPSCR view of the register, and which share bit positions
with FPSR bits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The pxa2xx CPUs are now only useful with user-mode emulation, because
we dropped all the machine types that used them in 9.2. (Technically
you could alse use "-cpu pxa270" with a board model like versatilepb
which doesn't sanity-check the CPU type, but that has never been a
supported config.)
To use them (or iwMMXt emulation) with QEMU user-mode you would need
to explicitly select them with the -cpu option or the QEMU_CPU
environment variable. A google search finds no examples of anybody
doing this in the last decade; I don't believe the GCC folks are
using QEMU to test their iwMMXt codegen either. In fact, GCC is in
the process of dropping support for iwMMXT entirely.
The iwMMXt emulation is thousands of lines of code in QEMU, and
is now the only bit of Arm insn decode which doesn't use decodetree.
We have no way to test or validate changes to it. This code is
just dead weight that is almost certainly not being used by anybody.
Mark it as deprecated.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20250127112715.2936555-2-peter.maydell@linaro.org
Now we have moved all the uses of vfp.fp_status_f16 and FPST_FPCR_F16
to the new A32 or A64 fields, we can remove these.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-19-peter.maydell@linaro.org
As the first part of splitting the existing fp_status_f16
into separate float_status fields for AArch32 and AArch64
(so that we can make FEAT_AFP control bits apply only
for AArch64), define the two new fp_status_f16_a32 and
fp_status_f16_a64 fields, but don't use them yet.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-14-peter.maydell@linaro.org
Now we have moved all the uses of vfp.fp_status and FPST_FPCR
to either the A32 or A64 fields, we can remove these.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-13-peter.maydell@linaro.org
We want to split the existing fp_status in the Arm CPUState into
separate float_status fields for AArch32 and AArch64. (This is
because new control bits defined by FEAT_AFP only have an effect for
AArch64, not AArch32.) To make this split we will:
* define new fp_status_a32 and fp_status_a64 which have
identical behaviour to the existing fp_status
* move existing uses of fp_status to fp_status_a32 or
fp_status_a64 as appropriate
* delete the old fp_status when it has no uses left
In this patch we add the new float_status fields.
We will also need to split fp_status_f16, but we will do that
as a separate series of patches.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20250124162836.2332150-7-peter.maydell@linaro.org
Pointer authentication on aarch64 is pretty expensive (up to 50% of
execution time) when running a virtual machine with tcg and -cpu max
(which enables pauth=on).
The advice is always: use pauth-impdef=on.
Our documentation even mentions it "by default" in
docs/system/introduction.rst.
Thus, we change the default to use impdef by default. This does not
affect kvm or hvf acceleration, since pauth algorithm used is the one
from host cpu.
This change is retro compatible, in terms of cli, with previous
versions, as the semantic of using -cpu max,pauth-impdef=on, and -cpu
max,pauth-qarma3=on is preserved.
The new option introduced in previous patch and matching old default is
-cpu max,pauth-qarma5=on.
It is retro compatible with migration as well, by defining a backcompat
property, that will use qarma5 by default for virt machine <= 9.2.
Tested by saving and restoring a vm from qemu 9.2.0 into qemu-master
(10.0) for cpus neoverse-n2 and max.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241219183211.3493974-3-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Before changing default pauth algorithm, we need to make sure current
default one (QARMA5) can still be selected.
$ qemu-system-aarch64 -cpu max,pauth-qarma5=on ...
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241219183211.3493974-2-pierrick.bouvier@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move the #endif guard where it belongs to restrict
the cpu_untagged_addr() implementation to user
emulation.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241114011310.3615-11-philmd@linaro.org>
FEAT_CMOW introduces support for controlling cache maintenance
instructions executed in EL0/1 and is mandatory from Armv8.8.
On real hardware, the main use for this feature is to prevent processes
from invalidating or flushing cache lines for addresses they only have
read permission, which can impact the performance of other processes.
QEMU implements all cache instructions as NOPs, and, according to rule
[1], which states that generating any Permission fault when a cache
instruction is implemented as a NOP is implementation-defined, no
Permission fault is generated for any cache instruction when it lacks
read and write permissions.
QEMU does not model any cache topology, so the PoU and PoC are before
any cache, and rules [2] apply. These rules state that generating any
MMU fault for cache instructions in this topology is also
implementation-defined. Therefore, for FEAT_CMOW, we do not generate any
MMU faults either, instead, we only advertise it in the feature
register.
[1] Rule R_HGLYG of section D8.14.3, Arm ARM K.a.
[2] Rules R_MZTNR and R_DNZYL of section D8.14.3, Arm ARM K.a.
Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241104142606.941638-1-gustavo.romero@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Our current usage of MMU indexes when EL3 is AArch32 is confused.
Architecturally, when EL3 is AArch32, all Secure code runs under the
Secure PL1&0 translation regime:
* code at EL3, which might be Mon, or SVC, or any of the
other privileged modes (PL1)
* code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its
own translation regime, and EL1 and EL0 (whether AArch32 or AArch64)
have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't
do anything special about Secure PL0, which meant it used the same
ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug
where arm_sctlr() incorrectly picked the NonSecure SCTLR as the
controlling register when in Secure PL0, which meant we were
spuriously generating alignment faults because we were looking at the
wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that
we wouldn't honour the PAN bit for Secure PL1, because there's no
equivalent _PAN mmu index for it.
Fix this by adding two new MMU indexes:
* ARMMMUIdx_E30_0 is for Secure PL0
* ARMMMUIdx_E30_3_PAN is for Secure PL1 when PAN is enabled
The existing ARMMMUIdx_E3 is used to mean "Secure PL1 without PAN"
(and would be named ARMMMUIdx_E30_3 in an AArch32-centric scheme).
These extra two indexes bring us up to the maximum of 16 that the
core code can currently support.
This commit:
* adds the new MMU index handling to the various places
where we deal in MMU index values
* adds assertions that we aren't AArch32 EL3 in a couple of
places that currently use the E10 indexes, to document why
they don't also need to handle the E30 indexes
* documents in a comment why regime_has_2_ranges() doesn't need
updating
Notes for backporting: this commit depends on the preceding revert of
4c2c047469; that revert and this commit should probably be
backported to everywhere that we originally backported 4c2c047469.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2588
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241101142845.1712482-3-peter.maydell@linaro.org
This reverts commit 4c2c047469.
This commit tried to fix a problem with our usage of MMU indexes when
EL3 is AArch32, using what it described as a "more complicated
approach" where we share the same MMU index values for Secure PL1&0
and NonSecure PL1&0. In theory this should work, but the change
didn't account for (at least) two things:
(1) The design change means we need to flush the TLBs at any point
where the CPU state flips from one to the other. We already flush
the TLB when SCR.NS is changed, but we don't flush the TLB when we
take an exception from NS PL1&0 into Mon or when we return from Mon
to NS PL1&0, and the commit didn't add any code to do that.
(2) The ATS12NS* address translate instructions allow Mon code (which
is Secure) to do a stage 1+2 page table walk for NS. I thought this
was OK because do_ats_write() does a page table walk which doesn't
use the TLBs, so because it can pass both the MMU index and also an
ARMSecuritySpace argument we can tell the table walk that we want NS
stage1+2, not S. But that means that all the code within the ptw
that needs to find e.g. the regime EL cannot do so only with an
mmu_idx -- all these functions like regime_sctlr(), regime_el(), etc
would need to pass both an mmu_idx and the security_space, so they
can tell whether this is a translation regime controlled by EL1 or
EL3 (and so whether to look at SCTLR.S or SCTLR.NS, etc).
In particular, because regime_el() wasn't updated to look at the
ARMSecuritySpace it would return 1 even when the CPU was in Monitor
mode (and the controlling EL is 3). This meant that page table walks
in Monitor mode would look at the wrong SCTLR, TCR, etc and would
generally fault when they should not.
Rather than trying to make the complicated changes needed to rescue
the design of 4c2c047469, we revert it in order to instead take the
route that that commit describes as "the most straightforward" fix,
where we add new MMU indexes EL30_0, EL30_3, EL30_3_PAN to correspond
to "Secure PL1&0 at PL0", "Secure PL1&0 at PL1", and "Secure PL1&0 at
PL1 with PAN".
This revert will re-expose the "spurious alignment faults in
Secure PL0" issue #2326; we'll fix it again in the next commit.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 20241101142845.1712482-2-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Extend the 'mte' property for the virt machine to cover KVM as
well. For KVM, we don't allocate tag memory, but instead enable
the capability.
If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.
[gankulkarni: This patch is rework of commit b320e21c48
which broke TCG since it made the TCG -cpu max
report the presence of MTE to the guest even if the board hadn't
enabled MTE by wiring up the tag RAM. This meant that if the guest
then tried to use MTE QEMU would segfault accessing the
non-existent tag RAM.]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Gustavo Romero <gustavo.romero@linaro.org>
Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Message-id: 20241008114302.4855-1-gankulkarni@os.amperecomputing.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
FEAT_EBF16 adds one new bit to the FPCR floating point control
register. Allow this bit to be read and written when the ID
registers indicate the presence of the feature.
Note that because this new bit is not in FPSCR_FPCR_MASK the bit is
not visible in the AArch32 FPSCR, and FPSCR writes do not affect it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Our current usage of MMU indexes when EL3 is AArch32 is confused.
Architecturally, when EL3 is AArch32, all Secure code runs under the
Secure PL1&0 translation regime:
* code at EL3, which might be Mon, or SVC, or any of the
other privileged modes (PL1)
* code at EL0 (Secure PL0)
This is different from when EL3 is AArch64, in which case EL3 is its
own translation regime, and EL1 and EL0 (whether AArch32 or AArch64)
have their own regime.
We claimed to be mapping Secure PL1 to our ARMMMUIdx_EL3, but didn't
do anything special about Secure PL0, which meant it used the same
ARMMMUIdx_EL10_0 that NonSecure PL0 does. This resulted in a bug
where arm_sctlr() incorrectly picked the NonSecure SCTLR as the
controlling register when in Secure PL0, which meant we were
spuriously generating alignment faults because we were looking at the
wrong SCTLR control bits.
The use of ARMMMUIdx_EL3 for Secure PL1 also resulted in the bug that
we wouldn't honour the PAN bit for Secure PL1, because there's no
equivalent _PAN mmu index for it.
We could fix this in one of two ways:
* The most straightforward is to add new MMU indexes EL30_0,
EL30_3, EL30_3_PAN to correspond to "Secure PL1&0 at PL0",
"Secure PL1&0 at PL1", and "Secure PL1&0 at PL1 with PAN".
This matches how we use indexes for the AArch64 regimes, and
preserves propirties like being able to determine the privilege
level from an MMU index without any other information. However
it would add two MMU indexes (we can share one with ARMMMUIdx_EL3),
and we are already using 14 of the 16 the core TLB code permits.
* The more complicated approach is the one we take here. We use
the same MMU indexes (E10_0, E10_1, E10_1_PAN) for Secure PL1&0
than we do for NonSecure PL1&0. This saves on MMU indexes, but
means we need to check in some places whether we're in the
Secure PL1&0 regime or not before we interpret an MMU index.
The changes in this commit were created by auditing all the places
where we use specific ARMMMUIdx_ values, and checking whether they
needed to be changed to handle the new index value usage.
Note for potential stable backports: taking also the previous
(comment-change-only) commit might make the backport easier.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2326
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-3-peter.maydell@linaro.org
We have a long comment describing the Arm architectural translation
regimes and how we map them to QEMU MMU indexes. This comment has
got a bit out of date:
* FEAT_SEL2 allows Secure EL2 and corresponding new regimes
* FEAT_RME introduces Realm state and its translation regimes
* We now model the Cortex-R52 so that is no longer a hypothetical
* We separated Secure Stage 2 and NonSecure Stage 2 MMU indexes
* We have an MMU index per physical address spacea
Add the missing pieces so that the list of architectural translation
regimes matches the Arm ARM, and the list and count of QEMU MMU
indexes in the comment matches the enum.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240809160430.1144805-2-peter.maydell@linaro.org
In a completely artifical memset benchmark object_dynamic_cast_assert
dominates the profile, even above guest address resolution and
the underlying host memset.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20240702154911.1667418-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>