qemu/target/arm
Peter Maydell 67683048be target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed)
Our implementation of the indexed version of SVE SDOT/UDOT/USDOT got
the calculation of the inner loop terminator wrong.  Although we
correctly account for the element size when we calculate the
terminator for the first iteration:
   intptr_t segend = MIN(16 / sizeof(TYPED), opr_sz_n);
we don't do that when we move it forward after the first inner loop
completes.  The intention is that we process the vector in 128-bit
segments, which for a 64-bit element size should mean (1, 2), (3, 4),
(5, 6), etc.  This bug meant that we would iterate (1, 2), (3, 4, 5,
6), (7, 8, 9, 10) etc and apply the wrong indexed element to some of
the operations, and also index off the end of the vector.

You don't see this bug if the vector length is small enough that we
don't need to iterate the outer loop, i.e.  if it is only 128 bits,
or if it is the 64-bit special case from AA32/AA64 AdvSIMD.  If the
vector length is 256 bits then we calculate the right results for the
elements in the vector but do index off the end of the vector. Vector
lengths greater than 256 bits see wrong answers. The instructions
that produce 32-bit results behave correctly.

Fix the recalculation of 'segend' for subsequent iterations, and
restore a version of the comment that was lost in the refactor of
commit 7020ffd656 that explains why we only need to clamp segend to
opr_sz_n for the first iteration, not the later ones.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2595
Fixes: 7020ffd656 ("target/arm: Macroize helper_gvec_{s,u}dot_idx_{b,h}")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241101185544.2130972-1-peter.maydell@linaro.org
(cherry picked from commit e6b2fa1b81)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-11-10 11:10:00 +03:00
..
hvf hvf: arm: Fix encodings for ID_AA64PFR1_EL1 and debug System registers 2024-05-30 17:11:59 +03:00
a32-uncond.decode arm tcg cpus: Fix Lesser GPL version number 2020-11-15 16:42:14 +01:00
a32.decode target/arm: Implement ESB instruction 2022-05-09 11:47:54 +01:00
arch_dump.c dump: Replace opaque DumpState pointer with a typed one 2022-10-06 19:30:43 +04:00
arm-powerctl.c arm/arm-powerctl: rebuild hflags after setting CP15 bits in arm_set_cpu_on() 2019-12-20 14:03:00 +00:00
arm-powerctl.h target/arm/arm-powerctl: Add new arm_set_cpu_on_and_reset() 2019-02-28 11:03:04 +00:00
arm_ldst.h accel/tcg: Add DisasContextBase argument to translator_ld* 2021-09-14 12:00:20 -07:00
common-semi-target.h semihosting: Split out common-semi-target.h 2022-06-28 04:35:07 +05:30
cpregs.h target/arm: Move define_debug_regs() to debug_helper.c 2022-07-07 11:37:33 +01:00
cpu-param.h target/arm: Enable TARGET_TB_PCREL 2022-10-20 11:28:29 +01:00
cpu-qom.h target: Introduce and use OBJECT_DECLARE_CPU_TYPE() macro 2022-03-06 22:23:09 +01:00
cpu.c target/arm: Disable SME if SVE is disabled 2023-12-20 19:11:10 +03:00
cpu.h target/arm: Handle m-profile in arm_is_secure 2023-04-12 16:57:32 +03:00
cpu64.c target/arm: Disable SVE extensions when SVE is disabled 2024-06-01 07:20:31 +03:00
cpu_tcg.c target/arm: Set TCGCPUOps.restore_state_to_opc for v7m 2022-11-29 18:15:26 -05:00
crypto_helper.c crypto: move sm4_sbox from target/arm 2022-04-29 10:47:45 +10:00
debug_helper.c target/arm: Store TCR_EL* registers as uint64_t 2022-07-18 13:20:13 +01:00
gdbstub.c Fix 'writeable' typos 2022-06-08 19:38:47 +01:00
gdbstub64.c target/arm: Rename sve_zcr_len_for_el to sve_vqm1_for_el 2022-06-08 19:38:57 +01:00
helper-a64.c target/arm: Change CPUArchState.aarch64 to bool 2022-04-22 14:44:54 +01:00
helper-a64.h target/arm: Merge mte_check1, mte_checkN 2021-04-30 11:16:49 +01:00
helper-mve.h target/arm: Implement MVE VRINT insns 2021-09-01 11:08:17 +01:00
helper-sme.h target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-02 14:01:04 +03:00
helper-sve.h target/arm: Implement REVD 2022-07-11 13:43:51 +01:00
helper.c target/arm: Ignore SMCR_EL2.LEN and SVCR_EL2.LEN if EL2 is not enabled 2024-08-02 10:24:26 +03:00
helper.h target/arm: Implement SCLAMP, UCLAMP 2022-07-11 13:43:51 +01:00
hvf_arm.h target: Use forward declared type instead of structure type 2022-03-06 22:22:40 +01:00
idau.h Use DECLARE_*CHECKER* macros 2020-09-09 09:27:09 -04:00
internals.h target/arm: Don't assert in regime_is_user() for E10 mmuidx values 2024-11-10 11:10:00 +03:00
iwmmxt_helper.c arm tcg cpus: Fix Lesser GPL version number 2020-11-15 16:42:14 +01:00
Kconfig meson: Introduce target-specific Kconfig 2021-07-09 18:21:34 +02:00
kvm-consts.h target/arm: Report KVM's actual PSCI version to guest in dtb 2022-03-02 19:27:37 +00:00
kvm-stub.c target/arm: Avoid bare abort() or assert(0) 2022-05-05 09:35:51 +01:00
kvm.c accel/kvm: Specify default IPA size for arm64 2023-09-11 10:53:50 +03:00
kvm64.c arm64: Restore trapless ptimer access 2023-09-13 12:21:22 +03:00
kvm_arm.h target/arm: Initialize debug capabilities only once 2023-05-18 21:09:59 +03:00
m-nocp.decode target/arm: Don't NOCP fault for FPCXT_NS accesses 2021-06-21 16:49:37 +01:00
m_helper.c target/arm: Use tlb_set_page_full 2022-10-10 14:52:25 +01:00
machine.c target/arm: Add the SME ZA storage to CPUARMState 2022-06-27 11:18:17 +01:00
meson.build target/arm: Trap non-streaming usage when Streaming SVE is active 2022-07-11 13:19:35 +01:00
monitor.c target/arm: Add cpu properties to control pauth 2021-01-19 14:38:51 +00:00
mte_helper.c accel/tcg: Simplify page_get/alloc_target_data 2022-10-26 11:11:28 +10:00
mve.decode target/arm: Implement MVE VRINT insns 2021-09-01 11:08:17 +01:00
mve_helper.c target/arm: Use expand_pred_b in mve_helper.c 2022-06-08 19:38:58 +01:00
neon-dp.decode target/arm: Implement vector float32 to bfloat16 conversion 2021-06-03 16:43:26 +01:00
neon-ls.decode target/arm: Remove duplicate 'plus1' function from Neon and SVE decode 2021-07-18 10:59:47 +01:00
neon-shared.decode target/arm: Remove duplicate 'plus1' function from Neon and SVE decode 2021-07-18 10:59:47 +01:00
neon_helper.c Replace config-time define HOST_WORDS_BIGENDIAN 2022-04-06 10:50:37 +02:00
op_addsub.h Move target-* CPU file into a target/ folder 2016-12-20 21:52:12 +01:00
op_helper.c accel/tcg: Remove will_exit argument from cpu_restore_state 2022-11-01 08:31:41 +11:00
pauth_helper.c compiler.h: replace QEMU_NORETURN with G_NORETURN 2022-04-21 17:03:51 +04:00
psci.c target/arm: Support PSCI 1.1 and SMCCC 1.0 2022-03-02 19:27:36 +00:00
ptw.c target/arm: Correctly propagate stage 1 BTI guarded bit in a two-stage walk 2023-11-03 19:35:34 +03:00
sme-fa64.decode target/arm: Mark LD1RO as non-streaming 2022-07-11 13:19:35 +01:00
sme.decode target/arm: Implement SME integer outer product 2022-07-11 13:43:51 +01:00
sme_helper.c target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-02 14:01:04 +03:00
sve.decode target/arm: Implement SCLAMP, UCLAMP 2022-07-11 13:43:51 +01:00
sve_helper.c target/arm: Fix SVE/SME gross MTE suppression checks 2024-02-16 14:19:18 +03:00
sve_ldst_internal.h target/arm: Use probe_access_full for MTE 2022-10-20 11:27:49 +01:00
syndrome.h target/arm: fix exception syndrome for AArch32 bkpt insn 2024-02-03 16:42:56 +03:00
t16.decode arm tcg cpus: Fix Lesser GPL version number 2020-11-15 16:42:14 +01:00
t32.decode target/arm: Implement ESB instruction 2022-05-09 11:47:54 +01:00
tlb_helper.c target/arm: Explicitly select short-format FSR for M-profile 2023-05-31 09:43:56 +03:00
trace-events docs: fix references to docs/devel/tracing.rst 2021-06-02 06:51:09 +02:00
trace.h trace: switch position of headers to what Meson requires 2020-08-21 06:18:24 -04:00
translate-a32.h target/arm: Define and use new load_cpu_field_low32() 2023-05-18 21:09:59 +03:00
translate-a64.c target/arm: Return correct result for LDG when ATA=0 2023-06-22 10:35:22 +03:00
translate-a64.h target/arm: Export unpredicated ld/st from translate-sve.c 2022-07-11 13:19:35 +01:00
translate-m-nocp.c target/arm: Enable TARGET_TB_PCREL 2022-10-20 11:28:29 +01:00
translate-mve.c target/arm: Change gen_exception_insn* to work on displacements 2022-10-20 11:27:52 +01:00
translate-neon.c target/arm: Fix alignment for VLD4.32 2022-09-22 16:38:27 +01:00
translate-sme.c target/arm: Handle denormals correctly for FMOPA (widening) 2024-08-02 14:01:04 +03:00
translate-sve.c target/arm: Avoid shifts by -1 in tszimm_shr() and tszimm_shl() 2024-08-02 10:24:18 +03:00
translate-vfp.c target/arm: Change gen_exception_insn* to work on displacements 2022-10-20 11:27:52 +01:00
translate.c target/arm: Fix 64-bit SSRA 2023-09-11 10:53:50 +03:00
translate.h target/arm: Enable TARGET_TB_PCREL 2022-10-20 11:28:29 +01:00
vec_helper.c target/arm: Fix SVE SDOT/UDOT/USDOT (4-way, indexed) 2024-11-10 11:10:00 +03:00
vec_internal.h target/arm: Export bfdotadd from vec_helper.c 2022-06-08 19:38:58 +01:00
vfp-uncond.decode arm tcg cpus: Fix Lesser GPL version number 2020-11-15 16:42:14 +01:00
vfp.decode target/arm: Don't NOCP fault for FPCXT_NS accesses 2021-06-21 16:49:37 +01:00
vfp_helper.c target/arm: Check NaN mode before silencing NaN 2021-07-02 11:48:36 +01:00