This series adds support for the Hexagon Vector eXtensions (HVX)

These instructions are documented here
 https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual
 
 Hexagon HVX is a wide vector engine with 128 byte vectors.
 
 See patch 01 Hexagon HVX README for more information.
 
 *** Changes in v2 ***
 Remove HVX tests from makefile to avoid need for toolchain upgrade
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJhgvvEAAoJEHsCRPsS3kQiMrQH/2ZGLfiWts1Vyi+phoXz24bC
 x/kAarlkCPP9YjiKa1HFCoX8L58O/r0ES6gwe2amXWDQoC+kQYoAJawOtgasGvuj
 ePctEkNLTJ2l5kkh83ITREVDpVXzoiAYSQ5hicJtKyAyfjFrdqC9mhHhRbShGG3+
 KA5JnyczrnuoIzaJxEbDPVaL/3hHThvpxV1bi9CWrEvjqa64Flot1szKAJlCeaTM
 GgersSCodtMyZHtU36ngzPDVrYNtTYZ5OiYkNpdEzHtzpuItq0xVMP0dGn2WGO+1
 0CTnuDmzJXMo+RDTo8pniz5pESgBbD9WMbgFEG3I0xUQIj0MdMkZa+h8lUvhmb4=
 =TK+z
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/quic/tags/pull-hex-20211103' into staging

This series adds support for the Hexagon Vector eXtensions (HVX)

These instructions are documented here
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual

Hexagon HVX is a wide vector engine with 128 byte vectors.

See patch 01 Hexagon HVX README for more information.

*** Changes in v2 ***
Remove HVX tests from makefile to avoid need for toolchain upgrade

# gpg: Signature made Wed 03 Nov 2021 05:14:44 PM EDT
# gpg:                using RSA key 7B0244FB12DE4422
# gpg: Good signature from "Taylor Simpson (Rock on) <tsimpson@quicinc.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 3635 C788 CE62 B91F D4C5  9AB4 7B02 44FB 12DE 4422

* remotes/quic/tags/pull-hex-20211103: (30 commits)
  Hexagon HVX (tests/tcg/hexagon) histogram test
  Hexagon HVX (tests/tcg/hexagon) scatter_gather test
  Hexagon HVX (tests/tcg/hexagon) hvx_misc test
  Hexagon HVX (tests/tcg/hexagon) vector_add_int test
  Hexagon HVX (target/hexagon) import instruction encodings
  Hexagon HVX (target/hexagon) instruction decoding
  Hexagon HVX (target/hexagon) import semantics
  Hexagon HVX (target/hexagon) helper overrides - vector stores
  Hexagon HVX (target/hexagon) helper overrides - vector loads
  Hexagon HVX (target/hexagon) helper overrides - vector splat and abs
  Hexagon HVX (target/hexagon) helper overrides - vector compares
  Hexagon HVX (target/hexagon) helper overrides - vector logical ops
  Hexagon HVX (target/hexagon) helper overrides - vector max/min
  Hexagon HVX (target/hexagon) helper overrides - vector shifts
  Hexagon HVX (target/hexagon) helper overrides - vector add & sub
  Hexagon HVX (target/hexagon) helper overrides - vector assign & cmov
  Hexagon HVX (target/hexagon) helper overrides for histogram instructions
  Hexagon HVX (target/hexagon) helper overrides infrastructure
  Hexagon HVX (target/hexagon) TCG generation
  Hexagon HVX (target/hexagon) helper functions
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This commit is contained in:
Richard Henderson 2021-11-04 06:34:36 -04:00
commit c88da1f3da
45 changed files with 10221 additions and 47 deletions

View file

@ -1,9 +1,13 @@
Hexagon is Qualcomm's very long instruction word (VLIW) digital signal Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
processor(DSP). processor(DSP). We also support Hexagon Vector eXtensions (HVX). HVX
is a wide vector coprocessor designed for high performance computer vision,
image processing, machine learning, and other workloads.
The following versions of the Hexagon core are supported The following versions of the Hexagon core are supported
Scalar core: v67 Scalar core: v67
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
HVX extension: v66
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual
We presented an overview of the project at the 2019 KVM Forum. We presented an overview of the project at the 2019 KVM Forum.
https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
@ -124,6 +128,71 @@ There are also cases where we brute force the TCG code generation.
Instructions with multiple definitions are examples. These require special Instructions with multiple definitions are examples. These require special
handling because qemu helpers can only return a single value. handling because qemu helpers can only return a single value.
For HVX vectors, the generator behaves slightly differently. The wide vectors
won't fit in a TCGv or TCGv_i64, so we pass TCGv_ptr variables to pass the
address to helper functions. Here's an example for an HVX vector-add-word
istruction.
static void generate_V6_vaddw(
CPUHexagonState *env,
DisasContext *ctx,
Insn *insn,
Packet *pkt)
{
const int VdN = insn->regno[0];
const intptr_t VdV_off =
ctx_future_vreg_off(ctx, VdN, 1, true);
TCGv_ptr VdV = tcg_temp_local_new_ptr();
tcg_gen_addi_ptr(VdV, cpu_env, VdV_off);
const int VuN = insn->regno[1];
const intptr_t VuV_off =
vreg_src_off(ctx, VuN);
TCGv_ptr VuV = tcg_temp_local_new_ptr();
const int VvN = insn->regno[2];
const intptr_t VvV_off =
vreg_src_off(ctx, VvN);
TCGv_ptr VvV = tcg_temp_local_new_ptr();
tcg_gen_addi_ptr(VuV, cpu_env, VuV_off);
tcg_gen_addi_ptr(VvV, cpu_env, VvV_off);
TCGv slot = tcg_constant_tl(insn->slot);
gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot);
tcg_temp_free(slot);
gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
tcg_temp_free_ptr(VdV);
tcg_temp_free_ptr(VuV);
tcg_temp_free_ptr(VvV);
}
Notice that we also generate a variable named <operand>_off for each operand of
the instruction. This makes it easy to override the instruction semantics with
functions from tcg-op-gvec.h. Here's the override for this instruction.
#define fGEN_TCG_V6_vaddw(SHORTCODE) \
tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
Finally, we notice that the override doesn't use the TCGv_ptr variables, so
we don't generate them when an override is present. Here is what we generate
when the override is present.
static void generate_V6_vaddw(
CPUHexagonState *env,
DisasContext *ctx,
Insn *insn,
Packet *pkt)
{
const int VdN = insn->regno[0];
const intptr_t VdV_off =
ctx_future_vreg_off(ctx, VdN, 1, true);
const int VuN = insn->regno[1];
const intptr_t VuV_off =
vreg_src_off(ctx, VuN);
const int VvN = insn->regno[2];
const intptr_t VvV_off =
vreg_src_off(ctx, VvN);
fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] = VuV.w[i] + VvV.w[i] ; } });
gen_log_vreg_write(ctx, VdV_off, VdN, EXT_DFL, insn->slot, false);
ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
}
In addition to instruction semantics, we use a generator to create the decode In addition to instruction semantics, we use a generator to create the decode
tree. This generation is also a two step process. The first step is to run tree. This generation is also a two step process. The first step is to run
target/hexagon/gen_dectree_import.c to produce target/hexagon/gen_dectree_import.c to produce
@ -140,6 +209,7 @@ runtime information for each thread and contains stuff like the GPR and
predicate registers. predicate registers.
macros.h macros.h
mmvec/macros.h
The Hexagon arch lib relies heavily on macros for the instruction semantics. The Hexagon arch lib relies heavily on macros for the instruction semantics.
This is a great advantage for qemu because we can override them for different This is a great advantage for qemu because we can override them for different
@ -203,6 +273,15 @@ During runtime, the following fields in CPUHexagonState (see cpu.h) are used
pred_written boolean indicating if predicate was written pred_written boolean indicating if predicate was written
mem_log_stores record of the stores (indexed by slot) mem_log_stores record of the stores (indexed by slot)
For Hexagon Vector eXtensions (HVX), the following fields are used
VRegs Vector registers
future_VRegs Registers to be stored during packet commit
tmp_VRegs Temporary registers *not* stored during commit
VRegs_updated Mask of predicated vector writes
QRegs Q (vector predicate) registers
future_QRegs Registers to be stored during packet commit
QRegs_updated Mask of predicated vector writes
*** Debugging *** *** Debugging ***
You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in

View file

@ -41,6 +41,27 @@ DEF_ATTRIB(STORE, "Stores to memory", "", "")
DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "") DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "")
DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "") DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "")
/* V6 Vector attributes */
DEF_ATTRIB(CVI, "Executes on the HVX extension", "", "")
DEF_ATTRIB(CVI_NEW, "New value memory instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VM, "Memory instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VP, "Permute instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VP_VS, "Double vector permute/shft insn executes on HVX", "", "")
DEF_ATTRIB(CVI_VX, "Multiply instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VX_DV, "Double vector multiply insn executes on HVX", "", "")
DEF_ATTRIB(CVI_VS, "Shift instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VS_VX, "Permute/shift and multiply insn executes on HVX", "", "")
DEF_ATTRIB(CVI_VA, "ALU instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_VA_DV, "Double vector alu instruction executes on HVX", "", "")
DEF_ATTRIB(CVI_4SLOT, "Consumes all the vector execution resources", "", "")
DEF_ATTRIB(CVI_TMP, "Transient Memory Load not written to register", "", "")
DEF_ATTRIB(CVI_GATHER, "CVI Gather operation", "", "")
DEF_ATTRIB(CVI_SCATTER, "CVI Scatter operation", "", "")
DEF_ATTRIB(CVI_SCATTER_RELEASE, "CVI Store Release for scatter", "", "")
DEF_ATTRIB(CVI_TMP_DST, "CVI instruction that doesn't write a register", "", "")
DEF_ATTRIB(CVI_SLOT23, "Can execute in slot 2 or slot 3 (HVX)", "", "")
/* Change-of-flow attributes */ /* Change-of-flow attributes */
DEF_ATTRIB(JUMP, "Jump-type instruction", "", "") DEF_ATTRIB(JUMP, "Jump-type instruction", "", "")
@ -87,6 +108,7 @@ DEF_ATTRIB(HWLOOP1_END, "Ends HW loop1", "", "")
DEF_ATTRIB(DCZEROA, "dczeroa type", "", "") DEF_ATTRIB(DCZEROA, "dczeroa type", "", "")
DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "") DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "")
DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "") DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "")
DEF_ATTRIB(L2FLUSHOP, "l2flush op type", "", "")
DEF_ATTRIB(DCFETCH, "dcfetch type", "", "") DEF_ATTRIB(DCFETCH, "dcfetch type", "", "")
DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "") DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "")

View file

@ -59,7 +59,7 @@ const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
"r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31", "r24", "r25", "r26", "r27", "r28", "r29", "r30", "r31",
"sa0", "lc0", "sa1", "lc1", "p3_0", "c5", "m0", "m1", "sa0", "lc0", "sa1", "lc1", "p3_0", "c5", "m0", "m1",
"usr", "pc", "ugp", "gp", "cs0", "cs1", "c14", "c15", "usr", "pc", "ugp", "gp", "cs0", "cs1", "c14", "c15",
"c16", "c17", "c18", "c19", "pkt_cnt", "insn_cnt", "c22", "c23", "c16", "c17", "c18", "c19", "pkt_cnt", "insn_cnt", "hvx_cnt", "c23",
"c24", "c25", "c26", "c27", "c28", "c29", "c30", "c31", "c24", "c25", "c26", "c27", "c28", "c29", "c30", "c31",
}; };
@ -113,7 +113,66 @@ static void print_reg(FILE *f, CPUHexagonState *env, int regnum)
hexagon_regnames[regnum], value); hexagon_regnames[regnum], value);
} }
static void hexagon_dump(CPUHexagonState *env, FILE *f) static void print_vreg(FILE *f, CPUHexagonState *env, int regnum,
bool skip_if_zero)
{
if (skip_if_zero) {
bool nonzero_found = false;
for (int i = 0; i < MAX_VEC_SIZE_BYTES; i++) {
if (env->VRegs[regnum].ub[i] != 0) {
nonzero_found = true;
break;
}
}
if (!nonzero_found) {
return;
}
}
qemu_fprintf(f, " v%d = ( ", regnum);
qemu_fprintf(f, "0x%02x", env->VRegs[regnum].ub[MAX_VEC_SIZE_BYTES - 1]);
for (int i = MAX_VEC_SIZE_BYTES - 2; i >= 0; i--) {
qemu_fprintf(f, ", 0x%02x", env->VRegs[regnum].ub[i]);
}
qemu_fprintf(f, " )\n");
}
void hexagon_debug_vreg(CPUHexagonState *env, int regnum)
{
print_vreg(stdout, env, regnum, false);
}
static void print_qreg(FILE *f, CPUHexagonState *env, int regnum,
bool skip_if_zero)
{
if (skip_if_zero) {
bool nonzero_found = false;
for (int i = 0; i < MAX_VEC_SIZE_BYTES / 8; i++) {
if (env->QRegs[regnum].ub[i] != 0) {
nonzero_found = true;
break;
}
}
if (!nonzero_found) {
return;
}
}
qemu_fprintf(f, " q%d = ( ", regnum);
qemu_fprintf(f, "0x%02x",
env->QRegs[regnum].ub[MAX_VEC_SIZE_BYTES / 8 - 1]);
for (int i = MAX_VEC_SIZE_BYTES / 8 - 2; i >= 0; i--) {
qemu_fprintf(f, ", 0x%02x", env->QRegs[regnum].ub[i]);
}
qemu_fprintf(f, " )\n");
}
void hexagon_debug_qreg(CPUHexagonState *env, int regnum)
{
print_qreg(stdout, env, regnum, false);
}
static void hexagon_dump(CPUHexagonState *env, FILE *f, int flags)
{ {
HexagonCPU *cpu = env_archcpu(env); HexagonCPU *cpu = env_archcpu(env);
@ -159,6 +218,17 @@ static void hexagon_dump(CPUHexagonState *env, FILE *f)
print_reg(f, env, HEX_REG_CS1); print_reg(f, env, HEX_REG_CS1);
#endif #endif
qemu_fprintf(f, "}\n"); qemu_fprintf(f, "}\n");
if (flags & CPU_DUMP_FPU) {
qemu_fprintf(f, "Vector Registers = {\n");
for (int i = 0; i < NUM_VREGS; i++) {
print_vreg(f, env, i, true);
}
for (int i = 0; i < NUM_QREGS; i++) {
print_qreg(f, env, i, true);
}
qemu_fprintf(f, "}\n");
}
} }
static void hexagon_dump_state(CPUState *cs, FILE *f, int flags) static void hexagon_dump_state(CPUState *cs, FILE *f, int flags)
@ -166,12 +236,12 @@ static void hexagon_dump_state(CPUState *cs, FILE *f, int flags)
HexagonCPU *cpu = HEXAGON_CPU(cs); HexagonCPU *cpu = HEXAGON_CPU(cs);
CPUHexagonState *env = &cpu->env; CPUHexagonState *env = &cpu->env;
hexagon_dump(env, f); hexagon_dump(env, f, flags);
} }
void hexagon_debug(CPUHexagonState *env) void hexagon_debug(CPUHexagonState *env)
{ {
hexagon_dump(env, stdout); hexagon_dump(env, stdout, CPU_DUMP_FPU);
} }
static void hexagon_cpu_set_pc(CPUState *cs, vaddr value) static void hexagon_cpu_set_pc(CPUState *cs, vaddr value)
@ -269,7 +339,7 @@ static void hexagon_cpu_class_init(ObjectClass *c, void *data)
cc->set_pc = hexagon_cpu_set_pc; cc->set_pc = hexagon_cpu_set_pc;
cc->gdb_read_register = hexagon_gdb_read_register; cc->gdb_read_register = hexagon_gdb_read_register;
cc->gdb_write_register = hexagon_gdb_write_register; cc->gdb_write_register = hexagon_gdb_write_register;
cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS; cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS + NUM_VREGS + NUM_QREGS;
cc->gdb_stop_before_watchpoint = true; cc->gdb_stop_before_watchpoint = true;
cc->disas_set_info = hexagon_cpu_disas_set_info; cc->disas_set_info = hexagon_cpu_disas_set_info;
cc->tcg_ops = &hexagon_tcg_ops; cc->tcg_ops = &hexagon_tcg_ops;

View file

@ -26,6 +26,7 @@ typedef struct CPUHexagonState CPUHexagonState;
#include "qemu-common.h" #include "qemu-common.h"
#include "exec/cpu-defs.h" #include "exec/cpu-defs.h"
#include "hex_regs.h" #include "hex_regs.h"
#include "mmvec/mmvec.h"
#define NUM_PREGS 4 #define NUM_PREGS 4
#define TOTAL_PER_THREAD_REGS 64 #define TOTAL_PER_THREAD_REGS 64
@ -34,6 +35,7 @@ typedef struct CPUHexagonState CPUHexagonState;
#define STORES_MAX 2 #define STORES_MAX 2
#define REG_WRITES_MAX 32 #define REG_WRITES_MAX 32
#define PRED_WRITES_MAX 5 /* 4 insns + endloop */ #define PRED_WRITES_MAX 5 /* 4 insns + endloop */
#define VSTORES_MAX 2
#define TYPE_HEXAGON_CPU "hexagon-cpu" #define TYPE_HEXAGON_CPU "hexagon-cpu"
@ -52,6 +54,13 @@ typedef struct {
uint64_t data64; uint64_t data64;
} MemLog; } MemLog;
typedef struct {
target_ulong va;
int size;
DECLARE_BITMAP(mask, MAX_VEC_SIZE_BYTES) QEMU_ALIGNED(16);
MMVector data QEMU_ALIGNED(16);
} VStoreLog;
#define EXEC_STATUS_OK 0x0000 #define EXEC_STATUS_OK 0x0000
#define EXEC_STATUS_STOP 0x0002 #define EXEC_STATUS_STOP 0x0002
#define EXEC_STATUS_REPLAY 0x0010 #define EXEC_STATUS_REPLAY 0x0010
@ -64,6 +73,9 @@ typedef struct {
#define CLEAR_EXCEPTION (env->status &= (~EXEC_STATUS_EXCEPTION)) #define CLEAR_EXCEPTION (env->status &= (~EXEC_STATUS_EXCEPTION))
#define SET_EXCEPTION (env->status |= EXEC_STATUS_EXCEPTION) #define SET_EXCEPTION (env->status |= EXEC_STATUS_EXCEPTION)
/* Maximum number of vector temps in a packet */
#define VECTOR_TEMPS_MAX 4
struct CPUHexagonState { struct CPUHexagonState {
target_ulong gpr[TOTAL_PER_THREAD_REGS]; target_ulong gpr[TOTAL_PER_THREAD_REGS];
target_ulong pred[NUM_PREGS]; target_ulong pred[NUM_PREGS];
@ -97,8 +109,27 @@ struct CPUHexagonState {
target_ulong llsc_val; target_ulong llsc_val;
uint64_t llsc_val_i64; uint64_t llsc_val_i64;
target_ulong is_gather_store_insn; MMVector VRegs[NUM_VREGS] QEMU_ALIGNED(16);
target_ulong gather_issued; MMVector future_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16);
MMVector tmp_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16);
VRegMask VRegs_updated;
MMQReg QRegs[NUM_QREGS] QEMU_ALIGNED(16);
MMQReg future_QRegs[NUM_QREGS] QEMU_ALIGNED(16);
QRegMask QRegs_updated;
/* Temporaries used within instructions */
MMVectorPair VuuV QEMU_ALIGNED(16);
MMVectorPair VvvV QEMU_ALIGNED(16);
MMVectorPair VxxV QEMU_ALIGNED(16);
MMVector vtmp QEMU_ALIGNED(16);
MMQReg qtmp QEMU_ALIGNED(16);
VStoreLog vstore[VSTORES_MAX];
target_ulong vstore_pending[VSTORES_MAX];
bool vtcm_pending;
VTCMStoreLog vtcm_log;
}; };
#define HEXAGON_CPU_CLASS(klass) \ #define HEXAGON_CPU_CLASS(klass) \

View file

@ -22,6 +22,7 @@
#include "decode.h" #include "decode.h"
#include "insn.h" #include "insn.h"
#include "printinsn.h" #include "printinsn.h"
#include "mmvec/decode_ext_mmvec.h"
#define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1)) #define fZXTN(N, M, VAL) ((VAL) & ((1LL << (N)) - 1))
@ -46,6 +47,7 @@ enum {
/* Name Num Table */ /* Name Num Table */
DEF_REGMAP(R_16, 16, 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23) DEF_REGMAP(R_16, 16, 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23)
DEF_REGMAP(R__8, 8, 0, 2, 4, 6, 16, 18, 20, 22) DEF_REGMAP(R__8, 8, 0, 2, 4, 6, 16, 18, 20, 22)
DEF_REGMAP(R_8, 8, 0, 1, 2, 3, 4, 5, 6, 7)
#define DECODE_MAPPED_REG(OPNUM, NAME) \ #define DECODE_MAPPED_REG(OPNUM, NAME) \
insn->regno[OPNUM] = DECODE_REGISTER_##NAME[insn->regno[OPNUM]]; insn->regno[OPNUM] = DECODE_REGISTER_##NAME[insn->regno[OPNUM]];
@ -157,6 +159,9 @@ static void decode_ext_init(void)
for (i = EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) { for (i = EXT_IDX_noext; i < EXT_IDX_noext_AFTER; i++) {
ext_trees[i] = &dectree_table_DECODE_EXT_EXT_noext; ext_trees[i] = &dectree_table_DECODE_EXT_EXT_noext;
} }
for (i = EXT_IDX_mmvec; i < EXT_IDX_mmvec_AFTER; i++) {
ext_trees[i] = &dectree_table_DECODE_EXT_EXT_mmvec;
}
} }
typedef struct { typedef struct {
@ -566,8 +571,12 @@ static void decode_remove_extenders(Packet *packet)
static SlotMask get_valid_slots(const Packet *pkt, unsigned int slot) static SlotMask get_valid_slots(const Packet *pkt, unsigned int slot)
{ {
if (GET_ATTRIB(pkt->insn[slot].opcode, A_EXTENSION)) {
return mmvec_ext_decode_find_iclass_slots(pkt->insn[slot].opcode);
} else {
return find_iclass_slots(pkt->insn[slot].opcode, return find_iclass_slots(pkt->insn[slot].opcode,
pkt->insn[slot].iclass); pkt->insn[slot].iclass);
}
} }
#define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) /* NOTHING */ #define DECODE_NEW_TABLE(TAG, SIZE, WHATNOT) /* NOTHING */
@ -728,6 +737,11 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable *table,
} }
decode_op(insn, opc, encoding); decode_op(insn, opc, encoding);
return 1; return 1;
} else if (table->table[i].type == DECTREE_EXTSPACE) {
/*
* For now, HVX will be the only coproc
*/
return decode_insns_tablewalk(insn, ext_trees[EXT_IDX_mmvec], encoding);
} else { } else {
return 0; return 0;
} }
@ -874,6 +888,7 @@ int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
int words_read = 0; int words_read = 0;
bool end_of_packet = false; bool end_of_packet = false;
int new_insns = 0; int new_insns = 0;
int i;
uint32_t encoding32; uint32_t encoding32;
/* Initialize */ /* Initialize */
@ -901,6 +916,11 @@ int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
return 0; return 0;
} }
pkt->encod_pkt_size_in_bytes = words_read * 4; pkt->encod_pkt_size_in_bytes = words_read * 4;
pkt->pkt_has_hvx = false;
for (i = 0; i < num_insns; i++) {
pkt->pkt_has_hvx |=
GET_ATTRIB(pkt->insn[i].opcode, A_CVI);
}
/* /*
* Check for :endloop in the parse bits * Check for :endloop in the parse bits
@ -931,6 +951,10 @@ int decode_packet(int max_words, const uint32_t *words, Packet *pkt,
decode_set_slot_number(pkt); decode_set_slot_number(pkt);
decode_fill_newvalue_regno(pkt); decode_fill_newvalue_regno(pkt);
if (pkt->pkt_has_hvx) {
mmvec_ext_decode_checks(pkt, disas_only);
}
if (!disas_only) { if (!disas_only) {
decode_shuffle_for_execution(pkt); decode_shuffle_for_execution(pkt);
decode_split_cmpjump(pkt); decode_split_cmpjump(pkt);

View file

@ -40,6 +40,11 @@ const char * const opcode_names[] = {
* Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(), * Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(),
* "Add 32-bit registers", * "Add 32-bit registers",
* { RdV=RsV+RtV;}) * { RdV=RsV+RtV;})
* HVX instructions have the following form
* EXTINSN(V6_vinsertwr, "Vx32.w=vinsert(Rt32)",
* ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX,A_CVI_LATE),
* "Insert Word Scalar into Vector",
* VxV.uw[0] = RtV;)
*/ */
const char * const opcode_syntax[XX_LAST_OPCODE] = { const char * const opcode_syntax[XX_LAST_OPCODE] = {
#define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \ #define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
@ -105,6 +110,14 @@ static const char *get_opcode_enc(int opcode)
static const char *get_opcode_enc_class(int opcode) static const char *get_opcode_enc_class(int opcode)
{ {
const char *tmp = opcode_encodings[opcode].encoding;
if (tmp == NULL) {
const char *test = "V6_"; /* HVX */
const char *name = opcode_names[opcode];
if (strncmp(name, test, strlen(test)) == 0) {
return "EXT_mmvec";
}
}
return opcode_enc_class_names[opcode_encodings[opcode].enc_class]; return opcode_enc_class_names[opcode_encodings[opcode].enc_class];
} }

View file

@ -48,11 +48,25 @@ def gen_helper_arg_pair(f,regtype,regid,regno):
if regno >= 0 : f.write(", ") if regno >= 0 : f.write(", ")
f.write("int64_t %s%sV" % (regtype,regid)) f.write("int64_t %s%sV" % (regtype,regid))
def gen_helper_arg_ext(f,regtype,regid,regno):
if regno > 0 : f.write(", ")
f.write("void *%s%sV_void" % (regtype,regid))
def gen_helper_arg_ext_pair(f,regtype,regid,regno):
if regno > 0 : f.write(", ")
f.write("void *%s%sV_void" % (regtype,regid))
def gen_helper_arg_opn(f,regtype,regid,i,tag): def gen_helper_arg_opn(f,regtype,regid,i,tag):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_arg_ext_pair(f,regtype,regid,i)
else:
gen_helper_arg_pair(f,regtype,regid,i) gen_helper_arg_pair(f,regtype,regid,i)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if hex_common.is_old_val(regtype, regid, tag): if hex_common.is_old_val(regtype, regid, tag):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_arg_ext(f,regtype,regid,i)
else:
gen_helper_arg(f,regtype,regid,i) gen_helper_arg(f,regtype,regid,i)
elif hex_common.is_new_val(regtype, regid, tag): elif hex_common.is_new_val(regtype, regid, tag):
gen_helper_arg_new(f,regtype,regid,i) gen_helper_arg_new(f,regtype,regid,i)
@ -72,24 +86,66 @@ def gen_helper_dest_decl_pair(f,regtype,regid,regno,subfield=""):
f.write(" int64_t %s%sV%s = 0;\n" % \ f.write(" int64_t %s%sV%s = 0;\n" % \
(regtype,regid,subfield)) (regtype,regid,subfield))
def gen_helper_dest_decl_ext(f,regtype,regid):
if (regtype == "Q"):
f.write(" /* %s%sV is *(MMQReg *)(%s%sV_void) */\n" % \
(regtype,regid,regtype,regid))
else:
f.write(" /* %s%sV is *(MMVector *)(%s%sV_void) */\n" % \
(regtype,regid,regtype,regid))
def gen_helper_dest_decl_ext_pair(f,regtype,regid,regno):
f.write(" /* %s%sV is *(MMVectorPair *))%s%sV_void) */\n" % \
(regtype,regid,regtype, regid))
def gen_helper_dest_decl_opn(f,regtype,regid,i): def gen_helper_dest_decl_opn(f,regtype,regid,i):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_dest_decl_ext_pair(f,regtype,regid, i)
else:
gen_helper_dest_decl_pair(f,regtype,regid,i) gen_helper_dest_decl_pair(f,regtype,regid,i)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_dest_decl_ext(f,regtype,regid)
else:
gen_helper_dest_decl(f,regtype,regid,i) gen_helper_dest_decl(f,regtype,regid,i)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
def gen_helper_src_var_ext(f,regtype,regid):
if (regtype == "Q"):
f.write(" /* %s%sV is *(MMQReg *)(%s%sV_void) */\n" % \
(regtype,regid,regtype,regid))
else:
f.write(" /* %s%sV is *(MMVector *)(%s%sV_void) */\n" % \
(regtype,regid,regtype,regid))
def gen_helper_src_var_ext_pair(f,regtype,regid,regno):
f.write(" /* %s%sV%s is *(MMVectorPair *)(%s%sV%s_void) */\n" % \
(regtype,regid,regno,regtype,regid,regno))
def gen_helper_return(f,regtype,regid,regno): def gen_helper_return(f,regtype,regid,regno):
f.write(" return %s%sV;\n" % (regtype,regid)) f.write(" return %s%sV;\n" % (regtype,regid))
def gen_helper_return_pair(f,regtype,regid,regno): def gen_helper_return_pair(f,regtype,regid,regno):
f.write(" return %s%sV;\n" % (regtype,regid)) f.write(" return %s%sV;\n" % (regtype,regid))
def gen_helper_dst_write_ext(f,regtype,regid):
return
def gen_helper_dst_write_ext_pair(f,regtype,regid):
return
def gen_helper_return_opn(f, regtype, regid, i): def gen_helper_return_opn(f, regtype, regid, i):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_dst_write_ext_pair(f,regtype,regid)
else:
gen_helper_return_pair(f,regtype,regid,i) gen_helper_return_pair(f,regtype,regid,i)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_dst_write_ext(f,regtype,regid)
else:
gen_helper_return(f,regtype,regid,i) gen_helper_return(f,regtype,regid,i)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
@ -129,13 +185,19 @@ def gen_helper_function(f, tag, tagregs, tagimms):
% (tag, tag)) % (tag, tag))
else: else:
## The return type of the function is the type of the destination ## The return type of the function is the type of the destination
## register ## register (if scalar)
i=0 i=0
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)): if (hex_common.is_written(regid)):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
continue
else:
gen_helper_return_type_pair(f,regtype,regid,i) gen_helper_return_type_pair(f,regtype,regid,i)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
continue
else:
gen_helper_return_type(f,regtype,regid,i) gen_helper_return_type(f,regtype,regid,i)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
@ -145,16 +207,37 @@ def gen_helper_function(f, tag, tagregs, tagimms):
f.write("void") f.write("void")
f.write(" HELPER(%s)(CPUHexagonState *env" % tag) f.write(" HELPER(%s)(CPUHexagonState *env" % tag)
## Arguments include the vector destination operands
i = 1 i = 1
for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)):
if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_arg_ext_pair(f,regtype,regid,i)
else:
continue
elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_arg_ext(f,regtype,regid,i)
else:
# This is the return value of the function
continue
else:
print("Bad register parse: ",regtype,regid,toss,numregs)
i += 1
## Arguments to the helper function are the source regs and immediates ## Arguments to the helper function are the source regs and immediates
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_read(regid)): if (hex_common.is_read(regid)):
if (hex_common.is_hvx_reg(regtype) and
hex_common.is_readwrite(regid)):
continue
gen_helper_arg_opn(f,regtype,regid,i,tag) gen_helper_arg_opn(f,regtype,regid,i,tag)
i += 1 i += 1
for immlett,bits,immshift in imms: for immlett,bits,immshift in imms:
gen_helper_arg_imm(f,immlett) gen_helper_arg_imm(f,immlett)
i += 1 i += 1
if hex_common.need_slot(tag): if hex_common.need_slot(tag):
if i > 0: f.write(", ") if i > 0: f.write(", ")
f.write("uint32_t slot") f.write("uint32_t slot")
@ -173,6 +256,17 @@ def gen_helper_function(f, tag, tagregs, tagimms):
gen_helper_dest_decl_opn(f,regtype,regid,i) gen_helper_dest_decl_opn(f,regtype,regid,i)
i += 1 i += 1
for regtype,regid,toss,numregs in regs:
if (hex_common.is_read(regid)):
if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_src_var_ext_pair(f,regtype,regid,i)
elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_helper_src_var_ext(f,regtype,regid)
else:
print("Bad register parse: ",regtype,regid,toss,numregs)
if 'A_FPOP' in hex_common.attribdict[tag]: if 'A_FPOP' in hex_common.attribdict[tag]:
f.write(' arch_fpop_start(env);\n'); f.write(' arch_fpop_start(env);\n');
@ -192,11 +286,12 @@ def main():
hex_common.read_semantics_file(sys.argv[1]) hex_common.read_semantics_file(sys.argv[1])
hex_common.read_attribs_file(sys.argv[2]) hex_common.read_attribs_file(sys.argv[2])
hex_common.read_overrides_file(sys.argv[3]) hex_common.read_overrides_file(sys.argv[3])
hex_common.read_overrides_file(sys.argv[4])
hex_common.calculate_attribs() hex_common.calculate_attribs()
tagregs = hex_common.get_tagregs() tagregs = hex_common.get_tagregs()
tagimms = hex_common.get_tagimms() tagimms = hex_common.get_tagimms()
with open(sys.argv[4], 'w') as f: with open(sys.argv[5], 'w') as f:
for tag in hex_common.tags: for tag in hex_common.tags:
## Skip the priv instructions ## Skip the priv instructions
if ( "A_PRIV" in hex_common.attribdict[tag] ) : if ( "A_PRIV" in hex_common.attribdict[tag] ) :

View file

@ -94,9 +94,13 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag)) f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag))
## Generate the qemu DEF_HELPER type for each result ## Generate the qemu DEF_HELPER type for each result
## Iterate over this list twice
## - Emit the scalar result
## - Emit the vector result
i=0 i=0
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)): if (hex_common.is_written(regid)):
if (not hex_common.is_hvx_reg(regtype)):
gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i) gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1 i += 1
@ -104,9 +108,19 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
f.write(', env' ) f.write(', env' )
i += 1 i += 1
# Second pass
for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)):
if (hex_common.is_hvx_reg(regtype)):
gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1
## Generate the qemu type for each input operand (regs and immediates) ## Generate the qemu type for each input operand (regs and immediates)
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_read(regid)): if (hex_common.is_read(regid)):
if (hex_common.is_hvx_reg(regtype) and
hex_common.is_readwrite(regid)):
continue
gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i) gen_def_helper_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1 i += 1
for immlett,bits,immshift in imms: for immlett,bits,immshift in imms:
@ -121,11 +135,12 @@ def main():
hex_common.read_semantics_file(sys.argv[1]) hex_common.read_semantics_file(sys.argv[1])
hex_common.read_attribs_file(sys.argv[2]) hex_common.read_attribs_file(sys.argv[2])
hex_common.read_overrides_file(sys.argv[3]) hex_common.read_overrides_file(sys.argv[3])
hex_common.read_overrides_file(sys.argv[4])
hex_common.calculate_attribs() hex_common.calculate_attribs()
tagregs = hex_common.get_tagregs() tagregs = hex_common.get_tagregs()
tagimms = hex_common.get_tagimms() tagimms = hex_common.get_tagimms()
with open(sys.argv[4], 'w') as f: with open(sys.argv[5], 'w') as f:
for tag in hex_common.tags: for tag in hex_common.tags:
## Skip the priv instructions ## Skip the priv instructions
if ( "A_PRIV" in hex_common.attribdict[tag] ) : if ( "A_PRIV" in hex_common.attribdict[tag] ) :

View file

@ -44,6 +44,11 @@ int main(int argc, char *argv[])
* Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(), * Q6INSN(A2_add,"Rd32=add(Rs32,Rt32)",ATTRIBS(),
* "Add 32-bit registers", * "Add 32-bit registers",
* { RdV=RsV+RtV;}) * { RdV=RsV+RtV;})
* HVX instructions have the following form
* EXTINSN(V6_vinsertwr, "Vx32.w=vinsert(Rt32)",
* ATTRIBS(A_EXTENSION,A_CVI,A_CVI_VX),
* "Insert Word Scalar into Vector",
* VxV.uw[0] = RtV;)
*/ */
#define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \ #define Q6INSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
do { \ do { \
@ -59,8 +64,23 @@ int main(int argc, char *argv[])
")\n", \ ")\n", \
#TAG, STRINGIZE(ATTRIBS)); \ #TAG, STRINGIZE(ATTRIBS)); \
} while (0); } while (0);
#define EXTINSN(TAG, BEH, ATTRIBS, DESCR, SEM) \
do { \
fprintf(outfile, "SEMANTICS( \\\n" \
" \"%s\", \\\n" \
" %s, \\\n" \
" \"\"\"%s\"\"\" \\\n" \
")\n", \
#TAG, STRINGIZE(BEH), STRINGIZE(SEM)); \
fprintf(outfile, "ATTRIBUTES( \\\n" \
" \"%s\", \\\n" \
" \"%s\" \\\n" \
")\n", \
#TAG, STRINGIZE(ATTRIBS)); \
} while (0);
#include "imported/allidefs.def" #include "imported/allidefs.def"
#undef Q6INSN #undef Q6INSN
#undef EXTINSN
/* /*
* Process the macro definitions * Process the macro definitions
@ -81,6 +101,19 @@ int main(int argc, char *argv[])
")\n", \ ")\n", \
#MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS)); #MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS));
#include "imported/macros.def" #include "imported/macros.def"
#undef DEF_MACRO
/*
* Process the macros for HVX
*/
#define DEF_MACRO(MNAME, BEH, ATTRS) \
fprintf(outfile, "MACROATTRIB( \\\n" \
" \"%s\", \\\n" \
" \"\"\"%s\"\"\", \\\n" \
" \"%s\" \\\n" \
")\n", \
#MNAME, STRINGIZE(BEH), STRINGIZE(ATTRS));
#include "imported/allext_macros.def"
#undef DEF_MACRO #undef DEF_MACRO
fclose(outfile); fclose(outfile);

View file

@ -119,10 +119,95 @@ def genptr_decl(f, tag, regtype, regid, regno):
(regtype, regid, regtype, regid)) (regtype, regid, regtype, regid))
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "V"):
if (regid in {"dd"}):
f.write(" const int %s%sN = insn->regno[%d];\n" %\
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" %\
(regtype, regid))
if (hex_common.is_tmp_result(tag)):
f.write(" ctx_tmp_vreg_off(ctx, %s%sN, 2, true);\n" % \
(regtype, regid))
else:
f.write(" ctx_future_vreg_off(ctx, %s%sN," % \
(regtype, regid))
f.write(" 2, true);\n")
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
elif (regid in {"uu", "vv", "xx"}):
f.write(" const int %s%sN = insn->regno[%d];\n" % \
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" % \
(regtype, regid))
f.write(" offsetof(CPUHexagonState, %s%sV);\n" % \
(regtype, regid))
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
elif (regid in {"s", "u", "v", "w"}):
f.write(" const int %s%sN = insn->regno[%d];\n" % \
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" % \
(regtype, regid))
f.write(" vreg_src_off(ctx, %s%sN);\n" % \
(regtype, regid))
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
elif (regid in {"d", "x", "y"}):
f.write(" const int %s%sN = insn->regno[%d];\n" % \
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" % \
(regtype, regid))
if (hex_common.is_tmp_result(tag)):
f.write(" ctx_tmp_vreg_off(ctx, %s%sN, 1, true);\n" % \
(regtype, regid))
else:
f.write(" ctx_future_vreg_off(ctx, %s%sN," % \
(regtype, regid))
f.write(" 1, true);\n");
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
elif (regtype == "Q"):
if (regid in {"d", "e", "x"}):
f.write(" const int %s%sN = insn->regno[%d];\n" % \
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" % \
(regtype, regid))
f.write(" offsetof(CPUHexagonState,\n")
f.write(" future_QRegs[%s%sN]);\n" % \
(regtype, regid))
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
elif (regid in {"s", "t", "u", "v"}):
f.write(" const int %s%sN = insn->regno[%d];\n" % \
(regtype, regid, regno))
f.write(" const intptr_t %s%sV_off =\n" %\
(regtype, regid))
f.write(" offsetof(CPUHexagonState, QRegs[%s%sN]);\n" % \
(regtype, regid))
if (not hex_common.skip_qemu_helper(tag)):
f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \
(regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
def genptr_decl_new(f,regtype,regid,regno): def genptr_decl_new(f, tag, regtype, regid, regno):
if (regtype == "N"): if (regtype == "N"):
if (regid in {"s", "t"}): if (regid in {"s", "t"}):
f.write(" TCGv %s%sN = hex_new_value[insn->regno[%d]];\n" % \ f.write(" TCGv %s%sN = hex_new_value[insn->regno[%d]];\n" % \
@ -135,6 +220,21 @@ def genptr_decl_new(f,regtype,regid,regno):
(regtype, regid, regno)) (regtype, regid, regno))
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "O"):
if (regid == "s"):
f.write(" const intptr_t %s%sN_num = insn->regno[%d];\n" % \
(regtype, regid, regno))
if (hex_common.skip_qemu_helper(tag)):
f.write(" const intptr_t %s%sN_off =\n" % \
(regtype, regid))
f.write(" ctx_future_vreg_off(ctx, %s%sN_num," % \
(regtype, regid))
f.write(" 1, true);\n")
else:
f.write(" TCGv %s%sN = tcg_constant_tl(%s%sN_num);\n" % \
(regtype, regid, regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
@ -145,7 +245,7 @@ def genptr_decl_opn(f, tag, regtype, regid, toss, numregs, i):
if hex_common.is_old_val(regtype, regid, tag): if hex_common.is_old_val(regtype, regid, tag):
genptr_decl(f,tag, regtype, regid, i) genptr_decl(f,tag, regtype, regid, i)
elif hex_common.is_new_val(regtype, regid, tag): elif hex_common.is_new_val(regtype, regid, tag):
genptr_decl_new(f,regtype,regid,i) genptr_decl_new(f, tag, regtype, regid, i)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
else: else:
@ -159,7 +259,7 @@ def genptr_decl_imm(f,immlett):
f.write(" int %s = insn->immed[%d];\n" % \ f.write(" int %s = insn->immed[%d];\n" % \
(hex_common.imm_name(immlett), i)) (hex_common.imm_name(immlett), i))
def genptr_free(f,regtype,regid,regno): def genptr_free(f, tag, regtype, regid, regno):
if (regtype == "R"): if (regtype == "R"):
if (regid in {"dd", "ss", "tt", "xx", "yy"}): if (regid in {"dd", "ss", "tt", "xx", "yy"}):
f.write(" tcg_temp_free_i64(%s%sV);\n" % (regtype, regid)) f.write(" tcg_temp_free_i64(%s%sV);\n" % (regtype, regid))
@ -182,33 +282,51 @@ def genptr_free(f,regtype,regid,regno):
elif (regtype == "M"): elif (regtype == "M"):
if (regid != "u"): if (regid != "u"):
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "V"):
if (regid in {"dd", "uu", "vv", "xx", \
"d", "s", "u", "v", "w", "x", "y"}):
if (not hex_common.skip_qemu_helper(tag)):
f.write(" tcg_temp_free_ptr(%s%sV);\n" % \
(regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
elif (regtype == "Q"):
if (regid in {"d", "e", "s", "t", "u", "v", "x"}):
if (not hex_common.skip_qemu_helper(tag)):
f.write(" tcg_temp_free_ptr(%s%sV);\n" % \
(regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
def genptr_free_new(f,regtype,regid,regno): def genptr_free_new(f, tag, regtype, regid, regno):
if (regtype == "N"): if (regtype == "N"):
if (regid not in {"s", "t"}): if (regid not in {"s", "t"}):
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "P"): elif (regtype == "P"):
if (regid not in {"t", "u", "v"}): if (regid not in {"t", "u", "v"}):
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "O"):
if (regid != "s"):
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
def genptr_free_opn(f,regtype,regid,i,tag): def genptr_free_opn(f,regtype,regid,i,tag):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
genptr_free(f,regtype,regid,i) genptr_free(f, tag, regtype, regid, i)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if hex_common.is_old_val(regtype, regid, tag): if hex_common.is_old_val(regtype, regid, tag):
genptr_free(f,regtype,regid,i) genptr_free(f, tag, regtype, regid, i)
elif hex_common.is_new_val(regtype, regid, tag): elif hex_common.is_new_val(regtype, regid, tag):
genptr_free_new(f,regtype,regid,i) genptr_free_new(f, tag, regtype, regid, i)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
def genptr_src_read(f,regtype,regid): def genptr_src_read(f, tag, regtype, regid):
if (regtype == "R"): if (regtype == "R"):
if (regid in {"ss", "tt", "xx", "yy"}): if (regid in {"ss", "tt", "xx", "yy"}):
f.write(" tcg_gen_concat_i32_i64(%s%sV, hex_gpr[%s%sN],\n" % \ f.write(" tcg_gen_concat_i32_i64(%s%sV, hex_gpr[%s%sN],\n" % \
@ -238,6 +356,47 @@ def genptr_src_read(f,regtype,regid):
elif (regtype == "M"): elif (regtype == "M"):
if (regid != "u"): if (regid != "u"):
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "V"):
if (regid in {"uu", "vv", "xx"}):
f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \
(regtype, regid))
f.write(" vreg_src_off(ctx, %s%sN),\n" % \
(regtype, regid))
f.write(" sizeof(MMVector), sizeof(MMVector));\n")
f.write(" tcg_gen_gvec_mov(MO_64,\n")
f.write(" %s%sV_off + sizeof(MMVector),\n" % \
(regtype, regid))
f.write(" vreg_src_off(ctx, %s%sN ^ 1),\n" % \
(regtype, regid))
f.write(" sizeof(MMVector), sizeof(MMVector));\n")
elif (regid in {"s", "u", "v", "w"}):
if (not hex_common.skip_qemu_helper(tag)):
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
elif (regid in {"x", "y"}):
f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \
(regtype, regid))
f.write(" vreg_src_off(ctx, %s%sN),\n" % \
(regtype, regid))
f.write(" sizeof(MMVector), sizeof(MMVector));\n")
if (not hex_common.skip_qemu_helper(tag)):
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
else:
print("Bad register parse: ", regtype, regid)
elif (regtype == "Q"):
if (regid in {"s", "t", "u", "v"}):
if (not hex_common.skip_qemu_helper(tag)):
f.write(" tcg_gen_addi_ptr(%s%sV, cpu_env, %s%sV_off);\n" % \
(regtype, regid, regtype, regid))
elif (regid in {"x"}):
f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \
(regtype, regid))
f.write(" offsetof(CPUHexagonState, QRegs[%s%sN]),\n" % \
(regtype, regid))
f.write(" sizeof(MMQReg), sizeof(MMQReg));\n")
else:
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
@ -248,15 +407,18 @@ def genptr_src_read_new(f,regtype,regid):
elif (regtype == "P"): elif (regtype == "P"):
if (regid not in {"t", "u", "v"}): if (regid not in {"t", "u", "v"}):
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
elif (regtype == "O"):
if (regid != "s"):
print("Bad register parse: ", regtype, regid)
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
def genptr_src_read_opn(f,regtype,regid,tag): def genptr_src_read_opn(f,regtype,regid,tag):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
genptr_src_read(f,regtype,regid) genptr_src_read(f, tag, regtype, regid)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if hex_common.is_old_val(regtype, regid, tag): if hex_common.is_old_val(regtype, regid, tag):
genptr_src_read(f,regtype,regid) genptr_src_read(f, tag, regtype, regid)
elif hex_common.is_new_val(regtype, regid, tag): elif hex_common.is_new_val(regtype, regid, tag):
genptr_src_read_new(f,regtype,regid) genptr_src_read_new(f,regtype,regid)
else: else:
@ -331,10 +493,67 @@ def genptr_dst_write(f, tag, regtype, regid):
else: else:
print("Bad register parse: ", regtype, regid) print("Bad register parse: ", regtype, regid)
def genptr_dst_write_ext(f, tag, regtype, regid, newv="EXT_DFL"):
if (regtype == "V"):
if (regid in {"dd", "xx", "yy"}):
if ('A_CONDEXEC' in hex_common.attribdict[tag]):
is_predicated = "true"
else:
is_predicated = "false"
f.write(" gen_log_vreg_write_pair(ctx, %s%sV_off, %s%sN, " % \
(regtype, regid, regtype, regid))
f.write("%s, insn->slot, %s);\n" % \
(newv, is_predicated))
f.write(" ctx_log_vreg_write_pair(ctx, %s%sN, %s,\n" % \
(regtype, regid, newv))
f.write(" %s);\n" % (is_predicated))
elif (regid in {"d", "x", "y"}):
if ('A_CONDEXEC' in hex_common.attribdict[tag]):
is_predicated = "true"
else:
is_predicated = "false"
f.write(" gen_log_vreg_write(ctx, %s%sV_off, %s%sN, %s, " % \
(regtype, regid, regtype, regid, newv))
f.write("insn->slot, %s);\n" % \
(is_predicated))
f.write(" ctx_log_vreg_write(ctx, %s%sN, %s, %s);\n" % \
(regtype, regid, newv, is_predicated))
else:
print("Bad register parse: ", regtype, regid)
elif (regtype == "Q"):
if (regid in {"d", "e", "x"}):
if ('A_CONDEXEC' in hex_common.attribdict[tag]):
is_predicated = "true"
else:
is_predicated = "false"
f.write(" gen_log_qreg_write(%s%sV_off, %s%sN, %s, " % \
(regtype, regid, regtype, regid, newv))
f.write("insn->slot, %s);\n" % (is_predicated))
f.write(" ctx_log_qreg_write(ctx, %s%sN, %s);\n" % \
(regtype, regid, is_predicated))
else:
print("Bad register parse: ", regtype, regid)
else:
print("Bad register parse: ", regtype, regid)
def genptr_dst_write_opn(f,regtype, regid, tag): def genptr_dst_write_opn(f,regtype, regid, tag):
if (hex_common.is_pair(regid)): if (hex_common.is_pair(regid)):
if (hex_common.is_hvx_reg(regtype)):
if (hex_common.is_tmp_result(tag)):
genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP")
else:
genptr_dst_write_ext(f, tag, regtype, regid)
else:
genptr_dst_write(f, tag, regtype, regid) genptr_dst_write(f, tag, regtype, regid)
elif (hex_common.is_single(regid)): elif (hex_common.is_single(regid)):
if (hex_common.is_hvx_reg(regtype)):
if (hex_common.is_new_result(tag)):
genptr_dst_write_ext(f, tag, regtype, regid, "EXT_NEW")
if (hex_common.is_tmp_result(tag)):
genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP")
else:
genptr_dst_write_ext(f, tag, regtype, regid, "EXT_DFL")
else:
genptr_dst_write(f, tag, regtype, regid) genptr_dst_write(f, tag, regtype, regid)
else: else:
print("Bad register parse: ",regtype,regid,toss,numregs) print("Bad register parse: ",regtype,regid,toss,numregs)
@ -406,13 +625,24 @@ def gen_tcg_func(f, tag, regs, imms):
## If there is a scalar result, it is the return type ## If there is a scalar result, it is the return type
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)): if (hex_common.is_written(regid)):
if (hex_common.is_hvx_reg(regtype)):
continue
gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i) gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1 i += 1
if (i > 0): f.write(", ") if (i > 0): f.write(", ")
f.write("cpu_env") f.write("cpu_env")
i=1 i=1
for regtype,regid,toss,numregs in regs:
if (hex_common.is_written(regid)):
if (not hex_common.is_hvx_reg(regtype)):
continue
gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1
for regtype,regid,toss,numregs in regs: for regtype,regid,toss,numregs in regs:
if (hex_common.is_read(regid)): if (hex_common.is_read(regid)):
if (hex_common.is_hvx_reg(regtype) and
hex_common.is_readwrite(regid)):
continue
gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i) gen_helper_call_opn(f, tag, regtype, regid, toss, numregs, i)
i += 1 i += 1
for immlett,bits,immshift in imms: for immlett,bits,immshift in imms:
@ -445,11 +675,12 @@ def main():
hex_common.read_semantics_file(sys.argv[1]) hex_common.read_semantics_file(sys.argv[1])
hex_common.read_attribs_file(sys.argv[2]) hex_common.read_attribs_file(sys.argv[2])
hex_common.read_overrides_file(sys.argv[3]) hex_common.read_overrides_file(sys.argv[3])
hex_common.read_overrides_file(sys.argv[4])
hex_common.calculate_attribs() hex_common.calculate_attribs()
tagregs = hex_common.get_tagregs() tagregs = hex_common.get_tagregs()
tagimms = hex_common.get_tagimms() tagimms = hex_common.get_tagimms()
with open(sys.argv[4], 'w') as f: with open(sys.argv[5], 'w') as f:
f.write("#ifndef HEXAGON_TCG_FUNCS_H\n") f.write("#ifndef HEXAGON_TCG_FUNCS_H\n")
f.write("#define HEXAGON_TCG_FUNCS_H\n\n") f.write("#define HEXAGON_TCG_FUNCS_H\n\n")

View file

@ -0,0 +1,903 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HEXAGON_GEN_TCG_HVX_H
#define HEXAGON_GEN_TCG_HVX_H
/*
* Histogram instructions
*
* Note that these instructions operate directly on the vector registers
* and therefore happen after commit.
*
* The generate_<tag> function is called twice
* The first time is during the normal TCG generation
* ctx->pre_commit is true
* In the masked cases, we save the mask to the qtmp temporary
* Otherwise, there is nothing to do
* The second call is at the end of gen_commit_packet
* ctx->pre_commit is false
* Generate the call to the helper
*/
static inline void assert_vhist_tmp(DisasContext *ctx)
{
/* vhist instructions require exactly one .tmp to be defined */
g_assert(ctx->tmp_vregs_idx == 1);
}
#define fGEN_TCG_V6_vhist(SHORTCODE) \
if (!ctx->pre_commit) { \
assert_vhist_tmp(ctx); \
gen_helper_vhist(cpu_env); \
}
#define fGEN_TCG_V6_vhistq(SHORTCODE) \
do { \
if (ctx->pre_commit) { \
intptr_t dstoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
} else { \
assert_vhist_tmp(ctx); \
gen_helper_vhistq(cpu_env); \
} \
} while (0)
#define fGEN_TCG_V6_vwhist256(SHORTCODE) \
if (!ctx->pre_commit) { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist256(cpu_env); \
}
#define fGEN_TCG_V6_vwhist256q(SHORTCODE) \
do { \
if (ctx->pre_commit) { \
intptr_t dstoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
} else { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist256q(cpu_env); \
} \
} while (0)
#define fGEN_TCG_V6_vwhist256_sat(SHORTCODE) \
if (!ctx->pre_commit) { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist256_sat(cpu_env); \
}
#define fGEN_TCG_V6_vwhist256q_sat(SHORTCODE) \
do { \
if (ctx->pre_commit) { \
intptr_t dstoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
} else { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist256q_sat(cpu_env); \
} \
} while (0)
#define fGEN_TCG_V6_vwhist128(SHORTCODE) \
if (!ctx->pre_commit) { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist128(cpu_env); \
}
#define fGEN_TCG_V6_vwhist128q(SHORTCODE) \
do { \
if (ctx->pre_commit) { \
intptr_t dstoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
} else { \
assert_vhist_tmp(ctx); \
gen_helper_vwhist128q(cpu_env); \
} \
} while (0)
#define fGEN_TCG_V6_vwhist128m(SHORTCODE) \
if (!ctx->pre_commit) { \
TCGv tcgv_uiV = tcg_constant_tl(uiV); \
assert_vhist_tmp(ctx); \
gen_helper_vwhist128m(cpu_env, tcgv_uiV); \
}
#define fGEN_TCG_V6_vwhist128qm(SHORTCODE) \
do { \
if (ctx->pre_commit) { \
intptr_t dstoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_mov(MO_64, dstoff, QvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
} else { \
TCGv tcgv_uiV = tcg_constant_tl(uiV); \
assert_vhist_tmp(ctx); \
gen_helper_vwhist128qm(cpu_env, tcgv_uiV); \
} \
} while (0)
#define fGEN_TCG_V6_vassign(SHORTCODE) \
tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
/* Vector conditional move */
#define fGEN_TCG_VEC_CMOV(PRED) \
do { \
TCGv lsb = tcg_temp_new(); \
TCGLabel *false_label = gen_new_label(); \
TCGLabel *end_label = gen_new_label(); \
tcg_gen_andi_tl(lsb, PsV, 1); \
tcg_gen_brcondi_tl(TCG_COND_NE, lsb, PRED, false_label); \
tcg_temp_free(lsb); \
tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_gen_br(end_label); \
gen_set_label(false_label); \
tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
1 << insn->slot); \
gen_set_label(end_label); \
} while (0)
/* Vector conditional move (true) */
#define fGEN_TCG_V6_vcmov(SHORTCODE) \
fGEN_TCG_VEC_CMOV(1)
/* Vector conditional move (false) */
#define fGEN_TCG_V6_vncmov(SHORTCODE) \
fGEN_TCG_VEC_CMOV(0)
/* Vector add - various forms */
#define fGEN_TCG_V6_vaddb(SHORTCODE) \
tcg_gen_gvec_add(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vaddh(SHORTCYDE) \
tcg_gen_gvec_add(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vaddw(SHORTCODE) \
tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vaddb_dv(SHORTCODE) \
tcg_gen_gvec_add(MO_8, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
#define fGEN_TCG_V6_vaddh_dv(SHORTCYDE) \
tcg_gen_gvec_add(MO_16, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
#define fGEN_TCG_V6_vaddw_dv(SHORTCODE) \
tcg_gen_gvec_add(MO_32, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
/* Vector sub - various forms */
#define fGEN_TCG_V6_vsubb(SHORTCODE) \
tcg_gen_gvec_sub(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vsubh(SHORTCODE) \
tcg_gen_gvec_sub(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vsubw(SHORTCODE) \
tcg_gen_gvec_sub(MO_32, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vsubb_dv(SHORTCODE) \
tcg_gen_gvec_sub(MO_8, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
#define fGEN_TCG_V6_vsubh_dv(SHORTCODE) \
tcg_gen_gvec_sub(MO_16, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
#define fGEN_TCG_V6_vsubw_dv(SHORTCODE) \
tcg_gen_gvec_sub(MO_32, VddV_off, VuuV_off, VvvV_off, \
sizeof(MMVector) * 2, sizeof(MMVector) * 2)
/* Vector shift right - various forms */
#define fGEN_TCG_V6_vasrh(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 15); \
tcg_gen_gvec_sars(MO_16, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vasrh_acc(SHORTCODE) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 15); \
tcg_gen_gvec_sars(MO_16, tmpoff, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_gen_gvec_add(MO_16, VxV_off, VxV_off, tmpoff, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vasrw(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 31); \
tcg_gen_gvec_sars(MO_32, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vasrw_acc(SHORTCODE) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 31); \
tcg_gen_gvec_sars(MO_32, tmpoff, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_gen_gvec_add(MO_32, VxV_off, VxV_off, tmpoff, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vlsrb(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 7); \
tcg_gen_gvec_shrs(MO_8, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vlsrh(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 15); \
tcg_gen_gvec_shrs(MO_16, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vlsrw(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 31); \
tcg_gen_gvec_shrs(MO_32, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
/* Vector shift left - various forms */
#define fGEN_TCG_V6_vaslb(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 7); \
tcg_gen_gvec_shls(MO_8, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vaslh(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 15); \
tcg_gen_gvec_shls(MO_16, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vaslh_acc(SHORTCODE) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 15); \
tcg_gen_gvec_shls(MO_16, tmpoff, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_gen_gvec_add(MO_16, VxV_off, VxV_off, tmpoff, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vaslw(SHORTCODE) \
do { \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 31); \
tcg_gen_gvec_shls(MO_32, VdV_off, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
#define fGEN_TCG_V6_vaslw_acc(SHORTCODE) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
TCGv shift = tcg_temp_new(); \
tcg_gen_andi_tl(shift, RtV, 31); \
tcg_gen_gvec_shls(MO_32, tmpoff, VuV_off, shift, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_gen_gvec_add(MO_32, VxV_off, VxV_off, tmpoff, \
sizeof(MMVector), sizeof(MMVector)); \
tcg_temp_free(shift); \
} while (0)
/* Vector max - various forms */
#define fGEN_TCG_V6_vmaxw(SHORTCODE) \
tcg_gen_gvec_smax(MO_32, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vmaxh(SHORTCODE) \
tcg_gen_gvec_smax(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vmaxuh(SHORTCODE) \
tcg_gen_gvec_umax(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vmaxb(SHORTCODE) \
tcg_gen_gvec_smax(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vmaxub(SHORTCODE) \
tcg_gen_gvec_umax(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
/* Vector min - various forms */
#define fGEN_TCG_V6_vminw(SHORTCODE) \
tcg_gen_gvec_smin(MO_32, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vminh(SHORTCODE) \
tcg_gen_gvec_smin(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vminuh(SHORTCODE) \
tcg_gen_gvec_umin(MO_16, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vminb(SHORTCODE) \
tcg_gen_gvec_smin(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vminub(SHORTCODE) \
tcg_gen_gvec_umin(MO_8, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
/* Vector logical ops */
#define fGEN_TCG_V6_vxor(SHORTCODE) \
tcg_gen_gvec_xor(MO_64, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vand(SHORTCODE) \
tcg_gen_gvec_and(MO_64, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vor(SHORTCODE) \
tcg_gen_gvec_or(MO_64, VdV_off, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vnot(SHORTCODE) \
tcg_gen_gvec_not(MO_64, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
/* Q register logical ops */
#define fGEN_TCG_V6_pred_or(SHORTCODE) \
tcg_gen_gvec_or(MO_64, QdV_off, QsV_off, QtV_off, \
sizeof(MMQReg), sizeof(MMQReg))
#define fGEN_TCG_V6_pred_and(SHORTCODE) \
tcg_gen_gvec_and(MO_64, QdV_off, QsV_off, QtV_off, \
sizeof(MMQReg), sizeof(MMQReg))
#define fGEN_TCG_V6_pred_xor(SHORTCODE) \
tcg_gen_gvec_xor(MO_64, QdV_off, QsV_off, QtV_off, \
sizeof(MMQReg), sizeof(MMQReg))
#define fGEN_TCG_V6_pred_or_n(SHORTCODE) \
tcg_gen_gvec_orc(MO_64, QdV_off, QsV_off, QtV_off, \
sizeof(MMQReg), sizeof(MMQReg))
#define fGEN_TCG_V6_pred_and_n(SHORTCODE) \
tcg_gen_gvec_andc(MO_64, QdV_off, QsV_off, QtV_off, \
sizeof(MMQReg), sizeof(MMQReg))
#define fGEN_TCG_V6_pred_not(SHORTCODE) \
tcg_gen_gvec_not(MO_64, QdV_off, QsV_off, \
sizeof(MMQReg), sizeof(MMQReg))
/* Vector compares */
#define fGEN_TCG_VEC_CMP(COND, TYPE, SIZE) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
tcg_gen_gvec_cmp(COND, TYPE, tmpoff, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
vec_to_qvec(SIZE, QdV_off, tmpoff); \
} while (0)
#define fGEN_TCG_V6_vgtw(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_32, 4)
#define fGEN_TCG_V6_vgth(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_16, 2)
#define fGEN_TCG_V6_vgtb(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GT, MO_8, 1)
#define fGEN_TCG_V6_vgtuw(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_32, 4)
#define fGEN_TCG_V6_vgtuh(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_16, 2)
#define fGEN_TCG_V6_vgtub(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_GTU, MO_8, 1)
#define fGEN_TCG_V6_veqw(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_32, 4)
#define fGEN_TCG_V6_veqh(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_16, 2)
#define fGEN_TCG_V6_veqb(SHORTCODE) \
fGEN_TCG_VEC_CMP(TCG_COND_EQ, MO_8, 1)
#define fGEN_TCG_VEC_CMP_OP(COND, TYPE, SIZE, OP) \
do { \
intptr_t tmpoff = offsetof(CPUHexagonState, vtmp); \
intptr_t qoff = offsetof(CPUHexagonState, qtmp); \
tcg_gen_gvec_cmp(COND, TYPE, tmpoff, VuV_off, VvV_off, \
sizeof(MMVector), sizeof(MMVector)); \
vec_to_qvec(SIZE, qoff, tmpoff); \
OP(MO_64, QxV_off, QxV_off, qoff, sizeof(MMQReg), sizeof(MMQReg)); \
} while (0)
#define fGEN_TCG_V6_vgtw_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgtw_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgtw_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_32, 4, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_vgtuw_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgtuw_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgtuw_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_32, 4, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_vgth_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgth_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgth_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_16, 2, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_vgtuh_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgtuh_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgtuh_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_16, 2, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_vgtb_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgtb_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgtb_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GT, MO_8, 1, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_vgtub_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_and)
#define fGEN_TCG_V6_vgtub_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_or)
#define fGEN_TCG_V6_vgtub_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_GTU, MO_8, 1, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_veqw_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_and)
#define fGEN_TCG_V6_veqw_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_or)
#define fGEN_TCG_V6_veqw_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_32, 4, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_veqh_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_and)
#define fGEN_TCG_V6_veqh_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_or)
#define fGEN_TCG_V6_veqh_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_16, 2, tcg_gen_gvec_xor)
#define fGEN_TCG_V6_veqb_and(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_and)
#define fGEN_TCG_V6_veqb_or(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_or)
#define fGEN_TCG_V6_veqb_xor(SHORTCODE) \
fGEN_TCG_VEC_CMP_OP(TCG_COND_EQ, MO_8, 1, tcg_gen_gvec_xor)
/* Vector splat - various forms */
#define fGEN_TCG_V6_lvsplatw(SHORTCODE) \
tcg_gen_gvec_dup_i32(MO_32, VdV_off, \
sizeof(MMVector), sizeof(MMVector), RtV)
#define fGEN_TCG_V6_lvsplath(SHORTCODE) \
tcg_gen_gvec_dup_i32(MO_16, VdV_off, \
sizeof(MMVector), sizeof(MMVector), RtV)
#define fGEN_TCG_V6_lvsplatb(SHORTCODE) \
tcg_gen_gvec_dup_i32(MO_8, VdV_off, \
sizeof(MMVector), sizeof(MMVector), RtV)
/* Vector absolute value - various forms */
#define fGEN_TCG_V6_vabsb(SHORTCODE) \
tcg_gen_gvec_abs(MO_8, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vabsh(SHORTCODE) \
tcg_gen_gvec_abs(MO_16, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
#define fGEN_TCG_V6_vabsw(SHORTCODE) \
tcg_gen_gvec_abs(MO_32, VdV_off, VuV_off, \
sizeof(MMVector), sizeof(MMVector))
/* Vector loads */
#define fGEN_TCG_V6_vL32b_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32Ub_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_cur_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_tmp_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_cur_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_tmp_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32Ub_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_cur_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_tmp_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_cur_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_tmp_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32Ub_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_cur_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_tmp_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_cur_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vL32b_nt_tmp_ppu(SHORTCODE) SHORTCODE
/* Predicated vector loads */
#define fGEN_TCG_PRED_VEC_LOAD(GET_EA, PRED, DSTOFF, INC) \
do { \
TCGv LSB = tcg_temp_new(); \
TCGLabel *false_label = gen_new_label(); \
TCGLabel *end_label = gen_new_label(); \
GET_EA; \
PRED; \
tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \
tcg_temp_free(LSB); \
gen_vreg_load(ctx, DSTOFF, EA, true); \
INC; \
tcg_gen_br(end_label); \
gen_set_label(false_label); \
tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
1 << insn->slot); \
gen_set_label(end_label); \
} while (0)
#define fGEN_TCG_PRED_VEC_LOAD_pred_pi \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \
fEA_REG(RxV), \
VdV_off, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_PRED_VEC_LOAD_npred_pi \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
VdV_off, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_V6_vL32b_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_V6_vL32b_cur_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_cur_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_V6_vL32b_tmp_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_tmp_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_V6_vL32b_nt_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_nt_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_V6_vL32b_nt_cur_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_nt_cur_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_V6_vL32b_nt_tmp_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_pi
#define fGEN_TCG_V6_vL32b_nt_tmp_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_pi
#define fGEN_TCG_PRED_VEC_LOAD_pred_ai \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
VdV_off, \
do {} while (0))
#define fGEN_TCG_PRED_VEC_LOAD_npred_ai \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
VdV_off, \
do {} while (0))
#define fGEN_TCG_V6_vL32b_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_V6_vL32b_cur_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_cur_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_V6_vL32b_tmp_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_tmp_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_V6_vL32b_nt_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_nt_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_V6_vL32b_nt_cur_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_nt_cur_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_V6_vL32b_nt_tmp_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ai
#define fGEN_TCG_V6_vL32b_nt_tmp_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ai
#define fGEN_TCG_PRED_VEC_LOAD_pred_ppu \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLD(PvV), \
fEA_REG(RxV), \
VdV_off, \
fPM_M(RxV, MuV))
#define fGEN_TCG_PRED_VEC_LOAD_npred_ppu \
fGEN_TCG_PRED_VEC_LOAD(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
VdV_off, \
fPM_M(RxV, MuV))
#define fGEN_TCG_V6_vL32b_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
#define fGEN_TCG_V6_vL32b_cur_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_cur_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
#define fGEN_TCG_V6_vL32b_tmp_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_tmp_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
#define fGEN_TCG_V6_vL32b_nt_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_nt_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
#define fGEN_TCG_V6_vL32b_nt_cur_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_nt_cur_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
#define fGEN_TCG_V6_vL32b_nt_tmp_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_pred_ppu
#define fGEN_TCG_V6_vL32b_nt_tmp_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_LOAD_npred_ppu
/* Vector stores */
#define fGEN_TCG_V6_vS32b_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32Ub_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32Ub_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32Ub_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_ppu(SHORTCODE) SHORTCODE
/* New value vector stores */
#define fGEN_TCG_NEWVAL_VEC_STORE(GET_EA, INC) \
do { \
GET_EA; \
gen_vreg_store(ctx, insn, pkt, EA, OsN_off, insn->slot, true); \
INC; \
} while (0)
#define fGEN_TCG_NEWVAL_VEC_STORE_pi \
fGEN_TCG_NEWVAL_VEC_STORE(fEA_REG(RxV), fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_V6_vS32b_new_pi(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_pi
#define fGEN_TCG_V6_vS32b_nt_new_pi(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_pi
#define fGEN_TCG_NEWVAL_VEC_STORE_ai \
fGEN_TCG_NEWVAL_VEC_STORE(fEA_RI(RtV, siV * sizeof(MMVector)), \
do { } while (0))
#define fGEN_TCG_V6_vS32b_new_ai(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_ai
#define fGEN_TCG_V6_vS32b_nt_new_ai(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_ai
#define fGEN_TCG_NEWVAL_VEC_STORE_ppu \
fGEN_TCG_NEWVAL_VEC_STORE(fEA_REG(RxV), fPM_M(RxV, MuV))
#define fGEN_TCG_V6_vS32b_new_ppu(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_ppu
#define fGEN_TCG_V6_vS32b_nt_new_ppu(SHORTCODE) \
fGEN_TCG_NEWVAL_VEC_STORE_ppu
/* Predicated vector stores */
#define fGEN_TCG_PRED_VEC_STORE(GET_EA, PRED, SRCOFF, ALIGN, INC) \
do { \
TCGv LSB = tcg_temp_new(); \
TCGLabel *false_label = gen_new_label(); \
TCGLabel *end_label = gen_new_label(); \
GET_EA; \
PRED; \
tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \
tcg_temp_free(LSB); \
gen_vreg_store(ctx, insn, pkt, EA, SRCOFF, insn->slot, ALIGN); \
INC; \
tcg_gen_br(end_label); \
gen_set_label(false_label); \
tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
1 << insn->slot); \
gen_set_label(end_label); \
} while (0)
#define fGEN_TCG_PRED_VEC_STORE_pred_pi(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_REG(RxV), \
VsV_off, ALIGN, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_PRED_VEC_STORE_npred_pi(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
VsV_off, ALIGN, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_PRED_VEC_STORE_new_pred_pi \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_REG(RxV), \
OsN_off, true, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_PRED_VEC_STORE_new_npred_pi \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
OsN_off, true, \
fPM_I(RxV, siV * sizeof(MMVector)))
#define fGEN_TCG_V6_vS32b_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_pi(true)
#define fGEN_TCG_V6_vS32b_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_pi(true)
#define fGEN_TCG_V6_vS32Ub_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_pi(false)
#define fGEN_TCG_V6_vS32Ub_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_pi(false)
#define fGEN_TCG_V6_vS32b_nt_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_pi(true)
#define fGEN_TCG_V6_vS32b_nt_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_pi(true)
#define fGEN_TCG_V6_vS32b_new_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_pi
#define fGEN_TCG_V6_vS32b_new_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_pi
#define fGEN_TCG_V6_vS32b_nt_new_pred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_pi
#define fGEN_TCG_V6_vS32b_nt_new_npred_pi(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_pi
#define fGEN_TCG_PRED_VEC_STORE_pred_ai(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
VsV_off, ALIGN, \
do { } while (0))
#define fGEN_TCG_PRED_VEC_STORE_npred_ai(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
VsV_off, ALIGN, \
do { } while (0))
#define fGEN_TCG_PRED_VEC_STORE_new_pred_ai \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
OsN_off, true, \
do { } while (0))
#define fGEN_TCG_PRED_VEC_STORE_new_npred_ai \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_RI(RtV, siV * sizeof(MMVector)), \
OsN_off, true, \
do { } while (0))
#define fGEN_TCG_V6_vS32b_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ai(true)
#define fGEN_TCG_V6_vS32b_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ai(true)
#define fGEN_TCG_V6_vS32Ub_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ai(false)
#define fGEN_TCG_V6_vS32Ub_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ai(false)
#define fGEN_TCG_V6_vS32b_nt_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ai(true)
#define fGEN_TCG_V6_vS32b_nt_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ai(true)
#define fGEN_TCG_V6_vS32b_new_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_ai
#define fGEN_TCG_V6_vS32b_new_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_ai
#define fGEN_TCG_V6_vS32b_nt_new_pred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_ai
#define fGEN_TCG_V6_vS32b_nt_new_npred_ai(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_ai
#define fGEN_TCG_PRED_VEC_STORE_pred_ppu(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_REG(RxV), \
VsV_off, ALIGN, \
fPM_M(RxV, MuV))
#define fGEN_TCG_PRED_VEC_STORE_npred_ppu(ALIGN) \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
VsV_off, ALIGN, \
fPM_M(RxV, MuV))
#define fGEN_TCG_PRED_VEC_STORE_new_pred_ppu \
fGEN_TCG_PRED_VEC_STORE(fLSBOLD(PvV), \
fEA_REG(RxV), \
OsN_off, true, \
fPM_M(RxV, MuV))
#define fGEN_TCG_PRED_VEC_STORE_new_npred_ppu \
fGEN_TCG_PRED_VEC_STORE(fLSBOLDNOT(PvV), \
fEA_REG(RxV), \
OsN_off, true, \
fPM_M(RxV, MuV))
#define fGEN_TCG_V6_vS32b_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ppu(true)
#define fGEN_TCG_V6_vS32b_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ppu(true)
#define fGEN_TCG_V6_vS32Ub_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ppu(false)
#define fGEN_TCG_V6_vS32Ub_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ppu(false)
#define fGEN_TCG_V6_vS32b_nt_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_pred_ppu(true)
#define fGEN_TCG_V6_vS32b_nt_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_npred_ppu(true)
#define fGEN_TCG_V6_vS32b_new_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_ppu
#define fGEN_TCG_V6_vS32b_new_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_ppu
#define fGEN_TCG_V6_vS32b_nt_new_pred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_pred_ppu
#define fGEN_TCG_V6_vS32b_nt_new_npred_ppu(SHORTCODE) \
fGEN_TCG_PRED_VEC_STORE_new_npred_ppu
/* Masked vector stores */
#define fGEN_TCG_V6_vS32b_qpred_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_qpred_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_qpred_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_qpred_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_qpred_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_qpred_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nqpred_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_nqpred_pi(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nqpred_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_nqpred_ai(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nqpred_ppu(SHORTCODE) SHORTCODE
#define fGEN_TCG_V6_vS32b_nt_nqpred_ppu(SHORTCODE) SHORTCODE
/* Store release not modelled in qemu, but need to suppress compiler warnings */
#define fGEN_TCG_V6_vS32b_srls_pi(SHORTCODE) \
do { \
siV = siV; \
} while (0)
#define fGEN_TCG_V6_vS32b_srls_ai(SHORTCODE) \
do { \
RtV = RtV; \
siV = siV; \
} while (0)
#define fGEN_TCG_V6_vS32b_srls_ppu(SHORTCODE) \
do { \
MuV = MuV; \
} while (0)
#endif

View file

@ -19,13 +19,16 @@
#include "cpu.h" #include "cpu.h"
#include "internal.h" #include "internal.h"
#include "tcg/tcg-op.h" #include "tcg/tcg-op.h"
#include "tcg/tcg-op-gvec.h"
#include "insn.h" #include "insn.h"
#include "opcodes.h" #include "opcodes.h"
#include "translate.h" #include "translate.h"
#define QEMU_GENERATE /* Used internally by macros.h */ #define QEMU_GENERATE /* Used internally by macros.h */
#include "macros.h" #include "macros.h"
#include "mmvec/macros.h"
#undef QEMU_GENERATE #undef QEMU_GENERATE
#include "gen_tcg.h" #include "gen_tcg.h"
#include "gen_tcg_hvx.h"
static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int slot) static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int slot)
{ {
@ -165,6 +168,9 @@ static inline void gen_read_ctrl_reg(DisasContext *ctx, const int reg_num,
} else if (reg_num == HEX_REG_QEMU_INSN_CNT) { } else if (reg_num == HEX_REG_QEMU_INSN_CNT) {
tcg_gen_addi_tl(dest, hex_gpr[HEX_REG_QEMU_INSN_CNT], tcg_gen_addi_tl(dest, hex_gpr[HEX_REG_QEMU_INSN_CNT],
ctx->num_insns); ctx->num_insns);
} else if (reg_num == HEX_REG_QEMU_HVX_CNT) {
tcg_gen_addi_tl(dest, hex_gpr[HEX_REG_QEMU_HVX_CNT],
ctx->num_hvx_insns);
} else { } else {
tcg_gen_mov_tl(dest, hex_gpr[reg_num]); tcg_gen_mov_tl(dest, hex_gpr[reg_num]);
} }
@ -191,6 +197,12 @@ static inline void gen_read_ctrl_reg_pair(DisasContext *ctx, const int reg_num,
tcg_gen_concat_i32_i64(dest, pkt_cnt, insn_cnt); tcg_gen_concat_i32_i64(dest, pkt_cnt, insn_cnt);
tcg_temp_free(pkt_cnt); tcg_temp_free(pkt_cnt);
tcg_temp_free(insn_cnt); tcg_temp_free(insn_cnt);
} else if (reg_num == HEX_REG_QEMU_HVX_CNT) {
TCGv hvx_cnt = tcg_temp_new();
tcg_gen_addi_tl(hvx_cnt, hex_gpr[HEX_REG_QEMU_HVX_CNT],
ctx->num_hvx_insns);
tcg_gen_concat_i32_i64(dest, hvx_cnt, hex_gpr[reg_num + 1]);
tcg_temp_free(hvx_cnt);
} else { } else {
tcg_gen_concat_i32_i64(dest, tcg_gen_concat_i32_i64(dest,
hex_gpr[reg_num], hex_gpr[reg_num],
@ -226,6 +238,9 @@ static inline void gen_write_ctrl_reg(DisasContext *ctx, int reg_num,
if (reg_num == HEX_REG_QEMU_INSN_CNT) { if (reg_num == HEX_REG_QEMU_INSN_CNT) {
ctx->num_insns = 0; ctx->num_insns = 0;
} }
if (reg_num == HEX_REG_QEMU_HVX_CNT) {
ctx->num_hvx_insns = 0;
}
} }
} }
@ -247,6 +262,9 @@ static inline void gen_write_ctrl_reg_pair(DisasContext *ctx, int reg_num,
ctx->num_packets = 0; ctx->num_packets = 0;
ctx->num_insns = 0; ctx->num_insns = 0;
} }
if (reg_num == HEX_REG_QEMU_HVX_CNT) {
ctx->num_hvx_insns = 0;
}
} }
} }
@ -446,5 +464,175 @@ static TCGv gen_8bitsof(TCGv result, TCGv value)
return result; return result;
} }
static intptr_t vreg_src_off(DisasContext *ctx, int num)
{
intptr_t offset = offsetof(CPUHexagonState, VRegs[num]);
if (test_bit(num, ctx->vregs_select)) {
offset = ctx_future_vreg_off(ctx, num, 1, false);
}
if (test_bit(num, ctx->vregs_updated_tmp)) {
offset = ctx_tmp_vreg_off(ctx, num, 1, false);
}
return offset;
}
static void gen_log_vreg_write(DisasContext *ctx, intptr_t srcoff, int num,
VRegWriteType type, int slot_num,
bool is_predicated)
{
TCGLabel *label_end = NULL;
intptr_t dstoff;
if (is_predicated) {
TCGv cancelled = tcg_temp_local_new();
label_end = gen_new_label();
/* Don't do anything if the slot was cancelled */
tcg_gen_extract_tl(cancelled, hex_slot_cancelled, slot_num, 1);
tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);
tcg_temp_free(cancelled);
}
if (type != EXT_TMP) {
dstoff = ctx_future_vreg_off(ctx, num, 1, true);
tcg_gen_gvec_mov(MO_64, dstoff, srcoff,
sizeof(MMVector), sizeof(MMVector));
tcg_gen_ori_tl(hex_VRegs_updated, hex_VRegs_updated, 1 << num);
} else {
dstoff = ctx_tmp_vreg_off(ctx, num, 1, false);
tcg_gen_gvec_mov(MO_64, dstoff, srcoff,
sizeof(MMVector), sizeof(MMVector));
}
if (is_predicated) {
gen_set_label(label_end);
}
}
static void gen_log_vreg_write_pair(DisasContext *ctx, intptr_t srcoff, int num,
VRegWriteType type, int slot_num,
bool is_predicated)
{
gen_log_vreg_write(ctx, srcoff, num ^ 0, type, slot_num, is_predicated);
srcoff += sizeof(MMVector);
gen_log_vreg_write(ctx, srcoff, num ^ 1, type, slot_num, is_predicated);
}
static void gen_log_qreg_write(intptr_t srcoff, int num, int vnew,
int slot_num, bool is_predicated)
{
TCGLabel *label_end = NULL;
intptr_t dstoff;
if (is_predicated) {
TCGv cancelled = tcg_temp_local_new();
label_end = gen_new_label();
/* Don't do anything if the slot was cancelled */
tcg_gen_extract_tl(cancelled, hex_slot_cancelled, slot_num, 1);
tcg_gen_brcondi_tl(TCG_COND_NE, cancelled, 0, label_end);
tcg_temp_free(cancelled);
}
dstoff = offsetof(CPUHexagonState, future_QRegs[num]);
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMQReg), sizeof(MMQReg));
if (is_predicated) {
tcg_gen_ori_tl(hex_QRegs_updated, hex_QRegs_updated, 1 << num);
gen_set_label(label_end);
}
}
static void gen_vreg_load(DisasContext *ctx, intptr_t dstoff, TCGv src,
bool aligned)
{
TCGv_i64 tmp = tcg_temp_new_i64();
if (aligned) {
tcg_gen_andi_tl(src, src, ~((int32_t)sizeof(MMVector) - 1));
}
for (int i = 0; i < sizeof(MMVector) / 8; i++) {
tcg_gen_qemu_ld64(tmp, src, ctx->mem_idx);
tcg_gen_addi_tl(src, src, 8);
tcg_gen_st_i64(tmp, cpu_env, dstoff + i * 8);
}
tcg_temp_free_i64(tmp);
}
static void gen_vreg_store(DisasContext *ctx, Insn *insn, Packet *pkt,
TCGv EA, intptr_t srcoff, int slot, bool aligned)
{
intptr_t dstoff = offsetof(CPUHexagonState, vstore[slot].data);
intptr_t maskoff = offsetof(CPUHexagonState, vstore[slot].mask);
if (is_gather_store_insn(insn, pkt)) {
TCGv sl = tcg_constant_tl(slot);
gen_helper_gather_store(cpu_env, EA, sl);
return;
}
tcg_gen_movi_tl(hex_vstore_pending[slot], 1);
if (aligned) {
tcg_gen_andi_tl(hex_vstore_addr[slot], EA,
~((int32_t)sizeof(MMVector) - 1));
} else {
tcg_gen_mov_tl(hex_vstore_addr[slot], EA);
}
tcg_gen_movi_tl(hex_vstore_size[slot], sizeof(MMVector));
/* Copy the data to the vstore buffer */
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMVector), sizeof(MMVector));
/* Set the mask to all 1's */
tcg_gen_gvec_dup_imm(MO_64, maskoff, sizeof(MMQReg), sizeof(MMQReg), ~0LL);
}
static void gen_vreg_masked_store(DisasContext *ctx, TCGv EA, intptr_t srcoff,
intptr_t bitsoff, int slot, bool invert)
{
intptr_t dstoff = offsetof(CPUHexagonState, vstore[slot].data);
intptr_t maskoff = offsetof(CPUHexagonState, vstore[slot].mask);
tcg_gen_movi_tl(hex_vstore_pending[slot], 1);
tcg_gen_andi_tl(hex_vstore_addr[slot], EA,
~((int32_t)sizeof(MMVector) - 1));
tcg_gen_movi_tl(hex_vstore_size[slot], sizeof(MMVector));
/* Copy the data to the vstore buffer */
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, sizeof(MMVector), sizeof(MMVector));
/* Copy the mask */
tcg_gen_gvec_mov(MO_64, maskoff, bitsoff, sizeof(MMQReg), sizeof(MMQReg));
if (invert) {
tcg_gen_gvec_not(MO_64, maskoff, maskoff,
sizeof(MMQReg), sizeof(MMQReg));
}
}
static void vec_to_qvec(size_t size, intptr_t dstoff, intptr_t srcoff)
{
TCGv_i64 tmp = tcg_temp_new_i64();
TCGv_i64 word = tcg_temp_new_i64();
TCGv_i64 bits = tcg_temp_new_i64();
TCGv_i64 mask = tcg_temp_new_i64();
TCGv_i64 zero = tcg_constant_i64(0);
TCGv_i64 ones = tcg_constant_i64(~0);
for (int i = 0; i < sizeof(MMVector) / 8; i++) {
tcg_gen_ld_i64(tmp, cpu_env, srcoff + i * 8);
tcg_gen_movi_i64(mask, 0);
for (int j = 0; j < 8; j += size) {
tcg_gen_extract_i64(word, tmp, j * 8, size * 8);
tcg_gen_movcond_i64(TCG_COND_NE, bits, word, zero, ones, zero);
tcg_gen_deposit_i64(mask, mask, bits, j, size);
}
tcg_gen_st8_i64(mask, cpu_env, dstoff + i);
}
tcg_temp_free_i64(tmp);
tcg_temp_free_i64(word);
tcg_temp_free_i64(bits);
tcg_temp_free_i64(mask);
}
#include "tcg_funcs_generated.c.inc" #include "tcg_funcs_generated.c.inc"
#include "tcg_func_table_generated.c.inc" #include "tcg_func_table_generated.c.inc"

View file

@ -23,6 +23,8 @@ DEF_HELPER_1(debug_start_packet, void, env)
DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int, int) DEF_HELPER_FLAGS_3(debug_check_store_width, TCG_CALL_NO_WG, void, env, int, int)
DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int) DEF_HELPER_FLAGS_3(debug_commit_end, TCG_CALL_NO_WG, void, env, int, int)
DEF_HELPER_2(commit_store, void, env, int) DEF_HELPER_2(commit_store, void, env, int)
DEF_HELPER_3(gather_store, void, env, i32, int)
DEF_HELPER_1(commit_hvx_stores, void, env)
DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32) DEF_HELPER_FLAGS_4(fcircadd, TCG_CALL_NO_RWG_SE, s32, s32, s32, s32, s32)
DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_FLAGS_1(fbrev, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_3(sfrecipa, i64, env, f32, f32) DEF_HELPER_3(sfrecipa, i64, env, f32, f32)
@ -90,4 +92,18 @@ DEF_HELPER_4(sffms_lib, f32, env, f32, f32, f32)
DEF_HELPER_3(dfmpyfix, f64, env, f64, f64) DEF_HELPER_3(dfmpyfix, f64, env, f64, f64)
DEF_HELPER_4(dfmpyhh, f64, env, f64, f64, f64) DEF_HELPER_4(dfmpyhh, f64, env, f64, f64, f64)
/* Histogram instructions */
DEF_HELPER_1(vhist, void, env)
DEF_HELPER_1(vhistq, void, env)
DEF_HELPER_1(vwhist256, void, env)
DEF_HELPER_1(vwhist256q, void, env)
DEF_HELPER_1(vwhist256_sat, void, env)
DEF_HELPER_1(vwhist256q_sat, void, env)
DEF_HELPER_1(vwhist128, void, env)
DEF_HELPER_1(vwhist128q, void, env)
DEF_HELPER_2(vwhist128m, void, env, s32)
DEF_HELPER_2(vwhist128qm, void, env, s32)
DEF_HELPER_2(probe_pkt_scalar_store_s0, void, env, int) DEF_HELPER_2(probe_pkt_scalar_store_s0, void, env, int)
DEF_HELPER_2(probe_hvx_stores, void, env, int)
DEF_HELPER_3(probe_pkt_scalar_hvx_stores, void, env, int, int)

View file

@ -19,6 +19,7 @@
#define HEXAGON_ARCH_TYPES_H #define HEXAGON_ARCH_TYPES_H
#include "qemu/osdep.h" #include "qemu/osdep.h"
#include "mmvec/mmvec.h"
#include "qemu/int128.h" #include "qemu/int128.h"
/* /*
@ -35,4 +36,8 @@ typedef uint64_t size8u_t;
typedef int64_t size8s_t; typedef int64_t size8s_t;
typedef Int128 size16s_t; typedef Int128 size16s_t;
typedef MMVector mmvector_t;
typedef MMVectorPair mmvector_pair_t;
typedef MMQReg mmqret_t;
#endif #endif

View file

@ -145,6 +145,9 @@ def compute_tag_immediates(tag):
## P predicate register ## P predicate register
## R GPR register ## R GPR register
## M modifier register ## M modifier register
## Q HVX predicate vector
## V HVX vector register
## O HVX new vector register
## regid can be one of the following ## regid can be one of the following
## d, e destination register ## d, e destination register
## dd destination register pair ## dd destination register pair
@ -180,6 +183,9 @@ def is_readwrite(regid):
def is_scalar_reg(regtype): def is_scalar_reg(regtype):
return regtype in "RPC" return regtype in "RPC"
def is_hvx_reg(regtype):
return regtype in "VQ"
def is_old_val(regtype, regid, tag): def is_old_val(regtype, regid, tag):
return regtype+regid+'V' in semdict[tag] return regtype+regid+'V' in semdict[tag]
@ -203,6 +209,13 @@ def need_ea(tag):
def skip_qemu_helper(tag): def skip_qemu_helper(tag):
return tag in overrides.keys() return tag in overrides.keys()
def is_tmp_result(tag):
return ('A_CVI_TMP' in attribdict[tag] or
'A_CVI_TMP_DST' in attribdict[tag])
def is_new_result(tag):
return ('A_CVI_NEW' in attribdict[tag])
def imm_name(immlett): def imm_name(immlett):
return "%siV" % immlett return "%siV" % immlett

View file

@ -76,6 +76,7 @@ enum {
/* Use reserved control registers for qemu execution counts */ /* Use reserved control registers for qemu execution counts */
HEX_REG_QEMU_PKT_CNT = 52, HEX_REG_QEMU_PKT_CNT = 52,
HEX_REG_QEMU_INSN_CNT = 53, HEX_REG_QEMU_INSN_CNT = 53,
HEX_REG_QEMU_HVX_CNT = 54,
HEX_REG_UTIMERLO = 62, HEX_REG_UTIMERLO = 62,
HEX_REG_UTIMERHI = 63, HEX_REG_UTIMERHI = 63,
}; };

View file

@ -0,0 +1,25 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
/*
* Top level file for all instruction set extensions
*/
#define EXTNAME mmvec
#define EXTSTR "mmvec"
#include "mmvec/ext.idef"
#undef EXTNAME
#undef EXTSTR

View file

@ -0,0 +1,25 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
/*
* Top level file for all instruction set extensions
*/
#define EXTNAME mmvec
#define EXTSTR "mmvec"
#include "mmvec/macros.def"
#undef EXTNAME
#undef EXTSTR

View file

@ -0,0 +1,20 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#define EXTNAME mmvec
#include "mmvec/encode_ext.def"
#undef EXTNAME

View file

@ -28,3 +28,4 @@
#include "shift.idef" #include "shift.idef"
#include "system.idef" #include "system.idef"
#include "subinsns.idef" #include "subinsns.idef"
#include "allext.idef"

View file

@ -71,6 +71,7 @@
#include "encode_pp.def" #include "encode_pp.def"
#include "encode_subinsn.def" #include "encode_subinsn.def"
#include "allextenc.def"
#ifdef __SELF_DEF_FIELD32 #ifdef __SELF_DEF_FIELD32
#undef __SELF_DEF_FIELD32 #undef __SELF_DEF_FIELD32

View file

@ -176,6 +176,12 @@ DEF_MACRO(
(A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY) (A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY)
) )
DEF_MACRO(
fVSATUVALN,
({ ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}),
()
)
DEF_MACRO( DEF_MACRO(
fSATUVALN, fSATUVALN,
({fSET_OVERFLOW(); ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}), ({fSET_OVERFLOW(); ((VAL) < 0) ? 0 : ((1LL<<(N))-1);}),
@ -188,6 +194,12 @@ DEF_MACRO(
() ()
) )
DEF_MACRO(
fVSATVALN,
({((VAL) < 0) ? (-(1LL<<((N)-1))) : ((1LL<<((N)-1))-1);}),
()
)
DEF_MACRO( DEF_MACRO(
fZXTN, /* macro name */ fZXTN, /* macro name */
((VAL) & ((1LL<<(N))-1)), ((VAL) & ((1LL<<(N))-1)),
@ -205,6 +217,11 @@ DEF_MACRO(
((fSXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATVALN(N,VAL)), ((fSXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATVALN(N,VAL)),
() ()
) )
DEF_MACRO(
fVSATN,
((fSXTN(N,64,VAL) == (VAL)) ? (VAL) : fVSATVALN(N,VAL)),
()
)
DEF_MACRO( DEF_MACRO(
fADDSAT64, fADDSAT64,
@ -234,6 +251,12 @@ DEF_MACRO(
() ()
) )
DEF_MACRO(
fVSATUN,
((fZXTN(N,64,VAL) == (VAL)) ? (VAL) : fVSATUVALN(N,VAL)),
()
)
DEF_MACRO( DEF_MACRO(
fSATUN, fSATUN,
((fZXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATUVALN(N,VAL)), ((fZXTN(N,64,VAL) == (VAL)) ? (VAL) : fSATUVALN(N,VAL)),
@ -253,6 +276,19 @@ DEF_MACRO(
() ()
) )
DEF_MACRO(
fVSATH,
(fVSATN(16,VAL)),
()
)
DEF_MACRO(
fVSATUH,
(fVSATUN(16,VAL)),
()
)
DEF_MACRO( DEF_MACRO(
fSATUB, fSATUB,
(fSATUN(8,VAL)), (fSATUN(8,VAL)),
@ -265,6 +301,20 @@ DEF_MACRO(
) )
DEF_MACRO(
fVSATUB,
(fVSATUN(8,VAL)),
()
)
DEF_MACRO(
fVSATB,
(fVSATN(8,VAL)),
()
)
/*************************************/ /*************************************/
/* immediate extension */ /* immediate extension */
/*************************************/ /*************************************/
@ -556,6 +606,18 @@ DEF_MACRO(
/* optional attributes */ /* optional attributes */
) )
DEF_MACRO(
fCAST2_2s, /* macro name */
((size2s_t)(A)),
/* optional attributes */
)
DEF_MACRO(
fCAST2_2u, /* macro name */
((size2u_t)(A)),
/* optional attributes */
)
DEF_MACRO( DEF_MACRO(
fCAST4_4s, /* macro name */ fCAST4_4s, /* macro name */
((size4s_t)(A)), ((size4s_t)(A)),
@ -876,6 +938,11 @@ DEF_MACRO(
(((size8s_t)(A))<<N), (((size8s_t)(A))<<N),
/* optional attributes */ /* optional attributes */
) )
DEF_MACRO(
fVSATW, /* saturating to 32-bits*/
fVSATN(32,((long long)A)),
()
)
DEF_MACRO( DEF_MACRO(
fSATW, /* saturating to 32-bits*/ fSATW, /* saturating to 32-bits*/
@ -883,6 +950,12 @@ DEF_MACRO(
() ()
) )
DEF_MACRO(
fVSAT, /* saturating to 32-bits*/
fVSATN(32,(A)),
()
)
DEF_MACRO( DEF_MACRO(
fSAT, /* saturating to 32-bits*/ fSAT, /* saturating to 32-bits*/
fSATN(32,(A)), fSATN(32,(A)),
@ -1389,6 +1462,11 @@ DEF_MACRO(fSETBITS,
/*************************************/ /*************************************/
/* Used for parity, etc........ */ /* Used for parity, etc........ */
/*************************************/ /*************************************/
DEF_MACRO(fCOUNTONES_2,
count_ones_2(VAL),
/* nothing */
)
DEF_MACRO(fCOUNTONES_4, DEF_MACRO(fCOUNTONES_4,
count_ones_4(VAL), count_ones_4(VAL),
/* nothing */ /* nothing */
@ -1419,6 +1497,11 @@ DEF_MACRO(fCL1_4,
/* nothing */ /* nothing */
) )
DEF_MACRO(fCL1_2,
count_leading_ones_2(VAL),
/* nothing */
)
DEF_MACRO(fINTERLEAVE, DEF_MACRO(fINTERLEAVE,
interleave(ODD,EVEN), interleave(ODD,EVEN),
/* nothing */ /* nothing */
@ -1576,3 +1659,8 @@ DEF_MACRO(fBRANCH_SPECULATE_STALL,
}, },
() ()
) )
DEF_MACRO(IV1DEAD,
,
()
)

View file

@ -0,0 +1,794 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#define CONCAT(A,B) A##B
#define EXTEXTNAME(X) CONCAT(EXT_,X)
#define DEF_ENC(TAG,STR) DEF_EXT_ENC(TAG,EXTEXTNAME(EXTNAME),STR)
#ifndef NO_MMVEC
DEF_ENC(V6_extractw, ICLASS_LD" 001 0 000sssss PP0uuuuu --1ddddd") /* coproc insn, returns Rd */
#endif
#ifndef NO_MMVEC
DEF_CLASS32(ICLASS_NCJ" 1--- -------- PP------ --------",COPROC_VMEM)
DEF_CLASS32(ICLASS_NCJ" 1000 0-0ttttt PPi--iii ---ddddd",BaseOffset_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1000 1-0ttttt PPivviii ---ddddd",BaseOffset_if_Pv_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1000 0-1ttttt PPi--iii --------",BaseOffset_VMEM_Stores1)
DEF_CLASS32(ICLASS_NCJ" 1000 1-0ttttt PPi--iii 00------",BaseOffset_VMEM_Stores2)
DEF_CLASS32(ICLASS_NCJ" 1000 1-1ttttt PPivviii --------",BaseOffset_if_Pv_VMEM_Stores)
DEF_CLASS32(ICLASS_NCJ" 1001 0-0xxxxx PP---iii ---ddddd",PostImm_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1001 1-0xxxxx PP-vviii ---ddddd",PostImm_if_Pv_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1001 0-1xxxxx PP---iii --------",PostImm_VMEM_Stores1)
DEF_CLASS32(ICLASS_NCJ" 1001 1-0xxxxx PP---iii 00------",PostImm_VMEM_Stores2)
DEF_CLASS32(ICLASS_NCJ" 1001 1-1xxxxx PP-vviii --------",PostImm_if_Pv_VMEM_Stores)
DEF_CLASS32(ICLASS_NCJ" 1011 0-0xxxxx PPu----- ---ddddd",PostM_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1011 1-0xxxxx PPuvv--- ---ddddd",PostM_if_Pv_VMEM_Loads)
DEF_CLASS32(ICLASS_NCJ" 1011 0-1xxxxx PPu----- --------",PostM_VMEM_Stores1)
DEF_CLASS32(ICLASS_NCJ" 1011 1-0xxxxx PPu----- 00------",PostM_VMEM_Stores2)
DEF_CLASS32(ICLASS_NCJ" 1011 1-1xxxxx PPuvv--- --------",PostM_if_Pv_VMEM_Stores)
DEF_CLASS32(ICLASS_NCJ" 110- 0------- PP------ --------",Z_Load)
DEF_CLASS32(ICLASS_NCJ" 110- 1------- PP------ --------",Z_Load_if_Pv)
DEF_CLASS32(ICLASS_NCJ" 1111 000ttttt PPu--0-- ---vvvvv",Gather)
DEF_CLASS32(ICLASS_NCJ" 1111 000ttttt PPu--1-- -ssvvvvv",Gather_if_Qs)
DEF_CLASS32(ICLASS_NCJ" 1111 001ttttt PPuvvvvv ---wwwww",Scatter)
DEF_CLASS32(ICLASS_NCJ" 1111 001ttttt PPuvvvvv -----sss",Scatter_New)
DEF_CLASS32(ICLASS_NCJ" 1111 1--ttttt PPuvvvvv -sswwwww",Scatter_if_Qs)
DEF_FIELD32(ICLASS_NCJ" 1--- -!------ PP------ --------",NT,"NonTemporal")
DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 000 --- ----- PP i --iii ----- ---","[#0] vmem(Rt+#s4)[:nt]")
#define LDST_ENC(TAG,MAJ3,MID3,RREG,TINY6,MIN3,VREG) DEF_ENC(TAG, ICLASS_NCJ "1" #MAJ3 #MID3 #RREG "PP" #TINY6 #MIN3 #VREG)
#define LDST_BO(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_ai, 000,MID3,ttttt,i PRED iii,MIN3,VREG)
#define LDST_PI(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_pi, 001,MID3,xxxxx,- PRED iii,MIN3,VREG)
#define LDST_PM(TAGPRE,MID3,PRED,MIN3,VREG) LDST_ENC(TAGPRE##_ppu,011,MID3,xxxxx,u PRED ---,MIN3,VREG)
#define LDST_BASICLD(OP,TAGPRE) \
OP(TAGPRE, 000,00,000,ddddd) \
OP(TAGPRE##_nt, 010,00,000,ddddd) \
OP(TAGPRE##_cur, 000,00,001,ddddd) \
OP(TAGPRE##_nt_cur, 010,00,001,ddddd) \
OP(TAGPRE##_tmp, 000,00,010,ddddd) \
OP(TAGPRE##_nt_tmp, 010,00,010,ddddd)
#define LDST_BASICST(OP,TAGPRE) \
OP(TAGPRE, 001,--,000,sssss) \
OP(TAGPRE##_nt, 011,--,000,sssss) \
OP(TAGPRE##_new, 001,--,001,-0sss) \
OP(TAGPRE##_srls, 001,--,001,-1---) \
OP(TAGPRE##_nt_new, 011,--,001,--sss) \
#define LDST_QPREDST(OP,TAGPRE) \
OP(TAGPRE##_qpred, 100,vv,000,sssss) \
OP(TAGPRE##_nt_qpred, 110,vv,000,sssss) \
OP(TAGPRE##_nqpred, 100,vv,001,sssss) \
OP(TAGPRE##_nt_nqpred,110,vv,001,sssss) \
#define LDST_CONDLD(OP,TAGPRE) \
OP(TAGPRE##_pred, 100,vv,010,ddddd) \
OP(TAGPRE##_nt_pred, 110,vv,010,ddddd) \
OP(TAGPRE##_npred, 100,vv,011,ddddd) \
OP(TAGPRE##_nt_npred, 110,vv,011,ddddd) \
OP(TAGPRE##_cur_pred, 100,vv,100,ddddd) \
OP(TAGPRE##_nt_cur_pred, 110,vv,100,ddddd) \
OP(TAGPRE##_cur_npred, 100,vv,101,ddddd) \
OP(TAGPRE##_nt_cur_npred, 110,vv,101,ddddd) \
OP(TAGPRE##_tmp_pred, 100,vv,110,ddddd) \
OP(TAGPRE##_nt_tmp_pred, 110,vv,110,ddddd) \
OP(TAGPRE##_tmp_npred, 100,vv,111,ddddd) \
OP(TAGPRE##_nt_tmp_npred, 110,vv,111,ddddd) \
#define LDST_PREDST(OP,TAGPRE,NT,MIN2) \
OP(TAGPRE##_pred, 1 NT 1,vv,MIN2 0,sssss) \
OP(TAGPRE##_npred, 1 NT 1,vv,MIN2 1,sssss)
#define LDST_PREDSTNEW(OP,TAGPRE,NT,MIN2) \
OP(TAGPRE##_pred, 1 NT 1,vv,MIN2 0,NT 0 sss) \
OP(TAGPRE##_npred, 1 NT 1,vv,MIN2 1,NT 1 sss)
// 0.0,vv,0--,sssss: pred st
#define LDST_BASICPREDST(OP,TAGPRE) \
LDST_PREDST(OP,TAGPRE, 0,00) \
LDST_PREDST(OP,TAGPRE##_nt, 1,00) \
LDST_PREDSTNEW(OP,TAGPRE##_new, 0,01) \
LDST_PREDSTNEW(OP,TAGPRE##_nt_new, 1,01)
LDST_BASICLD(LDST_BO,V6_vL32b)
LDST_CONDLD(LDST_BO,V6_vL32b)
LDST_BASICLD(LDST_PI,V6_vL32b)
LDST_CONDLD(LDST_PI,V6_vL32b)
LDST_BASICLD(LDST_PM,V6_vL32b)
LDST_CONDLD(LDST_PM,V6_vL32b)
// Loads
LDST_BO(V6_vL32Ub,000,00,111,ddddd)
//Stores
LDST_BASICST(LDST_BO,V6_vS32b)
LDST_BO(V6_vS32Ub,001,--,111,sssss)
// Byte Enabled Stores
LDST_QPREDST(LDST_BO,V6_vS32b)
// Scalar Predicated Stores
LDST_BASICPREDST(LDST_BO,V6_vS32b)
LDST_PREDST(LDST_BO,V6_vS32Ub,0,11)
DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 001 --- ----- PP - ----- ddddd ---","[#1] vmem(Rx++#s3)[:nt]")
// Loads
LDST_PI(V6_vL32Ub,000,00,111,ddddd)
//Stores
LDST_BASICST(LDST_PI,V6_vS32b)
LDST_PI(V6_vS32Ub,001,--,111,sssss)
// Byte Enabled Stores
LDST_QPREDST(LDST_PI,V6_vS32b)
// Scalar Predicated Stores
LDST_BASICPREDST(LDST_PI,V6_vS32b)
LDST_PREDST(LDST_PI,V6_vS32Ub,0,11)
DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 011 --- ----- PP - ----- ----- ---","[#3] vmem(Rx++#M)[:nt]")
// Loads
LDST_PM(V6_vL32Ub,000,00,111,ddddd)
//Stores
LDST_BASICST(LDST_PM,V6_vS32b)
LDST_PM(V6_vS32Ub,001,--,111,sssss)
// Byte Enabled Stores
LDST_QPREDST(LDST_PM,V6_vS32b)
// Scalar Predicated Stores
LDST_BASICPREDST(LDST_PM,V6_vS32b)
LDST_PREDST(LDST_PM,V6_vS32Ub,0,11)
DEF_ENC(V6_vaddcarrysat, ICLASS_CJ" 1 101 100 vvvvv PP 1 uuuuu 0ss ddddd") //
DEF_ENC(V6_vaddcarryo, ICLASS_CJ" 1 101 101 vvvvv PP 1 uuuuu 0ee ddddd") //
DEF_ENC(V6_vsubcarryo, ICLASS_CJ" 1 101 101 vvvvv PP 1 uuuuu 1ee ddddd") //
DEF_ENC(V6_vsatdw, ICLASS_CJ" 1 101 100 vvvvv PP 1 uuuuu 111 ddddd") //
DEF_FIELDROW_DESC32( ICLASS_NCJ" 1 111 --- ----- PP - ----- ----- ---","[#6] vgather,vscatter")
DEF_ENC(V6_vgathermw, ICLASS_NCJ" 1 111 000 ttttt PP u --000 --- vvvvv") // vtmp.w=vmem(Rt32,Mu2,Vv32.w).w
DEF_ENC(V6_vgathermh, ICLASS_NCJ" 1 111 000 ttttt PP u --001 --- vvvvv") // vtmp.h=vmem(Rt32,Mu2,Vv32.h).h
DEF_ENC(V6_vgathermhw, ICLASS_NCJ" 1 111 000 ttttt PP u --010 --- vvvvv") // vtmp.h=vmem(Rt32,Mu2,Vvv32.w).h
DEF_ENC(V6_vgathermwq, ICLASS_NCJ" 1 111 000 ttttt PP u --100 -ss vvvvv") // if (Qs4) vtmp.w=vmem(Rt32,Mu2,Vv32.w).w
DEF_ENC(V6_vgathermhq, ICLASS_NCJ" 1 111 000 ttttt PP u --101 -ss vvvvv") // if (Qs4) vtmp.h=vmem(Rt32,Mu2,Vv32.h).h
DEF_ENC(V6_vgathermhwq, ICLASS_NCJ" 1 111 000 ttttt PP u --110 -ss vvvvv") // if (Qs4) vtmp.h=vmem(Rt32,Mu2,Vvv32.w).h
DEF_ENC(V6_vscattermw, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 000 wwwww") // vmem(Rt32,Mu2,Vv32.w)=Vw32.w
DEF_ENC(V6_vscattermh, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 001 wwwww") // vmem(Rt32,Mu2,Vv32.h)=Vw32.h
DEF_ENC(V6_vscattermhw, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 010 wwwww") // vmem(Rt32,Mu2,Vv32.h)=Vw32.h
DEF_ENC(V6_vscattermw_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 100 wwwww") // vmem(Rt32,Mu2,Vv32.w) += Vw32.w
DEF_ENC(V6_vscattermh_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 101 wwwww") // vmem(Rt32,Mu2,Vv32.h) += Vw32.h
DEF_ENC(V6_vscattermhw_add, ICLASS_NCJ" 1 111 001 ttttt PP u vvvvv 110 wwwww") // vmem(Rt32,Mu2,Vv32.h) += Vw32.h
DEF_ENC(V6_vscattermwq, ICLASS_NCJ" 1 111 100 ttttt PP u vvvvv 0ss wwwww") // if (Qs4) vmem(Rt32,Mu2,Vv32.w)=Vw32.w
DEF_ENC(V6_vscattermhq, ICLASS_NCJ" 1 111 100 ttttt PP u vvvvv 1ss wwwww") // if (Qs4) vmem(Rt32,Mu2,Vv32.h)=Vw32.h
DEF_ENC(V6_vscattermhwq, ICLASS_NCJ" 1 111 101 ttttt PP u vvvvv 0ss wwwww") // if (Qs4) vmem(Rt32,Mu2,Vv32.h)=Vw32.h
DEF_CLASS32(ICLASS_CJ" 1--- -------- PP------ --------",COPROC_VX)
/***************************************************************
*
* Group #0, Uses Q6 Rt8: new in v61
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 000 --- ----- PP - ----- ----- ---","[#1] Vd32=(Vu32, Vv32, Rt8)")
DEF_ENC(V6_vasrhbsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vasruwuhrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vasrwuhrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vlutvvb_nm, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vlutvwh_nm, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vasruhubrndsat, ICLASS_CJ" 1 000 vvv vvttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vasruwuhsat, ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 100 ddddd") //
DEF_ENC(V6_vasruhubsat, ICLASS_CJ" 1 000 vvv vvttt PP 1 uuuuu 101 ddddd") //
/***************************************************************
*
* Group #1, Uses Q6 Rt32
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- ---","[#1] Vd32=(Vu32, Rt32)")
DEF_ENC(V6_vtmpyb, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vtmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vdmpyhb, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vrmpyub, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vrmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vdsaduh, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vdmpybus, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vdmpybus_dv, ICLASS_CJ" 1 001 000 ttttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vdmpyhsusat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vdmpyhsuisat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vdmpyhsat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vdmpyhisat, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vdmpyhb_dv, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vmpybus, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vmpabus, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vmpahb, ICLASS_CJ" 1 001 001 ttttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyh, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vmpyhss, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vmpyhsrs, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vmpyuh, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vrmpybusi, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 10i ddddd") //
DEF_ENC(V6_vrsadubi, ICLASS_CJ" 1 001 010 ttttt PP 0 uuuuu 11i ddddd") //
DEF_ENC(V6_vmpyihb, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vror, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vmpyuhe, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vmpabuu, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vlut4, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vasrw, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vasrh, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vaslw, ICLASS_CJ" 1 001 011 ttttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vaslh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vlsrw, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vlsrh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vlsrb, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vmpauhb, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vmpyiwub, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vmpyiwh, ICLASS_CJ" 1 001 100 ttttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyiwb, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_lvsplatw, ICLASS_CJ" 1 001 101 ttttt PP 0 ----0 001 ddddd") //
DEF_ENC(V6_pred_scalar2, ICLASS_CJ" 1 001 101 ttttt PP 0 ----- 010 -01dd") //
DEF_ENC(V6_vandvrt, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 010 -10dd") //
DEF_ENC(V6_pred_scalar2v2, ICLASS_CJ" 1 001 101 ttttt PP 0 ----- 010 -11dd") //
DEF_ENC(V6_vtmpyhb, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vandqrt, ICLASS_CJ" 1 001 101 ttttt PP 0 --0uu 101 ddddd") //
DEF_ENC(V6_vandnqrt, ICLASS_CJ" 1 001 101 ttttt PP 0 --1uu 101 ddddd") //
DEF_ENC(V6_vrmpyubi, ICLASS_CJ" 1 001 101 ttttt PP 0 uuuuu 11i ddddd") //
DEF_ENC(V6_vmpyub, ICLASS_CJ" 1 001 110 ttttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_lvsplath, ICLASS_CJ" 1 001 110 ttttt PP 0 ----- 001 ddddd") //
DEF_ENC(V6_lvsplatb, ICLASS_CJ" 1 001 110 ttttt PP 0 ----- 010 ddddd") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- ---","[#1] Vx32=(Vu32, Rt32)")
DEF_ENC(V6_vtmpyb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vtmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vtmpyhb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vdmpyhb_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vrmpyub_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vrmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vdmpybus_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vdmpybus_dv_acc, ICLASS_CJ" 1 001 000 ttttt PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vdmpyhsusat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vdmpyhsuisat_acc,ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vdmpyhisat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vdmpyhsat_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vdmpyhb_dv_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vmpybus_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vmpabus_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vmpahb_acc, ICLASS_CJ" 1 001 001 ttttt PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vmpyhsat_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vmpyuh_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vmpyiwb_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vmpyiwh_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vrmpybusi_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 10i xxxxx") //
DEF_ENC(V6_vrsadubi_acc, ICLASS_CJ" 1 001 010 ttttt PP 1 uuuuu 11i xxxxx") //
DEF_ENC(V6_vdsaduh_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vmpyihb_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vaslw_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vandqrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 --0uu 011 xxxxx") //
DEF_ENC(V6_vandnqrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 --1uu 011 xxxxx") //
DEF_ENC(V6_vandvrt_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 100 ---xx") //
DEF_ENC(V6_vasrw_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vrmpyubi_acc, ICLASS_CJ" 1 001 011 ttttt PP 1 uuuuu 11i xxxxx") //
DEF_ENC(V6_vmpyub_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vmpyiwub_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vmpauhb_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vmpyuhe_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 011 xxxxx")
DEF_ENC(V6_vmpahhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vmpauhuhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vmpsuhuhsat, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vasrh_acc, ICLASS_CJ" 1 001 100 ttttt PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vinsertwr, ICLASS_CJ" 1 001 101 ttttt PP 1 ----- 001 xxxxx")
DEF_ENC(V6_vmpabuu_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vaslh_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vmpyh_acc, ICLASS_CJ" 1 001 101 ttttt PP 1 uuuuu 110 xxxxx") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 001 --- ----- PP - ----- ----- ---","[#1] (Vx32, Vy32, Rt32)")
DEF_ENC(V6_vshuff, ICLASS_CJ" 1 001 111 ttttt PP 1 yyyyy 001 xxxxx") //
DEF_ENC(V6_vdeal, ICLASS_CJ" 1 001 111 ttttt PP 1 yyyyy 010 xxxxx") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 010 --- ----- PP - ----- ----- ---","[#2] if (Ps) Vd=Vu")
DEF_ENC(V6_vcmov, ICLASS_CJ" 1 010 000 ----- PP - uuuuu -ss ddddd")
DEF_ENC(V6_vncmov, ICLASS_CJ" 1 010 001 ----- PP - uuuuu -ss ddddd")
DEF_ENC(V6_vnccombine, ICLASS_CJ" 1 010 010 vvvvv PP - uuuuu -ss ddddd")
DEF_ENC(V6_vccombine, ICLASS_CJ" 1 010 011 vvvvv PP - uuuuu -ss ddddd")
DEF_ENC(V6_vrotr, ICLASS_CJ" 1 010 100 vvvvv PP 1 uuuuu 111 ddddd")
DEF_ENC(V6_vasr_into, ICLASS_CJ" 1 010 101 vvvvv PP 1 uuuuu 111 xxxxx")
/***************************************************************
*
* Group #3, Uses Q6 Rt8
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 011 --- ----- PP - ----- ----- ---","[#3] Vd32=(Vu32, Vv32, Rt8)")
DEF_ENC(V6_valignb, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vlalignb, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vasrwh, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vasrwhsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vasrwhrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vasrwuhsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vasrhubsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vasrhubrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vasrhbrndsat, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 000 ddddd") //
DEF_ENC(V6_vlutvvb, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 001 ddddd")
DEF_ENC(V6_vshuffvdd, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 011 ddddd") //
DEF_ENC(V6_vdealvdd, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 100 ddddd") //
DEF_ENC(V6_vlutvvb_oracc, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 101 xxxxx")
DEF_ENC(V6_vlutvwh, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 110 ddddd")
DEF_ENC(V6_vlutvwh_oracc, ICLASS_CJ" 1 011 vvv vvttt PP 1 uuuuu 111 xxxxx")
/***************************************************************
*
* Group #4, No Q6 regs
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 --- ----- PP 0 ----- ----- ---","[#4] Vd32=(Vu32, Vv32)")
DEF_ENC(V6_vrmpyubv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vrmpybv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vrmpybusv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vdmpyhvsat, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vmpybv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vmpyubv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vmpybusv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vmpyhv, ICLASS_CJ" 1 100 000 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyuhv, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vmpyhvsrs, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vmpyhus, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vmpabusv, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vmpyih, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vand, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vor, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vxor, ICLASS_CJ" 1 100 001 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vaddw, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vaddubsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vadduhsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vaddhsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vaddwsat, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vsubb, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vsubh, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vsubw, ICLASS_CJ" 1 100 010 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vsububsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vsubuhsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vsubhsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vsubwsat, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vaddb_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vaddh_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vaddw_dv, ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vaddubsat_dv,ICLASS_CJ" 1 100 011 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vadduhsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vaddhsat_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vaddwsat_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vsubb_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vsubh_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vsubw_dv, ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vsububsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vsubuhsat_dv,ICLASS_CJ" 1 100 100 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vsubhsat_dv, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vsubwsat_dv, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vaddubh, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vadduhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vaddhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vsububh, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vsubuhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vsubhw, ICLASS_CJ" 1 100 101 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vabsdiffub, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vabsdiffh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vabsdiffuh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vabsdiffw, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vavgub, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vavguh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vavgh, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vavgw, ICLASS_CJ" 1 100 110 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vnavgub, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vnavgh, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vnavgw, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vavgubrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vavguhrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vavghrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vavgwrnd, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vmpabuuv, ICLASS_CJ" 1 100 111 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 --- ----- PP 1 ----- ----- ---","[#4] Vx32=(Vu32, Vv32)")
DEF_ENC(V6_vrmpyubv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vrmpybv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vrmpybusv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vdmpyhvsat_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vmpybv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vmpyubv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vmpybusv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vmpyhv_acc, ICLASS_CJ" 1 100 000 vvvvv PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vmpyuhv_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vmpyhus_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vaddhw_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vmpyowh_64_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 011 xxxxx")
DEF_ENC(V6_vmpyih_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vmpyiewuh_acc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vmpyowh_sacc, ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vmpyowh_rnd_sacc,ICLASS_CJ" 1 100 001 vvvvv PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vmpyiewh_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vadduhw_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vaddubh_acc, ICLASS_CJ" 1 100 010 vvvvv PP 1 uuuuu 101 xxxxx") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 100 ----- PP 1 ----- ----- ---","[#4] Qx4=(Vu32, Vv32)")
// Grouped by element size (lsbs), operation (next-lsbs) and operation (next-lsbs)
DEF_ENC(V6_veqb_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 000xx") //
DEF_ENC(V6_veqh_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 001xx") //
DEF_ENC(V6_veqw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 010xx") //
DEF_ENC(V6_vgtb_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 100xx") //
DEF_ENC(V6_vgth_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 101xx") //
DEF_ENC(V6_vgtw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 000 110xx") //
DEF_ENC(V6_vgtub_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 000xx") //
DEF_ENC(V6_vgtuh_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 001xx") //
DEF_ENC(V6_vgtuw_and, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 001 010xx") //
DEF_ENC(V6_veqb_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 000xx") //
DEF_ENC(V6_veqh_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 001xx") //
DEF_ENC(V6_veqw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 010xx") //
DEF_ENC(V6_vgtb_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 100xx") //
DEF_ENC(V6_vgth_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 101xx") //
DEF_ENC(V6_vgtw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 010 110xx") //
DEF_ENC(V6_vgtub_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 000xx") //
DEF_ENC(V6_vgtuh_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 001xx") //
DEF_ENC(V6_vgtuw_or, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 011 010xx") //
DEF_ENC(V6_veqb_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 000xx") //
DEF_ENC(V6_veqh_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 001xx") //
DEF_ENC(V6_veqw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 010xx") //
DEF_ENC(V6_vgtb_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 100xx") //
DEF_ENC(V6_vgth_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 101xx") //
DEF_ENC(V6_vgtw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 100 110xx") //
DEF_ENC(V6_vgtub_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 000xx") //
DEF_ENC(V6_vgtuh_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 001xx") //
DEF_ENC(V6_vgtuw_xor, ICLASS_CJ" 1 100 100 vvvvv PP 1 uuuuu 101 010xx") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 101 ----- PP 1 ----- ----- ---","[#4] Qx4,Vd32=(Vu32, Vv32)")
DEF_ENC(V6_vaddcarry, ICLASS_CJ" 1 100 101 vvvvv PP 1 uuuuu 0xx ddddd") //
DEF_ENC(V6_vsubcarry, ICLASS_CJ" 1 100 101 vvvvv PP 1 uuuuu 1xx ddddd") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 100 11- ----- PP 1 ----- ----- ---","[#4] Vx32|=(Vu32, Vv32,#)")
DEF_ENC(V6_vlutvvb_oracci, ICLASS_CJ" 1 100 110 vvvvv PP 1 uuuuu iii xxxxx") //
DEF_ENC(V6_vlutvwh_oracci, ICLASS_CJ" 1 100 111 vvvvv PP 1 uuuuu iii xxxxx") //
/***************************************************************
*
* Group #5, Reserved/Deprecated. Uses Q6 Rx. Stupid FFT.
*
****************************************************************/
/***************************************************************
*
* Group #6, No Q6 regs
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 0 ----- ----- ---","[#6] Vd32=Vu32")
DEF_ENC(V6_vabsh, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vabsh_sat, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vabsw, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vabsw_sat, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vnot, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vdealh, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vdealb, ICLASS_CJ" 1 110 --0 ---00 PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vunpackub, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vunpackuh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vunpackb, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vunpackh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vabsb, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vabsb_sat, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vshuffh, ICLASS_CJ" 1 110 --0 ---01 PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vshuffb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vzb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vzh, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vsb, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vsh, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vcl0w, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vpopcounth, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vcl0h, ICLASS_CJ" 1 110 --0 ---10 PP 0 uuuuu 111 ddddd") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ---11 PP 0 ----- ----- ---","[#6] Qd4=Qt4, Qs4")
DEF_ENC(V6_pred_and, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 000dd") //
DEF_ENC(V6_pred_or, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 001dd") //
DEF_ENC(V6_pred_not, ICLASS_CJ" 1 110 --0 ---11 PP 0 ---ss 000 010dd") //
DEF_ENC(V6_pred_xor, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 011dd") //
DEF_ENC(V6_pred_or_n, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 100dd") //
DEF_ENC(V6_pred_and_n, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 101dd") //
DEF_ENC(V6_shuffeqh, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 110dd") //
DEF_ENC(V6_shuffeqw, ICLASS_CJ" 1 110 tt0 ---11 PP 0 ---ss 000 111dd") //
DEF_ENC(V6_vnormamtw, ICLASS_CJ" 1 110 --0 ---11 PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vnormamth, ICLASS_CJ" 1 110 --0 ---11 PP 0 uuuuu 101 ddddd") //
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --1 ----- PP 0 ----- ----- ---","[#6] Vd32=Vu32,Vv32")
DEF_ENC(V6_vlutvvbi, ICLASS_CJ" 1 110 001 vvvvv PP 0 uuuuu iii ddddd")
DEF_ENC(V6_vlutvwhi, ICLASS_CJ" 1 110 011 vvvvv PP 0 uuuuu iii ddddd")
DEF_ENC(V6_vaddbsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 000 ddddd")
DEF_ENC(V6_vsubbsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 001 ddddd")
DEF_ENC(V6_vadduwsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 010 ddddd")
DEF_ENC(V6_vsubuwsat_dv, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 011 ddddd")
DEF_ENC(V6_vaddububb_sat, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 100 ddddd")
DEF_ENC(V6_vsubububb_sat, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 101 ddddd")
DEF_ENC(V6_vmpyewuh_64, ICLASS_CJ" 1 110 101 vvvvv PP 0 uuuuu 110 ddddd")
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 1 ----- ----- ---","Vx32=Vu32")
DEF_ENC(V6_vunpackob, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vunpackoh, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 001 xxxxx") //
//DEF_ENC(V6_vunpackow, ICLASS_CJ" 1 110 --0 ---00 PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vhist, ICLASS_CJ" 1 110 --0 ---00 PP 1 -000- 100 -----")
DEF_ENC(V6_vwhist256, ICLASS_CJ" 1 110 --0 ---00 PP 1 -0010 100 -----")
DEF_ENC(V6_vwhist256_sat, ICLASS_CJ" 1 110 --0 ---00 PP 1 -0011 100 -----")
DEF_ENC(V6_vwhist128, ICLASS_CJ" 1 110 --0 ---00 PP 1 -010- 100 -----")
DEF_ENC(V6_vwhist128m, ICLASS_CJ" 1 110 --0 ---00 PP 1 -011i 100 -----")
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 110 --0 ----- PP 1 ----- ----- ---","if (Qv4) Vx32=Vu32")
DEF_ENC(V6_vaddbq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vaddhq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vaddwq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vaddbnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vaddhnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 100 xxxxx") //
DEF_ENC(V6_vaddwnq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 101 xxxxx") //
DEF_ENC(V6_vsubbq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 110 xxxxx") //
DEF_ENC(V6_vsubhq, ICLASS_CJ" 1 110 vv0 ---01 PP 1 uuuuu 111 xxxxx") //
DEF_ENC(V6_vsubwq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 000 xxxxx") //
DEF_ENC(V6_vsubbnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 001 xxxxx") //
DEF_ENC(V6_vsubhnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 010 xxxxx") //
DEF_ENC(V6_vsubwnq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 uuuuu 011 xxxxx") //
DEF_ENC(V6_vhistq, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --00- 100 -----")
DEF_ENC(V6_vwhist256q, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --010 100 -----")
DEF_ENC(V6_vwhist256q_sat, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --011 100 -----")
DEF_ENC(V6_vwhist128q, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --10- 100 -----")
DEF_ENC(V6_vwhist128qm, ICLASS_CJ" 1 110 vv0 ---10 PP 1 --11i 100 -----")
DEF_ENC(V6_vandvqv, ICLASS_CJ" 1 110 vv0 ---11 PP 1 uuuuu 000 ddddd")
DEF_ENC(V6_vandvnqv, ICLASS_CJ" 1 110 vv0 ---11 PP 1 uuuuu 001 ddddd")
DEF_ENC(V6_vprefixqb, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --000 010 ddddd") //
DEF_ENC(V6_vprefixqh, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --001 010 ddddd") //
DEF_ENC(V6_vprefixqw, ICLASS_CJ" 1 110 vv0 ---11 PP 1 --010 010 ddddd") //
DEF_ENC(V6_vassign, ICLASS_CJ" 1 110 --0 ---11 PP 1 uuuuu 111 ddddd")
DEF_ENC(V6_valignbi, ICLASS_CJ" 1 110 001 vvvvv PP 1 uuuuu iii ddddd")
DEF_ENC(V6_vlalignbi, ICLASS_CJ" 1 110 011 vvvvv PP 1 uuuuu iii ddddd")
DEF_ENC(V6_vswap, ICLASS_CJ" 1 110 101 vvvvv PP 1 uuuuu -tt ddddd") //
DEF_ENC(V6_vmux, ICLASS_CJ" 1 110 111 vvvvv PP 1 uuuuu -tt ddddd") //
/***************************************************************
*
* Group #7, No Q6 regs
*
****************************************************************/
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 111 --- ----- PP 0 ----- ----- ---","[#7] Vd32=(Vu32, Vv32)")
DEF_ENC(V6_vaddbsat, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vminub, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vminuh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vminh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vminw, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vmaxub, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vmaxuh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vmaxh, ICLASS_CJ" 1 111 000 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vaddclbh, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 000 ddddd") //
DEF_ENC(V6_vaddclbw, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 001 ddddd") //
DEF_ENC(V6_vavguw, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 010 ddddd") //
DEF_ENC(V6_vavguwrnd, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 011 ddddd") //
DEF_ENC(V6_vavgb, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 100 ddddd") //
DEF_ENC(V6_vavgbrnd, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 101 ddddd") //
DEF_ENC(V6_vnavgb, ICLASS_CJ" 1 111 000 vvvvv PP 1 uuuuu 110 ddddd") //
DEF_ENC(V6_vmaxw, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vdelta, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vsubbsat, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vrdelta, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vminb, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vmaxb, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vsatuwuh, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vdealb4w, ICLASS_CJ" 1 111 001 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyowh_rnd, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vshuffeb, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vshuffob, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vshufeh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vshufoh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vshufoeh, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vshufoeb, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vcombine, ICLASS_CJ" 1 111 010 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyieoh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vadduwsat, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vsathub, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vsatwh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vroundwh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 100 ddddd")
DEF_ENC(V6_vroundwuh, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 101 ddddd")
DEF_ENC(V6_vroundhb, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 110 ddddd")
DEF_ENC(V6_vroundhub, ICLASS_CJ" 1 111 011 vvvvv PP 0 uuuuu 111 ddddd")
DEF_FIELDROW_DESC32( ICLASS_CJ" 1 111 100 ----- PP - ----- ----- ---","[#7] Qd4=(Vu32, Vv32)")
DEF_ENC(V6_veqb, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 000dd") //
DEF_ENC(V6_veqh, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 001dd") //
DEF_ENC(V6_veqw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 010dd") //
DEF_ENC(V6_vgtb, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 100dd") //
DEF_ENC(V6_vgth, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 101dd") //
DEF_ENC(V6_vgtw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 000 110dd") //
DEF_ENC(V6_vgtub, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 000dd") //
DEF_ENC(V6_vgtuh, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 001dd") //
DEF_ENC(V6_vgtuw, ICLASS_CJ" 1 111 100 vvvvv PP 0 uuuuu 001 010dd") //
DEF_ENC(V6_vasrwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vlsrwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vlsrhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vasrhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vaslwv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vaslhv, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vaddb, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vaddh, ICLASS_CJ" 1 111 101 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vmpyiewuh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 000 ddddd")
DEF_ENC(V6_vmpyiowh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 001 ddddd")
DEF_ENC(V6_vpackeb, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vpackeh, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vsubuwsat, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vpackhub_sat,ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 101 ddddd") //
DEF_ENC(V6_vpackhb_sat, ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 110 ddddd") //
DEF_ENC(V6_vpackwuh_sat,ICLASS_CJ" 1 111 110 vvvvv PP 0 uuuuu 111 ddddd") //
DEF_ENC(V6_vpackwh_sat, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 000 ddddd") //
DEF_ENC(V6_vpackob, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 001 ddddd") //
DEF_ENC(V6_vpackoh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 010 ddddd") //
DEF_ENC(V6_vrounduhub, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 011 ddddd") //
DEF_ENC(V6_vrounduwuh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 100 ddddd") //
DEF_ENC(V6_vmpyewuh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 101 ddddd")
DEF_ENC(V6_vmpyowh, ICLASS_CJ" 1 111 111 vvvvv PP 0 uuuuu 111 ddddd")
#endif /* NO MMVEC */

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,842 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
DEF_MACRO(fDUMPQ,
do {
printf(STR ":" #REG ": 0x%016llx\n",REG.ud[0]);
} while (0),
()
)
DEF_MACRO(fUSE_LOOKUP_ADDRESS_BY_REV,
PROC->arch_proc_options->mmvec_use_full_va_for_lookup,
()
)
DEF_MACRO(fUSE_LOOKUP_ADDRESS,
1,
()
)
DEF_MACRO(fNOTQ,
({mmqreg_t _ret = {0}; int _i_; for (_i_ = 0; _i_ < fVECSIZE()/64; _i_++) _ret.ud[_i_] = ~VAL.ud[_i_]; _ret;}),
()
)
DEF_MACRO(fGETQBITS,
((MASK) & (REG.w[(BITNO)>>5] >> ((BITNO) & 0x1f))),
()
)
DEF_MACRO(fGETQBIT,
fGETQBITS(REG,1,1,BITNO),
()
)
DEF_MACRO(fGENMASKW,
(((fGETQBIT(QREG,(IDX*4+0)) ? 0xFF : 0x0) << 0)
|((fGETQBIT(QREG,(IDX*4+1)) ? 0xFF : 0x0) << 8)
|((fGETQBIT(QREG,(IDX*4+2)) ? 0xFF : 0x0) << 16)
|((fGETQBIT(QREG,(IDX*4+3)) ? 0xFF : 0x0) << 24)),
()
)
DEF_MACRO(fGET10BIT,
{
COE = (((((fGETUBYTE(3,VAL) >> (2 * POS)) & 3) << 8) | fGETUBYTE(POS,VAL)) << 6);
COE >>= 6;
},
()
)
DEF_MACRO(fVMAX,
(X>Y) ? X : Y,
()
)
DEF_MACRO(fGETNIBBLE,
( fSXTN(4,8,(SRC >> (4*IDX)) & 0xF) ),
()
)
DEF_MACRO(fGETCRUMB,
( fSXTN(2,8,(SRC >> (2*IDX)) & 0x3) ),
()
)
DEF_MACRO(fGETCRUMB_SYMMETRIC,
( (fGETCRUMB(IDX,SRC)>=0 ? (2-fGETCRUMB(IDX,SRC)) : fGETCRUMB(IDX,SRC) ) ),
()
)
#define ZERO_OFFSET_2B +
DEF_MACRO(fGENMASKH,
(((fGETQBIT(QREG,(IDX*2+0)) ? 0xFF : 0x0) << 0)
|((fGETQBIT(QREG,(IDX*2+1)) ? 0xFF : 0x0) << 8)),
()
)
DEF_MACRO(fGETMASKW,
(VREG.w[IDX] & fGENMASKW((QREG),IDX)),
()
)
DEF_MACRO(fGETMASKH,
(VREG.h[IDX] & fGENMASKH((QREG),IDX)),
()
)
DEF_MACRO(fCONDMASK8,
(fGETQBIT(QREG,IDX) ? (YESVAL) : (NOVAL)),
()
)
DEF_MACRO(fCONDMASK16,
((fGENMASKH(QREG,IDX) & (YESVAL)) | (fGENMASKH(fNOTQ(QREG),IDX) & (NOVAL))),
()
)
DEF_MACRO(fCONDMASK32,
((fGENMASKW(QREG,IDX) & (YESVAL)) | (fGENMASKW(fNOTQ(QREG),IDX) & (NOVAL))),
()
)
DEF_MACRO(fSETQBITS,
do {
size4u_t __TMP = (VAL);
REG.w[(BITNO)>>5] &= ~((MASK) << ((BITNO) & 0x1f));
REG.w[(BITNO)>>5] |= (((__TMP) & (MASK)) << ((BITNO) & 0x1f));
} while (0),
()
)
DEF_MACRO(fSETQBIT,
fSETQBITS(REG,1,1,BITNO,VAL),
()
)
DEF_MACRO(fVBYTES,
(fVECSIZE()),
()
)
DEF_MACRO(fVHALVES,
(fVECSIZE()/2),
()
)
DEF_MACRO(fVWORDS,
(fVECSIZE()/4),
()
)
DEF_MACRO(fVDWORDS,
(fVECSIZE()/8),
()
)
DEF_MACRO(fVALIGN,
( ADDR = ADDR & ~(LOG2_ALIGNMENT-1)),
()
)
DEF_MACRO(fVLASTBYTE,
( ADDR = ADDR | (LOG2_ALIGNMENT-1)),
()
)
DEF_MACRO(fVELEM,
((fVECSIZE()*8)/WIDTH),
()
)
DEF_MACRO(fVECLOGSIZE,
(mmvec_current_veclogsize(thread)),
()
)
DEF_MACRO(fVECSIZE,
(1<<fVECLOGSIZE()),
()
)
DEF_MACRO(fSWAPB,
{
size1u_t tmp = A;
A = B;
B = tmp;
},
/* NOTHING */
)
DEF_MACRO(
fVZERO,
mmvec_zero_vector(),
()
)
DEF_MACRO(
fNEWVREG,
((THREAD2STRUCT->VRegs_updated & (((VRegMask)1)<<VNUM)) ? THREAD2STRUCT->future_VRegs[VNUM] : mmvec_zero_vector()),
(A_DOTNEWVALUE,A_RESTRICT_SLOT0ONLY)
)
DEF_MACRO(
fV_AL_CHECK,
if ((EA) & (MASK)) {
warn("aligning misaligned vector. PC=%08x EA=%08x",thread->Regs[REG_PC],(EA));
},
()
)
DEF_MACRO(fSCATTER_INIT,
{
mem_vector_scatter_init(thread, insn, REGION_START, LENGTH, ELEMENT_SIZE);
if (EXCEPTION_DETECTED) return;
},
(A_STORE,A_MEMLIKE,A_RESTRICT_SLOT0ONLY)
)
DEF_MACRO(fGATHER_INIT,
{
mem_vector_gather_init(thread, insn, REGION_START, LENGTH, ELEMENT_SIZE);
if (EXCEPTION_DETECTED) return;
},
(A_LOAD,A_MEMLIKE,A_RESTRICT_SLOT1ONLY)
)
DEF_MACRO(fSCATTER_FINISH,
{
if (EXCEPTION_DETECTED) return;
mem_vector_scatter_finish(thread, insn, OP);
},
()
)
DEF_MACRO(fGATHER_FINISH,
{
if (EXCEPTION_DETECTED) return;
mem_vector_gather_finish(thread, insn);
},
()
)
DEF_MACRO(CHECK_VTCM_PAGE,
{
int slot = insn->slot;
paddr_t pa = thread->mem_access[slot].paddr+OFFSET;
pa = pa & ~(ALIGNMENT-1);
FLAG = (pa < (thread->mem_access[slot].paddr+LENGTH));
},
()
)
DEF_MACRO(COUNT_OUT_OF_BOUNDS,
{
if (!FLAG)
{
THREAD2STRUCT->vtcm_log.oob_access += SIZE;
warn("Scatter/Gather out of bounds of region");
}
},
()
)
DEF_MACRO(fLOG_SCATTER_OP,
{
// Log the size and indicate that the extension ext.c file needs to increment right before memory write
THREAD2STRUCT->vtcm_log.op = 1;
THREAD2STRUCT->vtcm_log.op_size = SIZE;
},
()
)
DEF_MACRO(fVLOG_VTCM_WORD_INCREMENT,
{
int slot = insn->slot;
int log_bank = 0;
int log_byte =0;
paddr_t pa = thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMENT-1));
paddr_t pa_high = thread->mem_access[slot].paddr+LEN;
for(int i0 = 0; i0 < 4; i0++)
{
log_byte = ((OFFSET>=0)&&((pa+i0)<=pa_high));
log_bank |= (log_byte<<i0);
LOG_VTCM_BYTE(pa+i0,log_byte,INC.ub[4*IDX+i0],4*IDX+i0);
}
{ LOG_VTCM_BANK(pa, log_bank, IDX); }
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORD_INCREMENT,
{
int slot = insn->slot;
int log_bank = 0;
int log_byte = 0;
paddr_t pa = thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMENT-1));
paddr_t pa_high = thread->mem_access[slot].paddr+LEN;
for(int i0 = 0; i0 < 2; i0++) {
log_byte = ((OFFSET>=0)&&((pa+i0)<=pa_high));
log_bank |= (log_byte<<i0);
LOG_VTCM_BYTE(pa+i0,log_byte,INC.ub[2*IDX+i0],2*IDX+i0);
}
{ LOG_VTCM_BANK(pa, log_bank,IDX); }
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORD_INCREMENT_DV,
{
int slot = insn->slot;
int log_bank = 0;
int log_byte = 0;
paddr_t pa = thread->mem_access[slot].paddr+(OFFSET & ~(ALIGNMENT-1));
paddr_t pa_high = thread->mem_access[slot].paddr+LEN;
for(int i0 = 0; i0 < 2; i0++) {
log_byte = ((OFFSET>=0)&&((pa+i0)<=pa_high));
log_bank |= (log_byte<<i0);
LOG_VTCM_BYTE(pa+i0,log_byte,INC.ub[2*IDX+i0],2*IDX+i0);
}
{ LOG_VTCM_BANK(pa, log_bank,(2*IDX2+IDX_H));}
},
()
)
DEF_MACRO(GATHER_FUNCTION,
{
int slot = insn->slot;
int i0;
paddr_t pa = thread->mem_access[slot].paddr+OFFSET;
paddr_t pa_high = thread->mem_access[slot].paddr+LEN;
int log_bank = 0;
int log_byte = 0;
for(i0 = 0; i0 < ELEMENT_SIZE; i0++)
{
log_byte = ((OFFSET>=0)&&((pa+i0)<=pa_high)) && QVAL;
log_bank |= (log_byte<<i0);
size1u_t B = sim_mem_read1(thread->system_ptr, thread->threadId, thread->mem_access[slot].paddr+OFFSET+i0);
THREAD2STRUCT->tmp_VRegs[0].ub[ELEMENT_SIZE*IDX+i0] = B;
LOG_VTCM_BYTE(pa+i0,log_byte,B,ELEMENT_SIZE*IDX+i0);
}
LOG_VTCM_BANK(pa, log_bank,BANK_IDX);
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_WORD,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 4, IDX, 1);
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORD,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, IDX, 1);
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORD_DV,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), 1);
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_WORDQ,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 4, IDX, fGETQBIT(QsV,4*IDX+i0));
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORDQ,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, IDX, fGETQBIT(QsV,2*IDX+i0));
},
()
)
DEF_MACRO(fVLOG_VTCM_GATHER_HALFWORDQ_DV,
{
GATHER_FUNCTION(EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), fGETQBIT(QsV,2*IDX+i0));
},
()
)
DEF_MACRO(DEBUG_LOG_ADDR,
{
if (thread->processor_ptr->arch_proc_options->mmvec_network_addr_log2)
{
int slot = insn->slot;
paddr_t pa = thread->mem_access[slot].paddr+OFFSET;
}
},
()
)
DEF_MACRO(SCATTER_OP_WRITE_TO_MEM,
{
for (int i = 0; i < mmvecx->vtcm_log.size; i+=sizeof(TYPE))
{
if ( mmvecx->vtcm_log.mask.ub[i] != 0) {
TYPE dst = 0;
TYPE inc = 0;
for(int j = 0; j < sizeof(TYPE); j++) {
dst |= (sim_mem_read1(thread->system_ptr, thread->threadId, mmvecx->vtcm_log.pa[i+j]) << (8*j));
inc |= mmvecx->vtcm_log.data.ub[j+i] << (8*j);
mmvecx->vtcm_log.mask.ub[j+i] = 0;
mmvecx->vtcm_log.data.ub[j+i] = 0;
mmvecx->vtcm_log.offsets.ub[j+i] = 0;
}
dst += inc;
for(int j = 0; j < sizeof(TYPE); j++) {
sim_mem_write1(thread->system_ptr,thread->threadId, mmvecx->vtcm_log.pa[i+j], (dst >> (8*j))& 0xFF );
}
}
}
},
()
)
DEF_MACRO(SCATTER_FUNCTION,
{
int slot = insn->slot;
int i0;
paddr_t pa = thread->mem_access[slot].paddr+OFFSET;
paddr_t pa_high = thread->mem_access[slot].paddr+LEN;
int log_bank = 0;
int log_byte = 0;
for(i0 = 0; i0 < ELEMENT_SIZE; i0++) {
log_byte = ((OFFSET>=0)&&((pa+i0)<=pa_high)) && QVAL;
log_bank |= (log_byte<<i0);
LOG_VTCM_BYTE(pa+i0,log_byte,IN.ub[ELEMENT_SIZE*IDX+i0],ELEMENT_SIZE*IDX+i0);
}
LOG_VTCM_BANK(pa, log_bank,BANK_IDX);
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORD,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 2, IDX, 1, IN);
},
()
)
DEF_MACRO(fVLOG_VTCM_WORD,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 4, IDX, 1, IN);
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORDQ,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 2, IDX, fGETQBIT(QsV,2*IDX+i0), IN);
},
()
)
DEF_MACRO(fVLOG_VTCM_WORDQ,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 4, IDX, fGETQBIT(QsV,4*IDX+i0), IN);
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORD_DV,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), 1, IN);
},
()
)
DEF_MACRO(fVLOG_VTCM_HALFWORDQ_DV,
{
SCATTER_FUNCTION (EA,OFFSET,IDX, LEN, 2, (2*IDX2+IDX_H), fGETQBIT(QsV,2*IDX+i0), IN);
},
()
)
DEF_MACRO(fSTORERELEASE,
{
fV_AL_CHECK(EA,fVECSIZE()-1);
mem_store_release(thread, insn, fVECSIZE(), EA&~(fVECSIZE()-1), EA, TYPE, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fVFETCH_AL,
{
fV_AL_CHECK(EA,fVECSIZE()-1);
mem_fetch_vector(thread, insn, EA&~(fVECSIZE()-1), insn->slot, fVECSIZE());
},
(A_LOAD,A_MEMLIKE)
)
DEF_MACRO(fLOADMMV_AL,
{
fV_AL_CHECK(EA,ALIGNMENT-1);
thread->last_pkt->double_access_vec = 0;
mem_load_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, LEN, &DST.ub[0], LEN, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_LOAD,A_MEMLIKE)
)
DEF_MACRO(fLOADMMV,
fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST),
()
)
DEF_MACRO(fLOADMMVQ,
do {
int __i;
fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST);
fVFOREACH(8,__i) if (!fGETQBIT(QVAL,__i)) DST.b[__i] = 0;
} while (0),
()
)
DEF_MACRO(fLOADMMVNQ,
do {
int __i;
fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST);
fVFOREACH(8,__i) if (fGETQBIT(QVAL,__i)) DST.b[__i] = 0;
} while (0),
()
)
DEF_MACRO(fLOADMMVU_AL,
{
size4u_t size2 = (EA)&(ALIGNMENT-1);
size4u_t size1 = LEN-size2;
thread->last_pkt->double_access_vec = 1;
mem_load_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot */ 1, size2, &DST.ub[size1], size2, fUSE_LOOKUP_ADDRESS());
mem_load_vector_oddva(thread, insn, EA, EA,/* slot */ 0, size1, &DST.ub[0], size1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_LOAD,A_MEMLIKE)
)
DEF_MACRO(fLOADMMVU,
{
/* if address happens to be aligned, only do aligned load */
thread->last_pkt->pkt_has_vtcm_access = 0;
thread->last_pkt->pkt_access_count = 0;
if ( (EA & (fVECSIZE()-1)) == 0) {
thread->last_pkt->pkt_has_vmemu_access = 0;
thread->last_pkt->double_access = 0;
fLOADMMV_AL(EA,fVECSIZE(),fVECSIZE(),DST);
} else {
thread->last_pkt->pkt_has_vmemu_access = 1;
thread->last_pkt->double_access = 1;
fLOADMMVU_AL(EA,fVECSIZE(),fVECSIZE(),DST);
}
},
()
)
DEF_MACRO(fSTOREMMV_AL,
{
fV_AL_CHECK(EA,ALIGNMENT-1);
mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, LEN, &SRC.ub[0], 0, 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMV,
fSTOREMMV_AL(EA,fVECSIZE(),fVECSIZE(),SRC),
()
)
DEF_MACRO(fSTOREMMVQ_AL,
do {
mmvector_t maskvec;
int i;
for (i = 0; i < fVECSIZE(); i++) maskvec.ub[i] = fGETQBIT(MASK,i);
mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, LEN, &SRC.ub[0], &maskvec.ub[0], 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
} while (0),
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMVQ,
fSTOREMMVQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK),
()
)
DEF_MACRO(fSTOREMMVNQ_AL,
{
mmvector_t maskvec;
int i;
for (i = 0; i < fVECSIZE(); i++) maskvec.ub[i] = fGETQBIT(MASK,i);
fV_AL_CHECK(EA,ALIGNMENT-1);
mem_store_vector_oddva(thread, insn, EA&~(ALIGNMENT-1), EA, insn->slot, LEN, &SRC.ub[0], &maskvec.ub[0], 1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMVNQ,
fSTOREMMVNQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK),
()
)
DEF_MACRO(fSTOREMMVU_AL,
{
size4u_t size1 = ALIGNMENT-((EA)&(ALIGNMENT-1));
size4u_t size2;
if (size1>LEN) size1 = LEN;
size2 = LEN-size1;
mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot */ 1, size2, &SRC.ub[size1], 0, 0, fUSE_LOOKUP_ADDRESS());
mem_store_vector_oddva(thread, insn, EA, EA, /* slot */ 0, size1, &SRC.ub[0], 0, 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMVU,
{
thread->last_pkt->pkt_has_vtcm_access = 0;
thread->last_pkt->pkt_access_count = 0;
if ( (EA & (fVECSIZE()-1)) == 0) {
thread->last_pkt->double_access = 0;
fSTOREMMV_AL(EA,fVECSIZE(),fVECSIZE(),SRC);
} else {
thread->last_pkt->double_access = 1;
thread->last_pkt->pkt_has_vmemu_access = 1;
fSTOREMMVU_AL(EA,fVECSIZE(),fVECSIZE(),SRC);
}
},
()
)
DEF_MACRO(fSTOREMMVQU_AL,
{
size4u_t size1 = ALIGNMENT-((EA)&(ALIGNMENT-1));
size4u_t size2;
mmvector_t maskvec;
int i;
for (i = 0; i < fVECSIZE(); i++) maskvec.ub[i] = fGETQBIT(MASK,i);
if (size1>LEN) size1 = LEN;
size2 = LEN-size1;
mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(),/* slot */ 1, size2, &SRC.ub[size1], &maskvec.ub[size1], 0, fUSE_LOOKUP_ADDRESS());
mem_store_vector_oddva(thread, insn, EA, /* slot */ 0, size1, &SRC.ub[0], &maskvec.ub[0], 0, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMVQU,
{
thread->last_pkt->pkt_has_vtcm_access = 0;
thread->last_pkt->pkt_access_count = 0;
if ( (EA & (fVECSIZE()-1)) == 0) {
thread->last_pkt->double_access = 0;
fSTOREMMVQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK);
} else {
thread->last_pkt->double_access = 1;
thread->last_pkt->pkt_has_vmemu_access = 1;
fSTOREMMVQU_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK);
}
},
()
)
DEF_MACRO(fSTOREMMVNQU_AL,
{
size4u_t size1 = ALIGNMENT-((EA)&(ALIGNMENT-1));
size4u_t size2;
mmvector_t maskvec;
int i;
for (i = 0; i < fVECSIZE(); i++) maskvec.ub[i] = fGETQBIT(MASK,i);
if (size1>LEN) size1 = LEN;
size2 = LEN-size1;
mem_store_vector_oddva(thread, insn, EA+size1, EA+fVECSIZE(), /* slot */ 1, size2, &SRC.ub[size1], &maskvec.ub[size1], 1, fUSE_LOOKUP_ADDRESS());
mem_store_vector_oddva(thread, insn, EA, EA, /* slot */ 0, size1, &SRC.ub[0], &maskvec.ub[0], 1, fUSE_LOOKUP_ADDRESS_BY_REV(thread->processor_ptr));
},
(A_STORE,A_MEMLIKE)
)
DEF_MACRO(fSTOREMMVNQU,
{
thread->last_pkt->pkt_has_vtcm_access = 0;
thread->last_pkt->pkt_access_count = 0;
if ( (EA & (fVECSIZE()-1)) == 0) {
thread->last_pkt->double_access = 0;
fSTOREMMVNQ_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK);
} else {
thread->last_pkt->double_access = 1;
thread->last_pkt->pkt_has_vmemu_access = 1;
fSTOREMMVNQU_AL(EA,fVECSIZE(),fVECSIZE(),SRC,MASK);
}
},
()
)
DEF_MACRO(fVFOREACH,
for (VAR = 0; VAR < fVELEM(WIDTH); VAR++),
/* NOTHING */
)
DEF_MACRO(fVARRAY_ELEMENT_ACCESS,
ARRAY.v[(INDEX) / (fVECSIZE()/(sizeof(ARRAY.TYPE[0])))].TYPE[(INDEX) % (fVECSIZE()/(sizeof(ARRAY.TYPE[0])))],
()
)
DEF_MACRO(fVNEWCANCEL,
do { THREAD2STRUCT->VRegs_select &= ~(1<<(REGNUM)); } while (0),
()
)
DEF_MACRO(fTMPVDATA,
mmvec_vtmp_data(thread),
(A_CVI)
)
DEF_MACRO(fVSATDW,
fVSATW( ( ( ((long long)U)<<32 ) | fZXTN(32,64,V) ) ),
/* attribs */
)
DEF_MACRO(fVASL_SATHI,
fVSATW(((U)<<1) | ((V)>>31)),
/* attribs */
)
DEF_MACRO(fVUADDSAT,
fVSATUN( WIDTH, fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V)),
/* attribs */
)
DEF_MACRO(fVSADDSAT,
fVSATN( WIDTH, fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V)),
/* attribs */
)
DEF_MACRO(fVUSUBSAT,
fVSATUN( WIDTH, fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V)),
/* attribs */
)
DEF_MACRO(fVSSUBSAT,
fVSATN( WIDTH, fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)),
/* attribs */
)
DEF_MACRO(fVAVGU,
((fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V))>>1),
/* attribs */
)
DEF_MACRO(fVAVGURND,
((fZXTN(WIDTH, 2*WIDTH, U) + fZXTN(WIDTH, 2*WIDTH, V)+1)>>1),
/* attribs */
)
DEF_MACRO(fVNAVGU,
((fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V))>>1),
/* attribs */
)
DEF_MACRO(fVNAVGURNDSAT,
fVSATUN(WIDTH,((fZXTN(WIDTH, 2*WIDTH, U) - fZXTN(WIDTH, 2*WIDTH, V)+1)>>1)),
/* attribs */
)
DEF_MACRO(fVAVGS,
((fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V))>>1),
/* attribs */
)
DEF_MACRO(fVAVGSRND,
((fSXTN(WIDTH, 2*WIDTH, U) + fSXTN(WIDTH, 2*WIDTH, V)+1)>>1),
/* attribs */
)
DEF_MACRO(fVNAVGS,
((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V))>>1),
/* attribs */
)
DEF_MACRO(fVNAVGSRND,
((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)+1)>>1),
/* attribs */
)
DEF_MACRO(fVNAVGSRNDSAT,
fVSATN(WIDTH,((fSXTN(WIDTH, 2*WIDTH, U) - fSXTN(WIDTH, 2*WIDTH, V)+1)>>1)),
/* attribs */
)
DEF_MACRO(fVNOROUND,
VAL,
/* NOTHING */
)
DEF_MACRO(fVNOSAT,
VAL,
/* NOTHING */
)
DEF_MACRO(fVROUND,
((VAL) + (((SHAMT)>0)?(1LL<<((SHAMT)-1)):0)),
/* NOTHING */
)
DEF_MACRO(fCARRY_FROM_ADD32,
(((fZXTN(32,64,A)+fZXTN(32,64,B)+C) >> 32) & 1),
/* NOTHING */
)
DEF_MACRO(fUARCH_NOTE_PUMP_4X,
,
()
)
DEF_MACRO(fUARCH_NOTE_PUMP_2X,
,
()
)

View file

@ -67,6 +67,9 @@ struct Packet {
bool pkt_has_store_s0; bool pkt_has_store_s0;
bool pkt_has_store_s1; bool pkt_has_store_s1;
bool pkt_has_hvx;
Insn *vhist_insn;
Insn insn[INSTRUCTIONS_MAX]; Insn insn[INSTRUCTIONS_MAX];
}; };

View file

@ -31,6 +31,9 @@
int hexagon_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg); int hexagon_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
int hexagon_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); int hexagon_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
void hexagon_debug_vreg(CPUHexagonState *env, int regnum);
void hexagon_debug_qreg(CPUHexagonState *env, int regnum);
void hexagon_debug(CPUHexagonState *env); void hexagon_debug(CPUHexagonState *env);
extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS]; extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS];

View file

@ -266,6 +266,10 @@ static inline void gen_pred_cancel(TCGv pred, int slot_num)
#define fNEWREG_ST(VAL) (VAL) #define fNEWREG_ST(VAL) (VAL)
#define fVSATUVALN(N, VAL) \
({ \
(((int)(VAL)) < 0) ? 0 : ((1LL << (N)) - 1); \
})
#define fSATUVALN(N, VAL) \ #define fSATUVALN(N, VAL) \
({ \ ({ \
fSET_OVERFLOW(); \ fSET_OVERFLOW(); \
@ -276,10 +280,16 @@ static inline void gen_pred_cancel(TCGv pred, int slot_num)
fSET_OVERFLOW(); \ fSET_OVERFLOW(); \
((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \ ((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \
}) })
#define fVSATVALN(N, VAL) \
({ \
((VAL) < 0) ? (-(1LL << ((N) - 1))) : ((1LL << ((N) - 1)) - 1); \
})
#define fZXTN(N, M, VAL) (((N) != 0) ? extract64((VAL), 0, (N)) : 0LL) #define fZXTN(N, M, VAL) (((N) != 0) ? extract64((VAL), 0, (N)) : 0LL)
#define fSXTN(N, M, VAL) (((N) != 0) ? sextract64((VAL), 0, (N)) : 0LL) #define fSXTN(N, M, VAL) (((N) != 0) ? sextract64((VAL), 0, (N)) : 0LL)
#define fSATN(N, VAL) \ #define fSATN(N, VAL) \
((fSXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATVALN(N, VAL)) ((fSXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATVALN(N, VAL))
#define fVSATN(N, VAL) \
((fSXTN(N, 64, VAL) == (VAL)) ? (VAL) : fVSATVALN(N, VAL))
#define fADDSAT64(DST, A, B) \ #define fADDSAT64(DST, A, B) \
do { \ do { \
uint64_t __a = fCAST8u(A); \ uint64_t __a = fCAST8u(A); \
@ -302,12 +312,18 @@ static inline void gen_pred_cancel(TCGv pred, int slot_num)
DST = __sum; \ DST = __sum; \
} \ } \
} while (0) } while (0)
#define fVSATUN(N, VAL) \
((fZXTN(N, 64, VAL) == (VAL)) ? (VAL) : fVSATUVALN(N, VAL))
#define fSATUN(N, VAL) \ #define fSATUN(N, VAL) \
((fZXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATUVALN(N, VAL)) ((fZXTN(N, 64, VAL) == (VAL)) ? (VAL) : fSATUVALN(N, VAL))
#define fSATH(VAL) (fSATN(16, VAL)) #define fSATH(VAL) (fSATN(16, VAL))
#define fSATUH(VAL) (fSATUN(16, VAL)) #define fSATUH(VAL) (fSATUN(16, VAL))
#define fVSATH(VAL) (fVSATN(16, VAL))
#define fVSATUH(VAL) (fVSATUN(16, VAL))
#define fSATUB(VAL) (fSATUN(8, VAL)) #define fSATUB(VAL) (fSATUN(8, VAL))
#define fSATB(VAL) (fSATN(8, VAL)) #define fSATB(VAL) (fSATN(8, VAL))
#define fVSATUB(VAL) (fVSATUN(8, VAL))
#define fVSATB(VAL) (fVSATN(8, VAL))
#define fIMMEXT(IMM) (IMM = IMM) #define fIMMEXT(IMM) (IMM = IMM)
#define fMUST_IMMEXT(IMM) fIMMEXT(IMM) #define fMUST_IMMEXT(IMM) fIMMEXT(IMM)
@ -414,6 +430,8 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
#define fCAST4s(A) ((int32_t)(A)) #define fCAST4s(A) ((int32_t)(A))
#define fCAST8u(A) ((uint64_t)(A)) #define fCAST8u(A) ((uint64_t)(A))
#define fCAST8s(A) ((int64_t)(A)) #define fCAST8s(A) ((int64_t)(A))
#define fCAST2_2s(A) ((int16_t)(A))
#define fCAST2_2u(A) ((uint16_t)(A))
#define fCAST4_4s(A) ((int32_t)(A)) #define fCAST4_4s(A) ((int32_t)(A))
#define fCAST4_4u(A) ((uint32_t)(A)) #define fCAST4_4u(A) ((uint32_t)(A))
#define fCAST4_8s(A) ((int64_t)((int32_t)(A))) #define fCAST4_8s(A) ((int64_t)((int32_t)(A)))
@ -510,7 +528,9 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
#define fPM_M(REG, MVAL) do { REG = REG + (MVAL); } while (0) #define fPM_M(REG, MVAL) do { REG = REG + (MVAL); } while (0)
#endif #endif
#define fSCALE(N, A) (((int64_t)(A)) << N) #define fSCALE(N, A) (((int64_t)(A)) << N)
#define fVSATW(A) fVSATN(32, ((long long)A))
#define fSATW(A) fSATN(32, ((long long)A)) #define fSATW(A) fSATN(32, ((long long)A))
#define fVSAT(A) fVSATN(32, (A))
#define fSAT(A) fSATN(32, (A)) #define fSAT(A) fSATN(32, (A))
#define fSAT_ORIG_SHL(A, ORIG_REG) \ #define fSAT_ORIG_SHL(A, ORIG_REG) \
((((int32_t)((fSAT(A)) ^ ((int32_t)(ORIG_REG)))) < 0) \ ((((int32_t)((fSAT(A)) ^ ((int32_t)(ORIG_REG)))) < 0) \
@ -647,12 +667,14 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int shift)
fSETBIT(j, DST, VAL); \ fSETBIT(j, DST, VAL); \
} \ } \
} while (0) } while (0)
#define fCOUNTONES_2(VAL) ctpop16(VAL)
#define fCOUNTONES_4(VAL) ctpop32(VAL) #define fCOUNTONES_4(VAL) ctpop32(VAL)
#define fCOUNTONES_8(VAL) ctpop64(VAL) #define fCOUNTONES_8(VAL) ctpop64(VAL)
#define fBREV_8(VAL) revbit64(VAL) #define fBREV_8(VAL) revbit64(VAL)
#define fBREV_4(VAL) revbit32(VAL) #define fBREV_4(VAL) revbit32(VAL)
#define fCL1_8(VAL) clo64(VAL) #define fCL1_8(VAL) clo64(VAL)
#define fCL1_4(VAL) clo32(VAL) #define fCL1_4(VAL) clo32(VAL)
#define fCL1_2(VAL) (clz32(~(uint16_t)(VAL) & 0xffff) - 16)
#define fINTERLEAVE(ODD, EVEN) interleave(ODD, EVEN) #define fINTERLEAVE(ODD, EVEN) interleave(ODD, EVEN)
#define fDEINTERLEAVE(MIXED) deinterleave(MIXED) #define fDEINTERLEAVE(MIXED) deinterleave(MIXED)
#define fHIDE(A) A #define fHIDE(A) A

View file

@ -20,6 +20,7 @@ hexagon_ss = ss.source_set()
hex_common_py = 'hex_common.py' hex_common_py = 'hex_common.py'
attribs_def = meson.current_source_dir() / 'attribs_def.h.inc' attribs_def = meson.current_source_dir() / 'attribs_def.h.inc'
gen_tcg_h = meson.current_source_dir() / 'gen_tcg.h' gen_tcg_h = meson.current_source_dir() / 'gen_tcg.h'
gen_tcg_hvx_h = meson.current_source_dir() / 'gen_tcg_hvx.h'
# #
# Step 1 # Step 1
@ -63,8 +64,8 @@ helper_protos_generated = custom_target(
'helper_protos_generated.h.inc', 'helper_protos_generated.h.inc',
output: 'helper_protos_generated.h.inc', output: 'helper_protos_generated.h.inc',
depends: [semantics_generated], depends: [semantics_generated],
depend_files: [hex_common_py, attribs_def, gen_tcg_h], depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h],
command: [python, files('gen_helper_protos.py'), semantics_generated, attribs_def, gen_tcg_h, '@OUTPUT@'], command: [python, files('gen_helper_protos.py'), semantics_generated, attribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'],
) )
hexagon_ss.add(helper_protos_generated) hexagon_ss.add(helper_protos_generated)
@ -72,8 +73,8 @@ tcg_funcs_generated = custom_target(
'tcg_funcs_generated.c.inc', 'tcg_funcs_generated.c.inc',
output: 'tcg_funcs_generated.c.inc', output: 'tcg_funcs_generated.c.inc',
depends: [semantics_generated], depends: [semantics_generated],
depend_files: [hex_common_py, attribs_def, gen_tcg_h], depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h],
command: [python, files('gen_tcg_funcs.py'), semantics_generated, attribs_def, gen_tcg_h, '@OUTPUT@'], command: [python, files('gen_tcg_funcs.py'), semantics_generated, attribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'],
) )
hexagon_ss.add(tcg_funcs_generated) hexagon_ss.add(tcg_funcs_generated)
@ -90,8 +91,8 @@ helper_funcs_generated = custom_target(
'helper_funcs_generated.c.inc', 'helper_funcs_generated.c.inc',
output: 'helper_funcs_generated.c.inc', output: 'helper_funcs_generated.c.inc',
depends: [semantics_generated], depends: [semantics_generated],
depend_files: [hex_common_py, attribs_def, gen_tcg_h], depend_files: [hex_common_py, attribs_def, gen_tcg_h, gen_tcg_hvx_h],
command: [python, files('gen_helper_funcs.py'), semantics_generated, attribs_def, gen_tcg_h, '@OUTPUT@'], command: [python, files('gen_helper_funcs.py'), semantics_generated, attribs_def, gen_tcg_h, gen_tcg_hvx_h, '@OUTPUT@'],
) )
hexagon_ss.add(helper_funcs_generated) hexagon_ss.add(helper_funcs_generated)
@ -174,6 +175,8 @@ hexagon_ss.add(files(
'printinsn.c', 'printinsn.c',
'arch.c', 'arch.c',
'fma_emu.c', 'fma_emu.c',
'mmvec/decode_ext_mmvec.c',
'mmvec/system_ext_mmvec.c',
)) ))
target_arch += {'hexagon': hexagon_ss} target_arch += {'hexagon': hexagon_ss}

View file

@ -0,0 +1,236 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "decode.h"
#include "opcodes.h"
#include "insn.h"
#include "iclass.h"
#include "mmvec/mmvec.h"
#include "mmvec/decode_ext_mmvec.h"
static void
check_new_value(Packet *pkt)
{
/* .new value for a MMVector store */
int i, j;
const char *reginfo;
const char *destletters;
const char *dststr = NULL;
uint16_t def_opcode;
char letter;
int def_regnum;
for (i = 1; i < pkt->num_insns; i++) {
uint16_t use_opcode = pkt->insn[i].opcode;
if (GET_ATTRIB(use_opcode, A_DOTNEWVALUE) &&
GET_ATTRIB(use_opcode, A_CVI) &&
GET_ATTRIB(use_opcode, A_STORE)) {
int use_regidx = strchr(opcode_reginfo[use_opcode], 's') -
opcode_reginfo[use_opcode];
/*
* What's encoded at the N-field is the offset to who's producing
* the value.
* Shift off the LSB which indicates odd/even register.
*/
int def_off = ((pkt->insn[i].regno[use_regidx]) >> 1);
int def_oreg = pkt->insn[i].regno[use_regidx] & 1;
int def_idx = -1;
for (j = i - 1; (j >= 0) && (def_off >= 0); j--) {
if (!GET_ATTRIB(pkt->insn[j].opcode, A_CVI)) {
continue;
}
def_off--;
if (def_off == 0) {
def_idx = j;
break;
}
}
/*
* Check for a badly encoded N-field which points to an instruction
* out-of-range
*/
g_assert(!((def_off != 0) || (def_idx < 0) ||
(def_idx > (pkt->num_insns - 1))));
/* def_idx is the index of the producer */
def_opcode = pkt->insn[def_idx].opcode;
reginfo = opcode_reginfo[def_opcode];
destletters = "dexy";
for (j = 0; (letter = destletters[j]) != 0; j++) {
dststr = strchr(reginfo, letter);
if (dststr != NULL) {
break;
}
}
if ((dststr == NULL) && GET_ATTRIB(def_opcode, A_CVI_GATHER)) {
def_regnum = 0;
pkt->insn[i].regno[use_regidx] = def_oreg;
pkt->insn[i].new_value_producer_slot = pkt->insn[def_idx].slot;
} else {
if (dststr == NULL) {
/* still not there, we have a bad packet */
g_assert_not_reached();
}
def_regnum = pkt->insn[def_idx].regno[dststr - reginfo];
/* Now patch up the consumer with the register number */
pkt->insn[i].regno[use_regidx] = def_regnum ^ def_oreg;
/* special case for (Vx,Vy) */
dststr = strchr(reginfo, 'y');
if (def_oreg && strchr(reginfo, 'x') && dststr) {
def_regnum = pkt->insn[def_idx].regno[dststr - reginfo];
pkt->insn[i].regno[use_regidx] = def_regnum;
}
/*
* We need to remember who produces this value to later
* check if it was dynamically cancelled
*/
pkt->insn[i].new_value_producer_slot = pkt->insn[def_idx].slot;
}
}
}
}
/*
* We don't want to reorder slot1/slot0 with respect to each other.
* So in our shuffling, we don't want to move the .cur / .tmp vmem earlier
* Instead, we should move the producing instruction later
* But the producing instruction might feed a .new store!
* So we may need to move that even later.
*/
static void
decode_mmvec_move_cvi_to_end(Packet *pkt, int max)
{
int i;
for (i = 0; i < max; i++) {
if (GET_ATTRIB(pkt->insn[i].opcode, A_CVI)) {
int last_inst = pkt->num_insns - 1;
uint16_t last_opcode = pkt->insn[last_inst].opcode;
/*
* If the last instruction is an endloop, move to the one before it
* Keep endloop as the last thing always
*/
if ((last_opcode == J2_endloop0) ||
(last_opcode == J2_endloop1) ||
(last_opcode == J2_endloop01)) {
last_inst--;
}
decode_send_insn_to(pkt, i, last_inst);
max--;
i--; /* Retry this index now that packet has rotated */
}
}
}
static void
decode_shuffle_for_execution_vops(Packet *pkt)
{
/*
* Sort for .new
*/
int i;
for (i = 0; i < pkt->num_insns; i++) {
uint16_t opcode = pkt->insn[i].opcode;
if (GET_ATTRIB(opcode, A_LOAD) &&
(GET_ATTRIB(opcode, A_CVI_NEW) ||
GET_ATTRIB(opcode, A_CVI_TMP))) {
/*
* Find prior consuming vector instructions
* Move to end of packet
*/
decode_mmvec_move_cvi_to_end(pkt, i);
break;
}
}
/* Move HVX new value stores to the end of the packet */
for (i = 0; i < pkt->num_insns - 1; i++) {
uint16_t opcode = pkt->insn[i].opcode;
if (GET_ATTRIB(opcode, A_STORE) &&
GET_ATTRIB(opcode, A_CVI_NEW) &&
!GET_ATTRIB(opcode, A_CVI_SCATTER_RELEASE)) {
int last_inst = pkt->num_insns - 1;
uint16_t last_opcode = pkt->insn[last_inst].opcode;
/*
* If the last instruction is an endloop, move to the one before it
* Keep endloop as the last thing always
*/
if ((last_opcode == J2_endloop0) ||
(last_opcode == J2_endloop1) ||
(last_opcode == J2_endloop01)) {
last_inst--;
}
decode_send_insn_to(pkt, i, last_inst);
break;
}
}
}
static void
check_for_vhist(Packet *pkt)
{
pkt->vhist_insn = NULL;
for (int i = 0; i < pkt->num_insns; i++) {
Insn *insn = &pkt->insn[i];
int opcode = insn->opcode;
if (GET_ATTRIB(opcode, A_CVI) && GET_ATTRIB(opcode, A_CVI_4SLOT)) {
pkt->vhist_insn = insn;
return;
}
}
}
/*
* Public Functions
*/
SlotMask mmvec_ext_decode_find_iclass_slots(int opcode)
{
if (GET_ATTRIB(opcode, A_CVI_VM)) {
/* HVX memory instruction */
if (GET_ATTRIB(opcode, A_RESTRICT_SLOT0ONLY)) {
return SLOTS_0;
} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT1ONLY)) {
return SLOTS_1;
}
return SLOTS_01;
} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT2ONLY)) {
return SLOTS_2;
} else if (GET_ATTRIB(opcode, A_CVI_VX)) {
/* HVX multiply instruction */
return SLOTS_23;
} else if (GET_ATTRIB(opcode, A_CVI_VS_VX)) {
/* HVX permute/shift instruction */
return SLOTS_23;
} else {
return SLOTS_0123;
}
}
void mmvec_ext_decode_checks(Packet *pkt, bool disas_only)
{
check_new_value(pkt);
if (!disas_only) {
decode_shuffle_for_execution_vops(pkt);
}
check_for_vhist(pkt);
}

View file

@ -0,0 +1,24 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HEXAGON_DECODE_EXT_MMVEC_H
#define HEXAGON_DECODE_EXT_MMVEC_H
void mmvec_ext_decode_checks(Packet *pkt, bool disas_only);
SlotMask mmvec_ext_decode_find_iclass_slots(int opcode);
#endif

View file

@ -0,0 +1,354 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HEXAGON_MMVEC_MACROS_H
#define HEXAGON_MMVEC_MACROS_H
#include "qemu/osdep.h"
#include "qemu/host-utils.h"
#include "arch.h"
#include "mmvec/system_ext_mmvec.h"
#ifndef QEMU_GENERATE
#define VdV (*(MMVector *)(VdV_void))
#define VsV (*(MMVector *)(VsV_void))
#define VuV (*(MMVector *)(VuV_void))
#define VvV (*(MMVector *)(VvV_void))
#define VwV (*(MMVector *)(VwV_void))
#define VxV (*(MMVector *)(VxV_void))
#define VyV (*(MMVector *)(VyV_void))
#define VddV (*(MMVectorPair *)(VddV_void))
#define VuuV (*(MMVectorPair *)(VuuV_void))
#define VvvV (*(MMVectorPair *)(VvvV_void))
#define VxxV (*(MMVectorPair *)(VxxV_void))
#define QeV (*(MMQReg *)(QeV_void))
#define QdV (*(MMQReg *)(QdV_void))
#define QsV (*(MMQReg *)(QsV_void))
#define QtV (*(MMQReg *)(QtV_void))
#define QuV (*(MMQReg *)(QuV_void))
#define QvV (*(MMQReg *)(QvV_void))
#define QxV (*(MMQReg *)(QxV_void))
#endif
#define LOG_VTCM_BYTE(VA, MASK, VAL, IDX) \
do { \
env->vtcm_log.data.ub[IDX] = (VAL); \
if (MASK) { \
set_bit((IDX), env->vtcm_log.mask); \
} else { \
clear_bit((IDX), env->vtcm_log.mask); \
} \
env->vtcm_log.va[IDX] = (VA); \
} while (0)
#define fNOTQ(VAL) \
({ \
MMQReg _ret; \
int _i_; \
for (_i_ = 0; _i_ < fVECSIZE() / 64; _i_++) { \
_ret.ud[_i_] = ~VAL.ud[_i_]; \
} \
_ret;\
})
#define fGETQBITS(REG, WIDTH, MASK, BITNO) \
((MASK) & (REG.w[(BITNO) >> 5] >> ((BITNO) & 0x1f)))
#define fGETQBIT(REG, BITNO) fGETQBITS(REG, 1, 1, BITNO)
#define fGENMASKW(QREG, IDX) \
(((fGETQBIT(QREG, (IDX * 4 + 0)) ? 0xFF : 0x0) << 0) | \
((fGETQBIT(QREG, (IDX * 4 + 1)) ? 0xFF : 0x0) << 8) | \
((fGETQBIT(QREG, (IDX * 4 + 2)) ? 0xFF : 0x0) << 16) | \
((fGETQBIT(QREG, (IDX * 4 + 3)) ? 0xFF : 0x0) << 24))
#define fGETNIBBLE(IDX, SRC) (fSXTN(4, 8, (SRC >> (4 * IDX)) & 0xF))
#define fGETCRUMB(IDX, SRC) (fSXTN(2, 8, (SRC >> (2 * IDX)) & 0x3))
#define fGETCRUMB_SYMMETRIC(IDX, SRC) \
((fGETCRUMB(IDX, SRC) >= 0 ? (2 - fGETCRUMB(IDX, SRC)) \
: fGETCRUMB(IDX, SRC)))
#define fGENMASKH(QREG, IDX) \
(((fGETQBIT(QREG, (IDX * 2 + 0)) ? 0xFF : 0x0) << 0) | \
((fGETQBIT(QREG, (IDX * 2 + 1)) ? 0xFF : 0x0) << 8))
#define fGETMASKW(VREG, QREG, IDX) (VREG.w[IDX] & fGENMASKW((QREG), IDX))
#define fGETMASKH(VREG, QREG, IDX) (VREG.h[IDX] & fGENMASKH((QREG), IDX))
#define fCONDMASK8(QREG, IDX, YESVAL, NOVAL) \
(fGETQBIT(QREG, IDX) ? (YESVAL) : (NOVAL))
#define fCONDMASK16(QREG, IDX, YESVAL, NOVAL) \
((fGENMASKH(QREG, IDX) & (YESVAL)) | \
(fGENMASKH(fNOTQ(QREG), IDX) & (NOVAL)))
#define fCONDMASK32(QREG, IDX, YESVAL, NOVAL) \
((fGENMASKW(QREG, IDX) & (YESVAL)) | \
(fGENMASKW(fNOTQ(QREG), IDX) & (NOVAL)))
#define fSETQBITS(REG, WIDTH, MASK, BITNO, VAL) \
do { \
uint32_t __TMP = (VAL); \
REG.w[(BITNO) >> 5] &= ~((MASK) << ((BITNO) & 0x1f)); \
REG.w[(BITNO) >> 5] |= (((__TMP) & (MASK)) << ((BITNO) & 0x1f)); \
} while (0)
#define fSETQBIT(REG, BITNO, VAL) fSETQBITS(REG, 1, 1, BITNO, VAL)
#define fVBYTES() (fVECSIZE())
#define fVALIGN(ADDR, LOG2_ALIGNMENT) (ADDR = ADDR & ~(LOG2_ALIGNMENT - 1))
#define fVLASTBYTE(ADDR, LOG2_ALIGNMENT) (ADDR = ADDR | (LOG2_ALIGNMENT - 1))
#define fVELEM(WIDTH) ((fVECSIZE() * 8) / WIDTH)
#define fVECLOGSIZE() (7)
#define fVECSIZE() (1 << fVECLOGSIZE())
#define fSWAPB(A, B) do { uint8_t tmp = A; A = B; B = tmp; } while (0)
#define fV_AL_CHECK(EA, MASK) \
if ((EA) & (MASK)) { \
warn("aligning misaligned vector. EA=%08x", (EA)); \
}
#define fSCATTER_INIT(REGION_START, LENGTH, ELEMENT_SIZE) \
mem_vector_scatter_init(env)
#define fGATHER_INIT(REGION_START, LENGTH, ELEMENT_SIZE) \
mem_vector_gather_init(env)
#define fSCATTER_FINISH(OP)
#define fGATHER_FINISH()
#define fLOG_SCATTER_OP(SIZE) \
do { \
env->vtcm_log.op = true; \
env->vtcm_log.op_size = SIZE; \
} while (0)
#define fVLOG_VTCM_WORD_INCREMENT(EA, OFFSET, INC, IDX, ALIGNMENT, LEN) \
do { \
int log_byte = 0; \
target_ulong va = EA; \
target_ulong va_high = EA + LEN; \
for (int i0 = 0; i0 < 4; i0++) { \
log_byte = (va + i0) <= va_high; \
LOG_VTCM_BYTE(va + i0, log_byte, INC. ub[4 * IDX + i0], \
4 * IDX + i0); \
} \
} while (0)
#define fVLOG_VTCM_HALFWORD_INCREMENT(EA, OFFSET, INC, IDX, ALIGNMENT, LEN) \
do { \
int log_byte = 0; \
target_ulong va = EA; \
target_ulong va_high = EA + LEN; \
for (int i0 = 0; i0 < 2; i0++) { \
log_byte = (va + i0) <= va_high; \
LOG_VTCM_BYTE(va + i0, log_byte, INC.ub[2 * IDX + i0], \
2 * IDX + i0); \
} \
} while (0)
#define fVLOG_VTCM_HALFWORD_INCREMENT_DV(EA, OFFSET, INC, IDX, IDX2, IDX_H, \
ALIGNMENT, LEN) \
do { \
int log_byte = 0; \
target_ulong va = EA; \
target_ulong va_high = EA + LEN; \
for (int i0 = 0; i0 < 2; i0++) { \
log_byte = (va + i0) <= va_high; \
LOG_VTCM_BYTE(va + i0, log_byte, INC.ub[2 * IDX + i0], \
2 * IDX + i0); \
} \
} while (0)
/* NOTE - Will this always be tmp_VRegs[0]; */
#define GATHER_FUNCTION(EA, OFFSET, IDX, LEN, ELEMENT_SIZE, BANK_IDX, QVAL) \
do { \
int i0; \
target_ulong va = EA; \
target_ulong va_high = EA + LEN; \
uintptr_t ra = GETPC(); \
int log_bank = 0; \
int log_byte = 0; \
for (i0 = 0; i0 < ELEMENT_SIZE; i0++) { \
log_byte = ((va + i0) <= va_high) && QVAL; \
log_bank |= (log_byte << i0); \
uint8_t B; \
B = cpu_ldub_data_ra(env, EA + i0, ra); \
env->tmp_VRegs[0].ub[ELEMENT_SIZE * IDX + i0] = B; \
LOG_VTCM_BYTE(va + i0, log_byte, B, ELEMENT_SIZE * IDX + i0); \
} \
} while (0)
#define fVLOG_VTCM_GATHER_WORD(EA, OFFSET, IDX, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, 1); \
} while (0)
#define fVLOG_VTCM_GATHER_HALFWORD(EA, OFFSET, IDX, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, 1); \
} while (0)
#define fVLOG_VTCM_GATHER_HALFWORD_DV(EA, OFFSET, IDX, IDX2, IDX_H, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), 1); \
} while (0)
#define fVLOG_VTCM_GATHER_WORDQ(EA, OFFSET, IDX, Q, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, \
fGETQBIT(QsV, 4 * IDX + i0)); \
} while (0)
#define fVLOG_VTCM_GATHER_HALFWORDQ(EA, OFFSET, IDX, Q, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, \
fGETQBIT(QsV, 2 * IDX + i0)); \
} while (0)
#define fVLOG_VTCM_GATHER_HALFWORDQ_DV(EA, OFFSET, IDX, IDX2, IDX_H, Q, LEN) \
do { \
GATHER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), \
fGETQBIT(QsV, 2 * IDX + i0)); \
} while (0)
#define SCATTER_OP_WRITE_TO_MEM(TYPE) \
do { \
uintptr_t ra = GETPC(); \
for (int i = 0; i < sizeof(MMVector); i += sizeof(TYPE)) { \
if (test_bit(i, env->vtcm_log.mask)) { \
TYPE dst = 0; \
TYPE inc = 0; \
for (int j = 0; j < sizeof(TYPE); j++) { \
uint8_t val; \
val = cpu_ldub_data_ra(env, env->vtcm_log.va[i + j], ra); \
dst |= val << (8 * j); \
inc |= env->vtcm_log.data.ub[j + i] << (8 * j); \
clear_bit(j + i, env->vtcm_log.mask); \
env->vtcm_log.data.ub[j + i] = 0; \
} \
dst += inc; \
for (int j = 0; j < sizeof(TYPE); j++) { \
cpu_stb_data_ra(env, env->vtcm_log.va[i + j], \
(dst >> (8 * j)) & 0xFF, ra); \
} \
} \
} \
} while (0)
#define SCATTER_OP_PROBE_MEM(TYPE, MMU_IDX, RETADDR) \
do { \
for (int i = 0; i < sizeof(MMVector); i += sizeof(TYPE)) { \
if (test_bit(i, env->vtcm_log.mask)) { \
for (int j = 0; j < sizeof(TYPE); j++) { \
probe_read(env, env->vtcm_log.va[i + j], 1, \
MMU_IDX, RETADDR); \
probe_write(env, env->vtcm_log.va[i + j], 1, \
MMU_IDX, RETADDR); \
} \
} \
} \
} while (0)
#define SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, ELEM_SIZE, BANK_IDX, QVAL, IN) \
do { \
int i0; \
target_ulong va = EA; \
target_ulong va_high = EA + LEN; \
int log_bank = 0; \
int log_byte = 0; \
for (i0 = 0; i0 < ELEM_SIZE; i0++) { \
log_byte = ((va + i0) <= va_high) && QVAL; \
log_bank |= (log_byte << i0); \
LOG_VTCM_BYTE(va + i0, log_byte, IN.ub[ELEM_SIZE * IDX + i0], \
ELEM_SIZE * IDX + i0); \
} \
} while (0)
#define fVLOG_VTCM_HALFWORD(EA, OFFSET, IN, IDX, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, 1, IN); \
} while (0)
#define fVLOG_VTCM_WORD(EA, OFFSET, IN, IDX, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, 1, IN); \
} while (0)
#define fVLOG_VTCM_HALFWORDQ(EA, OFFSET, IN, IDX, Q, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, IDX, \
fGETQBIT(QsV, 2 * IDX + i0), IN); \
} while (0)
#define fVLOG_VTCM_WORDQ(EA, OFFSET, IN, IDX, Q, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 4, IDX, \
fGETQBIT(QsV, 4 * IDX + i0), IN); \
} while (0)
#define fVLOG_VTCM_HALFWORD_DV(EA, OFFSET, IN, IDX, IDX2, IDX_H, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, \
(2 * IDX2 + IDX_H), 1, IN); \
} while (0)
#define fVLOG_VTCM_HALFWORDQ_DV(EA, OFFSET, IN, IDX, Q, IDX2, IDX_H, LEN) \
do { \
SCATTER_FUNCTION(EA, OFFSET, IDX, LEN, 2, (2 * IDX2 + IDX_H), \
fGETQBIT(QsV, 2 * IDX + i0), IN); \
} while (0)
#define fSTORERELEASE(EA, TYPE) \
do { \
fV_AL_CHECK(EA, fVECSIZE() - 1); \
} while (0)
#ifdef QEMU_GENERATE
#define fLOADMMV(EA, DST) gen_vreg_load(ctx, DST##_off, EA, true)
#endif
#ifdef QEMU_GENERATE
#define fLOADMMVU(EA, DST) gen_vreg_load(ctx, DST##_off, EA, false)
#endif
#ifdef QEMU_GENERATE
#define fSTOREMMV(EA, SRC) \
gen_vreg_store(ctx, insn, pkt, EA, SRC##_off, insn->slot, true)
#endif
#ifdef QEMU_GENERATE
#define fSTOREMMVQ(EA, SRC, MASK) \
gen_vreg_masked_store(ctx, EA, SRC##_off, MASK##_off, insn->slot, false)
#endif
#ifdef QEMU_GENERATE
#define fSTOREMMVNQ(EA, SRC, MASK) \
gen_vreg_masked_store(ctx, EA, SRC##_off, MASK##_off, insn->slot, true)
#endif
#ifdef QEMU_GENERATE
#define fSTOREMMVU(EA, SRC) \
gen_vreg_store(ctx, insn, pkt, EA, SRC##_off, insn->slot, false)
#endif
#define fVFOREACH(WIDTH, VAR) for (VAR = 0; VAR < fVELEM(WIDTH); VAR++)
#define fVARRAY_ELEMENT_ACCESS(ARRAY, TYPE, INDEX) \
ARRAY.v[(INDEX) / (fVECSIZE() / (sizeof(ARRAY.TYPE[0])))].TYPE[(INDEX) % \
(fVECSIZE() / (sizeof(ARRAY.TYPE[0])))]
#define fVSATDW(U, V) fVSATW(((((long long)U) << 32) | fZXTN(32, 64, V)))
#define fVASL_SATHI(U, V) fVSATW(((U) << 1) | ((V) >> 31))
#define fVUADDSAT(WIDTH, U, V) \
fVSATUN(WIDTH, fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V))
#define fVSADDSAT(WIDTH, U, V) \
fVSATN(WIDTH, fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V))
#define fVUSUBSAT(WIDTH, U, V) \
fVSATUN(WIDTH, fZXTN(WIDTH, 2 * WIDTH, U) - fZXTN(WIDTH, 2 * WIDTH, V))
#define fVSSUBSAT(WIDTH, U, V) \
fVSATN(WIDTH, fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V))
#define fVAVGU(WIDTH, U, V) \
((fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V)) >> 1)
#define fVAVGURND(WIDTH, U, V) \
((fZXTN(WIDTH, 2 * WIDTH, U) + fZXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1)
#define fVNAVGU(WIDTH, U, V) \
((fZXTN(WIDTH, 2 * WIDTH, U) - fZXTN(WIDTH, 2 * WIDTH, V)) >> 1)
#define fVNAVGURNDSAT(WIDTH, U, V) \
fVSATUN(WIDTH, ((fZXTN(WIDTH, 2 * WIDTH, U) - \
fZXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1))
#define fVAVGS(WIDTH, U, V) \
((fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V)) >> 1)
#define fVAVGSRND(WIDTH, U, V) \
((fSXTN(WIDTH, 2 * WIDTH, U) + fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1)
#define fVNAVGS(WIDTH, U, V) \
((fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V)) >> 1)
#define fVNAVGSRND(WIDTH, U, V) \
((fSXTN(WIDTH, 2 * WIDTH, U) - fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1)
#define fVNAVGSRNDSAT(WIDTH, U, V) \
fVSATN(WIDTH, ((fSXTN(WIDTH, 2 * WIDTH, U) - \
fSXTN(WIDTH, 2 * WIDTH, V) + 1) >> 1))
#define fVNOROUND(VAL, SHAMT) VAL
#define fVNOSAT(VAL) VAL
#define fVROUND(VAL, SHAMT) \
((VAL) + (((SHAMT) > 0) ? (1LL << ((SHAMT) - 1)) : 0))
#define fCARRY_FROM_ADD32(A, B, C) \
(((fZXTN(32, 64, A) + fZXTN(32, 64, B) + C) >> 32) & 1)
#define fUARCH_NOTE_PUMP_4X()
#define fUARCH_NOTE_PUMP_2X()
#define IV1DEAD()
#endif

View file

@ -0,0 +1,82 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HEXAGON_MMVEC_H
#define HEXAGON_MMVEC_H
#define MAX_VEC_SIZE_LOGBYTES 7
#define MAX_VEC_SIZE_BYTES (1 << MAX_VEC_SIZE_LOGBYTES)
#define NUM_VREGS 32
#define NUM_QREGS 4
typedef uint32_t VRegMask; /* at least NUM_VREGS bits */
typedef uint32_t QRegMask; /* at least NUM_QREGS bits */
#define VECTOR_SIZE_BYTE (fVECSIZE())
typedef union {
uint64_t ud[MAX_VEC_SIZE_BYTES / 8];
int64_t d[MAX_VEC_SIZE_BYTES / 8];
uint32_t uw[MAX_VEC_SIZE_BYTES / 4];
int32_t w[MAX_VEC_SIZE_BYTES / 4];
uint16_t uh[MAX_VEC_SIZE_BYTES / 2];
int16_t h[MAX_VEC_SIZE_BYTES / 2];
uint8_t ub[MAX_VEC_SIZE_BYTES / 1];
int8_t b[MAX_VEC_SIZE_BYTES / 1];
} MMVector;
typedef union {
uint64_t ud[2 * MAX_VEC_SIZE_BYTES / 8];
int64_t d[2 * MAX_VEC_SIZE_BYTES / 8];
uint32_t uw[2 * MAX_VEC_SIZE_BYTES / 4];
int32_t w[2 * MAX_VEC_SIZE_BYTES / 4];
uint16_t uh[2 * MAX_VEC_SIZE_BYTES / 2];
int16_t h[2 * MAX_VEC_SIZE_BYTES / 2];
uint8_t ub[2 * MAX_VEC_SIZE_BYTES / 1];
int8_t b[2 * MAX_VEC_SIZE_BYTES / 1];
MMVector v[2];
} MMVectorPair;
typedef union {
uint64_t ud[MAX_VEC_SIZE_BYTES / 8 / 8];
int64_t d[MAX_VEC_SIZE_BYTES / 8 / 8];
uint32_t uw[MAX_VEC_SIZE_BYTES / 4 / 8];
int32_t w[MAX_VEC_SIZE_BYTES / 4 / 8];
uint16_t uh[MAX_VEC_SIZE_BYTES / 2 / 8];
int16_t h[MAX_VEC_SIZE_BYTES / 2 / 8];
uint8_t ub[MAX_VEC_SIZE_BYTES / 1 / 8];
int8_t b[MAX_VEC_SIZE_BYTES / 1 / 8];
} MMQReg;
typedef struct {
MMVector data;
DECLARE_BITMAP(mask, MAX_VEC_SIZE_BYTES);
target_ulong va[MAX_VEC_SIZE_BYTES];
bool op;
int op_size;
} VTCMStoreLog;
/* Types of vector register assignment */
typedef enum {
EXT_DFL, /* Default */
EXT_NEW, /* New - value used in the same packet */
EXT_TMP /* Temp - value used but not stored to register */
} VRegWriteType;
#endif

View file

@ -0,0 +1,47 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "cpu.h"
#include "mmvec/system_ext_mmvec.h"
void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int slot)
{
size_t size = sizeof(MMVector);
env->vstore_pending[slot] = 1;
env->vstore[slot].va = vaddr;
env->vstore[slot].size = size;
memcpy(&env->vstore[slot].data.ub[0], &env->tmp_VRegs[0], size);
/* On a gather store, overwrite the store mask to emulate dropped gathers */
bitmap_copy(env->vstore[slot].mask, env->vtcm_log.mask, size);
}
void mem_vector_scatter_init(CPUHexagonState *env)
{
bitmap_zero(env->vtcm_log.mask, MAX_VEC_SIZE_BYTES);
env->vtcm_pending = true;
env->vtcm_log.op = false;
env->vtcm_log.op_size = 0;
}
void mem_vector_gather_init(CPUHexagonState *env)
{
bitmap_zero(env->vtcm_log.mask, MAX_VEC_SIZE_BYTES);
}

View file

@ -0,0 +1,25 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HEXAGON_SYSTEM_EXT_MMVEC_H
#define HEXAGON_SYSTEM_EXT_MMVEC_H
void mem_gather_store(CPUHexagonState *env, target_ulong vaddr, int slot);
void mem_vector_scatter_init(CPUHexagonState *env);
void mem_vector_gather_init(CPUHexagonState *env);
#endif

View file

@ -27,6 +27,8 @@
#include "arch.h" #include "arch.h"
#include "hex_arch_types.h" #include "hex_arch_types.h"
#include "fma_emu.h" #include "fma_emu.h"
#include "mmvec/mmvec.h"
#include "mmvec/macros.h"
#define SF_BIAS 127 #define SF_BIAS 127
#define SF_MANTBITS 23 #define SF_MANTBITS 23
@ -164,6 +166,57 @@ void HELPER(commit_store)(CPUHexagonState *env, int slot_num)
} }
} }
void HELPER(gather_store)(CPUHexagonState *env, uint32_t addr, int slot)
{
mem_gather_store(env, addr, slot);
}
void HELPER(commit_hvx_stores)(CPUHexagonState *env)
{
uintptr_t ra = GETPC();
int i;
/* Normal (possibly masked) vector store */
for (i = 0; i < VSTORES_MAX; i++) {
if (env->vstore_pending[i]) {
env->vstore_pending[i] = 0;
target_ulong va = env->vstore[i].va;
int size = env->vstore[i].size;
for (int j = 0; j < size; j++) {
if (test_bit(j, env->vstore[i].mask)) {
cpu_stb_data_ra(env, va + j, env->vstore[i].data.ub[j], ra);
}
}
}
}
/* Scatter store */
if (env->vtcm_pending) {
env->vtcm_pending = false;
if (env->vtcm_log.op) {
/* Need to perform the scatter read/modify/write at commit time */
if (env->vtcm_log.op_size == 2) {
SCATTER_OP_WRITE_TO_MEM(uint16_t);
} else if (env->vtcm_log.op_size == 4) {
/* Word Scatter += */
SCATTER_OP_WRITE_TO_MEM(uint32_t);
} else {
g_assert_not_reached();
}
} else {
for (i = 0; i < sizeof(MMVector); i++) {
if (test_bit(i, env->vtcm_log.mask)) {
cpu_stb_data_ra(env, env->vtcm_log.va[i],
env->vtcm_log.data.ub[i], ra);
clear_bit(i, env->vtcm_log.mask);
env->vtcm_log.data.ub[i] = 0;
}
}
}
}
}
static void print_store(CPUHexagonState *env, int slot) static void print_store(CPUHexagonState *env, int slot)
{ {
if (!(env->slot_cancelled & (1 << slot))) { if (!(env->slot_cancelled & (1 << slot))) {
@ -242,9 +295,10 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int has_st0, int has_st1)
HEX_DEBUG_LOG("Next PC = " TARGET_FMT_lx "\n", env->next_PC); HEX_DEBUG_LOG("Next PC = " TARGET_FMT_lx "\n", env->next_PC);
HEX_DEBUG_LOG("Exec counters: pkt = " TARGET_FMT_lx HEX_DEBUG_LOG("Exec counters: pkt = " TARGET_FMT_lx
", insn = " TARGET_FMT_lx ", insn = " TARGET_FMT_lx
"\n", ", hvx = " TARGET_FMT_lx "\n",
env->gpr[HEX_REG_QEMU_PKT_CNT], env->gpr[HEX_REG_QEMU_PKT_CNT],
env->gpr[HEX_REG_QEMU_INSN_CNT]); env->gpr[HEX_REG_QEMU_INSN_CNT],
env->gpr[HEX_REG_QEMU_HVX_CNT]);
} }
@ -393,6 +447,65 @@ void HELPER(probe_pkt_scalar_store_s0)(CPUHexagonState *env, int mmu_idx)
probe_store(env, 0, mmu_idx); probe_store(env, 0, mmu_idx);
} }
void HELPER(probe_hvx_stores)(CPUHexagonState *env, int mmu_idx)
{
uintptr_t retaddr = GETPC();
int i;
/* Normal (possibly masked) vector store */
for (i = 0; i < VSTORES_MAX; i++) {
if (env->vstore_pending[i]) {
target_ulong va = env->vstore[i].va;
int size = env->vstore[i].size;
for (int j = 0; j < size; j++) {
if (test_bit(j, env->vstore[i].mask)) {
probe_write(env, va + j, 1, mmu_idx, retaddr);
}
}
}
}
/* Scatter store */
if (env->vtcm_pending) {
if (env->vtcm_log.op) {
/* Need to perform the scatter read/modify/write at commit time */
if (env->vtcm_log.op_size == 2) {
SCATTER_OP_PROBE_MEM(size2u_t, mmu_idx, retaddr);
} else if (env->vtcm_log.op_size == 4) {
/* Word Scatter += */
SCATTER_OP_PROBE_MEM(size4u_t, mmu_idx, retaddr);
} else {
g_assert_not_reached();
}
} else {
for (int i = 0; i < sizeof(MMVector); i++) {
if (test_bit(i, env->vtcm_log.mask)) {
probe_write(env, env->vtcm_log.va[i], 1, mmu_idx, retaddr);
}
}
}
}
}
void HELPER(probe_pkt_scalar_hvx_stores)(CPUHexagonState *env, int mask,
int mmu_idx)
{
bool has_st0 = (mask >> 0) & 1;
bool has_st1 = (mask >> 1) & 1;
bool has_hvx_stores = (mask >> 2) & 1;
if (has_st0) {
probe_store(env, 0, mmu_idx);
}
if (has_st1) {
probe_store(env, 1, mmu_idx);
}
if (has_hvx_stores) {
HELPER(probe_hvx_stores)(env, mmu_idx);
}
}
/* /*
* mem_noshuf * mem_noshuf
* Section 5.5 of the Hexagon V67 Programmer's Reference Manual * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
@ -1181,6 +1294,171 @@ float64 HELPER(dfmpyhh)(CPUHexagonState *env, float64 RxxV,
return RxxV; return RxxV;
} }
/* Histogram instructions */
void HELPER(vhist)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int lane = 0; lane < 8; lane++) {
for (int i = 0; i < sizeof(MMVector) / 8; ++i) {
unsigned char value = input->ub[(sizeof(MMVector) / 8) * lane + i];
unsigned char regno = value >> 3;
unsigned char element = value & 7;
env->VRegs[regno].uh[(sizeof(MMVector) / 16) * lane + element]++;
}
}
}
void HELPER(vhistq)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int lane = 0; lane < 8; lane++) {
for (int i = 0; i < sizeof(MMVector) / 8; ++i) {
unsigned char value = input->ub[(sizeof(MMVector) / 8) * lane + i];
unsigned char regno = value >> 3;
unsigned char element = value & 7;
if (fGETQBIT(env->qtmp, sizeof(MMVector) / 8 * lane + i)) {
env->VRegs[regno].uh[
(sizeof(MMVector) / 16) * lane + element]++;
}
}
}
}
void HELPER(vwhist256)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 0) & (~7)) | ((bucket >> 0) & 7);
env->VRegs[vindex].uh[elindex] =
env->VRegs[vindex].uh[elindex] + weight;
}
}
void HELPER(vwhist256q)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 0) & (~7)) | ((bucket >> 0) & 7);
if (fGETQBIT(env->qtmp, 2 * i)) {
env->VRegs[vindex].uh[elindex] =
env->VRegs[vindex].uh[elindex] + weight;
}
}
}
void HELPER(vwhist256_sat)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 0) & (~7)) | ((bucket >> 0) & 7);
env->VRegs[vindex].uh[elindex] =
fVSATUH(env->VRegs[vindex].uh[elindex] + weight);
}
}
void HELPER(vwhist256q_sat)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 0) & (~7)) | ((bucket >> 0) & 7);
if (fGETQBIT(env->qtmp, 2 * i)) {
env->VRegs[vindex].uh[elindex] =
fVSATUH(env->VRegs[vindex].uh[elindex] + weight);
}
}
}
void HELPER(vwhist128)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 1) & (~3)) | ((bucket >> 1) & 3);
env->VRegs[vindex].uw[elindex] =
env->VRegs[vindex].uw[elindex] + weight;
}
}
void HELPER(vwhist128q)(CPUHexagonState *env)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 1) & (~3)) | ((bucket >> 1) & 3);
if (fGETQBIT(env->qtmp, 2 * i)) {
env->VRegs[vindex].uw[elindex] =
env->VRegs[vindex].uw[elindex] + weight;
}
}
}
void HELPER(vwhist128m)(CPUHexagonState *env, int32_t uiV)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 1) & (~3)) | ((bucket >> 1) & 3);
if ((bucket & 1) == uiV) {
env->VRegs[vindex].uw[elindex] =
env->VRegs[vindex].uw[elindex] + weight;
}
}
}
void HELPER(vwhist128qm)(CPUHexagonState *env, int32_t uiV)
{
MMVector *input = &env->tmp_VRegs[0];
for (int i = 0; i < (sizeof(MMVector) / 2); i++) {
unsigned int bucket = fGETUBYTE(0, input->h[i]);
unsigned int weight = fGETUBYTE(1, input->h[i]);
unsigned int vindex = (bucket >> 3) & 0x1F;
unsigned int elindex = ((i >> 1) & (~3)) | ((bucket >> 1) & 3);
if (((bucket & 1) == uiV) && fGETQBIT(env->qtmp, 2 * i)) {
env->VRegs[vindex].uw[elindex] =
env->VRegs[vindex].uw[elindex] + weight;
}
}
}
static void cancel_slot(CPUHexagonState *env, uint32_t slot) static void cancel_slot(CPUHexagonState *env, uint32_t slot)
{ {
HEX_DEBUG_LOG("Slot %d cancelled\n", slot); HEX_DEBUG_LOG("Slot %d cancelled\n", slot);

View file

@ -19,6 +19,7 @@
#include "qemu/osdep.h" #include "qemu/osdep.h"
#include "cpu.h" #include "cpu.h"
#include "tcg/tcg-op.h" #include "tcg/tcg-op.h"
#include "tcg/tcg-op-gvec.h"
#include "exec/cpu_ldst.h" #include "exec/cpu_ldst.h"
#include "exec/log.h" #include "exec/log.h"
#include "internal.h" #include "internal.h"
@ -47,11 +48,60 @@ TCGv hex_dczero_addr;
TCGv hex_llsc_addr; TCGv hex_llsc_addr;
TCGv hex_llsc_val; TCGv hex_llsc_val;
TCGv_i64 hex_llsc_val_i64; TCGv_i64 hex_llsc_val_i64;
TCGv hex_VRegs_updated;
TCGv hex_QRegs_updated;
TCGv hex_vstore_addr[VSTORES_MAX];
TCGv hex_vstore_size[VSTORES_MAX];
TCGv hex_vstore_pending[VSTORES_MAX];
static const char * const hexagon_prednames[] = { static const char * const hexagon_prednames[] = {
"p0", "p1", "p2", "p3" "p0", "p1", "p2", "p3"
}; };
intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
int num, bool alloc_ok)
{
intptr_t offset;
/* See if it is already allocated */
for (int i = 0; i < ctx->future_vregs_idx; i++) {
if (ctx->future_vregs_num[i] == regnum) {
return offsetof(CPUHexagonState, future_VRegs[i]);
}
}
g_assert(alloc_ok);
offset = offsetof(CPUHexagonState, future_VRegs[ctx->future_vregs_idx]);
for (int i = 0; i < num; i++) {
ctx->future_vregs_num[ctx->future_vregs_idx + i] = regnum++;
}
ctx->future_vregs_idx += num;
g_assert(ctx->future_vregs_idx <= VECTOR_TEMPS_MAX);
return offset;
}
intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum,
int num, bool alloc_ok)
{
intptr_t offset;
/* See if it is already allocated */
for (int i = 0; i < ctx->tmp_vregs_idx; i++) {
if (ctx->tmp_vregs_num[i] == regnum) {
return offsetof(CPUHexagonState, tmp_VRegs[i]);
}
}
g_assert(alloc_ok);
offset = offsetof(CPUHexagonState, tmp_VRegs[ctx->tmp_vregs_idx]);
for (int i = 0; i < num; i++) {
ctx->tmp_vregs_num[ctx->tmp_vregs_idx + i] = regnum++;
}
ctx->tmp_vregs_idx += num;
g_assert(ctx->tmp_vregs_idx <= VECTOR_TEMPS_MAX);
return offset;
}
static void gen_exception_raw(int excp) static void gen_exception_raw(int excp)
{ {
gen_helper_raise_exception(cpu_env, tcg_constant_i32(excp)); gen_helper_raise_exception(cpu_env, tcg_constant_i32(excp));
@ -63,6 +113,8 @@ static void gen_exec_counters(DisasContext *ctx)
hex_gpr[HEX_REG_QEMU_PKT_CNT], ctx->num_packets); hex_gpr[HEX_REG_QEMU_PKT_CNT], ctx->num_packets);
tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT], tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_INSN_CNT],
hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns); hex_gpr[HEX_REG_QEMU_INSN_CNT], ctx->num_insns);
tcg_gen_addi_tl(hex_gpr[HEX_REG_QEMU_HVX_CNT],
hex_gpr[HEX_REG_QEMU_HVX_CNT], ctx->num_hvx_insns);
} }
static void gen_end_tb(DisasContext *ctx) static void gen_end_tb(DisasContext *ctx)
@ -167,11 +219,19 @@ static void gen_start_packet(DisasContext *ctx, Packet *pkt)
bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS); bitmap_zero(ctx->regs_written, TOTAL_PER_THREAD_REGS);
ctx->preg_log_idx = 0; ctx->preg_log_idx = 0;
bitmap_zero(ctx->pregs_written, NUM_PREGS); bitmap_zero(ctx->pregs_written, NUM_PREGS);
ctx->future_vregs_idx = 0;
ctx->tmp_vregs_idx = 0;
ctx->vreg_log_idx = 0;
bitmap_zero(ctx->vregs_updated_tmp, NUM_VREGS);
bitmap_zero(ctx->vregs_updated, NUM_VREGS);
bitmap_zero(ctx->vregs_select, NUM_VREGS);
ctx->qreg_log_idx = 0;
for (i = 0; i < STORES_MAX; i++) { for (i = 0; i < STORES_MAX; i++) {
ctx->store_width[i] = 0; ctx->store_width[i] = 0;
} }
tcg_gen_movi_tl(hex_pkt_has_store_s1, pkt->pkt_has_store_s1); tcg_gen_movi_tl(hex_pkt_has_store_s1, pkt->pkt_has_store_s1);
ctx->s1_store_processed = false; ctx->s1_store_processed = false;
ctx->pre_commit = true;
if (HEX_DEBUG) { if (HEX_DEBUG) {
/* Handy place to set a breakpoint before the packet executes */ /* Handy place to set a breakpoint before the packet executes */
@ -193,6 +253,26 @@ static void gen_start_packet(DisasContext *ctx, Packet *pkt)
if (need_pred_written(pkt)) { if (need_pred_written(pkt)) {
tcg_gen_movi_tl(hex_pred_written, 0); tcg_gen_movi_tl(hex_pred_written, 0);
} }
if (pkt->pkt_has_hvx) {
tcg_gen_movi_tl(hex_VRegs_updated, 0);
tcg_gen_movi_tl(hex_QRegs_updated, 0);
}
}
bool is_gather_store_insn(Insn *insn, Packet *pkt)
{
if (GET_ATTRIB(insn->opcode, A_CVI_NEW) &&
insn->new_value_producer_slot == 1) {
/* Look for gather instruction */
for (int i = 0; i < pkt->num_insns; i++) {
Insn *in = &pkt->insn[i];
if (GET_ATTRIB(in->opcode, A_CVI_GATHER) && in->slot == 1) {
return true;
}
}
}
return false;
} }
/* /*
@ -448,10 +528,98 @@ static void process_dczeroa(DisasContext *ctx, Packet *pkt)
} }
} }
static bool pkt_has_hvx_store(Packet *pkt)
{
int i;
for (i = 0; i < pkt->num_insns; i++) {
int opcode = pkt->insn[i].opcode;
if (GET_ATTRIB(opcode, A_CVI) && GET_ATTRIB(opcode, A_STORE)) {
return true;
}
}
return false;
}
static void gen_commit_hvx(DisasContext *ctx, Packet *pkt)
{
int i;
/*
* for (i = 0; i < ctx->vreg_log_idx; i++) {
* int rnum = ctx->vreg_log[i];
* if (ctx->vreg_is_predicated[i]) {
* if (env->VRegs_updated & (1 << rnum)) {
* env->VRegs[rnum] = env->future_VRegs[rnum];
* }
* } else {
* env->VRegs[rnum] = env->future_VRegs[rnum];
* }
* }
*/
for (i = 0; i < ctx->vreg_log_idx; i++) {
int rnum = ctx->vreg_log[i];
bool is_predicated = ctx->vreg_is_predicated[i];
intptr_t dstoff = offsetof(CPUHexagonState, VRegs[rnum]);
intptr_t srcoff = ctx_future_vreg_off(ctx, rnum, 1, false);
size_t size = sizeof(MMVector);
if (is_predicated) {
TCGv cmp = tcg_temp_new();
TCGLabel *label_skip = gen_new_label();
tcg_gen_andi_tl(cmp, hex_VRegs_updated, 1 << rnum);
tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip);
tcg_temp_free(cmp);
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size);
gen_set_label(label_skip);
} else {
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size);
}
}
/*
* for (i = 0; i < ctx->qreg_log_idx; i++) {
* int rnum = ctx->qreg_log[i];
* if (ctx->qreg_is_predicated[i]) {
* if (env->QRegs_updated) & (1 << rnum)) {
* env->QRegs[rnum] = env->future_QRegs[rnum];
* }
* } else {
* env->QRegs[rnum] = env->future_QRegs[rnum];
* }
* }
*/
for (i = 0; i < ctx->qreg_log_idx; i++) {
int rnum = ctx->qreg_log[i];
bool is_predicated = ctx->qreg_is_predicated[i];
intptr_t dstoff = offsetof(CPUHexagonState, QRegs[rnum]);
intptr_t srcoff = offsetof(CPUHexagonState, future_QRegs[rnum]);
size_t size = sizeof(MMQReg);
if (is_predicated) {
TCGv cmp = tcg_temp_new();
TCGLabel *label_skip = gen_new_label();
tcg_gen_andi_tl(cmp, hex_QRegs_updated, 1 << rnum);
tcg_gen_brcondi_tl(TCG_COND_EQ, cmp, 0, label_skip);
tcg_temp_free(cmp);
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size);
gen_set_label(label_skip);
} else {
tcg_gen_gvec_mov(MO_64, dstoff, srcoff, size, size);
}
}
if (pkt_has_hvx_store(pkt)) {
gen_helper_commit_hvx_stores(cpu_env);
}
}
static void update_exec_counters(DisasContext *ctx, Packet *pkt) static void update_exec_counters(DisasContext *ctx, Packet *pkt)
{ {
int num_insns = pkt->num_insns; int num_insns = pkt->num_insns;
int num_real_insns = 0; int num_real_insns = 0;
int num_hvx_insns = 0;
for (int i = 0; i < num_insns; i++) { for (int i = 0; i < num_insns; i++) {
if (!pkt->insn[i].is_endloop && if (!pkt->insn[i].is_endloop &&
@ -459,13 +627,18 @@ static void update_exec_counters(DisasContext *ctx, Packet *pkt)
!GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) { !GET_ATTRIB(pkt->insn[i].opcode, A_IT_NOP)) {
num_real_insns++; num_real_insns++;
} }
if (GET_ATTRIB(pkt->insn[i].opcode, A_CVI)) {
num_hvx_insns++;
}
} }
ctx->num_packets++; ctx->num_packets++;
ctx->num_insns += num_real_insns; ctx->num_insns += num_real_insns;
ctx->num_hvx_insns += num_hvx_insns;
} }
static void gen_commit_packet(DisasContext *ctx, Packet *pkt) static void gen_commit_packet(CPUHexagonState *env, DisasContext *ctx,
Packet *pkt)
{ {
/* /*
* If there is more than one store in a packet, make sure they are all OK * If there is more than one store in a packet, make sure they are all OK
@ -474,6 +647,10 @@ static void gen_commit_packet(DisasContext *ctx, Packet *pkt)
* dczeroa has to be the only store operation in the packet, so we go * dczeroa has to be the only store operation in the packet, so we go
* ahead and process that first. * ahead and process that first.
* *
* When there is an HVX store, there can also be a scalar store in either
* slot 0 or slot1, so we create a mask for the helper to indicate what
* work to do.
*
* When there are two scalar stores, we probe the one in slot 0. * When there are two scalar stores, we probe the one in slot 0.
* *
* Note that we don't call the probe helper for packets with only one * Note that we don't call the probe helper for packets with only one
@ -482,13 +659,35 @@ static void gen_commit_packet(DisasContext *ctx, Packet *pkt)
*/ */
bool has_store_s0 = pkt->pkt_has_store_s0; bool has_store_s0 = pkt->pkt_has_store_s0;
bool has_store_s1 = (pkt->pkt_has_store_s1 && !ctx->s1_store_processed); bool has_store_s1 = (pkt->pkt_has_store_s1 && !ctx->s1_store_processed);
bool has_hvx_store = pkt_has_hvx_store(pkt);
if (pkt->pkt_has_dczeroa) { if (pkt->pkt_has_dczeroa) {
/* /*
* The dczeroa will be the store in slot 0, check that we don't have * The dczeroa will be the store in slot 0, check that we don't have
* a store in slot 1. * a store in slot 1 or an HVX store.
*/ */
g_assert(has_store_s0 && !has_store_s1); g_assert(has_store_s0 && !has_store_s1 && !has_hvx_store);
process_dczeroa(ctx, pkt); process_dczeroa(ctx, pkt);
} else if (has_hvx_store) {
TCGv mem_idx = tcg_constant_tl(ctx->mem_idx);
if (!has_store_s0 && !has_store_s1) {
gen_helper_probe_hvx_stores(cpu_env, mem_idx);
} else {
int mask = 0;
TCGv mask_tcgv;
if (has_store_s0) {
mask |= (1 << 0);
}
if (has_store_s1) {
mask |= (1 << 1);
}
if (has_hvx_store) {
mask |= (1 << 2);
}
mask_tcgv = tcg_constant_tl(mask);
gen_helper_probe_pkt_scalar_hvx_stores(cpu_env, mask_tcgv, mem_idx);
}
} else if (has_store_s0 && has_store_s1) { } else if (has_store_s0 && has_store_s1) {
/* /*
* process_store_log will execute the slot 1 store first, * process_store_log will execute the slot 1 store first,
@ -502,6 +701,9 @@ static void gen_commit_packet(DisasContext *ctx, Packet *pkt)
gen_reg_writes(ctx); gen_reg_writes(ctx);
gen_pred_writes(ctx, pkt); gen_pred_writes(ctx, pkt);
if (pkt->pkt_has_hvx) {
gen_commit_hvx(ctx, pkt);
}
update_exec_counters(ctx, pkt); update_exec_counters(ctx, pkt);
if (HEX_DEBUG) { if (HEX_DEBUG) {
TCGv has_st0 = TCGv has_st0 =
@ -513,6 +715,11 @@ static void gen_commit_packet(DisasContext *ctx, Packet *pkt)
gen_helper_debug_commit_end(cpu_env, has_st0, has_st1); gen_helper_debug_commit_end(cpu_env, has_st0, has_st1);
} }
if (pkt->vhist_insn != NULL) {
ctx->pre_commit = false;
pkt->vhist_insn->generate(env, ctx, pkt->vhist_insn, pkt);
}
if (pkt->pkt_has_cof) { if (pkt->pkt_has_cof) {
gen_end_tb(ctx); gen_end_tb(ctx);
} }
@ -537,7 +744,7 @@ static void decode_and_translate_packet(CPUHexagonState *env, DisasContext *ctx)
for (i = 0; i < pkt.num_insns; i++) { for (i = 0; i < pkt.num_insns; i++) {
gen_insn(env, ctx, &pkt.insn[i], &pkt); gen_insn(env, ctx, &pkt.insn[i], &pkt);
} }
gen_commit_packet(ctx, &pkt); gen_commit_packet(env, ctx, &pkt);
ctx->base.pc_next += pkt.encod_pkt_size_in_bytes; ctx->base.pc_next += pkt.encod_pkt_size_in_bytes;
} else { } else {
gen_exception_end_tb(ctx, HEX_EXCP_INVALID_PACKET); gen_exception_end_tb(ctx, HEX_EXCP_INVALID_PACKET);
@ -552,6 +759,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase,
ctx->mem_idx = MMU_USER_IDX; ctx->mem_idx = MMU_USER_IDX;
ctx->num_packets = 0; ctx->num_packets = 0;
ctx->num_insns = 0; ctx->num_insns = 0;
ctx->num_hvx_insns = 0;
} }
static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu) static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
@ -656,6 +864,9 @@ static char store_addr_names[STORES_MAX][NAME_LEN];
static char store_width_names[STORES_MAX][NAME_LEN]; static char store_width_names[STORES_MAX][NAME_LEN];
static char store_val32_names[STORES_MAX][NAME_LEN]; static char store_val32_names[STORES_MAX][NAME_LEN];
static char store_val64_names[STORES_MAX][NAME_LEN]; static char store_val64_names[STORES_MAX][NAME_LEN];
static char vstore_addr_names[VSTORES_MAX][NAME_LEN];
static char vstore_size_names[VSTORES_MAX][NAME_LEN];
static char vstore_pending_names[VSTORES_MAX][NAME_LEN];
void hexagon_translate_init(void) void hexagon_translate_init(void)
{ {
@ -718,6 +929,10 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, llsc_val), "llsc_val"); offsetof(CPUHexagonState, llsc_val), "llsc_val");
hex_llsc_val_i64 = tcg_global_mem_new_i64(cpu_env, hex_llsc_val_i64 = tcg_global_mem_new_i64(cpu_env,
offsetof(CPUHexagonState, llsc_val_i64), "llsc_val_i64"); offsetof(CPUHexagonState, llsc_val_i64), "llsc_val_i64");
hex_VRegs_updated = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, VRegs_updated), "VRegs_updated");
hex_QRegs_updated = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, QRegs_updated), "QRegs_updated");
for (i = 0; i < STORES_MAX; i++) { for (i = 0; i < STORES_MAX; i++) {
snprintf(store_addr_names[i], NAME_LEN, "store_addr_%d", i); snprintf(store_addr_names[i], NAME_LEN, "store_addr_%d", i);
hex_store_addr[i] = tcg_global_mem_new(cpu_env, hex_store_addr[i] = tcg_global_mem_new(cpu_env,
@ -739,4 +954,20 @@ void hexagon_translate_init(void)
offsetof(CPUHexagonState, mem_log_stores[i].data64), offsetof(CPUHexagonState, mem_log_stores[i].data64),
store_val64_names[i]); store_val64_names[i]);
} }
for (int i = 0; i < VSTORES_MAX; i++) {
snprintf(vstore_addr_names[i], NAME_LEN, "vstore_addr_%d", i);
hex_vstore_addr[i] = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, vstore[i].va),
vstore_addr_names[i]);
snprintf(vstore_size_names[i], NAME_LEN, "vstore_size_%d", i);
hex_vstore_size[i] = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, vstore[i].size),
vstore_size_names[i]);
snprintf(vstore_pending_names[i], NAME_LEN, "vstore_pending_%d", i);
hex_vstore_pending[i] = tcg_global_mem_new(cpu_env,
offsetof(CPUHexagonState, vstore_pending[i]),
vstore_pending_names[i]);
}
} }

View file

@ -29,6 +29,7 @@ typedef struct DisasContext {
uint32_t mem_idx; uint32_t mem_idx;
uint32_t num_packets; uint32_t num_packets;
uint32_t num_insns; uint32_t num_insns;
uint32_t num_hvx_insns;
int reg_log[REG_WRITES_MAX]; int reg_log[REG_WRITES_MAX];
int reg_log_idx; int reg_log_idx;
DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS); DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS);
@ -37,6 +38,20 @@ typedef struct DisasContext {
DECLARE_BITMAP(pregs_written, NUM_PREGS); DECLARE_BITMAP(pregs_written, NUM_PREGS);
uint8_t store_width[STORES_MAX]; uint8_t store_width[STORES_MAX];
bool s1_store_processed; bool s1_store_processed;
int future_vregs_idx;
int future_vregs_num[VECTOR_TEMPS_MAX];
int tmp_vregs_idx;
int tmp_vregs_num[VECTOR_TEMPS_MAX];
int vreg_log[NUM_VREGS];
bool vreg_is_predicated[NUM_VREGS];
int vreg_log_idx;
DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS);
DECLARE_BITMAP(vregs_updated, NUM_VREGS);
DECLARE_BITMAP(vregs_select, NUM_VREGS);
int qreg_log[NUM_QREGS];
bool qreg_is_predicated[NUM_QREGS];
int qreg_log_idx;
bool pre_commit;
} DisasContext; } DisasContext;
static inline void ctx_log_reg_write(DisasContext *ctx, int rnum) static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
@ -67,6 +82,46 @@ static inline bool is_preloaded(DisasContext *ctx, int num)
return test_bit(num, ctx->regs_written); return test_bit(num, ctx->regs_written);
} }
intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
int num, bool alloc_ok);
intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum,
int num, bool alloc_ok);
static inline void ctx_log_vreg_write(DisasContext *ctx,
int rnum, VRegWriteType type,
bool is_predicated)
{
if (type != EXT_TMP) {
ctx->vreg_log[ctx->vreg_log_idx] = rnum;
ctx->vreg_is_predicated[ctx->vreg_log_idx] = is_predicated;
ctx->vreg_log_idx++;
set_bit(rnum, ctx->vregs_updated);
}
if (type == EXT_NEW) {
set_bit(rnum, ctx->vregs_select);
}
if (type == EXT_TMP) {
set_bit(rnum, ctx->vregs_updated_tmp);
}
}
static inline void ctx_log_vreg_write_pair(DisasContext *ctx,
int rnum, VRegWriteType type,
bool is_predicated)
{
ctx_log_vreg_write(ctx, rnum ^ 0, type, is_predicated);
ctx_log_vreg_write(ctx, rnum ^ 1, type, is_predicated);
}
static inline void ctx_log_qreg_write(DisasContext *ctx,
int rnum, bool is_predicated)
{
ctx->qreg_log[ctx->qreg_log_idx] = rnum;
ctx->qreg_is_predicated[ctx->qreg_log_idx] = is_predicated;
ctx->qreg_log_idx++;
}
extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS]; extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
extern TCGv hex_pred[NUM_PREGS]; extern TCGv hex_pred[NUM_PREGS];
extern TCGv hex_next_PC; extern TCGv hex_next_PC;
@ -85,6 +140,12 @@ extern TCGv hex_dczero_addr;
extern TCGv hex_llsc_addr; extern TCGv hex_llsc_addr;
extern TCGv hex_llsc_val; extern TCGv hex_llsc_val;
extern TCGv_i64 hex_llsc_val_i64; extern TCGv_i64 hex_llsc_val_i64;
extern TCGv hex_VRegs_updated;
extern TCGv hex_QRegs_updated;
extern TCGv hex_vstore_addr[VSTORES_MAX];
extern TCGv hex_vstore_size[VSTORES_MAX];
extern TCGv hex_vstore_pending[VSTORES_MAX];
bool is_gather_store_insn(Insn *insn, Packet *pkt);
void process_store(DisasContext *ctx, Packet *pkt, int slot_num); void process_store(DisasContext *ctx, Packet *pkt, int slot_num);
#endif #endif

View file

@ -0,0 +1,88 @@
/*
* Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include "hvx_histogram_row.h"
const int vector_len = 128;
const int width = 275;
const int height = 20;
const int stride = (width + vector_len - 1) & -vector_len;
int err;
static uint8_t input[height][stride] __attribute__((aligned(128))) = {
#include "hvx_histogram_input.h"
};
static int result[256] __attribute__((aligned(128)));
static int expect[256] __attribute__((aligned(128)));
static void check(void)
{
for (int i = 0; i < 256; i++) {
int res = result[i];
int exp = expect[i];
if (res != exp) {
printf("ERROR at %3d: 0x%04x != 0x%04x\n",
i, res, exp);
err++;
}
}
}
static void ref_histogram(uint8_t *src, int stride, int width, int height,
int *hist)
{
for (int i = 0; i < 256; i++) {
hist[i] = 0;
}
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
hist[src[i * stride + j]]++;
}
}
}
static void hvx_histogram(uint8_t *src, int stride, int width, int height,
int *hist)
{
int n = 8192 / width;
for (int i = 0; i < 256; i++) {
hist[i] = 0;
}
for (int i = 0; i < height; i += n) {
int k = height - i > n ? n : height - i;
hvx_histogram_row(src, stride, width, k, hist);
src += n * stride;
}
}
int main()
{
ref_histogram(&input[0][0], stride, width, height, expect);
hvx_histogram(&input[0][0], stride, width, height, result);
check();
puts(err ? "FAIL" : "PASS");
return err ? 1 : 0;
}

View file

@ -0,0 +1,717 @@
/*
* Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
{ 0x26, 0x32, 0x2e, 0x2e, 0x2d, 0x2c, 0x2d, 0x2d,
0x2c, 0x2e, 0x31, 0x33, 0x36, 0x39, 0x3b, 0x3f,
0x42, 0x46, 0x4a, 0x4c, 0x51, 0x53, 0x53, 0x54,
0x56, 0x57, 0x58, 0x57, 0x56, 0x52, 0x51, 0x4f,
0x4c, 0x49, 0x47, 0x42, 0x3e, 0x3b, 0x38, 0x35,
0x33, 0x30, 0x2e, 0x2c, 0x2b, 0x2a, 0x2a, 0x28,
0x28, 0x27, 0x27, 0x28, 0x29, 0x2a, 0x2c, 0x2e,
0x2f, 0x33, 0x36, 0x38, 0x3c, 0x3d, 0x40, 0x42,
0x43, 0x42, 0x43, 0x44, 0x43, 0x41, 0x40, 0x3b,
0x3b, 0x3a, 0x38, 0x35, 0x32, 0x2f, 0x2c, 0x29,
0x27, 0x26, 0x23, 0x21, 0x1e, 0x1c, 0x1a, 0x19,
0x17, 0x15, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10,
0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d,
0x0c, 0x0d, 0x0e, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
0x0c, 0x0c, 0x0d, 0x0c, 0x0f, 0x0e, 0x0f, 0x0f,
0x0f, 0x10, 0x11, 0x12, 0x14, 0x16, 0x17, 0x19,
0x1c, 0x1d, 0x21, 0x25, 0x27, 0x29, 0x2b, 0x2f,
0x31, 0x33, 0x36, 0x38, 0x39, 0x3a, 0x3b, 0x3c,
0x3c, 0x3d, 0x3e, 0x3e, 0x3c, 0x3b, 0x3a, 0x39,
0x39, 0x3a, 0x3a, 0x3a, 0x3a, 0x3c, 0x3e, 0x43,
0x47, 0x4a, 0x4d, 0x51, 0x51, 0x54, 0x56, 0x56,
0x57, 0x56, 0x53, 0x4f, 0x4b, 0x47, 0x43, 0x41,
0x3e, 0x3c, 0x3a, 0x37, 0x36, 0x33, 0x32, 0x34,
0x34, 0x34, 0x34, 0x35, 0x36, 0x39, 0x3d, 0x3d,
0x3f, 0x40, 0x40, 0x40, 0x40, 0x3e, 0x40, 0x40,
0x42, 0x44, 0x47, 0x48, 0x4b, 0x4e, 0x56, 0x5c,
0x62, 0x68, 0x6f, 0x73, 0x76, 0x79, 0x7a, 0x7c,
0x7e, 0x7c, 0x78, 0x72, 0x6e, 0x69, 0x65, 0x60,
0x5b, 0x56, 0x52, 0x4d, 0x4a, 0x48, 0x47, 0x46,
0x44, 0x43, 0x42, 0x41, 0x41, 0x41, 0x40, 0x40,
0x3f, 0x3e, 0x3d, 0x3c, 0x3b, 0x3b, 0x38, 0x37,
0x36, 0x35, 0x36, 0x35, 0x36, 0x37, 0x38, 0x3c,
0x3d, 0x3f, 0x42, 0x44, 0x46, 0x48, 0x4b, 0x4c,
0x4e, 0x4e, 0x4d, 0x4c, 0x4a, 0x48, 0x49, 0x49,
0x4b, 0x4d, 0x4e, },
{ 0x23, 0x2d, 0x29, 0x29, 0x28, 0x28, 0x29, 0x29,
0x28, 0x2b, 0x2d, 0x2f, 0x32, 0x34, 0x36, 0x3a,
0x3d, 0x41, 0x44, 0x47, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x51, 0x51, 0x4f, 0x4c, 0x4b, 0x48,
0x46, 0x44, 0x40, 0x3d, 0x39, 0x36, 0x34, 0x30,
0x2f, 0x2d, 0x2a, 0x29, 0x28, 0x27, 0x26, 0x25,
0x25, 0x24, 0x24, 0x24, 0x26, 0x28, 0x28, 0x2a,
0x2b, 0x2e, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3c,
0x3d, 0x3d, 0x3e, 0x3e, 0x3e, 0x3c, 0x3b, 0x38,
0x37, 0x35, 0x33, 0x30, 0x2e, 0x2b, 0x27, 0x25,
0x24, 0x21, 0x20, 0x1d, 0x1b, 0x1a, 0x18, 0x16,
0x15, 0x14, 0x13, 0x12, 0x10, 0x11, 0x10, 0x0e,
0x0e, 0x0d, 0x0d, 0x0d, 0x0d, 0x0c, 0x0c, 0x0b,
0x0b, 0x0b, 0x0c, 0x0b, 0x0b, 0x09, 0x0a, 0x0b,
0x0b, 0x0a, 0x0a, 0x0c, 0x0c, 0x0c, 0x0d, 0x0e,
0x0e, 0x0f, 0x0f, 0x11, 0x12, 0x15, 0x15, 0x17,
0x1a, 0x1c, 0x1f, 0x22, 0x25, 0x26, 0x29, 0x2a,
0x2d, 0x30, 0x33, 0x34, 0x35, 0x35, 0x37, 0x37,
0x39, 0x3a, 0x39, 0x38, 0x37, 0x36, 0x36, 0x37,
0x35, 0x36, 0x35, 0x35, 0x36, 0x37, 0x3a, 0x3e,
0x40, 0x43, 0x48, 0x49, 0x4b, 0x4c, 0x4d, 0x4e,
0x4f, 0x4f, 0x4c, 0x48, 0x45, 0x41, 0x3e, 0x3b,
0x3a, 0x37, 0x36, 0x33, 0x32, 0x31, 0x30, 0x31,
0x32, 0x31, 0x31, 0x31, 0x31, 0x34, 0x37, 0x38,
0x3a, 0x3b, 0x3b, 0x3b, 0x3c, 0x3b, 0x3d, 0x3e,
0x3f, 0x40, 0x43, 0x44, 0x47, 0x4b, 0x4f, 0x56,
0x5a, 0x60, 0x66, 0x69, 0x6a, 0x6e, 0x71, 0x72,
0x73, 0x72, 0x6d, 0x69, 0x66, 0x60, 0x5c, 0x59,
0x54, 0x50, 0x4d, 0x48, 0x46, 0x44, 0x44, 0x43,
0x42, 0x41, 0x41, 0x40, 0x3f, 0x3f, 0x3e, 0x3d,
0x3d, 0x3d, 0x3c, 0x3a, 0x39, 0x38, 0x35, 0x35,
0x34, 0x34, 0x35, 0x34, 0x35, 0x36, 0x39, 0x3c,
0x3d, 0x3e, 0x41, 0x43, 0x44, 0x46, 0x48, 0x49,
0x4a, 0x49, 0x48, 0x47, 0x45, 0x43, 0x43, 0x44,
0x45, 0x47, 0x48, },
{ 0x23, 0x2d, 0x2a, 0x2a, 0x29, 0x29, 0x2a, 0x2a,
0x29, 0x2c, 0x2d, 0x2f, 0x32, 0x34, 0x36, 0x3a,
0x3d, 0x40, 0x44, 0x48, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x51, 0x51, 0x4f, 0x4c, 0x4b, 0x48,
0x46, 0x44, 0x40, 0x3d, 0x39, 0x36, 0x34, 0x30,
0x2f, 0x2d, 0x2a, 0x29, 0x28, 0x27, 0x26, 0x25,
0x25, 0x24, 0x24, 0x25, 0x26, 0x28, 0x29, 0x2a,
0x2b, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c,
0x3d, 0x3e, 0x3e, 0x3d, 0x3e, 0x3c, 0x3c, 0x3a,
0x37, 0x35, 0x33, 0x30, 0x2f, 0x2b, 0x28, 0x26,
0x24, 0x21, 0x20, 0x1e, 0x1c, 0x1b, 0x18, 0x17,
0x16, 0x14, 0x13, 0x12, 0x10, 0x10, 0x0f, 0x0e,
0x0f, 0x0e, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0c,
0x0b, 0x0b, 0x0c, 0x0c, 0x0c, 0x0b, 0x0b, 0x0c,
0x0c, 0x0b, 0x0b, 0x0c, 0x0d, 0x0c, 0x0e, 0x0e,
0x0e, 0x0f, 0x11, 0x11, 0x13, 0x14, 0x16, 0x18,
0x1a, 0x1d, 0x1f, 0x22, 0x25, 0x26, 0x29, 0x2b,
0x2d, 0x31, 0x33, 0x34, 0x36, 0x37, 0x38, 0x38,
0x39, 0x3a, 0x39, 0x38, 0x37, 0x36, 0x37, 0x37,
0x35, 0x36, 0x35, 0x36, 0x35, 0x38, 0x3a, 0x3e,
0x40, 0x41, 0x45, 0x47, 0x49, 0x4a, 0x4c, 0x4d,
0x4e, 0x4d, 0x4a, 0x47, 0x44, 0x40, 0x3d, 0x3b,
0x39, 0x37, 0x34, 0x34, 0x32, 0x31, 0x31, 0x33,
0x32, 0x31, 0x32, 0x33, 0x32, 0x36, 0x38, 0x39,
0x3b, 0x3c, 0x3c, 0x3c, 0x3d, 0x3d, 0x3e, 0x3e,
0x41, 0x42, 0x43, 0x45, 0x48, 0x4c, 0x50, 0x56,
0x5b, 0x5f, 0x62, 0x67, 0x69, 0x6c, 0x6e, 0x6e,
0x70, 0x6f, 0x6b, 0x67, 0x63, 0x5e, 0x5b, 0x58,
0x54, 0x51, 0x4e, 0x4a, 0x48, 0x46, 0x46, 0x46,
0x45, 0x46, 0x44, 0x43, 0x44, 0x43, 0x42, 0x42,
0x41, 0x40, 0x3f, 0x3e, 0x3c, 0x3b, 0x3a, 0x39,
0x39, 0x39, 0x38, 0x37, 0x37, 0x3a, 0x3e, 0x40,
0x42, 0x43, 0x47, 0x47, 0x48, 0x4a, 0x4b, 0x4c,
0x4c, 0x4b, 0x4a, 0x48, 0x46, 0x44, 0x43, 0x45,
0x45, 0x46, 0x47, },
{ 0x21, 0x2b, 0x28, 0x28, 0x28, 0x28, 0x29, 0x29,
0x28, 0x2a, 0x2d, 0x30, 0x32, 0x34, 0x37, 0x3a,
0x3c, 0x40, 0x44, 0x48, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x48,
0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x30,
0x2f, 0x2d, 0x2b, 0x2a, 0x28, 0x27, 0x26, 0x25,
0x24, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2d, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c,
0x3d, 0x3e, 0x3e, 0x3e, 0x3e, 0x3d, 0x3c, 0x3a,
0x37, 0x35, 0x33, 0x30, 0x2f, 0x2b, 0x28, 0x26,
0x25, 0x21, 0x20, 0x1e, 0x1c, 0x19, 0x19, 0x18,
0x17, 0x15, 0x15, 0x12, 0x11, 0x11, 0x11, 0x0f,
0x0e, 0x0e, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0c,
0x0c, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b,
0x0c, 0x0c, 0x0c, 0x0c, 0x0e, 0x0e, 0x0f, 0x0f,
0x0f, 0x10, 0x11, 0x13, 0x13, 0x15, 0x16, 0x18,
0x1a, 0x1c, 0x1f, 0x22, 0x25, 0x28, 0x29, 0x2d,
0x2f, 0x32, 0x34, 0x35, 0x36, 0x37, 0x38, 0x38,
0x39, 0x3a, 0x39, 0x39, 0x37, 0x36, 0x37, 0x36,
0x35, 0x35, 0x37, 0x35, 0x36, 0x37, 0x3a, 0x3d,
0x3e, 0x41, 0x43, 0x46, 0x46, 0x47, 0x48, 0x49,
0x4a, 0x49, 0x47, 0x45, 0x42, 0x3f, 0x3d, 0x3b,
0x3a, 0x38, 0x36, 0x34, 0x32, 0x32, 0x32, 0x32,
0x32, 0x31, 0x33, 0x32, 0x34, 0x37, 0x38, 0x38,
0x3a, 0x3b, 0x3d, 0x3d, 0x3d, 0x3e, 0x3f, 0x41,
0x42, 0x44, 0x44, 0x46, 0x49, 0x4d, 0x50, 0x54,
0x58, 0x5c, 0x61, 0x63, 0x65, 0x69, 0x6a, 0x6c,
0x6d, 0x6c, 0x68, 0x64, 0x61, 0x5c, 0x59, 0x57,
0x53, 0x51, 0x4f, 0x4c, 0x4a, 0x48, 0x48, 0x49,
0x49, 0x48, 0x48, 0x48, 0x47, 0x47, 0x46, 0x46,
0x45, 0x44, 0x42, 0x41, 0x3f, 0x3e, 0x3c, 0x3c,
0x3c, 0x3d, 0x3c, 0x3c, 0x3c, 0x3e, 0x41, 0x43,
0x46, 0x48, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4e,
0x4e, 0x4d, 0x4b, 0x49, 0x47, 0x44, 0x44, 0x45,
0x45, 0x45, 0x46, },
{ 0x22, 0x2b, 0x27, 0x27, 0x27, 0x27, 0x28, 0x28,
0x28, 0x2a, 0x2c, 0x2f, 0x30, 0x34, 0x37, 0x3b,
0x3d, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x47,
0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x30,
0x2f, 0x2d, 0x2b, 0x2a, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3a, 0x3b,
0x3d, 0x3e, 0x3e, 0x3f, 0x3f, 0x3d, 0x3c, 0x3a,
0x38, 0x36, 0x34, 0x31, 0x2e, 0x2c, 0x29, 0x26,
0x25, 0x22, 0x20, 0x1e, 0x1c, 0x1a, 0x19, 0x18,
0x16, 0x15, 0x14, 0x12, 0x10, 0x11, 0x11, 0x0f,
0x0e, 0x0e, 0x0e, 0x0e, 0x0d, 0x0c, 0x0d, 0x0c,
0x0c, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b, 0x0b,
0x0c, 0x0c, 0x0c, 0x0d, 0x0d, 0x0e, 0x0f, 0x0f,
0x0f, 0x10, 0x11, 0x13, 0x13, 0x15, 0x15, 0x18,
0x19, 0x1d, 0x1f, 0x21, 0x24, 0x27, 0x2a, 0x2c,
0x30, 0x33, 0x35, 0x36, 0x37, 0x38, 0x39, 0x39,
0x3a, 0x3a, 0x39, 0x39, 0x37, 0x36, 0x37, 0x36,
0x36, 0x36, 0x36, 0x36, 0x36, 0x37, 0x39, 0x3a,
0x3d, 0x3e, 0x41, 0x43, 0x43, 0x45, 0x46, 0x46,
0x47, 0x46, 0x44, 0x42, 0x40, 0x3d, 0x3a, 0x39,
0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x32, 0x32,
0x32, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38,
0x39, 0x3c, 0x3c, 0x3e, 0x3e, 0x3e, 0x41, 0x43,
0x44, 0x45, 0x46, 0x48, 0x49, 0x4c, 0x51, 0x54,
0x56, 0x5a, 0x5f, 0x61, 0x63, 0x65, 0x67, 0x69,
0x6a, 0x69, 0x67, 0x61, 0x5f, 0x5b, 0x58, 0x56,
0x54, 0x51, 0x50, 0x4e, 0x4c, 0x4a, 0x4b, 0x4c,
0x4c, 0x4b, 0x4b, 0x4b, 0x4b, 0x49, 0x4a, 0x49,
0x49, 0x48, 0x46, 0x44, 0x42, 0x41, 0x40, 0x3f,
0x3f, 0x40, 0x40, 0x40, 0x40, 0x42, 0x46, 0x49,
0x4b, 0x4c, 0x4f, 0x4f, 0x50, 0x52, 0x51, 0x51,
0x50, 0x4f, 0x4c, 0x4a, 0x48, 0x46, 0x45, 0x44,
0x44, 0x45, 0x46, },
{ 0x21, 0x2a, 0x27, 0x27, 0x27, 0x27, 0x27, 0x27,
0x27, 0x29, 0x2d, 0x2f, 0x31, 0x34, 0x37, 0x3b,
0x3e, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4b, 0x48,
0x45, 0x43, 0x3f, 0x3c, 0x39, 0x36, 0x33, 0x2f,
0x2f, 0x2d, 0x2a, 0x2a, 0x27, 0x26, 0x25, 0x24,
0x22, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2f, 0x31, 0x34, 0x37, 0x39, 0x3a, 0x3c,
0x3d, 0x3e, 0x3f, 0x40, 0x3f, 0x3d, 0x3d, 0x3a,
0x38, 0x36, 0x34, 0x31, 0x2e, 0x2c, 0x29, 0x26,
0x25, 0x22, 0x21, 0x1f, 0x1d, 0x1b, 0x19, 0x18,
0x16, 0x14, 0x14, 0x13, 0x11, 0x11, 0x11, 0x0f,
0x0f, 0x0f, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0d,
0x0d, 0x0d, 0x0c, 0x0b, 0x0b, 0x0b, 0x0b, 0x0c,
0x0c, 0x0d, 0x0d, 0x0d, 0x0e, 0x0e, 0x0f, 0x0f,
0x0f, 0x10, 0x13, 0x13, 0x14, 0x15, 0x17, 0x19,
0x1a, 0x1d, 0x1f, 0x22, 0x25, 0x27, 0x2a, 0x2e,
0x31, 0x33, 0x35, 0x38, 0x39, 0x3a, 0x3b, 0x3b,
0x3c, 0x3c, 0x3b, 0x3a, 0x39, 0x38, 0x38, 0x37,
0x36, 0x36, 0x37, 0x36, 0x37, 0x38, 0x38, 0x3a,
0x3b, 0x3e, 0x40, 0x40, 0x41, 0x42, 0x43, 0x42,
0x43, 0x42, 0x40, 0x40, 0x3f, 0x3c, 0x3b, 0x39,
0x38, 0x37, 0x36, 0x35, 0x34, 0x33, 0x32, 0x33,
0x32, 0x32, 0x34, 0x35, 0x35, 0x36, 0x39, 0x39,
0x3a, 0x3c, 0x3c, 0x3f, 0x40, 0x41, 0x43, 0x45,
0x45, 0x47, 0x48, 0x4a, 0x4b, 0x4d, 0x50, 0x53,
0x56, 0x59, 0x5c, 0x5f, 0x60, 0x65, 0x64, 0x66,
0x68, 0x66, 0x64, 0x61, 0x5e, 0x5a, 0x59, 0x56,
0x54, 0x52, 0x51, 0x50, 0x4e, 0x4c, 0x4d, 0x4f,
0x4f, 0x4f, 0x50, 0x50, 0x4f, 0x4f, 0x4e, 0x4d,
0x4c, 0x4b, 0x49, 0x47, 0x45, 0x44, 0x43, 0x43,
0x42, 0x43, 0x44, 0x44, 0x46, 0x47, 0x49, 0x4d,
0x4f, 0x51, 0x53, 0x54, 0x53, 0x54, 0x54, 0x53,
0x53, 0x51, 0x4e, 0x4b, 0x4a, 0x47, 0x45, 0x44,
0x44, 0x45, 0x46, },
{ 0x20, 0x28, 0x26, 0x26, 0x25, 0x24, 0x27, 0x27,
0x27, 0x29, 0x2c, 0x2e, 0x31, 0x34, 0x37, 0x3b,
0x3e, 0x41, 0x45, 0x48, 0x4a, 0x4c, 0x4e, 0x4e,
0x50, 0x51, 0x52, 0x51, 0x4f, 0x4b, 0x4a, 0x49,
0x45, 0x43, 0x3f, 0x3c, 0x3a, 0x36, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2e, 0x31, 0x34, 0x37, 0x39, 0x3b, 0x3c,
0x3d, 0x3e, 0x3f, 0x40, 0x3e, 0x3d, 0x3d, 0x3a,
0x38, 0x36, 0x34, 0x31, 0x2f, 0x2c, 0x29, 0x27,
0x25, 0x21, 0x21, 0x1f, 0x1c, 0x1d, 0x19, 0x18,
0x16, 0x15, 0x15, 0x13, 0x12, 0x11, 0x11, 0x0f,
0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d,
0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
0x0d, 0x0d, 0x0d, 0x0e, 0x0e, 0x0e, 0x0f, 0x10,
0x10, 0x10, 0x12, 0x13, 0x15, 0x16, 0x18, 0x1a,
0x1c, 0x1d, 0x20, 0x22, 0x25, 0x27, 0x2a, 0x2e,
0x30, 0x34, 0x38, 0x39, 0x3a, 0x3b, 0x3b, 0x3b,
0x3c, 0x3d, 0x3c, 0x3b, 0x3a, 0x39, 0x38, 0x37,
0x36, 0x36, 0x38, 0x37, 0x37, 0x37, 0x38, 0x3a,
0x3b, 0x3c, 0x3d, 0x3e, 0x3f, 0x40, 0x40, 0x40,
0x42, 0x40, 0x3f, 0x3e, 0x3d, 0x3b, 0x3a, 0x39,
0x37, 0x36, 0x36, 0x35, 0x34, 0x34, 0x33, 0x33,
0x33, 0x34, 0x35, 0x35, 0x35, 0x36, 0x38, 0x39,
0x3a, 0x3b, 0x3d, 0x3f, 0x42, 0x43, 0x45, 0x45,
0x46, 0x48, 0x49, 0x4b, 0x4b, 0x4d, 0x50, 0x53,
0x56, 0x57, 0x5a, 0x5c, 0x5e, 0x61, 0x63, 0x65,
0x66, 0x64, 0x62, 0x5f, 0x5c, 0x59, 0x58, 0x56,
0x55, 0x54, 0x52, 0x51, 0x50, 0x51, 0x51, 0x52,
0x52, 0x52, 0x52, 0x52, 0x51, 0x51, 0x51, 0x50,
0x4f, 0x4e, 0x4c, 0x4a, 0x47, 0x46, 0x45, 0x45,
0x45, 0x46, 0x46, 0x46, 0x4a, 0x4c, 0x4d, 0x52,
0x54, 0x56, 0x58, 0x58, 0x56, 0x57, 0x57, 0x56,
0x55, 0x53, 0x50, 0x4d, 0x49, 0x45, 0x44, 0x44,
0x43, 0x44, 0x45, },
{ 0x1f, 0x27, 0x24, 0x23, 0x25, 0x24, 0x25, 0x26,
0x26, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a,
0x3d, 0x41, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4e,
0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x49,
0x45, 0x43, 0x3f, 0x3c, 0x3a, 0x36, 0x33, 0x30,
0x2f, 0x2d, 0x29, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x25, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3c,
0x3e, 0x3f, 0x3f, 0x40, 0x3e, 0x3d, 0x3c, 0x3a,
0x38, 0x36, 0x34, 0x31, 0x30, 0x2c, 0x29, 0x28,
0x25, 0x23, 0x22, 0x1f, 0x1c, 0x1c, 0x18, 0x18,
0x16, 0x14, 0x14, 0x13, 0x11, 0x11, 0x11, 0x0f,
0x0f, 0x0e, 0x0f, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d,
0x0c, 0x0c, 0x0b, 0x0c, 0x0c, 0x0c, 0x0c, 0x0c,
0x0d, 0x0e, 0x0e, 0x0f, 0x0d, 0x0f, 0x10, 0x10,
0x10, 0x11, 0x13, 0x14, 0x15, 0x16, 0x19, 0x1a,
0x1c, 0x1f, 0x20, 0x23, 0x26, 0x28, 0x2a, 0x2e,
0x31, 0x35, 0x38, 0x39, 0x3a, 0x3c, 0x3d, 0x3d,
0x3e, 0x3e, 0x3d, 0x3c, 0x3a, 0x3a, 0x39, 0x39,
0x38, 0x37, 0x38, 0x38, 0x37, 0x38, 0x39, 0x3a,
0x3c, 0x3c, 0x3d, 0x3e, 0x3f, 0x3f, 0x40, 0x3f,
0x41, 0x40, 0x3e, 0x3e, 0x3d, 0x3b, 0x3b, 0x39,
0x37, 0x37, 0x35, 0x36, 0x34, 0x34, 0x34, 0x35,
0x35, 0x34, 0x34, 0x35, 0x35, 0x37, 0x38, 0x39,
0x3a, 0x3c, 0x3f, 0x3f, 0x43, 0x43, 0x45, 0x47,
0x48, 0x48, 0x4a, 0x4b, 0x4e, 0x4d, 0x51, 0x53,
0x56, 0x58, 0x59, 0x5b, 0x5d, 0x60, 0x62, 0x63,
0x64, 0x63, 0x61, 0x5e, 0x5c, 0x5a, 0x57, 0x56,
0x55, 0x54, 0x53, 0x52, 0x51, 0x51, 0x52, 0x52,
0x54, 0x54, 0x55, 0x55, 0x55, 0x54, 0x54, 0x53,
0x52, 0x50, 0x4e, 0x4d, 0x4b, 0x4a, 0x48, 0x48,
0x48, 0x48, 0x4a, 0x4b, 0x4d, 0x4f, 0x52, 0x55,
0x58, 0x5a, 0x5b, 0x5b, 0x5b, 0x5b, 0x5a, 0x59,
0x58, 0x55, 0x51, 0x4e, 0x4a, 0x46, 0x45, 0x44,
0x44, 0x44, 0x44, },
{ 0x1e, 0x26, 0x23, 0x23, 0x25, 0x24, 0x25, 0x26,
0x26, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a,
0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4f,
0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x48,
0x46, 0x44, 0x3f, 0x3b, 0x39, 0x36, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3b, 0x3d,
0x3e, 0x3f, 0x41, 0x41, 0x40, 0x3e, 0x3d, 0x3b,
0x38, 0x37, 0x34, 0x32, 0x30, 0x2c, 0x2a, 0x27,
0x26, 0x23, 0x22, 0x20, 0x1d, 0x1b, 0x1a, 0x19,
0x17, 0x15, 0x15, 0x13, 0x12, 0x12, 0x11, 0x0f,
0x11, 0x0f, 0x0e, 0x0e, 0x0d, 0x0d, 0x0d, 0x0c,
0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0d,
0x0e, 0x0e, 0x0e, 0x0f, 0x10, 0x10, 0x11, 0x11,
0x11, 0x13, 0x16, 0x15, 0x15, 0x18, 0x1a, 0x1b,
0x1d, 0x20, 0x22, 0x24, 0x27, 0x29, 0x2c, 0x30,
0x33, 0x37, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3e,
0x40, 0x40, 0x40, 0x3f, 0x3e, 0x3d, 0x3c, 0x3a,
0x3a, 0x3a, 0x3a, 0x3a, 0x3a, 0x3a, 0x3b, 0x3d,
0x3d, 0x3f, 0x40, 0x40, 0x3f, 0x41, 0x41, 0x41,
0x41, 0x41, 0x40, 0x40, 0x3f, 0x3e, 0x3c, 0x3b,
0x3a, 0x39, 0x37, 0x36, 0x36, 0x35, 0x35, 0x36,
0x36, 0x35, 0x35, 0x36, 0x36, 0x38, 0x39, 0x39,
0x3b, 0x3c, 0x3e, 0x40, 0x41, 0x43, 0x45, 0x47,
0x48, 0x48, 0x4b, 0x4c, 0x4d, 0x4f, 0x51, 0x53,
0x56, 0x56, 0x59, 0x5b, 0x5d, 0x5f, 0x61, 0x62,
0x63, 0x63, 0x61, 0x5e, 0x5c, 0x5a, 0x59, 0x57,
0x56, 0x54, 0x54, 0x53, 0x52, 0x53, 0x53, 0x55,
0x56, 0x56, 0x57, 0x57, 0x57, 0x57, 0x56, 0x56,
0x55, 0x53, 0x51, 0x4f, 0x4d, 0x4b, 0x49, 0x4b,
0x4b, 0x4c, 0x4d, 0x4e, 0x51, 0x53, 0x55, 0x58,
0x5b, 0x5c, 0x60, 0x60, 0x5f, 0x5e, 0x5d, 0x5c,
0x5a, 0x57, 0x53, 0x4f, 0x4b, 0x46, 0x45, 0x44,
0x44, 0x44, 0x44, },
{ 0x1d, 0x25, 0x22, 0x22, 0x23, 0x23, 0x24, 0x25,
0x25, 0x28, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3a,
0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4d, 0x4f, 0x4f,
0x50, 0x51, 0x52, 0x50, 0x4f, 0x4b, 0x4a, 0x47,
0x45, 0x43, 0x3f, 0x3c, 0x38, 0x35, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2b, 0x2f, 0x32, 0x34, 0x37, 0x39, 0x3c, 0x3d,
0x3e, 0x3f, 0x40, 0x41, 0x40, 0x3e, 0x3d, 0x3b,
0x39, 0x36, 0x34, 0x32, 0x30, 0x2d, 0x2a, 0x26,
0x26, 0x24, 0x22, 0x1f, 0x1d, 0x1c, 0x1a, 0x19,
0x18, 0x16, 0x15, 0x14, 0x12, 0x12, 0x12, 0x10,
0x10, 0x0f, 0x0e, 0x10, 0x0e, 0x0e, 0x0d, 0x0c,
0x0d, 0x0d, 0x0d, 0x0d, 0x0d, 0x0e, 0x0d, 0x0e,
0x0f, 0x0f, 0x0f, 0x10, 0x11, 0x11, 0x11, 0x12,
0x13, 0x14, 0x16, 0x16, 0x18, 0x1a, 0x1b, 0x1c,
0x1e, 0x21, 0x23, 0x25, 0x28, 0x2a, 0x2e, 0x32,
0x34, 0x38, 0x3a, 0x3c, 0x3d, 0x3f, 0x40, 0x42,
0x43, 0x43, 0x43, 0x42, 0x40, 0x3e, 0x3e, 0x3c,
0x3b, 0x3b, 0x3c, 0x3a, 0x3b, 0x3b, 0x3e, 0x3e,
0x40, 0x3f, 0x41, 0x41, 0x41, 0x42, 0x42, 0x43,
0x42, 0x41, 0x41, 0x41, 0x40, 0x3e, 0x3d, 0x3c,
0x3b, 0x3a, 0x39, 0x37, 0x36, 0x35, 0x36, 0x37,
0x35, 0x36, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x3b,
0x3b, 0x3d, 0x3e, 0x40, 0x41, 0x41, 0x44, 0x46,
0x48, 0x48, 0x4a, 0x4c, 0x4d, 0x4f, 0x51, 0x53,
0x55, 0x57, 0x59, 0x5a, 0x5b, 0x5e, 0x5f, 0x61,
0x62, 0x61, 0x60, 0x5e, 0x5c, 0x5a, 0x59, 0x58,
0x56, 0x55, 0x54, 0x53, 0x53, 0x54, 0x54, 0x55,
0x57, 0x57, 0x58, 0x59, 0x5a, 0x58, 0x59, 0x58,
0x57, 0x55, 0x53, 0x52, 0x4f, 0x4e, 0x4d, 0x4d,
0x4d, 0x4f, 0x51, 0x50, 0x54, 0x56, 0x59, 0x5c,
0x5f, 0x61, 0x64, 0x64, 0x63, 0x61, 0x5e, 0x5e,
0x5c, 0x59, 0x54, 0x50, 0x4c, 0x46, 0x45, 0x44,
0x44, 0x44, 0x44, },
{ 0x1c, 0x24, 0x21, 0x21, 0x21, 0x22, 0x23, 0x23,
0x25, 0x27, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b,
0x3e, 0x42, 0x45, 0x48, 0x4b, 0x4c, 0x50, 0x4f,
0x50, 0x51, 0x52, 0x50, 0x4e, 0x4b, 0x4a, 0x49,
0x45, 0x42, 0x3f, 0x3c, 0x38, 0x35, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2b, 0x2f, 0x32, 0x34, 0x38, 0x39, 0x3c, 0x3d,
0x3e, 0x3e, 0x40, 0x41, 0x40, 0x3e, 0x3c, 0x3a,
0x39, 0x37, 0x35, 0x33, 0x30, 0x2d, 0x2b, 0x28,
0x26, 0x23, 0x23, 0x20, 0x1e, 0x1b, 0x19, 0x19,
0x17, 0x16, 0x15, 0x14, 0x12, 0x12, 0x11, 0x10,
0x0f, 0x0e, 0x0e, 0x10, 0x0e, 0x0d, 0x0c, 0x0c,
0x0c, 0x0d, 0x0d, 0x0d, 0x0d, 0x0e, 0x0d, 0x0e,
0x0f, 0x0f, 0x0f, 0x10, 0x11, 0x11, 0x12, 0x14,
0x14, 0x14, 0x16, 0x18, 0x19, 0x1b, 0x1c, 0x1e,
0x20, 0x23, 0x26, 0x27, 0x29, 0x2c, 0x2f, 0x33,
0x36, 0x38, 0x3b, 0x3e, 0x3e, 0x42, 0x43, 0x46,
0x46, 0x46, 0x46, 0x44, 0x42, 0x41, 0x3f, 0x3e,
0x3d, 0x3d, 0x3e, 0x3d, 0x3d, 0x3e, 0x3e, 0x40,
0x40, 0x40, 0x43, 0x43, 0x42, 0x43, 0x45, 0x43,
0x43, 0x43, 0x42, 0x42, 0x41, 0x40, 0x40, 0x3e,
0x3c, 0x3a, 0x3a, 0x38, 0x36, 0x36, 0x36, 0x36,
0x37, 0x37, 0x36, 0x38, 0x38, 0x39, 0x3b, 0x3b,
0x3e, 0x3e, 0x3e, 0x40, 0x41, 0x43, 0x45, 0x46,
0x46, 0x49, 0x4c, 0x4c, 0x4d, 0x4f, 0x51, 0x54,
0x56, 0x57, 0x58, 0x5a, 0x5c, 0x5e, 0x60, 0x60,
0x61, 0x61, 0x60, 0x5f, 0x5c, 0x5a, 0x59, 0x58,
0x57, 0x57, 0x55, 0x54, 0x53, 0x55, 0x55, 0x58,
0x58, 0x59, 0x5a, 0x5a, 0x5a, 0x5b, 0x5b, 0x5b,
0x5a, 0x59, 0x56, 0x54, 0x53, 0x4e, 0x4e, 0x50,
0x50, 0x51, 0x52, 0x52, 0x57, 0x59, 0x5d, 0x60,
0x63, 0x63, 0x66, 0x66, 0x66, 0x64, 0x63, 0x61,
0x60, 0x5b, 0x55, 0x51, 0x4d, 0x48, 0x45, 0x44,
0x43, 0x43, 0x43, },
{ 0x1b, 0x23, 0x20, 0x21, 0x22, 0x22, 0x23, 0x24,
0x26, 0x27, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b,
0x3d, 0x42, 0x46, 0x49, 0x4a, 0x4c, 0x4f, 0x4f,
0x50, 0x50, 0x52, 0x50, 0x4e, 0x4b, 0x4b, 0x49,
0x45, 0x42, 0x3e, 0x3c, 0x38, 0x35, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2c, 0x2f, 0x32, 0x35, 0x38, 0x3a, 0x3c, 0x3d,
0x3e, 0x3e, 0x40, 0x41, 0x40, 0x3f, 0x3d, 0x3b,
0x3a, 0x38, 0x36, 0x33, 0x30, 0x2d, 0x2b, 0x29,
0x27, 0x24, 0x24, 0x21, 0x1e, 0x1c, 0x1b, 0x1a,
0x18, 0x17, 0x16, 0x15, 0x13, 0x12, 0x10, 0x0f,
0x10, 0x0f, 0x0e, 0x0f, 0x0e, 0x0d, 0x0d, 0x0d,
0x0d, 0x0d, 0x0e, 0x0e, 0x0e, 0x0f, 0x0e, 0x0f,
0x10, 0x11, 0x11, 0x12, 0x13, 0x13, 0x14, 0x15,
0x15, 0x16, 0x17, 0x1a, 0x1b, 0x1d, 0x1e, 0x20,
0x21, 0x25, 0x27, 0x29, 0x2b, 0x2d, 0x31, 0x35,
0x37, 0x39, 0x3c, 0x3f, 0x40, 0x43, 0x46, 0x47,
0x4a, 0x49, 0x48, 0x46, 0x45, 0x43, 0x42, 0x41,
0x3f, 0x40, 0x3f, 0x3f, 0x40, 0x3f, 0x41, 0x43,
0x43, 0x43, 0x44, 0x45, 0x45, 0x45, 0x45, 0x45,
0x45, 0x45, 0x44, 0x43, 0x43, 0x42, 0x42, 0x40,
0x3e, 0x3d, 0x3c, 0x39, 0x38, 0x38, 0x38, 0x38,
0x38, 0x36, 0x38, 0x39, 0x39, 0x3a, 0x3c, 0x3d,
0x3e, 0x3e, 0x3f, 0x41, 0x42, 0x42, 0x43, 0x45,
0x46, 0x49, 0x4b, 0x4d, 0x4f, 0x50, 0x53, 0x54,
0x57, 0x58, 0x5a, 0x5c, 0x5b, 0x5e, 0x60, 0x61,
0x60, 0x60, 0x5f, 0x5f, 0x5d, 0x5b, 0x5b, 0x59,
0x58, 0x57, 0x56, 0x55, 0x55, 0x55, 0x57, 0x59,
0x5b, 0x5b, 0x5d, 0x5c, 0x5c, 0x5e, 0x5e, 0x5e,
0x5d, 0x5b, 0x59, 0x56, 0x54, 0x51, 0x51, 0x51,
0x52, 0x55, 0x56, 0x56, 0x5a, 0x5d, 0x5f, 0x63,
0x66, 0x68, 0x6b, 0x6b, 0x68, 0x67, 0x66, 0x64,
0x61, 0x5d, 0x57, 0x52, 0x4f, 0x49, 0x46, 0x45,
0x43, 0x43, 0x43, },
{ 0x1a, 0x22, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x24,
0x26, 0x27, 0x2a, 0x2d, 0x31, 0x33, 0x37, 0x3b,
0x3d, 0x41, 0x46, 0x49, 0x4a, 0x4d, 0x4f, 0x4f,
0x50, 0x51, 0x52, 0x50, 0x4e, 0x4b, 0x4b, 0x48,
0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x33, 0x30,
0x2f, 0x2d, 0x2a, 0x28, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x27, 0x29, 0x2a,
0x2d, 0x2f, 0x32, 0x35, 0x39, 0x3a, 0x3c, 0x3d,
0x3e, 0x3f, 0x40, 0x41, 0x40, 0x3f, 0x3e, 0x3c,
0x3a, 0x38, 0x36, 0x33, 0x31, 0x2d, 0x2c, 0x29,
0x27, 0x26, 0x24, 0x21, 0x1f, 0x1d, 0x1c, 0x1a,
0x19, 0x18, 0x16, 0x15, 0x14, 0x13, 0x12, 0x10,
0x11, 0x10, 0x0f, 0x0f, 0x0f, 0x0e, 0x0e, 0x0e,
0x0f, 0x0f, 0x0e, 0x0e, 0x0e, 0x0f, 0x0f, 0x10,
0x11, 0x12, 0x12, 0x13, 0x15, 0x15, 0x16, 0x16,
0x17, 0x18, 0x1a, 0x1b, 0x1c, 0x1e, 0x1f, 0x21,
0x22, 0x25, 0x27, 0x2a, 0x2c, 0x2e, 0x33, 0x36,
0x39, 0x3a, 0x3d, 0x40, 0x41, 0x45, 0x47, 0x4a,
0x4c, 0x4d, 0x4c, 0x4a, 0x48, 0x45, 0x44, 0x41,
0x42, 0x42, 0x42, 0x42, 0x42, 0x43, 0x43, 0x44,
0x45, 0x47, 0x47, 0x48, 0x47, 0x48, 0x47, 0x47,
0x48, 0x48, 0x46, 0x46, 0x46, 0x43, 0x43, 0x41,
0x3f, 0x3e, 0x3b, 0x39, 0x38, 0x37, 0x37, 0x37,
0x38, 0x38, 0x37, 0x39, 0x39, 0x3a, 0x3c, 0x3e,
0x3e, 0x3f, 0x3f, 0x3f, 0x42, 0x43, 0x43, 0x45,
0x47, 0x48, 0x4b, 0x4c, 0x4e, 0x50, 0x51, 0x54,
0x56, 0x58, 0x5a, 0x5c, 0x5c, 0x5f, 0x5f, 0x5f,
0x61, 0x60, 0x5f, 0x5f, 0x5e, 0x5b, 0x5c, 0x5b,
0x59, 0x59, 0x57, 0x56, 0x55, 0x56, 0x57, 0x59,
0x5a, 0x5b, 0x5c, 0x5c, 0x5d, 0x5e, 0x5e, 0x5d,
0x5e, 0x5c, 0x5a, 0x57, 0x55, 0x52, 0x51, 0x52,
0x53, 0x55, 0x57, 0x58, 0x5c, 0x5e, 0x61, 0x65,
0x69, 0x6b, 0x6c, 0x6b, 0x6a, 0x69, 0x67, 0x64,
0x61, 0x5d, 0x59, 0x53, 0x4d, 0x48, 0x46, 0x45,
0x44, 0x44, 0x43, },
{ 0x1a, 0x21, 0x1e, 0x1f, 0x20, 0x21, 0x23, 0x24,
0x25, 0x28, 0x2a, 0x2e, 0x31, 0x33, 0x37, 0x3b,
0x3e, 0x41, 0x46, 0x49, 0x4b, 0x4d, 0x4f, 0x4e,
0x50, 0x51, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48,
0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x32, 0x30,
0x2f, 0x2d, 0x29, 0x27, 0x27, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x26, 0x27, 0x29, 0x2a,
0x2c, 0x2f, 0x32, 0x35, 0x38, 0x3b, 0x3c, 0x3e,
0x3f, 0x3f, 0x40, 0x41, 0x40, 0x3f, 0x3e, 0x3c,
0x3a, 0x39, 0x36, 0x34, 0x31, 0x2d, 0x2c, 0x29,
0x27, 0x26, 0x24, 0x21, 0x1f, 0x1d, 0x1c, 0x1a,
0x19, 0x17, 0x16, 0x15, 0x14, 0x13, 0x12, 0x10,
0x11, 0x10, 0x0f, 0x0f, 0x0f, 0x0e, 0x0e, 0x0e,
0x0e, 0x0e, 0x0e, 0x0e, 0x0e, 0x0f, 0x0f, 0x10,
0x11, 0x13, 0x14, 0x14, 0x15, 0x16, 0x17, 0x19,
0x19, 0x1a, 0x1c, 0x1d, 0x1e, 0x20, 0x22, 0x24,
0x25, 0x27, 0x29, 0x2c, 0x2e, 0x31, 0x35, 0x38,
0x3a, 0x3d, 0x41, 0x42, 0x45, 0x48, 0x4c, 0x4e,
0x4f, 0x4f, 0x4f, 0x4d, 0x4b, 0x49, 0x47, 0x47,
0x46, 0x45, 0x45, 0x45, 0x44, 0x44, 0x46, 0x47,
0x48, 0x49, 0x4b, 0x4b, 0x4a, 0x4b, 0x4b, 0x4a,
0x4b, 0x4a, 0x49, 0x49, 0x48, 0x46, 0x46, 0x44,
0x42, 0x41, 0x3d, 0x3b, 0x3a, 0x38, 0x38, 0x38,
0x37, 0x37, 0x39, 0x38, 0x3a, 0x3a, 0x3c, 0x3c,
0x3e, 0x40, 0x40, 0x41, 0x43, 0x43, 0x45, 0x46,
0x48, 0x49, 0x4b, 0x4e, 0x4f, 0x50, 0x53, 0x55,
0x57, 0x59, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, 0x60,
0x60, 0x60, 0x5f, 0x5f, 0x5e, 0x5c, 0x5b, 0x5a,
0x59, 0x58, 0x57, 0x57, 0x56, 0x56, 0x57, 0x58,
0x59, 0x5a, 0x5b, 0x5c, 0x5c, 0x5d, 0x5e, 0x5d,
0x5c, 0x5b, 0x58, 0x57, 0x54, 0x52, 0x52, 0x53,
0x54, 0x57, 0x58, 0x58, 0x5b, 0x5e, 0x62, 0x65,
0x69, 0x6b, 0x6d, 0x6c, 0x6a, 0x69, 0x67, 0x64,
0x62, 0x5e, 0x59, 0x54, 0x4d, 0x48, 0x47, 0x46,
0x45, 0x45, 0x44, },
{ 0x1a, 0x21, 0x1e, 0x1f, 0x20, 0x21, 0x23, 0x24,
0x25, 0x28, 0x2a, 0x2e, 0x31, 0x34, 0x37, 0x3b,
0x3e, 0x42, 0x47, 0x49, 0x4b, 0x4d, 0x4f, 0x4f,
0x50, 0x51, 0x51, 0x50, 0x50, 0x4c, 0x4a, 0x47,
0x44, 0x42, 0x3e, 0x3c, 0x39, 0x35, 0x32, 0x31,
0x2f, 0x2d, 0x29, 0x27, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x25, 0x25, 0x26, 0x27, 0x29, 0x2b,
0x2c, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e,
0x40, 0x40, 0x41, 0x42, 0x41, 0x3f, 0x3f, 0x3d,
0x3b, 0x39, 0x36, 0x33, 0x32, 0x2e, 0x2d, 0x2a,
0x27, 0x26, 0x25, 0x22, 0x1f, 0x1d, 0x1c, 0x1b,
0x19, 0x17, 0x17, 0x16, 0x15, 0x14, 0x12, 0x11,
0x11, 0x11, 0x10, 0x10, 0x0f, 0x0f, 0x0f, 0x0f,
0x0f, 0x0f, 0x10, 0x11, 0x10, 0x11, 0x11, 0x12,
0x11, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1b,
0x1c, 0x1c, 0x1e, 0x20, 0x21, 0x22, 0x23, 0x25,
0x27, 0x2a, 0x2c, 0x2f, 0x31, 0x35, 0x38, 0x3b,
0x3d, 0x40, 0x44, 0x47, 0x49, 0x4c, 0x4f, 0x51,
0x53, 0x53, 0x53, 0x51, 0x50, 0x4e, 0x4c, 0x4b,
0x4a, 0x49, 0x49, 0x49, 0x49, 0x4a, 0x4a, 0x4d,
0x4e, 0x4e, 0x4f, 0x50, 0x4f, 0x50, 0x51, 0x50,
0x50, 0x4e, 0x4d, 0x4c, 0x4b, 0x48, 0x48, 0x47,
0x44, 0x42, 0x3f, 0x3d, 0x3b, 0x3a, 0x39, 0x39,
0x39, 0x38, 0x39, 0x3b, 0x3a, 0x3c, 0x3e, 0x3d,
0x40, 0x40, 0x40, 0x42, 0x42, 0x42, 0x45, 0x46,
0x47, 0x49, 0x4c, 0x4e, 0x50, 0x50, 0x53, 0x56,
0x58, 0x59, 0x5d, 0x5d, 0x5e, 0x60, 0x61, 0x61,
0x62, 0x61, 0x60, 0x60, 0x5e, 0x5d, 0x5d, 0x5b,
0x57, 0x58, 0x56, 0x55, 0x55, 0x56, 0x56, 0x59,
0x59, 0x58, 0x5a, 0x5a, 0x5a, 0x5c, 0x5c, 0x5c,
0x5b, 0x5b, 0x58, 0x57, 0x54, 0x53, 0x52, 0x53,
0x54, 0x57, 0x58, 0x59, 0x5c, 0x5f, 0x63, 0x67,
0x6b, 0x6d, 0x6e, 0x6e, 0x6b, 0x6a, 0x68, 0x64,
0x62, 0x5e, 0x58, 0x53, 0x4f, 0x49, 0x47, 0x46,
0x45, 0x45, 0x44, },
{ 0x19, 0x20, 0x1e, 0x1e, 0x1f, 0x20, 0x22, 0x23,
0x25, 0x27, 0x2a, 0x2e, 0x31, 0x34, 0x37, 0x3a,
0x3e, 0x41, 0x46, 0x49, 0x4a, 0x4d, 0x4f, 0x4e,
0x50, 0x51, 0x51, 0x4f, 0x4f, 0x4d, 0x49, 0x47,
0x44, 0x42, 0x3e, 0x3c, 0x39, 0x36, 0x32, 0x31,
0x2f, 0x2d, 0x29, 0x27, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x25, 0x25, 0x26, 0x28, 0x29, 0x2b,
0x2c, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e,
0x3f, 0x3f, 0x41, 0x42, 0x41, 0x3f, 0x3f, 0x3d,
0x3c, 0x39, 0x36, 0x33, 0x32, 0x2e, 0x2d, 0x2a,
0x27, 0x26, 0x25, 0x22, 0x1f, 0x1e, 0x1d, 0x1b,
0x1a, 0x17, 0x17, 0x17, 0x14, 0x14, 0x12, 0x11,
0x11, 0x12, 0x11, 0x11, 0x10, 0x10, 0x10, 0x10,
0x10, 0x10, 0x11, 0x11, 0x11, 0x12, 0x13, 0x14,
0x14, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1c, 0x1e,
0x1e, 0x1f, 0x22, 0x23, 0x23, 0x24, 0x25, 0x27,
0x2a, 0x2d, 0x2f, 0x31, 0x35, 0x38, 0x3a, 0x3e,
0x41, 0x44, 0x48, 0x4b, 0x4d, 0x51, 0x53, 0x55,
0x57, 0x57, 0x56, 0x55, 0x54, 0x52, 0x52, 0x50,
0x4e, 0x50, 0x4e, 0x4d, 0x4d, 0x4d, 0x4f, 0x51,
0x51, 0x52, 0x54, 0x55, 0x55, 0x55, 0x57, 0x55,
0x54, 0x53, 0x52, 0x4e, 0x4d, 0x4b, 0x4a, 0x49,
0x46, 0x44, 0x41, 0x3f, 0x3d, 0x3b, 0x3a, 0x3a,
0x39, 0x39, 0x39, 0x39, 0x3a, 0x3b, 0x3d, 0x3e,
0x3f, 0x40, 0x41, 0x42, 0x44, 0x44, 0x45, 0x47,
0x49, 0x49, 0x4a, 0x4d, 0x50, 0x51, 0x53, 0x57,
0x5a, 0x5b, 0x5e, 0x5f, 0x60, 0x61, 0x62, 0x62,
0x63, 0x62, 0x60, 0x60, 0x5e, 0x5c, 0x5c, 0x59,
0x58, 0x56, 0x55, 0x55, 0x55, 0x55, 0x55, 0x54,
0x56, 0x56, 0x57, 0x58, 0x58, 0x59, 0x5a, 0x59,
0x58, 0x57, 0x56, 0x55, 0x54, 0x52, 0x53, 0x53,
0x53, 0x56, 0x57, 0x59, 0x5b, 0x5e, 0x62, 0x66,
0x6a, 0x6c, 0x6d, 0x6e, 0x6b, 0x69, 0x67, 0x64,
0x61, 0x5d, 0x58, 0x54, 0x50, 0x4a, 0x47, 0x46,
0x45, 0x45, 0x44, },
{ 0x1a, 0x21, 0x1e, 0x1f, 0x1f, 0x20, 0x22, 0x23,
0x25, 0x27, 0x2b, 0x2e, 0x31, 0x34, 0x37, 0x3b,
0x3d, 0x42, 0x45, 0x49, 0x4a, 0x4d, 0x4e, 0x4e,
0x51, 0x52, 0x50, 0x4f, 0x4f, 0x4c, 0x49, 0x48,
0x45, 0x42, 0x3e, 0x3b, 0x39, 0x36, 0x32, 0x32,
0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x25, 0x28, 0x29, 0x2b,
0x2d, 0x2f, 0x33, 0x35, 0x38, 0x3a, 0x3c, 0x3e,
0x3f, 0x3f, 0x41, 0x42, 0x41, 0x3f, 0x3e, 0x3c,
0x3c, 0x3a, 0x37, 0x33, 0x32, 0x2f, 0x2d, 0x2b,
0x28, 0x26, 0x25, 0x22, 0x20, 0x1e, 0x1d, 0x1b,
0x1a, 0x17, 0x17, 0x16, 0x14, 0x14, 0x12, 0x11,
0x12, 0x11, 0x11, 0x11, 0x11, 0x10, 0x10, 0x10,
0x10, 0x11, 0x12, 0x12, 0x12, 0x13, 0x14, 0x14,
0x16, 0x18, 0x19, 0x1a, 0x1b, 0x1d, 0x1e, 0x1f,
0x21, 0x22, 0x23, 0x25, 0x26, 0x26, 0x28, 0x2a,
0x2c, 0x2e, 0x32, 0x34, 0x39, 0x39, 0x3d, 0x41,
0x45, 0x47, 0x4c, 0x4e, 0x51, 0x54, 0x56, 0x58,
0x5b, 0x5c, 0x5a, 0x59, 0x58, 0x56, 0x55, 0x53,
0x53, 0x52, 0x52, 0x51, 0x52, 0x52, 0x53, 0x55,
0x57, 0x58, 0x5a, 0x5a, 0x59, 0x5b, 0x59, 0x59,
0x58, 0x57, 0x55, 0x53, 0x51, 0x4e, 0x4c, 0x4a,
0x48, 0x46, 0x43, 0x40, 0x3e, 0x3c, 0x3b, 0x3b,
0x38, 0x39, 0x38, 0x39, 0x3a, 0x3d, 0x3d, 0x3e,
0x3f, 0x40, 0x41, 0x43, 0x44, 0x45, 0x46, 0x48,
0x4a, 0x4b, 0x4d, 0x4e, 0x50, 0x52, 0x54, 0x56,
0x59, 0x5c, 0x5e, 0x5f, 0x60, 0x62, 0x62, 0x63,
0x63, 0x63, 0x61, 0x5f, 0x5e, 0x5d, 0x5c, 0x5b,
0x59, 0x56, 0x56, 0x55, 0x54, 0x53, 0x53, 0x54,
0x55, 0x54, 0x55, 0x55, 0x55, 0x57, 0x58, 0x57,
0x57, 0x56, 0x55, 0x54, 0x54, 0x52, 0x52, 0x53,
0x54, 0x55, 0x57, 0x58, 0x5b, 0x5e, 0x62, 0x65,
0x69, 0x6b, 0x6d, 0x6e, 0x6a, 0x69, 0x67, 0x63,
0x61, 0x5d, 0x58, 0x54, 0x4f, 0x4b, 0x48, 0x47,
0x46, 0x45, 0x45, },
{ 0x1a, 0x21, 0x1e, 0x1f, 0x1f, 0x20, 0x22, 0x23,
0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b,
0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4e, 0x4f,
0x51, 0x52, 0x50, 0x50, 0x4f, 0x4c, 0x4a, 0x48,
0x45, 0x42, 0x3f, 0x3b, 0x39, 0x36, 0x32, 0x31,
0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x28, 0x29, 0x2b,
0x2d, 0x30, 0x33, 0x36, 0x39, 0x3b, 0x3d, 0x3f,
0x3f, 0x40, 0x42, 0x43, 0x42, 0x40, 0x3e, 0x3c,
0x3c, 0x3a, 0x37, 0x34, 0x32, 0x2f, 0x2d, 0x2c,
0x2a, 0x27, 0x26, 0x23, 0x20, 0x1e, 0x1d, 0x1c,
0x1a, 0x18, 0x18, 0x17, 0x15, 0x16, 0x14, 0x12,
0x12, 0x12, 0x12, 0x12, 0x12, 0x11, 0x11, 0x12,
0x12, 0x12, 0x13, 0x14, 0x14, 0x14, 0x15, 0x16,
0x17, 0x19, 0x1b, 0x1c, 0x1e, 0x20, 0x20, 0x22,
0x24, 0x25, 0x26, 0x27, 0x28, 0x2a, 0x2c, 0x2c,
0x2f, 0x32, 0x35, 0x37, 0x3b, 0x3c, 0x41, 0x45,
0x48, 0x4c, 0x50, 0x52, 0x54, 0x57, 0x5a, 0x5c,
0x5f, 0x5f, 0x5f, 0x5d, 0x5c, 0x5b, 0x5a, 0x58,
0x57, 0x57, 0x57, 0x56, 0x56, 0x57, 0x57, 0x5a,
0x5c, 0x5e, 0x5f, 0x61, 0x5f, 0x5f, 0x5f, 0x5e,
0x5d, 0x5c, 0x5a, 0x57, 0x55, 0x52, 0x4f, 0x4e,
0x4a, 0x47, 0x46, 0x42, 0x41, 0x3e, 0x3d, 0x3c,
0x3b, 0x3a, 0x39, 0x39, 0x3b, 0x3c, 0x3d, 0x3f,
0x40, 0x42, 0x42, 0x44, 0x45, 0x46, 0x49, 0x49,
0x4b, 0x4c, 0x4e, 0x4f, 0x51, 0x54, 0x57, 0x58,
0x5b, 0x5d, 0x61, 0x61, 0x61, 0x63, 0x65, 0x65,
0x64, 0x64, 0x62, 0x61, 0x60, 0x5e, 0x5d, 0x5c,
0x59, 0x58, 0x56, 0x54, 0x53, 0x53, 0x53, 0x54,
0x54, 0x53, 0x53, 0x54, 0x54, 0x54, 0x55, 0x55,
0x56, 0x55, 0x54, 0x53, 0x53, 0x52, 0x52, 0x53,
0x55, 0x56, 0x57, 0x58, 0x5b, 0x5e, 0x62, 0x66,
0x69, 0x6b, 0x6d, 0x6d, 0x6b, 0x69, 0x67, 0x64,
0x61, 0x5d, 0x58, 0x55, 0x50, 0x4b, 0x48, 0x47,
0x46, 0x46, 0x46, },
{ 0x1a, 0x20, 0x1e, 0x1f, 0x1f, 0x21, 0x22, 0x23,
0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b,
0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4f, 0x4f,
0x51, 0x52, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48,
0x45, 0x42, 0x3f, 0x3b, 0x38, 0x36, 0x32, 0x31,
0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x27, 0x28, 0x29, 0x2b,
0x2e, 0x30, 0x33, 0x36, 0x39, 0x3b, 0x3d, 0x3f,
0x3f, 0x40, 0x41, 0x42, 0x41, 0x40, 0x3e, 0x3c,
0x3c, 0x3a, 0x37, 0x34, 0x33, 0x30, 0x2e, 0x2b,
0x29, 0x26, 0x24, 0x24, 0x20, 0x1f, 0x1d, 0x1d,
0x1a, 0x19, 0x17, 0x16, 0x16, 0x16, 0x16, 0x14,
0x13, 0x12, 0x13, 0x13, 0x13, 0x12, 0x12, 0x13,
0x13, 0x14, 0x15, 0x15, 0x14, 0x15, 0x16, 0x18,
0x19, 0x1b, 0x1c, 0x1e, 0x20, 0x21, 0x22, 0x24,
0x27, 0x28, 0x29, 0x2a, 0x2c, 0x2c, 0x2d, 0x2f,
0x32, 0x35, 0x37, 0x3a, 0x3c, 0x3e, 0x44, 0x48,
0x4c, 0x50, 0x54, 0x56, 0x58, 0x5b, 0x5e, 0x60,
0x61, 0x63, 0x62, 0x61, 0x60, 0x5f, 0x5e, 0x5e,
0x5c, 0x5c, 0x5b, 0x5a, 0x5a, 0x5b, 0x5c, 0x5e,
0x60, 0x63, 0x64, 0x65, 0x63, 0x62, 0x63, 0x63,
0x61, 0x60, 0x5e, 0x5b, 0x58, 0x55, 0x51, 0x4f,
0x4c, 0x4a, 0x47, 0x44, 0x42, 0x41, 0x3e, 0x3c,
0x3b, 0x3a, 0x3a, 0x3b, 0x3b, 0x3c, 0x3e, 0x3f,
0x40, 0x42, 0x43, 0x45, 0x46, 0x47, 0x49, 0x4a,
0x4c, 0x4c, 0x4f, 0x51, 0x52, 0x55, 0x58, 0x5b,
0x5c, 0x5f, 0x61, 0x62, 0x63, 0x64, 0x64, 0x65,
0x66, 0x65, 0x63, 0x62, 0x5f, 0x5e, 0x5e, 0x5c,
0x5b, 0x58, 0x56, 0x55, 0x54, 0x53, 0x52, 0x53,
0x52, 0x52, 0x52, 0x52, 0x52, 0x53, 0x55, 0x55,
0x55, 0x53, 0x53, 0x53, 0x52, 0x51, 0x52, 0x52,
0x55, 0x55, 0x58, 0x58, 0x5b, 0x5d, 0x61, 0x65,
0x68, 0x6a, 0x6c, 0x6b, 0x69, 0x68, 0x67, 0x64,
0x61, 0x5e, 0x58, 0x54, 0x4f, 0x4b, 0x49, 0x48,
0x47, 0x46, 0x45, },
{ 0x19, 0x20, 0x1d, 0x1f, 0x1f, 0x20, 0x23, 0x23,
0x25, 0x27, 0x2b, 0x2d, 0x31, 0x34, 0x37, 0x3b,
0x3d, 0x42, 0x45, 0x48, 0x4c, 0x4e, 0x4f, 0x4f,
0x51, 0x52, 0x51, 0x50, 0x4e, 0x4b, 0x4a, 0x48,
0x44, 0x42, 0x3f, 0x3a, 0x38, 0x36, 0x32, 0x30,
0x2f, 0x2c, 0x2a, 0x28, 0x26, 0x26, 0x25, 0x24,
0x23, 0x24, 0x24, 0x25, 0x26, 0x28, 0x29, 0x2b,
0x2e, 0x30, 0x34, 0x36, 0x39, 0x3b, 0x3d, 0x3f,
0x3f, 0x40, 0x41, 0x42, 0x41, 0x40, 0x3e, 0x3c,
0x3c, 0x3a, 0x37, 0x34, 0x33, 0x30, 0x2e, 0x2b,
0x29, 0x27, 0x25, 0x24, 0x21, 0x1f, 0x1e, 0x1c,
0x1b, 0x19, 0x17, 0x16, 0x16, 0x16, 0x16, 0x14,
0x13, 0x12, 0x13, 0x13, 0x13, 0x13, 0x13, 0x13,
0x13, 0x14, 0x15, 0x14, 0x14, 0x14, 0x17, 0x19,
0x1a, 0x1c, 0x1e, 0x20, 0x21, 0x23, 0x24, 0x26,
0x29, 0x29, 0x2b, 0x2c, 0x2d, 0x2e, 0x30, 0x31,
0x34, 0x38, 0x3b, 0x3c, 0x3f, 0x42, 0x47, 0x4c,
0x50, 0x54, 0x57, 0x5b, 0x5c, 0x5e, 0x62, 0x63,
0x66, 0x66, 0x66, 0x65, 0x64, 0x63, 0x61, 0x62,
0x60, 0x60, 0x5f, 0x5e, 0x5e, 0x5f, 0x60, 0x62,
0x65, 0x67, 0x69, 0x6a, 0x69, 0x68, 0x69, 0x67,
0x66, 0x64, 0x62, 0x5f, 0x5c, 0x58, 0x54, 0x51,
0x4e, 0x4b, 0x49, 0x45, 0x43, 0x41, 0x40, 0x3e,
0x3c, 0x3a, 0x3b, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f,
0x41, 0x42, 0x44, 0x46, 0x46, 0x48, 0x49, 0x4b,
0x4d, 0x50, 0x51, 0x53, 0x55, 0x57, 0x58, 0x5c,
0x5f, 0x60, 0x63, 0x64, 0x64, 0x65, 0x66, 0x66,
0x66, 0x65, 0x65, 0x63, 0x61, 0x5f, 0x5e, 0x5c,
0x5a, 0x58, 0x56, 0x55, 0x54, 0x53, 0x52, 0x52,
0x53, 0x52, 0x52, 0x52, 0x52, 0x53, 0x53, 0x53,
0x54, 0x53, 0x53, 0x52, 0x53, 0x51, 0x53, 0x53,
0x55, 0x57, 0x58, 0x59, 0x5b, 0x5d, 0x62, 0x64,
0x68, 0x6a, 0x6c, 0x6b, 0x69, 0x68, 0x67, 0x64,
0x61, 0x5d, 0x57, 0x54, 0x50, 0x4a, 0x48, 0x47,
0x46, 0x45, 0x45, },

View file

@ -0,0 +1,294 @@
/*
* Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
/*
* void hvx_histogram_row(uint8_t *src, => r0
* int stride, => r1
* int width, => r2
* int height, => r3
* int *hist => r4)
*/
.text
.p2align 2
.global hvx_histogram_row
.type hvx_histogram_row, @function
hvx_histogram_row:
{ r2 = lsr(r2, #7) /* size / VLEN */
r5 = and(r2, #127) /* size % VLEN */
v1 = #0
v0 = #0
}
/*
* Step 1: Clean the whole vector register file
*/
{ v3:2 = v1:0
v5:4 = v1:0
p0 = cmp.gt(r2, #0) /* P0 = (width / VLEN > 0) */
p1 = cmp.eq(r5, #0) /* P1 = (width % VLEN == 0) */
}
{ q0 = vsetq(r5)
v7:6 = v1:0
}
{ v9:8 = v1:0
v11:10 = v1:0
}
{ v13:12 = v1:0
v15:14 = v1:0
}
{ v17:16 = v1:0
v19:18 = v1:0
}
{ v21:20 = v1:0
v23:22 = v1:0
}
{ v25:24 = v1:0
v27:26 = v1:0
}
{ v29:28 = v1:0
v31:30 = v1:0
r10 = add(r0, r1) /* R10 = &src[2 * stride] */
loop1(.outerloop, r3)
}
/*
* Step 2: vhist
*/
.falign
.outerloop:
{ if (!p0) jump .loopend
loop0(.innerloop, r2)
}
.falign
.innerloop:
{ v12.tmp = vmem(R0++#1)
vhist
}:endloop0
.falign
.loopend:
if (p1) jump .skip /* if (width % VLEN == 0) done with current row */
{ v13.tmp = vmem(r0 + #0)
vhist(q0)
}
.falign
.skip:
{ r0 = r10 /* R0 = &src[(i + 1) * stride] */
r10 = add(r10, r1) /* R10 = &src[(i + 2) * stride] */
}:endloop1
/*
* Step 3: Sum up the data
*/
{ v0.h = vshuff(v0.h)
r10 = ##0x00010001
}
v1.h = vshuff(v1.h)
{ V2.h = vshuff(v2.h)
v0.w = vdmpy(v0.h, r10.h):sat
}
{ v3.h = vshuff(v3.h)
v1.w = vdmpy(v1.h, r10.h):sat
}
{ v4.h = vshuff(V4.h)
v2.w = vdmpy(v2.h, r10.h):sat
}
{ v5.h = vshuff(v5.h)
v3.w = vdmpy(v3.h, r10.h):sat
}
{ v6.h = vshuff(v6.h)
v4.w = vdmpy(v4.h, r10.h):sat
}
{ v7.h = vshuff(v7.h)
v5.w = vdmpy(v5.h, r10.h):sat
}
{ v8.h = vshuff(V8.h)
v6.w = vdmpy(v6.h, r10.h):sat
}
{ v9.h = vshuff(V9.h)
v7.w = vdmpy(v7.h, r10.h):sat
}
{ v10.h = vshuff(v10.h)
v8.w = vdmpy(v8.h, r10.h):sat
}
{ v11.h = vshuff(v11.h)
v9.w = vdmpy(v9.h, r10.h):sat
}
{ v12.h = vshuff(v12.h)
v10.w = vdmpy(v10.h, r10.h):sat
}
{ v13.h = vshuff(V13.h)
v11.w = vdmpy(v11.h, r10.h):sat
}
{ v14.h = vshuff(v14.h)
v12.w = vdmpy(v12.h, r10.h):sat
}
{ v15.h = vshuff(v15.h)
v13.w = vdmpy(v13.h, r10.h):sat
}
{ v16.h = vshuff(v16.h)
v14.w = vdmpy(v14.h, r10.h):sat
}
{ v17.h = vshuff(v17.h)
v15.w = vdmpy(v15.h, r10.h):sat
}
{ v18.h = vshuff(v18.h)
v16.w = vdmpy(v16.h, r10.h):sat
}
{ v19.h = vshuff(v19.h)
v17.w = vdmpy(v17.h, r10.h):sat
}
{ v20.h = vshuff(v20.h)
v18.W = vdmpy(v18.h, r10.h):sat
}
{ v21.h = vshuff(v21.h)
v19.w = vdmpy(v19.h, r10.h):sat
}
{ v22.h = vshuff(v22.h)
v20.w = vdmpy(v20.h, r10.h):sat
}
{ v23.h = vshuff(v23.h)
v21.w = vdmpy(v21.h, r10.h):sat
}
{ v24.h = vshuff(v24.h)
v22.w = vdmpy(v22.h, r10.h):sat
}
{ v25.h = vshuff(v25.h)
v23.w = vdmpy(v23.h, r10.h):sat
}
{ v26.h = vshuff(v26.h)
v24.w = vdmpy(v24.h, r10.h):sat
}
{ v27.h = vshuff(V27.h)
v25.w = vdmpy(v25.h, r10.h):sat
}
{ v28.h = vshuff(v28.h)
v26.w = vdmpy(v26.h, r10.h):sat
}
{ v29.h = vshuff(v29.h)
v27.w = vdmpy(v27.h, r10.h):sat
}
{ v30.h = vshuff(v30.h)
v28.w = vdmpy(v28.h, r10.h):sat
}
{ v31.h = vshuff(v31.h)
v29.w = vdmpy(v29.h, r10.h):sat
r28 = #32
}
{ vshuff(v1, v0, r28)
v30.w = vdmpy(v30.h, r10.h):sat
}
{ vshuff(v3, v2, r28)
v31.w = vdmpy(v31.h, r10.h):sat
}
{ vshuff(v5, v4, r28)
v0.w = vadd(v1.w, v0.w)
v2.w = vadd(v3.w, v2.w)
}
{ vshuff(v7, v6, r28)
r7 = #64
}
{ vshuff(v9, v8, r28)
v4.w = vadd(v5.w, v4.w)
v6.w = vadd(v7.w, v6.w)
}
vshuff(v11, v10, r28)
{ vshuff(v13, v12, r28)
v8.w = vadd(v9.w, v8.w)
v10.w = vadd(v11.w, v10.w)
}
vshuff(v15, v14, r28)
{ vshuff(v17, v16, r28)
v12.w = vadd(v13.w, v12.w)
v14.w = vadd(v15.w, v14.w)
}
vshuff(v19, v18, r28)
{ vshuff(v21, v20, r28)
v16.w = vadd(v17.w, v16.w)
v18.w = vadd(v19.w, v18.w)
}
vshuff(v23, v22, r28)
{ vshuff(v25, v24, r28)
v20.w = vadd(v21.w, v20.w)
v22.w = vadd(v23.w, v22.w)
}
vshuff(v27, v26, r28)
{ vshuff(v29, v28, r28)
v24.w = vadd(v25.w, v24.w)
v26.w = vadd(v27.w, v26.w)
}
vshuff(v31, v30, r28)
{ v28.w = vadd(v29.w, v28.w)
vshuff(v2, v0, r7)
}
{ v30.w = vadd(v31.w, v30.w)
vshuff(v6, v4, r7)
v0.w = vadd(v0.w, v2.w)
}
{ vshuff(v10, v8, r7)
v1.tmp = vmem(r4 + #0) /* update hist[0-31] */
v0.w = vadd(v0.w, v1.w)
vmem(r4++#1) = v0.new
}
{ vshuff(v14, v12, r7)
v4.w = vadd(v4.w, v6.w)
v8.w = vadd(v8.w, v10.w)
}
{ vshuff(v18, v16, r7)
v1.tmp = vmem(r4 + #0) /* update hist[32-63] */
v4.w = vadd(v4.w, v1.w)
vmem(r4++#1) = v4.new
}
{ vshuff(v22, v20, r7)
v12.w = vadd(v12.w, v14.w)
V16.w = vadd(v16.w, v18.w)
}
{ vshuff(v26, v24, r7)
v1.tmp = vmem(r4 + #0) /* update hist[64-95] */
v8.w = vadd(v8.w, v1.w)
vmem(r4++#1) = v8.new
}
{ vshuff(v30, v28, r7)
v1.tmp = vmem(r4 + #0) /* update hist[96-127] */
v12.w = vadd(v12.w, v1.w)
vmem(r4++#1) = v12.new
}
{ v20.w = vadd(v20.w, v22.w)
v1.tmp = vmem(r4 + #0) /* update hist[128-159] */
v16.w = vadd(v16.w, v1.w)
vmem(r4++#1) = v16.new
}
{ v24.w = vadd(v24.w, v26.w)
v1.tmp = vmem(r4 + #0) /* update hist[160-191] */
v20.w = vadd(v20.w, v1.w)
vmem(r4++#1) = v20.new
}
{ v28.w = vadd(v28.w, v30.w)
v1.tmp = vmem(r4 + #0) /* update hist[192-223] */
v24.w = vadd(v24.w, v1.w)
vmem(r4++#1) = v24.new
}
{ v1.tmp = vmem(r4 + #0) /* update hist[224-255] */
v28.w = vadd(v28.w, v1.w)
vmem(r4++#1) = v28.new
}
jumpr r31
.size hvx_histogram_row, .-hvx_histogram_row

View file

@ -0,0 +1,24 @@
/*
* Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#ifndef HVX_HISTOGRAM_ROW_H
#define HVX_HISTOGRAM_ROW_H
void hvx_histogram_row(uint8_t *src, int stride, int width, int height,
int *hist);
#endif

View file

@ -0,0 +1,469 @@
/*
* Copyright(c) 2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>
int err;
static void __check(int line, int i, int j, uint64_t result, uint64_t expect)
{
if (result != expect) {
printf("ERROR at line %d: [%d][%d] 0x%016llx != 0x%016llx\n",
line, i, j, result, expect);
err++;
}
}
#define check(RES, EXP) __check(__LINE__, RES, EXP)
#define MAX_VEC_SIZE_BYTES 128
typedef union {
uint64_t ud[MAX_VEC_SIZE_BYTES / 8];
int64_t d[MAX_VEC_SIZE_BYTES / 8];
uint32_t uw[MAX_VEC_SIZE_BYTES / 4];
int32_t w[MAX_VEC_SIZE_BYTES / 4];
uint16_t uh[MAX_VEC_SIZE_BYTES / 2];
int16_t h[MAX_VEC_SIZE_BYTES / 2];
uint8_t ub[MAX_VEC_SIZE_BYTES / 1];
int8_t b[MAX_VEC_SIZE_BYTES / 1];
} MMVector;
#define BUFSIZE 16
#define OUTSIZE 16
#define MASKMOD 3
MMVector buffer0[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
MMVector buffer1[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
MMVector mask[BUFSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
MMVector output[OUTSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
MMVector expect[OUTSIZE] __attribute__((aligned(MAX_VEC_SIZE_BYTES)));
#define CHECK_OUTPUT_FUNC(FIELD, FIELDSZ) \
static void check_output_##FIELD(int line, size_t num_vectors) \
{ \
for (int i = 0; i < num_vectors; i++) { \
for (int j = 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \
__check(line, i, j, output[i].FIELD[j], expect[i].FIELD[j]); \
} \
} \
}
CHECK_OUTPUT_FUNC(d, 8)
CHECK_OUTPUT_FUNC(w, 4)
CHECK_OUTPUT_FUNC(h, 2)
CHECK_OUTPUT_FUNC(b, 1)
static void init_buffers(void)
{
int counter0 = 0;
int counter1 = 17;
for (int i = 0; i < BUFSIZE; i++) {
for (int j = 0; j < MAX_VEC_SIZE_BYTES; j++) {
buffer0[i].b[j] = counter0++;
buffer1[i].b[j] = counter1++;
}
for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
mask[i].w[j] = (i + j % MASKMOD == 0) ? 0 : 1;
}
}
}
static void test_load_tmp(void)
{
void *p0 = buffer0;
void *p1 = buffer1;
void *pout = output;
for (int i = 0; i < BUFSIZE; i++) {
/*
* Load into v12 as .tmp, then use it in the next packet
* Should get the new value within the same packet and
* the old value in the next packet
*/
asm("v3 = vmem(%0 + #0)\n\t"
"r1 = #1\n\t"
"v12 = vsplat(r1)\n\t"
"{\n\t"
" v12.tmp = vmem(%1 + #0)\n\t"
" v4.w = vadd(v12.w, v3.w)\n\t"
"}\n\t"
"v4.w = vadd(v4.w, v12.w)\n\t"
"vmem(%2 + #0) = v4\n\t"
: : "r"(p0), "r"(p1), "r"(pout)
: "r1", "v12", "v3", "v4", "v6", "memory");
p0 += sizeof(MMVector);
p1 += sizeof(MMVector);
pout += sizeof(MMVector);
for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
expect[i].w[j] = buffer0[i].w[j] + buffer1[i].w[j] + 1;
}
}
check_output_w(__LINE__, BUFSIZE);
}
static void test_load_cur(void)
{
void *p0 = buffer0;
void *pout = output;
for (int i = 0; i < BUFSIZE; i++) {
asm("{\n\t"
" v2.cur = vmem(%0 + #0)\n\t"
" vmem(%1 + #0) = v2\n\t"
"}\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
p0 += sizeof(MMVector);
pout += sizeof(MMVector);
for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
expect[i].uw[j] = buffer0[i].uw[j];
}
}
check_output_w(__LINE__, BUFSIZE);
}
static void test_load_aligned(void)
{
/* Aligned loads ignore the low bits of the address */
void *p0 = buffer0;
void *pout = output;
const size_t offset = 13;
p0 += offset; /* Create an unaligned address */
asm("v2 = vmem(%0 + #0)\n\t"
"vmem(%1 + #0) = v2\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
expect[0] = buffer0[0];
check_output_w(__LINE__, 1);
}
static void test_load_unaligned(void)
{
void *p0 = buffer0;
void *pout = output;
const size_t offset = 12;
p0 += offset; /* Create an unaligned address */
asm("v2 = vmemu(%0 + #0)\n\t"
"vmem(%1 + #0) = v2\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
memcpy(expect, &buffer0[0].ub[offset], sizeof(MMVector));
check_output_w(__LINE__, 1);
}
static void test_store_aligned(void)
{
/* Aligned stores ignore the low bits of the address */
void *p0 = buffer0;
void *pout = output;
const size_t offset = 13;
pout += offset; /* Create an unaligned address */
asm("v2 = vmem(%0 + #0)\n\t"
"vmem(%1 + #0) = v2\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
expect[0] = buffer0[0];
check_output_w(__LINE__, 1);
}
static void test_store_unaligned(void)
{
void *p0 = buffer0;
void *pout = output;
const size_t offset = 12;
pout += offset; /* Create an unaligned address */
asm("v2 = vmem(%0 + #0)\n\t"
"vmemu(%1 + #0) = v2\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
memcpy(expect, buffer0, 2 * sizeof(MMVector));
memcpy(&expect[0].ub[offset], buffer0, sizeof(MMVector));
check_output_w(__LINE__, 2);
}
static void test_masked_store(bool invert)
{
void *p0 = buffer0;
void *pmask = mask;
void *pout = output;
memset(expect, 0xff, sizeof(expect));
memset(output, 0xff, sizeof(expect));
for (int i = 0; i < BUFSIZE; i++) {
if (invert) {
asm("r4 = #0\n\t"
"v4 = vsplat(r4)\n\t"
"v5 = vmem(%0 + #0)\n\t"
"q0 = vcmp.eq(v4.w, v5.w)\n\t"
"v5 = vmem(%1)\n\t"
"if (!q0) vmem(%2) = v5\n\t" /* Inverted test */
: : "r"(pmask), "r"(p0), "r"(pout)
: "r4", "v4", "v5", "q0", "memory");
} else {
asm("r4 = #0\n\t"
"v4 = vsplat(r4)\n\t"
"v5 = vmem(%0 + #0)\n\t"
"q0 = vcmp.eq(v4.w, v5.w)\n\t"
"v5 = vmem(%1)\n\t"
"if (q0) vmem(%2) = v5\n\t" /* Non-inverted test */
: : "r"(pmask), "r"(p0), "r"(pout)
: "r4", "v4", "v5", "q0", "memory");
}
p0 += sizeof(MMVector);
pmask += sizeof(MMVector);
pout += sizeof(MMVector);
for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) {
if (invert) {
if (i + j % MASKMOD != 0) {
expect[i].w[j] = buffer0[i].w[j];
}
} else {
if (i + j % MASKMOD == 0) {
expect[i].w[j] = buffer0[i].w[j];
}
}
}
}
check_output_w(__LINE__, BUFSIZE);
}
static void test_new_value_store(void)
{
void *p0 = buffer0;
void *pout = output;
asm("{\n\t"
" v2 = vmem(%0 + #0)\n\t"
" vmem(%1 + #0) = v2.new\n\t"
"}\n\t"
: : "r"(p0), "r"(pout) : "v2", "memory");
expect[0] = buffer0[0];
check_output_w(__LINE__, 1);
}
static void test_max_temps()
{
void *p0 = buffer0;
void *pout = output;
asm("v0 = vmem(%0 + #0)\n\t"
"v1 = vmem(%0 + #1)\n\t"
"v2 = vmem(%0 + #2)\n\t"
"v3 = vmem(%0 + #3)\n\t"
"v4 = vmem(%0 + #4)\n\t"
"{\n\t"
" v1:0.w = vadd(v3:2.w, v1:0.w)\n\t"
" v2.b = vshuffe(v3.b, v2.b)\n\t"
" v3.w = vadd(v1.w, v4.w)\n\t"
" v4.tmp = vmem(%0 + #5)\n\t"
"}\n\t"
"vmem(%1 + #0) = v0\n\t"
"vmem(%1 + #1) = v1\n\t"
"vmem(%1 + #2) = v2\n\t"
"vmem(%1 + #3) = v3\n\t"
"vmem(%1 + #4) = v4\n\t"
: : "r"(p0), "r"(pout) : "memory");
/* The first two vectors come from the vadd-pair instruction */
for (int i = 0; i < MAX_VEC_SIZE_BYTES / 4; i++) {
expect[0].w[i] = buffer0[0].w[i] + buffer0[2].w[i];
expect[1].w[i] = buffer0[1].w[i] + buffer0[3].w[i];
}
/* The third vector comes from the vshuffe instruction */
for (int i = 0; i < MAX_VEC_SIZE_BYTES / 2; i++) {
expect[2].uh[i] = (buffer0[2].uh[i] & 0xff) |
(buffer0[3].uh[i] & 0xff) << 8;
}
/* The fourth vector comes from the vadd-single instruction */
for (int i = 0; i < MAX_VEC_SIZE_BYTES / 4; i++) {
expect[3].w[i] = buffer0[1].w[i] + buffer0[5].w[i];
}
/*
* The fifth vector comes from the load to v4
* make sure the .tmp is dropped
*/
expect[4] = buffer0[4];
check_output_b(__LINE__, 5);
}
#define VEC_OP1(ASM, EL, IN, OUT) \
asm("v2 = vmem(%0 + #0)\n\t" \
"v2" #EL " = " #ASM "(v2" #EL ")\n\t" \
"vmem(%1 + #0) = v2\n\t" \
: : "r"(IN), "r"(OUT) : "v2", "memory")
#define VEC_OP2(ASM, EL, IN0, IN1, OUT) \
asm("v2 = vmem(%0 + #0)\n\t" \
"v3 = vmem(%1 + #0)\n\t" \
"v2" #EL " = " #ASM "(v2" #EL ", v3" #EL ")\n\t" \
"vmem(%2 + #0) = v2\n\t" \
: : "r"(IN0), "r"(IN1), "r"(OUT) : "v2", "v3", "memory")
#define TEST_VEC_OP1(NAME, ASM, EL, FIELD, FIELDSZ, OP) \
static void test_##NAME(void) \
{ \
void *pin = buffer0; \
void *pout = output; \
for (int i = 0; i < BUFSIZE; i++) { \
VEC_OP1(ASM, EL, pin, pout); \
pin += sizeof(MMVector); \
pout += sizeof(MMVector); \
} \
for (int i = 0; i < BUFSIZE; i++) { \
for (int j = 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \
expect[i].FIELD[j] = OP buffer0[i].FIELD[j]; \
} \
} \
check_output_##FIELD(__LINE__, BUFSIZE); \
}
#define TEST_VEC_OP2(NAME, ASM, EL, FIELD, FIELDSZ, OP) \
static void test_##NAME(void) \
{ \
void *p0 = buffer0; \
void *p1 = buffer1; \
void *pout = output; \
for (int i = 0; i < BUFSIZE; i++) { \
VEC_OP2(ASM, EL, p0, p1, pout); \
p0 += sizeof(MMVector); \
p1 += sizeof(MMVector); \
pout += sizeof(MMVector); \
} \
for (int i = 0; i < BUFSIZE; i++) { \
for (int j = 0; j < MAX_VEC_SIZE_BYTES / FIELDSZ; j++) { \
expect[i].FIELD[j] = buffer0[i].FIELD[j] OP buffer1[i].FIELD[j]; \
} \
} \
check_output_##FIELD(__LINE__, BUFSIZE); \
}
#define THRESHOLD 31
#define PRED_OP2(ASM, IN0, IN1, OUT, INV) \
asm("r4 = #%3\n\t" \
"v1.b = vsplat(r4)\n\t" \
"v2 = vmem(%0 + #0)\n\t" \
"q0 = vcmp.gt(v2.b, v1.b)\n\t" \
"v3 = vmem(%1 + #0)\n\t" \
"q1 = vcmp.gt(v3.b, v1.b)\n\t" \
"q2 = " #ASM "(q0, " INV "q1)\n\t" \
"r4 = #0xff\n\t" \
"v1.b = vsplat(r4)\n\t" \
"if (q2) vmem(%2 + #0) = v1\n\t" \
: : "r"(IN0), "r"(IN1), "r"(OUT), "i"(THRESHOLD) \
: "r4", "v1", "v2", "v3", "q0", "q1", "q2", "memory")
#define TEST_PRED_OP2(NAME, ASM, OP, INV) \
static void test_##NAME(bool invert) \
{ \
void *p0 = buffer0; \
void *p1 = buffer1; \
void *pout = output; \
memset(output, 0, sizeof(expect)); \
for (int i = 0; i < BUFSIZE; i++) { \
PRED_OP2(ASM, p0, p1, pout, INV); \
p0 += sizeof(MMVector); \
p1 += sizeof(MMVector); \
pout += sizeof(MMVector); \
} \
for (int i = 0; i < BUFSIZE; i++) { \
for (int j = 0; j < MAX_VEC_SIZE_BYTES; j++) { \
bool p0 = (buffer0[i].b[j] > THRESHOLD); \
bool p1 = (buffer1[i].b[j] > THRESHOLD); \
if (invert) { \
expect[i].b[j] = (p0 OP !p1) ? 0xff : 0x00; \
} else { \
expect[i].b[j] = (p0 OP p1) ? 0xff : 0x00; \
} \
} \
} \
check_output_b(__LINE__, BUFSIZE); \
}
TEST_VEC_OP2(vadd_w, vadd, .w, w, 4, +)
TEST_VEC_OP2(vadd_h, vadd, .h, h, 2, +)
TEST_VEC_OP2(vadd_b, vadd, .b, b, 1, +)
TEST_VEC_OP2(vsub_w, vsub, .w, w, 4, -)
TEST_VEC_OP2(vsub_h, vsub, .h, h, 2, -)
TEST_VEC_OP2(vsub_b, vsub, .b, b, 1, -)
TEST_VEC_OP2(vxor, vxor, , d, 8, ^)
TEST_VEC_OP2(vand, vand, , d, 8, &)
TEST_VEC_OP2(vor, vor, , d, 8, |)
TEST_VEC_OP1(vnot, vnot, , d, 8, ~)
TEST_PRED_OP2(pred_or, or, |, "")
TEST_PRED_OP2(pred_or_n, or, |, "!")
TEST_PRED_OP2(pred_and, and, &, "")
TEST_PRED_OP2(pred_and_n, and, &, "!")
TEST_PRED_OP2(pred_xor, xor, ^, "")
int main()
{
init_buffers();
test_load_tmp();
test_load_cur();
test_load_aligned();
test_load_unaligned();
test_store_aligned();
test_store_unaligned();
test_masked_store(false);
test_masked_store(true);
test_new_value_store();
test_max_temps();
test_vadd_w();
test_vadd_h();
test_vadd_b();
test_vsub_w();
test_vsub_h();
test_vsub_b();
test_vxor();
test_vand();
test_vor();
test_vnot();
test_pred_or(false);
test_pred_or_n(true);
test_pred_and(false);
test_pred_and_n(true);
test_pred_xor(false);
puts(err ? "FAIL" : "PASS");
return err ? 1 : 0;
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,61 @@
/*
* Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights Reserved.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdio.h>
int gA[401];
int gB[401];
int gC[401];
void vector_add_int()
{
int i;
for (i = 0; i < 400; i++) {
gA[i] = gB[i] + gC[i];
}
}
int main()
{
int error = 0;
int i;
for (i = 0; i < 400; i++) {
gB[i] = i * 2;
gC[i] = i * 3;
}
gA[400] = 17;
vector_add_int();
for (i = 0; i < 400; i++) {
if (gA[i] != i * 5) {
error++;
printf("ERROR: gB[%d] = %d\t", i, gB[i]);
printf("gC[%d] = %d\t", i, gC[i]);
printf("gA[%d] = %d\n", i, gA[i]);
}
}
if (gA[400] != 17) {
error++;
printf("ERROR: Overran the buffer\n");
}
if (!error) {
printf("PASS\n");
return 0;
} else {
printf("FAIL\n");
return 1;
}
}