mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-08-05 08:43:55 -06:00
target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec
Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8, respectively. vprtybw: rept loop master patch 8 12500 0,01198900 0,00703100 (-41.4%) 25 4000 0,01070100 0,00571400 (-46.6%) 100 1000 0,01123300 0,00678200 (-39.6%) 500 200 0,01601500 0,01535600 (-4.1%) 2500 40 0,03872900 0,05562100 (43.6%) 8000 12 0,10047000 0,16643000 (65.7%) vprtybd: rept loop master patch 8 12500 0,00757700 0,00788100 (4.0%) 25 4000 0,00652500 0,00669600 (2.6%) 100 1000 0,00714400 0,00825400 (15.5%) 500 200 0,01211000 0,01903700 (57.2%) 2500 40 0,03483800 0,07021200 (101.5%) 8000 12 0,09591800 0,21036200 (119.3%) vprtybq: rept loop master patch 8 12500 0,00675600 0,00667200 (-1.2%) 25 4000 0,00619400 0,00643200 (3.8%) 100 1000 0,00707100 0,00751100 (6.2%) 500 200 0,01199300 0,01342000 (11.9%) 2500 40 0,03490900 0,04092900 (17.2%) 8000 12 0,09588200 0,11465100 (19.6%) I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ, I'm not sure if it's worth to move those instructions. Comparing the assembly of the helper with the TCGop they are pretty similar, so I'm not sure why vprtybd took so much more time. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This commit is contained in:
parent
90b5aadb09
commit
d57fbd8fd9
5 changed files with 71 additions and 33 deletions
|
@ -492,31 +492,8 @@ static inline void set_vscr_sat(CPUPPCState *env)
|
|||
env->vscr_sat.u32[0] = 1;
|
||||
}
|
||||
|
||||
/* vprtybw */
|
||||
void helper_vprtybw(ppc_avr_t *r, ppc_avr_t *b)
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < ARRAY_SIZE(r->u32); i++) {
|
||||
uint64_t res = b->u32[i] ^ (b->u32[i] >> 16);
|
||||
res ^= res >> 8;
|
||||
r->u32[i] = res & 1;
|
||||
}
|
||||
}
|
||||
|
||||
/* vprtybd */
|
||||
void helper_vprtybd(ppc_avr_t *r, ppc_avr_t *b)
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < ARRAY_SIZE(r->u64); i++) {
|
||||
uint64_t res = b->u64[i] ^ (b->u64[i] >> 32);
|
||||
res ^= res >> 16;
|
||||
res ^= res >> 8;
|
||||
r->u64[i] = res & 1;
|
||||
}
|
||||
}
|
||||
|
||||
/* vprtybq */
|
||||
void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
|
||||
void helper_VPRTYBQ(ppc_avr_t *r, ppc_avr_t *b, uint32_t v)
|
||||
{
|
||||
uint64_t res = b->u64[0] ^ b->u64[1];
|
||||
res ^= res >> 32;
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue