plugins: optimize cpu_index code generation

When running with a single vcpu, we can return a constant instead of a
load when accessing cpu_index.
A side effect is that all tcg operations using it are optimized, most
notably scoreboard access.
When running a simple loop in user-mode, the speedup is around 20%.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241128213843.1023080-1-pierrick.bouvier@linaro.org>
This commit is contained in:
Pierrick Bouvier 2024-11-28 13:38:43 -08:00 committed by Richard Henderson
parent 0ccbac336b
commit dbf408b667

View file

@ -102,6 +102,15 @@ static void gen_disable_mem_helper(void)
static TCGv_i32 gen_cpu_index(void)
{
/*
* Optimize when we run with a single vcpu. All values using cpu_index,
* including scoreboard index, will be optimized out.
* User-mode calls tb_flush when setting this flag. In system-mode, all
* vcpus are created before generating code.
*/
if (!tcg_cflags_has(current_cpu, CF_PARALLEL)) {
return tcg_constant_i32(current_cpu->cpu_index);
}
TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
tcg_gen_ld_i32(cpu_index, tcg_env,
-offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));