mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-12-24 00:18:36 -07:00
This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> |
||
|---|---|---|
| .. | ||
| user | ||
| address-spaces.h | ||
| cpu-all.h | ||
| cpu-common.h | ||
| cpu-defs.h | ||
| cpu_ldst.h | ||
| cpu_ldst_template.h | ||
| cpu_ldst_useronly_template.h | ||
| cputlb.h | ||
| exec-all.h | ||
| gdbstub.h | ||
| gen-icount.h | ||
| helper-gen.h | ||
| helper-head.h | ||
| helper-proto.h | ||
| helper-tcg.h | ||
| hwaddr.h | ||
| ioport.h | ||
| log.h | ||
| memattrs.h | ||
| memory-internal.h | ||
| memory.h | ||
| memory_ldst.inc.h | ||
| memory_ldst_cached.inc.h | ||
| memory_ldst_phys.inc.h | ||
| poison.h | ||
| ram_addr.h | ||
| ramlist.h | ||
| semihost.h | ||
| softmmu-semi.h | ||
| target_page.h | ||
| tb-context.h | ||
| tb-hash-xx.h | ||
| tb-hash.h | ||
| tb-lookup.h | ||
| translator.h | ||