mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-08-02 23:33:54 -06:00
tcg: remove tb_lock
Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+------------*--+-+ | + + + + + *B | | before ***B*** ** * | |tb lock removal ###D### *** | 600 +-+ *** +-+ | ** # | | *B* #D | | *** * ## | 500 +-+ *** ### +-+ | * *** ### | | *B* # ## | | ** * #D# | 400 +-+ ** ## +-+ | ** ### | | ** ## | | ** # ## | 300 +-+ * B* #D# +-+ | B *** ### | | * ** #### | | * *** ### | 200 +-+ B *B #D# +-+ | #B* * ## # | | #* ## | | + D##D# + + + + | 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ | + + + + + + | | before ***B*** B | 80 +tb lock removal ###D### **D +-+ | **### | | **## | 70 +-+ ** # +-+ | ** ## | | ** # | 60 +-+ *B ## +-+ | ** ## | | *** #D | 50 +-+ *** ## +-+ | * ** ### | | **B* ### | 40 +-+ **** # ## +-+ | **** #D# | | ***B** ### | 30 +-+ B***B** #### +-+ | B * * # ### | | B ###D# | 20 +-+ D ##D## +-+ | D# | | + + + + + + | 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This commit is contained in:
parent
705ad1ff0c
commit
0ac20318ce
11 changed files with 75 additions and 152 deletions
|
@ -61,6 +61,7 @@ have their block-to-block jumps patched.
|
|||
Global TCG State
|
||||
----------------
|
||||
|
||||
### User-mode emulation
|
||||
We need to protect the entire code generation cycle including any post
|
||||
generation patching of the translated code. This also implies a shared
|
||||
translation buffer which contains code running on all cores. Any
|
||||
|
@ -75,9 +76,11 @@ patching.
|
|||
|
||||
(Current solution)
|
||||
|
||||
Mainly as part of the linux-user work all code generation is
|
||||
serialised with a tb_lock(). For the SoftMMU tb_lock() also takes the
|
||||
place of mmap_lock() in linux-user.
|
||||
Code generation is serialised with mmap_lock().
|
||||
|
||||
### !User-mode emulation
|
||||
Each vCPU has its own TCG context and associated TCG region, thereby
|
||||
requiring no locking.
|
||||
|
||||
Translation Blocks
|
||||
------------------
|
||||
|
@ -195,7 +198,7 @@ work as "safe work" and exiting the cpu run loop. This ensure by the
|
|||
time execution restarts all flush operations have completed.
|
||||
|
||||
TLB flag updates are all done atomically and are also protected by the
|
||||
tb_lock() which is used by the functions that update the TLB in bulk.
|
||||
corresponding page lock.
|
||||
|
||||
(Known limitation)
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue