git: kernel - Refactor Xinvltlb and the pmap page & global tlb invalidation code

Matthew Dillon dillon at crater.dragonflybsd.org
Fri Jul 15 15:09:59 PDT 2016


commit 79f2da03601f030588847e71f8ac645ad0d06091
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Fri Jul 15 13:28:39 2016 -0700

    kernel - Refactor Xinvltlb and the pmap page & global tlb invalidation code
    
    * Augment Xinvltlb to handle both TLB invalidation and per-page invalidation
    
    * Remove the old lwkt_ipi-based per-page invalidation code.
    
    * Include Xinvltlb interrupts in the V_IPI statistics counter
      (so they show up in systat -pv 1).
    
    * Add loop counters to detect and log possible endless loops.
    
    * (Fix single_apic_ipi_passive() but note that this function is currently
       not used.  Interrupts must be hard-disabled when checking icr_lo).
    
    * NEW INVALIDATION MECHANISM
    
      The new invalidation mechanism is primarily enclosed in mp_machdep.c and
      pmap_inval.c.  Supply new all-in-one rollup functions which include the
      *ptep contents adjustment, instead of prior piecemeal functions.
    
      The new mechanism uses Xinvltlb for both full-tlb and per-page
      invalidations.  This interrupt ignores critical sections (that is,
      will operate even if kernel code is in a critical section), which
      significantly improves the latency and stability of our pmap pte
      invalidation support functions.
    
      For example, prior to these changes the invalidation code uses the
      lwkt_ipiq paths which are subject to critical sections and could result
      in long stalls across substantially ALL cpus when one cpu was in a long
      cpu-bound critical section.
    
    * NEW SMP_INVLTLB() OPTIMIZATION
    
      smp_invltlb() always used Xinvltlb, and it still does.  However the
      code now avoids IPIing idle cpus, instead flagging them to issue the
      cpu_invltlb() call when they wake-up.
    
      To make this work the idle code must temporarily enter a critical section
      so 'normal' interrupts do not run until it has a chance to check and act
      on the flag.  This will slightly increase interrupt latency on an idle
      cpu.
    
      This change significantly improves smp_invltlb() overhead by avoiding
      having to pull idle cpus out of their high-latency/low-power state.  Thus
      it also avoids the high latency on those cpus messing up.
    
    * Remove unnecessary calls to smp_invltlb().  It is not necessary to call
      this function when a *ptep is transitioning from 0 to non-zero.  This
      significantly cuts down on smp_invltlb() traffic under load.
    
    * Remove a bunch of unused code in these paths.
    
    * Add machdep.report_invltlb_src and machdep.report_invlpg_src, down
      counters which do one stack backtrace when they hit 0.
    
    				TIMING TESTS
    
      No appreciable differences with the new code other than feeling smoother.
    
        mount_tmpfs dummy /usr/obj
    
        On monster (4-socket, 48-core):
    	time make -j 50 buildworld
    	BEFORE: 7849.697u 4693.979s 16:23.07 1275.9%
    	AFTER:  7682.598u 4467.224s 15:47.87 1281.8%
    
    	time make -j 50 nativekernel NO_MODULES=TRUE
    	BEFORE: 927.608u 254.626s 1:36.01 1231.3%
    	AFTER:  531.124u 204.456s 1:25.99 855.4%
    
        On 2 x E5-2620 (2-socket, 32-core):
    	time make -j 50 buildworld
    	BEFORE: 5750.042u 2291.083s 10:35.62 1265.0%
    	AFTER:  5694.573u 2280.078s 10:34.96 1255.9%
    
    	time make -j 50 nativekernel NO_MODULES=TRUE
    	BEFORE: 431.338u  84.458s 0:54.71 942.7%
    	AFTER:  414.962u  92.312s 0:54.75 926.5%
    	(time mostly spend in mkdep line and on final link)
    
    	Memory thread tests, 64 threads each allocating memory.
    
    	BEFORE: 3.1M faults/sec
    	AFTER:  3.1M faults/sec.

Summary of changes:
 sys/cpu/x86_64/include/cpufunc.h             |   4 +-
 sys/platform/pc64/apic/apic_vector.s         |   3 +-
 sys/platform/pc64/apic/lapic.c               |  21 +-
 sys/platform/pc64/include/globaldata.h       |  10 +-
 sys/platform/pc64/include/pmap_inval.h       |  23 +-
 sys/platform/pc64/vmm/vmx.c                  |  12 +-
 sys/platform/pc64/x86_64/machdep.c           |  29 ++
 sys/platform/pc64/x86_64/mp_machdep.c        | 434 ++++++++++++++++----
 sys/platform/pc64/x86_64/pmap.c              | 259 +++++-------
 sys/platform/pc64/x86_64/pmap_inval.c        | 580 +++++++++++++++++++++++----
 sys/platform/vkernel64/include/pmap_inval.h  |  18 -
 sys/platform/vkernel64/platform/pmap.c       |  20 +-
 sys/platform/vkernel64/platform/pmap_inval.c |  14 -
 sys/vm/pmap.h                                |   4 +-
 sys/vm/vm_contig.c                           |   7 +-
 15 files changed, 1022 insertions(+), 416 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/79f2da03601f030588847e71f8ac645ad0d06091


-- 
DragonFly BSD source repository


More information about the Commits mailing list