NVMe performance improvements in master
    Matthew Dillon 
    dillon at apollo.backplane.com
       
    Sat Jul 16 23:47:45 PDT 2016
    
    
  
    I've made significant progress on NVMe performance.  On a brand-new
    server (2 x Xeon 2620-v4, 16-core/32-thread, 128GB ram), with PCIe-3
    slots, testing two Samsung and one Intel NVMe card, I was able to
    achieve 931227 IOPS+ with highly parallelized 4K random reads from
    a urandom-filled partition (i.e. no compression, no dummy I/O full of
    zeros).  And the system is 75% idle while its running.
		>>> yes, you heard me, that's 931K IOPS <<<
    I've compiled some before-and-after statistics here:
	    http://apollo.backplane.com/DFlyMisc/nvme_sys03.txt
    Progress has been made in the pbuf subsystem (used by physio), and the
    MMU page invalidation subsystem.  Additional work will be needed to
    achieve these results through a filesystem.  The remaining roadblocks
    for getting this stupendously huge level of performance through our
    filesystems are as follows:
    (1) Filesystem data check, de-duplication, and compression overheads.
    (2) Kernel_pmap updates requiring SMP invalidations (an IPI to all cpus).
    (3) Lock contention in the filesystem and buffer cache path.
 
    (4) Hardware-level cache coherency load from atomic ops.
    Though, in fact, the filesystem will generally not be doing 4K I/Os.
    Most of these roadblocks, all except #(1), drop away with 32K and 64K
    I/Os.
					-Matt
					Matthew Dillon 
					<dillon at backplane.com>
    
    
More information about the Users
mailing list