Initial NVME driver for DragonFly committed to master

Matthew Dillon dillon at apollo.backplane.com
Sun Jun 5 12:08:42 PDT 2016


    The need for a NVMe (SSD over PCIe) disk driver has slowly increased.
    I finally decided to sit down and write one.  After reviewing the drivers
    in other OSs I decided to write ours from scratch.

    This driver should be considered experimental, it has only been tested
    on one NVMe card so far and I'm sure there are issues.  We do not yet
    have userland management support for it.  It isn't finished but it
    should be usable.

    The driver is currently built as a module in master, so it must be
    kldload'd to use (or put nvme_load="YES" in /boot/loader.conf).

    And it is surprisingly quite fast.  I built a ground-up SMP-friendly
    implementation and even though it isn't using MSI-X yet, it's still
    capable of some fairly significant bandwidth.  So far I've only tested
    it on a mini-PCIe card (2-lane or 4-lane, not sure) that I plugged into
    an adapater and threw into a test box:

    nvme0: <NVME-PCIe> port 0xe000-0xe0ff mem 0xf7e00000-0xf7e03fff
	   irq 16 at device 0.0 on pci1
    nvme0: NVME Version 1.1 maxqe=16384 caps=00f000203c013fff
    nvme0: Model SAMSUNG_MZVPV128HDGM-00000 BaseSerial S1XVNYAGA02988
	   nscount=1
    nvme0: Request 18/6 queues, Returns 8/8 queues, nominal map
    nvme0: Disk nvme0 ns=1 blksize=512 lbacnt=250069680 cap=119GB
	   serno=S1XVNYAGA02988-1

				--

    Concurrent random reads on the physical device using our
    test/sysperf/randread.c with 512-byte reads yields a VERY
    impressive 307,000 IOPS.

test40# randread /dev/nvme0s1b
     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   0   24  0.50 307806 150.30   5  0 51 10 34
   0   24  0.50 307911 150.35   4  0 52  7 37
   0   24  0.50 307793 150.29   5  0 55  8 32
   0   24  0.50 307818 150.30   3  0 54  8 35
   0   24  0.50 307893 150.34   4  0 53  8 35

				--

    Concurrent random reads through the HAMMER filesystem, freshly mounted
    (uncached) is around 25000 IOPS @ 64KB, 1.5 GBytes/sec.

test40# randread /mnt/test1.dat
     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   0   12 64.00 24599 1537.43   0  0 36  5 59
   0   23 63.99 24793 1549.40   1  0 37  5 57
   0   23 64.00 24611 1538.10   1  0 43  3 53

				--

    Singe process sequential reads through the HAMMER filesystem,
    freshly mounted (uncached), around 1.5 GBytes/sec.

test40# dd if=/mnt/test3.dat of=/dev/null bs=32k
     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   0   12 113.66 13537 1502.56   0  0 21  4 76
   0   12 113.66 13103 1454.33   0  0 35  3 61
   0   12 113.65 13070 1450.52   0  0 38  3 59

				--

    Concurrent sequential reads of two different files through the
    HAMMER filesystem, freshly mounted (uncached), around
    1.7 GBytes/sec.

test40# dd if=/mnt/test1.dat of=/dev/null bs=32k 
test40# dd if=/mnt/test2.dat of=/dev/null bs=32k 
     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   0   12 120.21 15066 1768.63   0  0 27  5 67
   0   12 120.04 15002 1758.65   0  0 33  3 64
   0   12 120.49 14930 1756.82   0  0 50  4 46

7516192768 bytes transferred in 7.931600 secs (947626318 bytes/sec)
7516192768 bytes transferred in 7.881519 secs (953647757 bytes/sec)

				--

    A single sequential read from the raw device, around 800MB/sec
    using 32KB reads:

test40# dd if=/dev/nvme0s1b of=/dev/null bs=32k
     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   0   12 32.00 25863 808.23   0  0  5  2 92
   0   12 32.00 25845 807.67   0  0  7  3 91
   0   12 32.00 26407 825.23   0  0  5  3 92

17179869184 bytes transferred in 20.225951 secs (849397352 bytes/sec)

				--

    And finally, five concurrent sequential reads of the raw device
    almost maxes out the device at 2.4 GBytes/sec:

dd if=/dev/nvme0s1b of=/dev/null bs=32k &
dd if=/dev/nvme0s1b of=/dev/null bs=32k &
dd if=/dev/nvme0s1b of=/dev/null bs=32k &
dd if=/dev/nvme0s1b of=/dev/null bs=32k &
dd if=/dev/nvme0s1b of=/dev/null bs=32k &

     tty           nvme0             cpu
 tin tout  KB/t tps   MB/s  us ni sy in id
   1   27 32.00 76859 2401.84   1  0 21  4 74
   0   12 32.00 76010 2375.33   0  0 22  4 74
   0   12 32.00 76017 2375.53   0  0 23  4 73
   0   12 32.00 75973 2374.16   1  0 20  4 75

    So there we have it.  A pretty good start for our new driver!

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the Users mailing list