Recent concurrency improvements in the AHCI driver and CAM need testing
dillon at apollo.backplane.com
Sat Apr 9 21:04:46 PDT 2011
I've pushed some serious changes to the AHCI SATA driver and CAM.
One fixes issues where the tags were not being utilized to their fullest
extent... well, really they weren't being utilized at all. I'm not
sure how I missed the problem before, but it is fixed now.
The second ensures that read requests cannot saturate all available
tags and cause writes to stall, and vise-versa, and also separates
out the read and write BIO streams and treats them as separate entities,
which means that reads can continue to be dispatched even if writes
saturate the drive's cache and writes can continue to be dispatched
even if concurrent read(s) would otherwise eat all available tags.
The reason the read/write saturation fixes are important is because
writes are usually completed instantly since they just go to the drive
cache, so even if reads are saturated there's no reason not to push
writes to the drive. Plus when the HD's cache becomes saturated writes
no longer complete instantly and would prevent reads from being
dispatched if all the tags were used to hold the writes.
With these fixes I am getting much better numbers with concurrency
I now get around 37000 IOPS doing random 512-byte sector reads with
a Crucial C300 SSD, verses ~8000 or so before the fix.
And I now get around ~365 IOPS with the same test on a hard drive,
verses ~150 IOPS before (remember these are random reads!).
blogbench also appears to have much better write/read parallelism
against the swapcache with the SSD/HD combo. Memory caches blow
out at around blog #1300 on my test boxes.
With the changes blogbench write performance is maintained through
blog #1600 or so, without the changes it drops off at #1300.
With the changes the swapcache SSD is pushing ~1400 IOPS or so
satisfying random read requests. Without the changes the swapcache
SSD is only pushing ~130 IOPS.
With the changes blogbench is able to maintain a ~60000 article
read rate at the end of the test. Without the changes the
read rate is more around ~10000 at the end of the test. At this
stage swapcache has cached a significant chunk of the data
in the SSD so the I/O activity is mixed random SSD and HD reads.
Ok, so I feel a bit sheepish that I missed the fact that the AHCI
driver wasn't utilizing its tags properly before. The difference
in performance is phenominal. Maybe we will start winning some
of those I/O benchmark tests now.
<dillon at backplane.com>
More information about the Users