RFC: virtio blk driver

Tim Bisson bissont at mac.com
Wed Dec 22 10:53:34 PST 2010

Thanks for your feedback Alex.
My first question: What is blkv2? How is it different from blk?
I need to remove blkv2 – it was used as a scratch place to get something 
Regarding the code, this is definitely not in a good shape as far as
style(9) goes. The indentations are pretty random and there are a lot of
C++ comments to comment out code (//foo). Code should be commented out
with #if 0 ... #endif. It makes it pretty much a pain to read.
Will do. I should have done that before sending out the RFC...
Now regarding the performance, is that the performance you were
expecting? How is the performance on NetBSD on the same system? If not
the same, where's the problem?
To be honest, I was just hoping to perform as good as emulation. But you 
have a good point and I'll get NetBSD virtio performance numbers.
I'm also not so sure about the way you dispatch I/O in
virtio_disk_strategy. You essentially insert the I/O to be dispatched at
the end of a queue, then you call virtio_blk_execute, from exactly that
same point, and this then dispatches the first I/O on the queue. This
sounds quite dodgy and probably shouldn't be the case. How do you
guarantee all the enqueued I/Os are dispatched eventually? You can't
just assume you'll get more strategy calls. Maybe you are covering that
case, but from a short glance I don't see it. In any case, dispatching a
completely different I/O from the requested one in a strategy call might
not be the greatest idea for a plain block driver.
All enqueued ops are eventually dispatched because we also dispatch from 
the callback handler (virtio_blk_vq_done()) as well. To preserve request 
ordering in strategy() we TAILQ insert new requests and pull the first 
request off the queue in strategy() and in the callback handler (if the 
queue is not empty).

I should explain why I added this additional complexity. We only have a 
fixed number of ring slots to insert requests into. Once I started 
stress testing I found we could exceed that number, which breaks things. 
NetBSD has a notion of maxqueuecnt in their generic block driver so they 
don't have this problem.

I took a look around and it looks like vinum does a similar thing. I 
guess an alternative is make virtio-blk look like a SCSI device and use 
cam, or maybe add the equivalent of maxqueuecnt to DragonFly for block 

The wiki has some notes about it: installing, architecture, porting
notes, etc:
Regarding your 'Misc Porting Notes' on the wiki: Using spin locks to
protect the lists should be fine, as long as that's pretty much all they
are used for. spinlocks can't be used in blocking conditions; if that's
your intention, you'll have to use mutexes or lockmgr locks.
It seems your intentions are to make the design lockless? Do you have
any more detailed plans on this? One queue/ring per CPU, or what is your
My goal is to make it as fast as possible. My thought is to use memory 
barriers to protect the rings.


More information about the Kernel mailing list