RFC: virtio blk driver
bissont at mac.com
Wed Dec 22 10:53:34 PST 2010
Thanks for your feedback Alex.
My first question: What is blkv2? How is it different from blk?
I need to remove blkv2 â it was used as a scratch place to get something
Regarding the code, this is definitely not in a good shape as far as
style(9) goes. The indentations are pretty random and there are a lot of
C++ comments to comment out code (//foo). Code should be commented out
with #if 0 ... #endif. It makes it pretty much a pain to read.
Will do. I should have done that before sending out the RFC...
Now regarding the performance, is that the performance you were
expecting? How is the performance on NetBSD on the same system? If not
the same, where's the problem?
To be honest, I was just hoping to perform as good as emulation. But you
have a good point and I'll get NetBSD virtio performance numbers.
I'm also not so sure about the way you dispatch I/O in
virtio_disk_strategy. You essentially insert the I/O to be dispatched at
the end of a queue, then you call virtio_blk_execute, from exactly that
same point, and this then dispatches the first I/O on the queue. This
sounds quite dodgy and probably shouldn't be the case. How do you
guarantee all the enqueued I/Os are dispatched eventually? You can't
just assume you'll get more strategy calls. Maybe you are covering that
case, but from a short glance I don't see it. In any case, dispatching a
completely different I/O from the requested one in a strategy call might
not be the greatest idea for a plain block driver.
All enqueued ops are eventually dispatched because we also dispatch from
the callback handler (virtio_blk_vq_done()) as well. To preserve request
ordering in strategy() we TAILQ insert new requests and pull the first
request off the queue in strategy() and in the callback handler (if the
queue is not empty).
I should explain why I added this additional complexity. We only have a
fixed number of ring slots to insert requests into. Once I started
stress testing I found we could exceed that number, which breaks things.
NetBSD has a notion of maxqueuecnt in their generic block driver so they
don't have this problem.
I took a look around and it looks like vinum does a similar thing. I
guess an alternative is make virtio-blk look like a SCSI device and use
cam, or maybe add the equivalent of maxqueuecnt to DragonFly for block
The wiki has some notes about it: installing, architecture, porting
Regarding your 'Misc Porting Notes' on the wiki: Using spin locks to
protect the lists should be fine, as long as that's pretty much all they
are used for. spinlocks can't be used in blocking conditions; if that's
your intention, you'll have to use mutexes or lockmgr locks.
It seems your intentions are to make the design lockless? Do you have
any more detailed plans on this? One queue/ring per CPU, or what is your
My goal is to make it as fast as possible. My thought is to use memory
barriers to protect the rings.
More information about the Kernel