SoC Project Proposal- Anticipatory Disk I/O scheduler
Matthew Dillon
dillon at apollo.backplane.com
Fri Apr 4 18:00:55 PDT 2008
:Hello,
:
:I had submitted a proposal for the Google SoC last week on the
:DragonFlyBSD Anticipatory Disk I/O scheduler. Since the deadline has
:been extended I could do with some improvements, if it requires any.
:If some of you could review the proposal and let me know if it falls
:short or is fine it'd be great.
:
:Since I was running out of time I could not give a lot of design
:issues from the actual code structure. If this is required could you
:point me to a segment code tree where I must do further research.
:
:The proposal is here: www.cc.gatech.edu/~thacker/DFlyBSD_proposal_thacker.pdf
:
:Thanks!
:
:Nirmal Thacker
Very interesting. It's true that the user process winds up being
somewhat synchronous when doing reads from the disk. Individual
filesystems do attempt to do some read-ahead but have never been
able to do all that good a job of it. The issue is complicated by
the fact that only the filesystem code really knows what blocksize to
use for buffer cache operations.
Having a thread heuristically prefetch data has interesting
implications. It *IS* possible to do even without knowing the block
size the filesystem normally chooses. It can be done because all
filesystem related I/O via the buffer cached is backed by a VM object,
thus making it possible to construct I/O's that directly map the
backing pages without actually having to go through the buffer cache.
The big giant lock we still have in DragonFly interferes with MP
issues but it would probably be beneficial to run such a thread
on several cpus and dispatch the read-ahead signal to a 'different'
cpu then the one that triggered the operation, and perhaps work on
getting rid of the need for the big giant lock in the low level I/O
system at the same time.
I think it would be worthwhile. Then instead of the filesystem
explicitly doing the read-ahead (which is somewhat expensive and right
smack in the middle of the critical path), it could instead pass
heuristical hints to a read-ahead subsystem and let the subsystem
deal with the read-aheads.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list