Soliciting opinions on dsched removal

Alex Hornung alex at alexhornung.com
Sun Nov 8 14:03:33 PST 2015


On 08/11/2015 10:38, Matthew Dillon wrote:
>     The basic problem is that heavy use of dsched (the disk scheduler) has
>     caused crashes and lockups ever since dsched went into the tree, and is
>     still causing crashes and lockups today.  Several people have tried to 
>     bandaid it but honestly I think the only correct solution is to remove it
>     and then start over from scratch at some future date with a better design.
>
>     So I would like to remove it, and I am soliciting opinions on that.
I'm all for it.

The current design is heavy-handed and just plain overkill. Linking the
I/O to threads on a fine-grain basis is just asking for trouble.

I've offered blueprints for a redesign on IRC to several people over the
years - without going into many details, the key aspect would be to get
rid of the heavy-handed linking to threads, processes, etc. Just hashing
based on the thread ID would simplify the design significantly and make
it a lot easier to work with.

If anybody is serious about replacing it with a more sensible design, I
can dig out some more information from my logged IRC rumblings.

However, with the prevalence of SSDs, NVMe, etc nowadays, I have a
feeling that adding any more latency to the disk access is rather
counter-productive, even if the scheduling was optimal. In other words -
I'm unconvinced we even would benefit from such a framework going
forward. Instead, it might be significantly more interesting trimming
down the latencies through CAM, or even coming up with a possible
replacement for it, with a focus on low latency/overhead.

Cheers,
Alex



More information about the Kernel mailing list