Soliciting opinions on dsched removal
alex at alexhornung.com
Sun Nov 8 14:03:33 PST 2015
On 08/11/2015 10:38, Matthew Dillon wrote:
> The basic problem is that heavy use of dsched (the disk scheduler) has
> caused crashes and lockups ever since dsched went into the tree, and is
> still causing crashes and lockups today. Several people have tried to
> bandaid it but honestly I think the only correct solution is to remove it
> and then start over from scratch at some future date with a better design.
> So I would like to remove it, and I am soliciting opinions on that.
I'm all for it.
The current design is heavy-handed and just plain overkill. Linking the
I/O to threads on a fine-grain basis is just asking for trouble.
I've offered blueprints for a redesign on IRC to several people over the
years - without going into many details, the key aspect would be to get
rid of the heavy-handed linking to threads, processes, etc. Just hashing
based on the thread ID would simplify the design significantly and make
it a lot easier to work with.
If anybody is serious about replacing it with a more sensible design, I
can dig out some more information from my logged IRC rumblings.
However, with the prevalence of SSDs, NVMe, etc nowadays, I have a
feeling that adding any more latency to the disk access is rather
counter-productive, even if the scheduling was optimal. In other words -
I'm unconvinced we even would benefit from such a framework going
forward. Instead, it might be significantly more interesting trimming
down the latencies through CAM, or even coming up with a possible
replacement for it, with a focus on low latency/overhead.
More information about the Kernel