Intel Meltdown bug mitigation in master
dillon at backplane.com
Fri Jan 5 11:04:33 PST 2018
Meltdown is an Intel-specific bug. AMD is immune. Some ARM cpus might
also be vulnerable but DragonFly doesn't run on ARM so... What Meltdown is
is basically a FULL KERNEL MEMORY disclosure bug. An unprivileged user
program can essentially discern the contents of all of kernel memory on an
Intel CPU. The bug works because Intel CPUs will do speculative reads
across protection domains, allowing the user program to massage the memory
and branch prediction cache to cause a speculative read of kernel memory
(even though it crosses the protection domain) followed by a speculative
conditional execution. Timing can then be used to scan for and distinguish
the contents of kernel memory.
DragonFlyBSD master now has a commit to fix this issue. It is not
considered 100% tested yet, but it is in the tree and has been tested
fairly well. Unfortunately, the only mitigation possible is to remove the
kernel memory mappings from the user MMU map, which means that every single
system call and interrupt (and the related return to userland later) must
reload the MMU twice. This will add 150ns - 250ns of overhead to every
system call and interrupt. System calls usually have an overhead of only
100ns, so now it will be 250nS - 350nS of overhead on Intel CPUs.
The mitigation is automatically enabled on Intel CPUs, and disabled on AMD
CPUs. A new sysctl can be used to manually enable or disable the
-- PERFORMANCE EFFECTS ON SYSTEMS --
Nominal program execution will lose around 5% of its performance with this
mitigation. e.g. compiles, utilities, etc. Not too bad.
Any system-call-heavy or interrupt-heavy program will lose between 10% and
30% of its performance. This can include databases, high-speed storage
operations, very high-speed network operations (e.g. 10GBe or faster), and
-- ADDITIONAL WORK --
I will again look into using PCID to further mitigation the problem. We
currently do not use PCID because it doesn't really improve performance.
But when this mitigation is enabled, PCID might reduce the impact
somewhat. Linux kernel programmers are saying that using PCID can reduce
the impact by 50% (e.g. 5%->30% becomes 3%->15% performance loss). But it
should be noted that Linux's mitigation is a bit more involved than ours so
it is unclear whether the same optimization will improve DragonFlyBSD's
performance when running with this mitigation.
I should note that we kernel programmers have spent decades trying to
reduce system call overheads, so to be sure, we are all pretty pissed off
at Intel right now. Intel's press releases have also been HIGHLY
DECEPTIVE. In particular, they are starting to talk up 'microcode
updates', but those are mitigations for the Spectre bug, not for the
Meltdown bug. Spectre is another bug, far more difficult to exploit than
Meltdown, which leaks information from other processes or the kernel based
on those other processes or kernel doing speculative reads and executions
which are partially managed by the originating user process. Spectre does
NOT involve a protection domain violation like Meltdown, so the Meltdown
mitigation cannot mitigate Spectre.
These bugs (both Meltdown and Spectre) really have to be fixed in the CPUs
themselves. Meltdown is the 1000 pound gorilla. I won't be buying any new
Intel chips that require the mitigation. I'm really pissed off at Intel.
-- DRAGONFLY-STABLE --
This work is now in master. It needs significantly more testing before I
can move it to -stable and I'm not even sure I CAN move it to -stable
easily. I will be looking into that on the weekend.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users