git: kernel - SMP - "Fix AP #%d (PHY# %d) failed" issues

Sepherosa Ziehau sepherosa at gmail.com
Tue Mar 2 22:47:46 PST 2010


On Wed, Feb 10, 2010 at 4:54 PM, Matthew Dillon
<dillon at crater.dragonflybsd.org> wrote:
>
> commit bb467734fc407e2c2de7f8314c63dd9f708f4df4
> Author: Matthew Dillon <dillon at apollo.backplane.com>
> Date:   Wed Feb 10 00:45:02 2010 -0800
>
>    kernel - SMP - "Fix AP #%d (PHY# %d) failed" issues

Wow, this is amazing, I was always wondering what's happening to my
phenom boxes.

>
>    Ok, here's what is going on.  If an SMI interrupt occurs while
>    an AP is going through the INIT/STARTUP IPI sequence the AP will
>    brick, and nothing you do will resurrect it.
>
>    BIOSes typically set up SMI interrupts when emulating (for example)
>    a PS/2 keyboard with a USB keyboard, or even if just implementing
>    BIOS support for a USB keyboard.  Even worse, the BIOS may set up
>    the interrupt to poll at 1000hz.  And, EVEN WORSE, it can totally
>    depend on which USB port you've plugged your keyboard in.  And, on top
>    of all of that, the SMI interrupt is not consistent.
>
>    The INIT/STARTUP code contains a 10ms delay (as per Intel spec) between
>    the INIT IPI and the STARTUP IPI.  Well, you can do the math.
>
>    In order to reliably boot a SMP system where the BIOS has set up
>    SMI interrupts this patch uses a nifty bit of code to detect when
>    the SMI interrupt has occurred and tries to shift the INIT/STARTUP
>    sequence into a gap between SMI interrupts.  If it has to it will
>    reduce the 10ms spec delay all the way down to 150us.  In many
>    cases we really have no choice for reliable operation.  Even a 300uS
>    delay is too much in the tests I performed on a Shuttle Phenom and
>    Phenom II cube.  I don't honestly know if this will break other SMP
>    configurations, we'll have to see.
>
>    On the particular shuttle I tested on, one of the four USB connections
>    on the backpanel (the upper left when looking at it from the back)
>    seemed to cause the BIOS to set up SMI interrupts at a high rate and
>    caused kernel boots to fail.  With this commit those boots now succeed.
>
> Summary of changes:
>  sys/platform/pc32/apic/apicreg.h      |    2 +-
>  sys/platform/pc32/apic/mpapic.c       |    5 +
>  sys/platform/pc32/i386/mp_machdep.c   |  166 ++++++++++++++++++++++++++++-----
>  sys/platform/pc32/include/smp.h       |    1 +
>  sys/platform/pc64/apic/apicreg.h      |    2 +-
>  sys/platform/pc64/apic/mpapic.c       |    5 +
>  sys/platform/pc64/include/smp.h       |    1 +
>  sys/platform/pc64/x86_64/mp_machdep.c |  155 ++++++++++++++++++++++++++----
>  8 files changed, 290 insertions(+), 47 deletions(-)
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/bb467734fc407e2c2de7f8314c63dd9f708f4df4
>
>
> --
> DragonFly BSD source repository
>



-- 
Live Free or Die





More information about the Commits mailing list