git: kernel - SMP - "Fix AP #%d (PHY# %d) failed" issues

Tobias Weingartner weingart at tepid.org
Wed Feb 10 16:52:30 PST 2010


On Wednesday, February 10, Matthew Dillon wrote:
> 
>     kernel - SMP - "Fix AP #%d (PHY# %d) failed" issues
>     
>     Ok, here's what is going on.  If an SMI interrupt occurs while
>     an AP is going through the INIT/STARTUP IPI sequence the AP will
>     brick, and nothing you do will resurrect it.

Ok, that's nuts.  Completely nuts.  I'm surprised that you could even
debug it and get a reliable method to work around this.  I'm very
surprised that an SMI interrupt is depending on or changing state in
such a way that things simply hang.  They're supposed to be transparent,
other than userland "loosing" some time...

Yuck, yuck, yuck.

Possibly we only need the delay on "old" style SMP boxes with external
APIC's?  IE: on new hardware with the stuff on-chip, we may be able to
get away with a much smaller delay in general?

-Toby.





More information about the Commits mailing list