SMP CPU Synchronization patch - needs testing on SMP systems

Matthew Dillon dillon at apollo.backplane.com
Sun Feb 15 12:31:44 PST 2004


    This patch needs some serious testing on SMP systems.  It is part 1 of
    a multi-stage patch to fix serious issues with our VM system.  

    This patch is an outgrowth of conversations I have had with Alan Cox
    and Tor regarding TLB writeback races between cpus.

    Basically the problem is that a user process running on one cpu may
    modify memory which causes the CPU to issue a TLB writeback of the page
    table entry in order to (e.g.) set the 'M'odified or 'A'ccessed bit in
    the pte.  Since the user process does not need the MP lock, this writeback
    can race against another cpu running in the kernel (that is holding the
    MP lock) trying to update the same page table entry.

    The result are occassional weird failures and panics such as 
    "dirty page found in cache queue".

    This patch basically creates a CPU sychronization and rendezvous API
    that allows us to force other cpus into a known state while we make
    sensitive page table changes.  Also in this patch is a reworking of the
    PMAP subsystem to use the new frameowrk.  Code to deal with modified bit
    races in the VM system is slated for a future stage.

    I would appreciate it if those of you with SMP systems could test this
    patch.  I have done some preliminary testing on a Dell-2550 and it was
    able to successfully buildworld twice, so I believe the patch is
    reasonably stable.  But it's a lot of work and a big patch and needs
    some third party testing before I feel I can commit it.

	fetch http://apollo.backplane.com/DFlyMisc/cpusync01.patch







More information about the Kernel mailing list