i386 version of cpu_sfence()

Sat Jan 29 09:44:13 PST 2011

:I think it suggests that:
: processor 0        processor 1
:store A <--- 1
:     :
:     : later
:     :..........>   load r1 A
:
:r1 still could be 0, since the A is still in the store buffer, while:
:processor 0        processor 1
:store A <--- 1
:sfence
:     :
:     : later
:     :..........>   load r1 A
:
:r1 could must be 1
:
:Well, I could be wrong on this.
:
:Best Regards,
:sephe

    Hmm. Well, for it not to be globally ordered processor 1 would
    have to be able to have visibility on a second write before it
    has visibility on the first write.  Since both writes are in
    the write buffer from processor 0 processor 1 should never see
    them out of order.

    With the caveat, however, that processor 1 CAN see them out of
    order if it reorders its reads or does speculative reads (which
    all cpus do by default), hence processor 1 would need a LFENCE
    between the two reads.

    But processor 0 (for x86) should not need a SFENCE.

    There might be another exception for write combining in the store
    buffer, though.   I'm not sure how wide the store buffer is
    (32 bits?).  We may not have to worry about it.  In that case
    write A, write B, write C where A and C can be write combined
    could wind up causing another cpu to see the write of C before the
    write of B.
	
						-Matt