i386 version of cpu_sfence()

Sepherosa Ziehau sepherosa at gmail.com
Sat Jan 29 04:53:25 PST 2011


On Sat, Jan 29, 2011 at 2:54 AM, Matthew Dillon
<dillon at apollo.backplane.com> wrote:
>
> :Hi all,
> :
> :i386 version of cpu_sfence(), it is just asm volatile ("" :::"memory")
> :
> :According to the instruction set, sfence should also ensures that the
> :"global visibility" (i.e. empty CPU store buffer) of the stores before
> :sfence.
> :So should we do the same as cpu_mfence(), i.e. use a locked memory access?
> :
> :Best Regards,
> :sephe
>
>    cpu_sfence() is basically a NOP, because x86 cpus already order
>    writes for global visibility.  The volatile ..."memory" macro is

The document only indicates that writes are ordered on x86, but global
visibility is not:
http://support.amd.com/us/Processor_TechDocs/24593.pdf
The second point on page 166

I think it suggests that:
 processor 0        processor 1
store A <--- 1
     :
     : later
     :..........>   load r1 A

r1 still could be 0, since the A is still in the store buffer, while:
processor 0        processor 1
store A <--- 1
sfence
     :
     : later
     :..........>   load r1 A

r1 could must be 1

Well, I could be wrong on this.

>    roughly equivalent to cpu_ccfence() ... prevent the compiler itself
>    from trying to optimize or reorder actual instructions around that
>    point in the code.

Best Regards,
sephe

-- 
Tomorrow Will Never Die






More information about the Kernel mailing list