MESI Caching work start (was Re: caching and userapi)

Sat Jul 19 13:20:49 PDT 2003

:Matt, looking by my books, I've found out references to MESI protocol
:on the PowerPC 604 manual (page 3-11), which it uses for it's caches.
:
:...
:
:I assume that, for the software caches, there may be no passive devices
:like system memory is, but only active ones. Therefore:
:
:   p1  p2  valid references
:---------------------------
:1. s   i   p1
:2. s   s   both p1 and p2
:3. e   i   p1
:4. m   i   p1
:
:Coding a simple state machine and veryfing it by
:brute force should not take too much time ;)

    Modified	One entity owns the object and that entity has modified the
		object.  The entity cannot downgrade until it has written
		out the object.  (In a software implementation this does not
		mean we have to write the object to its backing store, we 
		could instead hand the object to the next exclusive requester
		and only flush it if there are shared requesters).

    Exclusive	One entity owns the object but the object has not been
		modified.  The entity can throw the object away without
		further synchronization.  The entity can modify the object
		(changing it from Exclusive to Modified).

    Shared	(1) One entity has access to the object READ-ONLY but isn't
		sure if other entities might also have the object READ-ONLY.

		(2) Multiple entities are known to have the object READ-ONLY.

    Invalid	The data being cached by this entity is known to be bad because
		another entity invalidated it.

    Movement between states entails a certain degree of work.  You can always
    move from Exclusive to Modified without any work, since you know you are
    the only owner of an object.  Going from Shared to Exclusive requires
    invalidating anyone else who might be caching the object as Shared, but
    you know there are no exclusive or modified holders if your state is 
    already Shared.  Going from Invalid to any state requires negotiating
    with holders of the object, typically by broadcast, and possibly loaded
    the data from backing store.

    Certain optimizations are possible.   One can implement a SHARED+MODIFIED
    state, for example, where cpu 1 is holding an object in a Modified state
    and cpu 2 asks cpu 1 to downgrade to SHARED (which would normally require
    a flush from cpu 1 to backing store) in order for cpu 2 to be able to
    enter a SHARED state.  Instead of flushing cpu 1 would tell cpu 2 to go
    into a SHARED+MODIFIED state and provide cpu 2 with its modified data
    directly.  Then both cpu 1 and cpu 2 would be in a SHARED+MODIFIED state.
    This allows cpu 1 to avoid having to flush the data to backing store just
    so cpu 2 can access it, but adds a little more work later so multiple
    cpus don't both flush the (same) data to backing store.

:I've also googled a bit and found this one talking about MESI:
:
:http://uenics.evansville.edu/~mr56/ece757/Lecture5.pdf
:
:Regarding broadcast communication between CPUs, we can model it
:as an broadcasting ethernet: you store data on shared media
:(a global page for all CPUs, for example), tag some recipients
:on this data (like an 48bit ethernet address would be)
:and then send a global interrupt to all nodes,
:which would copy it to internal buffers and resume
:processing as fast as possible.

    One can use a 'global' FIFO for this which has one write index and multiple
    read indexes.  Anyone can write, and all the CPUs can read the request at
    their own pace.  The entry will not be overwritten (fifo action) until
    all readers have bumped their read index past the object.  This places a
    burden on the writer but allows the readers lockless access to the FIFO.

:Recipient tagging would allow partitioning the machine,
:such as for example allowing route cache entries only
:on CPU 1 and 2 but not on others.
:
:Time to read and code a bit more :)
:
:Greets, Antonio.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>