pmap of amd64

Fri Oct 12 20:04:57 PDT 2007

:> > 3. how to do with the per-cpu data, should a PML4/PDP entry for each CPU?
:>
:> A per-CPU PML4 entry would be wildly inefficient, also limiting the
:> scalability of DragonFly/x86-64 on current CPUs to, say, 254 processors
:> tops. As there's not a lot of per-CPU data pages, I think it's safe to
:> go with the same number we have on pc32.

   To clarify, a single PML4 entry can be used to map the per-cpu
   data.  That is, the same virtual address on all cpus.  This works if each
   cpu has its own PML4 table (i.e. PML4 is made per-cpu rather then
   per-process).

   This comes back to the question of whether there should be one PML4
   table for each cpu or whether there should be a PML4 table for each
   user vmspace.  I strongly recommend one for each cpu because it simplifies
   ALL of pmap code at the cost of making a process switch slightly more
   expensive (and again, only processes that actually use a huge address
   space would incur this cost).  Frankly I think the extra cost winds up
   in the noise compared to the number of page table accesses the cpu has
   to make anyway to reload the TLB after a process switch.

   The extra cost is that the process switch code would have to copy N PML4
   entries for the target process verses just reloading %cr3.  Copying one
   entry is clearly not costly.  Two, four, even 8 entries is not costly.
   Copying half the table (256 entries) might be a different matter but
   I think it is still a better solution then giving each process its own
   PML4 table.

						-Matt