Page fault handling in vpagetable area clarification

Aggelos Economopoulos aoiko at cc.ece.ntua.gr
Sat Feb 10 14:00:56 PST 2007


On Saturday 10 February 2007 21:57, Matthew Dillon wrote:
> :I'll keep the technical questions public so that web searches can find
> : them.
> :
> :In vm_fault_object(), fs.prot gets downgraded unconditionally if this is a
> :VM_MAPTYPE_VPAGETABLE entry. But if this was a write fault,
> :vm_fault_vpagetable() has already set VPTE_M (and if the vkernel clears
> : it, its pmap_clearbit() will invalidate the real kernel's pagetables).
> : Why can't the protection stay RW in this case?
> :
> :Aggelos
>
>     fs.prot is only downgraded for read faults on writable pages.   Write
>     faults on writable pages will be mapped RW.
>
>     Normally when a read fault occurs on a writable page the kernel will
>     map the page read-write, but still mark the page as being clean in its
>     vm_page structure.  Any future write to that page will cause the
>     hardware page table's modified bit to be set.  The real kernel lazily
>     checks the modified bit in the hardware page table entry at some future
>     time to determine if the page is actually still clean or not.
>
>     In order to properly simulate the setting of the modified bit in the
>     virtual page table, read faults to writable pages within the VM space
>     governed by the virtual page table must be mapped read-only instead
>     of read-write in order to force an actual write fault to occur if the
>     page is written.  Otherwise the real kernel has no way of knowing
>     when to set the modified bit in the virtual page table entry.

Right, but that's the read fault on writable page case. My question was about 
a _write_ fault on writable page. The way I'm reading the code, the host 
kernel again maps the page read only (vm_fault_object() clears the write bit 
from fs.prot, so after return to vm_fault(), pmap_enter() will map the page 
read only. However, since this is a _write_ fault, vm_fault_vpagetable() has 
already set VPTE_M in the vpagetable. Therefore, the real kernel doesn't have 
to worry about the modified bit (it's already set in the vpagetable, that's 
enough, right?). So I don't see why the page can't be mapped RW.

Aggelos





More information about the Kernel mailing list