git: Kernel - Close VM/BIO races and document.o

Matthew Dillon dillon at crater.dragonflybsd.org
Thu Aug 27 22:49:15 PDT 2009


commit cb1cf930f3044653f7c85caa21cec345878b00f1
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Thu Aug 27 20:34:50 2009 -0700

    Kernel - Close VM/BIO races and document.o
    
    * Remove vfs_setdirty(), it is no longer used.
      Remove vfs_page_set_valid(), it is no longer used.
      Remove vfs_bio_set_valid(), it is no longer used.
    
    * When acquiring a buffer with getblk() whos size differs from the
      buffer already cached, no longer destroy the VM pages backing
      the buffer after completing the write.  Instead just release
      the buffer so a new, larger one can be constructed.
    
      NFS buffers which straddle file EOF can remain cached after the
      file has been extended via seek/write or ftruncate, and their
      underlying VM pages may become dirty via mmap.  If the buffer
      is acquired later the underlying VM pages beyond the buffer's
      original b_bcount size must be retained, not destroyed.
    
    * No longer try to clear the pmap modified bit from misc vm_page_*()
      functions.  In cases where we desire the pmap modified bit to be
      clear, it should *already* have been cleared in the run-up to the
      I/O.  Clearing it later may cause the buffer cache to lose track
      of the fact that underlying VM pages may have been modified again.
    
      NFS buffers use b_dirtyoff/b_dirtyend to determine what to actually
      write.  If the VM page is modified again the current write operation
      will not cover all the dirty parts of the buffer and another write
      will have to be issued.  Clearing the pmap modified bit at later
      stages did not properly track changes in b_dirtyoff/b_dirtyend and
      resulted in dirty data being lost.
    
    * Implement vfs_clean_one_page() to deal with nearly all buffer cache vs
      backing VM page dirty->clean handling at the appropriate time.
    
      In addition, this function now detects the case where a buffer has
      B_NEEDCOMMIT set but the underlying VM page is dirty.  This
      function necessarily only clears the dirty bits associated
      with the buffer because buffer sizes are not necessarily page aligned,
      which is different from clearing ALL the dirty bits as the putpages
      code is able to do.  So the B_NEEDCOMMIT test is only against those
      dirty bits associated with the buffer.  If this is found to be the
      case the B_NEEDCOMMIT flag is cleared.
    
      This fixes a race where VM pages backing a dirty buffer which has gone
      through the phase-1 commit are dirtied via a mmap, and NFS then goes
      through with the phase-2 commit and throws the data away when it really
      needed to go back and do another phase-1 commit.
    
    * In vnode_generic_put_pages() no longer clear the VM page dirty bits
      associated with bits of a file which extend past file EOF in the
      page straddling the EOF.  We used to do this with the idea that
      we would only clear the dirty bits up to the file EOF later on
      in the I/O completion code.
    
      However, this was too fragile.  If a page ended up with any dirty
      bits left set it would remain endless dirty and be reflushed forever.
    
      We now clear the dirty bits for the entire page after a putpages
      operation completes without error, and don't bother doing it
      prior to I/O initiation.
    
    * Call nfs_meta_setsize() for both seek+write extensions (holes) and for
      ftruncate extensions (holes).
    
      nfs_meta_setsize() now deterministically adjusts the size of the buffer
      that was straddling the PRIOR EOF point, fixing an issue where
      write-extending a file to near the end of a nfs buffer boundary (32K),
      then seek-write extending it further by creating a hole, then
      mmap()ing the end of the first chunk and modifying data past the
      original write-extend point... would lose the extra data because
      the original buffer was still intact and was still sized for the
      original EOF.  This was difficult to reproduce because it only occurred
      if all the dirty bits got cleared when the original buffer is flushed,
      meaning the original write-extend point had to be within 511 bytes of
      the end of a 32K boundary.

Summary of changes:
 sys/kern/vfs_bio.c          |  413 +++++++++++++++++++++----------------------
 sys/vfs/devfs/devfs_vnops.c |    7 +-
 sys/vfs/nfs/nfs.h           |    3 +-
 sys/vfs/nfs/nfs_bio.c       |   97 +++++++++--
 sys/vfs/nfs/nfs_vnops.c     |    5 +-
 sys/vfs/nwfs/nwfs_io.c      |    9 +-
 sys/vfs/smbfs/smbfs_io.c    |    9 +-
 sys/vm/vm_page.c            |   19 ++-
 sys/vm/vm_page2.h           |    4 +
 sys/vm/vnode_pager.c        |   34 ++--
 10 files changed, 345 insertions(+), 255 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/cb1cf930f3044653f7c85caa21cec345878b00f1


-- 
DragonFly BSD source repository





More information about the Commits mailing list