VM idle page zeroing

Matthew Dillon dillon at apollo.backplane.com
Mon May 17 17:41:50 PDT 2010

    Here are buildworld times on my test box with and without pre-zeroing,
    single threaded (no -j option used):

    /usr/bin/time -l make buildworld >& /tmp/bw.out

     2306.94 real      1740.54 user       393.90 sys	nocache=1, enable=1
     2305.50 real      1738.80 user       397.38 sys	nocache=1, enable=1
     2306.85 real      1736.72 user       399.62 sys	nocache=1, enable=1
     2307.96 real      1737.63 user       398.24 sys	nocache=1, enable=1
     2306.22 real      1741.29 user       395.15 sys	nocache=1, enable=1
     2320.21 real      1739.50 user       396.49 sys	nocache=1, enable=1
     2311.48 real      1739.04 user       396.29 sys	nocache=1, enable=1
     2323.78 real      1735.54 user       399.61 sys	nocache=1, enable=1

     2331.38 real      1733.84 user       417.91 sys	enable=0
     2327.75 real      1733.74 user       415.24 sys	enable=0
     2322.29 real      1731.73 user       419.57 sys	enable=0
     2327.53 real      1729.05 user       416.27 sys	enable=0
     2335.71 real      1731.32 user       417.93 sys	enable=0

    As you can see it does seem to make a small difference.  The system
    time is about 20 seconds (4.8%) faster with pre-zeroing and the real
    time is around ~18 seconds (0.7%) or so faster.

    What this says, basically, is that the pre-zeroing helps out serialized
    operations (such as shell scripts) which have a lot of zfod faults but
    probably doesn't make much of a difference in the grand scheme of things.

    I also tried some bursting tests but it turns out that once the system
    is in a steady state the VM free page queue doesn't have enough pages
    to really be able to handle the zfod fault rate.  So I bumped 
    vm.v_free_target to an absurdly high number like 100000 (400MB) and
    ran a little shell script which ran /bin/echo in a big loop and I

	set i = 0
	while ( $i < 10000 )
	    /bin/echo fubar > /dev/null
	    @ i = $i + 1

    0.437u 1.148s 0:01.59 98.7%     342+164k 0+0io 0pf+0w  idlezero enabled
    0.460u 1.257s 0:01.71 100.0%    313+155k 0+0io 0pf+0w  idlezero disabled

    Or around a 7% execution time difference.  The problem, though, is that
    the system really needs a large number of free pages so the zfod zero
    fault burst doesn't drain the pre-zerod page cache during the test.

    With the default vm.v_free_target of 20000 on this box (40MB) the zfod
    faults strip out the pre-zerod pages almost instantly and the pgzero
    thread can't keep up.  Run times wind up being 1.69 or only 1.2%

    I then tried bumping up vm.idlezero_rate to infinity (well, 999999)
    with a normal vm.v_free_target and that helped some.  I got numbers
    in the 1.61 to 1.65 range.

    It may well be that the correct answer here is to remove the rate
    limiting entirely.


More information about the Kernel mailing list