BayLisa Presentation slides available

Sun Dec 18 22:35:19 PST 2005

    It was a fun meeting.  The Apple campus is a great place and they 
    have a late night bar nearby so we got some food afterwords.

    I have put the slides up on the DragonFly site:

    http://www.dragonflybsd.org/docs/
    http://www.dragonflybsd.org/docs/LISA200512/

    They are fairly self explanatory.  There are a couple of things to 
    note:

    * Keep in mind that the TSC's on the two cpus drift from each other.  I
      was resynchronizing them 10 times a second but they are still upwards
      of 200ns off.  This is demonstrated in the IPIQ Messaging PING-PONG
      test numbers.

    * Timings that show repetition with sequential sequence numbers (on any
      given cpu) tend to be end-to-end tests.  So, e.g. the tsleep/wakeup
      tests are end-to-end tests that include scheduler overheads.

    * There are two sets of SLAB allocator free() path tests.  The first 
      set has bogus numbers (its on a slide so I could explain why they
      were bogus, which was simply due to the extensive KTR logging blowing
      up the numbers).

    I was quite impressed with the results.  I did not expect the IPI
    code to be so fast, and even medium-complexity code such as processing
    a TCP packet tended to take on the order of ~1 uS.  That's a good
    argument for not doing preemption of non-interrupt threads.

    The major point of the presentation was to show the efficiencies gained
    when operations can be aggregated.  e.g. the use of passive IPI message
    queueing for free() tends to aggregate several free operations on the
    target cpu which are then executed in a tight loop.  Same with the network
    RX interrupt and both network and TCP packet processing.  Etc.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>