git: kernel - Refactor bcmp, bcopy, bzero, memset

Matthew Dillon dillon at crater.dragonflybsd.org
Tue May 8 10:01:43 PDT 2018


commit 5d48b3120a651eee088cced1b4cffd8a264722c6
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Sat May 5 21:52:37 2018 -0700

    kernel - Refactor bcmp, bcopy, bzero, memset
    
    * For now continue to use stosq/stosb, movsq/movsb, cmpsq/cmpsb sequences
      which are well optimized on AMD and Intel.  Do not just use the '*b'
      string op.  While this is optimized on Intel it is not optimized on
      AMD.
    
    * Note that two string ops in a row result in a serious pessimization.
      To fix this, for now, conditionalize the movsb, stosb, or cmpsb op so
      it is only executed when the remaining count is non-zero.  That is,
      assume nominal 8-byte alignment.
    
    * Refactor pagezero() to use a movq/addq/jne sequence.  This is
      significantly faster than movsq on AMD and only just very slightly
      slower than movsq on Intel.
    
    * Also use the above adjusted kernel code in libc for these functions,
      with minor modifications.  Since we are copying the code wholesale,
      replace the copyright for the related files in libc.
    
    * Refactor libc's memset() to replicate the data to all 64 bits code and
      then use code similar to bzero().
    
    Reported-by: mjg_ (info on pessimizations)

Summary of changes:
 lib/libc/x86_64/string/bcmp.S      |  53 +++++++++++++-----
 lib/libc/x86_64/string/bcopy.S     |  87 +++++++++++++++---------------
 lib/libc/x86_64/string/bzero.S     |  69 +++++++++++++-----------
 lib/libc/x86_64/string/memset.S    | 107 +++++++++++++++++++++----------------
 sys/platform/pc64/x86_64/support.s |  40 +++++++++++---
 5 files changed, 212 insertions(+), 144 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/5d48b3120a651eee088cced1b4cffd8a264722c6


-- 
DragonFly BSD source repository


More information about the Commits mailing list