git: kernel - Refactor bcmp, bcopy, bzero, memset
Matthew Dillon
dillon at crater.dragonflybsd.org
Tue May 8 10:01:43 PDT 2018
commit 5d48b3120a651eee088cced1b4cffd8a264722c6
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Sat May 5 21:52:37 2018 -0700
kernel - Refactor bcmp, bcopy, bzero, memset
* For now continue to use stosq/stosb, movsq/movsb, cmpsq/cmpsb sequences
which are well optimized on AMD and Intel. Do not just use the '*b'
string op. While this is optimized on Intel it is not optimized on
AMD.
* Note that two string ops in a row result in a serious pessimization.
To fix this, for now, conditionalize the movsb, stosb, or cmpsb op so
it is only executed when the remaining count is non-zero. That is,
assume nominal 8-byte alignment.
* Refactor pagezero() to use a movq/addq/jne sequence. This is
significantly faster than movsq on AMD and only just very slightly
slower than movsq on Intel.
* Also use the above adjusted kernel code in libc for these functions,
with minor modifications. Since we are copying the code wholesale,
replace the copyright for the related files in libc.
* Refactor libc's memset() to replicate the data to all 64 bits code and
then use code similar to bzero().
Reported-by: mjg_ (info on pessimizations)
Summary of changes:
lib/libc/x86_64/string/bcmp.S | 53 +++++++++++++-----
lib/libc/x86_64/string/bcopy.S | 87 +++++++++++++++---------------
lib/libc/x86_64/string/bzero.S | 69 +++++++++++++-----------
lib/libc/x86_64/string/memset.S | 107 +++++++++++++++++++++----------------
sys/platform/pc64/x86_64/support.s | 40 +++++++++++---
5 files changed, 212 insertions(+), 144 deletions(-)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/5d48b3120a651eee088cced1b4cffd8a264722c6
--
DragonFly BSD source repository
More information about the Commits
mailing list