[DragonFlyBSD - Bug #2824] New higher speed CRC code
bugtracker-admin at leaf.dragonflybsd.org
bugtracker-admin at leaf.dragonflybsd.org
Tue Jun 9 06:04:55 PDT 2015
Issue #2824 has been updated by alexh.
It doesn't save any operation/instruction with an optimizing compiler.
Even though it should be obvious, just to back it up with some real generated code, here go the critical loops of both versions (compiled with gcc -O3). The only difference is a 1-byte saving on the encoding of the xor. No real savings, and really no point in "optimizing" like that. The compiler does a better job :)
10: 48 83 c6 01 add $0x1,%rsi
14: 89 c1 mov %eax,%ecx
16: c1 e8 08 shr $0x8,%eax
19: 32 4e ff xor -0x1(%rsi),%cl
1c: 0f b6 c9 movzbl %cl,%ecx
1f: 33 04 8d 00 00 00 00 xor 0x0(,%rcx,4),%eax
26: 48 39 d6 cmp %rdx,%rsi
29: 75 e5 jne 10 <singletable_crc32c+0x10>
40: 89 c1 mov %eax,%ecx
42: 32 0e xor (%rsi),%cl
44: 48 83 c6 01 add $0x1,%rsi
48: c1 e8 08 shr $0x8,%eax
4b: 0f b6 c9 movzbl %cl,%ecx
4e: 33 04 8d 00 00 00 00 xor 0x0(,%rcx,4),%eax
55: 48 39 d6 cmp %rdx,%rsi
58: 75 e6 jne 40 <singletable_crc32c_carey+0x10>
----------------------------------------
Bug #2824: New higher speed CRC code
http://bugs.dragonflybsd.org/issues/2824#change-12667
* Author: robin.carey1
* Status: New
* Priority: Normal
* Assignee:
* Category:
* Target version:
----------------------------------------
Dear DragonFlyBSD bugs,
This isn't really a bug. I noticed there is the possibility of improving
the performance of the recently committed new CRC code ("fast iscsi crc
code").
In the following function:
sys/libkern/icrc32.c
<http://gitweb.dragonflybsd.org/dragonfly.git/blob/d557434b1f5510b6fed895379af444f0d034c07b:/sys/libkern/icrc32.c>
static uint32_t
singletable_crc32c(uint32_t crc, const void *buf, size_t size)
{
const uint8_t *p = buf;
while (size--)
crc = crc32Table[(crc ^ *p++) & 0xff] ^ (crc >> 8);
return crc;
}
The two separate operations of "size--" and "*p++" could be combined into
one operation. The way that I would do that would be something like:
...
size_t I;
for (i = 0; i < size; ++i) {
crc = crc32Table[(crc ^ p[i]) & 0xff] ^ (crc >> 8);
}
...
So you would be saving one operation; performance improvement.
I haven't looked at the rest of the code, so perhaps there are other
performance improvements that could be had.
Hope this helps ...
--
Sincerely,
Robin Carey BSc
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account
More information about the Bugs
mailing list