Comprehensive Threadripper tests - memory vs cpu freq at capped power

Matthew Dillon dillon at backplane.com
Sun Aug 19 23:05:48 PDT 2018


It will depend on the frequency of the interconnect as well.  I think the
idle power use comes out in my tests too... idle power consumption is
around 83W with the fabric running at 2800 or 3000 MHz, and 65W when
running at 2666 MHz or slower.  At idle the cpu frequency is at the same
relative low baseline value for all but the lowest PPT power settings, so
the difference can only be attributed to a combination of the infinity
fabric and the (also mostly idle) DDR4.  I don't know why it has that step
function between 2666 and 2800... I would have expect more of a linear
scaling.  But it *is* nice that it doesn't step up until we get above 2666.

Of course, the infinity fabric is doing a lot more than just shuffling
around memory requests.  It's also responsible for inter-CPU cache
coherency management.  I expect it has to be powered up regardless of how
idle the machine is.  The fabric has a nice simple name, but it is far from
simple in reality.  Also, its really unclear in that article how Anandtech
is measuring the infinity fabric's power consumption.  There might be some
registers, but what they are actually measuring is not necessarily what
they say they are measuring.

Just loading cpu threads does not necessarily load the fabric, it has to be
some sort of memory intensive load and some loads are going to be far worse
than others.  For example, loads which require a lot of cache management
transactions will load the fabric down worse than loads which only need to
access non-conflicting memory.  Anything that is computation-heavy will
have a much lower load on the fabric at much higher load on the CPU, while
anything that is memory-heavy will have a much higher load on the fabric
and much lower load on the CPU.

I will power up my dual socket Xeon and check its idle, fortunately I have
a second Kill-O-Watt meter.... lets see.  Ok, idle power consumption on my
2xXeon (total of 16 cores / 32 threads) is 98W at the wall plug.  This is
actually considerably higher than the 2990WX's 65W (2666MHz memory or
lower), and just a bit higher than the 2990WX's 83W w/2800 or 3000MHz
memory.  The Xeon has 12 sticks of 2133 memory I believe.  Using the
corepower module it breaks down as follows.  This makes sense to some
degree because the 2xXeon has 12 memory channels and the threadripper only
has 4.

cpu_node1.temp0              42.00 degC      OK           (node1 temp)
cpu_node0.power0                24.36 W                   (node0 Package
Power)
cpu_node0.power1                38.60 W                   (node0 DRAM Power)
cpu_node0.power2                 0.00 W                   (node0 Cores
Power)
cpu_node1.power0                18.90 W                   (node1 Package
Power)
cpu_node1.power1                16.78 W                   (node1 DRAM Power)
cpu_node1.power2                 0.00 W                   (node1 Cores
Power)

I'm currently running a long synth test (full bulk build of dports) on the
threadripper with it set to 150W PPT with memory set to 2666 (220W at the
wall from the table).  The synth test I ran with it at stock settings and
3000MHz memory took only 12 hours to run, which destroys the 22 hours it
takes on the Xeon and the 18 hours it takes on the quad socket opteron.

During the 12 hour run the 2990WX pulled 330W or so from the wall.  The
current run still in progress is pulling 230W at the wall with the synth
load (10W higher than the simple compile loop test in the table).  I expect
it will take longer than 12 hours to run, the question is... how much
longer :-).   I really like the idea of being able to run the 2990WX at
only 230W at the wall instead of 330W.  The 2xXeon at full load pulls
around 200W at the wall.

This is also solidifying the speed memory I will buy for the 2990WX 'for
real' (when I stuff it with 128G instead of 64G stolen from other
machines)... will probably be 2666, maybe 2400, ECC.  But definitely not
2800 or 3000.

http://apollo.backplane.com/DFlyMisc/synth_times.txt    (see last entries,
results are not really scientific because dports and compilers used are a
moving target).

(note: current run results with power capped at 150W PPT - 230W at the
wall, are not in yet)

-Matt

On Sun, Aug 19, 2018 at 7:52 PM, Samuel Paik <sam at paiks.org> wrote:

>
> Apparently there are some special cpu registers you can read to get power
> used by some components, probably not highly accurate but likely indicative.
>
>
> Anandtech's review ( https://www.anandtech.com/show/13124/the-amd-
> threadripper-2990wx-and-2950x-review/4 ) covered some of this, they found
> the 2950WX infinity fabric was using 34 W at low load rising to 43 W at
> higher load. At low load the interconnect was using more power than the cpu
> cores.
>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20180819/ced15ada/attachment.html>


More information about the Users mailing list