DragonFly power management stuffs

Sepherosa Ziehau sepherosa at gmail.com
Wed Jul 22 18:22:50 PDT 2015


First of all, I'd like to introduce the major power management
stuffs currently available on DragonFly.

- ACPI P-state.  It has the proper CPU power domain support.
- ACPI C-state.  Unlike other BSDs, on relatively recent Intel CPUs
  (the oldest Intel CPU I tested is Sandy Bridge), we don't use I/O
  port to enter ACPI C2/C3, instead, we give hint to BIOS that 'native
  C2/C3' is supported.  BIOS will send us ACPI C2/C3 to mwait C-states
  maps through GAS.  And the GAS will also contain information about
  whether checking bus master status is needed or not for ACPI C3.
  Given most of recent Intel CPUs (since core2) do not require bus
  master arbitration or flush cache before entering ACPI C3, entering
  APIC C-state becomes simple monitor/mwait instructions.
- Intel Performance and Energy Bias Hint.  According to Intel software
  developer manual, it's a "hint to guide the hardware heuristic of
  power management features to favor increasing dynamic performance or
  conserve energy consumption".
- Mwait C-state.  This requires mwait extension, which is available on
  almost all of the recent CPUs.  If we need to check bus master
  status, flush cache or bus master arbitration before entering ACPI
  C3, then mwait C-states deeper then C2/0 will not be used.

========================

ACPI P-state

On Intel i7-3770 3.40GHz and Intel i5-3230M 2.60GHz and many other
recent Intel CPUs (tested by dillon@), adjusting ACPI P-state does
_not_ reduce power consumption at all.  On these CPUs, adjusting ACPI
P-state only affects how dynamic frequency works, e.g. on Intel
i5-3230M (system is idle):
ACPI P-state 2601 (TurboBoost):
hw.sensors.cpu0.freq0: 2721627000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 2751159000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 2627060000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 2191103000 Hz (cpu3 freq)
ACPI P-state 2600:
hw.sensors.cpu0.freq0: 2266678000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 2401977000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 2248562000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 2406333000 Hz (cpu3 freq)
ACPI P-state 1200:
hw.sensors.cpu0.freq0: 1197281000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 1197340000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 1197284000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 1197300000 Hz (cpu3 freq)

On Intel E5-2620 v2 2.10GHz, adjusting ACPI P-state reduces power
consumption.  However, as far as I tested, the power consumption
change is only between TurboBoost ACPI P-state and non-TurboBoost
ACPI P-state, e.g. on Intel E5-2620 v2 (2-way):
ACPI P-state 2101 (TurboBoost): 94.6w
ACPI P-state 2100: 92.5w
ACPI P-state 1200: 92.5w

========================

Intel Performance and Energy Bias Hint

To be frank, I didn't notice power consumpion or thermal changes by
adjusting this hint on any of the Intel CPUs that I tested.

========================

ACPI C-state

Since on all of the Intel CPUs I tested ACPI C-states are mapped to
mwait C-states and do not require bus master operations; we move on
to mwait C-states.

========================

Mwait C-state

It seems to be the only power management stuff that reduces power
consumption on all Intel CPUs dillon@ and I tested.  The power
consumption on the CPUs I tested.

Intel i7-3770 (ACPI P-state 3401):
mwait C1/0: 38.3w
mwait C1/1: 38.3w
mwait C2/0: 37.6w
mwait C3/0: 36.8w

Intel i5-3230M (ACPI P-state 2601):
mwait C1/0: 14.7w
mwait C1/1: 14.7w
mwait C2/0: 13.3w
mwait C3/0: 12.9w
mwait C4/0: 12.9w
mwait C4/1: 12.9w

Intel E5-2620 v2 (2-way)
(ACPI P-state 2101 TurboBoost):
mwait C1/0: 94.6w
mwait C1/1: 94.6w
mwait C2/0: 93.7w
mwait C3/0: 92.3w
(ACPI P-state 2100~1200):
mwait C1/0: 92.5w
mwait C1/1: 92.5w
mwait C2/0: 85.3w
mwait C3/0: 83.8w

One thing in common is that there is no power consumption difference
between mwait C1/0 (C1) and C1/1 (C1E?); probably because C1E will be
entered once all cores are in C1 state, as mentioned in Intel E5-2600
v2 datasheet.

Though deep mwait C-states reduce power consumption, you will have to
pay for the additional latency.  The latency could be as high as 40us
if the CPU enters deep package C-state.  The average latency I
gathered on various types of CPUs I tested (by using
debug.ipiq.latency_test):

Intel i7-3770 (ACPI P-state 3401):
mwait C1/0 and C1/1: 760ns
mwait C2/0: 18us
mwait C3/0: 25us

Intel i5-3230M (ACPI P-state 2601):
mwait C1/0 and C1/1: 950ns
mwait C2/0: 21us
mwait C3/0, C4/0, C4/1: 26us

Intel E5-2620 v2 (2-way)
(ACPI P-state 2101 TurboBoost)
mwait C1/0 and C1/1: same package 2200ns, different package 2600ns
mwait C2/0: same package 15us, different package 24us
mwait C3/0: same package 33us, different package 37us
(ACPI P-state 2100)
mwait C1/0 and C1/1: same package 2200ns, different package 2600ns
mwait C2/0: same package 15us, different package 22us
mwait C3/0: same package 26us, different package 36us

NOTE: For Intel E5-2620 v2 (2-way) TurboBoost mode, there is up to
60us latency on the same package mwait C2 and mwait C3 latency test
(well, I don't know why).

If your application is latency aware, you need to be careful with the
deep mwait C-states.

--------

There are cases that you could save more power and your system runs
faster!  The situation I found is that deep mwait C-states could allow
loaded CPU to boost to higher frequency.  Here is what I saw on Intel
E5-2620 v2 (2-way) (ACPI P-state 2101 TurboBoost):

make -j 48 -DNO_MODULES buildkernel KERNCONF=LINT64

Force mwait C1/0 on all CPUs.  Total time: 182s
Power consumption during make depend: 110w
CPU frequency during make depend: 2.37GHz
Power consumption during full run: 161w
CPU frequency during full run: 2.4GHz

Force mwait C3/0 on all CPUs.  Total time: 180s (2 seconds shorter!)
Power consumption during make depend: 106w (4w lower!)
CPU frequency during make depend: 2.57GHz (200MHz higher!)
Power consumption during full run: 161w (same)
CPU frequency during full run: 2.4GHz

Best Regards,
sephe

-- 
Tomorrow Will Never Die


More information about the Users mailing list