DragonFly power management stuffs
Sepherosa Ziehau
sepherosa at gmail.com
Wed Jul 22 18:22:50 PDT 2015
First of all, I'd like to introduce the major power management
stuffs currently available on DragonFly.
- ACPI P-state. It has the proper CPU power domain support.
- ACPI C-state. Unlike other BSDs, on relatively recent Intel CPUs
(the oldest Intel CPU I tested is Sandy Bridge), we don't use I/O
port to enter ACPI C2/C3, instead, we give hint to BIOS that 'native
C2/C3' is supported. BIOS will send us ACPI C2/C3 to mwait C-states
maps through GAS. And the GAS will also contain information about
whether checking bus master status is needed or not for ACPI C3.
Given most of recent Intel CPUs (since core2) do not require bus
master arbitration or flush cache before entering ACPI C3, entering
APIC C-state becomes simple monitor/mwait instructions.
- Intel Performance and Energy Bias Hint. According to Intel software
developer manual, it's a "hint to guide the hardware heuristic of
power management features to favor increasing dynamic performance or
conserve energy consumption".
- Mwait C-state. This requires mwait extension, which is available on
almost all of the recent CPUs. If we need to check bus master
status, flush cache or bus master arbitration before entering ACPI
C3, then mwait C-states deeper then C2/0 will not be used.
========================
ACPI P-state
On Intel i7-3770 3.40GHz and Intel i5-3230M 2.60GHz and many other
recent Intel CPUs (tested by dillon@), adjusting ACPI P-state does
_not_ reduce power consumption at all. On these CPUs, adjusting ACPI
P-state only affects how dynamic frequency works, e.g. on Intel
i5-3230M (system is idle):
ACPI P-state 2601 (TurboBoost):
hw.sensors.cpu0.freq0: 2721627000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 2751159000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 2627060000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 2191103000 Hz (cpu3 freq)
ACPI P-state 2600:
hw.sensors.cpu0.freq0: 2266678000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 2401977000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 2248562000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 2406333000 Hz (cpu3 freq)
ACPI P-state 1200:
hw.sensors.cpu0.freq0: 1197281000 Hz (cpu0 freq)
hw.sensors.cpu1.freq0: 1197340000 Hz (cpu1 freq)
hw.sensors.cpu2.freq0: 1197284000 Hz (cpu2 freq)
hw.sensors.cpu3.freq0: 1197300000 Hz (cpu3 freq)
On Intel E5-2620 v2 2.10GHz, adjusting ACPI P-state reduces power
consumption. However, as far as I tested, the power consumption
change is only between TurboBoost ACPI P-state and non-TurboBoost
ACPI P-state, e.g. on Intel E5-2620 v2 (2-way):
ACPI P-state 2101 (TurboBoost): 94.6w
ACPI P-state 2100: 92.5w
ACPI P-state 1200: 92.5w
========================
Intel Performance and Energy Bias Hint
To be frank, I didn't notice power consumpion or thermal changes by
adjusting this hint on any of the Intel CPUs that I tested.
========================
ACPI C-state
Since on all of the Intel CPUs I tested ACPI C-states are mapped to
mwait C-states and do not require bus master operations; we move on
to mwait C-states.
========================
Mwait C-state
It seems to be the only power management stuff that reduces power
consumption on all Intel CPUs dillon@ and I tested. The power
consumption on the CPUs I tested.
Intel i7-3770 (ACPI P-state 3401):
mwait C1/0: 38.3w
mwait C1/1: 38.3w
mwait C2/0: 37.6w
mwait C3/0: 36.8w
Intel i5-3230M (ACPI P-state 2601):
mwait C1/0: 14.7w
mwait C1/1: 14.7w
mwait C2/0: 13.3w
mwait C3/0: 12.9w
mwait C4/0: 12.9w
mwait C4/1: 12.9w
Intel E5-2620 v2 (2-way)
(ACPI P-state 2101 TurboBoost):
mwait C1/0: 94.6w
mwait C1/1: 94.6w
mwait C2/0: 93.7w
mwait C3/0: 92.3w
(ACPI P-state 2100~1200):
mwait C1/0: 92.5w
mwait C1/1: 92.5w
mwait C2/0: 85.3w
mwait C3/0: 83.8w
One thing in common is that there is no power consumption difference
between mwait C1/0 (C1) and C1/1 (C1E?); probably because C1E will be
entered once all cores are in C1 state, as mentioned in Intel E5-2600
v2 datasheet.
Though deep mwait C-states reduce power consumption, you will have to
pay for the additional latency. The latency could be as high as 40us
if the CPU enters deep package C-state. The average latency I
gathered on various types of CPUs I tested (by using
debug.ipiq.latency_test):
Intel i7-3770 (ACPI P-state 3401):
mwait C1/0 and C1/1: 760ns
mwait C2/0: 18us
mwait C3/0: 25us
Intel i5-3230M (ACPI P-state 2601):
mwait C1/0 and C1/1: 950ns
mwait C2/0: 21us
mwait C3/0, C4/0, C4/1: 26us
Intel E5-2620 v2 (2-way)
(ACPI P-state 2101 TurboBoost)
mwait C1/0 and C1/1: same package 2200ns, different package 2600ns
mwait C2/0: same package 15us, different package 24us
mwait C3/0: same package 33us, different package 37us
(ACPI P-state 2100)
mwait C1/0 and C1/1: same package 2200ns, different package 2600ns
mwait C2/0: same package 15us, different package 22us
mwait C3/0: same package 26us, different package 36us
NOTE: For Intel E5-2620 v2 (2-way) TurboBoost mode, there is up to
60us latency on the same package mwait C2 and mwait C3 latency test
(well, I don't know why).
If your application is latency aware, you need to be careful with the
deep mwait C-states.
--------
There are cases that you could save more power and your system runs
faster! The situation I found is that deep mwait C-states could allow
loaded CPU to boost to higher frequency. Here is what I saw on Intel
E5-2620 v2 (2-way) (ACPI P-state 2101 TurboBoost):
make -j 48 -DNO_MODULES buildkernel KERNCONF=LINT64
Force mwait C1/0 on all CPUs. Total time: 182s
Power consumption during make depend: 110w
CPU frequency during make depend: 2.37GHz
Power consumption during full run: 161w
CPU frequency during full run: 2.4GHz
Force mwait C3/0 on all CPUs. Total time: 180s (2 seconds shorter!)
Power consumption during make depend: 106w (4w lower!)
CPU frequency during make depend: 2.57GHz (200MHz higher!)
Power consumption during full run: 161w (same)
CPU frequency during full run: 2.4GHz
Best Regards,
sephe
--
Tomorrow Will Never Die
More information about the Users
mailing list