i915: occasional hangs on Baytrail with deeper C-states

Francois Tigeot ftigeot at wolfpond.org
Thu May 12 23:56:53 PDT 2016


Daniel Bilik wrote:
> On Thu, 21 Apr 2016 10:39:32 +0200
> Daniel Bilik <ddb at neosystem.org> wrote:
> My conclusion is that there are two separate problems that can cause hangs
> on Baytrail when CPU enters deeper C-state...
> First one was introduced in i915_irq.c with Linux commit 31685c2+
> (drm/i915/vlv: WA for Turbo and RC6 to work together). Among other
> changes, it enabled tracking rps events the specific way for Valleyview.
> This one alone causes hangs less frequently, once a week or two for me.
> Fix for this is attached as dfly-i915-vlv-avoid-vlv_wa_c0_ei.diff (diff
> against recent Dragonfly source).
> And second one was introduced in intel_pm.c with Linux commit 8fb5519+
> (drm/i915: Agressive downclocking on Baytrail). This causes hangs more
> often, depending on usage pattern. For me it's approximately once a day,
> but freezes after couple of hours are not uncommon. Fix for this is
> attached as dfly-i915-vlv-avoid-aggressive-downclocking.diff.
> AFAICT fixing just one _or_ another is not sufficient, they have to be
> fixed both to prevent hangs completely. I've just confirmed this myself...
> Yesterday I updated my 4.5-DEVELOPMENT system to d920e19+, keeping my
> manual "vlv_wa_c0_ei" fix. I believed that "aggressive downclocking" fix
> was already in (it was comitted as 5d8e0f4+), but it was (probably
> unintentionally) dropped with commit a05eeeb+ (drm/i915: Update to Linux
> 4.3). Thus I've built the kernel with "vlv_wa_c0_ei" fix but without
> "aggressive downclocking" fix. And surprise, after three weeks of
> problem-free operation, I've got another nice machine hang today. :)
> I believe this is the reason why tens of linux users disscussing the
> problem on bugzilla.kernel.org have been unsuccessful in finding reliable
> solution so far. The effort to identify the (one) problem causing hangs
> leads to approach "apply the patch - test - freeze - revert it and try
> another one". Both patches were tested by several people on several linux
> kernel versions. And several people reported success for both of them,
> just to take it back after couple of days, when they get another machine
> hang. Just like me three weeks ago. ;-)
> Francois, would you please review attached patches and decide whether they
> can be pushed to Dragonfly and kept there (ie. not get lost with next
> drm/i915 update :))? Thank you.

The problem is, both commits are present in Linux 4.3 and the DragonFly 
code in master:

- drm/i915: Agressive downclocking on Baytrail
was already present in Linux 4.2 and is used unchanged

- drm/i915/vlv: WA for Turbo and RC6 to work together
was rewritten in :
drm/i915: Improved w/a for rps on Baytrail
and is also present in DragonFly

Francois Tigeot

