warning about 'large-function-growth limit reached'

Thu Jun 15 16:16:19 PDT 2006

:Dmitri Nikulin wrote:
:
:> Inlining is still not as dangerous as loop unrolling. Imagine
:> unrolling a three-page loop with a thousand iterations. That's three
:> thousand pages of separate instructions, for what? I hope gcc handles
:
:Loop unrolling doesn't work this way. It unrolls a loop a small number of
:times, likes 4 times if possible. The Intel compiler does that very
:systematically, and with very good results. 
:
:> *that* correctly and notices the loop unroll is completely worthless,
:> in fact having to load that much more code into cache probably means
:> it's a pessimisation. Or compromises and encapsulates the body...
:
:People doing computations try different optimization flags, different
:compilers and choose the best for *their* computations. Differences can be
:enormous, like twice faster. I have done that myself, a lot.
:
:-- 
:Michel Talon

    Generally speaking you don't want to unroll a loop on a modern cpu
    because the branch prediction cache makes the 'looping' operation 
    essentially free.  A very tiny loop, one that only ever iterates a
    few times (like 4 or 5) might benefit (because the branch prediction
    cache will miss at least once in the loop), but anything larger then 
    that wouldn't.  Also, any loop unrolling will add additional pollution
    to the L1 code cache and that is a much bigger deal.

    If you are interested in looking into actual numbers, I suggest taking
    one of the timing tests in /usr/src/test/sysperf (like loop*.c) and
    modifying it to time a loop with and without unrolling.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>