## Unrolling Recursion, the Golden Ratio, and Partial Evaluation

#### by Tony Garnock-Jones / @leastfixedpoint

The little language shootout
I ran a few weeks ago showed that Java had a significant edge over the
other contestants. I speculated at the time that it might have been
because Java was unrolling the `fib` loop once, inlining one of
the recursive invocations, which would have removed one layer from the
call tree - essentially halving the runtime. (More accurately,
dividing the runtime by φ = 1.618...)

I just now ran the same program unrolled once by hand, and with the unrolled version the C code has almost identical (well within the margin of error!) runtime to the Java code. Here's the faster C code:

static unsigned long fib(int n) { if (n < 2) { return 1; } else { if (n < 3) { return 2; } else { return fib(n - 1) + fib(n - 2); } } }

Essentially, it's inlining that recurses once, allowing a self-recursive procedure to inline one layer of self-invocation. It's a neat trick - it's discussed in section 7.1.2.2 of Craig Chambers's thesis on the earlier optimizing Self compiler. Self only inlines once (i.e. nonrecursively) - clearly, Java inlines twice occasionally.

Further unrolling the recursion gives commensurate speedups, as you
would expect: each unrolling divides the runtime by φ, roughly, so
you can easily speed up the C code by about a factor of ten compared
to the completely non-unrolled version by unrolling until you return
`13` constantly after an explicit test that `n <
7`.

All this inlining/unrolling/constant-propagation seems really closely related to partial evaluation, too... Hidehiko Masuhara's PhD thesis (U. Tokyo, 1999) is an interesting read, applying partial evaluation to reflective interpreters to give vastly improved performance for behaviourally-reflective languages. Integrating a partial evaluation system with a Self-like dynamic optimising compiler seems like a worthwhile project - something that is likely to be necessary for the implementation of ThiNG, perhaps, given how reflective I imagine it will end up being.