jmtd → log → GHC rewrite rules
The Glasgow Haskell Compiler (GHC) has support for user-supplied rewrite rules, which applied during one of the compiler optimisation stages. An example rule is
{-# RULES
"streamFilter fuse" forall f g xs.
streamFilter f (streamFilter g xs) = streamFilter (f.g) xs
#-}
I spent some time today looking at these more closely.
In order for rewrite rules to be applied, optimisation needs to be enabled. This
conflicts with interactive use of GHC, so you can't explore these things in GHCi.
I think rewrite rules are enabled by default (with optimisation), but you can ask
for them explicitly. When investigating these it's also useful to ask ghc
to
always recompile, otherwise you have to change the source or manually remove .hi
or .o
files (etc.) between invocations.
A starting set of command-line options to use then, are
-O -fenable-rewrite-rules -fforce-recomp
GHC runs several compilation stages, and the source program is transformed into
several different languages or language dialects as it goes. Before the phase
where rewrite rules are applied, some other optimisations take place, and the
source gets desugared. You can see the results of the desugaring by passing the
argument -ddump-ds
. Here's an example program
main = print (unStream (streamFilter (>3) (streamFilter (<10)
(mkStream [0..20]))))
And here's what it looks like after the first pass optimisation and desugaring:
main
= print
($fShow[] $fShowInteger)
(unStream
(streamFilter
(let {
ds
ds = 3 } in
\ ds -> > $fOrdInteger ds ds)
(streamFilter
(let {
ds
ds = 10 } in
\ ds -> < $fOrdInteger ds ds)
(mkStream (enumFromTo $fEnumInteger 0 20)))))
(Note: I used -dsuppress-all
and -dsuppress-uniques
to improve the clarity
of the above output. See Suppressing unwanted
information for further details).
Those short-hand sections ((<3)
) in the input program are expanded to
something quite a lot longer. Out of curiosity I tried it again with plain
lambdas, not sections, and the result was smaller
main
= print
($fShow[] $fShowInteger)
(unStream
(streamFilter
(\ x -> > $fOrdInteger x 3)
(streamFilter
(\ x -> < $fOrdInteger x 10)
(mkStream (enumFromTo $fEnumInteger 0 20)))))
Rewrite rules happen after this. Once they're done (and several other passes),
the program is translated into an intermediate representation called Tiny
Core. This language
faintly resembles the input Haskell. GHC will output the Tiny Core program
if you supply the argument -ddump-simpl
. Here's (most) of the program in Tiny
Core (I've substituted some constants for clarity):
main = hPutStr' stdout main1 True
main1 = $fShowInteger_$cshowList (catMaybes1 (main_go 0)) []
main_go
= \ x ->
case gtInteger# x 20 of {
DEFAULT ->
case ltInteger# x 10 of {
DEFAULT -> main_go (plusInteger x 1);
1# ->
case gtInteger# x 3 of {
DEFAULT -> main_go (plusInteger x 1);
1# -> : (Just x) (main_go (plusInteger x 1))
}
};
1# -> []
}
After Tiny Core, GHC continues to translate the program into other forms (including STG, CMM, ASM) and you can ask GHC to dump those representations too, but this is all downstream from the rewrites so not relevant to them.
The rewrite rule at the top of this blog post is never applied: It doesn't get
a chance, because the function it operates on (streamFilter
) is inlined by
GHC. GHC can detect this in some circumstances (-Winline-rule-shadowing
). You
instruct GHC to report on which rules fired with -ddump-rule-firings
and can
see before-and-after snippets of Tiny Core for each rule applied with
-ddump-rule-rewrites
.
I played around with adding {-# NOINLINE functionName #-}
pragmas to disable
inlining various functions to try and provoke a situation where the above rule
could match, but it was a losing battle: GHC's built-in optimisations are just
too good. But, it's also moot: the outcome I want (the filters to be fused) is
happening, it's just the built-in rewrite rules are achieving it,
once striot's functions
have been inlined away.
Comments