The Glasgow Haskell Compiler (GHC) has support for user-supplied rewrite rules, which applied during one of the compiler optimisation stages. An example rule is

{-# RULES
      "streamFilter fuse" forall f g xs.
          streamFilter f (streamFilter g xs) = streamFilter (f.g) xs
  #-}

I spent some time today looking at these more closely.

In order for rewrite rules to be applied, optimisation needs to be enabled. This conflicts with interactive use of GHC, so you can't explore these things in GHCi. I think rewrite rules are enabled by default (with optimisation), but you can ask for them explicitly. When investigating these it's also useful to ask ghc to always recompile, otherwise you have to change the source or manually remove .hi or .o files (etc.) between invocations. A starting set of command-line options to use then, are

-O -fenable-rewrite-rules -fforce-recomp

GHC runs several compilation stages, and the source program is transformed into several different languages or language dialects as it goes. Before the phase where rewrite rules are applied, some other optimisations take place, and the source gets desugared. You can see the results of the desugaring by passing the argument -ddump-ds. Here's an example program

main = print (unStream (streamFilter (>3) (streamFilter (<10)
    (mkStream [0..20]))))

And here's what it looks like after the first pass optimisation and desugaring:

main
  = print
      ($fShow[] $fShowInteger)
      (unStream
         (streamFilter
            (let {
               ds
               ds = 3 } in
             \ ds -> > $fOrdInteger ds ds)
            (streamFilter
               (let {
                  ds
                  ds = 10 } in
                \ ds -> < $fOrdInteger ds ds)
               (mkStream (enumFromTo $fEnumInteger 0 20)))))

(Note: I used -dsuppress-all and -dsuppress-uniques to improve the clarity of the above output. See Suppressing unwanted information for further details).

Those short-hand sections ((<3)) in the input program are expanded to something quite a lot longer. Out of curiosity I tried it again with plain lambdas, not sections, and the result was smaller

main
  = print
      ($fShow[] $fShowInteger)
      (unStream
         (streamFilter
            (\ x -> > $fOrdInteger x 3)
            (streamFilter
               (\ x -> < $fOrdInteger x 10)
               (mkStream (enumFromTo $fEnumInteger 0 20)))))

Rewrite rules happen after this. Once they're done (and several other passes), the program is translated into an intermediate representation called Tiny Core. This language faintly resembles the input Haskell. GHC will output the Tiny Core program if you supply the argument -ddump-simpl. Here's (most) of the program in Tiny Core (I've substituted some constants for clarity):

main  = hPutStr' stdout main1 True
main1 = $fShowInteger_$cshowList (catMaybes1 (main_go 0)) []
main_go
  = \ x ->
      case gtInteger# x 20 of {
        DEFAULT ->
          case ltInteger# x 10 of {
            DEFAULT -> main_go (plusInteger x 1);
            1# ->
              case gtInteger# x 3 of {
                DEFAULT -> main_go (plusInteger x 1);
                1# -> : (Just x) (main_go (plusInteger x 1))
              }
          };
        1# -> []
      }

After Tiny Core, GHC continues to translate the program into other forms (including STG, CMM, ASM) and you can ask GHC to dump those representations too, but this is all downstream from the rewrites so not relevant to them.

The rewrite rule at the top of this blog post is never applied: It doesn't get a chance, because the function it operates on (streamFilter) is inlined by GHC. GHC can detect this in some circumstances (-Winline-rule-shadowing). You instruct GHC to report on which rules fired with -ddump-rule-firings and can see before-and-after snippets of Tiny Core for each rule applied with -ddump-rule-rewrites.

I played around with adding {-# NOINLINE functionName #-} pragmas to disable inlining various functions to try and provoke a situation where the above rule could match, but it was a losing battle: GHC's built-in optimisations are just too good. But, it's also moot: the outcome I want (the filters to be fused) is happening, it's just the built-in rewrite rules are achieving it, once striot's functions have been inlined away.