# jmtd → log → phd → rewrite rule representation

I've begun writing up my phd and, not for the first time, I'm pondering issues of how best to represent things. Specifically, rewrite rules.

Here's one way of representing an example rewrite rule:

```
streamFilter g . streamFilter f = streamFilter (g . f)
```

This is a fairly succinct representation. It's sort-of Haskell, but not
quite. It's an equation. The left-hand side is a pattern: it's intended
to describe not one expression but a family of expressions that match.
The lower case individual letters `g`

and `f`

are *free variables*:
labelled placeholders within the pattern that can be referred to on the
right hand side. I haven't stipulated what defines a free variable and
what is a more concrete part of the pattern. It's kind-of Haskell, and
I've used the well-known operator `.`

to represent the two stream operators
(`streamFilter`

s) being connected together. (In practice, when we get
to the system where rules are actually applied, the connecting operator
is not going to be `.`

at all, so this is also an approximation).

One thing I don't like about `.`

here, despite its commonness, is having
to read right-to-left. I adopted the habit of using the lesser-known `>>>`

in a lot of my work (which is defined as `(>>>) = flip (.)`

), which reads
left-to-right. And then I have the reverse problem: people aren't familiar
with `>>>`

, and, just like `.`

, it's a stand-in anyway.

Towards the beginning of my PhD, I spent some time inventing rewrite rules to operate on pairs of operators taken from a defined, known set. I began representing the rules much as in the example above. Later on, I wanted to encode them as real Haskell, in order to check them more thoroughly. The above rule, I first encoded like this

```
filterFilterPre = streamFilter g . streamFilter f
filterFilterPost = streamFilter (g . f)
prop_filterFilter s = filterFilterPre s == filterFilterPost s
```

This is real code: the operators were already implemented in
StrIoT, and the final expression defined a
property for QuickCheck.
However, it's still not quite a rewrite rule. The left-hand side, which should
be a pattern, is really a concrete expression. The names `f`

and `g`

are
masquerading as free variables but are really concretely defined in a preamble
I wrote to support running QuickCheck against these things: usually simple
stuff like `g = odd`

, etc.

Eventually, I had to figure out how to really implement rewrite rules in StrIoT. There were a few approaches I could take. One would be to express the rules in some abstract form like the first example (but with properly defined semantics) and write a parser for them: I really wanted to avoid doing that.

As a happy accident, the solution I landed on was enabled by the semantics of algebraic-graphs, a Graph library we adopted to support representing a stream-processing topology. I wrote more about that in data-types for representing stream-processing programs.

I was able to implement rewrite rules as ordinary Haskell functions. The
left-hand side of the rewrite rule maps to the left-hand side (pattern) part of
a function definition. The function body implements the right-hand side. The
system that applies the rules attempts to apply each rewrite rule to every
sub-graph of a stream-processing program. The rewrite functions therefore need
to signal whether or not they're applicable at runtime. For that reason, the
return type is wrapped in `Maybe`

, and we provide a catch-all pattern for every
rule which simply returns `Nothing`

. The right-hand side implementation can be
pretty thorny. On the left-hand side, the stream operator connector we've
finally ended up with is `Connect`

from `algebraic-graphs`

.

Here's filter fuse, taken from the full ruleset:

```
filterFuse :: RewriteRule
filterFuse (Connect (Vertex a@(StreamVertex i (Filter sel1) (p:_) ty _ s1))
(Vertex b@(StreamVertex _ (Filter sel2) (q:_) _ _ s2))) =
let c = a { operator = Filter (sel1 * sel2)
, parameters = [\[| (\p q x -> p x && q x) $(p) $(q) |]]
, serviceTime = sumTimes s1 sel1 s2
}
in Just (removeEdge c c . mergeVertices (`elem` [a,b]) c)
filterFuse _ = Nothing
```

That's perhaps the simplest rule in the set. (See e.g. hoistOp for one of the worst!)

The question that remains to me, is, which representation, or representations, to use in the thesis? I'm currently planning to skip the abstract example I started with and start with the concrete Haskell encoding using QuickCheck. I'm not sure if it seems weird to have two completely separate implementations of the rules, but the simpler QuickCheck-checked rules are much closer to the "core essence" of the rules than the final implementation in StrIoT. And the derivation of the rules comes earlier in the timeline than the design work that led to the final StrIoT implementation. The middle-option is still compromised, however, by having concrete expressions pretending to be patterns. So I'm not sure.

## Comments