I've been meaning to write more about my PhD work for absolutely ages, but I've held myself back by wanting to try and keep a narrative running through the blog posts. That's not realistic for a number of reasons so I'm going to just write about different aspects of things without worrying about whether they make sense in the context of recent blog posts or not.

Part of what I am doing at the moment is investigating Template Haskell to see whether it would usefully improve our system implementation. Before I write more about how it might apply to our system, I'll first write a bit about Template Haskell itself.

Template Haskell (TH) is a meta-programming system: you write programs that are executed at compile time and can output code to be spliced into the parent program. The approach used by TH is really nice: you perform your meta-programming in real first-class Haskell, and it integrates really well with the main program.

TH provides two pairs of special brackets. Oxford brackets surrounding any Haskell expression cause the whole expression to be replaced by the result of parsing the expression — an expression tree — which can be inspected and manipulated by the main program:

[| \x -> x + 1 |]

The expression data-type is a series of mutually-recursive data types that represent the complete Haskell grammar. The top-level is Exp, for expression, which has constructors for the different expression types. The above lambda expression is represented as

LamE [VarP x_1]
    (InfixE (Just (VarE x_1))
            (VarE GHC.Num.+)
            (Just (LitE (IntegerL 1))))

Such expressions can be pattern-matched against, constructed, deconstructed etc just like any other data type.

The other bracket type performs the opposite operation: it takes an expression structure and splices it into code in the main program, to be compiled as normal:

λ> 1 + $( litE (IntegerL 1) )

The two are often intermixed, sometimes nested to several levels. What follows is a typical beginner TH meta-program. The standard function fst operators on a 2-tuple and returns the first value. It cannot operate on a tuple of a different valence. However, a meta-program can generate a version of fst specialised for an n-tuple of any n:

genfst n = do
    xs <- replicateM n (newName "x")
    let ntup = tupP (map varP xs)
    [| \ $(ntup) ->  $(varE (head xs)) |]

Used like so

λ> $(genfst 2) (1,2)
λ> $(genfst 3) ('a','b','c')
λ> :t $(genfst 10)
$(genfst 10) :: (a, b, c, d, e, f, g, h, i, j) -> a

That's a high-level gist of how you can use TH. I've skipped over a lot of detail, in particular an important aspect relating to scope and naming, which is key to the problem I am exploring at the moment. Oxford brackets and slice brackets do not operate directly on the simple Exp data-type, but upon an Exp within the Q Monad:

λ> :t [| 1 |]
[| 1 |] :: ExpQ

ExpQ is a synonym for Q Exp. Eagle-eyed Haskellers will have noticed that genfst above was written in terms of some Monad. And you might also have noticed the case discrepancy between the constructor types VarE (Etc) and varE, tupP, varP used in that function definition. These are convenience functions that wrap the relevant constructor in Q. The point of the Q Monad is (I think) to handle name scoping, and avoid unintended name clashes. Look at the output of these simple expressions, passed through runQ:

λ> runQ [| \x -> x |]
LamE [VarP x_1] (VarE x_1)
λ> runQ [| \x -> x |]
LamE [VarP x_2] (VarE x_2)

Those x are not the same x in the context they are evaluated (a GHCi session). And that's the crux of the problem I am exploring. More in a later blog post!