jmtd → log → Code formatting in documents
I've been exploring typesetting and formatting code within text documents such as papers, or my thesis. Up until now, I've been using the listings package without thinking much about it. By default, some sample Haskell code processed by listings looks like this (click any of the images to see larger, non-blurry versions):
It's formatted with a monospaced font, with some keywords highlighted, but not syntactic symbols.
There are several other options for typesetting and formatting code in LaTeX documents. For Haskell in particular, there is the preprocessor lhs2tex, The default output of which looks like this:
A proportional font, but it's taken pains to preserve vertical alignment, which
is syntactically significant for Haskell. It looks a little cluttered to me,
and I'm not a fan of nearly everything being italic. Again, symbols aren't
differentiated, but it has substituted them for more typographically
pleasing alternatives: ->
has become →
, and \
is now λ
.
Another option is perhaps the newest, the LaTeX package minted, which
leverages the Python Pygments program. Here's the same code again. It
defaults to monospace (the choice of font seems a lot clearer to me than the
default for listings
), no symbolic substitution, and liberal use of colour:
An informal survey of the samples so far showed that the minted output was the most popular.
All of these packages can be configured to varying degrees. Here are some examples of what I've achieved with a bit of tweaking
All of this has got me wondering whether there are straightforward empirical answers to some of these questions of style.
Firstly, I'm pretty convinced that symbolic substitution is valuable. When
writing Haskell, we write ->
, \
, /=
etc. not because it's most legible,
but because it's most practical to type those symbols on the most widely
available keyboards and popular keyboard layouts.1 Of the three
options listed here, symbolic substitution is possible with listings and
lhs2tex, but I haven't figured out if minted can do it (which is really
the question: can pygments do it?)
I'm unsure about proportional versus monospaced fonts. We typically use
monospaced fonts for editing computer code, but that's at least partly for
historical reasons. Vertical alignment is often very important in source code,
and it can be easily achieved with monospaced text; it's also sometimes
important to have individual characters (.
, etc.) not be de-emphasised by being
smaller than any other character.
lhs2tex, at least, addresses vertical alignment whilst using proportional fonts. I guess the importance of identifying individual significant characters is just as true in a code sample within a larger document as it is within plain source code.
From a (brief) scan of research on this topic, it seems that proportional fonts result in marginally quicker reading times for regular prose. It's not clear whether those results carry over into reading computer code in particular, and the margin is slim in any case. The drawbacks of monospaced text mostly apply when the volume of text is large, which is not the case for the short code snippets I am working with.
I still have a few open questions:
- Is colour useful for formatting code in a PDF document?
- does this open up a can of accessibility worms?
- What should be emphasised (or de-emphasised)
- Why is the minted output most popular: Could the choice of font be key? Aspects of the font other than proportionality (serifs? Size of serifs? etc)
-
The Haskell package Data.List.Unicode lets the programmer
use a range of unicode symbols in place of ASCII approximations, such
as
∈
instead ofelem
,≠
instead of/=
. Sadly, it's not possible to replace the denotation for an anonymous function,\
, withλ
this way.↩
Comments
Please, god, no!
Engineers should not be let near the colour palette. Ever. Leave that to the artists and the designers like you'd keep them out of your code box.
Thanks for commenting!
Do you find colour syntax highlighting useful for reading code in an editor? If so, surely the same principle must apply for in PDF?
I disable colour highlighting in my editor.
It used to be that I wouldn't turn it on although now I have to turn it off.
Of course everything has its own way of turning colour off because it's invariably an afterthought as though the people who make these things not only think that of course you'd want random colours sprayed indiscriminately over everything --- why wouldn't you? --- but also and far more egregious: that they have any sense of taste whatsoever.
I have even written a utility to remove SGR sequences from commands because skittles mode is infecting everything and it's impossible to just turn it off globally.
May I use your blog as a soap box to make an impassioned plea to code pushers everywhere? Imagine the code from a first year creative arts student who's not interested in programming but thinks it looks cool ... that's how bad your colour scheme is. Please stop.