It's been a year since I started exploring HLedger, and I'm still going. The rollover to 2023 was an opportunity to revisit my approach.

Some time ago I stumbled across Dmitry Astapov's HLedger notes (fully-fledged hledger, which I briefly mentioned in eventual consistency) and decided to adopt some of its ideas.

new year, new journal

First up, Astapov encourages starting a new journal file for a new calendar year. I do this for other, accounting-adjacent files as a matter of course, and I did it for my GNUCash files prior to adopting HLedger. But the reason for those is a general suspicion that a simple mistake with those softwares could irrevocably corrupt my data. I'm much more confident with HLedger, so rolling over at years end isn't necessary for that. But there are other advantages. A quick obvious one is you can get rid of old accounts (such as expense accounts tied to a particular project, now completed).

one journal per import

In the first year, I periodically imported account data via CSV exports of transactions and HLedger's (excellent) CSV import system. I imported all the transactions, once each, into a single, large journal file.

Astapov instead advocates for creating a separate journal for each CSV that you wish to import, and keep around the CSV, leaving you with a 1:1 mapping of CSV:journal. Then use HLedger's "include" mechanism to pull them all into the main journal.

With the former approach, where the CSV data was imported precisely, once, it was only exposed to your import rules once. The workflow ended up being: import transactions; notice some that you could have matched with import rules and auto-coded; write the rule for the next time. With Astapov's approach, you can re-generate the journal from the CSV at any point in the future with an updated set of import rules.

tracking dependencies

Now we get onto the job of driving the generation of all these derivative journal files. Astapov has built a sophisticated system using Haskell's "Shake", which I'm not yet familiar, but for my sins I'm quite adept at (GNU-flavoured) UNIX Make, so I started building with that. An example rule

import/jon/amex/%.journal: import/jon/amex/%.csv rules/amex.csv.rules
    rm -f $(@D)/.latest.$*.csv $@
    hledger import --rules-file rules/amex.csv.rules -f $@ $<

This captures the dependency between the journal and the underlying CSV but also to the relevant rules file; if I modify that, and this target is run in the future, all dependent journals should be re-generated.1

opening balances

It's all fine and well starting over in a new year, and I might be generous to forgive debts, but I can't count on others to do the same. We need to carry over some balance information from one year to the next. Astapov has a more complex (or perhaps featureful) scheme for this involving a custom Haskell program, but I bodged something with a pair of make targets:

import/opening/2023.csv: 2022.journal
    mkdir -p import/opening
    hledger bal -f $< \
                $(list_of_accounts_I_want_to_carry_over) \
        -O csv -N > $@

import/opening/2023.journal: import/opening/2023.csv rules/opening.rules
    rm -f $(@D)/.latest.2023.csv $@
    hledger import --rules-file rules/opening.rules \
        -f $@ $<

I think this could be golfed into a year-generic rule with a little more work. The nice thing about this approach is the opening balances for a given year might change, if adjustments are made in prior years. They shouldn't, for real accounts, but very well could for more "virtual" liabilities. (including: deciding to write off debts.)

run lots of reports

Astapov advocates for running lots of reports, and automatically. There's a really obvious advantage of that to me: there's no chance anyone except me will actually interact with HLedger itself. For family finances, I need reports to be able to discuss anything with my wife.

Extending my make rules to run reports is trivial. I've gone for HTML reports for the most part, as they're the easiest on the eye. Unfortunately the most useful report to discuss (at least at the moment) would be a list of transactions in a given expense category, and the register/aregister commands did not support HTML as an output format. I submitted my first HLedger patch to add HTML output support to aregister:

addressing the virtual posting problem

I wrote in my original hledger blog post that I had to resort to unbalanced virtual postings in order to record both a liability between my personal cash and family, as well as categorise the spend. I still haven't found a nice way around that.

But I suspect having broken out the journal into lots of other journals paves the way to a better solution to the above.

The form of a solution I am thinking of is: some scheme whereby the two destination accounts are combined together; perhaps, choose one as a primary and encode the other information in sub-accounts under that. For example, repeating the example from my hledger blog post:

2022-01-02 ZTL*RELISH
    family:liabilities:creditcard      £ -3.00
    family:dues:jon                     £ 3.00
    (jon:expenses:snacks)               £ 3.00

This could become

2022-01-02 ZTL*RELISH
    family:liabilities:creditcard      £ -3.00

(I note this is very similar to a solution proposed to me by someone responding on twitter).

The next step is to recognise that sometimes when looking at the data I care about one aspect, and at other times the other, but rarely both. So for the case where I'm thinking about family finances, I could use account aliases to effectively flatten out the expense category portion and ignore it.

On the other hand, when I'm concerned about how I've spent my personal cash and not about how much I owe the family account, I could use aliases to do the opposite: rewrite-away the family:liabilities:jon prefix and combine the transactions with the regular jon:expenses account heirarchy.

(this is all speculative: I need to actually try this.)

catching errors after an import

When I import the transactions for a given real bank account, I check the final balance against another source: usually a bank statement, to make sure they agree. I wasn't using any of the myriad methods to make sure that this remains true later on, and so there was the risk that I make an edit to something and accidentally remove a transaction that contributed to that number, and not notice (until the next import).

The CSV data my bank gives me for accounts (not for credit cards) also includes a 'resulting balance' field. It was therefore trivial to extend the CSV import rules to add balance assertions to the transactions that are generated. This catches the problem.

There are a couple of warts with balance assertions on every such transaction: for example, dealing with the duplicate transaction for paying a credit card: one from the bank statement, one from the credit card. Removing one of the two is sufficient to correct the account balances but sometimes they don't agree on the transaction date, or the transactions within a given day are sorted slightly differently by HLedger than by the bank. The simple solution is to just manually delete one or two assertions: there remain plenty more for assurance.

going forward

I've only scratched the surface of the suggestions in Astapov's "full fledged HLedger" notes. I'm up to step 2 of 14. I'm expecting to return to it once the changes I've made have bedded in a little bit.

I suppose I could anonymize and share the framework (Makefile etc) that I am using if anyone was interested. It would take some work, though, so I don't know when I'd get around to it.

  1. the rm …latest… bit is to clear up some state-tracking files that HLedger writes to avoid importing duplicate transactions. In this case, I know better than HLedger.