I've finally landed a patch/feature for HLedger I've been working on-and-off (mostly off) since around March.

HLedger has a powerful CSV importer which you configure with a set of rules. Rules consist of conditional matchers (does field X in this CSV row match this regular expression?) and field assignments (set the resulting transaction's account to Y).

motivating problem 1

Here's an example of one of my rules for handling credit card repayments. This rule is applied when I import a CSV for my current account, which pays the credit card:

if AMERICAN EXPRESS
    account2 liabilities:amex

This results in a ledger entry like the following

2023-10-31 AMERICAN EXPRESS
    assets:current          £- 6.66
    liabilities:amex        £  6.66

My current account statements cover calendar months. My credit card period spans mid-month to mid-month. I pay it off by direct debit, which comes out after the credit card period, towards the very end of the calendar month. That transaction falls roughly halfway through the next credit card period.

On my credit card statements, that repayment is "warped" to the start of the list of transactions, clearing the outstanding balance from the previous period.

When I import my credit card data to HLedger, I want to compare the result against a PDF statement to make sure my ledger matches reality. The repayment "warping" makes this awkward, because it means the balance for roughly half the new transactions (those that fall before the real-date of the repayment) don't match up.

motivating problem 2

I start new ledger files each year. I need to import the closing balances from the previous year to the next, which I do by exporting the final balance from the previous year in CSV and importing that into the new ledgers in the usual way.

Between 2022 and 2023 I changed the scheme I use for account names so I need to translate between the old and the new in the opening balances. I couldn't think of a way of achieving this in the import rules (besides writing a bespoke rule for every possible old account name) so I abused another HLedger feature instead, HLedger aliases. For example I added this alias in my family ledger file for 2023

alias /^family:(.*)/ = \1

These are ugly and I'd prefer to get rid of them.

regex match groups

A common feature of regular expressions is defining match groups which can be referenced elsewhere, such as on the far-side of a substitution. I added match group support to HLedger's field assignments.

addressing date warping

Here's an updated version rule from the first motivating problem:

if AMERICAN EXPRESS
& %date (..)/(..)/(....)
    account2 liabilities:amex
    comment2 date:\3-\2-16

We now match on on extra date field, and surround the day/month/year components with parentheses to define match groups. We add a second field assignment too, setting the second posting's "comment" field to a string which, once the match groups are interpolated, instructs HLedger to do date warping (I wrote about this in date warping in HLedger)

The new transaction looks like this:

2023-10-31 AMERICAN EXPRESS
    assets:current          £- 6.66
    liabilities:amex        £  6.66 ; date:2023-10-16

getting rid of aliases

In the second problem, I can strip off the unwanted account name prefixes at CSV import time, with rules like this

if %account2 ^family:(.*)$
    account2 \1

When!

This stuff landed a week ago in early November, and is not yet in a Hledger release.