Automatically balancing Beancount DKB transactions

Beancount is one of my favorite packages in the Python ecosystem.

What is Beancount?

I've written about it in much more detail in other places on the internet, but if you're new to it, Beancount is a PyPI package that provides a set of tools for you to keep track of your personal finances using Python.

The entire workflow revolves around a .beancount file on your disk that contains all your financial transactions (really, "all" as in "from-the-beginning-of-time all") specified using the double-entry bookkeeping format. Keeping it up to date involves the following steps:

  1. Download all your financial activity (at regular time intervals) from your bank's website as PDF/CSV/whatever-your-bank-allows exports.
  2. Convert this data into a format that Beancount understand (this is the part which may or may not involve custom Python code).
  3. Add/append the output of step #2 to you.beancount.

At the end, you.beancount contains all your financial activity where you can use all the tools that Beancount (and the ecosystem) provides to run all sorts of analysis on it.

Beancount DKB

One of my main bank accounts is with DKB. So to get Beancount working with DKB data exports, I wrote and currently maintain beancount-dkb, which provides a PyPI package to convert DKB CSV exports to the Beancount format.

So if one of the lines of your DKB CSV exports looks as follows:

"15.08.2022";"15.08.2022";"Lastschrift";"REWE Filialen Voll";"REWE SAGT DANKE.";"DE00000000000000000000";"AAAAAAAA";"-15,37";"000000000000000000";"0000000000000000000000";"";

... running it through an ECImporter instance makes Beancount output something like the following:

2022-08-15 * "REWE Filialen Voll" "REWE SAGT DANKE"
    Assets:DKB:EC        -15.37 EUR

In terms of double-entry bookkeeping (which is what Beancount relies on), this is halfway there. We still need to add a second leg to this transaction to balance it out such that the total sum is zero.

This step is something we usually do manually. But it doesn't always have to be.

Pattern matching against the Payee

Disclaimer: I stole the following idea from beancount-mercury written by Michael Lynch.

If we look at the original transaction, the payee is always the same. And even if won't always be the same, it's at least going to contain the same pattern.

We can make use of this fact to tell the DKB importers that if the payee string contains a specific value, automatically add a second posting to a separate matching account (which is what we would have done manually anyway). This is specifically the feature I ended up pushing to PyPI last weekend.

The package distributes two importers: ECImporter (for the German EC accounts) and CreditImporter (for credit cards).

After the update from last weekend, initializing ECImporter looks as follows:

CONFIG = (
    ECImporter(
        'DE1234567898765432100',
        'Assets:DKB:EC',
        'EUR',
        file_encoding='ISO-8859-1',
        # New
        payee_patterns=[
            ('REWE Filialen Voll', 'Expenses:Supermarket:REWE'),
        ],
    ),
)

payee_patterns is a new parameter that should contain a list of (regex, account) tuples as an argument. With this in place, ECImporter compiles the given regular expressions and uses this list to check if the payee of a given transaction matches any of the given values. If that is the case, the second posting (referencing the account) is automatically inserted.

Similarly, the CreditImporter has been adjusted as follows:

CONFIG = (
    CreditImporter(
        '1234********5678',
        'Assets:DKB:Credit',
        'EUR',
        file_encoding='ISO-8859-1',
        # New
        description_patterns=[
            ('REWE SAGT DANKE', 'Expenses:Supermarket:REWE'),
        ],
    ),
)

The idea is the same as in ECImporter. Just that instead of payee_patterns, CreditImporter relies on description_patterns because at the time of this writing, the CSV exports from DKB credit card accounts do not include a payee field. So the transaction's description string is all we have to work with.

Result

With this updated API and the updated importers, beancount-dkb automatically outputs the following transaction:

2022-08-15 * "REWE Filialen Voll" "REWE SAGT DANKE"
    Assets:DKB:EC        -15.37 EUR
    Expenses:Supermarket:REWE

The fact that I don't have to add this one line myself reduces my work by about a half. Thanks Michael, for coming up with the idea!