Automatically balancing Beancount DKB transactions
#open-source , #personal-finance , #python
Beancount is one of my favorite packages in the Python ecosystem.
What is Beancount?
I've written about it in much more detail in other places on the internet, but if you're new to it, Beancount is a PyPI package that provides a set of tools for you to keep track of your personal finances using Python.
The entire workflow revolves around a
.beancount file on your disk that
contains all your financial transactions (really, "all" as in
"from-the-beginning-of-time all") specified using the double-entry bookkeeping
format. Keeping it up to date involves the following steps:
- Download all your financial activity (at regular time intervals) from your bank's website as PDF/CSV/whatever-your-bank-allows exports.
- Convert this data into a format that Beancount understand (this is the part which may or may not involve custom Python code).
- Add/append the output of step #2 to
At the end,
you.beancount contains all your financial activity where you can
use all the tools that Beancount (and the ecosystem) provides to run all sorts
of analysis on it.
One of my main bank accounts is with DKB. So to get Beancount working with DKB data exports, I wrote and currently maintain beancount-dkb, which provides a PyPI package to convert DKB CSV exports to the Beancount format.
So if one of the lines of your DKB CSV exports looks as follows:
"15.08.2022";"15.08.2022";"Lastschrift";"REWE Filialen Voll";"REWE SAGT DANKE.";"DE00000000000000000000";"AAAAAAAA";"-15,37";"000000000000000000";"0000000000000000000000";"";
... running it through an
ECImporter instance makes Beancount output something
like the following:
2022-08-15 * "REWE Filialen Voll" "REWE SAGT DANKE" Assets:DKB:EC -15.37 EUR
In terms of double-entry bookkeeping (which is what Beancount relies on), this is halfway there. We still need to add a second leg to this transaction to balance it out such that the total sum is zero.
This step is something we usually do manually. But it doesn't always have to be.
Pattern matching against the Payee
Disclaimer: I stole the following idea from beancount-mercury written by Michael Lynch.
If we look at the original transaction, the payee is always the same. And even if won't always be the same, it's at least going to contain the same pattern.
We can make use of this fact to tell the DKB importers that if the payee string contains a specific value, automatically add a second posting to a separate matching account (which is what we would have done manually anyway). This is specifically the feature I ended up pushing to PyPI last weekend.
The package distributes two importers:
ECImporter (for the German EC accounts)
CreditImporter (for credit cards).
After the update from last weekend, initializing
ECImporter looks as follows:
CONFIG = ( ECImporter( 'DE1234567898765432100', 'Assets:DKB:EC', 'EUR', file_encoding='ISO-8859-1', # New payee_patterns=[ ('REWE Filialen Voll', 'Expenses:Supermarket:REWE'), ], ), )
payee_patterns is a new parameter that should contain a list of
(regex, account) tuples as an argument. With this in place,
ECImporter compiles the
given regular expressions and uses this list to check if the payee of a given
transaction matches any of the given values. If that is the case, the second
posting (referencing the
account) is automatically inserted.
CreditImporter has been adjusted as follows:
CONFIG = ( CreditImporter( '1234********5678', 'Assets:DKB:Credit', 'EUR', file_encoding='ISO-8859-1', # New description_patterns=[ ('REWE SAGT DANKE', 'Expenses:Supermarket:REWE'), ], ), )
The idea is the same as in
ECImporter. Just that instead of
CreditImporter relies on
description_patterns because at the time of this
writing, the CSV exports from DKB credit card accounts do not include a payee
field. So the transaction's description string is all we have to work with.
With this updated API and the updated importers,
outputs the following transaction:
2022-08-15 * "REWE Filialen Voll" "REWE SAGT DANKE" Assets:DKB:EC -15.37 EUR Expenses:Supermarket:REWE
The fact that I don't have to add this one line myself reduces my work by about a half. Thanks Michael, for coming up with the idea!