Off by 1 (Day)Jan 28, 2019 • #personal-finance , #practices , #programming , #python
One of the most common bugs when writing software is the classic "off by 1" error. In this post, I'll talk about a similar bug I found in some code I maintain, and how I fixed it.
I maintain beancount-dkb, which is a Python package that provides helper classes for converting DKB CSV exports to the Beancount format. In Beancount's terminology, these "helper" classes are called "importers".
If you're not familiar with Beancount, it's a plain-text accounting tool which lets you keep track of all your finances using plain text files. The idea is that you maintain all your bank transactions in one text file, and then use the tools that Beancount provides to run reports over all that data. The transactions in this file follow the Double Entry Accounting method, and are written in a DSL strictly specified by Beancount.
The way this works in practice is that every few weeks, you download
transactions from your bank (often this is a simple CSV export), and run them
through an importer to convert them into a data format that Beancount expects.
You then append the resulting data to a
.beancount file you maintain which
contains all your transactions, going all the way back to stone age.
Finally, you use the suite of tools that Beancount provides to run all sorts of
analysis on your financial data.
It's actually much less complicated than it sounds.
I've used it to import my financial history from the last three years and the whole process has been quite smooth, except for one hiccup. Balance assertions. And that's what this post is about.
Here's a short code snippet that represents the history of a single bank
account, written in the Beancount DSL. The bank account is named
and starts out with an opening balance of €100.
For simplicity, the history here consists of a single "going to the supermarket" transaction.
;; -*- mode: beancount -*- ; Date format - YYYY-MM-DD option "title" "Max Mustermann" option "operating_currency" "EUR" 2019-01-01 open Assets:DKB 2019-01-01 open Equity:Opening-Balances 2019-01-01 open Expenses:Supermarket 2019-01-15 * "Initialize Assets:DKB" Assets:DKB 100.00 EUR Equity:Opening-Balances 2019-01-15 * "Going to the supermarket" Assets:DKB -30.00 EUR Expenses:Supermarket 30.00 EUR 2019-01-16 balance Assets:DKB 70.00 EUR
The transaction with the description "Going to the supermarket" shows that the owner went to some supermarket and spent €30, which means that the account has €70 left at the end.
The interesting bit here is the last line.
2019-01-16 balance Assets:DKB 70.00 EUR
This line instructs Beancount to assert that the balance of the given account is the given amount at the beginning of the given date. And in case that's not true, Beancount should refuse to process things any further because there's obviously something wrong with the data.
Such assertions are not completely necessary, but having them gives you the peace of mind that the data you're working with is not wrong. This can happen in case of something like duplicate transactions. Often, when you have two accounts and you transfer money from one account to the other one, the same transaction is going to show up in both the account summaries. For Beancount they are two different transactions, but practially speaking that's not true. They are two legs of the same transaction. Left unmerged, these would result in wrong numbers on both the accounts. This is why balance assertions come in handy.
When I first started out with Beancount, there were no balance assertions in my
data. I knew the concept, but the initial versions of
beancount-dkb I released
didn't output any
After a few months of regular Beancount usage and realizing how useful these
assertions can be, I decided to implement support in
beancount-dkb. This was
not too much work since the documentation is pretty clear on how to do this.
So something like this,
... becomes this
2019-01-01 balance Assets:DKB 100.00 EUR
What turned out to be more work was testing the whole thing.
After the initial implementation, I noticed the numbers just weren't adding up. The test cases were fine, but the test cases were made-up data anyway. The output on the actual data just didn't add up. After a few hours of trying to figure things out, I found this little gem in the documentation.
Note that a balance assertion, like all other non-transaction directives, applies at the beginning of its date (i.e., midnight at the start of day). Just imagine that the balance check occurs right after midnight on that day.
The difference is subtle, but can easily lead to numbers not adding up.
What's happening here is that Beancount is expecting the balance amount to be valid at the beginning of the day, while the balance values from the DKB output correspond to the amount at the end of the day. Note that the DKB behavior is not documented anywhere (and if it is, I couldn't find the relevant docs), but from all the data I saw, this makes the most sense.
The fix in this case was easy. Just set the date of the
balance directive to 1
day after what's in the CSV. This shifts the time of the assertion to midnight
at the start of the next day, which in turn makes the numbers all look good.
Admittedly, this bug wasn't too hairy and the fix wasn't that tricky either. But often in cases like this where things are rather subtle, it can take you anywhere between a few minutes to a few hours to find a fix. For me, it was somewhere in between.
The latest release of
beancount-dkb includes this patch. I have (more or less
completely) rewritten my Beancount data using the latest code with the correct
balance assertions and so far things have been smooth. Apologies if an older
version affected your data!