A few days ago, Beancount officially switched from 2.x to 3.x. If I'm being honest, I was a bit afraid of this release, mostly because it was supposed to be a major rewrite. I'm generally not a huge fan of rewrites, and I don't think I'm the only one on the Internet who thinks that way.
Luckily, the update seems to have been less of an event as I anticipated. If you're planning to move from Beancount 2.x to 3.x, the update should be mostly uneventful. Nevertheless, there are a few important changes that may affect your workflow. The rest of this blog post will summarize the ones that I've noticed so far.
1. External data import workflow
Update (May 2025): The original version of this blog post that I published in July
2024 described only the pyproject.toml-based approach of importing external data. Since
then, I've discovered another (somewhat more official) way of importing external data,
using beangulp. Although the official documentation does not seem to reference it yet,
beangulp provides a few examples that use the Ingest
class to set up a workflow that
is not so different from the older config.py
based workflow. For completeness, I'll
describe both the approaches here.
The first thing I noticed is that the workflow for importing external data looks different.
With 2.x, you had to maintain a config.py
file in your project root where you would
import all your Importer classes and put them inside a CONFIG
variable.
from first_bank_importer import FirstImporter
from second_bank_importer import SecondImporter
CONFIG = [FirstImporter(), SecondImporter()]
The Beancount commands like bean-extract
and bean-identify
would then pick this file
and run through the list of specified importers to find the matching one.
In 3.x, this workflow is gone. Commands like bean-extract
and bean-identify
have
been removed from the beancount
package. Instead, the project has decided to go for a
script-based workflow instead (source). It appears that you don't need to maintain a
config.py
anymore, and your importers need to provide a command-line entry point that
can do those tasks (eg. identifying whether a given file matches an importer or
extracting a list of transactions out of a file).
If you maintain custom importers and distribute them through PyPI, the one immediate
downside of this approach is how those importers are initialized. With the config.py
approach, your users could instantiate importers just as any Python code, because it was
literally just Python code. Here's an example of how that used to look like:
from my_importer import MyImporter
CONFIG = [
MyImporter(
name="Assets:MyBank:Checking",
currency="EUR",
patterns={"ALDI": "Expenses:Supermarket:ALDI"},
)
]
In the absence of config.py
, you'll need to find a different way of doing this.
1. Using beangulp.Ingest
beangulp
provides an Ingest
class that accepts a list of importer objects,
ultimately giving you a workflow similar to what was available using config.py
. The
idea is to define an import.py
script (the script's name can be anything) in your
project root that contains something like the following:
from my_importer import MyImporter
from beangulp import Ingest
importers = (
MyImporter(
name="Assets:MyBank:Checking",
currency="EUR",
patterns={"ALDI": "Expenses:Supermarket:ALDI"},
)
)
if __name__ == "__main__":
ingest = Ingest(importers)
ingest()
Running this script should produce the following output on the console:
Usage: import.py [OPTIONS] COMMAND [ARGS]...
Import data from and file away documents from financial institutions.
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
archive Archive documents.
extract Extract transactions from documents.
identify Identify files for import.
As you can see, the script now supports archive
, extract
, and identify
subcommands, that replace bean-file
, bean-extract
, and bean-identify
,
respectively.
2. Defining importer-specific CLIs
If you don't want to use beangulp.Ingest
, another approach is put the importer's
initialization parameters inside pyproject.toml
. The pyproject.toml
file is
a relatively new Python standard that is used by a few other tools in the Python
ecosystem to place their configuration bits (eg. inside a tool.X
section), amongst
other information. The reasoning here is that if you're using Beancount, your finances
project is basically a Python project, that may as well contain a pyproject.toml
file, which in turn means putting importer configuration in there isn't all that bad.
As an example, if you maintain a beancount-mybank
package distributed through PyPI,
you could ask your users to put the following snippet inside their pyproject.toml
:
[tool.beancount-mybank.ec]
name = "Assets:MyBank:Checking"
currency = "EUR"
patterns = [
["ALDI", "Expenses:Supermarket:ALDI"]
]
... and add a beancount-mybank-ec
CLI command as part of your beancount-mynank
package. This CLI command could read the configuration inside pyproject.toml
,
initialize the importer, and then call out identify
or extract
.
Of course, this is just one way of approaching things. I'm sure there are more ways to approach this problem.
Although between the two approaches mentioned above, I would recommend going with the
beangulp
approach.
2. Fava compatibility
Update (May 2025): Fava does support Beancount 3.x!
This one is unfortunate. It seems like Fava isn't compatible with Beancount 3.x, yet.
There are a couple of Github issues (eg. #1824 and #1831) that are open at the time
of this writing. So far I haven't seen much activity on those issues, so I'm not sure
what the current state is.
If you're upgrading to Beancount 3.x, be aware that you'll lose out on Fava.
3. Adjusting Importers
The next thing I noticed is that the importer class definitions need to be adjusted because of API changes.
The beancount.ingest
package is gone and has been replaced with the beangulp module
which contains the new importer base (abstract) class and a few other utilities to work
with external data. The changes are not many and are not very big either. Here's a quick
list:
3.1. Updated base importer class
The first change is, of course, that beancount.ingest
is gone. Well, not completely
gone, but at least gone from beancount
. This functionality has now been extracted to
the beangulp
module.
In Beancount 2.x the importers were inheriting from
beancount.ingest.importer.ImporterProtocol
. In 3.x, they should now inherit from
beangulp.importer.Importer
.
from beangulp.importer import Importer
class MyImporter(Importer):
...
3.2. Updated file pointer type in method signatures
The second change is in the method signatures of methods like extract
or identify
.
In beancount.ingest.importer.ImporterProtocol
, these methods used to accept a
file-like object (cache._FileMemo
) as a parameter.
def extract(self, file: cache._FileMemo):
pass
This file-like object has been replaced with a string filepath, which is a bit more straightforward to work with.
def extract(self, filepath: str):
pass
3.3. Importers don't have self.FLAG
anymore
This is a minor one, but I think it might be worth mentioning. If your importers are
using self.FLAG
in any of the methods, it won't work anymore. You'll need to replace
it with one of the flags defined in beancount.core.flags
.
3.4. Method names
And the final change I noticed is how the importer method names have changed.
beangulp.importer.Importer
defines a new interface that importers should implement.
The identify
and extract
methods have been kept. Other than that, the
file_account
, file_date
, and file_name
methods have been renamed to remove the
file_
prefix. So file_account
has become account, file_date
has become date
, and
file_name
has become filename
.
The class is defined here, in case you'd like to see the complete interface yourself.
And that's pretty much it. After making the above changes to your importers and your workflow, your setup should be compatible with Beancount 3.x. As I mentioned earlier, the list of changes is not very big. And the changes themselves aren't that big either. However, I couldn't find a migration document that would guide me through the process, so I thought about writing a quick one up myself.
I hope this blog post was helpful. If you have any questions (or noticed something in this blog post that is not correct), please feel free to reach out to me on Bluesky!
(Oh and btw, if you'd like to learn more about how you can track your personal finances using Python and Beancount and don't mind sticking with Beancount 2.x for now, I wrote a book on the topic. 🙃)