Moving from Beancount 2.x to 3.x

A few days ago, Beancount officially switched from 2.x to 3.x. If I'm being honest, I was a bit afraid of this release, mostly because it was supposed to be a major rewrite. I'm generally not a huge fan of rewrites, and I don't think I'm the only one on the Internet who thinks that way.

Luckily, the update seems to have been less of an event as I anticipated. If you're planning to move from Beancount 2.x to 3.x, the update should be mostly uneventful. Nevertheless, there are a few important changes that may affect your workflow. The rest of this blog post will summarize the ones that I've noticed so far.

1. External data import workflow

Update (May 2025): The original version of this blog post that I published in July 2024 described only the pyproject.toml-based approach of importing external data. Since then, I've discovered another (somewhat more official) way of importing external data, using beangulp. Although the official documentation does not seem to reference it yet, beangulp provides a few examples that use the Ingest class to set up a workflow that is not so different from the older config.py based workflow. For completeness, I'll describe both the approaches here.

The first thing I noticed is that the workflow for importing external data looks different.

With 2.x, you had to maintain a config.py file in your project root where you would import all your Importer classes and put them inside a CONFIG variable.

from first_bank_importer import FirstImporter
from second_bank_importer import SecondImporter

CONFIG = [FirstImporter(), SecondImporter()]

The Beancount commands like bean-extract and bean-identify would then pick this file and run through the list of specified importers to find the matching one.

In 3.x, this workflow is gone. Commands like bean-extract and bean-identify have been removed from the beancount package. Instead, the project has decided to go for a script-based workflow instead (source). It appears that you don't need to maintain a config.py anymore, and your importers need to provide a command-line entry point that can do those tasks (eg. identifying whether a given file matches an importer or extracting a list of transactions out of a file).

If you maintain custom importers and distribute them through PyPI, the one immediate downside of this approach is how those importers are initialized. With the config.py approach, your users could instantiate importers just as any Python code, because it was literally just Python code. Here's an example of how that used to look like:

from my_importer import MyImporter

CONFIG = [
    MyImporter(
        name="Assets:MyBank:Checking",
        currency="EUR",
        patterns={"ALDI": "Expenses:Supermarket:ALDI"},
    )
]

In the absence of config.py, you'll need to find a different way of doing this.

1. Using `beangulp.Ingest`

beangulp provides an Ingest class that accepts a list of importer objects, ultimately giving you a workflow similar to what was available using config.py. The idea is to define an import.py script (the script's name can be anything) in your project root that contains something like the following:

from my_importer import MyImporter
from beangulp import Ingest

importers = (
    MyImporter(
        name="Assets:MyBank:Checking",
        currency="EUR",
        patterns={"ALDI": "Expenses:Supermarket:ALDI"},
    )
)

if __name__ == "__main__":
    ingest = Ingest(importers)
    ingest()

Running this script should produce the following output on the console:

Usage: import.py [OPTIONS] COMMAND [ARGS]...

  Import data from and file away documents from financial institutions.

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  archive   Archive documents.
  extract   Extract transactions from documents.
  identify  Identify files for import.

As you can see, the script now supports archive, extract, and identify subcommands, that replace bean-file, bean-extract, and bean-identify, respectively.

2. Defining importer-specific CLIs

If you don't want to use beangulp.Ingest, another approach is put the importer's initialization parameters inside pyproject.toml. The pyproject.toml file is a relatively new Python standard that is used by a few other tools in the Python ecosystem to place their configuration bits (eg. inside a tool.X section), amongst other information. The reasoning here is that if you're using Beancount, your finances project is basically a Python project, that may as well contain a pyproject.toml file, which in turn means putting importer configuration in there isn't all that bad.

As an example, if you maintain a beancount-mybank package distributed through PyPI, you could ask your users to put the following snippet inside their pyproject.toml:

[tool.beancount-mybank.ec]
name = "Assets:MyBank:Checking"
currency = "EUR"
patterns = [
    ["ALDI", "Expenses:Supermarket:ALDI"]
]

... and add a beancount-mybank-ec CLI command as part of your beancount-mynank package. This CLI command could read the configuration inside pyproject.toml, initialize the importer, and then call out identify or extract.

Of course, this is just one way of approaching things. I'm sure there are more ways to approach this problem.

Although between the two approaches mentioned above, I would recommend going with the beangulp approach.

2. Fava compatibility

Update (May 2025): Fava does support Beancount 3.x!

This one is unfortunate. It seems like Fava isn't compatible with Beancount 3.x, yet. There are a couple of Github issues (eg. #1824 and #1831) that are open at the time of this writing. So far I haven't seen much activity on those issues, so I'm not sure what the current state is.

~~If you're upgrading to Beancount 3.x, be aware that you'll lose out on Fava.~~

3. Adjusting Importers

The next thing I noticed is that the importer class definitions need to be adjusted because of API changes.

The beancount.ingest package is gone and has been replaced with the beangulp module which contains the new importer base (abstract) class and a few other utilities to work with external data. The changes are not many and are not very big either. Here's a quick list:

3.1. Updated base importer class

The first change is, of course, that beancount.ingest is gone. Well, not completely gone, but at least gone from beancount. This functionality has now been extracted to the beangulp module.

In Beancount 2.x the importers were inheriting from beancount.ingest.importer.ImporterProtocol. In 3.x, they should now inherit from beangulp.importer.Importer.

from beangulp.importer import Importer

class MyImporter(Importer):
    ...

3.2. Updated file pointer type in method signatures

The second change is in the method signatures of methods like extract or identify. In beancount.ingest.importer.ImporterProtocol, these methods used to accept a file-like object (cache._FileMemo) as a parameter.

def extract(self, file: cache._FileMemo):
    pass

This file-like object has been replaced with a string filepath, which is a bit more straightforward to work with.

def extract(self, filepath: str):
    pass

3.3. Importers don't have `self.FLAG` anymore

This is a minor one, but I think it might be worth mentioning. If your importers are using self.FLAG in any of the methods, it won't work anymore. You'll need to replace it with one of the flags defined in beancount.core.flags.

3.4. Method names

And the final change I noticed is how the importer method names have changed.

beangulp.importer.Importer defines a new interface that importers should implement. The identify and extract methods have been kept. Other than that, the file_account, file_date, and file_name methods have been renamed to remove the file_ prefix. So file_account has become account, file_date has become date, and file_name has become filename.

The class is defined here, in case you'd like to see the complete interface yourself.

And that's pretty much it. After making the above changes to your importers and your workflow, your setup should be compatible with Beancount 3.x. As I mentioned earlier, the list of changes is not very big. And the changes themselves aren't that big either. However, I couldn't find a migration document that would guide me through the process, so I thought about writing a quick one up myself.

I hope this blog post was helpful. If you have any questions (or noticed something in this blog post that is not correct), please feel free to reach out to me on Bluesky!

(Oh and btw, if you'd like to learn more about how you can track your personal finances using Python and Beancount and don't mind sticking with Beancount 2.x for now, I wrote a book on the topic. 🙃)