Api
Data Types
The following data types are used to define the configuration of the importer.
SimpleFileMatch = str | StrExactMatch | StrRegexMatch
module-attribute
Currently, we support three different modes of matching a CSV file. The first one is the default one, glob. A simple string would make it use glob mode like this:
You can also do an exact match like this:
Or, if you prefer regular expression:
ActionAddTxn
Bases: ImportBaseModel
Add a transaction to the beancount file.
This is the default action type. If your action does not specify a type, it will be assumed to be an add transaction action.
The following keys are available for the add transaction action:
file
: output beancount file name to write the transaction totxn
: the template of the transaction to insert
A transaction template is an object that contains the following keys:
id
: the optionalimport-id
to overwrite the default one. By default,{{ file | as_posix_path }}:{{ lineno }}
will be used unless the extractor provides a default value.date
: the optional date value to overwrite the default one. By default,{{ date }}
will be used.flag
: the optional flag value to overwrite the default one. By default,*
will be used.narration
: the optional narration value to overwrite the default one. By default{{ desc | default(bank_desc, true) }}
will be used.payee
: the optional payee value of the transaction.tags
: an optional list of tags for the transactionlinks
: an optional list of links for the transactionmetadata
: an optional list ofname
andvalue
objects as the metadata items for the transaction.postings
: a list of templates for postings in the transaction.
The structure of the posting template object looks like this.
account
: the account of postingamount
: the optional amount object withnumber
andcurrency
keysprice
: the optional amount object withnumber
andcurrency
keyscost
: the optional template of cost spec
file: str | None = None
class-attribute
instance-attribute
Which file to add the transaction to. If not provided, the default file will be used.
txn: TransactionTemplate
instance-attribute
The transaction transform template
type: typing.Literal[ActionType.add_txn] = pydantic.Field(ActionType.add_txn)
class-attribute
instance-attribute
indicates that this action is to add a transaction
ActionDelTxn
Bases: ImportBaseModel
Delete a transaction from the beancount file.
The following keys are available for the delete transaction action:
txn
: the template of the transaction to insert
A deleting transaction template is an object that contains the following keys:
id
: theimport-id
value for ensuring transactions to be deleted. By default,{{ file | as_posix_path }}:{{ lineno }}
will be used unless the extractor provides a default value.
txn: DeleteTransactionTemplate
instance-attribute
The transaction to delete
type: typing.Literal[ActionType.del_txn] = pydantic.Field(ActionType.del_txn)
class-attribute
instance-attribute
indicates that this action is to delete a transaction
ActionIgnore
Bases: ImportBaseModel
Ignore the transaction.
This prevents the transaction from being added to the beancount file.
Sometimes, we are not interested in some transactions, but if we don't process them, you will still see them
appear in the "unprocessed transactions" section of the report provided by our command line tool.
To mark one transaction as processed, you can simply use the ignore
action like this:
- name: Ignore unused entries
match:
extractor:
equals: "mercury"
desc:
one_of:
- Mercury Credit
- Mercury Checking xx1234
actions:
- type: ignore
type: typing.Literal[ActionType.ignore] = pydantic.Field(ActionType.ignore)
class-attribute
instance-attribute
indicates that this action is to ignore the transaction
Amount
Bases: ImportBaseModel
A posting amount transform template.
Used to transform the raw transaction amount into a beancount posting amount.
Examples
currency: str
instance-attribute
The currency of the amount.
number: str
instance-attribute
The amount number. It can be a Jinja2 template.
BeancountTransaction
dataclass
Beancount transaction.
file: pathlib.Path
instance-attribute
The beancount file path
id: str
instance-attribute
The import id of the transaction
lineno: int
instance-attribute
The line number of the transaction in the beancount file
ChangeSet
dataclass
Change set for beancount transactions.
It represents the changes to be made to the beancount file.
add: list[GeneratedTransaction]
instance-attribute
list of generated transaction to add
dangling: list[BeancountTransaction] | None = None
class-attribute
instance-attribute
list of existing beancount transaction with no corresponding generated transactions (dangling)
remove: list[BeancountTransaction]
instance-attribute
list of existing beancount transaction to remove
update: dict[int, GeneratedTransaction]
instance-attribute
map from
DateAfterMatch
Bases: MatchBaseModel
To match values after a date, one can do this:
date_after: str
instance-attribute
The date to match after
format: str
instance-attribute
The format of the date. used to parse the value date and the date to match
DateBeforeMatch
Bases: MatchBaseModel
To match values before a date, one can do this:
date_before: str
instance-attribute
The date to match before
format: str
instance-attribute
The format of the date. used to parse the value date and the date to match
DateSameDayMatch
Bases: MatchBaseModel
To match values with the same day, one can do this:
date_same_day: str
instance-attribute
The date to match on the day
format: str
instance-attribute
The format of the date. used to parse the value date and the date to match
DateSameMonthMatch
Bases: MatchBaseModel
To match values with the same month, one can do this:
date_same_month: str
instance-attribute
The date to match on the month
format: str
instance-attribute
The format of the date. used to parse the value date and the date to match
DateSameYearMatch
Bases: MatchBaseModel
To match values with the same year, one can do this:
date_same_year: str
instance-attribute
The date to match on the year
format: str
instance-attribute
The format of the date. used to parse the value date and the date to match
DeleteTransactionTemplate
Bases: ImportBaseModel
A transaction delete template.
id: str | None = None
class-attribute
instance-attribute
the import-id for deleting
DeletedTransaction
Bases: ImportBaseModel
represents a deleted transaction
ImportDoc
Bases: ImportBaseModel
The import configuration file for beancount-importer-rules.
Examples
# yaml-language-server: $schema=https://raw.githubusercontent.com/zenobi-us/beancount-importer-rules/master/schema.json
inputs:
- match: "sources/*.csv" # (1)
config:
extractor:
import_path: "extractors.my_extractor:YourExtractorClass" # (2)
as_name: "custom name for this extractor instance"
date_format: "%Y-%m-%d"
datetime_format: "%Y-%m-%d %H:%M:%S"
default_file: "books/{{ date.year }}.bean" # (3)
prepend_postings:
- account: "Assets:Bank"
imports:
- name: "simple"
match:
desc: "Simple Transaction"
actions:
- type: "add_txn"
txn:
date: "2021-01-01"
flag: "*"
narration: "Simple Transaction"
postings:
- account: "Expenses:Simple"
amount:
number: "{{ amount }}"
currency: "USD"
- pathname is relative to the workspace root
- import path is relative to the workspace root
- pathname is relative to the workspace root
You can view the schema for more details or refer to the ImportDoc api
context: dict | None = None
class-attribute
instance-attribute
Context comes in handy when you need to define variables to be referenced in the template.
As you can see in the example, we define a routine_expenses
dictionary variable in the context.
context:
routine_expenses:
"Amazon Web Services":
account: Expenses:Engineering:Servers:AWS
Netlify:
account: Expenses:Engineering:ServiceSubscription
Mailchimp:
account: Expenses:Marketing:ServiceSubscription
Circleci:
account: Expenses:Engineering:ServiceSubscription
Adobe:
account: Expenses:Design:ServiceSubscription
"Digital Ocean":
account: Expenses:Engineering:ServiceSubscription
Microsoft:
account: Expenses:Office:Supplies:SoftwareAsService
narration: "Microsoft 365 Apps for Business Subscription"
"Mercury IO Cashback":
account: Expenses:CreditCardCashback
narration: "Mercury IO Cashback"
WeWork:
account: Expenses:Office
narration: "Virtual mailing address service fee from WeWork"
Then, in the transaction template, we look up the dictionary to find out what narration value to use:
imports: ImportList
instance-attribute
The import rules
inputs: list[InputConfig]
instance-attribute
The input rules
outputs: list[OutputConfig] | None = None
class-attribute
instance-attribute
The output configuration
ImportList
Bases: RootModel[List[ImportRule | IncludeRule]]
The list of import rules.
Can be a list of ImportRule or IncludeRule
ImportRule
Bases: ImportBaseModel
An import rule to match and process transactions.
The following keys are available for the import configuration:
name
: An optional name for the user to comment on this matching rule. Currently, it has no functional purpose.match
: The rule for matching raw transactions extracted from the input CSV files. As described in the Import Match Rule Definitionactions
: Decide what to do with the matched raw transactions, as the Import Action Definition describes.
actions: list[Action]
instance-attribute
The actions to perform
common_cond: TxnMatchRule | None = None
class-attribute
instance-attribute
common condition to meet on top of the match rules
match: TxnMatchRule | list[TxnMatchVars]
instance-attribute
The match rule
name: str | None = None
class-attribute
instance-attribute
Name of import rule, Not used, just for reference
IncludeRule
Bases: ImportBaseModel
Include other yaml files that contain lists of ImportRule
include: str | list[str]
instance-attribute
The file path(s) to include
InputConfig
Bases: ImportBaseModel
The input configuration for the import rule.
InputConfigDetails
Bases: ImportBaseModel
The input configuration details for the import rule.
append_postings: list[PostingTemplate] | None = None
class-attribute
instance-attribute
Postings are to be appended to the generated transactions from the matched file. A list of posting templates as described in the Add Transaction Action section.
default_file: str | None = None
class-attribute
instance-attribute
The default output file for generated transactions from the matched file to use if not specified
in the add_txn
action.
default_txn: TransactionTemplate | None = None
class-attribute
instance-attribute
The default transaction template values to use in the generated transactions from the matched file. Please see the Add Transaction Action section.
extractor: ExractorInputConfig
instance-attribute
A python import path to the extractor to use for extracting transactions from the matched file.
The format is package.module:extractor_class
. For example, beancount_import_rules.extractors.plaid:PlaidExtractor
.
Your Extractor Class should inherit from beancount_import_rules.extractors.ExtractorBase
or
beancount_import_rules.extractors.ExtractorCsvBase
.
prepend_postings: list[PostingTemplate] | None = None
class-attribute
instance-attribute
Postings are to be prepended for the generated transactions from the matched file. A list of posting templates as described in the Add Transaction Action section.
MetadataItemTemplate
PostingTemplate
Bases: ImportBaseModel
A posting transform template.
Used to transform the raw transaction into a beancount posting.
account: str | None = None
class-attribute
instance-attribute
The account of the posting.
amount: AmountTemplate | None = None
class-attribute
instance-attribute
The amount of the posting.
cost: str | None = None
class-attribute
instance-attribute
The cost of the posting.
price: AmountTemplate | None = None
class-attribute
instance-attribute
The price of the posting.
SimpleTxnMatchRule
Bases: ImportBaseModel
The raw transactions extracted by the extractor come with many attributes. Here we list only a few from it:
The match
object should be a dictionary.
The key is the transaction attribute to match, and the value is the regular expression of the target pattern to match.
All listed attributes need to match so that a transaction will considered matched.
Only simple matching logic is possible with the current approach.
We will extend the matching rule to support more complex matching logic in the future, such as NOT, AND, OR operators.
bank_desc: StrMatch | None = None
class-attribute
instance-attribute
The bank description of the transaction to match.
category: StrMatch | None = None
class-attribute
instance-attribute
The category of the transaction to match.
currency: StrMatch | None = None
class-attribute
instance-attribute
The currency of the transaction to match.
date: StrMatch | None = None
class-attribute
instance-attribute
desc: StrMatch | None = None
class-attribute
instance-attribute
The description of the transaction to match.
Probably the most common field to match.
Examples
dest_account: StrMatch | None = None
class-attribute
instance-attribute
The destination account of the transaction to match.
extractor: StrMatch | None = None
class-attribute
instance-attribute
The extractor to match. This will be produced by the Extractor.get_name() method.
file: StrMatch | None = None
class-attribute
instance-attribute
gl_code: StrMatch | None = None
class-attribute
instance-attribute
The general ledger code of the transaction to match.
last_four_digits: StrMatch | None = None
class-attribute
instance-attribute
The last four digits of the card of the transaction to match.
name_on_card: StrMatch | None = None
class-attribute
instance-attribute
The name on the card of the transaction to match.
note: StrMatch | None = None
class-attribute
instance-attribute
The note of the transaction to match.
payee: StrMatch | None = None
class-attribute
instance-attribute
The payee of the transaction to match.
post_date: StrMatch | None = None
class-attribute
instance-attribute
reference: StrMatch | None = None
class-attribute
instance-attribute
The reference of the transaction to match.
source_account: StrMatch | None = None
class-attribute
instance-attribute
The source account of the transaction to match.
status: StrMatch | None = None
class-attribute
instance-attribute
The status of the transaction to match.
subcategory: StrMatch | None = None
class-attribute
instance-attribute
The subcategory of the transaction to match.
timezone: StrMatch | None = None
class-attribute
instance-attribute
transaction_id: StrMatch | None = None
class-attribute
instance-attribute
The transaction id of the transaction to match.
type: StrMatch | None = None
class-attribute
instance-attribute
The type of the transaction to match.
StrContainsMatch
StrExactMatch
StrOneOfMatch
StrPrefixMatch
StrRegexMatch
Bases: MatchBaseModel
When a simple string value is provided, regular expression matching will be used. Here's an example:
or
regex: str
instance-attribute
Does the transaction field match the regular expression
StrSuffixMatch
Transaction
dataclass
A transaction object.
TransactionTemplate
Bases: ImportBaseModel
A transaction transform template.
Used to transform the raw transaction into a beancount transaction.
txn:
date: "2021-01-01"
flag: "*"
narration: "Simple Transaction"
metadata:
- name: "icon"
value: "🍔"
postings:
- account: "Expenses:Simple"
amount:
number: "{{ amount }}"
currency: "USD"
results in the following beancount transaction:
date: str | None = None
class-attribute
instance-attribute
the date of the transaction
flag: str | None = None
class-attribute
instance-attribute
the flag of the transaction
id: str | None = None
class-attribute
instance-attribute
the import-id for de-duplication
links: list[str] | None = None
class-attribute
instance-attribute
the links of the transaction
metadata: list[MetadataItemTemplate] | None = None
class-attribute
instance-attribute
the metadata of the transaction
narration: str | None = None
class-attribute
instance-attribute
the narration of the transaction
payee: str | None = None
class-attribute
instance-attribute
the payee of the transaction
postings: list[PostingTemplate] | None = None
class-attribute
instance-attribute
the postings of the transaction
tags: list[str] | None = None
class-attribute
instance-attribute
the tags of the transaction
TxnMatchVars
Bases: ImportBaseModel
From time to time, you may find yourself writing similar import-matching rules with similar transaction templates. To avoid repeating yourself, you can also write multiple match conditions with their corresponding variables to be used by the template in the same import statement. For example, you can simply do the following two import statements:
imports:
- name: PG&E Gas
match:
extractor:
equals: "plaid"
desc:
prefix: "PGANDE WEB ONLINE "
actions:
- txn:
payee: "{{ payee }}"
narration: "Paid American Express Blue Cash Everyday"
postings:
- account: "Expenses:Util:Gas:PGE"
amount:
number: "{{ -amount }}"
currency: "{{ currency | default('USD', true) }}"
- name: Comcast
match:
extractor:
equals: "plaid"
desc: "Comcast"
actions:
- txn:
payee: "{{ payee }}"
narration: "Comcast"
postings:
- account: "Expenses:Util:Internet:Comcast"
amount:
number: "{{ -amount }}"
currency: "{{ currency | default('USD', true) }}"
With match and variables, you can write:
imports:
- name: Household expenses
common_cond:
extractor:
equals: "plaid"
match:
- cond:
desc:
prefix: "PGANDE WEB ONLINE "
vars:
account: "Expenses:Util:Gas:PGE"
narration: "Paid American Express Blue Cash Everyday"
- cond:
desc: "Comcast"
vars:
account: "Expenses:Housing:Util:Internet:Comcast"
narration: "Comcast"
actions:
- txn:
payee: "{{ payee }}"
narration: "{{ narration }}"
postings:
- account: "{{ account } "
amount:
number: "{{ -amount }}"
currency: "{{ currency | default('USD', true) }}"
The common_cond
is the condition to meet for all the matches. Instead of a map, you define
the match with the cond
field and the corresponding variables with the vars
field.
Please note that the vars
can also be the Jinja2 template and will rendered before
feeding into the transaction template.
If there are any original variables from the transaction with the same name defined in
the vars
field, the variables from the vars
field always override.
UnprocessedTransaction
dataclass
Unprocessed transaction.
It represents the transaction extracted from the source file.
appending_postings: list[GeneratedPosting] | None = None
class-attribute
instance-attribute
The generated postings to append to the transaction
import_id: str
instance-attribute
The import id of the transaction
output_file: str | None = None
class-attribute
instance-attribute
The generated output filename if available
prepending_postings: list[GeneratedPosting] | None = None
class-attribute
instance-attribute
The generated postings to prepend to the transaction
txn: Transaction
instance-attribute
The unprocessed transaction
Extractors
These provide the ability to extract data from various sources.
So far there is a convenience base class for CSV files.
ExtractorBase
ExtractorCsvBase
Bases: ExtractorBase
Base class for CSV extractors
Create a file called extractors/csv.py
by
subclassing ExtractorCsvBase
:
class MyCsvExtractor(ExtractorCsvBase):
fields = ["Date", "Description", "Amount", "Currency"]
def process_line(self, lineno, line):
return Transaction(
date=self.parse_date(line["Date"]),
narration=line["Description"],
amount=Decimal(line["Amount"]),
currency=line["Currency"],
)
delimiter: str = ','
class-attribute
instance-attribute
The delimiter used in the CSV file
fields: typing.List[str]
instance-attribute
The fields in the CSV file
detect(file_path)
Check if the input file is a CSV file with the expected fields. Should this extractor be used to process the file?
detect_has_header(file_path)
Check if the supplied csv file has a header row.
It will if the fieldnames attribute is not None and they match the values of the first row of the file.
We do this to detect if we need to skip the first row; it seems that the DictReader class does not automatically detect if the file has a header row or not and will return the first row as data if the fieldnames attribute is not set.
fingerprint(file_path)
Generate a fingerprint for the CSV file
get_linecount(file_path)
Get the number of lines in a file
parse_date(date_str)
Parse a date string using the self.format
parse_datetime(date_str)
Parse a date string using the self.format
process(file_path)
Process the CSV file and yield transactions.
Loops over the rows in the CSV file and yields a transaction for each row by calling
process_line
.
process_line(lineno, line, file_path, line_count)
Process a line in the CSV file and return a transaction.
This method should be implemented by subclasses to return a Transaction
.
create_extractor_factory(class_name=None, working_dir=Path.cwd())
Manages importing the defined extractor module and returning the extractor