### Install smart_importer Source: https://github.com/beancount/smart_importer/blob/main/README.rst Install the smart_importer package using pip. This is the recommended way to get started. ```bash pip install smart_importer ``` -------------------------------- ### Install Jieba for Chinese Tokenization Source: https://github.com/beancount/smart_importer/blob/main/README.rst Install the 'jieba' library, a popular Chinese text segmentation tool, using pip. This is a prerequisite for using Chinese tokenization. ```bash pip install jieba ``` -------------------------------- ### Run Unit Tests with Tox Source: https://github.com/beancount/smart_importer/blob/main/README.rst Execute the project's unit tests using the tox automation tool. Ensure tox is installed in your environment. ```bash make test ``` -------------------------------- ### Using smart_importer Hooks Source: https://github.com/beancount/smart_importer/blob/main/README.rst Example of configuring Beancount importers with multiple smart_importer hooks. This applies both PredictPostings and PredictPayees to all configured importers. ```python from your_custom_importer import MyBankImporter from smart_importer import PredictPayees, PredictPostings CONFIG = [ MyBankImporter('whatever', 'config', 'is', 'needed'), ] HOOKS = [ PredictPostings().hook, PredictPayees().hook ] ``` -------------------------------- ### Beancount Importer Example Source: https://github.com/beancount/smart_importer/blob/main/README.rst A basic structure for a Beancount importer class. This serves as a base for custom importers that can be enhanced by smart_importer. ```python class MyBankImporter(importer.Importer): """My existing importer""" # the actual importer logic would be here... ``` -------------------------------- ### Wrapping an Importer with smart_importer Source: https://github.com/beancount/smart_importer/blob/main/README.rst Example of wrapping a specific importer with smart_importer functionality. This approach modifies only the targeted importer, unlike using hooks. ```python from your_custom_importer import MyBankImporter from smart_importer import PredictPayees, PredictPostings CONFIG = [ PredictPostings().wrap( PredictPayees().wrap( MyBankImporter('whatever', 'config', 'is', 'needed') ) ), ] Hooks = [ ] ``` -------------------------------- ### Quick Start: PredictPostings Hook Source: https://github.com/beancount/smart_importer/blob/main/README.rst Example of applying the PredictPostings hook to a custom CSV importer. This hook uses existing entries as training data to predict entry attributes. ```python from beangulp.importers import csv from beangulp.importers.csv import Col from smart_importer import PredictPostings class MyBankImporter(csv.Importer): '''Conventional importer for MyBank''' def __init__(self, *, account): super().__init__( {Col.DATE: 'Date', Col.PAYEE: 'Transaction Details', Col.AMOUNT_DEBIT: 'Funds Out', Col.AMOUNT_CREDIT: 'Funds In'}, account, 'EUR', ( 'Date, Transaction Details, Funds Out, Funds In' ) ) CONFIG = [ MyBankImporter(account='Assets:MyBank:MyAccount'), ] HOOKS = [ PredictPostings().hook ] ``` -------------------------------- ### Configure EntryPredictor Behavior with Custom Tokenizer Source: https://context7.com/beancount/smart_importer/llms.txt Demonstrates configuring the base EntryPredictor class with options like a custom string tokenizer for non-Latin scripts. Use jieba for Chinese text. ```python import jieba from smart_importer import PredictPostings # Example of configuring PredictPostings with a custom tokenizer predictor = PredictPostings( string_tokenizer=jieba.cut, denylist_accounts=["Liabilities:CreditCard"] ) # This predictor can then be used in HOOKS or wrap() # HOOKS = [predictor.hook] ``` -------------------------------- ### Configure PredictPostings with Chinese Tokenizer Source: https://context7.com/beancount/smart_importer/llms.txt Initialize PredictPostings with a custom jieba tokenizer for Chinese narrations. Ensure string_token_pattern is None when using a custom tokenizer. ```python import jieba jieba.initialize() tokenizer = lambda s: list(jieba.cut(s)) predictor = PredictPostings( predict=True, # default: produce predictions overwrite=False, # default: skip if attribute already set string_tokenizer=tokenizer, # jieba tokenizer for Chinese narrations string_token_pattern=None, # must be None when a custom tokenizer is used denylist_accounts=[ "Expenses:Reimbursable", "Expenses:Denylisted", ], ) CONFIG = [ predictor.wrap(MyChineseBankImporter(account="Assets:CN:ICBC:Checking")), ] ``` -------------------------------- ### Configure Logging for Smart Importer Source: https://context7.com/beancount/smart_importer/llms.txt Configure the `smart_importer` logger to `DEBUG` level to view detailed internal information, including training data sizes, model fitting status, and prediction specifics for each transaction. ```python import logging # Show all smart_importer internals logging.getLogger("smart_importer").setLevel(logging.DEBUG) # Typical output during bean-extract: # DEBUG:smart_importer.predictor:Loaded training data with 42 transactions, ... # DEBUG:smart_importer.predictor:Trained the machine learning model. # DEBUG:smart_importer.predictor:Apply predictions with pipeline # DEBUG:smart_importer.predictor:Added predictions to 12 transactions ``` -------------------------------- ### EntryPredictor.hook Source: https://context7.com/beancount/smart_importer/llms.txt The hook method is the entry point for training and applying prediction models. It accepts a list of imported transactions and existing Beancount entries, trains models based on the provided data, and returns the imported transactions with predicted attributes applied. ```APIDOC ## EntryPredictor.hook — beangulp hook entry point `hook` is the method registered in `HOOKS`. It accepts the full list of `(filename, entries, account, importer)` tuples produced by all importers along with the complete list of existing Beancount directives, which serve as training data. It trains one model per importer result (filtered to that importer's account) and returns the same tuple list with predicted attributes applied. ```python from beancount.parser import parser from smart_importer import PredictPostings # Simulate what beangulp calls internally training_raw, _, __ = parser.parse_string(""" 2016-01-01 open Assets:US:BofA:Checking USD 2016-01-01 open Expenses:Food:Groceries USD 2016-01-06 * "Farmer Fresh" "Buying groceries" Assets:US:BofA:Checking -2.50 USD Expenses:Food:Groceries 2016-01-07 * "Gimme Coffee" "Coffee" Assets:US:BofA:Checking -4.00 USD Expenses:Food:Coffee """) imported_raw, _, __ = parser.parse_string(""" 2024-03-01 * "Farmer Fresh" "Weekly shop" Assets:US:BofA:Checking -38.00 USD """) from beancount.core.data import Transaction imported_txns = [t for t in imported_raw if isinstance(t, Transaction)] predictor = PredictPostings() results = predictor.hook( [("statement.csv", imported_txns, "Assets:US:BofA:Checking", None)], existing_entries=training_raw, ) for filename, entries, account, _importer in results: for entry in entries: if isinstance(entry, Transaction): print(entry.postings) # Output: posting with account predicted as "Expenses:Food:Groceries" ``` ``` -------------------------------- ### Use Jieba Tokenizer with PredictPostings Source: https://github.com/beancount/smart_importer/blob/main/README.rst Integrate the 'jieba' tokenizer with the PredictPostings class for processing Chinese text. Initialize jieba and define a tokenizer function. ```python from smart_importer import PredictPostings import jieba jieba.initialize() tokenizer = lambda s: list(jieba.cut(s)) predictor = PredictPostings(string_tokenizer=tokenizer) ``` -------------------------------- ### Configure Smart Importer Logging Source: https://github.com/beancount/smart_importer/blob/main/README.rst Set the logging level for the 'smart_importer' module to DEBUG. This is useful for detailed debugging. ```python import logging logging.getLogger('smart_importer').setLevel(logging.DEBUG) ``` -------------------------------- ### Logging Configuration Source: https://context7.com/beancount/smart_importer/llms.txt Configures the logging level for the `smart_importer` module. Setting the level to `DEBUG` provides detailed internal information. ```APIDOC ## Logging Configuration ### Description `smart_importer` uses Python's standard `logging` module under the `smart_importer` logger name. Setting the log level to `DEBUG` reveals training data sizes, pipeline fit status, and per-transaction prediction details. ### Usage To enable detailed logging, set the level of the `smart_importer` logger to `DEBUG`. ### Example ```python import logging # Show all smart_importer internals logging.getLogger("smart_importer").setLevel(logging.DEBUG) # Typical output during bean-extract: # DEBUG:smart_importer.predictor:Loaded training data with 42 transactions, ... # DEBUG:smart_importer.predictor:Trained the machine learning model. # DEBUG:smart_importer.predictor:Apply predictions with pipeline # DEBUG:smart_importer.predictor:Added predictions to 12 transactions ``` ``` -------------------------------- ### EntryPredictor.wrap: Enhance a Single Importer Source: https://context7.com/beancount/smart_importer/llms.txt Wrap an existing Importer instance with PredictPostings to automatically apply ML predictions during extraction. This is useful when only one importer needs prediction capabilities. ```python from smart_importer import PredictPostings, PredictPayees # Wrap an existing importer; the original class is untouched smart = PredictPostings(denylist_accounts=["Expenses:Reimbursable"]).wrap( MyBankImporter(account="Assets:MyBank:Checking") ) # smart.identify(), smart.account(), etc. delegate to MyBankImporter # smart.extract(filepath, existing) runs predictions automatically entries = smart.extract("statement.csv", existing_entries=my_ledger_entries) # entries now contain predicted second postings ``` -------------------------------- ### Predict Postings with Standard Beancount Hook Source: https://context7.com/beancount/smart_importer/llms.txt Applies PredictPostings to all importers using a standard beangulp hook. Trains on existing entries and predicts the second posting account. ```python from beangulp.importers import csv from beangulp.importers.csv import Col from smart_importer import PredictPostings # --- Standard beangulp hook (applies to all importers) --- class MyBankImporter(csv.Importer): def __init__(self, *, account): super().__init( { Col.DATE: "Date", Col.PAYEE: "Transaction Details", Col.AMOUNT_DEBIT: "Funds Out", Col.AMOUNT_CREDIT: "Funds In", }, account, "EUR", "Date, Transaction Details, Funds Out, Funds In", ) CONFIG = [ MyBankImporter(account="Assets:MyBank:Checking"), ] HOOKS = [ PredictPostings().hook, # trains on existing_entries, predicts second posting ] # --- Per-importer wrap (only this importer gets predictions) --- from smart_importer import PredictPostings CONFIG = [ PredictPostings().wrap( MyBankImporter(account="Assets:MyBank:Checking") ), ] # Run bean-extract with training data: # bean-extract import.py statement.csv -e my_ledger.beancount # # Expected effect: imported transactions go from # 2024-03-01 * "Starbucks" "Coffee" # Assets:MyBank:Checking -4.50 EUR # to # 2024-03-01 * "Starbucks" "Coffee" # Assets:MyBank:Checking -4.50 EUR # Expenses:Food:Coffee ``` -------------------------------- ### EntryPredictor Configuration Source: https://context7.com/beancount/smart_importer/llms.txt EntryPredictor is the base class for both PredictPostings and PredictPayees, offering configuration options for prediction behavior. ```APIDOC ## EntryPredictor Configuration ### Description Base class for prediction importers, providing configuration for prediction behavior. ### Constructor Parameters - `predict` (bool): Whether to perform predictions. Defaults to `True`. - `overwrite` (bool): Whether to overwrite existing attributes with predictions. Defaults to `False`. - `string_tokenizer` (callable): A custom string tokenizer for non-Latin scripts (e.g., `jieba`). - `string_token_pattern` (str): A regex pattern for tokenizing strings. - `denylist_accounts` (list[str]): A list of accounts whose transactions should be excluded from training. ``` -------------------------------- ### Build Scikit-learn Pipeline for Transaction Attribute Source: https://context7.com/beancount/smart_importer/llms.txt The `get_pipeline` factory creates scikit-learn pipelines for transaction attributes. It uses `NumericTxnAttribute` for numeric attributes prefixed with `date.` and `AttrGetter` with `StringVectorizer` for others. Custom tokenizers can be provided. ```python import jieba from smart_importer.pipelines import get_pipeline from beancount.parser import parser from beancount.core.data import Transaction jieba.initialize() tokenizer = lambda s: list(jieba.cut(s)) # String attribute pipeline with custom tokenizer narration_pipeline = get_pipeline( "narration", tokenizer=tokenizer, token_pattern=None, ) # Numeric attribute pipeline (no tokenizer needed) date_pipeline = get_pipeline("date.day", tokenizer=None) raw, _, __ = parser.parse_string(""" 2024-03-01 * "Farmer Fresh" "购买杂货" Assets:CN:Checking -50 CNY """) txns = [t for t in raw if isinstance(t, Transaction)] features = narration_pipeline.fit_transform(txns) print(features.shape) # (1, n_features) — Chinese tokens extracted via jieba ``` -------------------------------- ### Predict Payee with Beancount Hooks Source: https://context7.com/beancount/smart_importer/llms.txt Applies both PredictPostings and PredictPayees as independent beangulp hooks. PredictPayees fills the payee field when empty, unless overwrite=True is set. ```python from smart_importer import PredictPayees, PredictPostings # Apply both predictors as beangulp hooks; each runs independently CONFIG = [ MyBankImporter(account="Assets:MyBank:Checking"), ] HOOKS = [ PredictPostings().hook, PredictPayees().hook, ] # Stacking via wrap() applies both predictors to a single importer CONFIG = [ PredictPostings().wrap( PredictPayees().wrap( MyBankImporter(account="Assets:MyBank:Checking") ) ), ] # Result: a transaction with narration "Coffee" and no payee becomes # 2024-03-01 * "Starbucks" "Coffee" # Assets:MyBank:Checking -4.50 EUR # Expenses:Food:Coffee ``` -------------------------------- ### EntryPredictor.wrap Source: https://context7.com/beancount/smart_importer/llms.txt The wrap method allows you to wrap an existing Beancount importer with prediction capabilities. It returns an ImporterWrapper that delegates most calls to the original importer but intercepts the `extract` method to apply predictions. ```APIDOC ## EntryPredictor.wrap — wrap a single importer `wrap` takes an existing `beangulp.importer.Importer` instance and returns an `ImporterWrapper` that transparently delegates all importer methods (`identify`, `account`, `date`, `filename`, `deduplicate`, `sort`) to the original while intercepting `extract` to run the predictor. This is preferred when only one importer should receive ML predictions. ```python from smart_importer import PredictPostings, PredictPayees # Wrap an existing importer; the original class is untouched smart = PredictPostings(denylist_accounts=["Expenses:Reimbursable"]).wrap( MyBankImporter(account="Assets:MyBank:Checking") ) # smart.identify(), smart.account(), etc. delegate to MyBankImporter # smart.extract(filepath, existing) runs predictions automatically entries = smart.extract("statement.csv", existing_entries=my_ledger_entries) # entries now contain predicted second postings ``` ``` -------------------------------- ### EntryPredictor.hook: Train and Predict with Beancount Data Source: https://context7.com/beancount/smart_importer/llms.txt The hook method trains a model using existing Beancount entries and applies predictions to imported transactions. It's the entry point for beangulp hooks. ```python from beancount.parser import parser from smart_importer import PredictPostings # Simulate what beangulp calls internally training_raw, _, __ = parser.parse_string(""" 2016-01-01 open Assets:US:BofA:Checking USD 2016-01-01 open Expenses:Food:Groceries USD 2016-01-06 * "Farmer Fresh" "Buying groceries" Assets:US:BofA:Checking -2.50 USD Expenses:Food:Groceries 2016-01-07 * "Gimme Coffee" "Coffee" Assets:US:BofA:Checking -4.00 USD Expenses:Food:Coffee """) imported_raw, _, __ = parser.parse_string(""" 2024-03-01 * "Farmer Fresh" "Weekly shop" Assets:US:BofA:Checking -38.00 USD """) from beancount.core.data import Transaction imported_txns = [t for t in imported_raw if isinstance(t, Transaction)] predictor = PredictPostings() results = predictor.hook( [("statement.csv", imported_txns, "Assets:US:BofA:Checking", None)], existing_entries=training_raw, ) for filename, entries, account, _importer in results: for entry in entries: if isinstance(entry, Transaction): print(entry.postings) # Output: posting with account predicted as "Expenses:Food:Groceries" ``` -------------------------------- ### ImporterWrapper: Transparent Importer Proxy Source: https://context7.com/beancount/smart_importer/llms.txt ImporterWrapper subclasses beangulp.importer.Importer and delegates most methods to the original importer. Its extract method intercepts results for ML prediction. ```python from smart_importer.wrapper import ImporterWrapper from smart_importer import PredictPostings predictor = PredictPostings() original = MyBankImporter(account="Assets:MyBank:Checking") wrapper = ImporterWrapper(original, predictor) # All delegation methods work as expected print(wrapper.name) # same as original.name print(wrapper.identify("statement.csv")) # delegates to original print(wrapper.account("statement.csv")) # delegates to original # extract() is where the ML magic happens entries = wrapper.extract("statement.csv", existing_entries=[]) ``` -------------------------------- ### get_pipeline Source: https://context7.com/beancount/smart_importer/llms.txt Creates a scikit-learn pipeline for a given transaction attribute. Handles numeric attributes with a specific prefix and treats others as strings, applying appropriate vectorization. ```APIDOC ## get_pipeline ### Description Creates a scikit-learn pipeline for a given transaction attribute. Numeric attributes prefixed with `date.` (e.g., `date.day`) are handled by `NumericTxnAttribute`; all other attributes are treated as strings and processed through `AttrGetter` → `StringVectorizer` (a `CountVectorizer` with n-gram range 1–3 and empty-data safety). A custom tokenizer or token pattern can be supplied. ### Method Signature `get_pipeline(attribute_name: str, tokenizer: callable = None, token_pattern: str = None) -> sklearn.pipeline.Pipeline` ### Parameters - **attribute_name** (str) - The name of the transaction attribute to create a pipeline for (e.g., "narration", "date.day"). - **tokenizer** (callable, optional) - A custom tokenizer function to use for string attributes. Defaults to `None`. - **token_pattern** (str, optional) - A custom regex token pattern for the `CountVectorizer`. Defaults to `None`. ### Returns - sklearn.pipeline.Pipeline - A scikit-learn pipeline object configured for the specified attribute. ### Example ```python import jieba from smart_importer.pipelines import get_pipeline from beancount.parser import parser from beancount.core.data import Transaction jieba.initialize() tokenizer = lambda s: list(jieba.cut(s)) # String attribute pipeline with custom tokenizer narration_pipeline = get_pipeline( "narration", tokenizer=tokenizer, token_pattern=None, ) # Numeric attribute pipeline (no tokenizer needed) date_pipeline = get_pipeline("date.day", tokenizer=None) raw, _, __ = parser.parse_string(""" 2024-03-01 * "Farmer Fresh" "购买杂货" Assets:CN:Checking -50 CNY """ ) txns = [t for t in raw if isinstance(t, Transaction)] features = narration_pipeline.fit_transform(txns) print(features.shape) # (1, n_features) — Chinese tokens extracted via jieba ``` ``` -------------------------------- ### ImporterWrapper Source: https://context7.com/beancount/smart_importer/llms.txt ImporterWrapper is a proxy class that subclasses `beangulp.importer.Importer`. It forwards all calls to the wrapped importer, except for the `extract` method, which is enhanced to run predictions before returning the entries. It also exposes the original importer's name. ```APIDOC ## ImporterWrapper — transparent importer proxy `ImporterWrapper` is the object returned by `wrap()`. It subclasses `beangulp.importer.Importer` and proxies every interface method to the wrapped importer. The `extract` method calls the inner importer first, then passes the results through `EntryPredictor.hook` before returning. The wrapper also exposes the original importer's `name` property so Fava and `bean-extract` display the correct importer identity. ```python from smart_importer.wrapper import ImporterWrapper from smart_importer import PredictPostings predictor = PredictPostings() original = MyBankImporter(account="Assets:MyBank:Checking") wrapper = ImporterWrapper(original, predictor) # All delegation methods work as expected print(wrapper.name) # same as original.name print(wrapper.identify("statement.csv")) # delegates to original print(wrapper.account("statement.csv")) # delegates to original # extract() is where the ML magic happens entries = wrapper.extract("statement.csv", existing_entries=[]) ``` ``` -------------------------------- ### update_postings Source: https://context7.com/beancount/smart_importer/llms.txt Rebuilds a transaction's posting list from a predicted list of account names. If the transaction already has more than one posting, it is returned unchanged. The original posting object (with its amount) is placed at the position matching its account name in the predicted list, or appended at the end if not found. ```APIDOC ## update_postings ### Description Rebuilds a transaction's posting list from a predicted list of account names. If the transaction already has more than one posting it is returned unchanged. The original posting object (with its amount) is placed at the position matching its account name in the predicted list, or appended at the end if not found. ### Method Signature `update_postings(transaction: Transaction, predicted_accounts: list[str]) -> Transaction` ### Parameters - **transaction** (Transaction) - The Beancount transaction object to update. - **predicted_accounts** (list[str]) - A list of account names in the desired order. ### Returns - Transaction - The updated transaction object with modified postings, or the original transaction if it had more than one posting initially. ### Example ```python from beancount.parser import parser from beancount.core.data import Transaction from smart_importer.entries import update_postings raw, _, __ = parser.parse_string(""" 2024-03-01 * "Starbucks" "Coffee" Assets:Checking -4.50 USD """ ) txn = next(t for t in raw if isinstance(t, Transaction)) updated = update_postings(txn, ["Assets:Checking", "Expenses:Food:Coffee"]) print([(p.account, p.units) for p in updated.postings]) # [('Assets:Checking', Amount(-4.50, 'USD')), ('Expenses:Food:Coffee', None)] # Transaction with 2+ postings is returned untouched multi_raw, _, __ = parser.parse_string(""" 2024-03-01 * "Transfer" "" Assets:Checking -100 USD Assets:Savings 100 USD """ ) multi_txn = next(t for t in multi_raw if isinstance(t, Transaction)) assert update_postings(multi_txn, ["Assets:Other"]) is multi_txn ``` ``` -------------------------------- ### Update Postings on a Single-Legged Transaction Source: https://context7.com/beancount/smart_importer/llms.txt Use `update_postings` to rebuild a transaction's posting list from predicted account names. The function only modifies transactions with a single posting; multi-posting transactions are returned unchanged. Original postings are preserved with their amounts. ```python from beancount.parser import parser from beancount.core.data import Transaction from smart_importer.entries import update_postings raw, _, __ = parser.parse_string(""" 2024-03-01 * "Starbucks" "Coffee" Assets:Checking -4.50 USD """) txn = next(t for t in raw if isinstance(t, Transaction)) updated = update_postings(txn, ["Assets:Checking", "Expenses:Food:Coffee"]) print([(p.account, p.units) for p in updated.postings]) # [('Assets:Checking', Amount(-4.50, 'USD')), ('Expenses:Food:Coffee', None)] # Transaction with 2+ postings is returned untouched multi_raw, _, __ = parser.parse_string(""" 2024-03-01 * "Transfer" "" Assets:Checking -100 USD Assets:Savings 100 USD """) multi_txn = next(t for t in multi_raw if isinstance(t, Transaction)) assert update_postings(multi_txn, ["Assets:Other"]) is multi_txn ``` -------------------------------- ### PredictPostings - Predict Second Posting Account Source: https://context7.com/beancount/smart_importer/llms.txt PredictPostings is an EntryPredictor subclass that predicts and adds the second posting account for single-legged transactions. It trains on historical two-legged transactions and modifies only those entries with exactly one posting. ```APIDOC ## PredictPostings ### Description Predicts the second posting account for single-legged transactions. ### Method Instantiate `PredictPostings` and use its `.hook` or `.wrap()` method. ### Usage **As a beangulp hook (applies to all importers):** ```python from smart_importer import PredictPostings HOOKS = [ PredictPostings().hook, # trains on existing_entries, predicts second posting ] ``` **As a per-importer wrapper (only this importer gets predictions):** ```python from smart_importer import PredictPostings from beangulp.importers.csv import Importer as CSVImporter # Assuming CSVImporter CONFIG = [ PredictPostings().wrap( CSVImporter(...) ), ] ``` ### Example Transformation ``` Before: 2024-03-01 * "Starbucks" "Coffee" Assets:MyBank:Checking -4.50 EUR After: 2024-03-01 * "Starbucks" "Coffee" Assets:MyBank:Checking -4.50 EUR Expenses:Food:Coffee ``` ``` -------------------------------- ### Apply Predicted Scalar Attribute to Transaction Source: https://context7.com/beancount/smart_importer/llms.txt Use `set_entry_attribute` for immutable updates to a transaction's single attribute, like payee. By default, it preserves existing truthy values unless `overwrite=True` is specified. ```python from beancount.parser import parser from beancount.core.data import Transaction from smart_importer.entries import set_entry_attribute raw, _, __ = parser.parse_string(""" 2024-03-01 * "" "Coffee" Assets:Checking -4.50 USD """) txn = next(t for t in raw if isinstance(t, Transaction)) # Set payee when it is empty updated = set_entry_attribute(txn, "payee", "Starbucks") print(updated.payee) # "Starbucks" # Existing value is preserved when overwrite=False with_payee = set_entry_attribute(txn, "payee", "Starbucks") not_overwritten = set_entry_attribute(with_payee, "payee", "Other", overwrite=False) print(not_overwritten.payee) # "Starbucks" # Force overwrite overwritten = set_entry_attribute(with_payee, "payee", "Other", overwrite=True) print(overwritten.payee) # "Other" ``` -------------------------------- ### PredictPayees - Predict Payee Field Source: https://context7.com/beancount/smart_importer/llms.txt PredictPayees is a companion predictor that fills in the `payee` field for transactions. Predictions are only written when the payee is currently empty, unless `overwrite=True` is specified. ```APIDOC ## PredictPayees ### Description Predicts and fills in the `payee` field for transactions. ### Method Instantiate `PredictPayees` and use its `.hook` or `.wrap()` method. ### Usage **As beangulp hooks (each runs independently):** ```python from smart_importer import PredictPayees, PredictPostings HOOKS = [ PredictPostings().hook, PredictPayees().hook, ] ``` **Stacking via `wrap()`:** ```python from smart_importer import PredictPayees, PredictPostings from beangulp.importers.csv import Importer as CSVImporter # Assuming CSVImporter CONFIG = [ PredictPostings().wrap( PredictPayees().wrap( CSVImporter(...) ) ), ] ``` ### Configuration Options - `overwrite` (bool): If `True`, predictions will overwrite existing payee values. Defaults to `False`. ``` -------------------------------- ### set_entry_attribute Source: https://context7.com/beancount/smart_importer/llms.txt Applies a predicted scalar attribute to a transaction using an immutable update. If the attribute already has a truthy value and `overwrite` is `False` (default), the entry is returned unchanged. ```APIDOC ## set_entry_attribute ### Description Applies a predicted scalar attribute to a transaction using an immutable update (`_replace`) on a `Transaction` namedtuple to set a single attribute (e.g., `payee`). If the attribute already has a truthy value and `overwrite=False` (the default), the entry is returned unchanged. This is the primitive used by `PredictPayees.apply_prediction`. ### Method Signature `set_entry_attribute(entry: Transaction, attribute_name: str, value: any, overwrite: bool = False) -> Transaction` ### Parameters - **entry** (Transaction) - The Beancount transaction object to update. - **attribute_name** (str) - The name of the attribute to set (e.g., "payee", "narration"). - **value** (any) - The new value for the attribute. - **overwrite** (bool, optional) - If `True`, forces the attribute to be updated even if it already has a truthy value. Defaults to `False`. ### Returns - Transaction - A new transaction object with the specified attribute updated, or the original transaction if the attribute was not overwritten. ### Example ```python from beancount.parser import parser from beancount.core.data import Transaction from smart_importer.entries import set_entry_attribute raw, _, __ = parser.parse_string(""" 2024-03-01 * "" "Coffee" Assets:Checking -4.50 USD """ ) txn = next(t for t in raw if isinstance(t, Transaction)) # Set payee when it is empty updated = set_entry_attribute(txn, "payee", "Starbucks") print(updated.payee) # "Starbucks" # Existing value is preserved when overwrite=False with_payee = set_entry_attribute(txn, "payee", "Starbucks") not_overwritten = set_entry_attribute(with_payee, "payee", "Other", overwrite=False) print(not_overwritten.payee) # "Starbucks" # Force overwrite overwritten = set_entry_attribute(with_payee, "payee", "Other", overwrite=True) print(overwritten.payee) # "Other" ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.