### Project Setup and Execution Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Provides commands for cloning the repository, setting up the data, and running various demonstrations and benchmarks. Includes options for full benchmarks, single query tests, and output redirection. ```bash git clone https://github.com/Emmimal/context-window-engine cd context-window-engine # Place your CSV at: # data/credit_card_transactions.csv python demo.py # problem + solution, fast python context_window_engine.py # full benchmark, 100k rows python context_window_engine.py --full # full benchmark, 1.29M rows python context_window_engine.py --query 0 # single query python context_window_engine.py --sample-context # show raw LLM input python context_window_engine.py --output out.txt # save results to file python query_router.py # router demo, classify-only python query_router.py --csv data/credit_card_transactions.csv # with live answers python -m unittest test_engine -v # 87 tests python -m unittest test_router -v # 72 tests ``` -------------------------------- ### Query Routing Example Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Illustrates how the QueryRouter classifies incoming queries, directing computation tasks to the SemanticEngine and retrieval tasks to RAG. This prevents aggregation errors by ensuring such queries never reach the RAG system. ```text Query: "What is the total spend by category?" → COMPUTATION → SemanticEngine → exact answer in 102ms Query: "Find transactions from Jennifer Banks" → RETRIEVAL → RAG → appropriate — no aggregation required ``` -------------------------------- ### Run Unit Tests for Query Router Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Execute all unit tests for the query router component. This command ensures that the query routing logic functions as expected. ```bash python -m unittest test_router -v # 72 tests — query router ``` -------------------------------- ### Run Unit Tests for Core Engine Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Execute all unit tests for the core engine functionality. This command is used to verify the integrity and correctness of the engine's components. ```bash python -m unittest test_engine -v # 87 tests — core engine ``` -------------------------------- ### Simulate RAG Retrieval with Context Size Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Simulates a RAG pipeline by scoring rows using keyword overlap (BM25-style) and retrieving top-k. Measures confidence signals like coverage percentage, visible categories, partial sum, and detectability score at a specified context size. ```python from context_window_engine import simulate_rag_retrieval ctx = simulate_rag_retrieval( query = "What is the total spend by category?", rows = rows, context_size = 500, ) print(ctx.coverage_pct) # 0.5% of dataset print(ctx.confidence_signals) # categories visible, partial sum, detectability ``` -------------------------------- ### Route Queries with QueryRouter Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Classifies natural language queries into 'COMPUTATION' or 'RETRIEVAL' and dispatches to the correct execution path. Provides exact answers for computation queries and appropriate results for retrieval queries. Total latency includes classification and execution, typically under 250ms. ```python from context_window_engine import load_csv from query_router import QueryRouter rows = load_csv("data/credit_card_transactions.csv", max_rows=100_000) router = QueryRouter(rows) result = router.route("What is the total spend by category?") print(result.routed_to) # "COMPUTATION" print(result.answer.answer) # exact grouped totals — same as SemanticEngine print(result.total_latency) # classify + execute, typically < 250ms result = router.route("Find transactions from Jennifer Banks") print(result.routed_to) # "RETRIEVAL" print(result.answer.safe) # True — RAG is appropriate for lookup queries ``` -------------------------------- ### Compute Ground Truth for Aggregations Source: https://github.com/emmimal/context-window-engine/blob/main/README.md Calculates exact aggregated totals for queries using a semantic engine. Typically runs faster than 200ms on 100k rows. No model calls or retrieval involved. ```python from context_window_engine import compute_ground_truth, load_csv rows = load_csv("data/credit_card_transactions.csv", max_rows=100_000) gt = compute_ground_truth( query_label = "total by category", rows = rows, agg_func = "sum", agg_col = "amt", group_col = "category", ) print(gt.answer) # exact grouped totals, deterministic print(gt.latency_ms) # typically < 200ms on 100k rows ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.