### Project Setup and Execution

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Provides commands for cloning the repository, setting up the data, and running various demonstrations and benchmarks. Includes options for full benchmarks, single query tests, and output redirection.

```bash
git clone https://github.com/Emmimal/context-window-engine
cd context-window-engine

# Place your CSV at:
# data/credit_card_transactions.csv

python demo.py                                    # problem + solution, fast
python context_window_engine.py                   # full benchmark, 100k rows
python context_window_engine.py --full            # full benchmark, 1.29M rows
python context_window_engine.py --query 0         # single query
python context_window_engine.py --sample-context  # show raw LLM input
python context_window_engine.py --output out.txt  # save results to file
python query_router.py                            # router demo, classify-only
python query_router.py --csv data/credit_card_transactions.csv  # with live answers
python -m unittest test_engine -v                 # 87 tests
python -m unittest test_router -v                 # 72 tests
```

--------------------------------

### Query Routing Example

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Illustrates how the QueryRouter classifies incoming queries, directing computation tasks to the SemanticEngine and retrieval tasks to RAG. This prevents aggregation errors by ensuring such queries never reach the RAG system.

```text
Query: "What is the total spend by category?"
→ COMPUTATION  →  SemanticEngine  →  exact answer in 102ms

Query: "Find transactions from Jennifer Banks"
→ RETRIEVAL    →  RAG             →  appropriate — no aggregation required
```

--------------------------------

### Run Unit Tests for Query Router

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Execute all unit tests for the query router component. This command ensures that the query routing logic functions as expected.

```bash
python -m unittest test_router -v   # 72 tests — query router
```

--------------------------------

### Run Unit Tests for Core Engine

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Execute all unit tests for the core engine functionality. This command is used to verify the integrity and correctness of the engine's components.

```bash
python -m unittest test_engine -v   # 87 tests — core engine
```

--------------------------------

### Simulate RAG Retrieval with Context Size

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Simulates a RAG pipeline by scoring rows using keyword overlap (BM25-style) and retrieving top-k. Measures confidence signals like coverage percentage, visible categories, partial sum, and detectability score at a specified context size.

```python
from context_window_engine import simulate_rag_retrieval

ctx = simulate_rag_retrieval(
    query        = "What is the total spend by category?",
    rows         = rows,
    context_size = 500,
)
print(ctx.coverage_pct)           # 0.5% of dataset
print(ctx.confidence_signals)     # categories visible, partial sum, detectability
```

--------------------------------

### Route Queries with QueryRouter

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Classifies natural language queries into 'COMPUTATION' or 'RETRIEVAL' and dispatches to the correct execution path. Provides exact answers for computation queries and appropriate results for retrieval queries. Total latency includes classification and execution, typically under 250ms.

```python
from context_window_engine import load_csv
from query_router import QueryRouter

rows = load_csv("data/credit_card_transactions.csv", max_rows=100_000)
router = QueryRouter(rows)

result = router.route("What is the total spend by category?")
print(result.routed_to)        # "COMPUTATION"
print(result.answer.answer)    # exact grouped totals — same as SemanticEngine
print(result.total_latency)    # classify + execute, typically < 250ms

result = router.route("Find transactions from Jennifer Banks")
print(result.routed_to)        # "RETRIEVAL"
print(result.answer.safe)      # True — RAG is appropriate for lookup queries
```

--------------------------------

### Compute Ground Truth for Aggregations

Source: https://github.com/emmimal/context-window-engine/blob/main/README.md

Calculates exact aggregated totals for queries using a semantic engine. Typically runs faster than 200ms on 100k rows. No model calls or retrieval involved.

```python
from context_window_engine import compute_ground_truth, load_csv

rows = load_csv("data/credit_card_transactions.csv", max_rows=100_000)

gt = compute_ground_truth(
    query_label = "total by category",
    rows        = rows,
    agg_func    = "sum",
    agg_col     = "amt",
    group_col   = "category",
)
print(gt.answer)       # exact grouped totals, deterministic
print(gt.latency_ms)   # typically < 200ms on 100k rows
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.