### Manual Training Installation

Source: https://docs.arc.computer/installation

Step-by-step manual installation of dependencies for custom environments.

```bash
# Install PyTorch with CUDA support
python -m pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124

# Install vLLM and TensorBoard
python -m pip install vllm==0.8.3 tensorboard

# Install Flash Attention (for optimal performance)
python -m pip install flash-attn --no-build-isolation

# Install FlashInfer
python -m pip install flashinfer-python -i https://flashinfer.ai/whl/cu124/torch2.6/

# Install remaining dependencies
python -m pip install --upgrade -r requirements-py311.txt  # or requirements-py312.txt
```

--------------------------------

### Run Automated Training Setup

Source: https://docs.arc.computer/installation

Validated installation scripts for specific Python versions.

```bash
bash scripts/install_py311.sh
```

```bash
bash scripts/install_py312.sh
```

--------------------------------

### Install Prerequisites

Source: https://docs.arc.computer/examples/adaptive-tool-use

Install the necessary Python packages and initialize the Atlas environment.

```bash
pip install arc-atlas langchain-mcp-adapters langchain-openai langgraph mcp anyio
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...
atlas init  # Start Postgres for telemetry
```

--------------------------------

### Setup Environment and Verify Dependencies

Source: https://docs.arc.computer/benchmarks/reproduction

Commands to verify the Python environment, CUDA availability, and authenticate with Hugging Face.

```bash
# Python environment
python --version  # 3.11 or 3.12 required
pip install -r requirements.txt

# Verify CUDA
nvidia-smi
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"

# Authenticate with Hugging Face
huggingface-cli login
```

--------------------------------

### Distributed Training with Multi-GPU Setup

Source: https://docs.arc.computer/api-reference/trainers

Configure the GRPOTrainer for multi-GPU distributed training using the Accelerator library. The trainer automatically handles the distributed setup when an Accelerator instance is provided.

```python
from atlas_core.training.algorithms.grpo import GRPOTrainer
from accelerate import Accelerator

accelerator = Accelerator()

trainer = GRPOTrainer(
    config=config,
    model=model,
    accelerator=accelerator
)

# Trainer automatically handles distributed setup
trainer.train()
```

--------------------------------

### Install Training Dependencies

Source: https://docs.arc.computer/training/offline/gkd-training

Install necessary Python packages for training, including PyTorch, TRL, and vLLM.

```bash
# For Python 3.11
pip install -r requirements-py311.txt

# For Python 3.12
pip install -r requirements-py312.txt
```

--------------------------------

### Install Flash Attention

Source: https://docs.arc.computer/reference/troubleshooting

Install flash-attn without build isolation to avoid compilation errors.

```bash
pip install flash-attn --no-build-isolation
```

--------------------------------

### Configure Learning Parameters

Source: https://docs.arc.computer/sdk/learning-system

Example YAML configuration for the learning module, including provider settings and history limits.

```yaml
learning:
  enabled: true
  update_enabled: true
  history_limit: 25
  session_note_enabled: false
  apply_to_prompts: true
  llm:
    provider: openai
    model: gpt-5-mini
    api_key_env: OPENAI_API_KEY
```

--------------------------------

### Start Docker Daemon on macOS and Linux

Source: https://docs.arc.computer/reference/troubleshooting

Commands to start the Docker daemon on macOS and Linux. Includes verification steps.

```bash
# macOS
open -a Docker

# Linux - check status
sudo systemctl status docker

# Linux - start Docker
sudo systemctl start docker

# Verify
docker ps
```

--------------------------------

### Download or Train Foundation Model

Source: https://docs.arc.computer/concepts/hybrid-learning

Use the CLI to download pre-trained weights or launch custom training scripts on multi-GPU setups.

```bash
# Option 1: Pre-trained
huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking

# Option 2: Custom (2+ GPUs)
scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml
```

--------------------------------

### Clone Atlas Core Repository

Source: https://docs.arc.computer/sdk/quickstart

Initial setup commands to download the Atlas Core repository and navigate to the root directory.

```bash
git clone https://github.com/Arc-Computer/ATLAS.git
cd ATLAS
```

--------------------------------

### Callbacks and Monitoring

Source: https://docs.arc.computer/api-reference/trainers

Demonstrates how to configure and use callbacks with GRPOTrainer for monitoring training progress, including examples for WandbCallback and EarlyStoppingCallback.

```APIDOC
## Callbacks and Monitoring

### Available Callbacks

```python theme={null}
from transformers import EarlyStoppingCallback
from transformers.integrations import TensorBoardCallback, WandbCallback

# Configure callbacks
callbacks = [
    WandbCallback(
        project="atlas-training",
        name="experiment-1"
    ),
    EarlyStoppingCallback(
        early_stopping_patience=3,
        early_stopping_threshold=0.001
    )
]

trainer = GRPOTrainer(
    config=config,
    callbacks=callbacks
)
```
```

--------------------------------

### Initialize Atlas with Bundled Postgres

Source: https://docs.arc.computer/reference/troubleshooting

Start a bundled Docker instance with Postgres using `atlas init`. This typically starts Postgres on localhost:5433.

```bash
atlas init
# Starts bundled Docker + Postgres on localhost:5433
```

--------------------------------

### Install Atlas SDK

Source: https://docs.arc.computer/

Install the required Python package for the Atlas SDK.

```bash
pip install arc-atlas
```

--------------------------------

### Setup Conda Environment

Source: https://docs.arc.computer/installation

Commands to create and configure an isolated Conda environment.

```bash
# Create environment
conda create -n atlas python=3.11
conda activate atlas

# Install PyTorch
conda install pytorch==2.6.0 pytorch-cuda=12.4 -c pytorch -c nvidia

# Run installation script
bash scripts/install_py311.sh
```

--------------------------------

### Optimize Training Session Queries

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Python examples for selective data loading and asynchronous pagination to handle large datasets.

```python
# Use selective loading
sessions = get_training_sessions(
    db_url="postgresql://atlas:atlas@localhost:5433/atlas",
    include_trajectory_events=False,  # Skip if not needed
    limit=10000
)

# Use pagination for large datasets
async for batch in paginate_sessions(
    db_url="postgresql://atlas:atlas@localhost:5433/atlas",
    batch_size=500,
    min_reward=0.8
):
    process_batch(batch)
```

--------------------------------

### Customization and Task Definition

Source: https://docs.arc.computer/examples/adaptive-tool-use

Examples for adding custom MCP tools and modifying the learning task list.

```python
@server.call_tool()
async def database_query(query: str) -> str:
    """Execute safe database queries"""
    # Your implementation
    return results
```

```python
LEARNING_TASKS = [
    "Your domain-specific task 1",
    "Your domain-specific task 2",
    # ... progressive complexity
]
```

--------------------------------

### Single GPU Configuration

Source: https://docs.arc.computer/installation

Commands for inference and memory-constrained training on a single GPU setup.

```bash
# Inference only with single GPU
python examples/quickstart/evaluate.py  # Quick evaluation test

# For training with limited VRAM (requires 2+ GPUs)
scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_rcl.yaml

# Or use Zero-1 optimization
scripts/launch.sh zero1 2 src/atlas_core/configs/recipe/teacher_rcl.yaml
```

--------------------------------

### Multi-Node Training Setup

Source: https://docs.arc.computer/training/offline/grpo-training

Initiate multi-node training across multiple machines using torchrun. Ensure the master address and node configuration are correctly set.

```bash
# Node 1 (master)
torchrun \
  --nproc_per_node=8 \
  --nnodes=2 \
  --node_rank=0 \
  --master_addr=10.0.0.1 \
  atlas-core train recipe@_global_=teacher_rcl
```

--------------------------------

### Launch vLLM Server

Source: https://docs.arc.computer/training/offline/grpo-training

Use this command to start the vLLM inference server. Ensure CUDA_VISIBLE_DEVICES is set correctly for your hardware. Adjust gpu-memory-utilization based on available VRAM.

```bash
CUDA_VISIBLE_DEVICES=0,1 \
python -m atlas_core.training.generation.vllm_server \
  --model checkpoints/sft/final \
  --port 8765 \
  --tensor-parallel-size 2 \
  --gpu-memory-utilization 0.9
```

--------------------------------

### Monitor GRPO Training with TensorBoard

Source: https://docs.arc.computer/training/offline/grpo-training

Start TensorBoard to visualize training progress. Ensure the log directory matches your training output.

```bash
tensorboard --logdir checkpoints/grpo --port 6006
```

--------------------------------

### Troubleshoot Postgres Connection Refused

Source: https://docs.arc.computer/examples/adaptive-tool-use

Start Postgres with `atlas init` or verify the `DATABASE_URL` in your `.env` file to resolve connection refused errors.

```bash
Start Postgres with `atlas init` or verify DATABASE_URL in .env
```

--------------------------------

### Resolve SDK Installation Issues

Source: https://docs.arc.computer/reference/troubleshooting

Commands to verify Python environment and reinstall the Atlas SDK.

```bash
# Check Python version
python --version  # Should be 3.10+

# Create virtual environment with correct version
python3.12 -m venv .venv
source .venv/bin/activate

# Reinstall SDK
pip install --upgrade arc-atlas
```

```bash
# Ensure you're in the correct environment
which python
which pip

# Reinstall in current environment
pip uninstall arc-atlas -y
pip install arc-atlas

# Verify installation
python -c "import atlas; print(atlas.__version__)"
```

--------------------------------

### Initialize MCP Server Programmatically

Source: https://docs.arc.computer/reference/troubleshooting

Starts the MCP server process using Python before executing the Atlas task.

```python
# Ensure MCP server is started before agent
import subprocess
mcp_process = subprocess.Popen([
    "python", "-m", "your_mcp_server"
])

# Then run atlas
atlas run --config config.yaml --task "Your task"
```

--------------------------------

### Custom Trainer Implementation

Source: https://docs.arc.computer/api-reference/trainers

Provides an example of creating a custom trainer by extending the GRPOTrainer class, demonstrating how to override reward computation logic.

```APIDOC
## Custom Trainer Implementation

Create your own trainer by extending base classes:

```python theme={null}
from atlas_core.training.algorithms.grpo import GRPOTrainer
import torch

class CustomRewardTrainer(GRPOTrainer):
    """Custom trainer with modified reward computation"""

    def compute_rewards(self, completions, prompts):
        """Override reward computation"""
        rewards = []
        for completion, prompt in zip(completions, prompts):
            # Custom reward logic
            reward = self.custom_reward_function(completion, prompt)
            rewards.append(reward)
        return torch.tensor(rewards)

    def custom_reward_function(self, completion, prompt):
        """Implement domain-specific rewards"""
        # Example: Length penalty
        length_penalty = min(1.0, len(completion) / 500)

        # Example: Quality score
        quality = self.quality_model(completion)

        return quality * length_penalty
```
```

--------------------------------

### Domain-Specific Example Entries

Source: https://docs.arc.computer/reference/datasets

Sample JSON objects representing entries in the mathematics, code generation, and debugging subsets.

```json
{
  "prompt": "Sarah has 24 apples. She gives 1/3 to her brother...",
  "ground_truth": "12",
  "teaching": "Break down: 1) Calculate 1/3 of 24 = 8..."
}
```

```json
{
  "prompt": "Write a function to validate email addresses",
  "ground_truth": "def validate_email(email):...",
  "teaching": "Consider regex pattern, edge cases like..."
}
```

```json
{
  "prompt": "Service returns 503 errors intermittently",
  "ground_truth": "Check service mesh configuration...",
  "teaching": "Systematic approach: 1) Check Istio configs..."
}
```

--------------------------------

### Build a Custom GRPC Adapter

Source: https://docs.arc.computer/sdk/adapters

Implement a custom GRPC adapter by extending the AgentAdapter class and registering it. This example shows the basic structure for connecting to a gRPC service.

```python
from atlas.connectors.registry import AgentAdapter, register_adapter
from atlas.config.models import AdapterType

class GRPCAdapter(AgentAdapter):
    async def ainvoke(self, prompt: str, metadata: dict | None = None) -> str:
        # 1. Connect to your gRPC service.
        # 2. Build the request from the prompt.
        # 3. Execute the call and get a response.
        # 4. Return the response as a string.
        return f"Response for prompt: {prompt}"

# Assumes you've added GRPC to the AdapterType enum
register_adapter(AdapterType.GRPC, GRPCAdapter)
```

--------------------------------

### CLI Environment Verification

Source: https://docs.arc.computer/installation

Validates installation of accelerate, CUDA, and model file accessibility via CLI.

```bash
# Check accelerate installation
accelerate --version

# Verify CUDA
python -c "import torch; print(torch.cuda.is_available())"

# Test model access
huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking \
  --include "*.json" \
  --exclude "*.safetensors"
```

--------------------------------

### View GKD Training Command Line Output

Source: https://docs.arc.computer/training/offline/gkd-training

Example of the console output during an Atlas GKD training run showing epoch progress and baseline comparison metrics.

```text
Starting GKD training with Baseline Comparison reference: success=75.00%, tokens=1200
Loaded datasets: train=850, eval=150 conversations
AtlasGKDTrainer initialized with lmbda=1.0, beta=0.5

Epoch 1/3
  Step 100: loss=0.245, eval_loss=0.312
  ✅ Baseline Comparison targets MET: success delta=12.3 pp, token reduction=35.2%

Epoch 2/3
  Step 200: loss=0.198, eval_loss=0.276
  ✅ Baseline Comparison targets MET: success delta=14.1 pp, token reduction=38.7%
```

--------------------------------

### Verify Virtual Environment for Agent Discovery

Source: https://docs.arc.computer/reference/troubleshooting

Ensure the correct Python virtual environment is active and that necessary libraries like `langchain` are installed. Agent discovery runs within the current environment.

```bash
# Ensure correct environment is active
which python
pip list | grep langchain

# Discovery runs in your current environment
atlas env init --verbose
```

--------------------------------

### Live GRPO Training Metrics Output

Source: https://docs.arc.computer/training/offline/grpo-training

Example of live metrics output from the training log, showing step, reward, KL divergence, and non-degradation rate.

```text
{'step': 50, 'reward': 0.52, 'kl': 1.2, 'non_degrade': 0.96}
{'step': 100, 'reward': 0.61, 'kl': 1.4, 'non_degrade': 0.97}
{'step': 150, 'reward': 0.73, 'kl': 1.6, 'non_degrade': 0.98}
# Rewards should steadily increase over 24-36 hours
```

--------------------------------

### Fix vLLM Installation

Source: https://docs.arc.computer/installation

Install missing system dependencies or use pre-built wheels to resolve vLLM installation failures.

```bash
# Install build dependencies
sudo apt-get install python3-dev

# Try pre-built wheel
pip install https://github.com/vllm-project/vllm/releases/download/v0.8.3/vllm-0.8.3-cp311-cp311-linux_x86_64.whl
```

--------------------------------

### Run Learning Session

Source: https://docs.arc.computer/examples/adaptive-tool-use

Execute the full 25-task learning harness.

```bash
cd examples/mcp_tool_learning
python learning_harness.py
```

--------------------------------

### Launch Training Script

Source: https://docs.arc.computer/reference/faq

Execute the training process using the provided launch script and configuration file.

```bash
scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \
  dataset_name=path/to/your/data
```

--------------------------------

### Launch SFT Warmup

Source: https://docs.arc.computer/benchmarks/reproduction

Command to initiate the supervised fine-tuning phase.

```bash
scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \
  dataset_id_or_path=Arc-Intelligence/Arc-ATLAS-Teach-v0 \
  output_dir=results/pre_rl_model \
  seed=42
```

--------------------------------

### Bootstrap Project with Autodiscovery

Source: https://docs.arc.computer/installation

Initialize the environment and run tasks using the Atlas CLI.

```bash
atlas env init --task "Summarize the latest AI news"
atlas run --config .atlas/generated_config.yaml --task "Summarize the latest AI news"
```

--------------------------------

### Verify CUDA Installation

Source: https://docs.arc.computer/installation

Checks for NVIDIA driver and CUDA availability.

```bash
nvidia-smi  # Verify CUDA version
```

--------------------------------

### Get Session by ID

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Retrieve a specific training session by its unique ID.

```APIDOC
## GET /training_data/sessions/{session_id}

### Description
Fetches a single training session identified by its ID.

### Method
GET

### Endpoint
/training_data/sessions/{session_id}

### Parameters
#### Path Parameters
- **session_id** (integer) - Required - The unique identifier of the session to retrieve.

#### Query Parameters
- **db_url** (string) - Required - The database connection URL.

### Response
#### Success Response (200)
- **session** (AtlasSessionTrace) - The training session object.

#### Response Example
```json
{
  "session": {
    "session_reward": {"score": 0.92, "uncertainty": 0.02},
    "trajectory_events": [...],
    "student_learning": {...},
    "teacher_learning": {...},
    "learning_history": {...},
    "adaptive_summary": {...},
    "learning_key": "security-review-final",
    "drift_alert": null
  }
}
```
```

--------------------------------

### Verify Learning Persistence After Upgrade

Source: https://docs.arc.computer/sdk/learning-system

After upgrading the SDK, run a session that should trigger learning, restart the runtime, and then verify that the playbook loads from the database and the playbook hash in telemetry matches the registry.

```bash
atlas run --config your_config.yaml --task "test task"
```

```bash
atlas run --config your_config.yaml --task "another task"
```

```bash
psql $DATABASE_URL -c "SELECT learning_key, updated_at FROM learning_registry WHERE learning_key='your-key';"
```

--------------------------------

### Initialize AtlasGKDTrainer

Source: https://docs.arc.computer/training/offline/gkd-training

Python implementation for setting up the AtlasGKDTrainer with student and teacher models, configuration, and database connection.

```python
from atlas_core.training.algorithms.gkd_trainer import AtlasGKDTrainer
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import GKDConfig

student = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-7B")
teacher = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-14B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B")

args = GKDConfig(
    output_dir="outputs/gkd",
    per_device_train_batch_size=4,
    lmbda=1.0,
    beta=0.5,
)

trainer = AtlasGKDTrainer(
    model=student,
    teacher_model=teacher,
    args=args,
    db_url="postgresql://localhost:5432/atlas",
    min_reward=0.8,
    processing_class=tokenizer,
)

trainer.train()
trainer.save_model("outputs/gkd/final")
```

--------------------------------

### Launch SFT Training

Source: https://docs.arc.computer/training/offline/grpo-training

Execute SFT training across different GPU configurations using the launch script.

```bash
# Minimum (2 GPUs)
scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml \
  output_dir=checkpoints/sft

# Recommended (4 GPUs)
scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \
  output_dir=checkpoints/sft

# Full production (8 GPUs)
scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \
  output_dir=checkpoints/sft

# Memory-constrained with offloading
scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_sft.yaml \
  output_dir=checkpoints/sft
```

--------------------------------

### Get Training Sessions (Async)

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Asynchronously query training sessions for high-throughput pipelines.

```APIDOC
## GET /training_data/sessions/async

### Description
Fetches training sessions asynchronously from a PostgreSQL database, suitable for high-throughput scenarios.

### Method
GET

### Endpoint
/training_data/sessions/async

### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **limit** (integer) - Optional - The maximum number of sessions to return.

### Response
#### Success Response (200)
- **sessions** (list of AtlasSessionTrace) - A list of training session objects.

#### Response Example
```json
{
  "sessions": [
    {
      "session_reward": {"score": 0.75, "uncertainty": 0.03},
      "trajectory_events": [...],
      "student_learning": {...},
      "teacher_learning": {...},
      "learning_history": {...},
      "adaptive_summary": {...},
      "learning_key": "task-batch-2",
      "drift_alert": null
    }
  ]
}
```
```

--------------------------------

### Launch minimal training

Source: https://docs.arc.computer/benchmarks/reproduction

Executes a short training run for testing purposes.

```bash
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \
  model_name_or_path=checkpoints/teacher \
  max_steps=4 \
  eval_steps=1 \
  report_to=null
```

--------------------------------

### Run GRPO Training with vLLM Server

Source: https://docs.arc.computer/training/offline/grpo-training

Launch GRPO training using the `launch_with_server.sh` script, specifying the number of GPUs for training and vLLM. The first argument is training GPUs, the second is vLLM GPUs.

```bash
# Minimum (2 GPUs: 1 training, 1 vLLM)
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \
  model_name_or_path=checkpoints/sft/final

# Recommended (4 GPUs: 2 training, 2 vLLM)
scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml \
  model_name_or_path=checkpoints/sft/final

# Production (8 GPUs: 4 training, 4 vLLM)
scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml \
  model_name_or_path=checkpoints/sft/final
```

--------------------------------

### Verify Python Version

Source: https://docs.arc.computer/installation

Confirms the installed Python version meets the minimum requirement of 3.10.

```bash
python --version
```

--------------------------------

### Get Training Sessions

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Query training sessions directly from PostgreSQL with various filtering options.

```APIDOC
## GET /training_data/sessions

### Description
Fetches training sessions directly from a PostgreSQL database.

### Method
GET

### Endpoint
/training_data/sessions

### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **max_reward** (float) - Optional - Maximum reward score to filter sessions.
- **learning_key** (string) - Optional - Filter sessions by a specific learning key.
- **status_filters** (list of strings) - Optional - Filter sessions by their status (e.g., "succeeded", "failed").
- **start_date** (datetime) - Optional - Filter sessions that started on or after this date.
- **end_date** (datetime) - Optional - Filter sessions that ended on or before this date.
- **include_trajectory_events** (boolean) - Optional - Whether to include trajectory events. Defaults to true.
- **include_learning_data** (boolean) - Optional - Whether to include learning data. Defaults to true.
- **limit** (integer) - Optional - The maximum number of sessions to return.

### Response
#### Success Response (200)
- **sessions** (list of AtlasSessionTrace) - A list of training session objects.

#### Response Example
```json
{
  "sessions": [
    {
      "session_reward": {"score": 0.85, "uncertainty": 0.05},
      "trajectory_events": [...],
      "student_learning": {...},
      "teacher_learning": {...},
      "learning_history": {...},
      "adaptive_summary": {...},
      "learning_key": "security-review",
      "drift_alert": null
    }
  ]
}
```
```

--------------------------------

### Configure API Keys using a .env File

Source: https://docs.arc.computer/reference/troubleshooting

Create a .env file in your project root to store API keys. The SDK automatically loads these variables.

```bash
# Create .env file in project root
cat > .env << EOF
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...
EOF

# SDK auto-loads .env files
atlas run --config config.yaml --task "Your task"
```

--------------------------------

### Configure API Keys

Source: https://docs.arc.computer/installation

Environment variables for training and runtime SDK authentication.

```bash
# Training stack
export HF_TOKEN="your-huggingface-token"
export WANDB_API_KEY="your-wandb-key"  # Optional

# Runtime SDK
export ANTHROPIC_API_KEY="sk-ant-your-key"  # Primary provider
export GEMINI_API_KEY="your-gemini-key"  # Optional for rewards
```

--------------------------------

### Configure Atlas Database URL

Source: https://docs.arc.computer/reference/troubleshooting

Example of the `database_url` configuration for Atlas, specifying the connection string for a PostgreSQL database.

```yaml
# config.yaml
storage:
  database_url: postgresql://atlas:atlas@localhost:5433/atlas
```

--------------------------------

### Configure GKD (Postgres path)

Source: https://docs.arc.computer/examples/gkd-dev-example

Trains directly from approved traces in Postgres. Ensure ATLAS_DB_URL is set. Override trainer parameters as needed.

```bash
export ATLAS_DB_URL="postgresql://user:pass@host:5432/atlas"
atlas-core train \
  recipe@_global_=teacher_gkd \
  teacher_model_name_or_path=Qwen/Qwen2.5-14B-Instruct \
  model.model_name_or_path=Qwen/Qwen2.5-7B-Instruct \
  trainer.min_reward=0.8
```

--------------------------------

### Filter Dataset by Domain

Source: https://docs.arc.computer/reference/datasets

Examples of filtering the dataset for specific domains like mathematics, code generation, or debugging.

```python
math_data = dataset.filter(lambda x: x['domain'] == 'math')
```

```python
code_data = dataset.filter(lambda x: x['domain'] == 'code')
```

```python
sre_data = dataset.filter(lambda x: x['domain'] == 'debug')
```

--------------------------------

### Launch DeepSpeed Training with Accelerate

Source: https://docs.arc.computer/api-reference/trainers

Initiate distributed training with DeepSpeed using the `accelerate launch` command. Specify the DeepSpeed configuration file and the training script. Presets for zero3 and CPU offloading are available.

```bash
# Default zero3
accelerate launch --config_file accelerate/deepspeed_zero3.yaml \
  -m atlas_core.cli.train recipe@_global_=teacher_rcl
```

```bash
# CPU offload
accelerate launch --config_file accelerate/deepspeed_zero3_cpu_offloading.yaml \
  -m atlas_core.cli.train recipe@_global_=teacher_rcl
```

--------------------------------

### Count Training Sessions

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Get the count of training sessions matching specific criteria without loading full data.

```APIDOC
## GET /training_data/sessions/count

### Description
Retrieves the total count of training sessions that match the specified filters.

### Method
GET

### Endpoint
/training_data/sessions/count

### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **learning_key** (string) - Optional - Filter sessions by a specific learning key.

### Response
#### Success Response (200)
- **total** (integer) - The total number of matching training sessions.

#### Response Example
```json
{
  "total": 150
}
```
```

--------------------------------

### Load ATLAS-8B-Instruct Model

Source: https://docs.arc.computer/reference/models

Use this code to load the ATLAS-8B-Instruct model and tokenizer from Hugging Face. Ensure you have the transformers library installed.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Instruct",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Instruct"
)
```

--------------------------------

### GRPOTrainer Class Initialization

Source: https://docs.arc.computer/api-reference/trainers

Details the parameters required to initialize the GRPOTrainer for reinforcement learning tasks.

```APIDOC
## GRPOTrainer Initialization

### Description
Initializes the GRPOTrainer with the necessary models, configuration, and datasets for policy optimization.

### Parameters
- **config** (GRPOConfig) - Required - Training configuration
- **model** (PreTrainedModel) - Required - Model to train (policy network)
- **ref_model** (PreTrainedModel) - Required - Reference model for KL penalty
- **tokenizer** (PreTrainedTokenizer) - Required - Tokenizer for encoding/decoding
- **train_dataset** (Dataset) - Required - Training data
- **eval_dataset** (Dataset) - Optional - Evaluation data
- **reward_model** (PreTrainedModel) - Optional - Optional external reward model
- **compute_metrics** (Callable) - Optional - Custom metrics function
- **callbacks** (List[TrainerCallback]) - Optional - Training callbacks
- **optimizers** (Tuple) - Optional - Custom optimizer and scheduler
```

--------------------------------

### Load ATLAS-8B-Thinking Model

Source: https://docs.arc.computer/reference/models

Use this code to load the ATLAS-8B-Thinking model and tokenizer from Hugging Face. Ensure you have the transformers library installed.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking"
)
```

--------------------------------

### Build and Run Docker Container

Source: https://docs.arc.computer/installation

Commands for building the training image and executing the offline pipeline.

```bash
docker build -t atlas-core:local .
```

```bash
docker run --rm \
  -v "$(pwd)/exports:/data" \
  atlas-core:local \
  atlas-core offline-pipeline --export-path /data/traces.jsonl --dry-run
```

--------------------------------

### Multi-GPU Training Launch

Source: https://docs.arc.computer/installation

Distributed training commands for various GPU cluster sizes.

```bash
# Minimum 2 GPUs for RL training (1 for vLLM, 1 for training)
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml

# Production setup with 4 GPUs (2 for vLLM, 2 for training)
scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml

# Full 8 GPU setup
scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml
```

--------------------------------

### Verify Configuration Files

Source: https://docs.arc.computer/training/offline/gkd-training

Check for the existence of required GKD configuration files.

```bash
# Check required config files
ls src/atlas_core/configs/recipe/teacher_gkd.yaml
ls src/atlas_core/configs/trainer/gkd.yaml

# Expected output:
# src/atlas_core/configs/recipe/teacher_gkd.yaml
# src/atlas_core/configs/trainer/gkd.yaml
```

--------------------------------

### Count Training Sessions

Source: https://docs.arc.computer/training/offline/training-data-pipeline

Get the count of training sessions matching specific criteria without loading the full data using `count_training_sessions`.

```python
from atlas.training_data import count_training_sessions

total = count_training_sessions(
    db_url="postgresql://atlas:atlas@localhost:5433/atlas",
    min_reward=0.8,
    learning_key="task-1"
)
print(f"Found {total} sessions matching criteria")
```

--------------------------------

### Session Evaluation JSON Structure

Source: https://docs.arc.computer/concepts/reward-design

Example of the structured evaluation output generated after a session, containing weighted principles, scores, and learning insights.

```json
{
  "principles": [
    {"name": "Correctness", "weight": 0.5, "description": "Final deliverable matches requirements"},
    {"name": "Safety", "weight": 0.3, "description": "No policy violations detected"},
    {"name": "Efficiency", "weight": 0.2, "description": "Minimal retries needed"}
  ],
  "score": 0.85,
  "rationale": "Response solves the task correctly with efficient execution",
  "uncertainty": 0.1,
  "student_learning": "For straightforward tasks, proceed directly to solution without exploratory steps",
  "teacher_learning": null
}
```

--------------------------------

### Build GKD Dataset

Source: https://docs.arc.computer/training/offline/gkd-training

Initializes training and evaluation datasets from a PostgreSQL database using specified reward thresholds and learning keys.

```python
from atlas_core.data.gkd import build_gkd_dataset

train_ds, eval_ds = build_gkd_dataset(
    db_url="postgresql://localhost:5432/atlas",
    min_reward=0.8,
    learning_key="crm_workflows",
    eval_split=0.15,
)
```

--------------------------------

### SFT Warmup

Source: https://docs.arc.computer/api-reference/trainers

Performs supervised fine-tuning as a prerequisite step before RL training.

```python
from trl import SFTConfig, SFTTrainer

# Supervised fine-tuning before RL
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=sft_dataset,
    tokenizer=tokenizer,
    max_seq_length=2048
)

trainer.train()
```

--------------------------------

### GKD Validation Metrics Example

Source: https://docs.arc.computer/examples/gkd-dev-example

Inspect this JSON file for training loss, baseline and distilled evaluation metrics, and derived success delta and token reduction.

```json
{
  "training": {"train_loss": 0.0294},
  "baseline": {"accuracy": 0.758, "avg_generated_tokens": 210},
  "distilled": {"accuracy": 0.815, "avg_generated_tokens": 180}
}
```

--------------------------------

### Configure Hugging Face Environment

Source: https://docs.arc.computer/reference/troubleshooting

Manage authentication and cache settings for Hugging Face.

```bash
# Login to Hugging Face
huggingface-cli login

# Set cache directory if disk space limited
export HF_HOME=/path/to/cache

# Use offline mode if downloaded
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
```

--------------------------------

### Generate teaching guidance with TeacherGRPOTrainer

Source: https://docs.arc.computer/api-reference/trainers

Create tailored teaching guidance based on diagnostic results. The output format varies based on the student's capability level.

```python
def generate_guidance(
    self,
    task: str,
    diagnostic: DiagnosticResult,
    max_tokens: int = 200
) -> str:
    """
    Generate teaching guidance based on diagnosis

    Args:
        task: Original task or problem statement
        diagnostic: Results from diagnostic_probe()
        max_tokens: Maximum tokens for guidance response

    Returns:
        str: Tailored teaching guidance text
            Format depends on capability_level:
            - Low (0.0-0.3): Step-by-step walkthrough
            - Medium (0.3-0.7): Hints and scaffolding
            - High (0.7-1.0): Minimal guidance or verification

    Raises:
        ValueError: If max_tokens < 10 or diagnostic is None
        RuntimeError: If teacher model fails to generate guidance
        TypeError: If task is not a string

    Example:
        diagnostic = trainer.diagnostic_probe("Solve quadratic equation")
        guidance = trainer.generate_guidance(
            "Solve: x² - 5x + 6 = 0",
            diagnostic,
            max_tokens=150
        )
        print(f"Teaching guidance: {guidance}")
    """
```

--------------------------------

### Environment Discovery Commands

Source: https://docs.arc.computer/sdk/cli-reference

Commands for initializing and scaffolding agent environments.

```APIDOC
## atlas env init

### Description
Discovers agents and environments, populating the .atlas/ directory. Uses Claude Haiku 4.5 for agent ranking.

### Parameters
#### Flags
- **--task** (string) - Optional - Task description for discovery
- **--scaffold-config-full** (boolean) - Optional - Generate full configuration
- **--no-run** (boolean) - Optional - Skip execution
- **--timeout** (integer) - Optional - Timeout in seconds (default 240)

## atlas env scaffold

### Description
Seeds projects with reference factories using LangGraph templates.

### Parameters
#### Flags
- **--template** (string) - Optional - Template name
- **--output** (string) - Optional - Output directory
- **--force** (boolean) - Optional - Overwrite existing files
```

--------------------------------

### Evaluate a Single Interaction with RIMReward

Source: https://docs.arc.computer/training/reward-system-usage

Use the evaluate method of RIMReward for quick assessments of teaching effectiveness. Provide prompt, response, baseline, and teacher traces to get a score and rationale.

```python
from atlas_core.reward.interpretation import RIMReward

# Create reward system
reward = RIMReward(config_path='reward_system/interpretation.yaml')

# Evaluate a single interaction
result = reward.evaluate(
    prompt="What is 2+2?",
    response="The answer is <solution>4</solution>.",
    baseline_solutions="It is 4",
    teacher_traces="Explain your reasoning step by step",
)

print(f"Score: {result.score}")
print(f"Per-judge: {result.judge_scores}")
print(f"Rationale:\n{result.rationale}")
```

--------------------------------

### Teacher GRPO Prompt Templates

Source: https://docs.arc.computer/training/configuration

Defines the prompt templates used for diagnostic, adaptive, and feedback prompts in the teacher GRPO trainer. These templates guide the model's responses and interactions.

```yaml
student_diagnostic_template: |
  Question: {question}
  Before solving, briefly describe:
  1. What type of problem this is
  2. The key concepts or steps needed
  3. Any potential challenges you see
```

```yaml
teacher_adaptive_template: |
  Question: {question}
  Student's approach: {approach}

  <thinking>
  [Analyze student approach]
  </thinking>

  <teaching>
  [Only guidance to student - no answers]
  </teaching>
```

```yaml
student_with_teaching_template: |
  Question: {question}
  A teacher has provided: {teaching}

  Now solve step by step.
  <solution></solution>
```

--------------------------------

### Configure Agent and Reward System

Source: https://docs.arc.computer/examples/adaptive-tool-use

Define the agent import path and the reward system judge prompt in the configuration.

```yaml
agent:
  type: python
  import_path: examples.mcp_tool_learning.mcp_agent
  attribute: create_agent
```

```yaml
rim:
  judge_prompt: |
    Reward effective tool usage:
    - Correct tool for each task
    - Minimal redundant operations
    - Proper error handling
```

--------------------------------

### Validate Trained Model with Transformers

Source: https://docs.arc.computer/training/offline/grpo-training

Load a trained teacher model and a baseline student model using the Transformers library to compare their performance on a given problem. Requires PyTorch and Transformers installed.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load your trained teacher
teacher = AutoModelForCausalLM.from_pretrained(
    "results/rl_checkpoint",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("results/rl_checkpoint")

# Load baseline student
student = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Test on a problem
problem = "A train travels 120 miles in 2 hours. What is its speed?"

# Get baseline (student only)
inputs = tokenizer(problem, return_tensors="pt").to(student.device)
baseline = student.generate(**inputs, max_new_tokens=100)
print(f"Baseline: {tokenizer.decode(baseline[0])}")

# Get teaching (using the atlas-sdk runtime loop)
# This gives you the enhanced response
```

--------------------------------

### Troubleshoot Async Event Loop Errors

Source: https://docs.arc.computer/examples/adaptive-tool-use

Run the learning harness with `python learning_harness.py` instead of `python -i` to avoid async event loop errors.

```python
Run with `python learning_harness.py` (not `python -i`)
```

--------------------------------

### Standard GRPO Training

Source: https://docs.arc.computer/api-reference/trainers

Initializes and executes a GRPOTrainer with specific configuration and reward functions.

```python
from atlas_core.training.algorithms.grpo import GRPOTrainer
from atlas_core.training.algorithms.grpo_config import GRPOConfig
from atlas_core.reward.interpretation import RIMReward

# Minimal training arguments (TrainingArguments requires an output_dir)
grpo_args = GRPOConfig(
    output_dir="./output/grpo",
    model_name_or_path="Arc-Intelligence/ATLAS-8B-Thinking",
    learning_rate=5e-6,
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    beta=0.04,  # KL penalty
    logging_steps=10,
)

reward = RIMReward(config_path="reward_system/interpretation.yaml")

trainer = GRPOTrainer(
    model=grpo_args.model_name_or_path,
    reward_funcs=reward,
    args=grpo_args,
    train_dataset=train_data,
    eval_dataset=eval_data,
    processing_class=tokenizer,
)

trainer.train()
trainer.save_model("./output/grpo/checkpoint-final")
```

--------------------------------

### Quality Validation of Dataset

Source: https://docs.arc.computer/training/custom-datasets

Performs quality validation on a dataset using standard Python tooling and the `datasets` library. Calculates example count, average step length in tokens, and domain distribution.

```python
from collections import Counter
from datasets import Dataset

dataset = Dataset.from_list(records)
lengths = [len(example["step_trace"].split()) for example in dataset]
domains = Counter(example["session_metadata"].get("domain", "unknown") for example in dataset)

print(f"Examples: {len(dataset)}")
print(f"Avg step length: {sum(lengths)/len(lengths):.1f} tokens")
print(f"Domains: {domains}")
```

--------------------------------

### Monitor training metrics

Source: https://docs.arc.computer/benchmarks/reproduction

Commands for monitoring TensorBoard, server health, and GPU utilization.

```bash
# TensorBoard monitoring
tensorboard --logdir results/ --port 6006

# vLLM server health
watch -n 5 'curl -s http://localhost:8765/metrics'

# GPU utilization
nvidia-smi dmon -s u -d 5
```

--------------------------------

### Atlas Session Trace JSON Structure

Source: https://docs.arc.computer/sdk/export-traces

This is an example of the JSON structure for an Atlas Session Trace record. It includes details about the task, adaptive summary, triage dossier, plan, steps, rewards, and review status.

```json
{
  "task": "Summarize the latest Atlas SDK updates",
  "final_answer": "...",
  "adaptive_summary": {
    "adaptive_mode": "coach",
    "confidence": 0.58,
    "certification_run": false,
    "probe": {
      "mode": "coach",
      "confidence": 0.55,
      "evidence": ["persona_helpful_ratio=0.62", "risk_high_severity"]
    },
    "mode_history": [
      {"mode": "paired", "confidence": 0.71, "certification": true},
      {"mode": "coach", "confidence": 0.55}
    ]
  },
  "triage_dossier": {
    "task": "Summarize the latest Atlas SDK updates",
    "summary": "Capture highlights for stakeholders.",
    "risks": [{"category": "quality", "description": "Customer-facing copy", "severity": "moderate"}],
    "signals": [{"name": "tenant", "value": "demo"}],
    "tags": ["tenant:demo", "domain:sre"]
  },
  "plan": {"steps": [{"id": 1, "description": "Collect release notes"}, {"id": 2, "description": "Draft summary"}]},
  "steps": [
    {
      "step_id": 1,
      "description": "Collect release notes",
      "trace": "HUMAN: ...",
      "output": "...",
      "reward": {
        "score": 0.92,
        "judges": [
          {"identifier": "process", "score": 0.91, "rationale": "..."}
        ]
      },
      "guidance": ["Cite the release date."],
      "validation": {"valid": true, "rationale": "Complete"},
      "tool": "web_search",
      "tool_params": {"query": "Atlas SDK release notes"},
      "artifacts": {"sources": ["https://..."]},
      "deliverable": {"notes": ["https://..."]}
    }
  ],
  "session_reward": {
    "score": 0.88,
    "uncertainty": 0.07,
    "judges": [
      {"identifier": "process", "score": 0.90, "rationale": "..."}
    ]
  },
  "reward_summary": {"score": 0.88},
  "review_status": "approved",
  "personas_used": [
    {"persona": "planner", "instruction": "Focus on customer tone", "source": "memory"}
  ],
  "persona_updates": {
    "new_candidates": [
      {"persona": "planner", "instruction": "Mention adaptive modes", "tags": ["tenant:demo"]}
    ]
  },
  "session_metadata": {"batch": "aime-2025"}
}
```

--------------------------------

### Create Database Indexes for Performance

Source: https://docs.arc.computer/training/offline/training-data-pipeline

SQL commands to optimize reward filtering, date range queries, and metadata GIN indexing for session data.

```sql
-- Reward filtering (10-100x faster)
CREATE INDEX sessions_reward_score_idx
ON sessions ((reward_stats->>'score')::float);

-- Date range queries (50-100x faster)
CREATE INDEX sessions_created_at_idx
ON sessions (created_at DESC);

-- Learning key queries
CREATE INDEX sessions_metadata_gin_idx
ON sessions USING GIN (metadata);
```