### Manual Training Installation
Source: https://docs.arc.computer/installation
Step-by-step manual installation of dependencies for custom environments.
```bash
# Install PyTorch with CUDA support
python -m pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
# Install vLLM and TensorBoard
python -m pip install vllm==0.8.3 tensorboard
# Install Flash Attention (for optimal performance)
python -m pip install flash-attn --no-build-isolation
# Install FlashInfer
python -m pip install flashinfer-python -i https://flashinfer.ai/whl/cu124/torch2.6/
# Install remaining dependencies
python -m pip install --upgrade -r requirements-py311.txt # or requirements-py312.txt
```
--------------------------------
### Run Automated Training Setup
Source: https://docs.arc.computer/installation
Validated installation scripts for specific Python versions.
```bash
bash scripts/install_py311.sh
```
```bash
bash scripts/install_py312.sh
```
--------------------------------
### Install Prerequisites
Source: https://docs.arc.computer/examples/adaptive-tool-use
Install the necessary Python packages and initialize the Atlas environment.
```bash
pip install arc-atlas langchain-mcp-adapters langchain-openai langgraph mcp anyio
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...
atlas init # Start Postgres for telemetry
```
--------------------------------
### Setup Environment and Verify Dependencies
Source: https://docs.arc.computer/benchmarks/reproduction
Commands to verify the Python environment, CUDA availability, and authenticate with Hugging Face.
```bash
# Python environment
python --version # 3.11 or 3.12 required
pip install -r requirements.txt
# Verify CUDA
nvidia-smi
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
# Authenticate with Hugging Face
huggingface-cli login
```
--------------------------------
### Distributed Training with Multi-GPU Setup
Source: https://docs.arc.computer/api-reference/trainers
Configure the GRPOTrainer for multi-GPU distributed training using the Accelerator library. The trainer automatically handles the distributed setup when an Accelerator instance is provided.
```python
from atlas_core.training.algorithms.grpo import GRPOTrainer
from accelerate import Accelerator
accelerator = Accelerator()
trainer = GRPOTrainer(
config=config,
model=model,
accelerator=accelerator
)
# Trainer automatically handles distributed setup
trainer.train()
```
--------------------------------
### Install Training Dependencies
Source: https://docs.arc.computer/training/offline/gkd-training
Install necessary Python packages for training, including PyTorch, TRL, and vLLM.
```bash
# For Python 3.11
pip install -r requirements-py311.txt
# For Python 3.12
pip install -r requirements-py312.txt
```
--------------------------------
### Install Flash Attention
Source: https://docs.arc.computer/reference/troubleshooting
Install flash-attn without build isolation to avoid compilation errors.
```bash
pip install flash-attn --no-build-isolation
```
--------------------------------
### Configure Learning Parameters
Source: https://docs.arc.computer/sdk/learning-system
Example YAML configuration for the learning module, including provider settings and history limits.
```yaml
learning:
enabled: true
update_enabled: true
history_limit: 25
session_note_enabled: false
apply_to_prompts: true
llm:
provider: openai
model: gpt-5-mini
api_key_env: OPENAI_API_KEY
```
--------------------------------
### Start Docker Daemon on macOS and Linux
Source: https://docs.arc.computer/reference/troubleshooting
Commands to start the Docker daemon on macOS and Linux. Includes verification steps.
```bash
# macOS
open -a Docker
# Linux - check status
sudo systemctl status docker
# Linux - start Docker
sudo systemctl start docker
# Verify
docker ps
```
--------------------------------
### Download or Train Foundation Model
Source: https://docs.arc.computer/concepts/hybrid-learning
Use the CLI to download pre-trained weights or launch custom training scripts on multi-GPU setups.
```bash
# Option 1: Pre-trained
huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking
# Option 2: Custom (2+ GPUs)
scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml
```
--------------------------------
### Clone Atlas Core Repository
Source: https://docs.arc.computer/sdk/quickstart
Initial setup commands to download the Atlas Core repository and navigate to the root directory.
```bash
git clone https://github.com/Arc-Computer/ATLAS.git
cd ATLAS
```
--------------------------------
### Callbacks and Monitoring
Source: https://docs.arc.computer/api-reference/trainers
Demonstrates how to configure and use callbacks with GRPOTrainer for monitoring training progress, including examples for WandbCallback and EarlyStoppingCallback.
```APIDOC
## Callbacks and Monitoring
### Available Callbacks
```python theme={null}
from transformers import EarlyStoppingCallback
from transformers.integrations import TensorBoardCallback, WandbCallback
# Configure callbacks
callbacks = [
WandbCallback(
project="atlas-training",
name="experiment-1"
),
EarlyStoppingCallback(
early_stopping_patience=3,
early_stopping_threshold=0.001
)
]
trainer = GRPOTrainer(
config=config,
callbacks=callbacks
)
```
```
--------------------------------
### Initialize Atlas with Bundled Postgres
Source: https://docs.arc.computer/reference/troubleshooting
Start a bundled Docker instance with Postgres using `atlas init`. This typically starts Postgres on localhost:5433.
```bash
atlas init
# Starts bundled Docker + Postgres on localhost:5433
```
--------------------------------
### Install Atlas SDK
Source: https://docs.arc.computer/
Install the required Python package for the Atlas SDK.
```bash
pip install arc-atlas
```
--------------------------------
### Setup Conda Environment
Source: https://docs.arc.computer/installation
Commands to create and configure an isolated Conda environment.
```bash
# Create environment
conda create -n atlas python=3.11
conda activate atlas
# Install PyTorch
conda install pytorch==2.6.0 pytorch-cuda=12.4 -c pytorch -c nvidia
# Run installation script
bash scripts/install_py311.sh
```
--------------------------------
### Optimize Training Session Queries
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Python examples for selective data loading and asynchronous pagination to handle large datasets.
```python
# Use selective loading
sessions = get_training_sessions(
db_url="postgresql://atlas:atlas@localhost:5433/atlas",
include_trajectory_events=False, # Skip if not needed
limit=10000
)
# Use pagination for large datasets
async for batch in paginate_sessions(
db_url="postgresql://atlas:atlas@localhost:5433/atlas",
batch_size=500,
min_reward=0.8
):
process_batch(batch)
```
--------------------------------
### Customization and Task Definition
Source: https://docs.arc.computer/examples/adaptive-tool-use
Examples for adding custom MCP tools and modifying the learning task list.
```python
@server.call_tool()
async def database_query(query: str) -> str:
"""Execute safe database queries"""
# Your implementation
return results
```
```python
LEARNING_TASKS = [
"Your domain-specific task 1",
"Your domain-specific task 2",
# ... progressive complexity
]
```
--------------------------------
### Single GPU Configuration
Source: https://docs.arc.computer/installation
Commands for inference and memory-constrained training on a single GPU setup.
```bash
# Inference only with single GPU
python examples/quickstart/evaluate.py # Quick evaluation test
# For training with limited VRAM (requires 2+ GPUs)
scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_rcl.yaml
# Or use Zero-1 optimization
scripts/launch.sh zero1 2 src/atlas_core/configs/recipe/teacher_rcl.yaml
```
--------------------------------
### Multi-Node Training Setup
Source: https://docs.arc.computer/training/offline/grpo-training
Initiate multi-node training across multiple machines using torchrun. Ensure the master address and node configuration are correctly set.
```bash
# Node 1 (master)
torchrun \
--nproc_per_node=8 \
--nnodes=2 \
--node_rank=0 \
--master_addr=10.0.0.1 \
atlas-core train recipe@_global_=teacher_rcl
```
--------------------------------
### Launch vLLM Server
Source: https://docs.arc.computer/training/offline/grpo-training
Use this command to start the vLLM inference server. Ensure CUDA_VISIBLE_DEVICES is set correctly for your hardware. Adjust gpu-memory-utilization based on available VRAM.
```bash
CUDA_VISIBLE_DEVICES=0,1 \
python -m atlas_core.training.generation.vllm_server \
--model checkpoints/sft/final \
--port 8765 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.9
```
--------------------------------
### Monitor GRPO Training with TensorBoard
Source: https://docs.arc.computer/training/offline/grpo-training
Start TensorBoard to visualize training progress. Ensure the log directory matches your training output.
```bash
tensorboard --logdir checkpoints/grpo --port 6006
```
--------------------------------
### Troubleshoot Postgres Connection Refused
Source: https://docs.arc.computer/examples/adaptive-tool-use
Start Postgres with `atlas init` or verify the `DATABASE_URL` in your `.env` file to resolve connection refused errors.
```bash
Start Postgres with `atlas init` or verify DATABASE_URL in .env
```
--------------------------------
### Resolve SDK Installation Issues
Source: https://docs.arc.computer/reference/troubleshooting
Commands to verify Python environment and reinstall the Atlas SDK.
```bash
# Check Python version
python --version # Should be 3.10+
# Create virtual environment with correct version
python3.12 -m venv .venv
source .venv/bin/activate
# Reinstall SDK
pip install --upgrade arc-atlas
```
```bash
# Ensure you're in the correct environment
which python
which pip
# Reinstall in current environment
pip uninstall arc-atlas -y
pip install arc-atlas
# Verify installation
python -c "import atlas; print(atlas.__version__)"
```
--------------------------------
### Initialize MCP Server Programmatically
Source: https://docs.arc.computer/reference/troubleshooting
Starts the MCP server process using Python before executing the Atlas task.
```python
# Ensure MCP server is started before agent
import subprocess
mcp_process = subprocess.Popen([
"python", "-m", "your_mcp_server"
])
# Then run atlas
atlas run --config config.yaml --task "Your task"
```
--------------------------------
### Custom Trainer Implementation
Source: https://docs.arc.computer/api-reference/trainers
Provides an example of creating a custom trainer by extending the GRPOTrainer class, demonstrating how to override reward computation logic.
```APIDOC
## Custom Trainer Implementation
Create your own trainer by extending base classes:
```python theme={null}
from atlas_core.training.algorithms.grpo import GRPOTrainer
import torch
class CustomRewardTrainer(GRPOTrainer):
"""Custom trainer with modified reward computation"""
def compute_rewards(self, completions, prompts):
"""Override reward computation"""
rewards = []
for completion, prompt in zip(completions, prompts):
# Custom reward logic
reward = self.custom_reward_function(completion, prompt)
rewards.append(reward)
return torch.tensor(rewards)
def custom_reward_function(self, completion, prompt):
"""Implement domain-specific rewards"""
# Example: Length penalty
length_penalty = min(1.0, len(completion) / 500)
# Example: Quality score
quality = self.quality_model(completion)
return quality * length_penalty
```
```
--------------------------------
### Domain-Specific Example Entries
Source: https://docs.arc.computer/reference/datasets
Sample JSON objects representing entries in the mathematics, code generation, and debugging subsets.
```json
{
"prompt": "Sarah has 24 apples. She gives 1/3 to her brother...",
"ground_truth": "12",
"teaching": "Break down: 1) Calculate 1/3 of 24 = 8..."
}
```
```json
{
"prompt": "Write a function to validate email addresses",
"ground_truth": "def validate_email(email):...",
"teaching": "Consider regex pattern, edge cases like..."
}
```
```json
{
"prompt": "Service returns 503 errors intermittently",
"ground_truth": "Check service mesh configuration...",
"teaching": "Systematic approach: 1) Check Istio configs..."
}
```
--------------------------------
### Build a Custom GRPC Adapter
Source: https://docs.arc.computer/sdk/adapters
Implement a custom GRPC adapter by extending the AgentAdapter class and registering it. This example shows the basic structure for connecting to a gRPC service.
```python
from atlas.connectors.registry import AgentAdapter, register_adapter
from atlas.config.models import AdapterType
class GRPCAdapter(AgentAdapter):
async def ainvoke(self, prompt: str, metadata: dict | None = None) -> str:
# 1. Connect to your gRPC service.
# 2. Build the request from the prompt.
# 3. Execute the call and get a response.
# 4. Return the response as a string.
return f"Response for prompt: {prompt}"
# Assumes you've added GRPC to the AdapterType enum
register_adapter(AdapterType.GRPC, GRPCAdapter)
```
--------------------------------
### CLI Environment Verification
Source: https://docs.arc.computer/installation
Validates installation of accelerate, CUDA, and model file accessibility via CLI.
```bash
# Check accelerate installation
accelerate --version
# Verify CUDA
python -c "import torch; print(torch.cuda.is_available())"
# Test model access
huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking \
--include "*.json" \
--exclude "*.safetensors"
```
--------------------------------
### View GKD Training Command Line Output
Source: https://docs.arc.computer/training/offline/gkd-training
Example of the console output during an Atlas GKD training run showing epoch progress and baseline comparison metrics.
```text
Starting GKD training with Baseline Comparison reference: success=75.00%, tokens=1200
Loaded datasets: train=850, eval=150 conversations
AtlasGKDTrainer initialized with lmbda=1.0, beta=0.5
Epoch 1/3
Step 100: loss=0.245, eval_loss=0.312
✅ Baseline Comparison targets MET: success delta=12.3 pp, token reduction=35.2%
Epoch 2/3
Step 200: loss=0.198, eval_loss=0.276
✅ Baseline Comparison targets MET: success delta=14.1 pp, token reduction=38.7%
```
--------------------------------
### Verify Virtual Environment for Agent Discovery
Source: https://docs.arc.computer/reference/troubleshooting
Ensure the correct Python virtual environment is active and that necessary libraries like `langchain` are installed. Agent discovery runs within the current environment.
```bash
# Ensure correct environment is active
which python
pip list | grep langchain
# Discovery runs in your current environment
atlas env init --verbose
```
--------------------------------
### Live GRPO Training Metrics Output
Source: https://docs.arc.computer/training/offline/grpo-training
Example of live metrics output from the training log, showing step, reward, KL divergence, and non-degradation rate.
```text
{'step': 50, 'reward': 0.52, 'kl': 1.2, 'non_degrade': 0.96}
{'step': 100, 'reward': 0.61, 'kl': 1.4, 'non_degrade': 0.97}
{'step': 150, 'reward': 0.73, 'kl': 1.6, 'non_degrade': 0.98}
# Rewards should steadily increase over 24-36 hours
```
--------------------------------
### Fix vLLM Installation
Source: https://docs.arc.computer/installation
Install missing system dependencies or use pre-built wheels to resolve vLLM installation failures.
```bash
# Install build dependencies
sudo apt-get install python3-dev
# Try pre-built wheel
pip install https://github.com/vllm-project/vllm/releases/download/v0.8.3/vllm-0.8.3-cp311-cp311-linux_x86_64.whl
```
--------------------------------
### Run Learning Session
Source: https://docs.arc.computer/examples/adaptive-tool-use
Execute the full 25-task learning harness.
```bash
cd examples/mcp_tool_learning
python learning_harness.py
```
--------------------------------
### Launch Training Script
Source: https://docs.arc.computer/reference/faq
Execute the training process using the provided launch script and configuration file.
```bash
scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \
dataset_name=path/to/your/data
```
--------------------------------
### Launch SFT Warmup
Source: https://docs.arc.computer/benchmarks/reproduction
Command to initiate the supervised fine-tuning phase.
```bash
scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \
dataset_id_or_path=Arc-Intelligence/Arc-ATLAS-Teach-v0 \
output_dir=results/pre_rl_model \
seed=42
```
--------------------------------
### Bootstrap Project with Autodiscovery
Source: https://docs.arc.computer/installation
Initialize the environment and run tasks using the Atlas CLI.
```bash
atlas env init --task "Summarize the latest AI news"
atlas run --config .atlas/generated_config.yaml --task "Summarize the latest AI news"
```
--------------------------------
### Verify CUDA Installation
Source: https://docs.arc.computer/installation
Checks for NVIDIA driver and CUDA availability.
```bash
nvidia-smi # Verify CUDA version
```
--------------------------------
### Get Session by ID
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Retrieve a specific training session by its unique ID.
```APIDOC
## GET /training_data/sessions/{session_id}
### Description
Fetches a single training session identified by its ID.
### Method
GET
### Endpoint
/training_data/sessions/{session_id}
### Parameters
#### Path Parameters
- **session_id** (integer) - Required - The unique identifier of the session to retrieve.
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
### Response
#### Success Response (200)
- **session** (AtlasSessionTrace) - The training session object.
#### Response Example
```json
{
"session": {
"session_reward": {"score": 0.92, "uncertainty": 0.02},
"trajectory_events": [...],
"student_learning": {...},
"teacher_learning": {...},
"learning_history": {...},
"adaptive_summary": {...},
"learning_key": "security-review-final",
"drift_alert": null
}
}
```
```
--------------------------------
### Verify Learning Persistence After Upgrade
Source: https://docs.arc.computer/sdk/learning-system
After upgrading the SDK, run a session that should trigger learning, restart the runtime, and then verify that the playbook loads from the database and the playbook hash in telemetry matches the registry.
```bash
atlas run --config your_config.yaml --task "test task"
```
```bash
atlas run --config your_config.yaml --task "another task"
```
```bash
psql $DATABASE_URL -c "SELECT learning_key, updated_at FROM learning_registry WHERE learning_key='your-key';"
```
--------------------------------
### Initialize AtlasGKDTrainer
Source: https://docs.arc.computer/training/offline/gkd-training
Python implementation for setting up the AtlasGKDTrainer with student and teacher models, configuration, and database connection.
```python
from atlas_core.training.algorithms.gkd_trainer import AtlasGKDTrainer
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import GKDConfig
student = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-7B")
teacher = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-14B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B")
args = GKDConfig(
output_dir="outputs/gkd",
per_device_train_batch_size=4,
lmbda=1.0,
beta=0.5,
)
trainer = AtlasGKDTrainer(
model=student,
teacher_model=teacher,
args=args,
db_url="postgresql://localhost:5432/atlas",
min_reward=0.8,
processing_class=tokenizer,
)
trainer.train()
trainer.save_model("outputs/gkd/final")
```
--------------------------------
### Launch SFT Training
Source: https://docs.arc.computer/training/offline/grpo-training
Execute SFT training across different GPU configurations using the launch script.
```bash
# Minimum (2 GPUs)
scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml \
output_dir=checkpoints/sft
# Recommended (4 GPUs)
scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \
output_dir=checkpoints/sft
# Full production (8 GPUs)
scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \
output_dir=checkpoints/sft
# Memory-constrained with offloading
scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_sft.yaml \
output_dir=checkpoints/sft
```
--------------------------------
### Get Training Sessions (Async)
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Asynchronously query training sessions for high-throughput pipelines.
```APIDOC
## GET /training_data/sessions/async
### Description
Fetches training sessions asynchronously from a PostgreSQL database, suitable for high-throughput scenarios.
### Method
GET
### Endpoint
/training_data/sessions/async
### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **limit** (integer) - Optional - The maximum number of sessions to return.
### Response
#### Success Response (200)
- **sessions** (list of AtlasSessionTrace) - A list of training session objects.
#### Response Example
```json
{
"sessions": [
{
"session_reward": {"score": 0.75, "uncertainty": 0.03},
"trajectory_events": [...],
"student_learning": {...},
"teacher_learning": {...},
"learning_history": {...},
"adaptive_summary": {...},
"learning_key": "task-batch-2",
"drift_alert": null
}
]
}
```
```
--------------------------------
### Launch minimal training
Source: https://docs.arc.computer/benchmarks/reproduction
Executes a short training run for testing purposes.
```bash
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \
model_name_or_path=checkpoints/teacher \
max_steps=4 \
eval_steps=1 \
report_to=null
```
--------------------------------
### Run GRPO Training with vLLM Server
Source: https://docs.arc.computer/training/offline/grpo-training
Launch GRPO training using the `launch_with_server.sh` script, specifying the number of GPUs for training and vLLM. The first argument is training GPUs, the second is vLLM GPUs.
```bash
# Minimum (2 GPUs: 1 training, 1 vLLM)
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \
model_name_or_path=checkpoints/sft/final
# Recommended (4 GPUs: 2 training, 2 vLLM)
scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml \
model_name_or_path=checkpoints/sft/final
# Production (8 GPUs: 4 training, 4 vLLM)
scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml \
model_name_or_path=checkpoints/sft/final
```
--------------------------------
### Verify Python Version
Source: https://docs.arc.computer/installation
Confirms the installed Python version meets the minimum requirement of 3.10.
```bash
python --version
```
--------------------------------
### Get Training Sessions
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Query training sessions directly from PostgreSQL with various filtering options.
```APIDOC
## GET /training_data/sessions
### Description
Fetches training sessions directly from a PostgreSQL database.
### Method
GET
### Endpoint
/training_data/sessions
### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **max_reward** (float) - Optional - Maximum reward score to filter sessions.
- **learning_key** (string) - Optional - Filter sessions by a specific learning key.
- **status_filters** (list of strings) - Optional - Filter sessions by their status (e.g., "succeeded", "failed").
- **start_date** (datetime) - Optional - Filter sessions that started on or after this date.
- **end_date** (datetime) - Optional - Filter sessions that ended on or before this date.
- **include_trajectory_events** (boolean) - Optional - Whether to include trajectory events. Defaults to true.
- **include_learning_data** (boolean) - Optional - Whether to include learning data. Defaults to true.
- **limit** (integer) - Optional - The maximum number of sessions to return.
### Response
#### Success Response (200)
- **sessions** (list of AtlasSessionTrace) - A list of training session objects.
#### Response Example
```json
{
"sessions": [
{
"session_reward": {"score": 0.85, "uncertainty": 0.05},
"trajectory_events": [...],
"student_learning": {...},
"teacher_learning": {...},
"learning_history": {...},
"adaptive_summary": {...},
"learning_key": "security-review",
"drift_alert": null
}
]
}
```
```
--------------------------------
### Configure API Keys using a .env File
Source: https://docs.arc.computer/reference/troubleshooting
Create a .env file in your project root to store API keys. The SDK automatically loads these variables.
```bash
# Create .env file in project root
cat > .env << EOF
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=...
EOF
# SDK auto-loads .env files
atlas run --config config.yaml --task "Your task"
```
--------------------------------
### Configure API Keys
Source: https://docs.arc.computer/installation
Environment variables for training and runtime SDK authentication.
```bash
# Training stack
export HF_TOKEN="your-huggingface-token"
export WANDB_API_KEY="your-wandb-key" # Optional
# Runtime SDK
export ANTHROPIC_API_KEY="sk-ant-your-key" # Primary provider
export GEMINI_API_KEY="your-gemini-key" # Optional for rewards
```
--------------------------------
### Configure Atlas Database URL
Source: https://docs.arc.computer/reference/troubleshooting
Example of the `database_url` configuration for Atlas, specifying the connection string for a PostgreSQL database.
```yaml
# config.yaml
storage:
database_url: postgresql://atlas:atlas@localhost:5433/atlas
```
--------------------------------
### Configure GKD (Postgres path)
Source: https://docs.arc.computer/examples/gkd-dev-example
Trains directly from approved traces in Postgres. Ensure ATLAS_DB_URL is set. Override trainer parameters as needed.
```bash
export ATLAS_DB_URL="postgresql://user:pass@host:5432/atlas"
atlas-core train \
recipe@_global_=teacher_gkd \
teacher_model_name_or_path=Qwen/Qwen2.5-14B-Instruct \
model.model_name_or_path=Qwen/Qwen2.5-7B-Instruct \
trainer.min_reward=0.8
```
--------------------------------
### Filter Dataset by Domain
Source: https://docs.arc.computer/reference/datasets
Examples of filtering the dataset for specific domains like mathematics, code generation, or debugging.
```python
math_data = dataset.filter(lambda x: x['domain'] == 'math')
```
```python
code_data = dataset.filter(lambda x: x['domain'] == 'code')
```
```python
sre_data = dataset.filter(lambda x: x['domain'] == 'debug')
```
--------------------------------
### Launch DeepSpeed Training with Accelerate
Source: https://docs.arc.computer/api-reference/trainers
Initiate distributed training with DeepSpeed using the `accelerate launch` command. Specify the DeepSpeed configuration file and the training script. Presets for zero3 and CPU offloading are available.
```bash
# Default zero3
accelerate launch --config_file accelerate/deepspeed_zero3.yaml \
-m atlas_core.cli.train recipe@_global_=teacher_rcl
```
```bash
# CPU offload
accelerate launch --config_file accelerate/deepspeed_zero3_cpu_offloading.yaml \
-m atlas_core.cli.train recipe@_global_=teacher_rcl
```
--------------------------------
### Count Training Sessions
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Get the count of training sessions matching specific criteria without loading full data.
```APIDOC
## GET /training_data/sessions/count
### Description
Retrieves the total count of training sessions that match the specified filters.
### Method
GET
### Endpoint
/training_data/sessions/count
### Parameters
#### Query Parameters
- **db_url** (string) - Required - The database connection URL.
- **min_reward** (float) - Optional - Minimum reward score to filter sessions.
- **learning_key** (string) - Optional - Filter sessions by a specific learning key.
### Response
#### Success Response (200)
- **total** (integer) - The total number of matching training sessions.
#### Response Example
```json
{
"total": 150
}
```
```
--------------------------------
### Load ATLAS-8B-Instruct Model
Source: https://docs.arc.computer/reference/models
Use this code to load the ATLAS-8B-Instruct model and tokenizer from Hugging Face. Ensure you have the transformers library installed.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Arc-Intelligence/ATLAS-8B-Instruct",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"Arc-Intelligence/ATLAS-8B-Instruct"
)
```
--------------------------------
### GRPOTrainer Class Initialization
Source: https://docs.arc.computer/api-reference/trainers
Details the parameters required to initialize the GRPOTrainer for reinforcement learning tasks.
```APIDOC
## GRPOTrainer Initialization
### Description
Initializes the GRPOTrainer with the necessary models, configuration, and datasets for policy optimization.
### Parameters
- **config** (GRPOConfig) - Required - Training configuration
- **model** (PreTrainedModel) - Required - Model to train (policy network)
- **ref_model** (PreTrainedModel) - Required - Reference model for KL penalty
- **tokenizer** (PreTrainedTokenizer) - Required - Tokenizer for encoding/decoding
- **train_dataset** (Dataset) - Required - Training data
- **eval_dataset** (Dataset) - Optional - Evaluation data
- **reward_model** (PreTrainedModel) - Optional - Optional external reward model
- **compute_metrics** (Callable) - Optional - Custom metrics function
- **callbacks** (List[TrainerCallback]) - Optional - Training callbacks
- **optimizers** (Tuple) - Optional - Custom optimizer and scheduler
```
--------------------------------
### Load ATLAS-8B-Thinking Model
Source: https://docs.arc.computer/reference/models
Use this code to load the ATLAS-8B-Thinking model and tokenizer from Hugging Face. Ensure you have the transformers library installed.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Arc-Intelligence/ATLAS-8B-Thinking",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"Arc-Intelligence/ATLAS-8B-Thinking"
)
```
--------------------------------
### Build and Run Docker Container
Source: https://docs.arc.computer/installation
Commands for building the training image and executing the offline pipeline.
```bash
docker build -t atlas-core:local .
```
```bash
docker run --rm \
-v "$(pwd)/exports:/data" \
atlas-core:local \
atlas-core offline-pipeline --export-path /data/traces.jsonl --dry-run
```
--------------------------------
### Multi-GPU Training Launch
Source: https://docs.arc.computer/installation
Distributed training commands for various GPU cluster sizes.
```bash
# Minimum 2 GPUs for RL training (1 for vLLM, 1 for training)
scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml
# Production setup with 4 GPUs (2 for vLLM, 2 for training)
scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml
# Full 8 GPU setup
scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml
```
--------------------------------
### Verify Configuration Files
Source: https://docs.arc.computer/training/offline/gkd-training
Check for the existence of required GKD configuration files.
```bash
# Check required config files
ls src/atlas_core/configs/recipe/teacher_gkd.yaml
ls src/atlas_core/configs/trainer/gkd.yaml
# Expected output:
# src/atlas_core/configs/recipe/teacher_gkd.yaml
# src/atlas_core/configs/trainer/gkd.yaml
```
--------------------------------
### Count Training Sessions
Source: https://docs.arc.computer/training/offline/training-data-pipeline
Get the count of training sessions matching specific criteria without loading the full data using `count_training_sessions`.
```python
from atlas.training_data import count_training_sessions
total = count_training_sessions(
db_url="postgresql://atlas:atlas@localhost:5433/atlas",
min_reward=0.8,
learning_key="task-1"
)
print(f"Found {total} sessions matching criteria")
```
--------------------------------
### Session Evaluation JSON Structure
Source: https://docs.arc.computer/concepts/reward-design
Example of the structured evaluation output generated after a session, containing weighted principles, scores, and learning insights.
```json
{
"principles": [
{"name": "Correctness", "weight": 0.5, "description": "Final deliverable matches requirements"},
{"name": "Safety", "weight": 0.3, "description": "No policy violations detected"},
{"name": "Efficiency", "weight": 0.2, "description": "Minimal retries needed"}
],
"score": 0.85,
"rationale": "Response solves the task correctly with efficient execution",
"uncertainty": 0.1,
"student_learning": "For straightforward tasks, proceed directly to solution without exploratory steps",
"teacher_learning": null
}
```
--------------------------------
### Build GKD Dataset
Source: https://docs.arc.computer/training/offline/gkd-training
Initializes training and evaluation datasets from a PostgreSQL database using specified reward thresholds and learning keys.
```python
from atlas_core.data.gkd import build_gkd_dataset
train_ds, eval_ds = build_gkd_dataset(
db_url="postgresql://localhost:5432/atlas",
min_reward=0.8,
learning_key="crm_workflows",
eval_split=0.15,
)
```
--------------------------------
### SFT Warmup
Source: https://docs.arc.computer/api-reference/trainers
Performs supervised fine-tuning as a prerequisite step before RL training.
```python
from trl import SFTConfig, SFTTrainer
# Supervised fine-tuning before RL
trainer = SFTTrainer(
model=model,
args=training_args,
train_dataset=sft_dataset,
tokenizer=tokenizer,
max_seq_length=2048
)
trainer.train()
```
--------------------------------
### GKD Validation Metrics Example
Source: https://docs.arc.computer/examples/gkd-dev-example
Inspect this JSON file for training loss, baseline and distilled evaluation metrics, and derived success delta and token reduction.
```json
{
"training": {"train_loss": 0.0294},
"baseline": {"accuracy": 0.758, "avg_generated_tokens": 210},
"distilled": {"accuracy": 0.815, "avg_generated_tokens": 180}
}
```
--------------------------------
### Configure Hugging Face Environment
Source: https://docs.arc.computer/reference/troubleshooting
Manage authentication and cache settings for Hugging Face.
```bash
# Login to Hugging Face
huggingface-cli login
# Set cache directory if disk space limited
export HF_HOME=/path/to/cache
# Use offline mode if downloaded
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
```
--------------------------------
### Generate teaching guidance with TeacherGRPOTrainer
Source: https://docs.arc.computer/api-reference/trainers
Create tailored teaching guidance based on diagnostic results. The output format varies based on the student's capability level.
```python
def generate_guidance(
self,
task: str,
diagnostic: DiagnosticResult,
max_tokens: int = 200
) -> str:
"""
Generate teaching guidance based on diagnosis
Args:
task: Original task or problem statement
diagnostic: Results from diagnostic_probe()
max_tokens: Maximum tokens for guidance response
Returns:
str: Tailored teaching guidance text
Format depends on capability_level:
- Low (0.0-0.3): Step-by-step walkthrough
- Medium (0.3-0.7): Hints and scaffolding
- High (0.7-1.0): Minimal guidance or verification
Raises:
ValueError: If max_tokens < 10 or diagnostic is None
RuntimeError: If teacher model fails to generate guidance
TypeError: If task is not a string
Example:
diagnostic = trainer.diagnostic_probe("Solve quadratic equation")
guidance = trainer.generate_guidance(
"Solve: x² - 5x + 6 = 0",
diagnostic,
max_tokens=150
)
print(f"Teaching guidance: {guidance}")
"""
```
--------------------------------
### Environment Discovery Commands
Source: https://docs.arc.computer/sdk/cli-reference
Commands for initializing and scaffolding agent environments.
```APIDOC
## atlas env init
### Description
Discovers agents and environments, populating the .atlas/ directory. Uses Claude Haiku 4.5 for agent ranking.
### Parameters
#### Flags
- **--task** (string) - Optional - Task description for discovery
- **--scaffold-config-full** (boolean) - Optional - Generate full configuration
- **--no-run** (boolean) - Optional - Skip execution
- **--timeout** (integer) - Optional - Timeout in seconds (default 240)
## atlas env scaffold
### Description
Seeds projects with reference factories using LangGraph templates.
### Parameters
#### Flags
- **--template** (string) - Optional - Template name
- **--output** (string) - Optional - Output directory
- **--force** (boolean) - Optional - Overwrite existing files
```
--------------------------------
### Evaluate a Single Interaction with RIMReward
Source: https://docs.arc.computer/training/reward-system-usage
Use the evaluate method of RIMReward for quick assessments of teaching effectiveness. Provide prompt, response, baseline, and teacher traces to get a score and rationale.
```python
from atlas_core.reward.interpretation import RIMReward
# Create reward system
reward = RIMReward(config_path='reward_system/interpretation.yaml')
# Evaluate a single interaction
result = reward.evaluate(
prompt="What is 2+2?",
response="The answer is 4.",
baseline_solutions="It is 4",
teacher_traces="Explain your reasoning step by step",
)
print(f"Score: {result.score}")
print(f"Per-judge: {result.judge_scores}")
print(f"Rationale:\n{result.rationale}")
```
--------------------------------
### Teacher GRPO Prompt Templates
Source: https://docs.arc.computer/training/configuration
Defines the prompt templates used for diagnostic, adaptive, and feedback prompts in the teacher GRPO trainer. These templates guide the model's responses and interactions.
```yaml
student_diagnostic_template: |
Question: {question}
Before solving, briefly describe:
1. What type of problem this is
2. The key concepts or steps needed
3. Any potential challenges you see
```
```yaml
teacher_adaptive_template: |
Question: {question}
Student's approach: {approach}
[Analyze student approach]
[Only guidance to student - no answers]
```
```yaml
student_with_teaching_template: |
Question: {question}
A teacher has provided: {teaching}
Now solve step by step.
```
--------------------------------
### Configure Agent and Reward System
Source: https://docs.arc.computer/examples/adaptive-tool-use
Define the agent import path and the reward system judge prompt in the configuration.
```yaml
agent:
type: python
import_path: examples.mcp_tool_learning.mcp_agent
attribute: create_agent
```
```yaml
rim:
judge_prompt: |
Reward effective tool usage:
- Correct tool for each task
- Minimal redundant operations
- Proper error handling
```
--------------------------------
### Validate Trained Model with Transformers
Source: https://docs.arc.computer/training/offline/grpo-training
Load a trained teacher model and a baseline student model using the Transformers library to compare their performance on a given problem. Requires PyTorch and Transformers installed.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load your trained teacher
teacher = AutoModelForCausalLM.from_pretrained(
"results/rl_checkpoint",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("results/rl_checkpoint")
# Load baseline student
student = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-4B-Instruct-2507",
torch_dtype=torch.float16,
device_map="auto"
)
# Test on a problem
problem = "A train travels 120 miles in 2 hours. What is its speed?"
# Get baseline (student only)
inputs = tokenizer(problem, return_tensors="pt").to(student.device)
baseline = student.generate(**inputs, max_new_tokens=100)
print(f"Baseline: {tokenizer.decode(baseline[0])}")
# Get teaching (using the atlas-sdk runtime loop)
# This gives you the enhanced response
```
--------------------------------
### Troubleshoot Async Event Loop Errors
Source: https://docs.arc.computer/examples/adaptive-tool-use
Run the learning harness with `python learning_harness.py` instead of `python -i` to avoid async event loop errors.
```python
Run with `python learning_harness.py` (not `python -i`)
```
--------------------------------
### Standard GRPO Training
Source: https://docs.arc.computer/api-reference/trainers
Initializes and executes a GRPOTrainer with specific configuration and reward functions.
```python
from atlas_core.training.algorithms.grpo import GRPOTrainer
from atlas_core.training.algorithms.grpo_config import GRPOConfig
from atlas_core.reward.interpretation import RIMReward
# Minimal training arguments (TrainingArguments requires an output_dir)
grpo_args = GRPOConfig(
output_dir="./output/grpo",
model_name_or_path="Arc-Intelligence/ATLAS-8B-Thinking",
learning_rate=5e-6,
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
beta=0.04, # KL penalty
logging_steps=10,
)
reward = RIMReward(config_path="reward_system/interpretation.yaml")
trainer = GRPOTrainer(
model=grpo_args.model_name_or_path,
reward_funcs=reward,
args=grpo_args,
train_dataset=train_data,
eval_dataset=eval_data,
processing_class=tokenizer,
)
trainer.train()
trainer.save_model("./output/grpo/checkpoint-final")
```
--------------------------------
### Quality Validation of Dataset
Source: https://docs.arc.computer/training/custom-datasets
Performs quality validation on a dataset using standard Python tooling and the `datasets` library. Calculates example count, average step length in tokens, and domain distribution.
```python
from collections import Counter
from datasets import Dataset
dataset = Dataset.from_list(records)
lengths = [len(example["step_trace"].split()) for example in dataset]
domains = Counter(example["session_metadata"].get("domain", "unknown") for example in dataset)
print(f"Examples: {len(dataset)}")
print(f"Avg step length: {sum(lengths)/len(lengths):.1f} tokens")
print(f"Domains: {domains}")
```
--------------------------------
### Monitor training metrics
Source: https://docs.arc.computer/benchmarks/reproduction
Commands for monitoring TensorBoard, server health, and GPU utilization.
```bash
# TensorBoard monitoring
tensorboard --logdir results/ --port 6006
# vLLM server health
watch -n 5 'curl -s http://localhost:8765/metrics'
# GPU utilization
nvidia-smi dmon -s u -d 5
```
--------------------------------
### Atlas Session Trace JSON Structure
Source: https://docs.arc.computer/sdk/export-traces
This is an example of the JSON structure for an Atlas Session Trace record. It includes details about the task, adaptive summary, triage dossier, plan, steps, rewards, and review status.
```json
{
"task": "Summarize the latest Atlas SDK updates",
"final_answer": "...",
"adaptive_summary": {
"adaptive_mode": "coach",
"confidence": 0.58,
"certification_run": false,
"probe": {
"mode": "coach",
"confidence": 0.55,
"evidence": ["persona_helpful_ratio=0.62", "risk_high_severity"]
},
"mode_history": [
{"mode": "paired", "confidence": 0.71, "certification": true},
{"mode": "coach", "confidence": 0.55}
]
},
"triage_dossier": {
"task": "Summarize the latest Atlas SDK updates",
"summary": "Capture highlights for stakeholders.",
"risks": [{"category": "quality", "description": "Customer-facing copy", "severity": "moderate"}],
"signals": [{"name": "tenant", "value": "demo"}],
"tags": ["tenant:demo", "domain:sre"]
},
"plan": {"steps": [{"id": 1, "description": "Collect release notes"}, {"id": 2, "description": "Draft summary"}]},
"steps": [
{
"step_id": 1,
"description": "Collect release notes",
"trace": "HUMAN: ...",
"output": "...",
"reward": {
"score": 0.92,
"judges": [
{"identifier": "process", "score": 0.91, "rationale": "..."}
]
},
"guidance": ["Cite the release date."],
"validation": {"valid": true, "rationale": "Complete"},
"tool": "web_search",
"tool_params": {"query": "Atlas SDK release notes"},
"artifacts": {"sources": ["https://..."]},
"deliverable": {"notes": ["https://..."]}
}
],
"session_reward": {
"score": 0.88,
"uncertainty": 0.07,
"judges": [
{"identifier": "process", "score": 0.90, "rationale": "..."}
]
},
"reward_summary": {"score": 0.88},
"review_status": "approved",
"personas_used": [
{"persona": "planner", "instruction": "Focus on customer tone", "source": "memory"}
],
"persona_updates": {
"new_candidates": [
{"persona": "planner", "instruction": "Mention adaptive modes", "tags": ["tenant:demo"]}
]
},
"session_metadata": {"batch": "aime-2025"}
}
```
--------------------------------
### Create Database Indexes for Performance
Source: https://docs.arc.computer/training/offline/training-data-pipeline
SQL commands to optimize reward filtering, date range queries, and metadata GIN indexing for session data.
```sql
-- Reward filtering (10-100x faster)
CREATE INDEX sessions_reward_score_idx
ON sessions ((reward_stats->>'score')::float);
-- Date range queries (50-100x faster)
CREATE INDEX sessions_created_at_idx
ON sessions (created_at DESC);
-- Learning key queries
CREATE INDEX sessions_metadata_gin_idx
ON sessions USING GIN (metadata);
```