### Manual Training Installation Source: https://docs.arc.computer/installation Step-by-step manual installation of dependencies for custom environments. ```bash # Install PyTorch with CUDA support python -m pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124 # Install vLLM and TensorBoard python -m pip install vllm==0.8.3 tensorboard # Install Flash Attention (for optimal performance) python -m pip install flash-attn --no-build-isolation # Install FlashInfer python -m pip install flashinfer-python -i https://flashinfer.ai/whl/cu124/torch2.6/ # Install remaining dependencies python -m pip install --upgrade -r requirements-py311.txt # or requirements-py312.txt ``` -------------------------------- ### Run Automated Training Setup Source: https://docs.arc.computer/installation Validated installation scripts for specific Python versions. ```bash bash scripts/install_py311.sh ``` ```bash bash scripts/install_py312.sh ``` -------------------------------- ### Install Prerequisites Source: https://docs.arc.computer/examples/adaptive-tool-use Install the necessary Python packages and initialize the Atlas environment. ```bash pip install arc-atlas langchain-mcp-adapters langchain-openai langgraph mcp anyio export OPENAI_API_KEY=sk-... export GEMINI_API_KEY=... atlas init # Start Postgres for telemetry ``` -------------------------------- ### Setup Environment and Verify Dependencies Source: https://docs.arc.computer/benchmarks/reproduction Commands to verify the Python environment, CUDA availability, and authenticate with Hugging Face. ```bash # Python environment python --version # 3.11 or 3.12 required pip install -r requirements.txt # Verify CUDA nvidia-smi python -c "import torch; print(f'PyTorch: {torch.__version__}')" python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')" # Authenticate with Hugging Face huggingface-cli login ``` -------------------------------- ### Distributed Training with Multi-GPU Setup Source: https://docs.arc.computer/api-reference/trainers Configure the GRPOTrainer for multi-GPU distributed training using the Accelerator library. The trainer automatically handles the distributed setup when an Accelerator instance is provided. ```python from atlas_core.training.algorithms.grpo import GRPOTrainer from accelerate import Accelerator accelerator = Accelerator() trainer = GRPOTrainer( config=config, model=model, accelerator=accelerator ) # Trainer automatically handles distributed setup trainer.train() ``` -------------------------------- ### Install Training Dependencies Source: https://docs.arc.computer/training/offline/gkd-training Install necessary Python packages for training, including PyTorch, TRL, and vLLM. ```bash # For Python 3.11 pip install -r requirements-py311.txt # For Python 3.12 pip install -r requirements-py312.txt ``` -------------------------------- ### Install Flash Attention Source: https://docs.arc.computer/reference/troubleshooting Install flash-attn without build isolation to avoid compilation errors. ```bash pip install flash-attn --no-build-isolation ``` -------------------------------- ### Configure Learning Parameters Source: https://docs.arc.computer/sdk/learning-system Example YAML configuration for the learning module, including provider settings and history limits. ```yaml learning: enabled: true update_enabled: true history_limit: 25 session_note_enabled: false apply_to_prompts: true llm: provider: openai model: gpt-5-mini api_key_env: OPENAI_API_KEY ``` -------------------------------- ### Start Docker Daemon on macOS and Linux Source: https://docs.arc.computer/reference/troubleshooting Commands to start the Docker daemon on macOS and Linux. Includes verification steps. ```bash # macOS open -a Docker # Linux - check status sudo systemctl status docker # Linux - start Docker sudo systemctl start docker # Verify docker ps ``` -------------------------------- ### Download or Train Foundation Model Source: https://docs.arc.computer/concepts/hybrid-learning Use the CLI to download pre-trained weights or launch custom training scripts on multi-GPU setups. ```bash # Option 1: Pre-trained huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking # Option 2: Custom (2+ GPUs) scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml ``` -------------------------------- ### Clone Atlas Core Repository Source: https://docs.arc.computer/sdk/quickstart Initial setup commands to download the Atlas Core repository and navigate to the root directory. ```bash git clone https://github.com/Arc-Computer/ATLAS.git cd ATLAS ``` -------------------------------- ### Callbacks and Monitoring Source: https://docs.arc.computer/api-reference/trainers Demonstrates how to configure and use callbacks with GRPOTrainer for monitoring training progress, including examples for WandbCallback and EarlyStoppingCallback. ```APIDOC ## Callbacks and Monitoring ### Available Callbacks ```python theme={null} from transformers import EarlyStoppingCallback from transformers.integrations import TensorBoardCallback, WandbCallback # Configure callbacks callbacks = [ WandbCallback( project="atlas-training", name="experiment-1" ), EarlyStoppingCallback( early_stopping_patience=3, early_stopping_threshold=0.001 ) ] trainer = GRPOTrainer( config=config, callbacks=callbacks ) ``` ``` -------------------------------- ### Initialize Atlas with Bundled Postgres Source: https://docs.arc.computer/reference/troubleshooting Start a bundled Docker instance with Postgres using `atlas init`. This typically starts Postgres on localhost:5433. ```bash atlas init # Starts bundled Docker + Postgres on localhost:5433 ``` -------------------------------- ### Install Atlas SDK Source: https://docs.arc.computer/ Install the required Python package for the Atlas SDK. ```bash pip install arc-atlas ``` -------------------------------- ### Setup Conda Environment Source: https://docs.arc.computer/installation Commands to create and configure an isolated Conda environment. ```bash # Create environment conda create -n atlas python=3.11 conda activate atlas # Install PyTorch conda install pytorch==2.6.0 pytorch-cuda=12.4 -c pytorch -c nvidia # Run installation script bash scripts/install_py311.sh ``` -------------------------------- ### Optimize Training Session Queries Source: https://docs.arc.computer/training/offline/training-data-pipeline Python examples for selective data loading and asynchronous pagination to handle large datasets. ```python # Use selective loading sessions = get_training_sessions( db_url="postgresql://atlas:atlas@localhost:5433/atlas", include_trajectory_events=False, # Skip if not needed limit=10000 ) # Use pagination for large datasets async for batch in paginate_sessions( db_url="postgresql://atlas:atlas@localhost:5433/atlas", batch_size=500, min_reward=0.8 ): process_batch(batch) ``` -------------------------------- ### Customization and Task Definition Source: https://docs.arc.computer/examples/adaptive-tool-use Examples for adding custom MCP tools and modifying the learning task list. ```python @server.call_tool() async def database_query(query: str) -> str: """Execute safe database queries""" # Your implementation return results ``` ```python LEARNING_TASKS = [ "Your domain-specific task 1", "Your domain-specific task 2", # ... progressive complexity ] ``` -------------------------------- ### Single GPU Configuration Source: https://docs.arc.computer/installation Commands for inference and memory-constrained training on a single GPU setup. ```bash # Inference only with single GPU python examples/quickstart/evaluate.py # Quick evaluation test # For training with limited VRAM (requires 2+ GPUs) scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_rcl.yaml # Or use Zero-1 optimization scripts/launch.sh zero1 2 src/atlas_core/configs/recipe/teacher_rcl.yaml ``` -------------------------------- ### Multi-Node Training Setup Source: https://docs.arc.computer/training/offline/grpo-training Initiate multi-node training across multiple machines using torchrun. Ensure the master address and node configuration are correctly set. ```bash # Node 1 (master) torchrun \ --nproc_per_node=8 \ --nnodes=2 \ --node_rank=0 \ --master_addr=10.0.0.1 \ atlas-core train recipe@_global_=teacher_rcl ``` -------------------------------- ### Launch vLLM Server Source: https://docs.arc.computer/training/offline/grpo-training Use this command to start the vLLM inference server. Ensure CUDA_VISIBLE_DEVICES is set correctly for your hardware. Adjust gpu-memory-utilization based on available VRAM. ```bash CUDA_VISIBLE_DEVICES=0,1 \ python -m atlas_core.training.generation.vllm_server \ --model checkpoints/sft/final \ --port 8765 \ --tensor-parallel-size 2 \ --gpu-memory-utilization 0.9 ``` -------------------------------- ### Monitor GRPO Training with TensorBoard Source: https://docs.arc.computer/training/offline/grpo-training Start TensorBoard to visualize training progress. Ensure the log directory matches your training output. ```bash tensorboard --logdir checkpoints/grpo --port 6006 ``` -------------------------------- ### Troubleshoot Postgres Connection Refused Source: https://docs.arc.computer/examples/adaptive-tool-use Start Postgres with `atlas init` or verify the `DATABASE_URL` in your `.env` file to resolve connection refused errors. ```bash Start Postgres with `atlas init` or verify DATABASE_URL in .env ``` -------------------------------- ### Resolve SDK Installation Issues Source: https://docs.arc.computer/reference/troubleshooting Commands to verify Python environment and reinstall the Atlas SDK. ```bash # Check Python version python --version # Should be 3.10+ # Create virtual environment with correct version python3.12 -m venv .venv source .venv/bin/activate # Reinstall SDK pip install --upgrade arc-atlas ``` ```bash # Ensure you're in the correct environment which python which pip # Reinstall in current environment pip uninstall arc-atlas -y pip install arc-atlas # Verify installation python -c "import atlas; print(atlas.__version__)" ``` -------------------------------- ### Initialize MCP Server Programmatically Source: https://docs.arc.computer/reference/troubleshooting Starts the MCP server process using Python before executing the Atlas task. ```python # Ensure MCP server is started before agent import subprocess mcp_process = subprocess.Popen([ "python", "-m", "your_mcp_server" ]) # Then run atlas atlas run --config config.yaml --task "Your task" ``` -------------------------------- ### Custom Trainer Implementation Source: https://docs.arc.computer/api-reference/trainers Provides an example of creating a custom trainer by extending the GRPOTrainer class, demonstrating how to override reward computation logic. ```APIDOC ## Custom Trainer Implementation Create your own trainer by extending base classes: ```python theme={null} from atlas_core.training.algorithms.grpo import GRPOTrainer import torch class CustomRewardTrainer(GRPOTrainer): """Custom trainer with modified reward computation""" def compute_rewards(self, completions, prompts): """Override reward computation""" rewards = [] for completion, prompt in zip(completions, prompts): # Custom reward logic reward = self.custom_reward_function(completion, prompt) rewards.append(reward) return torch.tensor(rewards) def custom_reward_function(self, completion, prompt): """Implement domain-specific rewards""" # Example: Length penalty length_penalty = min(1.0, len(completion) / 500) # Example: Quality score quality = self.quality_model(completion) return quality * length_penalty ``` ``` -------------------------------- ### Domain-Specific Example Entries Source: https://docs.arc.computer/reference/datasets Sample JSON objects representing entries in the mathematics, code generation, and debugging subsets. ```json { "prompt": "Sarah has 24 apples. She gives 1/3 to her brother...", "ground_truth": "12", "teaching": "Break down: 1) Calculate 1/3 of 24 = 8..." } ``` ```json { "prompt": "Write a function to validate email addresses", "ground_truth": "def validate_email(email):...", "teaching": "Consider regex pattern, edge cases like..." } ``` ```json { "prompt": "Service returns 503 errors intermittently", "ground_truth": "Check service mesh configuration...", "teaching": "Systematic approach: 1) Check Istio configs..." } ``` -------------------------------- ### Build a Custom GRPC Adapter Source: https://docs.arc.computer/sdk/adapters Implement a custom GRPC adapter by extending the AgentAdapter class and registering it. This example shows the basic structure for connecting to a gRPC service. ```python from atlas.connectors.registry import AgentAdapter, register_adapter from atlas.config.models import AdapterType class GRPCAdapter(AgentAdapter): async def ainvoke(self, prompt: str, metadata: dict | None = None) -> str: # 1. Connect to your gRPC service. # 2. Build the request from the prompt. # 3. Execute the call and get a response. # 4. Return the response as a string. return f"Response for prompt: {prompt}" # Assumes you've added GRPC to the AdapterType enum register_adapter(AdapterType.GRPC, GRPCAdapter) ``` -------------------------------- ### CLI Environment Verification Source: https://docs.arc.computer/installation Validates installation of accelerate, CUDA, and model file accessibility via CLI. ```bash # Check accelerate installation accelerate --version # Verify CUDA python -c "import torch; print(torch.cuda.is_available())" # Test model access huggingface-cli download Arc-Intelligence/ATLAS-8B-Thinking \ --include "*.json" \ --exclude "*.safetensors" ``` -------------------------------- ### View GKD Training Command Line Output Source: https://docs.arc.computer/training/offline/gkd-training Example of the console output during an Atlas GKD training run showing epoch progress and baseline comparison metrics. ```text Starting GKD training with Baseline Comparison reference: success=75.00%, tokens=1200 Loaded datasets: train=850, eval=150 conversations AtlasGKDTrainer initialized with lmbda=1.0, beta=0.5 Epoch 1/3 Step 100: loss=0.245, eval_loss=0.312 ✅ Baseline Comparison targets MET: success delta=12.3 pp, token reduction=35.2% Epoch 2/3 Step 200: loss=0.198, eval_loss=0.276 ✅ Baseline Comparison targets MET: success delta=14.1 pp, token reduction=38.7% ``` -------------------------------- ### Verify Virtual Environment for Agent Discovery Source: https://docs.arc.computer/reference/troubleshooting Ensure the correct Python virtual environment is active and that necessary libraries like `langchain` are installed. Agent discovery runs within the current environment. ```bash # Ensure correct environment is active which python pip list | grep langchain # Discovery runs in your current environment atlas env init --verbose ``` -------------------------------- ### Live GRPO Training Metrics Output Source: https://docs.arc.computer/training/offline/grpo-training Example of live metrics output from the training log, showing step, reward, KL divergence, and non-degradation rate. ```text {'step': 50, 'reward': 0.52, 'kl': 1.2, 'non_degrade': 0.96} {'step': 100, 'reward': 0.61, 'kl': 1.4, 'non_degrade': 0.97} {'step': 150, 'reward': 0.73, 'kl': 1.6, 'non_degrade': 0.98} # Rewards should steadily increase over 24-36 hours ``` -------------------------------- ### Fix vLLM Installation Source: https://docs.arc.computer/installation Install missing system dependencies or use pre-built wheels to resolve vLLM installation failures. ```bash # Install build dependencies sudo apt-get install python3-dev # Try pre-built wheel pip install https://github.com/vllm-project/vllm/releases/download/v0.8.3/vllm-0.8.3-cp311-cp311-linux_x86_64.whl ``` -------------------------------- ### Run Learning Session Source: https://docs.arc.computer/examples/adaptive-tool-use Execute the full 25-task learning harness. ```bash cd examples/mcp_tool_learning python learning_harness.py ``` -------------------------------- ### Launch Training Script Source: https://docs.arc.computer/reference/faq Execute the training process using the provided launch script and configuration file. ```bash scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \ dataset_name=path/to/your/data ``` -------------------------------- ### Launch SFT Warmup Source: https://docs.arc.computer/benchmarks/reproduction Command to initiate the supervised fine-tuning phase. ```bash scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \ dataset_id_or_path=Arc-Intelligence/Arc-ATLAS-Teach-v0 \ output_dir=results/pre_rl_model \ seed=42 ``` -------------------------------- ### Bootstrap Project with Autodiscovery Source: https://docs.arc.computer/installation Initialize the environment and run tasks using the Atlas CLI. ```bash atlas env init --task "Summarize the latest AI news" atlas run --config .atlas/generated_config.yaml --task "Summarize the latest AI news" ``` -------------------------------- ### Verify CUDA Installation Source: https://docs.arc.computer/installation Checks for NVIDIA driver and CUDA availability. ```bash nvidia-smi # Verify CUDA version ``` -------------------------------- ### Get Session by ID Source: https://docs.arc.computer/training/offline/training-data-pipeline Retrieve a specific training session by its unique ID. ```APIDOC ## GET /training_data/sessions/{session_id} ### Description Fetches a single training session identified by its ID. ### Method GET ### Endpoint /training_data/sessions/{session_id} ### Parameters #### Path Parameters - **session_id** (integer) - Required - The unique identifier of the session to retrieve. #### Query Parameters - **db_url** (string) - Required - The database connection URL. ### Response #### Success Response (200) - **session** (AtlasSessionTrace) - The training session object. #### Response Example ```json { "session": { "session_reward": {"score": 0.92, "uncertainty": 0.02}, "trajectory_events": [...], "student_learning": {...}, "teacher_learning": {...}, "learning_history": {...}, "adaptive_summary": {...}, "learning_key": "security-review-final", "drift_alert": null } } ``` ``` -------------------------------- ### Verify Learning Persistence After Upgrade Source: https://docs.arc.computer/sdk/learning-system After upgrading the SDK, run a session that should trigger learning, restart the runtime, and then verify that the playbook loads from the database and the playbook hash in telemetry matches the registry. ```bash atlas run --config your_config.yaml --task "test task" ``` ```bash atlas run --config your_config.yaml --task "another task" ``` ```bash psql $DATABASE_URL -c "SELECT learning_key, updated_at FROM learning_registry WHERE learning_key='your-key';" ``` -------------------------------- ### Initialize AtlasGKDTrainer Source: https://docs.arc.computer/training/offline/gkd-training Python implementation for setting up the AtlasGKDTrainer with student and teacher models, configuration, and database connection. ```python from atlas_core.training.algorithms.gkd_trainer import AtlasGKDTrainer from transformers import AutoModelForCausalLM, AutoTokenizer from trl import GKDConfig student = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-7B") teacher = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-14B") tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-7B") args = GKDConfig( output_dir="outputs/gkd", per_device_train_batch_size=4, lmbda=1.0, beta=0.5, ) trainer = AtlasGKDTrainer( model=student, teacher_model=teacher, args=args, db_url="postgresql://localhost:5432/atlas", min_reward=0.8, processing_class=tokenizer, ) trainer.train() trainer.save_model("outputs/gkd/final") ``` -------------------------------- ### Launch SFT Training Source: https://docs.arc.computer/training/offline/grpo-training Execute SFT training across different GPU configurations using the launch script. ```bash # Minimum (2 GPUs) scripts/launch.sh 2 src/atlas_core/configs/recipe/teacher_sft.yaml \ output_dir=checkpoints/sft # Recommended (4 GPUs) scripts/launch.sh 4 src/atlas_core/configs/recipe/teacher_sft.yaml \ output_dir=checkpoints/sft # Full production (8 GPUs) scripts/launch.sh 8 src/atlas_core/configs/recipe/teacher_sft.yaml \ output_dir=checkpoints/sft # Memory-constrained with offloading scripts/launch.sh offload 2 src/atlas_core/configs/recipe/teacher_sft.yaml \ output_dir=checkpoints/sft ``` -------------------------------- ### Get Training Sessions (Async) Source: https://docs.arc.computer/training/offline/training-data-pipeline Asynchronously query training sessions for high-throughput pipelines. ```APIDOC ## GET /training_data/sessions/async ### Description Fetches training sessions asynchronously from a PostgreSQL database, suitable for high-throughput scenarios. ### Method GET ### Endpoint /training_data/sessions/async ### Parameters #### Query Parameters - **db_url** (string) - Required - The database connection URL. - **min_reward** (float) - Optional - Minimum reward score to filter sessions. - **limit** (integer) - Optional - The maximum number of sessions to return. ### Response #### Success Response (200) - **sessions** (list of AtlasSessionTrace) - A list of training session objects. #### Response Example ```json { "sessions": [ { "session_reward": {"score": 0.75, "uncertainty": 0.03}, "trajectory_events": [...], "student_learning": {...}, "teacher_learning": {...}, "learning_history": {...}, "adaptive_summary": {...}, "learning_key": "task-batch-2", "drift_alert": null } ] } ``` ``` -------------------------------- ### Launch minimal training Source: https://docs.arc.computer/benchmarks/reproduction Executes a short training run for testing purposes. ```bash scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \ model_name_or_path=checkpoints/teacher \ max_steps=4 \ eval_steps=1 \ report_to=null ``` -------------------------------- ### Run GRPO Training with vLLM Server Source: https://docs.arc.computer/training/offline/grpo-training Launch GRPO training using the `launch_with_server.sh` script, specifying the number of GPUs for training and vLLM. The first argument is training GPUs, the second is vLLM GPUs. ```bash # Minimum (2 GPUs: 1 training, 1 vLLM) scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml \ model_name_or_path=checkpoints/sft/final # Recommended (4 GPUs: 2 training, 2 vLLM) scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml \ model_name_or_path=checkpoints/sft/final # Production (8 GPUs: 4 training, 4 vLLM) scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml \ model_name_or_path=checkpoints/sft/final ``` -------------------------------- ### Verify Python Version Source: https://docs.arc.computer/installation Confirms the installed Python version meets the minimum requirement of 3.10. ```bash python --version ``` -------------------------------- ### Get Training Sessions Source: https://docs.arc.computer/training/offline/training-data-pipeline Query training sessions directly from PostgreSQL with various filtering options. ```APIDOC ## GET /training_data/sessions ### Description Fetches training sessions directly from a PostgreSQL database. ### Method GET ### Endpoint /training_data/sessions ### Parameters #### Query Parameters - **db_url** (string) - Required - The database connection URL. - **min_reward** (float) - Optional - Minimum reward score to filter sessions. - **max_reward** (float) - Optional - Maximum reward score to filter sessions. - **learning_key** (string) - Optional - Filter sessions by a specific learning key. - **status_filters** (list of strings) - Optional - Filter sessions by their status (e.g., "succeeded", "failed"). - **start_date** (datetime) - Optional - Filter sessions that started on or after this date. - **end_date** (datetime) - Optional - Filter sessions that ended on or before this date. - **include_trajectory_events** (boolean) - Optional - Whether to include trajectory events. Defaults to true. - **include_learning_data** (boolean) - Optional - Whether to include learning data. Defaults to true. - **limit** (integer) - Optional - The maximum number of sessions to return. ### Response #### Success Response (200) - **sessions** (list of AtlasSessionTrace) - A list of training session objects. #### Response Example ```json { "sessions": [ { "session_reward": {"score": 0.85, "uncertainty": 0.05}, "trajectory_events": [...], "student_learning": {...}, "teacher_learning": {...}, "learning_history": {...}, "adaptive_summary": {...}, "learning_key": "security-review", "drift_alert": null } ] } ``` ``` -------------------------------- ### Configure API Keys using a .env File Source: https://docs.arc.computer/reference/troubleshooting Create a .env file in your project root to store API keys. The SDK automatically loads these variables. ```bash # Create .env file in project root cat > .env << EOF OPENAI_API_KEY=sk-... GEMINI_API_KEY=... ANTHROPIC_API_KEY=... EOF # SDK auto-loads .env files atlas run --config config.yaml --task "Your task" ``` -------------------------------- ### Configure API Keys Source: https://docs.arc.computer/installation Environment variables for training and runtime SDK authentication. ```bash # Training stack export HF_TOKEN="your-huggingface-token" export WANDB_API_KEY="your-wandb-key" # Optional # Runtime SDK export ANTHROPIC_API_KEY="sk-ant-your-key" # Primary provider export GEMINI_API_KEY="your-gemini-key" # Optional for rewards ``` -------------------------------- ### Configure Atlas Database URL Source: https://docs.arc.computer/reference/troubleshooting Example of the `database_url` configuration for Atlas, specifying the connection string for a PostgreSQL database. ```yaml # config.yaml storage: database_url: postgresql://atlas:atlas@localhost:5433/atlas ``` -------------------------------- ### Configure GKD (Postgres path) Source: https://docs.arc.computer/examples/gkd-dev-example Trains directly from approved traces in Postgres. Ensure ATLAS_DB_URL is set. Override trainer parameters as needed. ```bash export ATLAS_DB_URL="postgresql://user:pass@host:5432/atlas" atlas-core train \ recipe@_global_=teacher_gkd \ teacher_model_name_or_path=Qwen/Qwen2.5-14B-Instruct \ model.model_name_or_path=Qwen/Qwen2.5-7B-Instruct \ trainer.min_reward=0.8 ``` -------------------------------- ### Filter Dataset by Domain Source: https://docs.arc.computer/reference/datasets Examples of filtering the dataset for specific domains like mathematics, code generation, or debugging. ```python math_data = dataset.filter(lambda x: x['domain'] == 'math') ``` ```python code_data = dataset.filter(lambda x: x['domain'] == 'code') ``` ```python sre_data = dataset.filter(lambda x: x['domain'] == 'debug') ``` -------------------------------- ### Launch DeepSpeed Training with Accelerate Source: https://docs.arc.computer/api-reference/trainers Initiate distributed training with DeepSpeed using the `accelerate launch` command. Specify the DeepSpeed configuration file and the training script. Presets for zero3 and CPU offloading are available. ```bash # Default zero3 accelerate launch --config_file accelerate/deepspeed_zero3.yaml \ -m atlas_core.cli.train recipe@_global_=teacher_rcl ``` ```bash # CPU offload accelerate launch --config_file accelerate/deepspeed_zero3_cpu_offloading.yaml \ -m atlas_core.cli.train recipe@_global_=teacher_rcl ``` -------------------------------- ### Count Training Sessions Source: https://docs.arc.computer/training/offline/training-data-pipeline Get the count of training sessions matching specific criteria without loading full data. ```APIDOC ## GET /training_data/sessions/count ### Description Retrieves the total count of training sessions that match the specified filters. ### Method GET ### Endpoint /training_data/sessions/count ### Parameters #### Query Parameters - **db_url** (string) - Required - The database connection URL. - **min_reward** (float) - Optional - Minimum reward score to filter sessions. - **learning_key** (string) - Optional - Filter sessions by a specific learning key. ### Response #### Success Response (200) - **total** (integer) - The total number of matching training sessions. #### Response Example ```json { "total": 150 } ``` ``` -------------------------------- ### Load ATLAS-8B-Instruct Model Source: https://docs.arc.computer/reference/models Use this code to load the ATLAS-8B-Instruct model and tokenizer from Hugging Face. Ensure you have the transformers library installed. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "Arc-Intelligence/ATLAS-8B-Instruct", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained( "Arc-Intelligence/ATLAS-8B-Instruct" ) ``` -------------------------------- ### GRPOTrainer Class Initialization Source: https://docs.arc.computer/api-reference/trainers Details the parameters required to initialize the GRPOTrainer for reinforcement learning tasks. ```APIDOC ## GRPOTrainer Initialization ### Description Initializes the GRPOTrainer with the necessary models, configuration, and datasets for policy optimization. ### Parameters - **config** (GRPOConfig) - Required - Training configuration - **model** (PreTrainedModel) - Required - Model to train (policy network) - **ref_model** (PreTrainedModel) - Required - Reference model for KL penalty - **tokenizer** (PreTrainedTokenizer) - Required - Tokenizer for encoding/decoding - **train_dataset** (Dataset) - Required - Training data - **eval_dataset** (Dataset) - Optional - Evaluation data - **reward_model** (PreTrainedModel) - Optional - Optional external reward model - **compute_metrics** (Callable) - Optional - Custom metrics function - **callbacks** (List[TrainerCallback]) - Optional - Training callbacks - **optimizers** (Tuple) - Optional - Custom optimizer and scheduler ``` -------------------------------- ### Load ATLAS-8B-Thinking Model Source: https://docs.arc.computer/reference/models Use this code to load the ATLAS-8B-Thinking model and tokenizer from Hugging Face. Ensure you have the transformers library installed. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "Arc-Intelligence/ATLAS-8B-Thinking", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained( "Arc-Intelligence/ATLAS-8B-Thinking" ) ``` -------------------------------- ### Build and Run Docker Container Source: https://docs.arc.computer/installation Commands for building the training image and executing the offline pipeline. ```bash docker build -t atlas-core:local . ``` ```bash docker run --rm \ -v "$(pwd)/exports:/data" \ atlas-core:local \ atlas-core offline-pipeline --export-path /data/traces.jsonl --dry-run ``` -------------------------------- ### Multi-GPU Training Launch Source: https://docs.arc.computer/installation Distributed training commands for various GPU cluster sizes. ```bash # Minimum 2 GPUs for RL training (1 for vLLM, 1 for training) scripts/launch_with_server.sh 1 1 src/atlas_core/configs/recipe/teacher_rcl.yaml # Production setup with 4 GPUs (2 for vLLM, 2 for training) scripts/launch_with_server.sh 2 2 src/atlas_core/configs/recipe/teacher_rcl.yaml # Full 8 GPU setup scripts/launch_with_server.sh 4 4 src/atlas_core/configs/recipe/teacher_rcl.yaml ``` -------------------------------- ### Verify Configuration Files Source: https://docs.arc.computer/training/offline/gkd-training Check for the existence of required GKD configuration files. ```bash # Check required config files ls src/atlas_core/configs/recipe/teacher_gkd.yaml ls src/atlas_core/configs/trainer/gkd.yaml # Expected output: # src/atlas_core/configs/recipe/teacher_gkd.yaml # src/atlas_core/configs/trainer/gkd.yaml ``` -------------------------------- ### Count Training Sessions Source: https://docs.arc.computer/training/offline/training-data-pipeline Get the count of training sessions matching specific criteria without loading the full data using `count_training_sessions`. ```python from atlas.training_data import count_training_sessions total = count_training_sessions( db_url="postgresql://atlas:atlas@localhost:5433/atlas", min_reward=0.8, learning_key="task-1" ) print(f"Found {total} sessions matching criteria") ``` -------------------------------- ### Session Evaluation JSON Structure Source: https://docs.arc.computer/concepts/reward-design Example of the structured evaluation output generated after a session, containing weighted principles, scores, and learning insights. ```json { "principles": [ {"name": "Correctness", "weight": 0.5, "description": "Final deliverable matches requirements"}, {"name": "Safety", "weight": 0.3, "description": "No policy violations detected"}, {"name": "Efficiency", "weight": 0.2, "description": "Minimal retries needed"} ], "score": 0.85, "rationale": "Response solves the task correctly with efficient execution", "uncertainty": 0.1, "student_learning": "For straightforward tasks, proceed directly to solution without exploratory steps", "teacher_learning": null } ``` -------------------------------- ### Build GKD Dataset Source: https://docs.arc.computer/training/offline/gkd-training Initializes training and evaluation datasets from a PostgreSQL database using specified reward thresholds and learning keys. ```python from atlas_core.data.gkd import build_gkd_dataset train_ds, eval_ds = build_gkd_dataset( db_url="postgresql://localhost:5432/atlas", min_reward=0.8, learning_key="crm_workflows", eval_split=0.15, ) ``` -------------------------------- ### SFT Warmup Source: https://docs.arc.computer/api-reference/trainers Performs supervised fine-tuning as a prerequisite step before RL training. ```python from trl import SFTConfig, SFTTrainer # Supervised fine-tuning before RL trainer = SFTTrainer( model=model, args=training_args, train_dataset=sft_dataset, tokenizer=tokenizer, max_seq_length=2048 ) trainer.train() ``` -------------------------------- ### GKD Validation Metrics Example Source: https://docs.arc.computer/examples/gkd-dev-example Inspect this JSON file for training loss, baseline and distilled evaluation metrics, and derived success delta and token reduction. ```json { "training": {"train_loss": 0.0294}, "baseline": {"accuracy": 0.758, "avg_generated_tokens": 210}, "distilled": {"accuracy": 0.815, "avg_generated_tokens": 180} } ``` -------------------------------- ### Configure Hugging Face Environment Source: https://docs.arc.computer/reference/troubleshooting Manage authentication and cache settings for Hugging Face. ```bash # Login to Hugging Face huggingface-cli login # Set cache directory if disk space limited export HF_HOME=/path/to/cache # Use offline mode if downloaded export HF_DATASETS_OFFLINE=1 export TRANSFORMERS_OFFLINE=1 ``` -------------------------------- ### Generate teaching guidance with TeacherGRPOTrainer Source: https://docs.arc.computer/api-reference/trainers Create tailored teaching guidance based on diagnostic results. The output format varies based on the student's capability level. ```python def generate_guidance( self, task: str, diagnostic: DiagnosticResult, max_tokens: int = 200 ) -> str: """ Generate teaching guidance based on diagnosis Args: task: Original task or problem statement diagnostic: Results from diagnostic_probe() max_tokens: Maximum tokens for guidance response Returns: str: Tailored teaching guidance text Format depends on capability_level: - Low (0.0-0.3): Step-by-step walkthrough - Medium (0.3-0.7): Hints and scaffolding - High (0.7-1.0): Minimal guidance or verification Raises: ValueError: If max_tokens < 10 or diagnostic is None RuntimeError: If teacher model fails to generate guidance TypeError: If task is not a string Example: diagnostic = trainer.diagnostic_probe("Solve quadratic equation") guidance = trainer.generate_guidance( "Solve: x² - 5x + 6 = 0", diagnostic, max_tokens=150 ) print(f"Teaching guidance: {guidance}") """ ``` -------------------------------- ### Environment Discovery Commands Source: https://docs.arc.computer/sdk/cli-reference Commands for initializing and scaffolding agent environments. ```APIDOC ## atlas env init ### Description Discovers agents and environments, populating the .atlas/ directory. Uses Claude Haiku 4.5 for agent ranking. ### Parameters #### Flags - **--task** (string) - Optional - Task description for discovery - **--scaffold-config-full** (boolean) - Optional - Generate full configuration - **--no-run** (boolean) - Optional - Skip execution - **--timeout** (integer) - Optional - Timeout in seconds (default 240) ## atlas env scaffold ### Description Seeds projects with reference factories using LangGraph templates. ### Parameters #### Flags - **--template** (string) - Optional - Template name - **--output** (string) - Optional - Output directory - **--force** (boolean) - Optional - Overwrite existing files ``` -------------------------------- ### Evaluate a Single Interaction with RIMReward Source: https://docs.arc.computer/training/reward-system-usage Use the evaluate method of RIMReward for quick assessments of teaching effectiveness. Provide prompt, response, baseline, and teacher traces to get a score and rationale. ```python from atlas_core.reward.interpretation import RIMReward # Create reward system reward = RIMReward(config_path='reward_system/interpretation.yaml') # Evaluate a single interaction result = reward.evaluate( prompt="What is 2+2?", response="The answer is 4.", baseline_solutions="It is 4", teacher_traces="Explain your reasoning step by step", ) print(f"Score: {result.score}") print(f"Per-judge: {result.judge_scores}") print(f"Rationale:\n{result.rationale}") ``` -------------------------------- ### Teacher GRPO Prompt Templates Source: https://docs.arc.computer/training/configuration Defines the prompt templates used for diagnostic, adaptive, and feedback prompts in the teacher GRPO trainer. These templates guide the model's responses and interactions. ```yaml student_diagnostic_template: | Question: {question} Before solving, briefly describe: 1. What type of problem this is 2. The key concepts or steps needed 3. Any potential challenges you see ``` ```yaml teacher_adaptive_template: | Question: {question} Student's approach: {approach} [Analyze student approach] [Only guidance to student - no answers] ``` ```yaml student_with_teaching_template: | Question: {question} A teacher has provided: {teaching} Now solve step by step. ``` -------------------------------- ### Configure Agent and Reward System Source: https://docs.arc.computer/examples/adaptive-tool-use Define the agent import path and the reward system judge prompt in the configuration. ```yaml agent: type: python import_path: examples.mcp_tool_learning.mcp_agent attribute: create_agent ``` ```yaml rim: judge_prompt: | Reward effective tool usage: - Correct tool for each task - Minimal redundant operations - Proper error handling ``` -------------------------------- ### Validate Trained Model with Transformers Source: https://docs.arc.computer/training/offline/grpo-training Load a trained teacher model and a baseline student model using the Transformers library to compare their performance on a given problem. Requires PyTorch and Transformers installed. ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load your trained teacher teacher = AutoModelForCausalLM.from_pretrained( "results/rl_checkpoint", torch_dtype=torch.float16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("results/rl_checkpoint") # Load baseline student student = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Instruct-2507", torch_dtype=torch.float16, device_map="auto" ) # Test on a problem problem = "A train travels 120 miles in 2 hours. What is its speed?" # Get baseline (student only) inputs = tokenizer(problem, return_tensors="pt").to(student.device) baseline = student.generate(**inputs, max_new_tokens=100) print(f"Baseline: {tokenizer.decode(baseline[0])}") # Get teaching (using the atlas-sdk runtime loop) # This gives you the enhanced response ``` -------------------------------- ### Troubleshoot Async Event Loop Errors Source: https://docs.arc.computer/examples/adaptive-tool-use Run the learning harness with `python learning_harness.py` instead of `python -i` to avoid async event loop errors. ```python Run with `python learning_harness.py` (not `python -i`) ``` -------------------------------- ### Standard GRPO Training Source: https://docs.arc.computer/api-reference/trainers Initializes and executes a GRPOTrainer with specific configuration and reward functions. ```python from atlas_core.training.algorithms.grpo import GRPOTrainer from atlas_core.training.algorithms.grpo_config import GRPOConfig from atlas_core.reward.interpretation import RIMReward # Minimal training arguments (TrainingArguments requires an output_dir) grpo_args = GRPOConfig( output_dir="./output/grpo", model_name_or_path="Arc-Intelligence/ATLAS-8B-Thinking", learning_rate=5e-6, num_train_epochs=3, per_device_train_batch_size=4, gradient_accumulation_steps=4, beta=0.04, # KL penalty logging_steps=10, ) reward = RIMReward(config_path="reward_system/interpretation.yaml") trainer = GRPOTrainer( model=grpo_args.model_name_or_path, reward_funcs=reward, args=grpo_args, train_dataset=train_data, eval_dataset=eval_data, processing_class=tokenizer, ) trainer.train() trainer.save_model("./output/grpo/checkpoint-final") ``` -------------------------------- ### Quality Validation of Dataset Source: https://docs.arc.computer/training/custom-datasets Performs quality validation on a dataset using standard Python tooling and the `datasets` library. Calculates example count, average step length in tokens, and domain distribution. ```python from collections import Counter from datasets import Dataset dataset = Dataset.from_list(records) lengths = [len(example["step_trace"].split()) for example in dataset] domains = Counter(example["session_metadata"].get("domain", "unknown") for example in dataset) print(f"Examples: {len(dataset)}") print(f"Avg step length: {sum(lengths)/len(lengths):.1f} tokens") print(f"Domains: {domains}") ``` -------------------------------- ### Monitor training metrics Source: https://docs.arc.computer/benchmarks/reproduction Commands for monitoring TensorBoard, server health, and GPU utilization. ```bash # TensorBoard monitoring tensorboard --logdir results/ --port 6006 # vLLM server health watch -n 5 'curl -s http://localhost:8765/metrics' # GPU utilization nvidia-smi dmon -s u -d 5 ``` -------------------------------- ### Atlas Session Trace JSON Structure Source: https://docs.arc.computer/sdk/export-traces This is an example of the JSON structure for an Atlas Session Trace record. It includes details about the task, adaptive summary, triage dossier, plan, steps, rewards, and review status. ```json { "task": "Summarize the latest Atlas SDK updates", "final_answer": "...", "adaptive_summary": { "adaptive_mode": "coach", "confidence": 0.58, "certification_run": false, "probe": { "mode": "coach", "confidence": 0.55, "evidence": ["persona_helpful_ratio=0.62", "risk_high_severity"] }, "mode_history": [ {"mode": "paired", "confidence": 0.71, "certification": true}, {"mode": "coach", "confidence": 0.55} ] }, "triage_dossier": { "task": "Summarize the latest Atlas SDK updates", "summary": "Capture highlights for stakeholders.", "risks": [{"category": "quality", "description": "Customer-facing copy", "severity": "moderate"}], "signals": [{"name": "tenant", "value": "demo"}], "tags": ["tenant:demo", "domain:sre"] }, "plan": {"steps": [{"id": 1, "description": "Collect release notes"}, {"id": 2, "description": "Draft summary"}]}, "steps": [ { "step_id": 1, "description": "Collect release notes", "trace": "HUMAN: ...", "output": "...", "reward": { "score": 0.92, "judges": [ {"identifier": "process", "score": 0.91, "rationale": "..."} ] }, "guidance": ["Cite the release date."], "validation": {"valid": true, "rationale": "Complete"}, "tool": "web_search", "tool_params": {"query": "Atlas SDK release notes"}, "artifacts": {"sources": ["https://..."]}, "deliverable": {"notes": ["https://..."]} } ], "session_reward": { "score": 0.88, "uncertainty": 0.07, "judges": [ {"identifier": "process", "score": 0.90, "rationale": "..."} ] }, "reward_summary": {"score": 0.88}, "review_status": "approved", "personas_used": [ {"persona": "planner", "instruction": "Focus on customer tone", "source": "memory"} ], "persona_updates": { "new_candidates": [ {"persona": "planner", "instruction": "Mention adaptive modes", "tags": ["tenant:demo"]} ] }, "session_metadata": {"batch": "aime-2025"} } ``` -------------------------------- ### Create Database Indexes for Performance Source: https://docs.arc.computer/training/offline/training-data-pipeline SQL commands to optimize reward filtering, date range queries, and metadata GIN indexing for session data. ```sql -- Reward filtering (10-100x faster) CREATE INDEX sessions_reward_score_idx ON sessions ((reward_stats->>'score')::float); -- Date range queries (50-100x faster) CREATE INDEX sessions_created_at_idx ON sessions (created_at DESC); -- Learning key queries CREATE INDEX sessions_metadata_gin_idx ON sessions USING GIN (metadata); ```