### Create Local Environment File Source: https://github.com/openpipe/art/blob/main/CONTRIBUTING.md Copy the example environment file to create a local .env file for GPU training setup. ```bash cp .env.example .env ``` -------------------------------- ### Install Dependencies Source: https://github.com/openpipe/art/blob/main/CONTRIBUTING.md Install project dependencies using uv. ```bash uv sync --group dev ``` -------------------------------- ### Complete Email Agent Example Source: https://github.com/openpipe/art/blob/main/docs/integrations/langgraph-integration.mdx This code demonstrates the setup and execution of a LangGraph ReAct agent for email processing. It includes tool definitions, agent invocation with configuration, and trajectory tracking with correctness judging. Replace mock email functions with actual API integrations and provide real training scenarios. ```python tools = [search_inbox_tool, read_email_tool, return_final_answer_tool] chat_model = init_chat_model(model.get_inference_name(), temperature=1.0) react_agent = create_react_agent(chat_model, tools) try: config = { "configurable": {"thread_id": str(uuid.uuid4())}, "recursion_limit": MAX_TURNS, } await react_agent.ainvoke({ "messages": [ SystemMessage(content=system_prompt), HumanMessage(content=scenario.question), ] }, config=config) if final_answer: traj.final_answer = final_answer correctness_judge_response = await judge_correctness(scenario, final_answer.answer) traj.metrics["correct"] = float(correctness_judge_response.accept) except Exception as e: print(f"Error running agent: {e}") traj.messages_and_choices.append({"role": "assistant", "content": f"Error: {str(e)}"}) return traj ``` ```python # Main training function async def main(): # Sample training scenarios (replace with real data) training_scenarios = [ Scenario( id="1", question="Find emails about the quarterly budget", answer="Budget meeting scheduled for Q4 review", inbox_address="user@company.com", query_date="2024-01-20" ), Scenario( id="2", question="Look for urgent project updates", answer="Project deadline moved to next month", inbox_address="user@company.com", query_date="2024-01-20" ), ] # Register model with backend await model.register(backend) # Training configuration training_config = { "groups_per_step": 2, "num_epochs": 3, "rollouts_per_group": 4, "learning_rate": 1e-5, "max_steps": 5, } # Training iterator training_iterator = iterate_dataset( training_scenarios, groups_per_step=training_config["groups_per_step"], num_epochs=training_config["num_epochs"], initial_step=await model.get_step(), ) # Training loop for batch in training_iterator: print(f"Training step {batch.step}, epoch {batch.epoch}") # Create trajectory groups groups = [] for scenario in batch.items: groups.append( art.TrajectoryGroup([ wrap_rollout(model, rollout)( model, EmailScenario(step=batch.step, scenario=scenario) ) for _ in range(training_config["rollouts_per_group"]) ]) ) # Gather trajectories finished_groups = await art.gather_trajectory_groups( groups, pbar_desc="gather", max_exceptions=training_config["rollouts_per_group"] * len(batch.items), ) # Apply RULER scoring judged_groups = [] for group in finished_groups: judged_group = await ruler_score_group(group, "openai/o4-mini") judged_groups.append(judged_group) # Train model result = await backend.train( model, judged_groups, learning_rate=training_config["learning_rate"], ) await model.log(judged_groups, metrics=result.metrics, step=result.step, split="train") print(f"Completed training step {batch.step}") if batch.step >= training_config["max_steps"]: break if __name__ == "__main__": asyncio.run(main()) ``` -------------------------------- ### Install MCP Google Maps Server with uv Source: https://github.com/openpipe/art/blob/main/examples/mcp-rl/servers/python/mcp_googlemaps/README.md Alternatively, use the uv package manager to install the server in editable mode. This is a faster alternative to pip. ```bash uv pip install -e . ``` -------------------------------- ### Install OpenPipe ART with Backend Dependencies Source: https://github.com/openpipe/art/blob/main/docs/getting-started/installation-setup.mdx Install the OpenPipe ART client along with backend dependencies required for local GPU training and inference. Use this command if you plan to run the ART server locally. ```bash pip install openpipe-art[backend] ``` -------------------------------- ### Setup and Model Registration Source: https://github.com/openpipe/art/blob/main/examples/prisoners-dilemma.ipynb Initializes the Art library, loads environment variables, and registers a trainable model with a local backend for the Prisoner's Dilemma simulation. ```python import asyncio import re from dotenv import load_dotenv import art from art.local import LocalBackend load_dotenv() BASE_MODEL = "Qwen/Qwen2.5-7B-Instruct" PRISONERS_DILEMMA_ROUNDS = 10 TRAINING_STEPS = 1_000 backend = LocalBackend() model = art.TrainableModel( name="001", project="prisoners-dilemma", base_model=BASE_MODEL ) await model.register(backend) client = model.openai_client() ``` -------------------------------- ### Install MCP Google Maps Server Source: https://github.com/openpipe/art/blob/main/examples/mcp-rl/servers/python/mcp_googlemaps/README.md Use pip to install the server in editable mode. This command installs the package and its dependencies. ```bash pip install -e . ``` -------------------------------- ### Install Git Hooks with Prek Source: https://github.com/openpipe/art/blob/main/CONTRIBUTING.md Install git hooks using prek for local code quality checks. This is optional but recommended. ```bash uv run prek install ``` -------------------------------- ### SFT Data Format with Tool Calls Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/sft-training.mdx An SFT training data example including tool calls, tool definitions, and tool responses. ```json { "messages": [ { "role": "system", "content": "You are a helpful assistant" }, { "role": "user", "content": "What's the weather in Hobart?" }, { "role": "assistant", "content": null, "tool_calls": [ { "id": "call_1", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\": \"Hobart\"}" } } ] }, { "role": "tool", "tool_call_id": "call_1", "content": "15°C, partly cloudy" }, { "role": "assistant", "content": "It's currently 15°C and partly cloudy in Hobart." } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather", "parameters": { "type": "object", "properties": { "location": { "type": "string" } } } } } ] } ``` -------------------------------- ### Install OpenPipe ART Source: https://github.com/openpipe/art/blob/main/docs/getting-started/about.mdx Install the ART client into your project to add RL training capabilities. This command can be run on any machine that uses Python. ```bash pip install openpipe-art ``` -------------------------------- ### Initiate Training Task Source: https://github.com/openpipe/art/blob/main/dev/profile.ipynb Creates and starts a training task using the `art.local.train` module. It also sets up a queue to receive training results. ```python import asyncio from art.local.train import train results_queue = asyncio.Queue() train_task = asyncio.create_task(train(state.trainer, results_queue)) ``` -------------------------------- ### Start RL from SFT LoRA Adapter Source: https://github.com/openpipe/art/blob/main/docs/getting-started/faq.mdx Initialize a TrainableModel using an existing Hugging Face-style LoRA adapter directory as the base model for RL training. ```python import art model = art.TrainableModel( name="agent-001", project="my-agentic-task", base_model="/path/to/my_sft_lora_adapter", # HF-style adapter dir ) ``` -------------------------------- ### Run MCP Google Maps Server with Environment Variable Source: https://github.com/openpipe/art/blob/main/examples/mcp-rl/servers/python/mcp_googlemaps/README.md Alternatively, if the GOOGLE_MAPS_API_KEY environment variable is set, the server can be started without explicit arguments. ```bash export GOOGLE_MAPS_API_KEY=your_api_key python server.py ``` -------------------------------- ### Install ART with LangGraph Extras Source: https://github.com/openpipe/art/blob/main/docs/integrations/langgraph-integration.mdx Install ART with the necessary extras for LangGraph integration and backend components. Ensure you are using version 0.4.9 or higher. ```bash uv pip install -U openpipe-art[backend,langgraph]>=0.4.9 ``` -------------------------------- ### Basic SFT Data Format Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/sft-training.mdx A simple JSONL example for SFT training data, containing system, user, and assistant messages. ```json { "messages": [ { "role": "system", "content": "You are a helpful assistant" }, { "role": "user", "content": "What is the capital of Tasmania?" }, { "role": "assistant", "content": "Hobart" } ] } ``` -------------------------------- ### Run ART Server Locally with LocalBackend Source: https://github.com/openpipe/art/blob/main/docs/getting-started/installation-setup.mdx Example of initializing the LocalBackend and registering a TrainableModel. This code is used when running the ART server on your local machine with a GPU. ```python from art import TrainableModel, gather_trajectory_groups from art.local.backend import LocalBackend backend = LocalBackend() model = TrainableModel( name="agent-001", project="my-agentic-task", base_model="OpenPipe/Qwen3-14B-Instruct", ) await model.register(backend) ... the rest of your code ... ``` -------------------------------- ### Train Agent with OpenEnv Echo Environment Source: https://github.com/openpipe/art/blob/main/docs/integrations/openenv-integration.mdx This example shows a complete workflow for training an agent using OpenEnv's echo environment with ART. It includes setting up the backend, defining the model, creating environment clients, and running the training loop. ```python import asyncio from datetime import datetime import art from art.serverless.backend import ServerlessBackend from dotenv import load_dotenv from envs.echo_env import EchoAction, EchoEnv import weave PROMPT = "Use at most 100 tokens; maximize the total character length of the output." NUM_STEPS = 50 ROLLOUTS_PER_GROUP = 4 # The rollout function defines how your agent interacts with the environment async def rollout(model: art.TrainableModel, env_client: EchoEnv) -> art.Trajectory: # Reset the environment to get initial state await asyncio.to_thread(env_client.reset) # Create a trajectory to store interactions and rewards traj = art.Trajectory( messages_and_choices=[{"role": "system", "content": PROMPT}], reward=0.0 ) # Use the model to generate an action choice = ( await model.openai_client().chat.completions.create( model=model.inference_model_name, messages=traj.messages(), max_completion_tokens=100, timeout=30, ) ).choices[0] reply = (choice.message.content or "").strip() # Send the action to the environment and get observation/reward result = await asyncio.to_thread( env_client.step, EchoAction(message=reply) ) # Record the model's output and reward traj.messages_and_choices.append(choice) traj.reward = result.reward return traj.finish() async def main() -> None: load_dotenv() weave.init("openenv-demo") # Set up the training backend backend = ServerlessBackend() # Define the model to train model = art.TrainableModel( name=f"openenv-echo-{datetime.now().strftime('%Y-%m-%d-%H%M%S')}", project="openenv-demo", base_model="OpenPipe/Qwen3-14B-Instruct", ) await model.register(backend) # Create a pool of environment clients for efficient training env_pool = [ EchoEnv.from_docker_image("quixote13/echo-env:latest") for _ in range(ROLLOUTS_PER_GROUP) ] # Training loop for step in range(await model.get_step(), NUM_STEPS): print(f"Gathering groups for step {step}") # Run multiple rollouts in parallel groups = await art.gather_trajectory_groups([ art.TrajectoryGroup( rollout(model, env_client) for env_client in env_pool ) ]) # Train the model on collected trajectories result = await backend.train(model, groups) await model.log(groups, metrics=result.metrics, step=result.step, split="train") if __name__ == "__main__": asyncio.run(main()) ``` -------------------------------- ### Use Managed Autoscaling Backend with ServerlessBackend Source: https://github.com/openpipe/art/blob/main/docs/getting-started/installation-setup.mdx Example of initializing the ServerlessBackend and registering a TrainableModel. This code is used when sending inference and training requests to the W&B Training cluster. ```python from art import TrainableModel, gather_trajectory_groups from art.serverless.backend import ServerlessBackend backend = ServerlessBackend() model = TrainableModel( name="agent-001", project="my-agentic-task", base_model="OpenPipe/Qwen3-14B-Instruct", ) await model.register(backend) ... the rest of your code ... ``` -------------------------------- ### TrainableModel with ServerlessBackend Source: https://github.com/openpipe/art/blob/main/README.md Instantiate a TrainableModel and register it with a ServerlessBackend for training agents. Requires a W&B API key. This setup allows for rapid iteration and deployment of trained models. ```python from art.serverless.backend import ServerlessBackend model = art.TrainableModel( project="voice-agent", name="agent-001", base_model="Qwen/Qwen3.6-27B" ) backend = ServerlessBackend( api_key="your_wandb_api_key" ) model.register(backend) # Edit and iterate in minutes, not hours! ``` -------------------------------- ### Run Training Script Source: https://github.com/openpipe/art/blob/main/docs/tutorials/summarizer.mdx Executes the main training script for the summarizer model. This command registers the model, downloads checkpoints, starts services, trains the model, and uploads final checkpoints if configured. ```bash uv run python src/summarizer/train.py ``` -------------------------------- ### Run SFT Training Source: https://github.com/openpipe/art/blob/main/docs/tutorials/open-deep-research.mdx Execute the supervised fine-tuning training run. This step enhances the model's ability to follow research agent formats and reasoning patterns, providing a better starting point for RL training. ```bash uv run run_sft.py # Run your sft training run. ~1 Hour ``` -------------------------------- ### Initialize ServerlessBackend Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-backend.mdx Instantiate ServerlessBackend to train remotely. Provide your W&B API key as an argument or set it as an environment variable. ```python from art.serverless.backend import ServerlessBackend backend = ServerlessBackend( api_key="my-api-key", # or set WANDB_API_KEY in the environment ) ``` -------------------------------- ### Model Initialization and Registration Source: https://github.com/openpipe/art/blob/main/dev/yes-no-maybe.ipynb Initializes the 'art' library, loads environment variables, sets up a local backend, and registers a trainable model with a specified base model and project. ```python from itertools import permutations from dotenv import load_dotenv import openai import art from art.local import LocalBackend load_dotenv() backend = LocalBackend() model = art.TrainableModel( name="010", project="yes-no-maybe", base_model="Qwen/Qwen2.5-7B-Instruct", # _internal_config=art.dev.InternalModelConfig( # _decouple_vllm_and_unsloth=True, # engine_args=art.dev.EngineArgs(gpu_memory_utilization=0.7), # ), ) await model.register(backend) ``` -------------------------------- ### Initialize ART Local Backend and Model Source: https://github.com/openpipe/art/blob/main/dev/yes-no-maybe-vision/train.ipynb Sets up the local backend and initializes a trainable model using a specified base model and project name. Loads environment variables for configuration. ```python from dotenv import load_dotenv from generate_images import generate_yes_no_maybe_prompts, save_prompt_images import openai import art from art.local import LocalBackend load_dotenv() backend = LocalBackend() model = art.TrainableModel( name="009", project="yes-no-maybe-vision", base_model="Qwen/Qwen2.5-VL-7B-Instruct", ) await model.register(backend) ``` -------------------------------- ### Perform rollout with Qwen3 model Source: https://github.com/openpipe/art/blob/main/dev/new_models/qwen3_try.ipynb Executes the rollout function using the Qwen3 model and the prompt at index 4 from the prompts list. ```python await rollout(qwen3, prompts[4]) ``` -------------------------------- ### Initialize TrainableModel from SFT LoRA Adapter Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-client.mdx Initialize the ART client using an existing SFT LoRA adapter directory. This allows warm-starting RL training from task-aligned weights, potentially reducing training steps and costs. ```python import art model = art.TrainableModel( name="agent-001", project="my-agentic-task", # Point to the local SFT LoRA adapter directory # (e.g., contains adapter_config.json and adapter_model.bin/safetensors) base_model="/path/to/my_sft_lora_adapter", ) ``` -------------------------------- ### Complete Joke Generation Example with RULER Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/ruler.mdx A comprehensive example showcasing RULER's ability to rank joke trajectories of varying quality. This includes setting up initial messages, creating distinct trajectories, scoring them with RULER, and displaying the ranked results. ```python import asyncio import art from art.rewards import ruler_score_group from openai.types.chat.chat_completion import Choice from openai.types.chat import ChatCompletionMessage async def main(): # Initial messages shared by all trajectories initial_messages = [ {"role": "system", "content": "You are a comedy writer. Generate funny jokes based on the given topic."}, {"role": "user", "content": "Tell me a funny joke about computers"} ] # Create three trajectories with different quality responses good_trajectory = art.Trajectory( messages_and_choices=[ *initial_messages, Choice(finish_reason="stop", index=0, message=ChatCompletionMessage( role="assistant", content="Why don't computers ever get invited to parties?\n\nBecause they always crash! 🥁\n\nBut seriously, have you tried turning them off and on again?" )) ], reward=0.0 ) mediocre_trajectory = art.Trajectory( messages_and_choices=[ *initial_messages, Choice(finish_reason="stop", index=0, message=ChatCompletionMessage( role="assistant", content="What do you call a computer that doesn't work?\n\nBroken." )) ], reward=0.0 ) off_topic_trajectory = art.Trajectory( messages_and_choices=[ *initial_messages, Choice(finish_reason="stop", index=0, message=ChatCompletionMessage( role="assistant", content="I don't really know jokes about computers, but here's a fact: The sky is blue because of Rayleigh scattering." )) ], reward=0.0 ) # Create a TrajectoryGroup and use RULER to score group = art.TrajectoryGroup([good_trajectory, mediocre_trajectory, off_topic_trajectory]) judged_group = await ruler_score_group(group, "openai/o3", debug=True) # Display rankings if judged_group: sorted_trajectories = sorted(judged_group.trajectories, key=lambda t: t.reward, reverse=True) for rank, traj in enumerate(sorted_trajectories, 1): messages = traj.messages() print(f"Rank {rank}: Score {traj.reward:.3f}") print(f" Response: {messages[-1]['content'][:50]}...") asyncio.run(main()) ``` -------------------------------- ### Initialize and Register Model Source: https://github.com/openpipe/art/blob/main/examples/rock-paper-tool-use.ipynb Sets up a trainable model with a specified name, project, and base model. It then registers the model with a local backend. ```python import asyncio import json from dotenv import load_dotenv from openai.types.chat.chat_completion import ChatCompletion import art from art.local import LocalBackend load_dotenv() MODEL_NAME = "001" BASE_MODEL = "Qwen/Qwen2.5-7B-Instruct" TRAINING_STEPS = 1_000 model = art.TrainableModel( name=MODEL_NAME, project="rock-paper-tool-use", base_model=BASE_MODEL ) backend = LocalBackend() await model.register(backend) client = model.openai_client() ``` -------------------------------- ### Add Dependency with uv Source: https://github.com/openpipe/art/blob/main/CLAUDE.md Use this command to add a new package dependency to your project. Ensure `uv` is installed and configured. ```bash uv add ``` -------------------------------- ### Initialize LocalBackend Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-backend.mdx Instantiate LocalBackend to run training on the local machine. Configure automatic shutdown and the path for storing logs and weights. ```python from art.local import LocalBackend backend = LocalBackend( # set to True if you want your backend to shut down automatically # when your client process ends in_process: False, # local path where the backend will store trajectory logs and model weights path: './.art', ) ``` -------------------------------- ### Perform rollout with Qwen2.5 model Source: https://github.com/openpipe/art/blob/main/dev/new_models/qwen3_try.ipynb Executes the rollout function with the Qwen2.5 model and a specific prompt from the loaded prompts list. ```python await rollout(qwen2, prompts[4]) ``` -------------------------------- ### Provide a Custom Rubric Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/ruler.mdx Define a custom rubric to guide the judge's scoring. The rubric is a multi-line string where each line specifies a scoring criterion. ```python custom_rubric = """ - Prioritize responses that are concise and clear - Penalize responses that include emojis or informal language - Reward responses that cite sources """ await ruler_score_group( group, "openai/o3", rubric=custom_rubric ) ``` -------------------------------- ### Use Different Judge Models Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/ruler.mdx Specify any LLM supported by LiteLLM as the judge model. Examples include 'openai/o4-mini', 'anthropic/claude-sonnet-4-20250514', and local models like 'ollama/qwen3:32b'. ```python await ruler_score_group(group, "openai/o4-mini") ``` ```python await ruler_score_group(group, "anthropic/claude-sonnet-4-20250514") ``` ```python await ruler_score_group(group, "ollama/qwen3:32b") ``` -------------------------------- ### Initialize and Use Backend Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-backend.mdx Initialize either a `ServerlessBackend` or `LocalBackend` based on the `BACKEND_TYPE` environment variable. Register a `TrainableModel` with the initialized backend for subsequent training. ```python BACKEND_TYPE = "serverless" if BACKEND_TYPE == "serverless": from art.serverless.backend import ServerlessBackend backend = await ServerlessBackend() else: from art.local import LocalBackend backend = LocalBackend() model = art.TrainableModel(...) await model.register(backend) # ...training code... ``` -------------------------------- ### Define Training Scenarios Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-client.mdx Define a Scenario class with fields that differentiate real-world situations. Instantiate a list of Scenario objects to represent diverse training examples. ```python class Scenario: # add whatever fields differ from one real-world scenario to another field_1: str field_2: float scenarios = [ Scenario( field_1: "hello", field_2: 0 ), Scenario( field_1: "world!", field_2: 1 ) ] ``` -------------------------------- ### SFT Warmup then RL Training Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/sft-training.mdx This snippet demonstrates how to perform SFT for a specified number of epochs and then seamlessly transition to RL training within the same ART run. It initializes a model, registers it with a backend, runs SFT from a file, and then enters a loop for RL training, gathering trajectory groups and updating the model. ```python import art from art.local import LocalBackend # from art.serverless.backend import ServerlessBackend from art.utils.sft import train_sft_from_file async def main(): backend = LocalBackend() # backend = ServerlessBackend() # or use serverless for managed GPUs model = art.TrainableModel( name="warmup-then-rl", project="my-project", base_model="Qwen/Qwen3-30B-A3B-Instruct-2507", ) await model.register(backend) # Phase 1: SFT warmup from a dataset await train_sft_from_file( model=model, file_path="data/train.jsonl", epochs=3, ) # Phase 2: RL training picks up from the SFT checkpoint from my_project import rollout, scenarios for step in range(await model.get_step(), 50): train_groups = await art.gather_trajectory_groups( [ art.TrajectoryGroup(rollout(model, scenario) for _ in range(8)) for scenario in scenarios ] ) result = await backend.train(model, train_groups) await model.log(train_groups, metrics=result.metrics, step=result.step, split="train") ``` -------------------------------- ### Deprecate model.train() in favor of backend.train() Source: https://github.com/openpipe/art/blob/main/docs/proposals/backend-first-training-api.md Use the new backend.train() method and deprecate the old model.train() to guide users towards the new API. The old method will emit a warning. ```python await model.train(trajectory_groups, config=TrainConfig(learning_rate=5e-6)) ``` ```python await backend.train(model, trajectory_groups, learning_rate=5e-6) ``` ```python async def train(self, ...): warnings.warn( "model.train() is deprecated. Use backend.train(model, ...) instead.", DeprecationWarning, stacklevel=2, ) ... ``` -------------------------------- ### Initialize LangGraph Chat Model with ART Source: https://github.com/openpipe/art/blob/main/docs/integrations/langgraph-integration.mdx Initializes a chat model for use with LangGraph, optionally enabling Weave tracking for logging and analysis. Ensure necessary libraries like 'weave' and 'langchain_core' are installed. ```python import uuid import weave from langchain_core.messages import HumanMessage, SystemMessage from langchain_core.tools import tool from langgraph.prebuilt import create_react_agent from art.langgraph import init_chat_model import art import os # Initialize Weave tracking (optional) if os.getenv("WANDB_API_KEY", ""): weave.init(model.project, settings={"print_call_link": False}) ``` -------------------------------- ### Initialize TrainableModel Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/art-client.mdx Initialize the ART client with model name, project, and base model. Use this to set up your agent for training and inference. ```python import art model = art.TrainableModel( # the name of your model as it will appear in W&B # and other observability platforms name="agent-001", # keep your project name constant between all the models you train # for a given task to consistently group metrics project="my-agentic-task", # the model that you want to train from base_model="OpenPipe/Qwen3-14B-Instruct", ) ``` -------------------------------- ### Model and Backend Initialization Source: https://github.com/openpipe/art/blob/main/examples/temporal_clue/temporal-clue-7b-async.ipynb Initializes a trainable model with a specific name, project, and base model, then registers it with a local backend. ```python model = art.TrainableModel( name="056", project="temporal-clue", base_model="Qwen/Qwen2.5-7B-Instruct" ) backend = LocalBackend() await model.register(backend) ``` -------------------------------- ### Deleting Checkpoints within a Training Loop Source: https://github.com/openpipe/art/blob/main/docs/features/checkpoint-deletion.mdx This example demonstrates how to integrate checkpoint deletion into a training loop to manage storage efficiently. It deletes checkpoints after each training step, keeping only the most recent and best-performing one based on a specified metric. ```APIDOC ## Training Loop with Checkpoint Deletion ### Description This example shows a training loop that trains a model for a set number of steps. After each step, it calls `delete_checkpoints` to remove all but the most recent and best-performing checkpoint, significantly reducing storage overhead. ### Method ```python # ... inside the training loop ... await model.delete_checkpoints(best_checkpoint_metric="train/reward") ``` ### Parameters - **best_checkpoint_metric** (string) - Required - The metric to use for ranking checkpoints. In this example, `"train/reward"` is used. ### Request Example ```python # ... inside the training loop ... await model.delete_checkpoints(best_checkpoint_metric="train/reward") ``` ``` -------------------------------- ### Initialize Model and Backend Source: https://github.com/openpipe/art/blob/main/docs/integrations/langgraph-integration.mdx Initializes the language model and local backend for agent execution. Ensure these are configured correctly for your environment. ```python import asyncio import uuid from dataclasses import asdict from textwrap import dedent from typing import List import art import weave from langchain_core.messages import HumanMessage, SystemMessage from langchain_core.tools import tool from langgraph.prebuilt import create_react_agent from litellm import acompletion from pydantic import BaseModel, Field from tenacity import retry, stop_after_attempt from art.langgraph import init_chat_model, wrap_rollout from art.local import LocalBackend from art.utils import iterate_dataset # Initialize model and backend model = art.Model(name="Qwen/Qwen2.5-7B-Instruct") backend = LocalBackend() ``` -------------------------------- ### Create Benchmark Directory Source: https://github.com/openpipe/art/blob/main/examples/mcp-rl/mcp_rl/benchmarks/display_benchmarks/mcp_alphavantage.ipynb Creates a directory to store benchmark results if it does not already exist. ```python benchmarks_dir = f"{get_repo_root_path()}/assets/benchmarks/{project_name}" os.makedirs(benchmarks_dir, exist_ok=True) ``` -------------------------------- ### Backend-Specific Training Behaviors Source: https://github.com/openpipe/art/blob/main/docs/proposals/backend-first-training-api.md Illustrates how to leverage backend-specific arguments for unique training behaviors, such as immediate deployment or tensor visualization. ```python # TinkerBackend: Train without saving (for rapid iteration) await tinker_backend.train(model, groups, save_checkpoint=False) # TinkerBackend: Train and immediately deploy await tinker_backend.train(model, groups, deploy_checkpoint=True) # LocalBackend: Visualize training tensors await local_backend.train(model, groups, plot_tensors=True) # ServerlessBackend: Just works, minimal options await serverless_backend.train(model, groups, learning_rate=1e-5) ``` -------------------------------- ### Serverless SFT Training with ART Source: https://github.com/openpipe/art/blob/main/docs/fundamentals/sft-training.mdx Use ServerlessBackend for remote SFT training on managed GPUs. Ensure the WANDB_API_KEY environment variable is set. ```python from art.serverless.backend import ServerlessBackend backend = ServerlessBackend() # uses WANDB_API_KEY env var model = art.TrainableModel( name="my-sft-model", project="sft-project", base_model="Qwen/Qwen3-30B-A3B-Instruct-2507", ) await model.register(backend) await model.train_sft(trajectories, config=art.TrainSFTConfig(learning_rate=5e-5)) ``` -------------------------------- ### Initialize ART Trainable Model and Local Backend Source: https://github.com/openpipe/art/blob/main/dev/math-vista/math-vista.ipynb Initializes a trainable model using the ART library, specifying a base model and project. Registers a local backend and obtains an OpenAI-compatible client. ```python import re import art from art.local import LocalBackend model = art.TrainableModel( name="002", project="math-vista", base_model="Qwen/Qwen2.5-VL-7B-Instruct", ) backend = LocalBackend() await model.register(backend) client = model.openai_client() ``` -------------------------------- ### Run All Tests Source: https://github.com/openpipe/art/blob/main/CLAUDE.md Execute all project tests using the `uv run prek run --all-files` command. This should be done before committing changes. ```bash uv run prek run --all-files ``` -------------------------------- ### Register Qwen2.5 model with backend Source: https://github.com/openpipe/art/blob/main/dev/new_models/qwen3_try.ipynb Creates a TrainableModel instance for Qwen2.5 and registers it with the initialized backend. This prepares the model for training and inference. ```python qwen2 = art.TrainableModel( name="004", project="yes-no-maybe-s", base_model="Qwen/Qwen2.5-0.5B-Instruct", # base_model="Qwen/Qwen2.5-0.5B-Instruct", ) await qwen2.register(backend) ``` -------------------------------- ### Basic MCP RL Training Pipeline Source: https://github.com/openpipe/art/blob/main/docs/features/mcp-rl.mdx This pipeline demonstrates initializing a trainable model, generating training scenarios, gathering trajectory groups, scoring them using RULER, and finally training the model. Ensure you have your OpenRouter API key and the necessary tools and resources lists defined elsewhere. ```python import art from art.mcp import generate_scenarios from art.rewards import ruler_score_group from art import gather_trajectory_groups # Initialize the model model = art.TrainableModel( model="OpenPipe/Qwen3-14B-Instruct", openrouter_api_key="your_openrouter_key" ) # Generate training scenarios automatically scenario_collection = await generate_scenarios( tools=tools_list, resources=resources_list, num_scenarios=100, show_preview=False, generator_model="gpt-4o-mini", generator_api_key="your_openrouter_key", ) # Gather trajectory groups groups = await gather_trajectory_groups( ( art.TrajectoryGroup( rollout(model, scenario, False) for _ in range(4) # rollouts per group ) for scenario in scenario_collection ), pbar_desc="train gather step", ) # Score groups using RULER scored_groups = [ await ruler_score_group( group, judge_model="gpt-4o-mini", debug=True, swallow_exceptions=True ) for group in groups ] # Train the model result = await backend.train(model, scored_groups, learning_rate=1e-5) await model.log(scored_groups, metrics=result.metrics, step=result.step, split="train") ```