### Quick Example: Install, Configure, and Train Source: https://github.com/microsoft/skillopt/blob/main/docs/index.md This example shows how to install SkillOpt, configure Azure OpenAI credentials, and train on the SearchQA benchmark. ```bash # Install pip install -e . # Configure credentials export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/" export AZURE_OPENAI_API_KEY="your-key" # Train on SearchQA python scripts/train.py --config configs/searchqa/default.yaml # Evaluate best skill python scripts/eval_only.py \ --config configs/searchqa/default.yaml \ --skill outputs/best_skill.md ``` -------------------------------- ### Copy Environment Example Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Copies the example .env file to .env for user configuration. ```bash cp .env.example .env ``` -------------------------------- ### Quick Start - Training on SearchQA Source: https://github.com/microsoft/skillopt/blob/main/README.md Minimal example command to train on SearchQA using a specified configuration, data split, and OpenAI endpoint. ```bash python scripts/train.py \ --config configs/searchqa/default.yaml \ --split_dir /path/to/your/searchqa_split \ --azure_openai_endpoint https://your-resource.openai.azure.com/ \ --optimizer_model gpt-5.5 \ --target_model gpt-5.5 ``` -------------------------------- ### Quick Start - Training CLI Arguments Source: https://github.com/microsoft/skillopt/blob/main/README.md Table detailing key command-line arguments for the training script, including their descriptions and examples. ```markdown | Argument | Description | Example | |---|---|---| | `--config` | Benchmark config YAML | `configs/searchqa/default.yaml` | | `--split_dir` | Path to data split directory | `/path/to/split` | | `--azure_openai_endpoint` | Azure OpenAI endpoint URL | `https://your-resource.openai.azure.com/` | | `--optimizer_model` | Optimizer model deployment name | `gpt-5.5` | | `--target_model` | Target model deployment name | `gpt-5.5` | | `--num_epochs` | Number of training epochs | `4` | | `--batch_size` | Batch size per step | `40` | | `--workers` | Parallel rollout workers | `8` | | `--out_root` | Output directory | `outputs/my_run` | ``` -------------------------------- ### Environment Variables Example Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Example configuration for environment variables, including Azure OpenAI, OpenAI, and Anthropic API keys. ```ini # Azure OpenAI (default backend) AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ AZURE_OPENAI_API_KEY=your-key # Or use OpenAI directly OPENAI_API_KEY=sk-... # Or Anthropic Claude ANTHROPIC_API_KEY=sk-ant-... ``` -------------------------------- ### Install and launch WebUI Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md Commands to install the WebUI dependencies and launch the application. ```bash pip install -e ".[webui]" python -m skillopt_webui.app ``` -------------------------------- ### Install All Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs all optional dependencies. ```bash pip install -e ".[alfworld,claude,qwen,webui,dev]" ``` -------------------------------- ### CLI Overrides Example Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Example of how to override configuration values from the command line. ```bash python scripts/train.py \ --config configs/searchqa/default.yaml \ optimizer.learning_rate=16 \ optimizer.lr_scheduler=linear \ gradient.analyst_workers=8 ``` -------------------------------- ### Quick Install Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Clones the SkillOpt repository, navigates into the directory, and installs the package in editable mode. ```bash git clone https://github.com/microsoft/SkillOpt.git cd SkillOpt pip install -e . ``` -------------------------------- ### Install Qwen (Local) Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs optional dependencies for the local Qwen backend. ```bash pip install -e ".[qwen]" ``` -------------------------------- ### Example configuration parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md Key parameters in the configuration file with analogies to deep learning concepts. ```yaml train: num_epochs: 4 # (epochs) batch_size: 40 # (batch size) optimizer: learning_rate: 4 # (max edits per step) lr_scheduler: cosine # (learning rate schedule) use_slow_update: true # (momentum at epoch boundary) use_meta_skill: true # (cross-epoch optimizer memory) gradient: analyst_workers: 16 # (parallel reflection workers) evaluation: use_gate: true # (validation gating) ``` -------------------------------- ### Install Documentation Dependencies and Serve Source: https://github.com/microsoft/skillopt/blob/main/CONTRIBUTING.md Commands to install documentation dependencies and preview the documentation locally. ```bash pip install -e ".[docs]" mkdocs serve # Preview at http://localhost:8000 ``` -------------------------------- ### Training Examples Source: https://github.com/microsoft/skillopt/blob/main/docs/reference/cli.md Examples demonstrating how to use the training command with different configurations and overrides. ```bash # Basic training python scripts/train.py --config configs/searchqa/default.yaml ``` ```bash # With overrides python scripts/train.py \ --config configs/searchqa/default.yaml \ --cfg-options optimizer.learning_rate=16 optimizer.lr_scheduler=linear ``` ```bash # With custom initial skill python scripts/train.py \ --config configs/searchqa/default.yaml \ --cfg-options env.skill_init=skills/my_seed.md ``` -------------------------------- ### Configure API Credentials - Example Source: https://github.com/microsoft/skillopt/blob/main/README.md Copies the example environment file and instructs to edit it with API credentials. ```bash cp .env.example .env # Edit .env with your API credentials, then: source .env ``` -------------------------------- ### Install WebUI Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs optional dependencies for the WebUI. ```bash pip install -e ".[webui]" ``` -------------------------------- ### Verify Installation Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Verifies the SkillOpt installation by importing the library and printing a success message. ```python import skillopt; print('SkillOpt ready!') ``` -------------------------------- ### Install Development Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs optional dependencies for development. ```bash pip install -e ".[dev]" ``` -------------------------------- ### Install SkillOpt Source: https://github.com/microsoft/skillopt/blob/main/README.md Clones the repository, installs the package, and optionally installs dependencies for the ALFWorld benchmark. ```bash git clone https://github.com/microsoft/SkillOpt.git cd SkillOpt pip install -e . # For ALFWorld benchmark (optional): pip install -e ".[alfworld]" alfworld-download ``` -------------------------------- ### Install Claude Backend Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs optional dependencies for the Claude backend. ```bash pip install -e ".[claude]" ``` -------------------------------- ### Quick Start - Training on LiveMathematicianBench Source: https://github.com/microsoft/skillopt/blob/main/README.md Command to train on LiveMathematicianBench using a specified configuration, data split, and OpenAI endpoint. ```bash python scripts/train.py \ --config configs/livemathematicianbench/default.yaml \ --split_dir /path/to/your/livemath_split \ --azure_openai_endpoint https://your-resource.openai.azure.com/ \ --optimizer_model gpt-5.5 \ --target_model gpt-5.5 ``` -------------------------------- ### Install ALFWorld Extras Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/installation.md Installs optional dependencies for ALFWorld benchmark. ```bash pip install -e ".[alfworld]" ``` -------------------------------- ### Train the model Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md Command to start the training process using the specified configuration file. ```bash python scripts/train.py --config configs/searchqa/default.yaml ``` -------------------------------- ### Documentation Preview Source: https://github.com/microsoft/skillopt/blob/main/docs/contributing.md Commands to install documentation dependencies and serve the documentation locally for preview. ```bash pip install -e ".[docs]" mkdocs serve # Preview at http://localhost:8000 ``` -------------------------------- ### Solution.py Initialization Source: https://github.com/microsoft/skillopt/blob/main/skillopt/envs/spreadsheetbench/prompts/react_system.md The required starting lines for the solution.py script, defining input and output paths. ```python INPUT_PATH = "" OUTPUT_PATH = "" ``` -------------------------------- ### Create Config Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-benchmark.md Example YAML configuration file for the new benchmark. ```yaml _base_: ['../_base_/default.yaml'] env: name: my_benchmark data_path: data/my_benchmark split_mode: ratio split_ratio: "2:1:7" train: num_epochs: 4 batch_size: 40 optimizer: learning_rate: 4 lr_scheduler: cosine use_slow_update: true use_meta_skill: true gradient: analyst_workers: 16 ``` -------------------------------- ### Configuration Example Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-backend.md This YAML snippet demonstrates how to configure SkillOpt to use the new backend. ```yaml model: backend: your_backend model_name: your-model-id temperature: 0.7 max_tokens: 4096 ``` -------------------------------- ### Initial Skill Configuration Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/skill-document.md Example YAML configuration for setting the initial skill for training. ```yaml train: init_skill: "path/to/initial_skill.md" # or omit for empty ``` -------------------------------- ### Evaluation Examples Source: https://github.com/microsoft/skillopt/blob/main/docs/reference/cli.md Examples demonstrating how to use the evaluation command to assess skills on different data splits. ```bash # Evaluate best skill on test set python scripts/eval_only.py \ --config configs/searchqa/default.yaml \ --skill outputs/searchqa/run_001/skills/best_skill.md ``` ```bash # Evaluate on validation set python scripts/eval_only.py \ --config configs/searchqa/default.yaml \ --skill outputs/searchqa/run_001/skills/best_skill.md \ --split valid ``` -------------------------------- ### Example training output Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md Sample output observed during the training process. ```text [Step 1/8] Rollout: 20 items, 4 workers... [Step 1/8] Score: 0.65 → Reflect... [Step 1/8] 6 edit patches generated [Step 1/8] Selected 4 edits (lr=8, cosine → 7.7) [Step 1/8] Gate: val score 0.68 > 0.65 ✓ ACCEPT [Step 2/8] ... ``` -------------------------------- ### Clone and Install Development Dependencies Source: https://github.com/microsoft/skillopt/blob/main/CONTRIBUTING.md Steps to clone the SkillOpt repository and install development dependencies. ```bash git clone https://github.com/microsoft/SkillOpt.git cd SkillOpt pip install -e ".[dev]" ``` -------------------------------- ### Quick Start - Training on ALFWorld Source: https://github.com/microsoft/skillopt/blob/main/README.md Command to train on ALFWorld using a specified configuration, data split, and OpenAI endpoint. ```bash python scripts/train.py \ --config configs/alfworld/default.yaml \ --split_dir /path/to/your/alfworld_split \ --azure_openai_endpoint https://your-resource.openai.azure.com/ \ --optimizer_model gpt-5.5 \ --target_model gpt-5.5 ``` -------------------------------- ### Quick Start - Eval Only Source: https://github.com/microsoft/skillopt/blob/main/README.md Command to evaluate a trained skill on specific data splits without further training. ```bash # Evaluate a trained skill on specific data splits without training: ``` -------------------------------- ### Model Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for the model backend, optimizer, and target. ```yaml model: backend: azure_openai # azure_openai | openai_chat | claude_code_exec | qwen optimizer: gpt-5.5 # Optimizer model (for reflection) target: gpt-5.5 # Target model (for rollout) ``` -------------------------------- ### Optimizer Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for the optimizer, including learning rate, scheduler, and meta-skill usage. ```yaml optimizer: learning_rate: 4 # Max edits per step (edit budget) min_learning_rate: 2 # Min edits for decay schedulers lr_scheduler: cosine # constant | linear | cosine | autonomous use_slow_update: true # Momentum-like blending at epoch boundary slow_update_samples: 20 # Samples for slow update evaluation use_meta_skill: true # Cross-epoch strategy memory ``` -------------------------------- ### Gradient (Reflection) Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for gradient reflection, including minibatch size and analyst workers. ```yaml gradient: minibatch_size: 8 # Reflect minibatch size analyst_workers: 16 # Parallel reflection workers max_analyst_rounds: 3 # Max rounds of analyst reflection failure_only: false # Only reflect on failures ``` -------------------------------- ### Config Structure Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Directory structure for configuration files, showing global defaults and benchmark-specific overrides. ```yaml configs/ ├── _base_/ │ └── default.yaml # Global defaults ├── searchqa/ │ └── default.yaml # SearchQA overrides ├── docvqa/ │ └── default.yaml # DocVQA overrides └── alfworld/ └── default.yaml # ALFWorld overrides ``` -------------------------------- ### Environment (Data) Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for the environment and data, including benchmark name, split mode, and data path. ```yaml env: name: searchqa # Benchmark name split_mode: ratio # ratio | split_dir split_ratio: "2:1:7" # train:val:test ratio data_path: "" # Path to dataset exec_timeout: 120 # Per-task timeout (seconds) ``` -------------------------------- ### Training Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for the training process, including epochs, batch size, and seed. ```yaml train: num_epochs: 4 # Number of training epochs batch_size: 40 # Tasks per step (batch size) accumulation: 1 # Gradient accumulation seed: 42 ``` -------------------------------- ### Evaluation Parameters Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/configuration.md Configuration parameters for evaluation, including validation gating and test evaluation. ```yaml evaluation: use_gate: true # Validation gating (accept/reject updates) eval_test: true # Run test evaluation after training ``` -------------------------------- ### Data Preparation - Example JSON Structure (SearchQA) Source: https://github.com/microsoft/skillopt/blob/main/README.md Shows the JSON format for SearchQA task items, including id, question, context, and answers. ```json [ { "id": "unique_item_id", "question": "Who wrote the novel ...", "context": "[DOC] relevant passage text ...", "answers": ["expected answer"] } ] ``` -------------------------------- ### Review the config file Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md View the default configuration file for the SearchQA benchmark. ```bash cat configs/searchqa/default.yaml ``` -------------------------------- ### Usage Source: https://github.com/microsoft/skillopt/blob/main/skillopt/envs/_template/README.md Steps to copy, rename, implement, register, and create configuration for a new benchmark. ```bash cp -r skillopt/envs/_template skillopt/envs/your_benchmark ``` -------------------------------- ### Typical Skill Document Structure Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/skill-document.md An example of the structure of a typical skill document in Markdown format. ```markdown # Task Strategy ## General Approach - Break complex problems into sub-steps - Always verify intermediate results ## Common Patterns - When you see X, try approach Y - Avoid Z because it leads to errors ## Edge Cases - If the input contains A, handle it specially by... - Watch out for B — it requires C ## Output Format - Always include reasoning before the answer - Format numbers with proper units ``` -------------------------------- ### Backend Architecture Directory Structure Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-backend.md The directory structure shows where new backends should be placed, with `your_backend.py` as an example. ```tree skillopt/model/ ├── base.py # Abstract base class ├── azure_openai.py # Azure OpenAI backend ├── openai_model.py # Direct OpenAI backend ├── claude.py # Anthropic Claude backend ├── qwen.py # Local Qwen (vLLM) backend └── your_backend.py # Your new backend ``` -------------------------------- ### Configure API Credentials - Qwen (local vLLM) Source: https://github.com/microsoft/skillopt/blob/main/README.md Sets environment variables for Qwen chat base URL and model name for local vLLM deployment. ```bash export QWEN_CHAT_BASE_URL="http://localhost:8000/v1" export QWEN_CHAT_MODEL="Qwen/Qwen3.5-4B" ``` -------------------------------- ### Example Custom Backend Implementation Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-backend.md This Python code defines a custom backend `YourBackend` that inherits from `ModelBackend` and implements the `generate` method. ```python from skillopt.model.base import ModelBackend, ModelResponse import os class YourBackend(ModelBackend): """Your custom model backend.""" def __init__(self, cfg: dict): super().__init__(cfg) self.model_name = cfg.get('model_name', 'your-default-model') self.api_key = os.environ.get('YOUR_API_KEY', '') self.client = self._init_client() def _init_client(self): """Initialize API client.""" # TODO: Set up your API client pass async def generate( self, messages: list[dict], temperature: float = 0.7, max_tokens: int = 4096, **kwargs ) -> ModelResponse: """ Generate a completion. Args: messages: Chat messages [{"role": "...", "content": "..."}] temperature: Sampling temperature max_tokens: Maximum tokens in response Returns: ModelResponse with content, usage, and metadata """ response = await self.client.chat( model=self.model_name, messages=messages, temperature=temperature, max_tokens=max_tokens, ) return ModelResponse( content=response.text, usage={ 'prompt_tokens': response.usage.input, 'completion_tokens': response.usage.output, }, model=self.model_name, ) async def generate_with_tools( self, messages: list[dict], tools: list[dict], **kwargs ) -> ModelResponse: """Generate with tool/function calling support.""" # Optional: implement if your model supports tool use raise NotImplementedError("Tool use not supported") ``` -------------------------------- ### Data Preparation - Directory Structure Source: https://github.com/microsoft/skillopt/blob/main/README.md Illustrates the expected directory structure for data preparation, with train, val, and test subdirectories each containing a JSON file. ```bash data/my_split/ ├── train/items.json ├── val/items.json └── test/items.json ``` -------------------------------- ### WebUI Command Source: https://github.com/microsoft/skillopt/blob/main/docs/reference/cli.md The command to launch the SkillOpt Web User Interface. ```bash python -m skillopt_webui.app [--port PORT] [--share] ``` -------------------------------- ### Create Benchmark Package Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-benchmark.md Create the directory structure for a new benchmark. ```bash mkdir -p skillopt/envs/my_benchmark touch skillopt/envs/my_benchmark/__init__.py ``` -------------------------------- ### Launch WebUI with public share link Source: https://github.com/microsoft/skillopt/blob/main/README.md Command to launch the WebUI with a public share link, useful for remote servers. ```bash python -m skillopt_webui.app --share ``` -------------------------------- ### Select Stage Analogy Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/training-loop.md Python code illustrating gradient clipping and optimizer step size analogy for the select stage. ```python # Analogy: gradient clipping + optimizer step size selected = top_k(edits, k=learning_rate) ``` -------------------------------- ### Run Training Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-benchmark.md Command to run the training script with the new benchmark configuration. ```bash python scripts/train.py --config configs/my_benchmark/default.yaml ``` -------------------------------- ### Optional: Qwen local model (via vLLM) Dependencies Source: https://github.com/microsoft/skillopt/blob/main/requirements.txt These packages are optional and are needed for using Qwen local models via vLLM. ```python # vllm>=0.4.0 ``` -------------------------------- ### Configure API Credentials - OpenAI Source: https://github.com/microsoft/skillopt/blob/main/README.md Sets the environment variable for OpenAI API key. ```bash export OPENAI_API_KEY="sk-..." ``` -------------------------------- ### Rollout Stage Analogy Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/training-loop.md Python code illustrating the forward pass analogy for the rollout stage. ```python # Analogy: forward pass through the network predictions = model(input, skill_document) scores = evaluate(predictions, ground_truth) ``` -------------------------------- ### Implement Environment Adapter Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/new-benchmark.md Python code for a custom environment adapter. ```python from skillopt.envs.base import EnvAdapter, TaskResult class MyBenchmarkEnv(EnvAdapter): """Execute tasks and evaluate results.""" def __init__(self, cfg: dict): super().__init__(cfg) async def execute(self, item: DataItem, skill: str, model) -> TaskResult: """ Execute a single task. Args: item: The data item to process skill: Current skill document content model: The target model instance Returns: TaskResult with prediction, score, and trajectory """ # Build prompt with skill document prompt = self.build_prompt(item, skill) # Get model response response = await model.generate(prompt) # Extract prediction prediction = self.parse_response(response) # Score against ground truth score = self.evaluate(prediction, item.ground_truth) return TaskResult( item_id=item.id, prediction=prediction, score=score, trajectory=[ {"role": "system", "content": skill}, {"role": "user", "content": item.input}, {"role": "assistant", "content": response} ] ) def evaluate(self, prediction: str, ground_truth: str) -> float: """ Score a prediction against ground truth. Returns: Float between 0.0 and 1.0 """ # TODO: Implement your scoring logic # Examples: exact match, F1, ANLS, etc. return float(prediction.strip() == ground_truth.strip()) def build_prompt(self, item, skill: str) -> str: """Combine skill document with task input.""" return f"{skill}\n\n---\n\nQuestion: {item.input}" def parse_response(self, response: str) -> str: """Extract the answer from model response.""" return response.strip() ``` -------------------------------- ### Evaluate the best skill Source: https://github.com/microsoft/skillopt/blob/main/docs/guide/first-experiment.md Command to evaluate the best skill on the test split. ```bash python scripts/eval_only.py \ --config configs/searchqa/default.yaml \ --skill outputs/searchqa//skills/best_skill.md ``` -------------------------------- ### Comparison Grid CSS Source: https://github.com/microsoft/skillopt/blob/main/index.html CSS for creating a comparison grid layout. ```css .comparison-grid { display: grid; grid-template-columns: repeat(3, minmax(0, 1fr)); gap: 12px; } ```