### Configure Service Environment Variables (.env)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/launcher.md

This example shows the necessary environment variables to be defined in a `.env` file for a new service. `MYENV_PATH` specifies the working directory, and `MYENV_SCRIPT` defines the command to start the service.

```dotenv
# MyEnv service
MYENV_PATH=/abs/path/to/myenv
MYENV_SCRIPT=python -m myenv.api --host 0.0.0.0 --port 9009
```

--------------------------------

### Launch full experiment

Source: https://github.com/modelscope/agentevolver/blob/main/docs/launcher.md

Execute the launcher script with flags to clean up existing processes, back up configurations, and start required services.

```bash
python launcher.py --kill --conf examples/self-question-nav-attr.yaml --with-appworld --with-exp-maker
```

--------------------------------

### Setup Appworld Environment Service (Bash)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/index.md

Configures the environment service specifically for Appworld. This script should be run from the specified directory.

```bash
cd env_service/environments/appworld && bash setup.sh
```

--------------------------------

### Launch AgentEvolver Training

Source: https://github.com/modelscope/agentevolver/blob/main/README.md

Starts the training process using the AgentEvolver launcher. Includes options for minimal training or full training with ReMe integration.

```bash
conda activate agentevolver

# option 1: minimal example without ReMe (using built-in datasets within environments)
python launcher.py --conf examples/basic.yaml --with-appworld

# option 2: full example with ReMe (questioning + navigating + attributing)
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme
```

--------------------------------

### Install Project Dependencies

Source: https://github.com/modelscope/agentevolver/blob/main/games/README.md

Installs all required packages for the AgentEvolver project, including game-specific dependencies. This is a prerequisite before training an LLM agent.

```bash
bash install.sh
pip install -r games/requirements_game.txt
```

--------------------------------

### Environment Initialization Request and Response

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Example JSON structures for the /create endpoint. The input requires environment identification and task parameters, while the output provides the initial system state and instance metadata.

```json
{
  "env": "appworld",
  "instance_id": "my-instance-id-001",
  "task_id": "task-001",
  "params": {
    "simple": false,
    "prompt": true
  }
}
```

```json
{
  "state": [
    {
      "role": "system",
      "content": "[Structured prompt with task description and tool info]"
    },
    {
      "role": "user",
      "content": "[Task instruction from Appworld]"
    }
  ],
  "info": {
    "instance_id": "my-instance-id-001",
    "task_id": "task-001"
  }
}
```

--------------------------------

### Orchestrate Parallel Rollouts with ParallelEnvManager

Source: https://context7.com/modelscope/agentevolver/llms.txt

Demonstrates how to initialize the ParallelEnvManager for orchestrating parallel rollouts. It covers creating tasks, getting experience configurations, executing rollouts, and converting trajectories to a training data format.

```python
from agentevolver.module.env_manager.env_manager import ParallelEnvManager
from agentevolver.module.exp_manager.exp_manager import ExperienceManager, TaskExpConfig
from agentevolver.schema.task import Task

# Initialize parallel environment manager
env_manager = ParallelEnvManager(
    config=config,
    async_rollout_manager=async_rollout_manager,
    max_parallel=64,  # Max concurrent environment workers
    max_llm_retries=3
)

# Create tasks for rollout
tasks = [
    Task(task_id=f"task_{i}", query=f"Query {i}", env_type="appworld")
    for i in range(32)
]

# Initialize experience manager for self-navigating
exp_manager = ExperienceManager(config=config)

# Get experience configurations for each task
task_exp_configs = exp_manager.get_complete_exp_configs(tasks, mode="sample")

# Execute parallel rollouts
trajectories = env_manager.rollout(
    tasks=tasks,
    task_exp_configs=task_exp_configs,
    mode="sample",  # or "validate"
    epoch="train.1.0"
)
# Returns List[Trajectory] sorted by (data_id, rollout_id)

# Convert trajectories to DataProto for training
dataproto = env_manager.to_dataproto(trajectories)
# dataproto.batch contains: prompts, responses, attention_mask, loss_mask, etc.
# dataproto.non_tensor_batch contains: task_ids, reward_scores, steps, etc.
```

--------------------------------

### Install ReMe Experience Management (Bash)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/index.md

Installs the ReMe component for experience management. This is an optional step for enhanced functionality. Refer to the ReMe GitHub for more details.

```bash
bash external/reme/install_reme.sh
```

--------------------------------

### Launch Environment Service via CLI

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Commands to start the environment service manually. Supports configuration of environment type, network binding, and debug modes.

```bash
python -m env_service.env_service --env appworld --portal 127.0.0.1 --port 8080 --debug True
```

--------------------------------

### Install AgentEvolver Dependencies (Bash)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/index.md

Installs the necessary dependencies for AgentEvolver, including setting up the training environment. Requires conda and cuda toolkit to be pre-installed.

```bash
bash install.sh
```

--------------------------------

### Install Game Dependencies

Source: https://github.com/modelscope/agentevolver/blob/main/games/README.md

Installs the necessary Python packages for non-training usage of the game features. It requires a requirements file specific to games.

```bash
pip install -r games/requirements_game.txt
```

--------------------------------

### Control Game UI Display Based on State (CSS)

Source: https://github.com/modelscope/agentevolver/blob/main/games/web/static/avalon/participate.html

These CSS rules dynamically control the display of game UI elements based on the presence of 'auto-start' or 'manual-start' classes on the HTML element. The 'auto-start' class hides the game setup and shows messages and input containers, suitable for games that start automatically. The 'manual-start' class does the opposite, showing the game setup and hiding messages and input containers, for games requiring manual initiation.

```css
html.auto-start #game-setup {
  display: none !important;
}
html.auto-start #messages-container {
  display: flex !important;
}
html.auto-start .input-container {
  display: flex !important;
}
html.manual-start #game-setup {
  display: block !important;
}
html.manual-start #messages-container {
  display: none !important;
}
html.manual-start .input-container {
  display: none !important;
}
```

--------------------------------

### Launch Web Interface Server

Source: https://github.com/modelscope/agentevolver/blob/main/games/README.md

Starts the Python web server for the AgentEvolver project. Once running, the interface can be accessed via a web browser at http://localhost:8000.

```python
python games/web/server.py
```

--------------------------------

### Configure Task Manager Grader

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/task_manager.md

Example configuration for the Task Manager grader, demonstrating how to specify the original environment grader and the synthetic LLM-based fallback grader.

```yaml
task_manager:
  grader:
    original_grader: env
    synthetic_grader: llm
```

--------------------------------

### Configure and Start ReMe-Service

Source: https://github.com/modelscope/agentevolver/blob/main/docs/tutorial/quick_start.md

Sets up the ReMe long-term memory service with specific LLM and embedding configurations. This service listens on a local port to provide memory and reflection capabilities to the agent.

```bash
export FLOW_EMBEDDING_API_KEY="<YOUR_API_KEY>"
export FLOW_EMBEDDING_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
export FLOW_LLM_API_KEY="<YOUR_API_KEY>"
export FLOW_LLM_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
conda activate reme
cd external/reme
reme config=default backend=http thread_pool_max_workers=256 http.host="127.0.0.1" http.port=8001 http.limit_concurrency=256 llm.default.model_name=qwen-max-2025-01-25 embedding_model.default.model_name=text-embedding-v4 vector_store.default.backend=local op.rerank_memory_op.params.enable_llm_rerank=false
```

--------------------------------

### Configure AgentEvolver Training Parameters

Source: https://context7.com/modelscope/agentevolver/llms.txt

Example YAML configuration for AgentEvolver training, covering environment service endpoints, trainer hyperparameters, data processing limits, and algorithm-specific settings.

```yaml
env_service:
  env_type: "appworld"
  env_url: "http://127.0.0.1:8000"
trainer:
  total_epochs: 10
  save_freq: 100
data:
  train_batch_size: 32
actor_rollout_ref:
  rollout:
    n: 4
    max_steps: 20
algorithm:
  adv_estimator: "grpo"
  gamma: 1.0
```

--------------------------------

### Configure Observer UI Visibility

Source: https://github.com/modelscope/agentevolver/blob/main/games/web/static/avalon/observe.html

CSS rules to manage the visibility of game setup and message containers based on the initialization state class applied to the HTML element.

```css
html.auto-start #game-setup { display: none !important; }
html.auto-start #messages-container { display: flex !important; }
html.manual-start #game-setup { display: block !important; }
html.manual-start #messages-container { display: none !important; }
```

--------------------------------

### Start Task Manager in Standalone Mode (Bash)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/task_manager.md

This command initiates the Task Manager in standalone mode for simple task synthesis. It runs the module directly, displaying synthesis progress and printing the path to the generated tasks upon completion.

```bash
$ python -m agentevolver.module.task_manager

```

--------------------------------

### Implement Custom Diplomacy Workflow Class

Source: https://github.com/modelscope/agentevolver/blob/main/examples/game/diplomacy/README.md

Shows how to create a custom workflow class by inheriting from BaseAgentscopeWorkflow. It requires implementing the __init__ method for agent setup and the execute method for running the game logic and returning a trajectory.

```python
from agentevolver.utils.agentscope_utils import BaseAgentscopeWorkflow
from agentevolver.schema.trajectory import Trajectory

class DiplomacyWorkflow(BaseAgentscopeWorkflow):
    def __init__(self, task, llm_chat_fn, model_name, **kwargs):
        super().__init__(task, llm_chat_fn, model_name, **kwargs)
    
    def execute(self) -> Trajectory:
        pass
```

--------------------------------

### POST /create

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Initializes a new environment instance and prepares its initial state for a specific task.

```APIDOC
## POST /create

### Description
Initializes a new environment instance and prepares its initial state.

### Method
POST

### Endpoint
/create

### Request Body
- **env** (string) - Required - The environment to use (e.g., "appworld").
- **instance_id** (string) - Required - A user-defined identifier for tracking the environment instance.
- **task_id** (string) - Required - The task assigned to this instance.
- **params** (dict) - Optional - Additional configuration options (simple, prompt).

### Request Example
{
  "env": "appworld",
  "instance_id": "my-instance-id-001",
  "task_id": "task-001",
  "params": {
    "simple": false,
    "prompt": true
  }
}

### Response
#### Success Response (200)
- **state** (array) - The initial state containing system and user messages.
- **info** (object) - Metadata including instance_id and task_id.

#### Response Example
{
  "state": [
    {"role": "system", "content": "[Structured prompt]"},
    {"role": "user", "content": "[Task instruction]"}
  ],
  "info": {
    "instance_id": "my-instance-id-001",
    "task_id": "task-001"
  }
}
```

--------------------------------

### Launch Experiment via CLI

Source: https://github.com/modelscope/agentevolver/blob/main/docs/launcher.md

Demonstrates how to execute the launcher script with a specific configuration file and optional environment services. The command includes the configuration path and flags to enable specific services and process cleanup.

```bash
python launcher.py --conf examples/self-question-nav-attr.yaml --python-killer --with-appworld --with-exp-maker --with-logview
```

--------------------------------

### Initialize and Load Tasks with TaskManager

Source: https://context7.com/modelscope/agentevolver/llms.txt

Shows how to initialize the TaskManager with configuration and load tasks from both an environment service and a JSONL dataset file. It also demonstrates accessing loaded tasks and creating a training dataset.

```python
from agentevolver.module.task_manager.task_manager import TaskManager, FullDataset
from agentevolver.client.env_client import EnvClient

# Initialize task manager with configuration
task_manager = TaskManager(
    config=task_config,
    mixture_strategy="unified",  # unified, stratified, or custom
    reward_config=reward_config
)

# Load tasks from environment service
env_client = EnvClient("http://127.0.0.1:8000")
task_manager.load_tasks_from_environment(
    env_client,
    env_type="appworld",
    split="train"
)

# Load tasks from JSONL dataset file
task_manager.load_tasks_from_dataset(
    dataset=rlhf_dataset,
    env_type="appworld"
)

# Access loaded tasks
seed_tasks = task_manager.seed_tasks
print(f"Loaded {len(seed_tasks)} seed tasks")

# Create training dataset with automatic task mixing
train_dataset = FullDataset(
    task_manager=task_manager,
    mixture_strategy=task_manager._mixture_strategy,
    reward_config=task_manager._reward_config,
    cache_path="./cache/train_tasks.jsonl",
    tokenizer=tokenizer,
    config=data_config,
    processor=None
)

# Update dataset for next epoch
train_dataset.update()
```

--------------------------------

### Manage Rollout Context with Experience - Python

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/exp_manager.md

Injects relevant past experiences into prompts for context-aware rollouts. It retrieves top-K experiences using EMClient, formats them, and prepends them to the current rollout message, enhancing context without altering the core task.

```python
history_experience = self.em_client.call_context_generator(
    trajectory=trajectory,
    retrieve_top_k=reme_config.retrieve_top_k,
    workspace_id=reme_config.workspace_id
    )
formatted_experience = self.experience_template.format(history_experience)
new_content = formatted_experience + trajectory.steps[-1]["content"]
trajectory.steps[-1]["content"] = new_content
```

--------------------------------

### GET /healthz

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Performs a basic health check on the environment service.

```APIDOC
## GET /healthz

### Description
Basic service status check.

### Method
GET

### Endpoint
/healthz
```

--------------------------------

### Launch AgentEvolver with Configuration

Source: https://github.com/modelscope/agentevolver/blob/main/docs/tutorial/configuration.md

Commands to execute the AgentEvolver training process using either a custom YAML file or direct command-line overrides for specific parameters.

```bash
python launcher.py --conf examples/my_config.yaml
```

```bash
python3 -m agentevolver.main_ppo \
    --config-path="$CONFIG_PATH" \
    --config-name='script_config' \
    trainer.experiment_name=my_experiment \
    data.train_batch_size=64
```

--------------------------------

### Launch AgentEvolver Training (Python)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/index.md

Launches the AgentEvolver training process using a configuration file. Supports basic training and full self-evolving pipelines with options for Appworld integration and ReMe.

```python
# minimal example without ReMe (using built-in datasets within environments).
python launcher.py --conf examples/train-basic.yaml --with-appworld

# full example with ReMe (questioning + navigating + attributing)
python launcher.py --conf examples/self-question-nav-attr.yaml --with-appworld
```

--------------------------------

### Initialize and Run Ray PPO Trainer

Source: https://context7.com/modelscope/agentevolver/llms.txt

Configures and executes the AgentEvolverRayPPOTrainer. It sets up resource pools, maps roles to workers, and initiates the distributed training loop.

```python
from agentevolver.module.trainer.ae_ray_trainer import AgentEvolverRayPPOTrainer
from verl.trainer.ppo.ray_trainer import ResourcePoolManager, Role

resource_pool_manager = ResourcePoolManager(resource_pool_spec={"actor_rollout": {"gpu": 4, "cpu": 16}, "critic": {"gpu": 2, "cpu": 8}})
role_worker_mapping = {Role.ActorRollout: ActorRolloutWorker, Role.Critic: CriticWorker}

trainer = AgentEvolverRayPPOTrainer(
    config=config, tokenizer=tokenizer, role_worker_mapping=role_worker_mapping,
    resource_pool_manager=resource_pool_manager, train_task_manager=train_task_manager,
    val_task_manager=val_task_manager, shuffle_trainset=True
)
trainer.init_workers()
trainer.fit()
```

--------------------------------

### Launch Workflow via CLI

Source: https://github.com/modelscope/agentevolver/blob/main/examples/game/avalon/README.md

Executes the configured workflow using the launcher script. Requires specifying the directory path to the configuration files.

```bash
python launcher.py --config-path examples/game/avalon --config-name config
```

--------------------------------

### POST /step

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Sends an action to the environment instance and retrieves the resulting state, reward, and completion status.

```APIDOC
## POST /step

### Description
Sends an action to the environment instance.

### Method
POST

### Endpoint
/step

### Request Body
- **instance_id** (string) - Required - The ID of the active environment instance.
- **action** (string/object) - Required - The action to perform in the environment.

### Response
#### Success Response (200)
- **state** (object) - The next state of the environment.
- **reward** (float) - The reward received for the action.
- **done** (bool) - Indicates if the task is complete.
```

--------------------------------

### Configure Multi-Node Ray Cluster

Source: https://github.com/modelscope/agentevolver/blob/main/docs/tutorial/quick_start.md

Manual setup for a distributed Ray cluster across multiple nodes to enable multi-node training. Requires all nodes to share the same Conda environment.

```bash
conda activate agentevolver
ray start --head
ray start --address='<head addr>'
```

--------------------------------

### Configure Game Scene Layout

Source: https://github.com/modelscope/agentevolver/blob/main/games/web/static/loop/index.html

Sets up the main game area container with a grid background overlay and responsive table positioning.

```css
#scene { position: relative; height: 580px; overflow: hidden; border-radius: 12px; border: 2px solid var(--border); background: linear-gradient(180deg, #143458 0%, #0c192c 60%); } #scene::before { content: ""; position: absolute; inset: 0; background-image: linear-gradient(90deg, var(--grid) 1px, transparent 1px), linear-gradient(var(--grid) 1px, transparent 1px); background-size: 32px 32px; opacity: 0.35; }
```

--------------------------------

### POST /env/create

Source: https://context7.com/modelscope/agentevolver/llms.txt

Creates a new environment instance for a specific task ID.

```APIDOC
## POST /env/create

### Description
Initializes a new environment instance based on the provided task identifier.

### Method
POST

### Endpoint
/env/create

### Parameters
#### Request Body
- **task_id** (string) - Required - The unique identifier of the task.
- **params** (object) - Optional - Configuration parameters for the instance.

### Request Example
{
  "task_id": "task_1",
  "params": {}
}

### Response
#### Success Response (200)
- **instance_id** (string) - The unique ID of the created environment instance.

#### Response Example
{
  "instance_id": "instance_task_1_0"
}
```

--------------------------------

### Initialize and Execute Agent Flow

Source: https://context7.com/modelscope/agentevolver/llms.txt

Shows the initialization of the AgentFlow class with necessary components like a reward calculator and LLM function, followed by executing the agent-environment interaction loop.

```python
from agentevolver.module.agent_flow.agent_flow import AgentFlow
from agentevolver.module.agent_flow.reward_calculator import RewardCalculator
from agentevolver.module.context_manager.cmt_linear import Linear_CMT

agent_flow = AgentFlow(
    reward_calculator=reward_calculator,
    llm_chat_fn=llm_chat_fn,
    tokenizer=tokenizer,
    config=config
)

context_manager = Linear_CMT(
    tokenizer=tokenizer,
    max_context_length=config.data.max_seq_length,
    metadata={}
)

init_messages = [{"role": "user", "content": "Your task is to..."}]

context_manager = agent_flow.execute(
    context_manager=context_manager,
    init_messages=init_messages,
    env=env_client,
    instance_id="env_instance_001",
    tmux={'step': [0], 'token': [0]},
    stop=[False],
    thread_index=0,
    task_id="task_001",
    traj_exp_config=traj_exp_config,
    data_id="0",
    rollout_id="0",
    query="Find user profile"
)

reward = context_manager.reward
print(f"Outcome: {reward.outcome}, Success: {reward.success_rate}")

trajectory = context_manager.to_trajectory()
```

--------------------------------

### Environment Profile JSON Structure

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/task_manager.md

Defines the structure for an Environment Profile JSON file. This file specifies entities, their attributes, and operations within an environment, along with task preferences for guiding data collection.

```json
{
  "name": "Alice",
  "background": "A general user working with a file system.",
  "entities": [
    {
      "name": "file",
      "description": "A file in a file system.",
      "attrs": {
        "name": "The name of the file.",
        "size": "The size of the file in bytes.",
        "type": "The type of the file, e.g. text, image, video, etc.",
        "parent": "The parent directory of the file."
      },
      "opts": [
        { "name": "create", "description": "Create a new file." },
        { "name": "delete", "description": "Delete a file." },
        { "name": "read", "description": "Read a file." },
        { "name": "write", "description": "Write to a file." }
      ]
    },
    {
      "name": "directory",
      "description": "A directory in a file system.",
      "attrs": {
        "name": "The name of the directory.",
        "parent": "The parent directory of the directory."
      },
      "opts": [
        { "name": "create", "description": "Create a new directory." },
        { "name": "delete", "description": "Delete a directory." },
        { "name": "list", "description": "List the contents of a directory." }
      ]
    }
  ],
  "task_preference": {
    "num_entities": 2,
    "num_opts": 3,
    "relation_difficulty": 3
  }
}
```

--------------------------------

### Launch Service with pty_launch (Python)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/launcher.md

This code shows how to conditionally call the `pty_launch` helper function within the `main()` function of `launcher.py` when a specific CLI flag is provided. It includes an optional 'ready' string for monitoring service startup.

```python
if args.with_myenv:
    pty_launch("myenv", success_std_string="Uvicorn running on")
```

--------------------------------

### Python: Compute Suffix Sum for Step Advantages

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/adv_processor.md

Calculates the advantage for each step by computing the cumulative sum of future rewards (suffix sum). This function helps determine the expected future reward from any given step, crucial for guiding policy learning.

```python
def suffix_sum_on_steps(rewards: torch.Tensor) -> torch.Tensor:
    # Calculates the suffix sum of rewards for each step.
    # ... implementation details ...
    pass
```

--------------------------------

### Initialize Game Configuration and State (JavaScript)

Source: https://github.com/modelscope/agentevolver/blob/main/games/web/static/avalon/participate.html

This JavaScript code initializes early game settings by checking session storage for game configuration, selected portraits, and game running status. It updates the global `__EARLY_INIT__` object and applies CSS classes to the document element to control the game's display mode (auto-start or manual-start). It handles potential errors during parsing by defaulting to manual-start.

```javascript
window.__EARLY_INIT__ = { hasGameConfig: false, isGameRunning: false, config: null, portraits: null };
(function() {
  try {
    var gameConfigStr = sessionStorage.getItem('gameConfig');
    var portraitsStr = sessionStorage.getItem('selectedPortraits');
    var gameRunning = sessionStorage.getItem('gameRunning') === 'true';
    if (gameConfigStr) {
      window.__EARLY_INIT__.hasGameConfig = true;
      window.__EARLY_INIT__.config = JSON.parse(gameConfigStr);
      document.documentElement.classList.add('auto-start');
    } else if (gameRunning) {
      window.__EARLY_INIT__.isGameRunning = true;
      document.documentElement.classList.add('auto-start');
    } else {
      document.documentElement.classList.add('manual-start');
    }
    if (portraitsStr) {
      window.__EARLY_INIT__.portraits = JSON.parse(portraitsStr);
    }
  } catch (e) {
    document.documentElement.classList.add('manual-start');
  }
})();
```

--------------------------------

### JavaScript Game Redirection and Mode Setting

Source: https://github.com/modelscope/agentevolver/blob/main/games/web/static/loop/index.html

This code defines a function to compute the redirect URL for a given game and overrides the window.setMode function to enforce an 'observe' mode. It also includes event listeners to manage the game state, particularly for Avalon, preventing new games from starting if one is already running and redirecting users to the existing session.

```javascript
window.computeRedirectUrl = function(game) { if (!game) return '/'; return `/static/loop/${game}/observe.html`; }; const originalSetMode = window.setMode; window.setMode = function() { if (originalSetMode) originalSetMode('observe'); }; document.addEventListener('DOMContentLoaded', function() { if (window.setMode) window.setMode('observe'); const modeToggle = document.querySelector('.mode-toggle'); if (modeToggle) modeToggle.style.display = 'none'; document.querySelectorAll('.avalon-participate-only, .diplomacy-participate-only').forEach((el) => el.style.display = 'none'); // Prevent starting a new Avalon game while one is running; redirect to existing session. const isGameRunning = sessionStorage.getItem('gameRunning') === 'true'; let runningGame = null; try { const cfgStr = sessionStorage.getItem('gameConfig'); if (cfgStr) { const cfg = JSON.parse(cfgStr); runningGame = cfg.game || null; } } catch (e) { runningGame = null; } const avalonCard = document.querySelector('.game-card[data-game="avalon"]'); const startBtn = document.getElementById('start-btn'); function redirectToAvalon() { window.location.href = '/static/loop/avalon/observe.html'; } if (isGameRunning && runningGame === 'avalon') { if (startBtn) { startBtn.disabled = true; startBtn.textContent = 'Avalon running'; startBtn.addEventListener('click', function(e) { e.preventDefault(); redirectToAvalon(); }); } if (avalonCard) { avalonCard.classList.add('active'); avalonCard.addEventListener('click', function(e) { e.preventDefault(); redirectToAvalon(); }); } } });
```

--------------------------------

### Launch Simulation Environment Service

Source: https://github.com/modelscope/agentevolver/blob/main/docs/tutorial/quick_start.md

Activates the AppWorld environment and launches the simulation service required for agent operation. This service must run in the background.

```bash
conda activate appworld
bash env_service/launch_script/appworld.sh
```

--------------------------------

### POST /env/step

Source: https://context7.com/modelscope/agentevolver/llms.txt

Executes an action within a specific environment instance and returns the observation.

```APIDOC
## POST /env/step

### Description
Performs an action in the environment and returns the resulting state, reward, and termination status.

### Method
POST

### Endpoint
/env/step

### Parameters
#### Request Body
- **instance_id** (string) - Required - The ID of the environment instance.
- **action** (object) - Required - The action to be performed.

### Request Example
{
  "instance_id": "instance_task_1_0",
  "action": {"command": "click_button"}
}

### Response
#### Success Response (200)
- **state** (object) - The new state of the environment.
- **reward** (float) - The reward received.
- **is_terminated** (boolean) - Whether the task is finished.

#### Response Example
{
  "state": {"content": "Action executed", "role": "user"},
  "reward": 0.5,
  "is_terminated": false
}
```

--------------------------------

### Manual Execution Scripts (Bash)

Source: https://github.com/modelscope/agentevolver/blob/main/docs/index.md

Provides standalone scripts for manual execution of AgentEvolver pipelines. Includes options for basic RL training and the complete self-evolving pipeline with customizable configurations.

```bash
# Execute basic RL pipeline with GRPO using built-in datasets within environments.
bash examples/run_basic.sh

# Run the complete self-evolving AgentEvolver pipeline with fully customizable configurations.
bash examples/run_overall.sh
```

--------------------------------

### Execute Agent Training

Source: https://github.com/modelscope/agentevolver/blob/main/docs/tutorial/quick_start.md

Runs the training scripts for either basic GRPO or the full AgentEvolver pipeline. Requires the environment and memory services to be active.

```bash
conda activate agentevolver
bash examples/run_basic.sh
bash examples/run_overall.sh
```

--------------------------------

### Launch Diplomacy Training

Source: https://github.com/modelscope/agentevolver/blob/main/examples/game/diplomacy/README.md

Commands to execute the training process using the launcher script or provided shell scripts for production and debug environments.

```bash
python launcher.py --config-path examples/game/diplomacy --config-name config
./examples/game/diplomacy/run_train.sh
./examples/game/diplomacy/run_train_debug.sh
```

--------------------------------

### POST /step

Source: https://github.com/modelscope/agentevolver/blob/main/docs/guidelines/env_service.md

Advances the environment by one step using a user-generated action and returns the resulting state and reward.

```APIDOC
## POST /step

### Description
Advances the environment one step using a user-generated action. This endpoint is used to execute agent commands and receive feedback.

### Method
POST

### Endpoint
/step

### Parameters
#### Request Body
- **action** (object) - Required - The action object containing the agent's intent.
  - **role** (str) - Required - Typically "assistant".
  - **content** (str) - Required - The agent's natural language or code action.
  - **tool_calls** (list) - Optional - List of structured tool calls.

### Request Example
{
  "action": {
    "role": "assistant",
    "content": "```python\nopen_app(\"Calendar\")\n```",
    "tool_calls": []
  }
}

### Response
#### Success Response (200)
- **state** (list) - Environment’s response including tool output.
- **reward** (float) - Task performance score (1.0 for success).
- **is_terminated** (bool) - Indicates if the task is completed.
- **info** (dict) - Additional metadata.

#### Response Example
{
  "state": [
    {
      "role": "assistant",
      "content": "Output:\n```\nApp opened successfully\n```"
    }
  ],
  "reward": 1.0,
  "is_terminated": true,
  "info": {}
}
```

--------------------------------

### Trajectory Creation and Usage

Source: https://context7.com/modelscope/agentevolver/llms.txt

Demonstrates how to create a Trajectory object, access its properties, and convert it into a training sample format.

```APIDOC
## Trajectory Creation and Usage

### Description
This section shows how to create a `Trajectory` object, which represents a sequence of steps in an agent's interaction with an environment. It also covers accessing properties of the trajectory and converting it into a format suitable for training.

### Method
N/A (Code Example)

### Endpoint
N/A

### Parameters
N/A

### Request Example
```python
from agentevolver.schema.trajectory import Trajectory
from agentevolver.schema.reward import Reward

trajectory = Trajectory(
    task_id="task_001",
    data_id="0",
    rollout_id="0",
    query="Find user profile information",
    steps=[
        {
            "role": "assistant",
            "content": "I'll search for the user profile using the API.\n<tool_call>search_user(id='123')</tool_call>"
        },
        {
            "role": "user",
            "content": "[OBSERVATION]\nUser found: John Doe, email: john@example.com"
        },
        {
            "role": "assistant",
            "content": "I found the user profile. The user is John Doe with email john@example.com."
        }
    ],
    is_terminated=True,
    reward=Reward(
        outcome=1.0,
        success_rate=1.0,
        madness=0.0,
        description="Task completed successfully"
    ),
    metadata={
        "task_train_exp_mode": "discard",
        "add_exp": True,
        "experience_list": []
    }
)

# Access trajectory properties
print(f"Task: {trajectory.task_id}")
print(f"Success: {trajectory.reward.success_rate}")
print(f"Steps: {len(trajectory.steps)}")

# Convert trajectory to training sample format
samples = trajectory.group_tokenize()  # Returns List[Sample]
```

### Response
N/A

### Response Example
N/A
```

--------------------------------

### Create and Access Trajectory Data

Source: https://context7.com/modelscope/agentevolver/llms.txt

Demonstrates how to create a Trajectory object, populate it with steps and rewards, and access its properties. It also shows how to convert the trajectory into a training sample format.

```python
trajectory = Trajectory(
    task_id="task_001",
    data_id="0",
    rollout_id="0",
    query="Find user profile information",
    steps=[
        {
            "role": "assistant",
            "content": "I'll search for the user profile using the API.\n<tool_call>search_user(id='123')</tool_call>"
        },
        {
            "role": "user",
            "content": "[OBSERVATION]\nUser found: John Doe, email: john@example.com"
        },
        {
            "role": "assistant",
            "content": "I found the user profile. The user is John Doe with email john@example.com."
        }
    ],
    is_terminated=True,
    reward=Reward(
        outcome=1.0,
        success_rate=1.0,
        madness=0.0,
        description="Task completed successfully"
    ),
    metadata={
        "task_train_exp_mode": "discard",
        "add_exp": True,
        "experience_list": []
    }
)

# Access trajectory properties
print(f"Task: {trajectory.task_id}")
print(f"Success: {trajectory.reward.success_rate}")
print(f"Steps: {len(trajectory.steps)}")

# Convert trajectory to training sample format
samples = trajectory.group_tokenize()  # Returns List[Sample]
```

--------------------------------

### Manage Experience with ExperienceManager

Source: https://context7.com/modelscope/agentevolver/llms.txt

Illustrates the use of the ExperienceManager for implementing the Self-Navigating mechanism. It shows how to allocate training modes for tasks and allocate settings for experience addition during rollouts.

```python
from agentevolver.module.exp_manager.exp_manager import (
    ExperienceManager, ExperienceWorker, TaskExpConfig, TrajExpConfig
)
from agentevolver.schema.trajectory import Trajectory

# Initialize experience manager
exp_manager = ExperienceManager(config=config)

# Allocate training modes for tasks (keep vs discard experience)
task_exp_configs = exp_manager.allocate_train_mode(tasks)
# Returns List[TaskExpConfig] with train_mode: "keep" or "discard"

# Allocate experience addition settings
task_exp_configs = exp_manager.allocate_add_exp(
    task_exp_configs,
    mode="sample"  # or "validate"
)
# Each config now has add_exp: List[bool] for each rollout
```

--------------------------------

### Experience Worker for Rollout Context Management

Source: https://context7.com/modelscope/agentevolver/llms.txt

Illustrates the use of ExperienceWorker to manage rollout context by adding historical experience. It shows how to configure experience retrieval and augment messages with it.

```python
exp_worker = ExperienceWorker(config=config)

traj_exp_config = TrajExpConfig(
    add_exp=True,
    train_mode="discard",
    task_id="task_001",
    query="Find user data"
)

augmented_messages, updated_config = exp_worker.manage_rollout_context(
    init_messages=[{"role": "user", "content": "Task query..."}],
    traj_exp_config=traj_exp_config
)
```