### Run Local Script in Dev Mode to Start Client

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

After preparing the image, use this command to run the entire setup, including starting the client process, in development mode.

```bash
./run-local.sh --mode dev --start-client true
```

--------------------------------

### Start VM and Client Processes Individually

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Once the container is running in interactive mode, you can start the VM and client processes separately.

```bash
./start_vm.sh
..
./start_client.sh
```

--------------------------------

### Install python-pptx Library

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Install the python-pptx library using pip for LibreOffice Press.

```shell
pip install python-pptx
```

--------------------------------

### Configuration File Example

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Create a config.json file at the project root with your API keys for OpenAI or Azure endpoints.

```json
{
    "OPENAI_API_KEY": "<OPENAI_API_KEY>", // if you are using OpenAI endpoint
    "AZURE_API_KEY": "<AZURE_API_KEY>",  // if you are using Azure endpoint
    "AZURE_ENDPOINT": "https://yourendpoint.openai.azure.com/", // if you are using Azure endpoint
}
```

--------------------------------

### Install Dependencies and Playwright

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/mm_agents/navi/screenparsing_oss/webparse/README.md

Install Python requirements and Playwright browser binaries. Run this command in your terminal.

```bash
pip install -r requirements.txt ; playwright install
```

--------------------------------

### Testing a Custom Agent

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Agent.md

Example Python code to instantiate and test a custom agent. It shows how to get observations, provide instructions, and execute predicted actions.

```python
from mm_agents.my_agent.agent import MyAgent

agent = MyAgent()
obs = get_current_observation()  # Function to retrieve the current observation
instruction = "Your test instruction here"
actions = agent.predict(instruction, obs)
execute_actions(actions)  # Function to execute the predicted actions
```

--------------------------------

### Install python-docx and odfpy Libraries

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Install the python-docx and odfpy libraries using pip for LibreOffice Writer.

```shell
pip install python-docx
pip install odfpy
```

--------------------------------

### Clone Repository and Install Dependencies

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

These commands clone the WindowsAgentArena repository and install the required Python dependencies. Activate the 'winarena' Conda environment before running the pip install command.

```bash
git clone https://github.com/microsoft/WindowsAgentArena.git
cd WindowsAgentArena
# Install the required dependencies in your python environment
# conda activate winarena
pip install -r requirements.txt
```

--------------------------------

### Install Playwright for Python

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Install the Playwright library for Python using pip. After installation, run 'playwright install' to download necessary browser binaries.

```bash
pip install playwright
```

```bash
playwright install
```

--------------------------------

### Task JSON Configuration Example

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Tasks.md

This JSON defines a task for WAA, including its ID, natural language instruction, initial configuration steps (launching VLC and simulating a click), an evaluator to check the outcome, and the expected result type.

```json
{
    "id": "8ba5ae7a-5ae5-4eab-9fcc-5dd4fe3abf89-W0S",
    "instruction": "Help me modify the folder used to store my recordings to the Desktop",
    "config": [
        {
            "type": "launch",
            "parameters": {
                "command": "vlc"
            }
        },
        {
            "type": "execute",
            "parameters": {
                "command": [
                    "python",
                    "-c",
                    "import pyautogui; import time; pyautogui.click(960, 540); time.sleep(0.5);"
                ]
            }
        }
    ],
    "evaluator": {
        "func": "vis_vlc_recordings_folder",
        "expected": {
            "type": "rule",
            "rules": {
                "recording_file_path": "C:\\Users\\Docker\\Desktop"
            }
        }
    },
    "result": {
        "type": "vlc_config",
        "dest": "vlcrc"
    }
}
```

--------------------------------

### Example Navi Agent Implementation

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Agent.md

An example Python implementation of an agent named Navi, demonstrating the structure and required `predict()` and `reset()` methods. This serves as a template for custom agents.

```python
# agent.py
import logging
from typing import Dict, List
from PIL import Image
from io import BytesIO
import copy

logger = logging.getLogger("desktopenv.agent")

class NaviAgent:
    def __init__(
            self,
            server: str = "azure",
            model: str = "gpt-4o",
            som_config = None,
            som_origin = "oss",
            obs_view = "screen",
            auto_window_maximize = False,
            use_last_screen = True,
            temperature: float = 0.5,
    ):
        # Initialize agent parameters
        self.action_space = "code_block"
        self.server = server
        self.model = model
        # ... (additional initialization)

    def predict(self, instruction: str, obs: Dict) -> List:
        """
        Predict the next action(s) based on the current observation.
        """
        # Process the observation
        # Generate actions based on the instruction
        # ...
        actions = ["# Your code logic here"]
        return actions

    def reset(self):
        """
        Reset the agent's internal state.
        """
        # Reset logic
        pass

```

--------------------------------

### Install Python Packages

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Installs necessary Python packages for image processing and hashing. Ensure pip is available in your environment.

```bash
pip install opencv-python-headless Pillow imagehash
```

--------------------------------

### Run Strongest Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Executes the `run-local.sh` script with GPU enabled and specific origin and accessibility backend configurations for the strongest agent setup.

```bash
./run-local.sh --gpu-enabled true --som-origin mixed-omni --a11y-backend uia
```

--------------------------------

### Run Local Script with Specific Model Origin and GPU Enabled

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

This command is used to run the local deployment with a specific screen understanding model origin and enables GPU usage. Ensure you have the necessary dependencies installed.

```bash
./run-local.sh --som-origin mixed-omni --gpu-enabled true
```

--------------------------------

### Import Playwright Module in Python

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Import the synchronous Playwright API at the beginning of your Python script to start browser automation.

```python
from playwright.sync_api import sync_playwright
```

--------------------------------

### Change Difficulty Level in Start Client Script

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

To enable the new harder difficulty mode, modify the 'diff_lvl' parameter in the 'start_client.sh' script from 'normal' to 'hard'. This mode requires agents to initialize tasks themselves.

```bash
diff_lvl="hard"
```

--------------------------------

### Test Python Server Accessibility

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Send a GET request to the screenshot endpoint to verify that the Python server is running and accessible within the Docker container.

```bash
curl -v -X GET http://20.20.20.21:5000/screenshot
# you should get a HTTP/1.1 200 OK respose
```

--------------------------------

### Run Local Script in Interactive Mode

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Launch the Docker container without starting the VM and client processes. This is useful for developing agents and extensions.

```bash
cd scripts
./run-local.sh --interactive true
```

--------------------------------

### Open a Web Page with Playwright

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Launches a Chromium browser, navigates to a specified URL, and closes the browser. Requires Playwright for Python to be installed.

```python
from playwright.sync_api import sync_playwright

def run(playwright):
    browser = playwright.chromium.launch()
    page = browser.new_page()
    page.goto("http://example.com")
    ## other actions...
    browser.close()

with sync_playwright() as playwright:
    run(playwright)
```

--------------------------------

### Upload Folder to Azure Blob Storage using Azure CLI

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Use this Azure CLI command to upload a local folder to an Azure Blob container. Ensure you have the Azure CLI installed and are logged in.

```bash
az login --use-device-code
az storage blob upload-batch --account-name <STORAGE_ACCOUNT_NAME> --destination <CONTAINER_NAME> --source <LOCAL_FOLDER>
```

--------------------------------

### Start Chrome with Remote Debugging Enabled

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Modify the Chrome shortcut properties to include the --remote-debugging-port flag. This enables remote debugging on port 9222, allowing tools like Playwright to connect.

```bash
"C:\Path\To\Chrome.exe" --remote-debugging-port=9222
```

--------------------------------

### Run Local Script Help

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Displays the help information for the `run-local.sh` script.

```bash
./run-local.sh --help
```

--------------------------------

### Prepare Windows 11 Golden Image

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Run this script once to prepare a Windows 11 VM snapshot with all necessary programs and a Python server. Monitor progress at http://localhost:8006.

```bash
cd ./scripts
./run-local.sh --prepare-image true
```

--------------------------------

### Run Base Benchmark

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Launch the evaluation to run the baseline agent on all benchmark tasks. Navigate to the scripts directory first.

```bash
cd scripts
./run-local.sh
# For client/agent options:
```

--------------------------------

### Baseline Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Sets up the baseline configuration for the Navi agent using webparse, groundingdino, and OCR (TesseractOCR).

```bash
./run-local.sh --som-origin oss
```

--------------------------------

### Show Results Script

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Navigates to the client directory and runs the `show_results.py` script to display benchmark results. Requires specifying the directory where results are stored.

```bash
cd src/win-arena-container/client
python show_results.py --result_dir <path_to_results_folder>
```

--------------------------------

### Run Local Benchmark with Custom Resources

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Execute the benchmark locally, overriding default RAM and CPU allocations. Useful for systems with limited resources.

```bash
./run-local.sh --ram-size 4G --cpu-cores 4
```

--------------------------------

### Run Azure Experiments

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Execute ML training jobs on Azure Compute Instances. Ensure your experiments.json is configured.

```bash
cd scripts
python run_azure.py --experiments_json "experiments.json"
```

--------------------------------

### Run Azure Deployment with Experiment Parameters

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Execute the run_azure.py script with specific experiment parameters using command-line arguments. This is an alternative to manually editing experiments.json.

```bash
cd scripts
python run_azure.py --experiments_json "experiments.json" --update_json --exp_name "experiment_1" --ci_startup_script_path "Users/<YOUR_USER>/compute-instance-startup.sh" --agent "navi" --json_name "evaluation_examples_windows/test_all.json" --num_workers 4 --som_origin oss --a11y_backend win32
```

--------------------------------

### Build WinArena Docker Image Locally

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Build the WinArena image locally from the scripts directory. Use --build-base-image to rebuild the base image if Dockerfile-WinArena-Base changes.

```bash
cd scripts
./build-container-image.sh
```

```bash
# If there are any changes in 'Dockerfile-WinArena-Base', use the --build-base-image flag to build also the base image locally
# ./build-container-image.sh --build-base-image true

# For other build options:
# ./build-container-image.sh --help
```

--------------------------------

### Recommended Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Uses the recommended configuration for the Navi agent, combining Omniparser with accessibility tree information for optimal results.

```bash
./run-local.sh --som-origin mixed-omni --a11y-backend uia
```

--------------------------------

### Navigate to mm_agents Directory

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Agent.md

Change the current directory to the `src/win-arena-container/client/mm_agents` folder where agent files are located.

```bash
cd src/win-arena-container/client/mm_agents
```

--------------------------------

### Show Azure Experiment Results

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Display experiment results from downloaded agent outputs. Requires a JSON configuration and the path to the results directory.

```bash
cd scripts
python show_azure.py --json_config "experiments.json" --result_dir <path_to_downloaded_agent_outputs_folder>
```

--------------------------------

### Create New Agent Folder

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Agent.md

Create a new directory for your custom agent within the `mm_agents` directory. Replace `my_agent` with your desired agent name.

```bash
mkdir my_agent
```

--------------------------------

### Copy Default Agent Template

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Develop-Agent.md

Copy the `agent.py` file from the default agent's directory to your new agent's directory to use as a template.

```bash
cp default_agent/agent.py my_agent/
```

--------------------------------

### Log in to Azure CLI

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Command to log in to Azure CLI. It may prompt for device code authentication. Use the commented-out lines if you need to specify a tenant or set a subscription.

```bash
az login --use-device-code 
# If multiple tenants or subscriptions, make sure to select the right ones with:
# az login --use-device-code --tenant "<YOUR_AZURE_AD_TENANT_ID>"
# az account set --subscription "<YOUR_AZURE_AD_TENANT_ID>"
```

--------------------------------

### Run Local Script in Dev Mode for Image Preparation

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Use this command to prepare the Windows golden image in development mode, enabling shared folders between the Docker host and the Windows VM.

```bash
cd ./scripts
./run-local.sh --mode dev --prepare-image true
```

--------------------------------

### Connect to Running Docker Container

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Connect to the running Docker container to test VM accessibility and Python server status.

```bash
cd scripts
./run-local.sh --connect true
```

--------------------------------

### Mixed OSS and Accessibility Tree Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Combines OSS detections with accessibility tree information for the Navi agent.

```bash
./run-local.sh --som-origin mixed-oss --a11y-backend uia
```

--------------------------------

### Define Experiment Parameters in experiments.json

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

This JSON structure defines parameters for an experiment run, including agent, model, and paths. Use this as a reference for your own experiment configurations.

```json
{
  "experiment_1": {
    "ci_startup_script_path": "Users/<YOUR_USER>/compute-instance-startup.sh", 
    "agent": "navi",
    "datastore_input_path": "storage",
    "docker_img_name": "windowsarena/winarena:latest",
    "exp_name": "experiment_1",
    "num_workers": 4,
    "use_managed_identity": false,
    "json_name": "evaluation_examples_windows/test_all.json",
    "model_name": "gpt-4-1106-vision-preview",
    "som_origin": "oss", 
    "a11y_backend": "win32" 
  }
  
}
```

--------------------------------

### Fast Accessibility Tree Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Configures the Navi agent to use a faster but less accurate accessibility tree backend.

```bash
./run-local.sh --som-origin a11y --a11y-backend win32
```

--------------------------------

### Run Local Benchmark Disabling KVM Acceleration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Disable KVM acceleration for local benchmark runs. Not recommended due to performance impact; consider Azure for better performance.

```bash
./run-local.sh --use-kvm false
```

--------------------------------

### Accurate Accessibility Tree Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Configures the Navi agent to use a slower but more accurate accessibility tree backend.

```bash
./run-local.sh --som-origin a11y --a11y-backend uia
```

--------------------------------

### Add Azure Configuration to config.json

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Append these keys to your project's config.json file to specify Azure subscription, resource group, and workspace details.

```json
{
    ...

    "AZURE_SUBSCRIPTION_ID": "<YOUR_AZURE_SUBSCRIPTION_ID>", 
    "AZURE_ML_RESOURCE_GROUP": "<YOUR_AZURE_ML_RESOURCE_GROUP>",
    "AZURE_ML_WORKSPACE_NAME": "<YOUR_AZURE_ML_WORKSPACE_NAME>"
}
```

--------------------------------

### Pull WinArena-Base Docker Image

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Pull the base image from Docker Hub to include necessary dependencies for running the code.

```bash
docker pull windowsarena/winarena-base:latest
```

--------------------------------

### OmniParser Navi Agent Configuration

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Configures the Navi agent to use Omniparser for screen element understanding.

```bash
./run-local.sh --som-origin omni
```

--------------------------------

### Upload Custom Docker Image to Azure Container Registry

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

Steps to log in to Azure and Docker, tag a local Docker image, and push it to your Azure Container Registry. This is for using custom images.

```bash
az login --use-device-code
# potentially needed if commands below don't work: az acr login --name <ACR_NAME>
docker login
docker tag <IMAGE_NAME> <ACR_NAME>.azurecr.io/<IMAGE_NAME>:<TAG>
docker push <ACR_NAME>.azurecr.io/<IMAGE_NAME>:<TAG>
```

--------------------------------

### Create Conda Environment for Windows Agent Arena

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

This command creates a new Conda environment named 'winarena' with Python version 3.9, which is recommended for running the scripts.

```bash
conda create -n winarena python=3.9
```

--------------------------------

### Forward Ports to Connect Python Server Externally

Source: https://github.com/microsoft/windowsagentarena/blob/main/docs/Development-Tips.md

Create a proxy server within the Docker container to forward specific ports (5000, 9222, 1337) to the Windows server's IP address, enabling external connections.

```bash
# connect to the running docker
cd scripts
./run-local.sh --connect true

# It will forward the requests and responses from the ports to the windows server's IP inside the docker
echo -n 5000 9222 1337 | xargs -d ' ' -I% bash -c 'socat tcp-listen:%,fork tcp:
```

--------------------------------

### Required Libraries for LibreOffice Calc

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

List of libraries required for LibreOffice Calc operations.

```text
openpyxl
pandas
lxml
xmltodict
```

--------------------------------

### Convert Bash Scripts to Unix Format (WSL2)

Source: https://github.com/microsoft/windowsagentarena/blob/main/README.md

If running on WSL2 and encountering interpreter errors, convert bash scripts from DOS/Windows format to Unix format.

```bash
cd ./scripts
find . -maxdepth 1 -type f -exec dos2unix {} +
```

--------------------------------

### Generate CSV from XLSX using LibreOffice

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Convert an XLSX file to CSV format using the LibreOffice command-line tool. Specify conversion options and output directory. The last parameter indicates the sheet number to export.

```shell
libreoffice --convert-to "csv:Text - txt - csv (StarCalc):44,34,UTF8,,,,false,true,true,false,false,1" --out-dir /home/user /home/user/abc.xlsx
```

--------------------------------

### Disable System Crash Report

Source: https://github.com/microsoft/windowsagentarena/blob/main/src/win-arena-container/client/desktop_env/evaluators/README.md

Disable the system crash report by editing the apport configuration file.

```shell
sudo vim /etc/default/apport
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.