### Install and Setup Xorg for Virtual Screen

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Install Xorg and configure it to use a virtual screen for rendering Unity environments on a remote server. This is necessary for visual observations when headless mode is not used.

```sh
# Install Xorg
$ sudo apt-get update
$ sudo apt-get install -y xserver-xorg mesa-utils
$ sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024

# Get the BusID information
$ nvidia-xconfig --query-gpu-info

# Add the BusID information to your /etc/X11/xorg.conf file
$ sudo sed -i 's/    BoardName      "Tesla K80"/    BoardName      "Tesla K80"\n    BusID          "0:30:0"/g' /etc/X11/xorg.conf

# Remove the Section "Files" from the /etc/X11/xorg.conf file
# And remove two lines that contain Section "Files" and EndSection
$ sudo vim /etc/X11/xorg.conf
```

--------------------------------

### Install Grpc.Tools on Linux

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs Grpc.Tools using apt-get on Linux.

```bash
sudo apt-get install nuget
```

--------------------------------

### Install Grpc.Tools on Windows

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs the Grpc.Tools NuGet package on Windows. Ensure nuget is in your PATH.

```bash
nuget install Grpc.Tools -Version 1.14.1 -OutputDirectory $MLAGENTS_ROOT\protobuf-definitions
```

--------------------------------

### Start X Server and Configure Display

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Start the X server and set the DISPLAY environment variable to use it for rendering.

```sh
# Start the X Server, press Enter to come back to the command line
$ sudo /usr/bin/X :0 &

# Check if Xorg process is running
# You will have a list of processes running on the GPU, Xorg should be in the list.
$ nvidia-smi

# Make the ubuntu use X Server for display
$ export DISPLAY=:0
```

--------------------------------

### Install Pip

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md

Execute the downloaded get-pip.py script using python3 to install or upgrade pip. Ensure you have python3 installed.

```bash
python3 get-pip.py
```

--------------------------------

### Install protobuf

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs a specific version of the protobuf library. Use --force to ensure the version is applied.

```bash
pip install protobuf==3.19.6 --force
```

--------------------------------

### Training Output Example

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md

This is an example of the console output you can expect when the mlagents-learn command starts training. It includes the ML-Agents logo and initial connection messages.

```console
ml-agents$ mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=first-run


                        ▄▄▄▓▓▓▓
                   ╓▓▓▓▓▓▓█▓▓▓▓▓
              ,▄▄▄m▀▀▀'  ,▓▓▓▀▓▓▄                           ▓▓▓  ▓▓▌
            ▄▓▓▓▀'      ▄▓▓▀  ▓▓▓      ▄▄     ▄▄ ,▄▄ ▄▄▄▄   ,▄▄ ▄▓▓▌▄ ▄▄▄    ,▄▄
          ▄▓▓▓▀        ▄▓▓▀   ▐▓▓▌     ▓▓▌   ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌  ╒▓▓▌
        ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓      ▓▀      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌   ▐▓▓▄ ▓▓▌
        ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄     ▓▓      ▓▓▌   ▐▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▌    ▐▓▓▐▓▓
          ^█▓▓▓        ▀▓▓▄   ▐▓▓▌     ▓▓▓▓▄▓▓▓▓ ▐▓▓    ▓▓▓ ▓▓▓  ▓▓▓▄    ▓▓▓▓`
            '▀▓▓▓▄      ^▓▓▓  ▓▓▓       └▀▀▀▀ ▀▀ ^▀▀    `▀▀ `▀▀   '▀▀    ▐▓▓▌
               ▀▀▀▀▓▄▄▄   ▓▓▓▓▓▓,
                   `▀█▓▓▓▓▓▓▓▓▓▌
                        ¬`▀▀▀█▓


```

--------------------------------

### Install Grpc.Tools on Mac

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs Grpc.Tools using Homebrew on macOS.

```bash
brew install nuget
```

--------------------------------

### Successful Training Start Message

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md

This message indicates that your custom trainer package is installed correctly and ML-Agents is ready to begin training.

```text
[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
```

--------------------------------

### Install Custom Trainer Package

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md

Install your custom trainer package using pip. If it's pip-installable, use `pip install your_custom_package`. Alternatively, for local development, use `pip install -e ./ml-agents-trainer-plugin`.

```shell
pip3 install your_custom_package
```

```shell
pip3 install -e ./ml-agents-trainer-plugin
```

--------------------------------

### Install mypy-protobuf

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs mypy-protobuf for type checking generated protobuf code.

```bash
pip install mypy-protobuf==1.16.0
```

--------------------------------

### Install grpcio-tools

Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md

Installs the grpcio-tools package, which is necessary for generating gRPC code.

```bash
pip install grpcio-tools==1.28.1
```

--------------------------------

### Start ML Agents Training

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md

Use the `mlagents-learn` command to start training an agent. Specify the path to the configuration file and a unique run ID for the training session.

```bash
mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun
```

--------------------------------

### Install Python Pip

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs pip for both Python 2 and Python 3, which are needed for installing ML-Agents dependencies.

```bash
sudo apt install python-pip
sudo apt install python3-pip
```

--------------------------------

### Install ML-Agents Packages

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md

Before installing your custom trainer, ensure you have the ML-Agents environment and core packages installed using pip.

```shell
pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents
```

--------------------------------

### Install cuDNN and Set Library Path

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs cuDNN on the VM and sets the LD_LIBRARY_PATH environment variable. A reboot is required after installation.

```bash
sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH

. ~/.profile

sudo reboot
```

--------------------------------

### Run ML-Agents Docker Container for 3DBall Example

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Docker.md

This is a concrete example of the `docker run` command for the 3DBall environment. It specifies a container name, mounts the local unity-volume, maps ports, uses a specific image name, and provides the necessary ML-Agents arguments for training.

```sh
docker run -it --name 3DBallContainer.first.trial \
           --mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \
           -p 5005:5005 \
           -p 6006:6006 \
           balance.ball.v0.1:latest 3DBall \
           /unity-volume/trainer_config.yaml \
           --env=/unity-volume/3DBall \
           --train \
           --run-id=3dball_first_trial
```

--------------------------------

### Install python3-venv (Ubuntu)

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md

On Ubuntu, you need to install the python3-venv package before creating virtual environments.

```bash
$ sudo apt-get install python3-venv
```

--------------------------------

### Install ML-Agents Packages

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Install PyTorch and ML-Agents using pip. Ensure you use the specified versions for compatibility.

```bash
pip3 install torch==1.7.0 -f https://download.pytorch.org/whl/torch_stable.html
```

```bash
python -m pip install mlagents==1.1.0
```

--------------------------------

### Install CUDA Toolkit on Ubuntu

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs the CUDA toolkit version 8.0 on Ubuntu 16.04 LTS. This is a prerequisite for GPU-accelerated training.

```bash
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb

sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb

sudo apt-get update

sudo apt-get install cuda-8-0
```

--------------------------------

### Run Training Command

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md

Execute the mlagents-learn command to start training the agent, specifying the configuration file and a run ID for tracking.

```bash
mlagents-learn config/rollerball_config.yaml --run-id=RollerBall
```

--------------------------------

### Training with Parameter Randomization

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md

Launch the `mlagents-learn` command with your configuration file to start training with environment parameter randomization enabled.

```sh
mlagents-learn config/ppo/3DBall_randomize.yaml --run-id=3D-Ball-randomize
```

--------------------------------

### Verify Xorg Processes

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Ensure no Xorg processes are running before starting the X server. Use nvidia-smi to check GPU utilization.

```sh
# Kill any possible running Xorg processes
# Note that you might have to run this command multiple times depending on
# how Xorg is configured.
$ sudo killall Xorg

# Check if there is any Xorg process left
# You will have a list of processes running on the GPU, Xorg should not be in
# the list, as shown below.
$ nvidia-smi

# Thu Jun 14 20:21:11 2018
# +-----------------------------------------------------------------------------+
# | NVIDIA-SMI 390.67                 Driver Version: 390.67                    |
# |-------------------------------+----------------------+----------------------|
# | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
# | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
# |===============================+======================+======================|
# |   0  Tesla K80           On   | 00000000:00:1E.0 Off |                    0 |
# | N/A   37C    P8    31W / 149W |      0MiB / 11441MiB |      0%      Default |
# +-------------------------------+----------------------+----------------------+
#
# +-----------------------------------------------------------------------------+
# | Processes:                                                       GPU Memory |
# |  GPU       PID   Type   Process name                             Usage      |
# |=============================================================================|
# |  No running processes found                                                 |
# +-----------------------------------------------------------------------------+
```

--------------------------------

### UnityEnvironment Initialization and Basic Usage

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Demonstrates how to initialize UnityEnvironment, configure it using side channels, reset the environment, get behavior specs, and step through episodes with random actions.

```APIDOC
## UnityEnvironment

### Description
`UnityEnvironment` is the primary Python class for connecting to a Unity simulation. It supports launching standalone executables or attaching to the Unity Editor, and exposes a step-based API for multi-agent environments. All side channels are injected at construction time.

### Initialization Example
```python
from mlagents_envs.environment import UnityEnvironment
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
from mlagents_envs.side_channel.environment_parameters_channel import EnvironmentParametersChannel
import numpy as np

# Configure engine and environment parameters via side channels
engine_channel = EngineConfigurationChannel()
params_channel = EnvironmentParametersChannel()

# Connect to a standalone build; worker_id offsets the port (base 5005)
env = UnityEnvironment(
    file_name="builds/3DBall",
    worker_id=0,
    seed=42,
    no_graphics=True,
    timeout_wait=120,
    side_channels=[engine_channel, params_channel],
    num_areas=4,
)

# Speed up simulation (time_scale > 1 skips rendering waits)
engine_channel.set_configuration_parameters(time_scale=20.0, target_frame_rate=60)
# Randomize a Unity-side float parameter (read via Academy.EnvironmentParameters)
params_channel.set_float_parameter("goal_size", 5.0)

env.reset()

behavior_name = list(env.behavior_specs.keys())[0]
spec = env.behavior_specs[behavior_name]
print(f"Behavior: {behavior_name}")
print(f"Action spec: {spec.action_spec}")  # e.g. Continuous: 2, Discrete: ()
print(f"Observation specs: {spec.observation_specs}")

try:
    for episode in range(3):
        env.reset()
        for step in range(200):
            decision_steps, terminal_steps = env.get_steps(behavior_name)

            # Random actions for all agents requesting a decision
            n_agents = len(decision_steps)
            if n_agents > 0:
                action_tuple = spec.action_spec.random_action(n_agents)
                env.set_actions(behavior_name, action_tuple)

            env.step()

            # Per-agent episode endings
            for agent_id in terminal_steps:
                t = terminal_steps[agent_id]
                print(f"  Agent {agent_id} ended. Reward={t.reward:.3f}, interrupted={t.interrupted}")
finally:
    env.close()
```
```

--------------------------------

### Test Xorg Configuration with glxgears

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Verify Xorg is correctly configured by running glxgears. A high FPS indicates proper setup.

```sh
# For more information on glxgears, see ftp://www.x.org/pub/X11R6.8.1/doc/glxgears.1.html.
$ glxgears
# If Xorg is configured correctly, you should see the following message

# Running synchronized to the vertical refresh.  The framerate should be
# approximately the same as the monitor refresh rate.
# 137296 frames in 5.0 seconds = 27459.053 FPS
# 141674 frames in 5.0 seconds = 28334.779 FPS
# 141490 frames in 5.0 seconds = 28297.875 FPS
```

--------------------------------

### Initialize ML-Agents Environment and Q-Network

Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_2_Train.ipynb

Sets up the Unity environment from the registry and initializes a VisualQNetwork for training. Imports necessary libraries for environment interaction and deep learning.

```python
from mlagents_envs.registry import default_registry
from mlagents_envs.environment import UnityEnvironment
import matplotlib.pyplot as plt
import os
%matplotlib inline

# Create the GridWorld Environment from the registry
env = default_registry["GridWorld"].make()
print("GridWorld environment created.")

num_actions = 5
```

```python
qnet = VisualQNetwork((3, 64, 84), 126, num_actions)
```

--------------------------------

### RollerAgent Initialization and Reset Logic

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md

This C# script demonstrates the initialization and reset logic for a RollerAgent. It includes getting a Rigidbody component, resetting agent velocity and position if it falls, and moving the target to a new random location at the start of each episode.

```csharp
using System.Collections.Generic;
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;

public class RollerAgent : Agent
{
    Rigidbody rBody;
    void Start () {
        rBody = GetComponent<Rigidbody>();
    }

    public Transform Target;
    public override void OnEpisodeBegin()
    {
       // If the Agent fell, zero its momentum
        if (this.transform.localPosition.y < 0)
        {
            this.rBody.angularVelocity = Vector3.zero;
            this.rBody.velocity = Vector3.zero;
            this.transform.localPosition = new Vector3( 0, 0.5f, 0);
        }

        // Move the target to a new spot
        Target.localPosition = new Vector3(Random.value * 8 - 4,
                                           0.5f,
                                           Random.value * 8 - 4);
    }
}
```

--------------------------------

### Install ML-Agents Package

Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_1_Run.ipynb

Installs the ML-Agents Python package if it's not already installed. This command uses pip for installation.

```python
try:
  import mlagents
  print("ml-agents already installed")
except ImportError:
  !python -m pip install -q mlagents==1.1.0
  print("Installed ml-agents")
```

--------------------------------

### Upgrade Pip and Setuptools (macOS)

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md

Once the virtual environment is activated, upgrade pip and setuptools to their latest versions using pip3 install --upgrade.

```bash
$ pip3 install --upgrade pip
$ pip3 install --upgrade setuptools
```

--------------------------------

### Launch ML-Agents Training with Curriculum

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md

Use this command to start training agents with curriculum learning. Specify the path to your curriculum configuration file and a unique run ID. Training can be resumed from where it left off using the `--resume` flag.

```sh
mlagents-learn config/ppo/WallJump_curriculum.yaml --run-id=wall-jump-curriculum
```

--------------------------------

### Initialize UnityEnvironment

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md

Starts a new Unity environment and establishes a connection. Note: Communication is unauthenticated; ensure a secure network. Supports various configuration options like no-graphics mode and side channels.

```python
 | __init__(file_name: Optional[str] = None, worker_id: int = 0, base_port: Optional[int] = None, seed: int = 0, no_graphics: bool = False, no_graphics_monitor: bool = False, timeout_wait: int = 60, additional_args: Optional[List[str]] = None, side_channels: Optional[List[SideChannel]] = None, log_folder: Optional[str] = None, num_areas: int = 1)
```

--------------------------------

### Start Unity Environment from Registry

Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_4_SB3VectorEnv.ipynb

Initializes a Unity environment using the `make_mla_sb3_env` factory function. It configures the environment using a `LimitedConfig` object, specifying the environment name, base port, seed, number of parallel environments, and whether to allow multiple observations.

```python
# -----------------
# This code is used to close an env that might not have been closed before
try:
  env.close()
except:
  pass
# -----------------

env = make_mla_sb3_env(
    config=LimitedConfig(
        env_path_or_name='Basic',  # Can use any name from a registry or a path to your own unity build.
        base_port=6006,
        base_seed=42,
        num_env=NUM_ENVS,
        allow_multiple_obs=True,
    ),
    no_graphics=True,  # Set to false if you are running locally and want to watch the environments move around as they train.
)
```

--------------------------------

### Start ML Agents Training with Multiple Environments

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md

Use the `--num-envs` command-line option to specify the number of concurrent Unity instances to run in parallel during training. This enhances training speed by allowing parallel experience gathering.

```bash
mlagents-learn config/rollerball_config.yaml --run-id=RollerBall --num-envs=2
```

--------------------------------

### Install Nvidia Driver on Ubuntu

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs the necessary NVIDIA driver for GPU support on Ubuntu 16.04 LTS. Requires a reboot after installation.

```bash
wget http://us.download.nvidia.com/tesla/375.66/nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb

sudo dpkg -i nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb

sudo apt-get update

sudo apt-get install cuda-drivers

sudo reboot
```

--------------------------------

### Local Plugin Installation

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-Plugins.md

Install your custom plugin locally using pip in editable mode. This command should be run in the same Python virtual environment where ML-agents is installed.

```bash
pip install -e [path to your plugin code]
```

--------------------------------

### Start TensorBoard for ML-Agents

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Tensorboard.md

Run this command in your terminal to launch TensorBoard and view training statistics. Ensure you are in the ML-Agents Toolkit directory. The 'results' folder contains the log files, and '--port 6006' specifies the network port.

```bash
tensorboard --logdir results --port 6006
```

--------------------------------

### CheckpointSettings.prioritize_resume_init

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-On-Off-Policy-Trainer-Documentation.md

Prioritize explicit command line resume/init over conflicting yaml options. if both resume/init are set at one place use resume.

```APIDOC
## CheckpointSettings.prioritize_resume_init

### Description
Prioritize explicit command line resume/init over conflicting yaml options. if both resume/init are set at one place use resume.

### Method
prioritize_resume_init
```

--------------------------------

### Start Unity Environment from Registry

Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_1_Run.ipynb

Initializes and makes a Unity Environment instance from the default registry using the selected `env_id`. It includes a mechanism to close any previously opened environment to prevent conflicts.

```python
# -----------------
# This code is used to close an env that might not have been closed before
try:
  env.close()
except:
  pass
# -----------------

from mlagents_envs.registry import default_registry

env = default_registry[env_id].make()
```

--------------------------------

### UnityAECEnv.__init__

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-PettingZoo-API-Documentation.md

Initializes the UnityAECEnv wrapper for PettingZoo environments.

```APIDOC
## __init__

```python
 | __init__(env: BaseEnv, seed: Optional[int] = None)
```

### Description
Initializes a Unity AEC environment wrapper.

### Arguments
- `env`: The UnityEnvironment that is being wrapped.
- `seed`: The seed for the action spaces of the agents.
```

--------------------------------

### Start TensorBoard on Azure VM

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Run TensorBoard to monitor training progress. Ensure the log directory is correctly set and the host is configured to allow external connections.

```bash
tensorboard --logdir results --host 0.0.0.0
```

--------------------------------

### Install PyTorch on Windows with CUDA Support

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md

On Windows, install PyTorch version 2.2.1 with CUDA 12.1 support using pip. Ensure Microsoft Visual C++ Redistributable is installed if prompted.

```shell
pip3 install torch~=2.2.1 --index-url https://download.pytorch.org/whl/cu121
```

--------------------------------

### Initialize and Interact with UnityEnvironment

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Connect to a Unity standalone build, configure engine and environment parameters using side channels, and step through the environment with random actions.

```python
from mlagents_envs.environment import UnityEnvironment
from mlagents_envs.base_env import ActionTuple
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
from mlagents_envs.side_channel.environment_parameters_channel import EnvironmentParametersChannel
import numpy as np

# Configure engine and environment parameters via side channels
engine_channel = EngineConfigurationChannel()
params_channel = EnvironmentParametersChannel()

# Connect to a standalone build; worker_id offsets the port (base 5005)
env = UnityEnvironment(
    file_name="builds/3DBall",
    worker_id=0,
    seed=42,
    no_graphics=True,
    timeout_wait=120,
    side_channels=[engine_channel, params_channel],
    num_areas=4,
)

# Speed up simulation (time_scale > 1 skips rendering waits)
engine_channel.set_configuration_parameters(time_scale=20.0, target_frame_rate=60)
# Randomize a Unity-side float parameter (read via Academy.EnvironmentParameters)
params_channel.set_float_parameter("goal_size", 5.0)

env.reset()

behavior_name = list(env.behavior_specs.keys())[0]
spec = env.behavior_specs[behavior_name]
print(f"Behavior: {behavior_name}")
print(f"Action spec: {spec.action_spec}")  # e.g. Continuous: 2, Discrete: ()
print(f"Observation specs: {spec.observation_specs}")

try:
    for episode in range(3):
        env.reset()
        for step in range(200):
            decision_steps, terminal_steps = env.get_steps(behavior_name)

            # Random actions for all agents requesting a decision
            n_agents = len(decision_steps)
            if n_agents > 0:
                action_tuple = spec.action_spec.random_action(n_agents)
                env.set_actions(behavior_name, action_tuple)

            env.step()

            # Per-agent episode endings
            for agent_id in terminal_steps:
                t = terminal_steps[agent_id]
                print(f"  Agent {agent_id} ended. Reward={t.reward:.3f}, interrupted={t.interrupted}")
finally:
    env.close()
```

--------------------------------

### Install mlagents_envs Package

Source: https://github.com/unity-technologies/ml-agents/blob/develop/ml-agents-envs/README.md

Install the mlagents_envs package with a specific version using pip.

```sh
python -m pip install mlagents_envs==1.1.0
```

--------------------------------

### Install Additional Dependencies

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs Pillow and NumPy, which are common dependencies for ML-Agents projects.

```bash
pip3 install pillow
pip3 install numpy
```

--------------------------------

### Create and Wrap Unity Environment for Gym

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Demonstrates creating a UnityEnvironment and wrapping it with UnityToGymWrapper for use with the Gym interface. This setup is suitable for single-agent environments.

```python
from mlagents_envs.environment import UnityEnvironment
from mlagents_envs.wrappers.unity_to_gym import UnityToGymWrapper

unity_env = UnityEnvironment(file_name="builds/3DBall", no_graphics=True, seed=1)
env = UnityToGymWrapper(
    unity_env,
    uint8_visual=False,       # float32 visual obs in [0,1]
    flatten_branched=False,   # keep MultiDiscrete for branched actions
    allow_multiple_obs=False, # return first obs only (or list if True)
    action_space_seed=42,
)

print(f"Observation space: {env.observation_space}")
print(f"Action space:      {env.action_space}")

obs = env.reset()
total_reward = 0.0
done = False

while not done:
    action = env.action_space.sample()
    obs, reward, done, info = env.step(action)
    total_reward += reward

print(f"Episode reward: {total_reward:.2f}")
env.close()
```

--------------------------------

### UnityEnvironment Initialization

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md

Initializes a Unity environment and establishes a connection. Supports various configurations for graphics, ports, and command-line arguments.

```APIDOC
## __init__

### Description
Starts a new unity environment and establishes a connection with the environment. Notice: Currently communication between Unity and Python takes place over an open socket without authentication. Ensure that the network where training takes place is secure.

### Parameters
#### Path Parameters
- **file_name** (Optional[str]) - Description: Name of Unity environment binary.
- **worker_id** (int) - Description: Offset from base_port. Used for training multiple environments simultaneously. Defaults to 0.
- **base_port** (Optional[int]) - Description: Baseline port number to connect to Unity environment over. worker_id increments over this. If no environment is specified (i.e. file_name is None), the DEFAULT_EDITOR_PORT will be used.
- **seed** (int) - Description: Seed for the environment. Defaults to 0.
- **no_graphics** (bool) - Description: Whether to run the Unity simulator in no-graphics mode. Defaults to False.
- **no_graphics_monitor** (bool) - Description: Whether to run the main worker in graphics mode, with the remaining in no-graphics mode. Defaults to False.
- **timeout_wait** (int) - Description: Time (in seconds) to wait for connection from environment. Defaults to 60.
- **additional_args** (Optional[List[str]]) - Description: Additional Unity command line arguments.
- **side_channels** (Optional[List[SideChannel]]) - Description: Additional side channel for no-rl communication with Unity.
- **log_folder** (Optional[str]) - Description: Optional folder to write the Unity Player log file into. Requires absolute path.
- **num_areas** (int) - Description: Number of areas in the environment. Defaults to 1.
```

--------------------------------

### Observe Training Progress with TensorBoard

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md

Launch TensorBoard to visualize training statistics. Point it to the `results` directory where training logs are stored. Access the dashboard via `localhost:6006`.

```bash
tensorboard --logdir results
```

--------------------------------

### Install python3-distutils (Ubuntu)

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md

On Ubuntu, if you encounter a ModuleNotFoundError for distutils.util, install the python3-distutils package using apt-get.

```bash
sudo apt-get install python3-distutils
```

--------------------------------

### Load and Initialize Unity Environment

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI.md

Use UnityEnvironment to load a built Unity environment binary and start interacting with it. Set the file_name to None to interact directly with the Editor.

```python
from mlagents_envs.environment import UnityEnvironment
# This is a non-blocking call that only loads the environment.
env = UnityEnvironment(file_name="3DBall", seed=1, side_channels=[])
# Start interacting with the environment.
env.reset()
behavior_names = env.behavior_specs.keys()
...
```

--------------------------------

### Install ML-Agents Packages

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Installs the necessary mlagents and mlagents-envs Python packages. Ensure you use compatible versions.

```bash
pip install mlagents==1.1.0 mlagents-envs==1.1.0
```

--------------------------------

### Install TensorFlow CPU Version

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs TensorFlow version 1.4.0 and Keras version 2.0.6, for CPU-only training.

```bash
pip3 install tensorflow==1.4.0 keras==2.0.6
```

--------------------------------

### Launch Unity Environment from Registry

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md

Launches a Unity environment from a registry entry using the `make` method.

```python
registry = UnityEnvRegistry()
env = registry[<environment_identifier>].make()
```

--------------------------------

### Example Training Output

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md

This console output shows the typical information displayed during an ML-Agents training session, including hyperparameters and step-by-step reward progression.

```console
INFO:mlagents_envs:
'Ball3DAcademy' started successfully!
Unity Academy name: Ball3DAcademy

INFO:mlagents_envs:Connected new brain:
Unity brain name: 3DBallLearning
        Number of Visual Observations (per agent): 0
        Vector Observation space size (per agent): 8
        Number of stacked Vector Observation: 1
INFO:mlagents_envs:Hyperparameters for the PPO Trainer of brain 3DBallLearning:
        batch_size:          64
        beta:                0.001
        buffer_size:         12000
        epsilon:             0.2
        gamma:               0.995
        hidden_units:        128
        lambd:               0.99
        learning_rate:       0.0003
        max_steps:           5.0e4
        normalize:           True
        num_epoch:           3
        num_layers:          2
        time_horizon:        1000
        sequence_length:     64
        summary_freq:        1000
        use_recurrent:       False
        memory_size:         256
        use_curiosity:       False
        curiosity_strength:  0.01
        curiosity_enc_size:  128
        output_path: ./results/first3DBallRun/3DBallLearning
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 1000. Mean Reward: 1.242. Std of Reward: 0.746. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 2000. Mean Reward: 1.319. Std of Reward: 0.693. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 3000. Mean Reward: 1.804. Std of Reward: 1.056. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 4000. Mean Reward: 2.151. Std of Reward: 1.432. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 5000. Mean Reward: 3.175. Std of Reward: 2.250. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 6000. Mean Reward: 4.898. Std of Reward: 4.019. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 7000. Mean Reward: 6.716. Std of Reward: 5.125. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 8000. Mean Reward: 12.124. Std of Reward: 11.929. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 9000. Mean Reward: 18.151. Std of Reward: 16.871. Training.
INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 10000. Mean Reward: 27.284. Std of Reward: 28.667. Training.
```

--------------------------------

### Install ML-Agents Library

Source: https://github.com/unity-technologies/ml-agents/blob/develop/ml-agents-envs/colabs/Colab_PettingZoo.ipynb

Clones the ml-agents repository and installs the ml-agents-envs and ml-agents packages. This is required to use the ML-Agents environments.

```python
try:
  import mlagents
  print("ml-agents already installed")
except ImportError:
  !git clone -b main --single-branch https://github.com/Unity-Technologies/ml-agents.git
  !python -m pip install -q ./ml-agents/ml-agents-envs
  !python -m pip install -q ./ml-agents/ml-agents
  print("Installed ml-agents")
```

--------------------------------

### Install TensorFlow GPU Version

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md

Installs TensorFlow version 1.4.0 and Keras version 2.0.6, specifically for GPU-accelerated training.

```bash
pip3 install tensorflow-gpu==1.4.0 keras==2.0.6
```

--------------------------------

### Run Training Command

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md

Use this command to start the training process. Specify the trainer configuration file, the environment executable name, and a unique run identifier.

```sh
mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=firstRun
```

--------------------------------

### Install Nvidia Driver on Ubuntu

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Download and install the latest Nvidia driver for Ubuntu. Ensure Nouveau is disabled to avoid conflicts.

```sh
# Download and install the latest Nvidia driver for ubuntu
# Please refer to http://download.nvidia.com/XFree86/Linux-#x86_64/latest.txt
$ wget http://download.nvidia.com/XFree86/Linux-x86_64/390.87/NVIDIA-Linux-x86_64-390.87.run
$ sudo /bin/bash ./NVIDIA-Linux-x86_64-390.87.run --accept-license --no-questions --ui=none

# Disable Nouveau as it will clash with the Nvidia driver
$ sudo echo 'blacklist nouveau'  | sudo tee -a /etc/modprobe.d/blacklist.conf
$ sudo echo 'options nouveau modeset=0'  | sudo tee -a /etc/modprobe.d/blacklist.conf
$ sudo echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
$ sudo update-initramfs -u
```

--------------------------------

### UnityParallelEnv.__init__

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-PettingZoo-API-Documentation.md

Initializes the UnityParallelEnv wrapper for PettingZoo environments.

```APIDOC
## __init__

```python
 | __init__(env: BaseEnv, seed: Optional[int] = None)
```

### Description
Initializes a Unity Parallel environment wrapper.

### Arguments
- `env`: The UnityEnvironment that is being wrapped.
- `seed`: The seed for the action spaces of the agents.
```

--------------------------------

### Configure GAIL + Behavioral Cloning

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Configuration for GAIL and Behavioral Cloning within the ML-Agents PPO trainer. Ensure demo_path points to your expert demonstration files.

```yaml
behaviors:
  Crawler:
    trainer_type: ppo
    hyperparameters:
      batch_size: 2024
      buffer_size: 20240
      learning_rate: 0.0003
      beta: 0.005
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
    network_settings:
      normalize: true
      hidden_units: 512
      num_layers: 3
    reward_signals:
      gail:
        gamma: 0.99
        strength: 1.0
        demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawler.demo
        use_actions: false
        use_vail: false
        learning_rate: 0.0003
    max_steps: 10000000
    time_horizon: 1000
    summary_freq: 30000
    behavioral_cloning:
      demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawler.demo
      steps: 50000
      strength: 0.5
```

--------------------------------

### Create Virtual Environment (Windows)

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md

Create a directory for virtual environments and then create a new virtual environment named 'sample-env' using the python -m venv command.

```cmd
md python-envs
python -m venv python-envs\sample-env
```

--------------------------------

### RawBytesChannel Example

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md

An example implementation of a SideChannel for raw byte exchange between the environment and the Python API. It is intended for general research purposes.

```python
class RawBytesChannel(SideChannel)

```

--------------------------------

### Install ML-Agents Python Packages (Editable Mode)

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md

Installs the ML-Agents Python packages in editable mode, allowing live changes to the Python files for immediate testing. This is recommended if you plan to modify the ML-Agents source code or contribute changes. Install PyTorch first, then the ML-Agents packages.

```sh
pip3 install torch -f https://download.pytorch.org/whl/torch_stable.html
pip3 install -e ./ml-agents-envs
pip3 install -e ./ml-agents
```

--------------------------------

### Initialize Unity Gym Environment

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Gym-API.md

Instantiate a Unity environment wrapped for the Gym API. Configure visual observation format, action space flattening, and observation list handling.

```python
from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper

env = UnityToGymWrapper(unity_env, uint8_visual, flatten_branched, allow_multiple_obs)
```

--------------------------------

### Run ML-Agents Docker Container

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Docker.md

Use this command to start the ML-Agents Docker container. Ensure you replace placeholders like `<container-name>`, `<image-name>`, and `<trainer-config-file>` with your specific values. The `--mount` flag is crucial for persisting data and accessing configuration files.

```sh
docker run -it --name <container-name> \
           --mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \
           -p 5005:5005 \
           -p 6006:6006 \
           <image-name>:latest \
           <trainer-config-file> \
           --env=<environment-name> \
           --train \
           --run-id=<run-id>
```

--------------------------------

### Trainer Configuration Error Message

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md

This error message appears if the specified trainer type is incorrect or not found in the installed packages. Double-check your YAML configuration and installation.

```shell
mlagents.trainers.exception.TrainerConfigError: Invalid trainer type a2c was found
```

--------------------------------

### Install ML-Agents Packages

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md

Install the ML-Agents core packages and your custom trainer plugin using pip in editable mode. This ensures changes are reflected immediately.

```shell
pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents
pip3 install -e ./ml-agents-trainer-plugin
```

--------------------------------

### Initialize Unity Environment with Executable

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md

Use this code to connect to a Unity environment executable using the Python API. Ensure the 'file_name' argument points to your built environment.

```python
from mlagents_envs.environment import UnityEnvironment
env = UnityEnvironment(file_name=<env_name>)
```

--------------------------------

### Clone ML-Agents Repo and Install Packages

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md

Clone the ML-Agents repository and install the required Python packages using pip. Ensure you are in the correct directory after cloning.

```sh
git clone --branch release_23 https://github.com/Unity-Technologies/ml-agents.git
cd ml-agents/ml-agents/
pip3 install -e .
```

--------------------------------

### View mlagents-learn CLI options

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md

Use this command to see all available command-line options for the `mlagents-learn` utility.

```sh
mlagents-learn --help
```

--------------------------------

### Install and Use Custom A2C Trainer

Source: https://context7.com/unity-technologies/ml-agents/llms.txt

Installs the custom trainer plugin in editable mode and demonstrates how to use the 'a2c' trainer type in a YAML configuration file.

```bash
# Install in editable mode so the entry point is live
pip install -e ./ml-agents-trainer-plugin

# Now use the custom trainer_type in YAML
# behaviors:
#   MyAgent:
#     trainer_type: a2c
#     hyperparameters:
#       batch_size: 256

mlagents-learn config/custom/my_agent.yaml --run-id=a2c_run

```

--------------------------------

### Install Custom Trainer Plugin

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Custom-Trainer-Plugin.md

Install a custom ML-Agents trainer plugin using pip in editable mode. Replace '<./ml-agents-trainer-plugin>' with the actual path to your plugin folder.

```sh
pip3 install -e <./ml-agents-trainer-plugin>
```

--------------------------------

### Install grpcio for Dependency Resolution

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md

If a 'grpcio' wheel build error occurs, install a specific version (1.48.2) from conda-forge before reinstalling 'mlagents'. This resolves potential dependency conflicts.

```shell
conda install "grpcio=1.48.2" -c conda-forge
```

--------------------------------

### UnityToGymWrapper.__init__

Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Gym-API-Documentation.md

Initializes the UnityToGymWrapper, providing a Gym interface for Unity environments. It allows for customization of visual observation format, action space flattening, and handling of multiple observations.

```APIDOC
## __init__

### Description
Environment initialization.

### Parameters
* `unity_env` (BaseEnv) - Required - The Unity BaseEnv to be wrapped in the gym. Will be closed when the UnityToGymWrapper closes.
* `uint8_visual` (bool) - Optional - Return visual observations as uint8 (0-255) matrices instead of float (0.0-1.0).
* `flatten_branched` (bool) - Optional - If True, turn branched discrete action spaces into a Discrete space rather than MultiDiscrete.
* `allow_multiple_obs` (bool) - Optional - If True, return a list of np.ndarrays as observations with the first elements containing the visual observations and the last element containing the array of vector observations. If False, returns a single np.ndarray containing either only a single visual observation or the array of vector observations.
* `action_space_seed` (Optional[int]) - Optional - If non-None, will be used to set the random seed on created gym.Space instances.
```

--------------------------------

### Configure and Initialize PPO Agent

Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_4_SB3VectorEnv.ipynb

Sets up hyperparameters and initializes a PPO agent for training. It configures the policy network architecture, learning rate, clip range, and batch size. The environment is wrapped with `VecMonitor` for statistics gathering.

```python
# 250K should train to a reward ~= 0.90 for the "Basic" environment.
# We set the value lower here to demonstrate just a small amount of trianing.
BATCH_SIZE = 32
BUFFER_SIZE = 256
UPDATES = 50
TOTAL_TAINING_STEPS_GOAL = BUFFER_SIZE * UPDATES
BETA = 0.0005
N_EPOCHS = 3 
STEPS_PER_UPDATE = BUFFER_SIZE / NUM_ENVS

# Helps gather stats for our eval() calls later so we can see reward stats.
env = VecMonitor(env)

#Policy and Value function with 2 layers of 128 units each and no shared layers.
policy_kwargs = {"net_arch" : [{"pi": [32,32], "vf": [32,32]}]}

model = PPO(
    "MlpPolicy",
    env,
    verbose=1,
    learning_rate=lambda progress: 0.0003 * (1.0 - progress),
    clip_range=lambda progress: 0.2 * (1.0 - progress),
    clip_range_vf=lambda progress: 0.2 * (1.0 - progress),
    # Uncomment this if you want to log tensorboard results when running this notebook locally.
    # tensorboard_log="results",
    policy_kwargs=policy_kwargs,
    n_steps=int(STEPS_PER_UPDATE),
    batch_size=BATCH_SIZE,
    n_epochs=N_EPOCHS,
    ent_coef=BETA,
)
```