### Install and Setup Xorg for Virtual Screen Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Install Xorg and configure it to use a virtual screen for rendering Unity environments on a remote server. This is necessary for visual observations when headless mode is not used. ```sh # Install Xorg $ sudo apt-get update $ sudo apt-get install -y xserver-xorg mesa-utils $ sudo nvidia-xconfig -a --use-display-device=None --virtual=1280x1024 # Get the BusID information $ nvidia-xconfig --query-gpu-info # Add the BusID information to your /etc/X11/xorg.conf file $ sudo sed -i 's/ BoardName "Tesla K80"/ BoardName "Tesla K80"\n BusID "0:30:0"/g' /etc/X11/xorg.conf # Remove the Section "Files" from the /etc/X11/xorg.conf file # And remove two lines that contain Section "Files" and EndSection $ sudo vim /etc/X11/xorg.conf ``` -------------------------------- ### Install Grpc.Tools on Linux Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs Grpc.Tools using apt-get on Linux. ```bash sudo apt-get install nuget ``` -------------------------------- ### Install Grpc.Tools on Windows Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs the Grpc.Tools NuGet package on Windows. Ensure nuget is in your PATH. ```bash nuget install Grpc.Tools -Version 1.14.1 -OutputDirectory $MLAGENTS_ROOT\protobuf-definitions ``` -------------------------------- ### Start X Server and Configure Display Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Start the X server and set the DISPLAY environment variable to use it for rendering. ```sh # Start the X Server, press Enter to come back to the command line $ sudo /usr/bin/X :0 & # Check if Xorg process is running # You will have a list of processes running on the GPU, Xorg should be in the list. $ nvidia-smi # Make the ubuntu use X Server for display $ export DISPLAY=:0 ``` -------------------------------- ### Install Pip Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md Execute the downloaded get-pip.py script using python3 to install or upgrade pip. Ensure you have python3 installed. ```bash python3 get-pip.py ``` -------------------------------- ### Install protobuf Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs a specific version of the protobuf library. Use --force to ensure the version is applied. ```bash pip install protobuf==3.19.6 --force ``` -------------------------------- ### Training Output Example Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md This is an example of the console output you can expect when the mlagents-learn command starts training. It includes the ML-Agents logo and initial connection messages. ```console ml-agents$ mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=first-run ▄▄▄▓▓▓▓ ╓▓▓▓▓▓▓█▓▓▓▓▓ ,▄▄▄m▀▀▀' ,▓▓▓▀▓▓▄ ▓▓▓ ▓▓▌ ▄▓▓▓▀' ▄▓▓▀ ▓▓▓ ▄▄ ▄▄ ,▄▄ ▄▄▄▄ ,▄▄ ▄▓▓▌▄ ▄▄▄ ,▄▄ ▄▓▓▓▀ ▄▓▓▀ ▐▓▓▌ ▓▓▌ ▐▓▓ ▐▓▓▓▀▀▀▓▓▌ ▓▓▓ ▀▓▓▌▀ ^▓▓▌ ╒▓▓▌ ▄▓▓▓▓▓▄▄▄▄▄▄▄▄▓▓▓ ▓▀ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▄ ▓▓▌ ▀▓▓▓▓▀▀▀▀▀▀▀▀▀▀▓▓▄ ▓▓ ▓▓▌ ▐▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▌ ▐▓▓▐▓▓ ^█▓▓▓ ▀▓▓▄ ▐▓▓▌ ▓▓▓▓▄▓▓▓▓ ▐▓▓ ▓▓▓ ▓▓▓ ▓▓▓▄ ▓▓▓▓` '▀▓▓▓▄ ^▓▓▓ ▓▓▓ └▀▀▀▀ ▀▀ ^▀▀ `▀▀ `▀▀ '▀▀ ▐▓▓▌ ▀▀▀▀▓▄▄▄ ▓▓▓▓▓▓, `▀█▓▓▓▓▓▓▓▓▓▌ ¬`▀▀▀█▓ ``` -------------------------------- ### Install Grpc.Tools on Mac Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs Grpc.Tools using Homebrew on macOS. ```bash brew install nuget ``` -------------------------------- ### Successful Training Start Message Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md This message indicates that your custom trainer package is installed correctly and ML-Agents is ready to begin training. ```text [INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor. ``` -------------------------------- ### Install Custom Trainer Package Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md Install your custom trainer package using pip. If it's pip-installable, use `pip install your_custom_package`. Alternatively, for local development, use `pip install -e ./ml-agents-trainer-plugin`. ```shell pip3 install your_custom_package ``` ```shell pip3 install -e ./ml-agents-trainer-plugin ``` -------------------------------- ### Install mypy-protobuf Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs mypy-protobuf for type checking generated protobuf code. ```bash pip install mypy-protobuf==1.16.0 ``` -------------------------------- ### Install grpcio-tools Source: https://github.com/unity-technologies/ml-agents/blob/develop/protobuf-definitions/README.md Installs the grpcio-tools package, which is necessary for generating gRPC code. ```bash pip install grpcio-tools==1.28.1 ``` -------------------------------- ### Start ML Agents Training Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md Use the `mlagents-learn` command to start training an agent. Specify the path to the configuration file and a unique run ID for the training session. ```bash mlagents-learn config/ppo/3DBall.yaml --run-id=first3DBallRun ``` -------------------------------- ### Install Python Pip Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs pip for both Python 2 and Python 3, which are needed for installing ML-Agents dependencies. ```bash sudo apt install python-pip sudo apt install python3-pip ``` -------------------------------- ### Install ML-Agents Packages Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md Before installing your custom trainer, ensure you have the ML-Agents environment and core packages installed using pip. ```shell pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents ``` -------------------------------- ### Install cuDNN and Set Library Path Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs cuDNN on the VM and sets the LD_LIBRARY_PATH environment variable. A reboot is required after installation. ```bash sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH . ~/.profile sudo reboot ``` -------------------------------- ### Run ML-Agents Docker Container for 3DBall Example Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Docker.md This is a concrete example of the `docker run` command for the 3DBall environment. It specifies a container name, mounts the local unity-volume, maps ports, uses a specific image name, and provides the necessary ML-Agents arguments for training. ```sh docker run -it --name 3DBallContainer.first.trial \ --mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \ -p 5005:5005 \ -p 6006:6006 \ balance.ball.v0.1:latest 3DBall \ /unity-volume/trainer_config.yaml \ --env=/unity-volume/3DBall \ --train \ --run-id=3dball_first_trial ``` -------------------------------- ### Install python3-venv (Ubuntu) Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md On Ubuntu, you need to install the python3-venv package before creating virtual environments. ```bash $ sudo apt-get install python3-venv ``` -------------------------------- ### Install ML-Agents Packages Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Install PyTorch and ML-Agents using pip. Ensure you use the specified versions for compatibility. ```bash pip3 install torch==1.7.0 -f https://download.pytorch.org/whl/torch_stable.html ``` ```bash python -m pip install mlagents==1.1.0 ``` -------------------------------- ### Install CUDA Toolkit on Ubuntu Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs the CUDA toolkit version 8.0 on Ubuntu 16.04 LTS. This is a prerequisite for GPU-accelerated training. ```bash wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb sudo apt-get update sudo apt-get install cuda-8-0 ``` -------------------------------- ### Run Training Command Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md Execute the mlagents-learn command to start training the agent, specifying the configuration file and a run ID for tracking. ```bash mlagents-learn config/rollerball_config.yaml --run-id=RollerBall ``` -------------------------------- ### Training with Parameter Randomization Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md Launch the `mlagents-learn` command with your configuration file to start training with environment parameter randomization enabled. ```sh mlagents-learn config/ppo/3DBall_randomize.yaml --run-id=3D-Ball-randomize ``` -------------------------------- ### Verify Xorg Processes Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Ensure no Xorg processes are running before starting the X server. Use nvidia-smi to check GPU utilization. ```sh # Kill any possible running Xorg processes # Note that you might have to run this command multiple times depending on # how Xorg is configured. $ sudo killall Xorg # Check if there is any Xorg process left # You will have a list of processes running on the GPU, Xorg should not be in # the list, as shown below. $ nvidia-smi # Thu Jun 14 20:21:11 2018 # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 390.67 Driver Version: 390.67 | # |-------------------------------+----------------------+----------------------| # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # |===============================+======================+======================| # | 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 | # | N/A 37C P8 31W / 149W | 0MiB / 11441MiB | 0% Default | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: GPU Memory | # | GPU PID Type Process name Usage | # |=============================================================================| # | No running processes found | # +-----------------------------------------------------------------------------+ ``` -------------------------------- ### UnityEnvironment Initialization and Basic Usage Source: https://context7.com/unity-technologies/ml-agents/llms.txt Demonstrates how to initialize UnityEnvironment, configure it using side channels, reset the environment, get behavior specs, and step through episodes with random actions. ```APIDOC ## UnityEnvironment ### Description `UnityEnvironment` is the primary Python class for connecting to a Unity simulation. It supports launching standalone executables or attaching to the Unity Editor, and exposes a step-based API for multi-agent environments. All side channels are injected at construction time. ### Initialization Example ```python from mlagents_envs.environment import UnityEnvironment from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel from mlagents_envs.side_channel.environment_parameters_channel import EnvironmentParametersChannel import numpy as np # Configure engine and environment parameters via side channels engine_channel = EngineConfigurationChannel() params_channel = EnvironmentParametersChannel() # Connect to a standalone build; worker_id offsets the port (base 5005) env = UnityEnvironment( file_name="builds/3DBall", worker_id=0, seed=42, no_graphics=True, timeout_wait=120, side_channels=[engine_channel, params_channel], num_areas=4, ) # Speed up simulation (time_scale > 1 skips rendering waits) engine_channel.set_configuration_parameters(time_scale=20.0, target_frame_rate=60) # Randomize a Unity-side float parameter (read via Academy.EnvironmentParameters) params_channel.set_float_parameter("goal_size", 5.0) env.reset() behavior_name = list(env.behavior_specs.keys())[0] spec = env.behavior_specs[behavior_name] print(f"Behavior: {behavior_name}") print(f"Action spec: {spec.action_spec}") # e.g. Continuous: 2, Discrete: () print(f"Observation specs: {spec.observation_specs}") try: for episode in range(3): env.reset() for step in range(200): decision_steps, terminal_steps = env.get_steps(behavior_name) # Random actions for all agents requesting a decision n_agents = len(decision_steps) if n_agents > 0: action_tuple = spec.action_spec.random_action(n_agents) env.set_actions(behavior_name, action_tuple) env.step() # Per-agent episode endings for agent_id in terminal_steps: t = terminal_steps[agent_id] print(f" Agent {agent_id} ended. Reward={t.reward:.3f}, interrupted={t.interrupted}") finally: env.close() ``` ``` -------------------------------- ### Test Xorg Configuration with glxgears Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Verify Xorg is correctly configured by running glxgears. A high FPS indicates proper setup. ```sh # For more information on glxgears, see ftp://www.x.org/pub/X11R6.8.1/doc/glxgears.1.html. $ glxgears # If Xorg is configured correctly, you should see the following message # Running synchronized to the vertical refresh. The framerate should be # approximately the same as the monitor refresh rate. # 137296 frames in 5.0 seconds = 27459.053 FPS # 141674 frames in 5.0 seconds = 28334.779 FPS # 141490 frames in 5.0 seconds = 28297.875 FPS ``` -------------------------------- ### Initialize ML-Agents Environment and Q-Network Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_2_Train.ipynb Sets up the Unity environment from the registry and initializes a VisualQNetwork for training. Imports necessary libraries for environment interaction and deep learning. ```python from mlagents_envs.registry import default_registry from mlagents_envs.environment import UnityEnvironment import matplotlib.pyplot as plt import os %matplotlib inline # Create the GridWorld Environment from the registry env = default_registry["GridWorld"].make() print("GridWorld environment created.") num_actions = 5 ``` ```python qnet = VisualQNetwork((3, 64, 84), 126, num_actions) ``` -------------------------------- ### RollerAgent Initialization and Reset Logic Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md This C# script demonstrates the initialization and reset logic for a RollerAgent. It includes getting a Rigidbody component, resetting agent velocity and position if it falls, and moving the target to a new random location at the start of each episode. ```csharp using System.Collections.Generic; using UnityEngine; using Unity.MLAgents; using Unity.MLAgents.Sensors; public class RollerAgent : Agent { Rigidbody rBody; void Start () { rBody = GetComponent(); } public Transform Target; public override void OnEpisodeBegin() { // If the Agent fell, zero its momentum if (this.transform.localPosition.y < 0) { this.rBody.angularVelocity = Vector3.zero; this.rBody.velocity = Vector3.zero; this.transform.localPosition = new Vector3( 0, 0.5f, 0); } // Move the target to a new spot Target.localPosition = new Vector3(Random.value * 8 - 4, 0.5f, Random.value * 8 - 4); } } ``` -------------------------------- ### Install ML-Agents Package Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_1_Run.ipynb Installs the ML-Agents Python package if it's not already installed. This command uses pip for installation. ```python try: import mlagents print("ml-agents already installed") except ImportError: !python -m pip install -q mlagents==1.1.0 print("Installed ml-agents") ``` -------------------------------- ### Upgrade Pip and Setuptools (macOS) Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md Once the virtual environment is activated, upgrade pip and setuptools to their latest versions using pip3 install --upgrade. ```bash $ pip3 install --upgrade pip $ pip3 install --upgrade setuptools ``` -------------------------------- ### Launch ML-Agents Training with Curriculum Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md Use this command to start training agents with curriculum learning. Specify the path to your curriculum configuration file and a unique run ID. Training can be resumed from where it left off using the `--resume` flag. ```sh mlagents-learn config/ppo/WallJump_curriculum.yaml --run-id=wall-jump-curriculum ``` -------------------------------- ### Initialize UnityEnvironment Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md Starts a new Unity environment and establishes a connection. Note: Communication is unauthenticated; ensure a secure network. Supports various configuration options like no-graphics mode and side channels. ```python | __init__(file_name: Optional[str] = None, worker_id: int = 0, base_port: Optional[int] = None, seed: int = 0, no_graphics: bool = False, no_graphics_monitor: bool = False, timeout_wait: int = 60, additional_args: Optional[List[str]] = None, side_channels: Optional[List[SideChannel]] = None, log_folder: Optional[str] = None, num_areas: int = 1) ``` -------------------------------- ### Start Unity Environment from Registry Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_4_SB3VectorEnv.ipynb Initializes a Unity environment using the `make_mla_sb3_env` factory function. It configures the environment using a `LimitedConfig` object, specifying the environment name, base port, seed, number of parallel environments, and whether to allow multiple observations. ```python # ----------------- # This code is used to close an env that might not have been closed before try: env.close() except: pass # ----------------- env = make_mla_sb3_env( config=LimitedConfig( env_path_or_name='Basic', # Can use any name from a registry or a path to your own unity build. base_port=6006, base_seed=42, num_env=NUM_ENVS, allow_multiple_obs=True, ), no_graphics=True, # Set to false if you are running locally and want to watch the environments move around as they train. ) ``` -------------------------------- ### Start ML Agents Training with Multiple Environments Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Create-New.md Use the `--num-envs` command-line option to specify the number of concurrent Unity instances to run in parallel during training. This enhances training speed by allowing parallel experience gathering. ```bash mlagents-learn config/rollerball_config.yaml --run-id=RollerBall --num-envs=2 ``` -------------------------------- ### Install Nvidia Driver on Ubuntu Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs the necessary NVIDIA driver for GPU support on Ubuntu 16.04 LTS. Requires a reboot after installation. ```bash wget http://us.download.nvidia.com/tesla/375.66/nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb sudo dpkg -i nvidia-diag-driver-local-repo-ubuntu1604_375.66-1_amd64.deb sudo apt-get update sudo apt-get install cuda-drivers sudo reboot ``` -------------------------------- ### Local Plugin Installation Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-Plugins.md Install your custom plugin locally using pip in editable mode. This command should be run in the same Python virtual environment where ML-agents is installed. ```bash pip install -e [path to your plugin code] ``` -------------------------------- ### Start TensorBoard for ML-Agents Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Tensorboard.md Run this command in your terminal to launch TensorBoard and view training statistics. Ensure you are in the ML-Agents Toolkit directory. The 'results' folder contains the log files, and '--port 6006' specifies the network port. ```bash tensorboard --logdir results --port 6006 ``` -------------------------------- ### CheckpointSettings.prioritize_resume_init Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-On-Off-Policy-Trainer-Documentation.md Prioritize explicit command line resume/init over conflicting yaml options. if both resume/init are set at one place use resume. ```APIDOC ## CheckpointSettings.prioritize_resume_init ### Description Prioritize explicit command line resume/init over conflicting yaml options. if both resume/init are set at one place use resume. ### Method prioritize_resume_init ``` -------------------------------- ### Start Unity Environment from Registry Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_1_Run.ipynb Initializes and makes a Unity Environment instance from the default registry using the selected `env_id`. It includes a mechanism to close any previously opened environment to prevent conflicts. ```python # ----------------- # This code is used to close an env that might not have been closed before try: env.close() except: pass # ----------------- from mlagents_envs.registry import default_registry env = default_registry[env_id].make() ``` -------------------------------- ### UnityAECEnv.__init__ Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-PettingZoo-API-Documentation.md Initializes the UnityAECEnv wrapper for PettingZoo environments. ```APIDOC ## __init__ ```python | __init__(env: BaseEnv, seed: Optional[int] = None) ``` ### Description Initializes a Unity AEC environment wrapper. ### Arguments - `env`: The UnityEnvironment that is being wrapped. - `seed`: The seed for the action spaces of the agents. ``` -------------------------------- ### Start TensorBoard on Azure VM Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Run TensorBoard to monitor training progress. Ensure the log directory is correctly set and the host is configured to allow external connections. ```bash tensorboard --logdir results --host 0.0.0.0 ``` -------------------------------- ### Install PyTorch on Windows with CUDA Support Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md On Windows, install PyTorch version 2.2.1 with CUDA 12.1 support using pip. Ensure Microsoft Visual C++ Redistributable is installed if prompted. ```shell pip3 install torch~=2.2.1 --index-url https://download.pytorch.org/whl/cu121 ``` -------------------------------- ### Initialize and Interact with UnityEnvironment Source: https://context7.com/unity-technologies/ml-agents/llms.txt Connect to a Unity standalone build, configure engine and environment parameters using side channels, and step through the environment with random actions. ```python from mlagents_envs.environment import UnityEnvironment from mlagents_envs.base_env import ActionTuple from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel from mlagents_envs.side_channel.environment_parameters_channel import EnvironmentParametersChannel import numpy as np # Configure engine and environment parameters via side channels engine_channel = EngineConfigurationChannel() params_channel = EnvironmentParametersChannel() # Connect to a standalone build; worker_id offsets the port (base 5005) env = UnityEnvironment( file_name="builds/3DBall", worker_id=0, seed=42, no_graphics=True, timeout_wait=120, side_channels=[engine_channel, params_channel], num_areas=4, ) # Speed up simulation (time_scale > 1 skips rendering waits) engine_channel.set_configuration_parameters(time_scale=20.0, target_frame_rate=60) # Randomize a Unity-side float parameter (read via Academy.EnvironmentParameters) params_channel.set_float_parameter("goal_size", 5.0) env.reset() behavior_name = list(env.behavior_specs.keys())[0] spec = env.behavior_specs[behavior_name] print(f"Behavior: {behavior_name}") print(f"Action spec: {spec.action_spec}") # e.g. Continuous: 2, Discrete: () print(f"Observation specs: {spec.observation_specs}") try: for episode in range(3): env.reset() for step in range(200): decision_steps, terminal_steps = env.get_steps(behavior_name) # Random actions for all agents requesting a decision n_agents = len(decision_steps) if n_agents > 0: action_tuple = spec.action_spec.random_action(n_agents) env.set_actions(behavior_name, action_tuple) env.step() # Per-agent episode endings for agent_id in terminal_steps: t = terminal_steps[agent_id] print(f" Agent {agent_id} ended. Reward={t.reward:.3f}, interrupted={t.interrupted}") finally: env.close() ``` -------------------------------- ### Install mlagents_envs Package Source: https://github.com/unity-technologies/ml-agents/blob/develop/ml-agents-envs/README.md Install the mlagents_envs package with a specific version using pip. ```sh python -m pip install mlagents_envs==1.1.0 ``` -------------------------------- ### Install Additional Dependencies Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs Pillow and NumPy, which are common dependencies for ML-Agents projects. ```bash pip3 install pillow pip3 install numpy ``` -------------------------------- ### Create and Wrap Unity Environment for Gym Source: https://context7.com/unity-technologies/ml-agents/llms.txt Demonstrates creating a UnityEnvironment and wrapping it with UnityToGymWrapper for use with the Gym interface. This setup is suitable for single-agent environments. ```python from mlagents_envs.environment import UnityEnvironment from mlagents_envs.wrappers.unity_to_gym import UnityToGymWrapper unity_env = UnityEnvironment(file_name="builds/3DBall", no_graphics=True, seed=1) env = UnityToGymWrapper( unity_env, uint8_visual=False, # float32 visual obs in [0,1] flatten_branched=False, # keep MultiDiscrete for branched actions allow_multiple_obs=False, # return first obs only (or list if True) action_space_seed=42, ) print(f"Observation space: {env.observation_space}") print(f"Action space: {env.action_space}") obs = env.reset() total_reward = 0.0 done = False while not done: action = env.action_space.sample() obs, reward, done, info = env.step(action) total_reward += reward print(f"Episode reward: {total_reward:.2f}") env.close() ``` -------------------------------- ### UnityEnvironment Initialization Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md Initializes a Unity environment and establishes a connection. Supports various configurations for graphics, ports, and command-line arguments. ```APIDOC ## __init__ ### Description Starts a new unity environment and establishes a connection with the environment. Notice: Currently communication between Unity and Python takes place over an open socket without authentication. Ensure that the network where training takes place is secure. ### Parameters #### Path Parameters - **file_name** (Optional[str]) - Description: Name of Unity environment binary. - **worker_id** (int) - Description: Offset from base_port. Used for training multiple environments simultaneously. Defaults to 0. - **base_port** (Optional[int]) - Description: Baseline port number to connect to Unity environment over. worker_id increments over this. If no environment is specified (i.e. file_name is None), the DEFAULT_EDITOR_PORT will be used. - **seed** (int) - Description: Seed for the environment. Defaults to 0. - **no_graphics** (bool) - Description: Whether to run the Unity simulator in no-graphics mode. Defaults to False. - **no_graphics_monitor** (bool) - Description: Whether to run the main worker in graphics mode, with the remaining in no-graphics mode. Defaults to False. - **timeout_wait** (int) - Description: Time (in seconds) to wait for connection from environment. Defaults to 60. - **additional_args** (Optional[List[str]]) - Description: Additional Unity command line arguments. - **side_channels** (Optional[List[SideChannel]]) - Description: Additional side channel for no-rl communication with Unity. - **log_folder** (Optional[str]) - Description: Optional folder to write the Unity Player log file into. Requires absolute path. - **num_areas** (int) - Description: Number of areas in the environment. Defaults to 1. ``` -------------------------------- ### Observe Training Progress with TensorBoard Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md Launch TensorBoard to visualize training statistics. Point it to the `results` directory where training logs are stored. Access the dashboard via `localhost:6006`. ```bash tensorboard --logdir results ``` -------------------------------- ### Install python3-distutils (Ubuntu) Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md On Ubuntu, if you encounter a ModuleNotFoundError for distutils.util, install the python3-distutils package using apt-get. ```bash sudo apt-get install python3-distutils ``` -------------------------------- ### Load and Initialize Unity Environment Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI.md Use UnityEnvironment to load a built Unity environment binary and start interacting with it. Set the file_name to None to interact directly with the Editor. ```python from mlagents_envs.environment import UnityEnvironment # This is a non-blocking call that only loads the environment. env = UnityEnvironment(file_name="3DBall", seed=1, side_channels=[]) # Start interacting with the environment. env.reset() behavior_names = env.behavior_specs.keys() ... ``` -------------------------------- ### Install ML-Agents Packages Source: https://context7.com/unity-technologies/ml-agents/llms.txt Installs the necessary mlagents and mlagents-envs Python packages. Ensure you use compatible versions. ```bash pip install mlagents==1.1.0 mlagents-envs==1.1.0 ``` -------------------------------- ### Install TensorFlow CPU Version Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs TensorFlow version 1.4.0 and Keras version 2.0.6, for CPU-only training. ```bash pip3 install tensorflow==1.4.0 keras==2.0.6 ``` -------------------------------- ### Launch Unity Environment from Registry Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md Launches a Unity environment from a registry entry using the `make` method. ```python registry = UnityEnvRegistry() env = registry[].make() ``` -------------------------------- ### Example Training Output Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Sample.md This console output shows the typical information displayed during an ML-Agents training session, including hyperparameters and step-by-step reward progression. ```console INFO:mlagents_envs: 'Ball3DAcademy' started successfully! Unity Academy name: Ball3DAcademy INFO:mlagents_envs:Connected new brain: Unity brain name: 3DBallLearning Number of Visual Observations (per agent): 0 Vector Observation space size (per agent): 8 Number of stacked Vector Observation: 1 INFO:mlagents_envs:Hyperparameters for the PPO Trainer of brain 3DBallLearning: batch_size: 64 beta: 0.001 buffer_size: 12000 epsilon: 0.2 gamma: 0.995 hidden_units: 128 lambd: 0.99 learning_rate: 0.0003 max_steps: 5.0e4 normalize: True num_epoch: 3 num_layers: 2 time_horizon: 1000 sequence_length: 64 summary_freq: 1000 use_recurrent: False memory_size: 256 use_curiosity: False curiosity_strength: 0.01 curiosity_enc_size: 128 output_path: ./results/first3DBallRun/3DBallLearning INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 1000. Mean Reward: 1.242. Std of Reward: 0.746. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 2000. Mean Reward: 1.319. Std of Reward: 0.693. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 3000. Mean Reward: 1.804. Std of Reward: 1.056. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 4000. Mean Reward: 2.151. Std of Reward: 1.432. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 5000. Mean Reward: 3.175. Std of Reward: 2.250. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 6000. Mean Reward: 4.898. Std of Reward: 4.019. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 7000. Mean Reward: 6.716. Std of Reward: 5.125. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 8000. Mean Reward: 12.124. Std of Reward: 11.929. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 9000. Mean Reward: 18.151. Std of Reward: 16.871. Training. INFO:mlagents.trainers: first3DBallRun: 3DBallLearning: Step: 10000. Mean Reward: 27.284. Std of Reward: 28.667. Training. ``` -------------------------------- ### Install ML-Agents Library Source: https://github.com/unity-technologies/ml-agents/blob/develop/ml-agents-envs/colabs/Colab_PettingZoo.ipynb Clones the ml-agents repository and installs the ml-agents-envs and ml-agents packages. This is required to use the ML-Agents environments. ```python try: import mlagents print("ml-agents already installed") except ImportError: !git clone -b main --single-branch https://github.com/Unity-Technologies/ml-agents.git !python -m pip install -q ./ml-agents/ml-agents-envs !python -m pip install -q ./ml-agents/ml-agents print("Installed ml-agents") ``` -------------------------------- ### Install TensorFlow GPU Version Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Microsoft-Azure.md Installs TensorFlow version 1.4.0 and Keras version 2.0.6, specifically for GPU-accelerated training. ```bash pip3 install tensorflow-gpu==1.4.0 keras==2.0.6 ``` -------------------------------- ### Run Training Command Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md Use this command to start the training process. Specify the trainer configuration file, the environment executable name, and a unique run identifier. ```sh mlagents-learn config/ppo/3DBall.yaml --env=3DBall --run-id=firstRun ``` -------------------------------- ### Install Nvidia Driver on Ubuntu Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Download and install the latest Nvidia driver for Ubuntu. Ensure Nouveau is disabled to avoid conflicts. ```sh # Download and install the latest Nvidia driver for ubuntu # Please refer to http://download.nvidia.com/XFree86/Linux-#x86_64/latest.txt $ wget http://download.nvidia.com/XFree86/Linux-x86_64/390.87/NVIDIA-Linux-x86_64-390.87.run $ sudo /bin/bash ./NVIDIA-Linux-x86_64-390.87.run --accept-license --no-questions --ui=none # Disable Nouveau as it will clash with the Nvidia driver $ sudo echo 'blacklist nouveau' | sudo tee -a /etc/modprobe.d/blacklist.conf $ sudo echo 'options nouveau modeset=0' | sudo tee -a /etc/modprobe.d/blacklist.conf $ sudo echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf $ sudo update-initramfs -u ``` -------------------------------- ### UnityParallelEnv.__init__ Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-PettingZoo-API-Documentation.md Initializes the UnityParallelEnv wrapper for PettingZoo environments. ```APIDOC ## __init__ ```python | __init__(env: BaseEnv, seed: Optional[int] = None) ``` ### Description Initializes a Unity Parallel environment wrapper. ### Arguments - `env`: The UnityEnvironment that is being wrapped. - `seed`: The seed for the action spaces of the agents. ``` -------------------------------- ### Configure GAIL + Behavioral Cloning Source: https://context7.com/unity-technologies/ml-agents/llms.txt Configuration for GAIL and Behavioral Cloning within the ML-Agents PPO trainer. Ensure demo_path points to your expert demonstration files. ```yaml behaviors: Crawler: trainer_type: ppo hyperparameters: batch_size: 2024 buffer_size: 20240 learning_rate: 0.0003 beta: 0.005 epsilon: 0.2 lambd: 0.95 num_epoch: 3 learning_rate_schedule: linear network_settings: normalize: true hidden_units: 512 num_layers: 3 reward_signals: gail: gamma: 0.99 strength: 1.0 demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawler.demo use_actions: false use_vail: false learning_rate: 0.0003 max_steps: 10000000 time_horizon: 1000 summary_freq: 30000 behavioral_cloning: demo_path: Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawler.demo steps: 50000 strength: 0.5 ``` -------------------------------- ### Create Virtual Environment (Windows) Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Virtual-Environment.md Create a directory for virtual environments and then create a new virtual environment named 'sample-env' using the python -m venv command. ```cmd md python-envs python -m venv python-envs\sample-env ``` -------------------------------- ### RawBytesChannel Example Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-LLAPI-Documentation.md An example implementation of a SideChannel for raw byte exchange between the environment and the Python API. It is intended for general research purposes. ```python class RawBytesChannel(SideChannel) ``` -------------------------------- ### Install ML-Agents Python Packages (Editable Mode) Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md Installs the ML-Agents Python packages in editable mode, allowing live changes to the Python files for immediate testing. This is recommended if you plan to modify the ML-Agents source code or contribute changes. Install PyTorch first, then the ML-Agents packages. ```sh pip3 install torch -f https://download.pytorch.org/whl/torch_stable.html pip3 install -e ./ml-agents-envs pip3 install -e ./ml-agents ``` -------------------------------- ### Initialize Unity Gym Environment Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Gym-API.md Instantiate a Unity environment wrapped for the Gym API. Configure visual observation format, action space flattening, and observation list handling. ```python from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper env = UnityToGymWrapper(unity_env, uint8_visual, flatten_branched, allow_multiple_obs) ``` -------------------------------- ### Run ML-Agents Docker Container Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Using-Docker.md Use this command to start the ML-Agents Docker container. Ensure you replace placeholders like ``, ``, and `` with your specific values. The `--mount` flag is crucial for persisting data and accessing configuration files. ```sh docker run -it --name \ --mount type=bind,source="$(pwd)"/unity-volume,target=/unity-volume \ -p 5005:5005 \ -p 6006:6006 \ :latest \ \ --env= \ --train \ --run-id= ``` -------------------------------- ### Trainer Configuration Error Message Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md This error message appears if the specified trainer type is incorrect or not found in the installed packages. Double-check your YAML configuration and installation. ```shell mlagents.trainers.exception.TrainerConfigError: Invalid trainer type a2c was found ``` -------------------------------- ### Install ML-Agents Packages Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Tutorial-Custom-Trainer-Plugin.md Install the ML-Agents core packages and your custom trainer plugin using pip in editable mode. This ensures changes are reflected immediately. ```shell pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents pip3 install -e ./ml-agents-trainer-plugin ``` -------------------------------- ### Initialize Unity Environment with Executable Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Learning-Environment-Executable.md Use this code to connect to a Unity environment executable using the Python API. Ensure the 'file_name' argument points to your built environment. ```python from mlagents_envs.environment import UnityEnvironment env = UnityEnvironment(file_name=) ``` -------------------------------- ### Clone ML-Agents Repo and Install Packages Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-on-Amazon-Web-Service.md Clone the ML-Agents repository and install the required Python packages using pip. Ensure you are in the correct directory after cloning. ```sh git clone --branch release_23 https://github.com/Unity-Technologies/ml-agents.git cd ml-agents/ml-agents/ pip3 install -e . ``` -------------------------------- ### View mlagents-learn CLI options Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Training-ML-Agents.md Use this command to see all available command-line options for the `mlagents-learn` utility. ```sh mlagents-learn --help ``` -------------------------------- ### Install and Use Custom A2C Trainer Source: https://context7.com/unity-technologies/ml-agents/llms.txt Installs the custom trainer plugin in editable mode and demonstrates how to use the 'a2c' trainer type in a YAML configuration file. ```bash # Install in editable mode so the entry point is live pip install -e ./ml-agents-trainer-plugin # Now use the custom trainer_type in YAML # behaviors: # MyAgent: # trainer_type: a2c # hyperparameters: # batch_size: 256 mlagents-learn config/custom/my_agent.yaml --run-id=a2c_run ``` -------------------------------- ### Install Custom Trainer Plugin Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Custom-Trainer-Plugin.md Install a custom ML-Agents trainer plugin using pip in editable mode. Replace '<./ml-agents-trainer-plugin>' with the actual path to your plugin folder. ```sh pip3 install -e <./ml-agents-trainer-plugin> ``` -------------------------------- ### Install grpcio for Dependency Resolution Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Installation.md If a 'grpcio' wheel build error occurs, install a specific version (1.48.2) from conda-forge before reinstalling 'mlagents'. This resolves potential dependency conflicts. ```shell conda install "grpcio=1.48.2" -c conda-forge ``` -------------------------------- ### UnityToGymWrapper.__init__ Source: https://github.com/unity-technologies/ml-agents/blob/develop/com.unity.ml-agents/Documentation~/Python-Gym-API-Documentation.md Initializes the UnityToGymWrapper, providing a Gym interface for Unity environments. It allows for customization of visual observation format, action space flattening, and handling of multiple observations. ```APIDOC ## __init__ ### Description Environment initialization. ### Parameters * `unity_env` (BaseEnv) - Required - The Unity BaseEnv to be wrapped in the gym. Will be closed when the UnityToGymWrapper closes. * `uint8_visual` (bool) - Optional - Return visual observations as uint8 (0-255) matrices instead of float (0.0-1.0). * `flatten_branched` (bool) - Optional - If True, turn branched discrete action spaces into a Discrete space rather than MultiDiscrete. * `allow_multiple_obs` (bool) - Optional - If True, return a list of np.ndarrays as observations with the first elements containing the visual observations and the last element containing the array of vector observations. If False, returns a single np.ndarray containing either only a single visual observation or the array of vector observations. * `action_space_seed` (Optional[int]) - Optional - If non-None, will be used to set the random seed on created gym.Space instances. ``` -------------------------------- ### Configure and Initialize PPO Agent Source: https://github.com/unity-technologies/ml-agents/blob/develop/colab/Colab_UnityEnvironment_4_SB3VectorEnv.ipynb Sets up hyperparameters and initializes a PPO agent for training. It configures the policy network architecture, learning rate, clip range, and batch size. The environment is wrapped with `VecMonitor` for statistics gathering. ```python # 250K should train to a reward ~= 0.90 for the "Basic" environment. # We set the value lower here to demonstrate just a small amount of trianing. BATCH_SIZE = 32 BUFFER_SIZE = 256 UPDATES = 50 TOTAL_TAINING_STEPS_GOAL = BUFFER_SIZE * UPDATES BETA = 0.0005 N_EPOCHS = 3 STEPS_PER_UPDATE = BUFFER_SIZE / NUM_ENVS # Helps gather stats for our eval() calls later so we can see reward stats. env = VecMonitor(env) #Policy and Value function with 2 layers of 128 units each and no shared layers. policy_kwargs = {"net_arch" : [{"pi": [32,32], "vf": [32,32]}]} model = PPO( "MlpPolicy", env, verbose=1, learning_rate=lambda progress: 0.0003 * (1.0 - progress), clip_range=lambda progress: 0.2 * (1.0 - progress), clip_range_vf=lambda progress: 0.2 * (1.0 - progress), # Uncomment this if you want to log tensorboard results when running this notebook locally. # tensorboard_log="results", policy_kwargs=policy_kwargs, n_steps=int(STEPS_PER_UPDATE), batch_size=BATCH_SIZE, n_epochs=N_EPOCHS, ent_coef=BETA, ) ```