======================== CODE SNIPPETS ======================== TITLE: Running the DeviceMesh 2D Setup with TorchRun DESCRIPTION: Command line instruction to run the 2D parallel setup with DeviceMesh using TorchRun, requiring 8 processes per node. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_device_mesh.rst#2025-04-22_snippet_3 LANGUAGE: python CODE: ``` torchrun --nproc_per_node=8 2d_setup_with_device_mesh.py ``` ---------------------------------------- TITLE: Running the HSDP Setup with TorchRun DESCRIPTION: Command line instruction to run the Hybrid Sharding Data Parallel setup with TorchRun, requiring 8 processes per node. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_device_mesh.rst#2025-04-22_snippet_5 LANGUAGE: python CODE: ``` torchrun --nproc_per_node=8 hsdp.py ``` ---------------------------------------- TITLE: Building Tutorial Documentation DESCRIPTION: Command for building HTML version of the tutorial website without executing code examples. SOURCE: https://github.com/pytorch/tutorials/blob/main/CONTRIBUTING.md#2025-04-22_snippet_2 LANGUAGE: bash CODE: ``` make html-noplot ``` ---------------------------------------- TITLE: Installing NUMA Control Tools DESCRIPTION: Commands to install numactl and taskset utilities on Ubuntu and CentOS SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/xeon_run_cpu.rst#2025-04-22_snippet_1 LANGUAGE: console CODE: ``` $ apt-get install numactl ``` LANGUAGE: console CODE: ``` $ yum install numactl ``` LANGUAGE: console CODE: ``` $ apt-get install util-linux ``` LANGUAGE: console CODE: ``` $ yum install util-linux ``` ---------------------------------------- TITLE: Installing Memory Allocators DESCRIPTION: Commands to install TCMalloc and JeMalloc memory allocators on different platforms SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/xeon_run_cpu.rst#2025-04-22_snippet_3 LANGUAGE: console CODE: ``` $ apt-get install google-perftools ``` LANGUAGE: console CODE: ``` $ yum install gperftools ``` LANGUAGE: console CODE: ``` $ conda install conda-forge::gperftools ``` LANGUAGE: console CODE: ``` $ apt-get install libjemalloc2 ``` LANGUAGE: console CODE: ``` $ yum install jemalloc ``` LANGUAGE: console CODE: ``` $ conda install conda-forge::jemalloc ``` ---------------------------------------- TITLE: Installing Intel OpenMP Runtime Library DESCRIPTION: Commands to install Intel OpenMP Runtime Library using pip or conda SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/xeon_run_cpu.rst#2025-04-22_snippet_2 LANGUAGE: console CODE: ``` $ pip install intel-openmp ``` LANGUAGE: console CODE: ``` $ conda install mkl ``` ---------------------------------------- TITLE: Setting up 2D Parallel Pattern With DeviceMesh in PyTorch DESCRIPTION: This code demonstrates how to use DeviceMesh to simplify the setup of a 2D parallel pattern. It shows how to initialize a device mesh and access the underlying process groups. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_device_mesh.rst#2025-04-22_snippet_2 LANGUAGE: python CODE: ``` from torch.distributed.device_mesh import init_device_mesh mesh_2d = init_device_mesh("cuda", (2, 4), mesh_dim_names=("replicate", "shard")) # Users can access the underlying process group thru `get_group` API. replicate_group = mesh_2d.get_group(mesh_dim="replicate") shard_group = mesh_2d.get_group(mesh_dim="shard") ``` ---------------------------------------- TITLE: Installing System and Python Prerequisites for WSI Analysis - Bash DESCRIPTION: This snippet gives shell commands to install required system-level libraries (OpenJpeg, OpenSlide, Pixman) via apt-get for Linux, and the main Python dependencies (TIAToolbox <1.5, HistoEncoder) via pip. Successful installation is echoed in the terminal. Optional alternate homebrew instructions for macOS are mentioned in the surrounding text, but not directly included as a snippet. These commands are prerequisites for running subsequent code blocks that rely on TIAToolbox or handle .svs/.tif WSI data. No inputs/outputs other than terminal installation logs and confirmation message. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/tiatoolbox_tutorial.rst#2025-04-22_snippet_4 LANGUAGE: bash CODE: ``` apt-get -y -qq install libopenjp2-7-dev libopenjp2-tools openslide-tools libpixman-1-dev pip install -q 'tiatoolbox<1.5' histoencoder && echo "Installation is done." ``` ---------------------------------------- TITLE: Installing PyTorch Dependencies DESCRIPTION: Commands for installing the nightly build of PyTorch and torchvision, including options for CPU and CUDA support. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/quantized_transfer_learning_tutorial.rst#2025-04-22_snippet_0 LANGUAGE: shell CODE: ``` pip install numpy pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html ``` ---------------------------------------- TITLE: Shell Commands to Build the Example LibTorch C++ Application DESCRIPTION: This shell script demonstrates the common commands to build the C++ example application using CMake. It requires that CMake and LibTorch are properly installed. /path/to/libtorch must be the full path to the LibTorch directory. The commands create a build directory, run cmake to configure with the LibTorch prefix, and build the application in Release mode. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_export.rst#2025-04-22_snippet_6 LANGUAGE: sh CODE: ``` mkdir build\ncd build\ncmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..\ncmake --build . --config Release ``` ---------------------------------------- TITLE: Installing PyTorch Dependencies DESCRIPTION: Command to install the latest version of PyTorch and related packages using pip. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/FSDP_advanced_tutorial.rst#2025-04-22_snippet_0 LANGUAGE: bash CODE: ``` pip3 install torch torchvision torchaudio ``` ---------------------------------------- TITLE: Adding PyTorch Entry Point in torch_npu setup (Diff) DESCRIPTION: Shows a diff for torch_npu's setup.py to add the entry point required for device backend autoloading, specifying torch_npu's _autoload function. This ensures the extension can be discovered and loaded automatically by PyTorch at runtime. Requires setup function from setuptools and the _autoload function defined in torch_npu. The snippet documents only the modifications relevant to extension registration. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/python_extension_autoload.rst#2025-04-22_snippet_6 LANGUAGE: diff CODE: ``` setup( name="torch_npu", version="2.5", + entry_points={ + 'torch.backends': [ + 'torch_npu = torch_npu:_autoload', + ], + } ) ``` ---------------------------------------- TITLE: Running the 2D Setup with TorchRun DESCRIPTION: Command line instruction to run the 2D parallel setup using TorchRun (PyTorch Elastic). It specifies to use 8 processes per node with a rendezvous endpoint. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_device_mesh.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` torchrun --nproc_per_node=8 --rdzv_id=100 --rdzv_endpoint=localhost:29400 2d_setup.py ``` ---------------------------------------- TITLE: Building Single Tutorial with Gallery Pattern DESCRIPTION: Commands demonstrating how to build a specific tutorial using the GALLERY_PATTERN environment variable. SOURCE: https://github.com/pytorch/tutorials/blob/main/README.md#2025-04-22_snippet_1 LANGUAGE: bash CODE: ``` GALLERY_PATTERN="neural_style_transfer_tutorial.py" make html ``` LANGUAGE: bash CODE: ``` GALLERY_PATTERN="neural_style_transfer_tutorial.py" sphinx-build . _build ``` ---------------------------------------- TITLE: Running PyTorch Inference with run_cpu Script DESCRIPTION: Example commands for running PyTorch inference in different configurations using the run_cpu script SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/xeon_run_cpu.rst#2025-04-22_snippet_4 LANGUAGE: console CODE: ``` $ python -m torch.backends.xeon.run_cpu --ninstances 1 --ncores-per-instance 1 [program_args] ``` LANGUAGE: console CODE: ``` $ python -m torch.backends.xeon.run_cpu --node-id 0 [program_args] ``` LANGUAGE: console CODE: ``` $ python -m torch.backends.xeon.run_cpu --ninstances 8 --ncores-per-instance 14 [program_args] ``` LANGUAGE: console CODE: ``` $ python -m torch.backends.xeon.run_cpu --throughput-mode [program_args] ``` LANGUAGE: console CODE: ``` $ python -m torch.backends.xeon.run_cpu –h usage: run_cpu.py [-h] [--multi-instance] [-m] [--no-python] [--enable-tcmalloc] [--enable-jemalloc] [--use-default-allocator] [--disable-iomp] [--ncores-per-instance] [--ninstances] [--skip-cross-node-cores] [--rank] [--latency-mode] [--throughput-mode] [--node-id] [--use-logical-core] [--disable-numactl] [--disable-taskset] [--core-list] [--log-path] [--log-file-prefix] [program_args] ``` ---------------------------------------- TITLE: Specifying the Entry Point in setup.py for Autoloading (Python) DESCRIPTION: Shows how to add an entry_points section to the setup() function in setup.py so that the package registers itself as a PyTorch backend. This informs PyTorch to call the specified function (here, _autoload) via the entry point mechanism. Requires setuptools and PyTorch installation. The 'torch.backends' entry ensures the specified module:function is discoverable by PyTorch's autoload machinery. The parameters include the package name, version, and entry_points dictionary. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/python_extension_autoload.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` setup( name="torch_foo", version="1.0", entry_points={ "torch.backends": [ "torch_foo = torch_foo:_autoload", ], } ) ``` ---------------------------------------- TITLE: Building and Running the PyTorch C++ DCGAN Example DESCRIPTION: Contains shell commands demonstrating how to build the DCGAN C++ example using `make` and then run the resulting executable (`./dcgan`). The included sample output verifies that the program runs and successfully loads data batches from the MNIST dataset, printing batch sizes and labels as defined in the iteration loop. Requires `make`, a C++ toolchain, and a configured build system (likely CMake). SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_frontend.rst#2025-04-22_snippet_22 LANGUAGE: shell CODE: ``` root@fa350df05ecf:/home/build# make Scanning dependencies of target dcgan [ 50%] Building CXX object CMakeFiles/dcgan.dir/dcgan.cpp.o [100%] Linking CXX executable dcgan [100%] Built target dcgan root@fa350df05ecf:/home/build# make [100%] Built target dcgan root@fa350df05ecf:/home/build# ./dcgan Batch size: 64 | Labels: 5 2 6 7 2 1 6 7 0 1 6 2 3 6 9 1 8 4 0 6 5 3 3 0 4 6 6 6 4 0 8 6 0 6 9 2 4 0 2 8 6 3 3 2 9 2 0 1 4 2 3 4 8 2 9 9 3 5 8 0 0 7 9 9 Batch size: 64 | Labels: 2 2 4 7 1 2 8 8 6 9 0 2 2 9 3 6 1 3 8 0 4 4 8 8 8 9 2 6 4 7 1 5 0 9 7 5 4 3 5 4 1 2 8 0 7 1 9 6 1 6 5 3 4 4 1 2 3 2 3 5 0 1 6 2 Batch size: 64 | Labels: 4 5 4 2 1 4 8 3 8 3 6 1 5 4 3 6 2 2 5 1 3 1 5 0 8 2 1 5 3 2 4 4 5 9 7 2 8 9 2 0 6 7 4 3 8 3 5 8 8 3 0 5 8 0 8 7 8 5 5 6 1 7 8 0 Batch size: 64 | Labels: 3 3 7 1 4 1 6 1 0 3 6 4 0 2 5 4 0 4 2 8 1 9 6 5 1 6 3 2 8 9 2 3 8 7 4 5 9 6 0 8 3 0 0 6 4 8 2 5 4 1 8 3 7 8 0 0 8 9 6 7 2 1 4 7 Batch size: 64 | Labels: 3 0 5 5 9 8 3 9 8 9 5 9 5 0 4 1 2 7 7 2 0 0 5 4 8 7 7 6 1 0 7 9 3 0 6 3 2 6 2 7 6 3 3 4 0 5 8 8 9 1 9 2 1 9 4 4 9 2 4 6 2 9 4 0 Batch size: 64 | Labels: 9 6 7 5 3 5 9 0 8 6 6 7 8 2 1 9 8 8 1 1 8 2 0 7 1 4 1 6 7 5 1 7 7 4 0 3 2 9 0 6 6 3 4 4 8 1 2 8 6 9 2 0 3 1 2 8 5 6 4 8 5 8 6 2 Batch size: 64 | Labels: 9 3 0 3 6 5 1 8 6 0 1 9 9 1 6 1 7 7 4 4 4 7 8 8 6 7 8 2 6 0 4 6 8 2 5 3 9 8 4 0 9 9 3 7 0 5 8 2 4 5 6 2 8 2 5 3 7 1 9 1 8 2 2 7 Batch size: 64 | Labels: 9 1 9 2 7 2 6 0 8 6 8 7 7 4 8 6 1 1 6 8 5 7 9 1 3 2 0 5 1 7 3 1 6 1 0 8 6 0 8 1 0 5 4 9 3 8 5 8 4 8 0 1 2 6 2 4 2 7 7 3 7 4 5 3 Batch size: 64 | Labels: 8 8 3 1 8 6 4 2 9 5 8 0 2 8 6 6 7 0 9 8 3 8 7 1 6 6 2 7 7 4 5 5 2 1 7 9 5 4 9 1 0 3 1 9 3 9 8 8 5 3 7 5 3 6 8 9 4 2 0 1 2 5 4 7 Batch size: 64 | Labels: 9 2 7 0 8 4 4 2 7 5 0 0 6 2 0 5 9 5 9 8 8 9 3 5 7 5 4 7 3 0 5 7 6 5 7 1 6 2 8 7 6 3 2 6 5 6 1 2 7 7 0 0 5 9 0 0 9 1 7 8 3 2 9 4 Batch size: 64 | Labels: 7 6 5 7 7 5 2 2 4 9 9 4 8 7 4 8 9 4 5 7 1 2 6 9 8 5 1 2 3 6 7 8 1 1 3 9 8 7 9 5 0 8 5 1 8 7 2 6 5 1 2 0 9 7 4 0 9 0 4 6 0 0 8 6 ... ``` ---------------------------------------- TITLE: Installing Vulkan SDK on macOS (Shell) DESCRIPTION: Shell commands to navigate to the Vulkan SDK root directory, source the environment setup script, and run the Python installation script for the Vulkan SDK on macOS. Requires the Vulkan SDK to be downloaded and unpacked, and the `VULKAN_SDK_ROOT` environment variable to be set. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/vulkan_workflow.rst#2025-04-22_snippet_0 LANGUAGE: shell CODE: ``` cd $VULKAN_SDK_ROOT source setup-env.sh sudo python install_vulkan.py ``` ---------------------------------------- TITLE: FSDP Main Training Setup DESCRIPTION: Main function that sets up FSDP training environment, including data loading, model wrapping, and training initialization. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/FSDP_tutorial.rst#2025-04-22_snippet_5 LANGUAGE: python CODE: ``` def fsdp_main(rank, world_size, args): setup(rank, world_size) transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ]) dataset1 = datasets.MNIST('../data', train=True, download=True, transform=transform) dataset2 = datasets.MNIST('../data', train=False, transform=transform) sampler1 = DistributedSampler(dataset1, rank=rank, num_replicas=world_size, shuffle=True) sampler2 = DistributedSampler(dataset2, rank=rank, num_replicas=world_size) train_kwargs = {'batch_size': args.batch_size, 'sampler': sampler1} test_kwargs = {'batch_size': args.test_batch_size, 'sampler': sampler2} cuda_kwargs = {'num_workers': 2, 'pin_memory': True, 'shuffle': False} train_kwargs.update(cuda_kwargs) test_kwargs.update(cuda_kwargs) train_loader = torch.utils.data.DataLoader(dataset1,**train_kwargs) test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs) my_auto_wrap_policy = functools.partial( size_based_auto_wrap_policy, min_num_params=100 ) torch.cuda.set_device(rank) init_start_event = torch.cuda.Event(enable_timing=True) ``` ---------------------------------------- TITLE: Adding PyTorch Entry Point to setup() for habana_frameworks Extension (Diff) DESCRIPTION: Presents a diff showing how to add an entry_points section to setup.py in the habana_frameworks package. This registers 'device_backend' as an entrypoint, mapped to the __autoload function, allowing PyTorch to autoload the Intel Gaudi HPU backend. The snippet is written in diff format to reflect changes required. It requires setuptool's setup function and that the entry point module and function are available. This modification is essential for making the extension compatible with the PyTorch autoload feature. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/python_extension_autoload.rst#2025-04-22_snippet_3 LANGUAGE: diff CODE: ``` setup( name="habana_frameworks", version="2.5", + entry_points={ + 'torch.backends': [ + "device_backend = habana_frameworks:__autoload", + ], + } ) ``` ---------------------------------------- TITLE: Installing TorchVision (Shell/Pip) DESCRIPTION: Command to install the `torchvision` library using pip. TorchVision provides access to popular datasets, model architectures, and common image transformations for computer vision, needed here to get a pretrained model. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/vulkan_workflow.rst#2025-04-22_snippet_6 LANGUAGE: shell CODE: ``` pip install torchvision ``` ---------------------------------------- TITLE: Python Setup Configuration DESCRIPTION: Python setup script for building the C++ extension using PyTorch's cpp_extension module. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/process_group_cpp_extension_tutorial.rst#2025-04-22_snippet_3 LANGUAGE: Python CODE: ``` # file name: setup.py import os ``` ---------------------------------------- TITLE: Simplified Process Group Initialization with torchrun DESCRIPTION: Demonstrates the simplified process group initialization using torchrun compared to manual setup. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/ddp_series_fault_tolerance.rst#2025-04-22_snippet_1 LANGUAGE: diff CODE: ``` - def ddp_setup(rank, world_size): + def ddp_setup(): - """ - Args: - rank: Unique identifier of each process - world_size: Total number of processes - """ - os.environ["MASTER_ADDR"] = "localhost" - os.environ["MASTER_PORT"] = "12355" - init_process_group(backend="nccl", rank=rank, world_size=world_size) + init_process_group(backend="nccl") torch.cuda.set_device(int(os.environ["LOCAL_RANK"])) ``` ---------------------------------------- TITLE: Folder Layouts for LibTorch and Example Application (Shell) DESCRIPTION: These shell code listings present typical directory layouts for a LibTorch installation and an example C++ application project. They aid in understanding file/folder placement and are useful for configuring build systems or referencing header/library locations. The structure distinguishes between library files and the application source. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_export.rst#2025-04-22_snippet_5 LANGUAGE: sh CODE: ``` libtorch/\n bin/\n include/\n lib/\n share/ ``` LANGUAGE: sh CODE: ``` example-app/\n CMakeLists.txt\n example-app.cpp ``` ---------------------------------------- TITLE: Building a PyTorch C++ Extension with setuptools in Python DESCRIPTION: This Python snippet provides an example setup.py script for building a PyTorch C++ extension (out-of-tree backend) using setuptools and torch.utils.cpp_extension. It specifies the package name, C++ source files, include directories, compiler and linker flags, and custom build extensions. The example assumes existence of variables such as torch_xla_sources, include_dirs, and extra_compile_args, which must be defined as required by your backend. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/extend_dispatcher.rst#2025-04-22_snippet_6 LANGUAGE: Python CODE: ``` from setuptools import setup from torch.utils.cpp_extension import BuildExtension, CppExtension setup( name='torch_xla', ext_modules=[ CppExtension( '_XLAC', torch_xla_sources, include_dirs=include_dirs, extra_compile_args=extra_compile_args, library_dirs=library_dirs, extra_link_args=extra_link_args + \ [make_relative_rpath('torch_xla/lib')], ), ], cmdclass={ 'build_ext': Build, # Build is a derived class of BuildExtension } # more configs... ) ``` ---------------------------------------- TITLE: Distributed Training Setup Functions DESCRIPTION: Helper functions to initialize and cleanup distributed training process groups for FSDP implementation. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/FSDP_tutorial.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` def setup(rank, world_size): os.environ['MASTER_ADDR'] = 'localhost' os.environ['MASTER_PORT'] = '12355' # initialize the process group dist.init_process_group("nccl", rank=rank, world_size=world_size) def cleanup(): dist.destroy_process_group() ``` ---------------------------------------- TITLE: Installing PyTorch and Intel GPU Backend Dependencies (bash) DESCRIPTION: This command installs the required PyTorch stack and Triton backend for Intel GPUs from the official index. It is a prerequisite for running all subsequent Python code in the tutorial. Dependencies installed include torch, torchvision, torchaudio, and pytorch-triton-xpu. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/pt2e_quant_xpu_inductor.rst#2025-04-22_snippet_0 LANGUAGE: bash CODE: ``` pip3 install torch torchvision torchaudio pytorch-triton-xpu --index-url https://download.pytorch.org/whl/xpu ``` ---------------------------------------- TITLE: Running QAT Example with Inductor Freezing Enabled DESCRIPTION: Example command to run the Quantization-Aware Training example with the Inductor freezing feature enabled. This is necessary since the freezing feature is not enabled by default in PyTorch. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/pt2e_quant_x86_inductor.rst#2025-04-22_snippet_11 LANGUAGE: bash CODE: ``` TORCHINDUCTOR_FREEZING=1 python example_x86inductorquantizer_qat.py ``` ---------------------------------------- TITLE: Installing PyTorch for AWS Graviton DESCRIPTION: Command to install PyTorch which supports AWS Graviton3 optimizations starting with version 2.0. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/inference_tuning_on_aws_graviton.rst#2025-04-22_snippet_0 LANGUAGE: bash CODE: ``` python3 -m pip install torch ``` ---------------------------------------- TITLE: Training ResNet50 with FP32 using Intel Extension for PyTorch DESCRIPTION: Demonstrates how to train a ResNet50 model on CIFAR10 dataset using FP32 precision with Intel Extension for PyTorch backend. Includes data loading, model setup, optimization and training loop implementation. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/torch_compile_backend_ipex.rst#2025-04-22_snippet_0 LANGUAGE: python CODE: ``` import torch import torchvision LR = 0.001 DOWNLOAD = True DATA = 'datasets/cifar10/' transform = torchvision.transforms.Compose([ torchvision.transforms.Resize((224, 224)), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) ]) train_dataset = torchvision.datasets.CIFAR10( root=DATA, train=True, transform=transform, download=DOWNLOAD, ) train_loader = torch.utils.data.DataLoader( dataset=train_dataset, batch_size=128 ) model = torchvision.models.resnet50() criterion = torch.nn.CrossEntropyLoss() optimizer = torch.optim.SGD(model.parameters(), lr = LR, momentum=0.9) model.train() import intel_extension_for_pytorch as ipex # Invoke the following API optionally, to apply frontend optimizations model, optimizer = ipex.optimize(model, optimizer=optimizer) compile_model = torch.compile(model, backend="ipex") for batch_idx, (data, target) in enumerate(train_loader): optimizer.zero_grad() output = compile_model(data) loss = criterion(output, target) loss.backward() optimizer.step() ``` ---------------------------------------- TITLE: Example Output of CommDebugMode with MLPModule DESCRIPTION: This shows sample output from CommDebugMode when applied to an MLPModule at noise level 0. It displays the collective operation counts at module level, showing where operations like all_reduce occur in the forward pass of the model. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_comm_debug_mode.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` Expected Output: Global FORWARD PASS *c10d_functional.all_reduce: 1 MLPModule FORWARD PASS *c10d_functional.all_reduce: 1 MLPModule.net1 MLPModule.relu MLPModule.net2 FORWARD PASS *c10d_functional.all_reduce: 1 ``` ---------------------------------------- TITLE: Verifying PyTorch Installation (Python) DESCRIPTION: A simple Python script to import the `torch` library and print its version number. This is used to verify that PyTorch has been successfully built and installed. SOURCE: https://github.com/pytorch/tutorials/blob/main/prototype_source/vulkan_workflow.rst#2025-04-22_snippet_3 LANGUAGE: python CODE: ``` import torch print(torch.__version__) ``` ---------------------------------------- TITLE: Data Loading and Transformation Setup DESCRIPTION: Configures data loading pipelines with transformations for training and validation datasets using torchvision. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/quantized_transfer_learning_tutorial.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` import torch from torchvision import transforms, datasets data_transforms = { 'train': transforms.Compose([ transforms.Resize(224), transforms.RandomCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]), 'val': transforms.Compose([ transforms.Resize(224), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) ]), } data_dir = 'data/hymenoptera_data' image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=16, shuffle=True, num_workers=8) for x in ['train', 'val']} dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'val']} class_names = image_datasets['train'].classes device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") ``` ---------------------------------------- TITLE: FSDP Training Setup and Imports DESCRIPTION: Imports required packages for FSDP training including PyTorch core libraries, transformers, and distributed training utilities. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/FSDP_advanced_tutorial.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` import os import argparse import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from transformers import AutoTokenizer, GPT2TokenizerFast from transformers import T5Tokenizer, T5ForConditionalGeneration import functools from torch.optim.lr_scheduler import StepLR import torch.nn.functional as F import torch.distributed as dist import torch.multiprocessing as mp from torch.nn.parallel import DistributedDataParallel as DDP from torch.utils.data.distributed import DistributedSampler from transformers.models.t5.modeling_t5 import T5Block from torch.distributed.algorithms._checkpoint.checkpoint_wrapper import ( checkpoint_wrapper, CheckpointImpl, apply_activation_checkpointing_wrapper) from torch.distributed.fsdp import ( FullyShardedDataParallel as FSDP, MixedPrecision, BackwardPrefetch, ShardingStrategy, FullStateDictConfig, StateDictType, ) from torch.distributed.fsdp.wrap import ( transformer_auto_wrap_policy, enable_wrap, wrap, ) from functools import partial from torch.utils.data import DataLoader from pathlib import Path from summarization_dataset import * from transformers.models.t5.modeling_t5 import T5Block from typing import Type import time import tqdm from datetime import datetime ``` ---------------------------------------- TITLE: Installing LibTorch Dependencies in Shell DESCRIPTION: Downloads and extracts the LibTorch distribution for CPU usage on Ubuntu Linux. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_frontend.rst#2025-04-22_snippet_0 LANGUAGE: shell CODE: ``` wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip unzip libtorch-shared-with-deps-latest.zip ``` ---------------------------------------- TITLE: HuggingFace T5 Model Setup DESCRIPTION: Function to initialize the T5 model and tokenizer from HuggingFace pretrained models. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/FSDP_advanced_tutorial.rst#2025-04-22_snippet_3 LANGUAGE: python CODE: ``` def setup_model(model_name): model = T5ForConditionalGeneration.from_pretrained(model_name) tokenizer = T5Tokenizer.from_pretrained(model_name) return model, tokenizer ``` ---------------------------------------- TITLE: Referencing C++ Custom Operator Example in PyTorch DESCRIPTION: Example of how to reference the C++ custom operator tutorial in PyTorch documentation. This shows the syntax for referencing other documentation pages. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/custom_ops_landing_page.rst#2025-04-22_snippet_1 LANGUAGE: rst CODE: ``` :ref:`cpp-custom-ops-tutorial` ``` ---------------------------------------- TITLE: Setup Script for C++ Extension DESCRIPTION: Python setup.py script to build the C++ extension using setuptools and torch.utils.cpp_extension. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_extension.rst#2025-04-22_snippet_2 LANGUAGE: python CODE: ``` from setuptools import setup, Extension from torch.utils import cpp_extension setup(name='lltm_cpp', ext_modules=[cpp_extension.CppExtension('lltm_cpp', ['lltm.cpp'])], cmdclass={'build_ext': cpp_extension.BuildExtension}) ``` ---------------------------------------- TITLE: Referencing Python Custom Operator Example in PyTorch DESCRIPTION: Example of how to reference the Python custom operator tutorial in PyTorch documentation. This shows the syntax for referencing other documentation pages. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/custom_ops_landing_page.rst#2025-04-22_snippet_0 LANGUAGE: rst CODE: ``` :ref:`python-custom-ops-tutorial` ``` ---------------------------------------- TITLE: Implementing Dynamic Neural Network with Control Flow in PyTorch DESCRIPTION: A PyTorch implementation that demonstrates control flow and weight sharing in neural networks. This example creates a model that dynamically chooses between 3rd, 4th, or 5th order polynomials during each forward pass. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/pytorch_with_examples.rst#2025-04-22_snippet_7 LANGUAGE: python CODE: ``` # -*- coding: utf-8 -*- import random import torch import math class DynamicNet(torch.nn.Module): def __init__(self): """ In the constructor we instantiate five parameters and assign them as members. """ super().__init__() self.a = torch.nn.Parameter(torch.randn(())) self.b = torch.nn.Parameter(torch.randn(())) self.c = torch.nn.Parameter(torch.randn(())) self.d = torch.nn.Parameter(torch.randn(())) self.e = torch.nn.Parameter(torch.randn(())) def forward(self, x): """ For the forward pass of the model, we randomly choose either 4, 5 and reuse the e parameter to compute the contribution of these orders. Since each forward pass builds a dynamic computation graph, we can use normal Python control-flow operators like loops or conditional statements when defining the forward pass of the model. Here we also see that it is perfectly safe to reuse the same parameter many times when defining a computational graph. """ y = self.a + self.b * x + self.c * x ** 2 + self.d * x ** 3 for exp in range(4, random.randint(4, 6)): y = y + self.e * x ** exp return y def string(self): """ Just like any class in Python, you can also define custom method on PyTorch modules """ return f'y = {self.a.item()} + {self.b.item()} x + {self.c.item()} x^2 + {self.d.item()} x^3 + {self.e.item()} x^4 ? + {self.e.item()} x^5 ?' # Create Tensors to hold input and outputs. x = torch.linspace(-math.pi, math.pi, 2000) y = torch.sin(x) # Construct our model by instantiating the class defined above model = DynamicNet() # Construct our loss function and an Optimizer. Training this strange model with # vanilla stochastic gradient descent is tough, so we use momentum criterion = torch.nn.MSELoss(reduction='sum') optimizer = torch.optim.SGD(model.parameters(), lr=1e-8, momentum=0.9) for t in range(30000): # Forward pass: Compute predicted y by passing x to the model y_pred = model(x) # Compute and print loss loss = criterion(y_pred, y) if t % 2000 == 1999: print(t, loss.item()) # Zero gradients, perform a backward pass, and update the weights. optimizer.zero_grad() loss.backward() optimizer.step() print(f'Result: {model.string()}') ``` ---------------------------------------- TITLE: Adding Gallery Items for PyTorch Neural Network Examples DESCRIPTION: This code snippet adds gallery items for two PyTorch tutorial examples: a polynomial module and a dynamic network. It uses reStructuredText directives to include these examples in a gallery display. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/pytorch_with_examples.rst#2025-04-22_snippet_8 LANGUAGE: reStructuredText CODE: ``` .. galleryitem:: /beginner/examples_nn/polynomial_module.py .. galleryitem:: /beginner/examples_nn/dynamic_net.py .. raw:: html
``` ---------------------------------------- TITLE: Implementing Polynomial Fitting with PyTorch Optimizer DESCRIPTION: A PyTorch implementation that uses the optim package to update model parameters. This example demonstrates how to use built-in optimizers like RMSprop instead of manually implementing gradient descent. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/pytorch_with_examples.rst#2025-04-22_snippet_5 LANGUAGE: python CODE: ``` # -*- coding: utf-8 -*- import torch import math # Create Tensors to hold input and outputs. x = torch.linspace(-math.pi, math.pi, 2000) y = torch.sin(x) # Prepare the input tensor (x, x^2, x^3). p = torch.tensor([1, 2, 3]) xx = x.unsqueeze(-1).pow(p) # Use the nn package to define our model and loss function. model = torch.nn.Sequential( torch.nn.Linear(3, 1), torch.nn.Flatten(0, 1) ) loss_fn = torch.nn.MSELoss(reduction='sum') # Use the optim package to define an Optimizer that will update the weights of # the model for us. Here we will use RMSprop; the optim package contains many other # optimization algorithms. The first argument to the RMSprop constructor tells the # optimizer which Tensors it should update. learning_rate = 1e-3 optimizer = torch.optim.RMSprop(model.parameters(), lr=learning_rate) for t in range(2000): # Forward pass: compute predicted y by passing x to the model. y_pred = model(xx) # Compute and print loss. loss = loss_fn(y_pred, y) if t % 100 == 99: print(t, loss.item()) # Before the backward pass, use the optimizer object to zero all of the # gradients for the variables it will update (which are the learnable # weights of the model). This is because by default, gradients are # accumulated in buffers( i.e, not overwritten) whenever .backward() # is called. Checkout docs of torch.autograd.backward for more details. optimizer.zero_grad() # Backward pass: compute gradient of the loss with respect to model # parameters loss.backward() # Calling the step function on an Optimizer makes an update to its # parameters optimizer.step() linear_layer = model[0] print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item()} x + {linear_layer.weight[:, 1].item()} x^2 + {linear_layer.weight[:, 2].item()} x^3') ``` ---------------------------------------- TITLE: Starting the Flask Server for the Image Classifier API - Shell DESCRIPTION: Command for launching the Flask app from the shell. FLASK_APP is set to 'app.py', and flask run starts the web server (default on port 5000). Requires Flask to be installed. Expects 'app.py' to be present in the working directory. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/deployment_with_flask.rst#2025-04-22_snippet_6 LANGUAGE: shell CODE: ``` FLASK_APP=app.py flask run ``` ---------------------------------------- TITLE: Installing Holistic Trace Analysis using pip DESCRIPTION: Command to install the HolisticTraceAnalysis package using pip. This is the primary installation method for HTA. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/hta_intro_tutorial.rst#2025-04-22_snippet_0 LANGUAGE: python CODE: ``` pip install HolisticTraceAnalysis ``` ---------------------------------------- TITLE: Bundling Example Inputs to Scripted Model - PyTorch - Python DESCRIPTION: Demonstrates how to create a list of example inputs (for 'forward') and attach them to a TorchScript module using the bundle_inputs utility. The sample input tuple must match the model input signature. This step creates a bunded_model with embedded sample inputs, suitable for later retrieval or testing. bundle_inputs comes from torch.utils.bundled_inputs; ensure it is imported. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/bundled_inputs.rst#2025-04-22_snippet_1 LANGUAGE: python CODE: ``` # For each method create a list of inputs and each input is a tuple of arguments\nsample_input = [(torch.zeros(1,10),)]\n\n# Create model with bundled inputs, if type(input) is list then the input is bundled to 'forward'\nbundled_model = bundle_inputs(scripted_module, sample_input) ``` ---------------------------------------- TITLE: Installing Intel Neural Compressor DESCRIPTION: Commands for installing Intel Neural Compressor from pip or conda. Supports Python versions 3.6, 3.7, 3.8, and 3.9. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/intel_neural_compressor_for_pytorch.rst#2025-04-22_snippet_0 LANGUAGE: bash CODE: ``` # install stable version from pip pip install neural-compressor # install nightly version from pip pip install -i https://test.pypi.org/simple/ neural-compressor # install stable version from from conda conda install neural-compressor -c conda-forge -c intel ``` ---------------------------------------- TITLE: Inference with ResNet50 in FP32 using Intel Extension for PyTorch DESCRIPTION: Shows how to perform inference using a pre-trained ResNet50 model with FP32 precision using Intel Extension for PyTorch backend. Includes model optimization and inference setup. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/torch_compile_backend_ipex.rst#2025-04-22_snippet_2 LANGUAGE: python CODE: ``` import torch import torchvision.models as models model = models.resnet50(weights='ResNet50_Weights.DEFAULT') model.eval() data = torch.rand(1, 3, 224, 224) import intel_extension_for_pytorch as ipex # Invoke the following API optionally, to apply frontend optimizations model = ipex.optimize(model, weights_prepack=False) compile_model = torch.compile(model, backend="ipex") with torch.no_grad(): compile_model(data) ``` ---------------------------------------- TITLE: Implementing Polynomial Fitting with NumPy DESCRIPTION: A numpy implementation of fitting a third-order polynomial to a sine function. This example manually implements both the forward and backward passes through the network using numpy operations. SOURCE: https://github.com/pytorch/tutorials/blob/main/beginner_source/pytorch_with_examples.rst#2025-04-22_snippet_0 LANGUAGE: python CODE: ``` # -*- coding: utf-8 -*- import numpy as np import math # Create random input and output data np.random.seed(42) x = np.random.randn(200, 1) y = np.sin(x) # Randomly initialize weights a = np.random.randn() b = np.random.randn() c = np.random.randn() d = np.random.randn() learning_rate = 1e-6 for t in range(2000): # Forward pass: compute predicted y # y = a + b * x + c * x^2 + d * x^3 y_pred = a + b * x + c * x ** 2 + d * x ** 3 # Compute and print loss loss = np.square(y_pred - y).sum() if t % 100 == 99: print(t, loss) # Backprop to compute gradients of a, b, c, d with respect to loss grad_y_pred = 2.0 * (y_pred - y) grad_a = grad_y_pred.sum() grad_b = (grad_y_pred * x).sum() grad_c = (grad_y_pred * x ** 2).sum() grad_d = (grad_y_pred * x ** 3).sum() # Update weights a -= learning_rate * grad_a b -= learning_rate * grad_b c -= learning_rate * grad_c d -= learning_rate * grad_d print(f'Result: y = {a} + {b} x + {c} x^2 + {d} x^3') ``` ---------------------------------------- TITLE: Running DDP with a Model Parallel Architecture Example - PyTorch - Python DESCRIPTION: Demonstrates initialization and training of a DDP-wrapped model-parallel network. Sets up device allocation per process, wraps the custom multi-GPU ToyMpModel in DistributedDataParallel, and walks through optimizer and loss setup, forward and backward passes, and process group cleanup. Assumes a distributed context with known rank and world_size and enough GPUs for parallel use. The outputs and targets are randomly generated. Dependencies: torch, torch.nn, torch.optim, setup, cleanup functions. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/ddp_tutorial.rst#2025-04-22_snippet_5 LANGUAGE: python CODE: ``` def demo_model_parallel(rank, world_size): print(f"Running DDP with model parallel example on rank {rank}.") setup(rank, world_size) # setup mp_model and devices for this process dev0 = rank * 2 dev1 = rank * 2 + 1 mp_model = ToyMpModel(dev0, dev1) ddp_mp_model = DDP(mp_model) loss_fn = nn.MSELoss() optimizer = optim.SGD(ddp_mp_model.parameters(), lr=0.001) optimizer.zero_grad() # outputs will be on dev1 outputs = ddp_mp_model(torch.randn(20, 10)) labels = torch.randn(20, 5).to(dev1) loss_fn(outputs, labels).backward() optimizer.step() cleanup() print(f"Finished running DDP with model parallel example on rank {rank}.") ``` ---------------------------------------- TITLE: Building PyTorch Extension with setuptools DESCRIPTION: Terminal output showing the build process of the custom operator using setup.py. Demonstrates compilation, linking and installation steps. SOURCE: https://github.com/pytorch/tutorials/blob/main/advanced_source/torch_script_custom_ops.rst#2025-04-22_snippet_26 LANGUAGE: shell CODE: ``` $ python setup.py build develop running build running build_ext building 'warp_perspective' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /root/local/miniconda/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/local/miniconda/lib/python3.7/site-packages/torch/lib/include -I/root/local/miniconda/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/root/local/miniconda/lib/python3.7/site-packages/torch/lib/include/TH -I/root/local/miniconda/lib/python3.7/site-packages/torch/lib/include/THC -I/root/local/miniconda/include/python3.7m -c op.cpp -o build/temp.linux-x86_64-3.7/op.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=warp_perspective -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++ creating build/lib.linux-x86_64-3.7 g++ -pthread -shared -B /root/local/miniconda/compiler_compat -L/root/local/miniconda/lib -Wl,-rpath=/root/local/miniconda/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/op.o -lopencv_core -lopencv_imgproc -o build/lib.linux-x86_64-3.7/warp_perspective.so running develop running egg_info creating warp_perspective.egg-info writing warp_perspective.egg-info/PKG-INFO writing dependency_links to warp_perspective.egg-info/dependency_links.txt writing top-level names to warp_perspective.egg-info/top_level.txt writing manifest file 'warp_perspective.egg-info/SOURCES.txt' reading manifest file 'warp_perspective.egg-info/SOURCES.txt' writing manifest file 'warp_perspective.egg-info/SOURCES.txt' running build_ext copying build/lib.linux-x86_64-3.7/warp_perspective.so -> Creating /root/local/miniconda/lib/python3.7/site-packages/warp-perspective.egg-link (link to .) Adding warp-perspective 0.0.0 to easy-install.pth file Installed /warp_perspective Processing dependencies for warp-perspective==0.0.0 Finished processing dependencies for warp-perspective==0.0.0 ``` ---------------------------------------- TITLE: Setting up 2D Parallel Pattern Without DeviceMesh in PyTorch DESCRIPTION: This code demonstrates the manual setup of process groups for a 2D parallel pattern in PyTorch without using DeviceMesh. It involves calculating shard groups and replicate groups, then assigning them to each rank. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/distributed_device_mesh.rst#2025-04-22_snippet_0 LANGUAGE: python CODE: ``` import os import torch import torch.distributed as dist # Understand world topology rank = int(os.environ["RANK"]) world_size = int(os.environ["WORLD_SIZE"]) print(f"Running example on {rank=} in a world with {world_size=}") # Create process groups to manage 2-D like parallel pattern dist.init_process_group("nccl") torch.cuda.set_device(rank) # Create shard groups (e.g. (0, 1, 2, 3), (4, 5, 6, 7)) # and assign the correct shard group to each rank num_node_devices = torch.cuda.device_count() shard_rank_lists = list(range(0, num_node_devices // 2)), list(range(num_node_devices // 2, num_node_devices)) shard_groups = ( dist.new_group(shard_rank_lists[0]), dist.new_group(shard_rank_lists[1]), ) current_shard_group = ( shard_groups[0] if rank in shard_rank_lists[0] else shard_groups[1] ) # Create replicate groups (for example, (0, 4), (1, 5), (2, 6), (3, 7)) # and assign the correct replicate group to each rank current_replicate_group = None shard_factor = len(shard_rank_lists[0]) for i in range(num_node_devices // 2): replicate_group_ranks = list(range(i, num_node_devices, shard_factor)) replicate_group = dist.new_group(replicate_group_ranks) if rank in replicate_group_ranks: current_replicate_group = replicate_group ``` ---------------------------------------- TITLE: Implementing HTML Meta Redirect to ExecuTorch Documentation DESCRIPTION: This HTML meta tag creates an automatic redirect to the ExecuTorch documentation page after a 3-second delay, directing users from the deprecated PyTorch Mobile documentation to the currently supported alternative. SOURCE: https://github.com/pytorch/tutorials/blob/main/recipes_source/model_preparation_ios.rst#2025-04-22_snippet_0 LANGUAGE: html CODE: ``` ``` ---------------------------------------- TITLE: Initializing Command-Line Arguments for PyTorch RPC Parameter Server DESCRIPTION: Sets up argparse to handle command-line arguments for configuring the distributed training setup, including world size, rank, number of GPUs, master address, and port. SOURCE: https://github.com/pytorch/tutorials/blob/main/intermediate_source/rpc_param_server_tutorial.rst#2025-04-22_snippet_12 LANGUAGE: python CODE: ``` if __name__ == '__main__': parser = argparse.ArgumentParser( description="Parameter-Server RPC based training") parser.add_argument( "--world_size", type=int, default=4, help="""Total number of participating processes. Should be the sum of master node and all training nodes.""") parser.add_argument( "--rank", type=int, default=None, help="Global rank of this process. Pass in 0 for master.") parser.add_argument( "--num_gpus", type=int, default=0, help="""Number of GPUs to use for training, Currently supports between 0 and 2 GPUs. Note that this argument will be passed to the parameter servers.""") parser.add_argument( "--master_addr", type=str, default="localhost", help="""Address of master, will default to localhost if not provided. Master must be able to accept network traffic on the address + port.""") parser.add_argument( "--master_port", type=str, default="29500", help="""Port that master is listening on, will default to 29500 if not provided. Master must be able to accept network traffic on the host and port.""") args = parser.parse_args() assert args.rank is not None, "must provide rank argument." assert args.num_gpus <= 3, f"Only 0-2 GPUs currently supported (got {args.num_gpus})." os.environ['MASTER_ADDR'] = args.master_addr os.environ["MASTER_PORT"] = args.master_port ```