### Sample Installation Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/UnifiedMemoryStreams/CMakeLists.txt

Includes a CMake module to set up the installation configuration for the sample.

```cmake
include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake)
setup_samples_install()
```

--------------------------------

### Setup Sample Installation

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/FDTD3d/CMakeLists.txt

Calls a function to set up the installation rules for the sample. This is part of the common installation procedure for NVIDIA samples.

```cmake
setup_samples_install()
```

--------------------------------

### Include Directories and Installation Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/CMakeLists.txt

Includes necessary directories for common headers and sets up sample installation rules. This ensures that the project can find its dependencies and be installed correctly.

```cmake
# Include directories and libraries
include_directories(../../../Common ../Benchmark_Common)

# Include installation configuration
include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake)
setup_samples_install()
```

--------------------------------

### Install Samples on Linux

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

After configuring and building, navigate to the build directory and run 'make install' to install the samples on Linux.

```bash
cd build/
make install

```

--------------------------------

### Sample Installation Configuration

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/cudaOpenMP/CMakeLists.txt

This includes a CMake module to set up the installation rules for the samples, ensuring they can be installed correctly after building.

```cmake
# Include installation configuration
include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake)
setup_samples_install()
```

--------------------------------

### Install Dependencies and Run Sample

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/simplePrint/README.md

Installs project dependencies from requirements.txt and then executes the simplePrint Python script.

```bash
# Install dependencies
pip install -r requirements.txt

# Run the sample
python simplePrint.py
```

--------------------------------

### Basic CMake Project Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/dwtHaar1D/CMakeLists.txt

Sets the minimum CMake version, adds module paths, defines the project name and languages, and finds the CUDA toolkit. This is a standard starting point for CUDA projects.

```cmake
cmake_minimum_required(VERSION 3.20)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules")

project(dwtHaar1D LANGUAGES C CXX CUDA)

find_package(CUDAToolkit REQUIRED)
```

--------------------------------

### Include Sample Installation Logic

Source: https://github.com/nvidia/cuda-samples/blob/master/CMakeLists.txt

Includes an external CMake file that handles the installation configuration for samples.

```cmake
include(cmake/InstallSamples.cmake)
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/systemInfo/README.md

Install the required packages for the system information sample using pip.

```bash
cd /path/to/cuda-samples/python/1_GettingStarted/systemInfo
pip install -r requirements.txt
```

--------------------------------

### Target Properties and Installation Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/template/CMakeLists.txt

Enables separable compilation for CUDA targets and includes a script for sample installation configuration.

```cmake
set_target_properties(template PROPERTIES CUDA_SEPARABLE_COMPILATION ON)

# Include installation configuration
include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake)
setup_samples_install()
```

--------------------------------

### Installation Rules

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/simple/CMakeLists.txt

Configures installation paths for the 'simple' executable and its associated '.ll' file. It installs to a unified directory if CUDA_SAMPLES_INSTALL_DIR is defined, otherwise to a 'bin' directory.

```cmake
if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(TARGETS simple DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
    install(FILES simple-gpu64.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
else()
    install(TARGETS simple DESTINATION bin)
    install(FILES simple-gpu64.ll DESTINATION bin)
endif()
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/3_FrameworkInterop/customTensorFlowKernel/README.md

Navigate to the sample directory and install required Python packages using the provided requirements file.

```bash
cd python/3_FrameworkInterop/customTensorFlowKernel
pip install -r requirements.txt
```

--------------------------------

### Search Space Configuration Example

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmulAutotuner/README.md

Define tile sizes and latency values for the autotuner. Lines starting with '#' are comments. Each 'tile' line specifies TILE_BLOCK_M, TILE_BLOCK_N, and TILE_BLOCK_K. Latency values are combined with every tile entry.

```text
tile 64 64 32
tile 128 64 32
load_latency 2 5 8
store_latency 2 5 8
```

--------------------------------

### Build and Install Samples on Windows (Command Line)

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

Build and install samples from the x64 Native Tools Command Prompt for VS. Replace 'Release' with 'Debug' for debug builds. The --config flag is for multi-configuration generators.

```cmd
cd build
cmake --build . --config Release
cmake --install . --config Release

```

--------------------------------

### Python Installation Instructions

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaComputeLambdas/README.md

Installs the necessary packages for the cudaComputeLambdas sample, including cuda-cccl, cuda-core, cupy, and numpy.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/cudaComputeLambdas
pip install -r requirements.txt
```

--------------------------------

### Run Zero-Copy Example (Default)

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md

Executes the zero-copy CUDA Python example with default parameters, after setting the library path.

```bash
# Pre-steps: Set library path
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Run with default parameters (1M elements)
python simpleZeroCopy.py
```

--------------------------------

### Install ptxgen and test.ll to CUDA_SAMPLES_INSTALL_DIR

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/ptxgen/CMakeLists.txt

Installs the 'ptxgen' executable and 'test.ll' file to the specified CUDA_SAMPLES_INSTALL_DIR for unified installation.

```cmake
install(TARGETS ptxgen DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
install(FILES test.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
```

--------------------------------

### Include Installation Configuration

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/FDTD3d/CMakeLists.txt

Includes a CMake script to handle sample installation configuration. This script is located in a common directory for samples.

```cmake
include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake)
```

--------------------------------

### Basic CMake Project Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/UnifiedMemoryStreams/CMakeLists.txt

Sets up the minimum CMake version, module path, project name, and languages.

```cmake
cmake_minimum_required(VERSION 3.20)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules")

project(UnifiedMemoryStreams LANGUAGES C CXX CUDA)
```

--------------------------------

### Install pre-commit using Pip

Source: https://github.com/nvidia/cuda-samples/blob/master/CONTRIBUTING.md

Install the pre-commit tool using pip.

```bash
pip install pre-commit
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/deviceQuery/README.md

Install the necessary Python packages for the deviceQuery sample using pip and the provided requirements.txt file. Ensure you have Python 3.10+ and CUDA Toolkit 13.0+ installed.

```bash
cd cuda-samples/python/1_GettingStarted/deviceQuery
pip install -r requirements.txt
```

--------------------------------

### Run Prefix Sum Sample

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/prefixSum/README.md

Installs dependencies and runs the prefixSum.py sample. Ensure you have created and activated a virtual environment first.

```bash
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Run sample
python prefixSum.py
```

--------------------------------

### Install Requirements

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/greenContext/README.md

Installs necessary Python packages from the requirements.txt file. Navigate to the greenContext directory before running.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/greenContext
pip install -r requirements.txt
```

--------------------------------

### Show Help for Zero-Copy Example

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md

Displays the help message for the zero-copy CUDA Python script, showing available command-line arguments.

```bash
# Pre-steps: Set library path
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Show help
python simpleZeroCopy.py --help
```

--------------------------------

### Complete CUDA Vector Addition Example

Source: https://github.com/nvidia/cuda-samples/blob/master/python/Utilities/README.md

A full example demonstrating device initialization, kernel compilation, launch configuration, and result verification using cuda.core and utilities.

```python
import sys
from pathlib import Path

# Import utility
sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities"))
from cuda_samples_utils import verify_array_result

import cupy as cp
from cuda.core import Device, Program, ProgramOptions, LaunchConfig, launch

# Use cuda.core directly for device and kernel operations
device = Device(0)
device.set_current()

program_options = ProgramOptions(std="c++17", arch=f"sm_{device.arch}")
program = Program(kernel_source, code_type="c++", options=program_options)
module = program.compile("cubin", name_expressions=("kernel_name",))
kernel = module.get_kernel("kernel_name")

# Calculate grid size inline
threads_per_block = 256
blocks_per_grid = (num_elements + threads_per_block - 1) // threads_per_block

# Launch kernel - pass cupy arrays directly
config = LaunchConfig(grid=blocks_per_grid, block=threads_per_block)
launch(stream, config, kernel, a, b, c, cp.int32(num_elements))

# Verify results using utility
verify_array_result(c, expected)
```

--------------------------------

### Install Requirements

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/processCheckpoint/README.md

Installs the necessary Python packages for the process checkpointing sample. Ensure you are in the sample's directory.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/processCheckpoint
pip install -r requirements.txt
```

--------------------------------

### Install GLFW3 on Ubuntu

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/simpleVulkanMMAP/Build_instructions.txt

Installs the GLFW3 library and development headers on Ubuntu systems.

```bash
sudo apt-get install libglfw3
sudo apt-get install libglfw3-dev
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/copyImageArraytoGPU/README.md

Install the required Python packages for the sample, including numpy, cuda-python, cuda-core, and cupy.

```bash
cd /path/to/cuda-samples/python/1_GettingStarted/copyImageArraytoGPU
pip install -r requirements.txt
```

--------------------------------

### Run Zero-Copy Example (Custom Elements)

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md

Executes the zero-copy CUDA Python example with a custom number of elements, after setting the library path.

```bash
# Pre-steps: Set library path
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Use 2M elements
python simpleZeroCopy.py --num_elements 2097152
```

--------------------------------

### Include Directories and Library Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/postProcessGL/CMakeLists.txt

Includes common directories and sets up platform-specific include and library paths for GLUT and GLEW on Windows.

```cmake
# Include directories and libraries
include_directories(../../../Common)

if(WIN32)
    set(PC_GLUT_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../Common")
    set(PC_GLUT_LIBRARY_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../Common/lib/x64")
    # The GLEW library built on Windows on Arm system is named as glew32.lib/glew32.dll by default.
    if(EXISTS "${PC_GLUT_LIBRARY_DIRS}/glew32.lib")
        set(GLEW_LIB_NAME "glew32")
    else()
        set(GLEW_LIB_NAME "glew64")
    endif()
endif()

find_package(OpenGL)
find_package(GLUT)
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/kernelNsysProfile/README.md

Installs the necessary Python packages for the sample. Ensure you have NumPy version 2.3.2 or higher.

```bash
pip install -r requirements.txt
```

--------------------------------

### Project Setup and Package Finding

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/2_Concepts_and_Techniques/EGLStream_CUDA_Interop/CMakeLists.txt

Initializes the CMake project, sets the module path, and finds the required CUDAToolkit package. This is a standard setup for CUDA projects.

```cmake
cmake_minimum_required(VERSION 3.20)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules")

project(EGLStream_CUDA_Interop LANGUAGES C CXX CUDA)

find_package(CUDAToolkit REQUIRED)
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/matrixMulSharedMem/README.md

Installs the necessary Python packages for the project using pip. Ensure you are in the project's directory and have activated the virtual environment.

```bash
cd cuda-samples/python/2_CoreConcepts/matrixMulSharedMem
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/memoryResources/README.md

Installs the necessary Python packages for running CUDA Python memory resource examples. Ensure you are in the correct directory before running.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/memoryResources
pip install -r requirements.txt
```

--------------------------------

### Run Example with Specific SM Split

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/greenContext/README.md

Runs the greenContext.py script with a predefined SM split, allocating 112 SMs for the long kernel and 16 SMs for the critical kernel, matching a CUDA programming guide example.

```bash
python greenContext.py --split 112,16
```

--------------------------------

### Run the Sample

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/copyImageArraytoGPU/README.md

Navigate to the sample directory and execute the Python script to perform the image array copy to the GPU.

```bash
cd samples/python/1_GettingStarted/copyImageArraytoGPU
python copyImageArraytoGPU.py
```

--------------------------------

### Install Shared Memory LL Files

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/cuda-shared-memory/CMakeLists.txt

Installs shared_memory.ll and extern_shared_memory.ll. Installs to CUDA_SAMPLES_INSTALL_DIR if defined, otherwise to bin/cuda-shared-memory.

```cmake
if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(FILES shared_memory.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/cuda-shared-memory)
    install(FILES extern_shared_memory.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/cuda-shared-memory)
else()
    install(FILES shared_memory.ll DESTINATION bin/cuda-shared-memory)
    install(FILES extern_shared_memory.ll DESTINATION bin/cuda-shared-memory)
endif()
```

--------------------------------

### Install Syscalls LLVM IR Files (Conditional)

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/syscalls/CMakeLists.txt

Conditionally installs 'malloc-free.ll' and 'vprintf.ll' files. If CUDA_SAMPLES_INSTALL_DIR is defined, they are installed to the unified samples directory; otherwise, they are installed to a standalone 'bin/syscalls' directory.

```cmake
if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(FILES malloc-free.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/syscalls)
    install(FILES vprintf.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/syscalls)
else()
    install(FILES malloc-free.ll DESTINATION bin/syscalls)
    install(FILES vprintf.ll DESTINATION bin/syscalls)
endif()
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/3_FrameworkInterop/customPyTorchKernel/README.md

Install required packages for the custom PyTorch kernel. For Windows, ensure a CUDA-enabled PyTorch build is installed.

```bash
cd python/3_FrameworkInterop/customPyTorchKernel
pip install -r requirements.txt
```

```bash
pip install torch --index-url https://download.pytorch.org/whl/cu128
```

--------------------------------

### Display all tileMatmul command-line options

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/README.md

Show all available command-line options for the tileMatmul sample by using the --help flag.

```bash
./tileMatmul --help
```

--------------------------------

### Install Targets

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/cuda-c-linking/CMakeLists.txt

Installs the built targets (executable and library) to a specified directory. It prioritizes CUDA_SAMPLES_INSTALL_DIR if defined, otherwise installs to 'bin'.

```cmake
# Install to CUDA_SAMPLES_INSTALL_DIR if defined (for unified installation),
# otherwise install to bin (for standalone libNVVM build)
if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(TARGETS cuda-c-linking mathfuncs64 DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
else()
    install(TARGETS cuda-c-linking mathfuncs64 DESTINATION bin)
endif()
```

--------------------------------

### Install CMake on Debian/Ubuntu

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

Installs CMake version 3.20 or later using apt. Ensure CMake is installed before proceeding with sample builds.

```bash
sudo apt install cmake
```

--------------------------------

### Conditional Installation of NVVM DLL

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/CMakeLists.txt

Installs the NVVM DLL to a specified directory on Windows if NVVM_DLL is defined. It prioritizes CUDA_SAMPLES_INSTALL_DIR if set, otherwise installs to 'bin'.

```cmake
if (WIN32 AND NVVM_DLL)
  if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(FILES ${NVVM_DLL} DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
  else()
    install(FILES ${NVVM_DLL} DESTINATION bin)
  endif()
endif()
```

--------------------------------

### Run on Linux/macOS

Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/multiGPUGradientAverage/README.md

Launch the sample using mpirun with at least 2 processes. Specify the number of processes and the script to run.

```bash
# Single node (2 GPUs)
mpirun -np 2 python multiGPUGradientAverage.py --size 10000

# Single node (4 GPUs)
mpirun -np 4 python multiGPUGradientAverage.py --size 10000

# With specific GPUs
CUDA_VISIBLE_DEVICES=0,2 mpirun -np 2 python multiGPUGradientAverage.py
```

--------------------------------

### Installation Rules for Executable and LLVM IR

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/device-side-launch/CMakeLists.txt

Defines where the 'dsl' executable and its associated LLVM IR file ('dsl-gpu64.ll') should be installed. It supports installation into a unified samples directory or a standalone 'bin' directory.

```cmake
if(DEFINED CUDA_SAMPLES_INSTALL_DIR)
    install(TARGETS dsl DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
    install(FILES dsl-gpu64.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR})
else()
    install(TARGETS dsl DESTINATION bin)
    install(FILES dsl-gpu64.ll DESTINATION bin)
endif()
```

--------------------------------

### Run Memory Resources Demo with Custom Parameters

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/memoryResources/README.md

Runs the memory resources demonstration script with custom configurations. Use the --elements flag to specify buffer size or --device to select a specific GPU.

```bash
# Larger buffer size
python memoryResources.py --elements 1048576

# Use a specific GPU
python memoryResources.py --device 1
```

--------------------------------

### Run on Windows

Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/multiGPUGradientAverage/README.md

Launch the sample using mpiexec with at least 2 processes. Ensure the Microsoft MPI bin directory is in your PATH or provide the full path to mpiexec.

```powershell
& "C:\Program Files\Microsoft MPI\Bin\mpiexec.exe" -n 2 \
    python multiGPUGradientAverage.py --size 10000
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/vectorAdd/README.md

Install the necessary Python packages for CUDA development and GPU array manipulation.

```bash
cd /path/to/cuda-samples/python/1_GettingStarted/vectorAdd
pip install -r requirements.txt
```

--------------------------------

### Build CUDA Samples

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

Commands to build all CUDA samples. Ensure you are in the root directory of the samples project.

```bash
mkdir build
cd build
cmake ..
make -j$(nproc)
```

--------------------------------

### Install pre-commit using Conda

Source: https://github.com/nvidia/cuda-samples/blob/master/CONTRIBUTING.md

Install the pre-commit tool using the conda package manager.

```bash
conda config --add channels conda-forge
conda install pre-commit
```

--------------------------------

### Run JIT LTO Linking Sample with Custom Parameters

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/jitLtoLinking/README.md

Executes the JIT LTO linking sample with custom parameters for elements and device selection.

```bash
# Larger element count
python jitLtoLinking.py --elements 1048576

# Use a specific GPU
python jitLtoLinking.py --device 1
```

--------------------------------

### Run Simple P2P with Default Parameters

Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/simpleP2P/README.md

Execute the script with default settings for array size (16M elements).

```bash
# Run with default parameters (16M elements = 64MB)
python simpleP2P.py
```

--------------------------------

### Install Requirements

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/jitLtoLinking/README.md

Installs necessary Python packages for the CUDA JIT and LTO sample using pip.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/jitLtoLinking
pip install -r requirements.txt
```

--------------------------------

### Single Run Sample Configuration

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

Configure an executable to run once with specified arguments. Arguments are appended to the command line, separated by spaces. Paths are relative to the executable's directory.

```json
{
    "ptxgen": {
        "args": [
            "test.ll",
            "-arch=compute_75"
        ]
    }
}
```

--------------------------------

### Customize Installation Prefix with CMake

Source: https://github.com/nvidia/cuda-samples/blob/master/README.md

Use CMAKE_INSTALL_PREFIX to change the root installation directory. The default is 'build/bin'.

```bash
cmake -DCMAKE_INSTALL_PREFIX=/custom/path ..

```

--------------------------------

### Library and Include Path Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/8_Platform_Specific/Tegra/simpleGLES/CMakeLists.txt

Includes common directories and finds necessary libraries for graphics and windowing, specifically EGL, X11, and OpenGL. These are essential for cross-platform graphics applications.

```cmake
# Include directories and libraries
include_directories(../../../../Common)

find_package(EGL)
find_package(X11)
find_package(OpenGL)
```

--------------------------------

### Basic CMake Project Setup

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/2_Concepts_and_Techniques/scan/CMakeLists.txt

Sets the minimum CMake version, appends module paths, and defines the project name and languages, including CUDA.

```cmake
cmake_minimum_required(VERSION 3.20)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules")

project(scan LANGUAGES C CXX CUDA)
```

--------------------------------

### Basic CMake Configuration for Device-Side Launch

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/device-side-launch/CMakeLists.txt

Sets up the build environment by defining the executable, linking necessary libraries, and configuring platform-specific properties.

```cmake
set(CMAKE_INSTALL_RPATH ${LIBNVVM_HOME})
set(CMAKE_INCLUDE_CURRENT_DIR YES)
set_property(SOURCE dsl.c
             PROPERTY COMPILE_DEFINITIONS LIBCUDADEVRT="${CUDADEVRT_LIB}")

add_executable(dsl dsl.c)

add_test(NAME device-side-launch COMMAND dsl
	WORKING_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}")
target_link_libraries(dsl ${NVVM_LIB} ${CUDA_LIB})
```

--------------------------------

### Basic CMake Setup and Project Definition

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/nbody/CMakeLists.txt

Sets the minimum CMake version, appends module paths, and defines the project name and languages (C, CXX, CUDA). It also finds the required CUDA Toolkit.

```cmake
cmake_minimum_required(VERSION 3.20)

list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules")

project(nbody LANGUAGES C CXX CUDA)

find_package(CUDAToolkit REQUIRED)
```

--------------------------------

### Manual Package Installation

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md

Manually installs specific versions of NumPy and CUDA libraries required for CUDA Python.

```bash
pip install numpy>=2.3.2 cuda-core>=1.0.0 cuda-python>=13.0.0
```

--------------------------------

### Run Launch Configuration Tuning Script

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/launchConfigTuning/README.md

Execute the main Python script to perform launch configuration tuning benchmarks.

```bash
python launchConfigTuning.py
```

--------------------------------

### Install CUDA Python Requirements

Source: https://github.com/nvidia/cuda-samples/blob/master/python/Utilities/README.md

Installs the common CUDA 13 stack including cuda-python, cuda-core, cupy-cuda13x, and numpy.

```bash
cd /path/to/cuda-samples/Python
pip install -r requirements.txt
```

--------------------------------

### Run Basic Vector Addition

Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/vectorAdd/README.md

Execute the vector addition sample script with default parameters.

```bash
cd samples/python/1_GettingStarted/vectorAdd
python vectorAdd.py
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/ipcMemoryPool/README.md

Install the required Python packages for the IPC memory pool sample. This includes cuda-python, cuda-core, and cupy.

```bash
cd /path/to/cuda-samples/python/4_DistributedComputing/ipcMemoryPool
pip install -r requirements.txt
```

--------------------------------

### Run tileMatmul with default settings

Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/README.md

Execute the tileMatmul sample with default warmup and benchmark iterations. Validation is off by default.

```bash
./tileMatmul
```

--------------------------------

### Python Installation Requirements

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaGraphs/README.md

Installs necessary Python packages for the CUDA graphs sample, including cuda-python, cuda-core, and cupy.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/cudaGraphs
pip install -r requirements.txt
```

--------------------------------

### Running with Custom Device Parameter

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaComputeLambdas/README.md

Launches the cudaComputeLambdas sample, specifying a custom CUDA device ID.

```bash
python cudaComputeLambdas.py --device 1
```

--------------------------------

### Install Dependencies

Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/tmaTensorMap/README.md

Install the required Python packages for the tmaTensorMap sample. Ensure you have CUDA Toolkit 13.0 or newer with libcudacxx headers.

```bash
cd /path/to/cuda-samples/python/2_CoreConcepts/tmaTensorMap
pip install -r requirements.txt
```