### Sample Installation Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/UnifiedMemoryStreams/CMakeLists.txt Includes a CMake module to set up the installation configuration for the sample. ```cmake include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake) setup_samples_install() ``` -------------------------------- ### Setup Sample Installation Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/FDTD3d/CMakeLists.txt Calls a function to set up the installation rules for the sample. This is part of the common installation procedure for NVIDIA samples. ```cmake setup_samples_install() ``` -------------------------------- ### Include Directories and Installation Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/CMakeLists.txt Includes necessary directories for common headers and sets up sample installation rules. This ensures that the project can find its dependencies and be installed correctly. ```cmake # Include directories and libraries include_directories(../../../Common ../Benchmark_Common) # Include installation configuration include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake) setup_samples_install() ``` -------------------------------- ### Install Samples on Linux Source: https://github.com/nvidia/cuda-samples/blob/master/README.md After configuring and building, navigate to the build directory and run 'make install' to install the samples on Linux. ```bash cd build/ make install ``` -------------------------------- ### Sample Installation Configuration Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/cudaOpenMP/CMakeLists.txt This includes a CMake module to set up the installation rules for the samples, ensuring they can be installed correctly after building. ```cmake # Include installation configuration include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake) setup_samples_install() ``` -------------------------------- ### Install Dependencies and Run Sample Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/simplePrint/README.md Installs project dependencies from requirements.txt and then executes the simplePrint Python script. ```bash # Install dependencies pip install -r requirements.txt # Run the sample python simplePrint.py ``` -------------------------------- ### Basic CMake Project Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/dwtHaar1D/CMakeLists.txt Sets the minimum CMake version, adds module paths, defines the project name and languages, and finds the CUDA toolkit. This is a standard starting point for CUDA projects. ```cmake cmake_minimum_required(VERSION 3.20) list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules") project(dwtHaar1D LANGUAGES C CXX CUDA) find_package(CUDAToolkit REQUIRED) ``` -------------------------------- ### Include Sample Installation Logic Source: https://github.com/nvidia/cuda-samples/blob/master/CMakeLists.txt Includes an external CMake file that handles the installation configuration for samples. ```cmake include(cmake/InstallSamples.cmake) ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/systemInfo/README.md Install the required packages for the system information sample using pip. ```bash cd /path/to/cuda-samples/python/1_GettingStarted/systemInfo pip install -r requirements.txt ``` -------------------------------- ### Target Properties and Installation Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/template/CMakeLists.txt Enables separable compilation for CUDA targets and includes a script for sample installation configuration. ```cmake set_target_properties(template PROPERTIES CUDA_SEPARABLE_COMPILATION ON) # Include installation configuration include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake) setup_samples_install() ``` -------------------------------- ### Installation Rules Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/simple/CMakeLists.txt Configures installation paths for the 'simple' executable and its associated '.ll' file. It installs to a unified directory if CUDA_SAMPLES_INSTALL_DIR is defined, otherwise to a 'bin' directory. ```cmake if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(TARGETS simple DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) install(FILES simple-gpu64.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) else() install(TARGETS simple DESTINATION bin) install(FILES simple-gpu64.ll DESTINATION bin) endif() ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/3_FrameworkInterop/customTensorFlowKernel/README.md Navigate to the sample directory and install required Python packages using the provided requirements file. ```bash cd python/3_FrameworkInterop/customTensorFlowKernel pip install -r requirements.txt ``` -------------------------------- ### Search Space Configuration Example Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmulAutotuner/README.md Define tile sizes and latency values for the autotuner. Lines starting with '#' are comments. Each 'tile' line specifies TILE_BLOCK_M, TILE_BLOCK_N, and TILE_BLOCK_K. Latency values are combined with every tile entry. ```text tile 64 64 32 tile 128 64 32 load_latency 2 5 8 store_latency 2 5 8 ``` -------------------------------- ### Build and Install Samples on Windows (Command Line) Source: https://github.com/nvidia/cuda-samples/blob/master/README.md Build and install samples from the x64 Native Tools Command Prompt for VS. Replace 'Release' with 'Debug' for debug builds. The --config flag is for multi-configuration generators. ```cmd cd build cmake --build . --config Release cmake --install . --config Release ``` -------------------------------- ### Python Installation Instructions Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaComputeLambdas/README.md Installs the necessary packages for the cudaComputeLambdas sample, including cuda-cccl, cuda-core, cupy, and numpy. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/cudaComputeLambdas pip install -r requirements.txt ``` -------------------------------- ### Run Zero-Copy Example (Default) Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md Executes the zero-copy CUDA Python example with default parameters, after setting the library path. ```bash # Pre-steps: Set library path export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH # Run with default parameters (1M elements) python simpleZeroCopy.py ``` -------------------------------- ### Install ptxgen and test.ll to CUDA_SAMPLES_INSTALL_DIR Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/ptxgen/CMakeLists.txt Installs the 'ptxgen' executable and 'test.ll' file to the specified CUDA_SAMPLES_INSTALL_DIR for unified installation. ```cmake install(TARGETS ptxgen DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) install(FILES test.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) ``` -------------------------------- ### Include Installation Configuration Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/FDTD3d/CMakeLists.txt Includes a CMake script to handle sample installation configuration. This script is located in a common directory for samples. ```cmake include(${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/InstallSamples.cmake) ``` -------------------------------- ### Basic CMake Project Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/0_Introduction/UnifiedMemoryStreams/CMakeLists.txt Sets up the minimum CMake version, module path, project name, and languages. ```cmake cmake_minimum_required(VERSION 3.20) list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules") project(UnifiedMemoryStreams LANGUAGES C CXX CUDA) ``` -------------------------------- ### Install pre-commit using Pip Source: https://github.com/nvidia/cuda-samples/blob/master/CONTRIBUTING.md Install the pre-commit tool using pip. ```bash pip install pre-commit ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/deviceQuery/README.md Install the necessary Python packages for the deviceQuery sample using pip and the provided requirements.txt file. Ensure you have Python 3.10+ and CUDA Toolkit 13.0+ installed. ```bash cd cuda-samples/python/1_GettingStarted/deviceQuery pip install -r requirements.txt ``` -------------------------------- ### Run Prefix Sum Sample Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/prefixSum/README.md Installs dependencies and runs the prefixSum.py sample. Ensure you have created and activated a virtual environment first. ```bash # Create and activate virtual environment python -m venv venv source venv/bin/activate # Linux/macOS # venv\Scripts\activate # Windows # Install dependencies pip install -r requirements.txt # Run sample python prefixSum.py ``` -------------------------------- ### Install Requirements Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/greenContext/README.md Installs necessary Python packages from the requirements.txt file. Navigate to the greenContext directory before running. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/greenContext pip install -r requirements.txt ``` -------------------------------- ### Show Help for Zero-Copy Example Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md Displays the help message for the zero-copy CUDA Python script, showing available command-line arguments. ```bash # Pre-steps: Set library path export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH # Show help python simpleZeroCopy.py --help ``` -------------------------------- ### Complete CUDA Vector Addition Example Source: https://github.com/nvidia/cuda-samples/blob/master/python/Utilities/README.md A full example demonstrating device initialization, kernel compilation, launch configuration, and result verification using cuda.core and utilities. ```python import sys from pathlib import Path # Import utility sys.path.insert(0, str(Path(__file__).parent.parent.parent / "Utilities")) from cuda_samples_utils import verify_array_result import cupy as cp from cuda.core import Device, Program, ProgramOptions, LaunchConfig, launch # Use cuda.core directly for device and kernel operations device = Device(0) device.set_current() program_options = ProgramOptions(std="c++17", arch=f"sm_{device.arch}") program = Program(kernel_source, code_type="c++", options=program_options) module = program.compile("cubin", name_expressions=("kernel_name",)) kernel = module.get_kernel("kernel_name") # Calculate grid size inline threads_per_block = 256 blocks_per_grid = (num_elements + threads_per_block - 1) // threads_per_block # Launch kernel - pass cupy arrays directly config = LaunchConfig(grid=blocks_per_grid, block=threads_per_block) launch(stream, config, kernel, a, b, c, cp.int32(num_elements)) # Verify results using utility verify_array_result(c, expected) ``` -------------------------------- ### Install Requirements Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/processCheckpoint/README.md Installs the necessary Python packages for the process checkpointing sample. Ensure you are in the sample's directory. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/processCheckpoint pip install -r requirements.txt ``` -------------------------------- ### Install GLFW3 on Ubuntu Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/simpleVulkanMMAP/Build_instructions.txt Installs the GLFW3 library and development headers on Ubuntu systems. ```bash sudo apt-get install libglfw3 sudo apt-get install libglfw3-dev ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/copyImageArraytoGPU/README.md Install the required Python packages for the sample, including numpy, cuda-python, cuda-core, and cupy. ```bash cd /path/to/cuda-samples/python/1_GettingStarted/copyImageArraytoGPU pip install -r requirements.txt ``` -------------------------------- ### Run Zero-Copy Example (Custom Elements) Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md Executes the zero-copy CUDA Python example with a custom number of elements, after setting the library path. ```bash # Pre-steps: Set library path export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH # Use 2M elements python simpleZeroCopy.py --num_elements 2097152 ``` -------------------------------- ### Include Directories and Library Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/postProcessGL/CMakeLists.txt Includes common directories and sets up platform-specific include and library paths for GLUT and GLEW on Windows. ```cmake # Include directories and libraries include_directories(../../../Common) if(WIN32) set(PC_GLUT_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../Common") set(PC_GLUT_LIBRARY_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../Common/lib/x64") # The GLEW library built on Windows on Arm system is named as glew32.lib/glew32.dll by default. if(EXISTS "${PC_GLUT_LIBRARY_DIRS}/glew32.lib") set(GLEW_LIB_NAME "glew32") else() set(GLEW_LIB_NAME "glew64") endif() endif() find_package(OpenGL) find_package(GLUT) ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/kernelNsysProfile/README.md Installs the necessary Python packages for the sample. Ensure you have NumPy version 2.3.2 or higher. ```bash pip install -r requirements.txt ``` -------------------------------- ### Project Setup and Package Finding Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/2_Concepts_and_Techniques/EGLStream_CUDA_Interop/CMakeLists.txt Initializes the CMake project, sets the module path, and finds the required CUDAToolkit package. This is a standard setup for CUDA projects. ```cmake cmake_minimum_required(VERSION 3.20) list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules") project(EGLStream_CUDA_Interop LANGUAGES C CXX CUDA) find_package(CUDAToolkit REQUIRED) ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/matrixMulSharedMem/README.md Installs the necessary Python packages for the project using pip. Ensure you are in the project's directory and have activated the virtual environment. ```bash cd cuda-samples/python/2_CoreConcepts/matrixMulSharedMem python -m venv venv source venv/bin/activate pip install -r requirements.txt ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/memoryResources/README.md Installs the necessary Python packages for running CUDA Python memory resource examples. Ensure you are in the correct directory before running. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/memoryResources pip install -r requirements.txt ``` -------------------------------- ### Run Example with Specific SM Split Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/greenContext/README.md Runs the greenContext.py script with a predefined SM split, allocating 112 SMs for the long kernel and 16 SMs for the critical kernel, matching a CUDA programming guide example. ```bash python greenContext.py --split 112,16 ``` -------------------------------- ### Run the Sample Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/copyImageArraytoGPU/README.md Navigate to the sample directory and execute the Python script to perform the image array copy to the GPU. ```bash cd samples/python/1_GettingStarted/copyImageArraytoGPU python copyImageArraytoGPU.py ``` -------------------------------- ### Install Shared Memory LL Files Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/cuda-shared-memory/CMakeLists.txt Installs shared_memory.ll and extern_shared_memory.ll. Installs to CUDA_SAMPLES_INSTALL_DIR if defined, otherwise to bin/cuda-shared-memory. ```cmake if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(FILES shared_memory.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/cuda-shared-memory) install(FILES extern_shared_memory.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/cuda-shared-memory) else() install(FILES shared_memory.ll DESTINATION bin/cuda-shared-memory) install(FILES extern_shared_memory.ll DESTINATION bin/cuda-shared-memory) endif() ``` -------------------------------- ### Install Syscalls LLVM IR Files (Conditional) Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/syscalls/CMakeLists.txt Conditionally installs 'malloc-free.ll' and 'vprintf.ll' files. If CUDA_SAMPLES_INSTALL_DIR is defined, they are installed to the unified samples directory; otherwise, they are installed to a standalone 'bin/syscalls' directory. ```cmake if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(FILES malloc-free.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/syscalls) install(FILES vprintf.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}/syscalls) else() install(FILES malloc-free.ll DESTINATION bin/syscalls) install(FILES vprintf.ll DESTINATION bin/syscalls) endif() ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/3_FrameworkInterop/customPyTorchKernel/README.md Install required packages for the custom PyTorch kernel. For Windows, ensure a CUDA-enabled PyTorch build is installed. ```bash cd python/3_FrameworkInterop/customPyTorchKernel pip install -r requirements.txt ``` ```bash pip install torch --index-url https://download.pytorch.org/whl/cu128 ``` -------------------------------- ### Display all tileMatmul command-line options Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/README.md Show all available command-line options for the tileMatmul sample by using the --help flag. ```bash ./tileMatmul --help ``` -------------------------------- ### Install Targets Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/cuda-c-linking/CMakeLists.txt Installs the built targets (executable and library) to a specified directory. It prioritizes CUDA_SAMPLES_INSTALL_DIR if defined, otherwise installs to 'bin'. ```cmake # Install to CUDA_SAMPLES_INSTALL_DIR if defined (for unified installation), # otherwise install to bin (for standalone libNVVM build) if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(TARGETS cuda-c-linking mathfuncs64 DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) else() install(TARGETS cuda-c-linking mathfuncs64 DESTINATION bin) endif() ``` -------------------------------- ### Install CMake on Debian/Ubuntu Source: https://github.com/nvidia/cuda-samples/blob/master/README.md Installs CMake version 3.20 or later using apt. Ensure CMake is installed before proceeding with sample builds. ```bash sudo apt install cmake ``` -------------------------------- ### Conditional Installation of NVVM DLL Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/CMakeLists.txt Installs the NVVM DLL to a specified directory on Windows if NVVM_DLL is defined. It prioritizes CUDA_SAMPLES_INSTALL_DIR if set, otherwise installs to 'bin'. ```cmake if (WIN32 AND NVVM_DLL) if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(FILES ${NVVM_DLL} DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) else() install(FILES ${NVVM_DLL} DESTINATION bin) endif() endif() ``` -------------------------------- ### Run on Linux/macOS Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/multiGPUGradientAverage/README.md Launch the sample using mpirun with at least 2 processes. Specify the number of processes and the script to run. ```bash # Single node (2 GPUs) mpirun -np 2 python multiGPUGradientAverage.py --size 10000 # Single node (4 GPUs) mpirun -np 4 python multiGPUGradientAverage.py --size 10000 # With specific GPUs CUDA_VISIBLE_DEVICES=0,2 mpirun -np 2 python multiGPUGradientAverage.py ``` -------------------------------- ### Installation Rules for Executable and LLVM IR Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/device-side-launch/CMakeLists.txt Defines where the 'dsl' executable and its associated LLVM IR file ('dsl-gpu64.ll') should be installed. It supports installation into a unified samples directory or a standalone 'bin' directory. ```cmake if(DEFINED CUDA_SAMPLES_INSTALL_DIR) install(TARGETS dsl DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) install(FILES dsl-gpu64.ll DESTINATION ${CUDA_SAMPLES_INSTALL_DIR}) else() install(TARGETS dsl DESTINATION bin) install(FILES dsl-gpu64.ll DESTINATION bin) endif() ``` -------------------------------- ### Run Memory Resources Demo with Custom Parameters Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/memoryResources/README.md Runs the memory resources demonstration script with custom configurations. Use the --elements flag to specify buffer size or --device to select a specific GPU. ```bash # Larger buffer size python memoryResources.py --elements 1048576 # Use a specific GPU python memoryResources.py --device 1 ``` -------------------------------- ### Run on Windows Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/multiGPUGradientAverage/README.md Launch the sample using mpiexec with at least 2 processes. Ensure the Microsoft MPI bin directory is in your PATH or provide the full path to mpiexec. ```powershell & "C:\Program Files\Microsoft MPI\Bin\mpiexec.exe" -n 2 \ python multiGPUGradientAverage.py --size 10000 ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/vectorAdd/README.md Install the necessary Python packages for CUDA development and GPU array manipulation. ```bash cd /path/to/cuda-samples/python/1_GettingStarted/vectorAdd pip install -r requirements.txt ``` -------------------------------- ### Build CUDA Samples Source: https://github.com/nvidia/cuda-samples/blob/master/README.md Commands to build all CUDA samples. Ensure you are in the root directory of the samples project. ```bash mkdir build cd build cmake .. make -j$(nproc) ``` -------------------------------- ### Install pre-commit using Conda Source: https://github.com/nvidia/cuda-samples/blob/master/CONTRIBUTING.md Install the pre-commit tool using the conda package manager. ```bash conda config --add channels conda-forge conda install pre-commit ``` -------------------------------- ### Run JIT LTO Linking Sample with Custom Parameters Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/jitLtoLinking/README.md Executes the JIT LTO linking sample with custom parameters for elements and device selection. ```bash # Larger element count python jitLtoLinking.py --elements 1048576 # Use a specific GPU python jitLtoLinking.py --device 1 ``` -------------------------------- ### Run Simple P2P with Default Parameters Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/simpleP2P/README.md Execute the script with default settings for array size (16M elements). ```bash # Run with default parameters (16M elements = 64MB) python simpleP2P.py ``` -------------------------------- ### Install Requirements Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/jitLtoLinking/README.md Installs necessary Python packages for the CUDA JIT and LTO sample using pip. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/jitLtoLinking pip install -r requirements.txt ``` -------------------------------- ### Single Run Sample Configuration Source: https://github.com/nvidia/cuda-samples/blob/master/README.md Configure an executable to run once with specified arguments. Arguments are appended to the command line, separated by spaces. Paths are relative to the executable's directory. ```json { "ptxgen": { "args": [ "test.ll", "-arch=compute_75" ] } } ``` -------------------------------- ### Customize Installation Prefix with CMake Source: https://github.com/nvidia/cuda-samples/blob/master/README.md Use CMAKE_INSTALL_PREFIX to change the root installation directory. The default is 'build/bin'. ```bash cmake -DCMAKE_INSTALL_PREFIX=/custom/path .. ``` -------------------------------- ### Library and Include Path Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/8_Platform_Specific/Tegra/simpleGLES/CMakeLists.txt Includes common directories and finds necessary libraries for graphics and windowing, specifically EGL, X11, and OpenGL. These are essential for cross-platform graphics applications. ```cmake # Include directories and libraries include_directories(../../../../Common) find_package(EGL) find_package(X11) find_package(OpenGL) ``` -------------------------------- ### Basic CMake Project Setup Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/2_Concepts_and_Techniques/scan/CMakeLists.txt Sets the minimum CMake version, appends module paths, and defines the project name and languages, including CUDA. ```cmake cmake_minimum_required(VERSION 3.20) list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules") project(scan LANGUAGES C CXX CUDA) ``` -------------------------------- ### Basic CMake Configuration for Device-Side Launch Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/7_libNVVM/device-side-launch/CMakeLists.txt Sets up the build environment by defining the executable, linking necessary libraries, and configuring platform-specific properties. ```cmake set(CMAKE_INSTALL_RPATH ${LIBNVVM_HOME}) set(CMAKE_INCLUDE_CURRENT_DIR YES) set_property(SOURCE dsl.c PROPERTY COMPILE_DEFINITIONS LIBCUDADEVRT="${CUDADEVRT_LIB}") add_executable(dsl dsl.c) add_test(NAME device-side-launch COMMAND dsl WORKING_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}") target_link_libraries(dsl ${NVVM_LIB} ${CUDA_LIB}) ``` -------------------------------- ### Basic CMake Setup and Project Definition Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/5_Domain_Specific/nbody/CMakeLists.txt Sets the minimum CMake version, appends module paths, and defines the project name and languages (C, CXX, CUDA). It also finds the required CUDA Toolkit. ```cmake cmake_minimum_required(VERSION 3.20) list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/../../../cmake/Modules") project(nbody LANGUAGES C CXX CUDA) find_package(CUDAToolkit REQUIRED) ``` -------------------------------- ### Manual Package Installation Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/simpleZeroCopy/README.md Manually installs specific versions of NumPy and CUDA libraries required for CUDA Python. ```bash pip install numpy>=2.3.2 cuda-core>=1.0.0 cuda-python>=13.0.0 ``` -------------------------------- ### Run Launch Configuration Tuning Script Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/launchConfigTuning/README.md Execute the main Python script to perform launch configuration tuning benchmarks. ```bash python launchConfigTuning.py ``` -------------------------------- ### Install CUDA Python Requirements Source: https://github.com/nvidia/cuda-samples/blob/master/python/Utilities/README.md Installs the common CUDA 13 stack including cuda-python, cuda-core, cupy-cuda13x, and numpy. ```bash cd /path/to/cuda-samples/Python pip install -r requirements.txt ``` -------------------------------- ### Run Basic Vector Addition Source: https://github.com/nvidia/cuda-samples/blob/master/python/1_GettingStarted/vectorAdd/README.md Execute the vector addition sample script with default parameters. ```bash cd samples/python/1_GettingStarted/vectorAdd python vectorAdd.py ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/4_DistributedComputing/ipcMemoryPool/README.md Install the required Python packages for the IPC memory pool sample. This includes cuda-python, cuda-core, and cupy. ```bash cd /path/to/cuda-samples/python/4_DistributedComputing/ipcMemoryPool pip install -r requirements.txt ``` -------------------------------- ### Run tileMatmul with default settings Source: https://github.com/nvidia/cuda-samples/blob/master/cpp/9_CUDA_Tile/tileMatmul/README.md Execute the tileMatmul sample with default warmup and benchmark iterations. Validation is off by default. ```bash ./tileMatmul ``` -------------------------------- ### Python Installation Requirements Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaGraphs/README.md Installs necessary Python packages for the CUDA graphs sample, including cuda-python, cuda-core, and cupy. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/cudaGraphs pip install -r requirements.txt ``` -------------------------------- ### Running with Custom Device Parameter Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/cudaComputeLambdas/README.md Launches the cudaComputeLambdas sample, specifying a custom CUDA device ID. ```bash python cudaComputeLambdas.py --device 1 ``` -------------------------------- ### Install Dependencies Source: https://github.com/nvidia/cuda-samples/blob/master/python/2_CoreConcepts/tmaTensorMap/README.md Install the required Python packages for the tmaTensorMap sample. Ensure you have CUDA Toolkit 13.0 or newer with libcudacxx headers. ```bash cd /path/to/cuda-samples/python/2_CoreConcepts/tmaTensorMap pip install -r requirements.txt ```