### Install Script Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/spec_compilation.md Run the installation script for the SPEC benchmarks. ```bash bash install.sh ``` -------------------------------- ### Install FileCheck Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Use this command to install the FileCheck tool. ```bash pip3 install --user filecheck ``` -------------------------------- ### Install LIT Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Use this command to install the LIT testing tool. ```bash pip3 install --user lit ``` -------------------------------- ### Build and Install IR2Vec Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Compile and optionally install IR2Vec after configuration. ```bash make [&& make install] ``` -------------------------------- ### Install and Run Pre-commit Hooks Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/version_upgrade_process.md Install the `pre-commit` tool and run it to check code style and formatting locally before pushing commits. This ensures that all code adheres to project standards. ```bash pre-commit install pre-commit run --all-files ``` -------------------------------- ### Source Environment Variables Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/spec_compilation.md Source the environment variables script after installation. ```bash source shrc ``` -------------------------------- ### Feature Engineering Setup Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Initializes constants for coarse-grained factors (cfs) and prepares a DataFrame with kernel value counts and frequencies. ```python cfs = np.array([1, 2, 4, 8, 16, 32]) kernel_freq = df["kernel"].value_counts().sort_index().reset_index() ``` -------------------------------- ### Install IR2Vec Python Package Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Install the IR2Vec Python package using pip. This is the recommended method for Python users. ```bash pip install -U ir2vec ``` -------------------------------- ### Clone OpenKE-PyTorch Repository Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/OpenKE/README.md Clones the OpenKE-PyTorch repository and navigates into the directory. Ensure PyTorch is installed beforehand. ```bash git clone -b OpenKE-PyTorch https://github.com/thunlp/OpenKE --depth 1 cd OpenKE cd openke ``` -------------------------------- ### Create Conda Environment for OpenKE Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Create and activate a Conda environment using the provided 'openKE.yaml' file to install necessary packages for OpenKE. ```bash conda create -f ./OpenKE/openKE.yaml conda activate openKE ``` -------------------------------- ### Import Libraries Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Out_Of_Vocabulary/OOV.ipynb Imports necessary libraries for data manipulation and visualization. Ensure these libraries are installed. ```python # Part of the IR2Vec Project, under the Apache License v2.0 with LLVM # Exceptions. See the LICENSE file for license information. # SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception # import numpy as np import pandas as pd import matplotlib.pyplot as plt ``` -------------------------------- ### IR2Vec Python API Usage Examples Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Demonstrates two approaches for using IR2Vec Python APIs: directly via module functions or through methods of an initialized IR2Vec object. ```python import ir2vec import numpy as np # IR2Vec Python APIs can be used in two ways. As shown below. initObj = ir2vec.initEmbedding("/path/to/file.ll", "fa", "p") #Approach 1 progVector1 = ir2vec.getProgramVector(initObj) functionVectorMap1 = ir2vec.getFunctionVectors(initObj) instructionVectorsList1 = ir2vec.getInstructionVectors(initObj) #Approach 2 progVector2 = initObj.getProgramVector() functionVectorMap2 = initObj.getFunctionVectors() instructionVectorsList2 = initObj.getInstructionVectors() ``` -------------------------------- ### Query Vector Representations in C++ Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Example of generating and accessing instruction, function, and program vector representations using the IR2Vec C++ interface. Requires LLVM Module and DIM. ```c++ #include "IR2Vec.h" // Creating object to generate FlowAware representation auto ir2vec = IR2Vec::Embeddings(, IR2Vec::IR2VecMode::FlowAware, ); // Getting Instruction vectors corresponding to the instructions in auto instVecMap = ir2vec.getInstVecMap(); // Access the generated vectors for (auto instVec : instVecMap) { outs() << "Instruction : "; instVec.first->print(outs()); outs() << ": "; for (auto val : instVec.second) outs() << val << "\t"; } // Getting vectors corresponding to the functions in auto funcVecMap = ir2vec.getFunctionVecMap(); // Access the generated vectors for (auto funcVec : funcVecMap) { outs() << "Function : " << funcVec.first->getName() << "\n"; for (auto val : funcVec.second) outs() << val << "\t"; } // Getting the program vector auto pgmVec = ir2vec.getProgramVector(); // Access the generated vector for (auto val : pgmVec) outs() << val << "\t"; ``` -------------------------------- ### Get Instruction Vectors in Python Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Retrieves a list of instruction-level vectors for all instructions in the LLVM IR file. ```python # Getting instruction-level vectors instructionVectorsList = initObj.getInstructionVectors() ``` -------------------------------- ### Get Function Vectors in Python Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Retrieves a dictionary of function-level vectors for all functions in the LLVM IR file. ```python # Getting function-level vectors functionVectorMap = initObj.getFunctionVectors() ``` -------------------------------- ### Basic CMake Configuration Source: https://github.com/iith-compilers/ir2vec/blob/main/src/CMakeLists.txt Sets up versioning, include directories, and project options. Ensures generated headers are accessible. ```cmake configure_file (./include/version.h.cmake version.h @ONLY) include_directories(./include ${CMAKE_CURRENT_BINARY_DIR}) include_directories(${CMAKE_BINARY_DIR}) option(LLVM_IR2VEC "where to enable IR2Vec as subproject for LLVM" OFF) set(GENERATED_HEADERS_DIR "${CMAKE_BINARY_DIR}/include") file(MAKE_DIRECTORY ${GENERATED_HEADERS_DIR}) ``` -------------------------------- ### Build SPEC CPU 2006 Benchmarks Source: https://github.com/iith-compilers/ir2vec/wiki/spec_compilation Builds the SPEC CPU 2006 benchmarks using a specified configuration. Ensure compilation parameters and paths are correctly set. ```bash runspec --config config_clang16_cpu2006.cfg --tune=base --action=build --rebuild --define build_ncpus=1 int fp ``` -------------------------------- ### Create Build Directory Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Initial step to create a build directory for compiling IR2Vec. ```bash mkdir build && cd build ``` -------------------------------- ### Build SPEC CPU 2017 Benchmarks Source: https://github.com/iith-compilers/ir2vec/wiki/spec_compilation Builds the SPEC CPU 2017 benchmarks using a specified configuration. Ensure compilation parameters and paths are correctly set. ```bash runcpu --config config_clang16_cpu2017.cfg --tune=base --action=build --rebuild --define build_ncpus=1 intrate fprate intspeed fpspeed ``` -------------------------------- ### Build IR2Vec Binary Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Build the IR2Vec binary by navigating to the build directory and running 'make'. This is a prerequisite for generating triplets. ```bash cd ../build make ``` -------------------------------- ### Build a Docker Image Locally Source: https://github.com/iith-compilers/ir2vec/wiki/docker_update Builds a Docker image from a Dockerfile in the current directory. Replace 'your-image-name:1.0' with your desired image name and tag. ```bash docker build -t your-image-name:1.0 -f Dockerfile . ``` -------------------------------- ### Load Prior Art Results Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Loads pre-computed results for prior art device mapping techniques (DeepTune, Grewe et al., Static Mapping, NCC) from pickle files. ```python deeptune_res = pd.read_pickle("data/prior_art_results/deeptune_dm.results") grewe_res = pd.read_pickle("data/prior_art_results/grewe_dm.results") static_res = pd.read_pickle("data/prior_art_results/static_dm.results") cc_res = pd.read_pickle("data/prior_art_results/ncc_fix_DM.results") ``` -------------------------------- ### Get Program Vector in Python Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Retrieves the program-level vector representation using an initialized IR2Vec object. ```python # Getting the program-level vector progVector = initObj.getProgramVector() ``` -------------------------------- ### Prepare Oracle Data for Comparison Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb This commented-out snippet shows how to prepare 'Oracle' data by copying an existing IR2Vec model and adjusting columns for comparison. It's useful for setting up a baseline or reference point in performance analysis. ```python # oracle = ir2vec_fa.copy() # oracle["Model"] = "Oracle" # oracle["Speedup"] = oracle["OracleSpeedUp"] # oracle["Predicted-CF"] = oracle["Oracle-CF"] # oracle.drop(columns=['OracleSpeedUp'],inplace=True) ``` -------------------------------- ### Link IR2Vec Libraries with CMake Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Template for linking IR2Vec libraries in a CMake-based project. Ensure IR2VEC_INSTALL_DIR is set to the IR2Vec installation path. ```cmake set(IR2VEC_INSTALL_DIR "" CACHE PATH "IR2Vec installation directory") include_directories("${IR2VEC_INSTALL_DIR}/include") target_link_libraries( PUBLIC ${IR2VEC_INSTALL_DIR}/lib/) ``` -------------------------------- ### Download Eigen Library Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Download the specified version of the Eigen library. ```bash wget https://gitlab.com/libeigen/eigen/-/archive/3.3.7/eigen-3.3.7.tar.gz ``` -------------------------------- ### Load Oracle and Runtime Data Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Loads oracle and runtime data from CSV files into pandas DataFrames. This setup is typically used for performance analysis and comparison. ```python _FLAG_TO_DEVICE_NAME = { "Cypress": "AMD Radeon HD 5900", "Tahiti": "AMD Tahiti 7970", "Fermi": "NVIDIA GTX 480", "Kepler": "NVIDIA Tesla K20c", } device_list = ["Cypress", "Tahiti", "Fermi", "Kepler"] oracle_file = os.path.join("./data/pact-2014-oracles.csv") oracles = pd.read_csv(oracle_file) runtimes_file = os.path.join("./data/pact-2014-runtimes.csv") df = pd.read_csv(runtimes_file) ``` -------------------------------- ### Build Eigen Library Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Configure and build the Eigen library using CMake. ```bash mkdir eigen-build && cd eigen-build cmake ../eigen-3.3.7 && make ``` -------------------------------- ### Configure Lit Site Configuration Source: https://github.com/iith-compilers/ir2vec/blob/main/src/test-suite/CMakeLists.txt Configures the lit.site.cfg.py file using CMake's configure_file command, allowing for site-specific test configurations. ```cmake configure_file(lit.site.cfg.py.in lit.site.cfg.py @ONLY) ``` -------------------------------- ### Create and Configure Multi-Axis Plot Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/TimeTaken/timeCompare.ipynb Generates a matplotlib figure with multiple subplots arranged vertically to display time-taken data across different scales. This setup is useful for visualizing data with large variations. ```python plt.figure(figsize=(20, 5), dpi=200) grid = plt.GridSpec(4, 1, hspace=0.06) ax = plt.subplot(grid[0, 0]) ax2 = plt.subplot(grid[1, 0]) ax3 = plt.subplot(grid[2:4, 0]) ax.plot(timeTakenfilename, fa_data, color="blue", label="Flow-Aware", marker="o") ax.plot(timeTakenfilename, sym_data, color="orange", label="Symbolic", marker="x") ax2.plot(timeTakenfilename, fa_data, color="blue", label="Flow-Aware", marker="o") ax2.plot(timeTakenfilename, sym_data, color="orange", label="Symbolic", marker="x") ax3.plot(timeTakenfilename, fa_data, color="blue", label="Flow-Aware", marker="o") ax3.plot(timeTakenfilename, sym_data, color="orange", label="Symbolic", marker="x") ax.set_ylim(1180, 1500) ax2.set_ylim(324, 420) ax3.set_ylim(0, 280) # hide the spines between ax and ax2 ax.spines["bottom"].set_visible(False) ax2.spines["bottom"].set_visible(False) ax2.spines["top"].set_visible(False) ax3.spines["top"].set_visible(False) ax.xaxis.tick_top() ax.tick_params(labeltop="off") # don't put tick labels at the top ax2.xaxis.set_visible(False) ax3.xaxis.tick_bottom() # ax3.yaxis.set_major_locator(x_locator) d = 0.015 # how big to make the diagonal lines in axes coordinates # arguments to pass to plot, just so we don't keep repeating them kwargs = dict(transform=ax.transAxes, color="k", clip_on=False) ax.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal ax.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal kwargs.update(transform=ax2.transAxes) # switch to the bottom axes ax2.plot((-d, +d), (1 - d, 1 + d), **kwargs) # bottom-left diagonal ax2.plot((1 - d, 1 + d), (1 - d, 1 + d), **kwargs) # bottom-right diagonal kwargs = dict(transform=ax2.transAxes, color="k", clip_on=False) ax2.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal ax2.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal kwargs.update(transform=ax3.transAxes) # switch to the bottom axes ax3.plot((-d, +d), ((1 + d / 4), 1 + d), **kwargs) # bottom-left diagonal ax3.plot((1 - d, 1 + d), (1 + d / 4, 1 + d), **kwargs) # What's cool about this is that now if we vary the distance between # ax and ax2 via f.subplots_adjust(hspace=...) or plt.subplot_tool(), # the diagonal lines will move accordingly, and stay right at the tips # of the spines they are 'breaking' ax3.set_xticklabels( timeTakenfilename, rotation=45, horizontalalignment="right", fontsize=12 ) ax.legend(fontsize="xx-large", loc=2) ax3.set_xlabel("Programs", fontsize=22) ax2.set_ylabel("Time Taken in ms", fontsize=20) # plt.show() plt.savefig("Time-Taken.pdf", bbox_inches="tight") ``` -------------------------------- ### Generate Multi-Axis OOV Plot Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Out_Of_Vocabulary/OOV.ipynb Creates a complex Matplotlib plot with multiple y-axis ranges to visualize OOV miss counts for IR2Vec and NCC across different filenames. This snippet requires Matplotlib to be installed and configured. ```python plt.figure(figsize=(20, 5), dpi=200) grid = plt.GridSpec(4, 1, hspace=0.06) ax = plt.subplot(grid[0, 0]) ax2 = plt.subplot(grid[1, 0]) ax3 = plt.subplot(grid[2:4, 0]) ax.plot(filename, IR2Vec_MissCount, color="green", label="#OOV in IR2Vec", marker="x") ax.plot(filename, NCC_MissCount, color="red", label="#OOV in NCC", marker="o") ax2.plot(filename, IR2Vec_MissCount, color="green", label="#OOV in IR2Vec", marker="x") ax2.plot(filename, NCC_MissCount, color="red", label="#OOV in NCC", marker="o") ax3.plot(filename, IR2Vec_MissCount, color="green", label="#OOV in IR2Vec", marker="x") ax3.plot(filename, NCC_MissCount, color="red", label="#OOV in NCC", marker="o") ax.set_ylim(1460, 1600) ax2.set_ylim(1010, 1210) ax3.set_ylim(-30, 440) # hide the spines between ax and ax2 ax.spines["bottom"].set_visible(False) ax2.spines["bottom"].set_visible(False) ax2.spines["top"].set_visible(False) ax3.spines["top"].set_visible(False) ax.xaxis.tick_top() ax.tick_params(labeltop="off") # don't put tick labels at the top ax2.xaxis.set_visible(False) ax3.xaxis.tick_bottom() d = 0.015 # how big to make the diagonal lines in axes coordinates # arguments to pass to plot, just so we don't keep repeating them kwargs = dict(transform=ax.transAxes, color="k", clip_on=False) ax.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal ax.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal kwargs.update(transform=ax2.transAxes) # switch to the bottom axes ax2.plot((-d, +d), (1 - d, 1 + d), **kwargs) # bottom-left diagonal ax2.plot((1 - d, 1 + d), (1 - d, 1 + d), **kwargs) # bottom-right diagonal kwargs = dict(transform=ax2.transAxes, color="k", clip_on=False) ax2.plot((-d, +d), (-d, +d), **kwargs) # top-left diagonal ax2.plot((1 - d, 1 + d), (-d, +d), **kwargs) # top-right diagonal kwargs.update(transform=ax3.transAxes) # switch to the bottom axes ax3.plot((-d, +d), ((1 + d / 4), 1 + d), **kwargs) # bottom-left diagonal ax3.plot((1 - d, 1 + d), (1 + d / 4, 1 + d), **kwargs) ax3.set_xticklabels(filename, rotation=45, horizontalalignment="right", fontsize=12) ax.legend(fontsize="xx-large", loc=2) ax3.set_xlabel("Programs", fontsize=22) ax2.set_ylabel("#OOV", fontsize=20) plt.savefig("71-Programs_OOV.pdf", bbox_inches="tight") ``` -------------------------------- ### Collect LLVM IR Files for SPEC Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/spec_compilation.md Execute a script to collect generated LLVM IR (.ll) files from the SPEC benchmarks. Ensure the script path and any necessary modifications are correct. ```bash collect_ir/spec/get_ll_spec.sh ``` -------------------------------- ### Load Prior Art Results and Data Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Initializes variables for storing results from prior art and loads datasets for program analysis. This includes defining device mappings and reading program metadata. ```python static_pred_vals = [58.823529, 56.911765] static_pred_mean = [57.867647] static_sp_vals = [1.0, 1.0] static_sp_mean = [1.0] grewe_pred_vals = [73.382353, 72.941176] grewe_pred_mean = [73.161765] grewe_sp_vals = [2.905822, 1.264801] grewe_sp_mean = [2.085312] deeptune_pred_vals = [83.676471, 80.294118] deeptune_pred_mean = [81.985294] deeptune_sp_vals = [3.335612, 1.412222] deeptune_sp_mean = [2.373917] ncc_pred_vals = [82.79, 81.76] ncc_pred_mean = [82.275] ncc_sp_vals = [3.42, 1.39] ncc_sp_mean = [2.405] llfiles = pd.read_csv("./data/all.txt", sep="\s+") fileNum = llfiles["FileNum"] filesname = llfiles["ProgramName"] device_dict = {"amd": "AMD Tahiti 7970", "nvidia": "NVIDIA GTX 970"} ``` -------------------------------- ### Monitor Training with TensorBoard Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Track the progress of the TransE model training using TensorBoard. Point TensorBoard to the '~/ray_results' directory where training logs are stored. ```bash tensorboard --logdir=~/ray_results ``` -------------------------------- ### Check IR2Vec Build Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Verify the correctness of the IR2Vec build. ```bash make check_ir2vec ``` -------------------------------- ### Run Training with Ray Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/comPile.md A helper script to assist in running the IR2Vec training process using Ray. It helps in specifying log paths and formatting parameters correctly. ```bash ComPile/run_training_ray.sh ``` -------------------------------- ### Performance Comparison for NVIDIA Tesla K20c Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Displays the percentage increase in speedup for IR2Vec and other optimization methods compared to the baseline on an NVIDIA Tesla K20c GPU. ```python print("\nNVIDIA Tesla K20c") print(" % Increase in SpeedUp over Magni et al - ", percentage(tes_ir2vFA, tes_magni)) print(" % Increase in SpeedUp over DeepTune - ", percentage(tes_ir2vFA, tes_dt)) print(" % Increase in SpeedUp over DeepTune_TL - ", percentage(tes_ir2vFA, tes_dtTL)) print(" % Increase in SpeedUp over Inst2Vec - ", percentage(tes_ir2vFA, tes_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", percentage(tes_ir2vFA, tes_ir2vSym), ) ``` -------------------------------- ### Performance Comparison for AMD Tahiti 7970 Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Displays the percentage increase in speedup for IR2Vec and other optimization methods compared to the baseline on an AMD Tahiti 7970 GPU. ```python print("\nAMD Tahiti 7970") print(" % Increase in SpeedUp over Magni et al - ", percentage(tah_ir2vFA, tah_magni)) print(" % Increase in SpeedUp over DeepTune - ", percentage(tah_ir2vFA, tah_dt)) print(" % Increase in SpeedUp over DeepTune_TL - ", percentage(tah_ir2vFA, tah_dtTL)) print(" % Increase in SpeedUp over Inst2Vec - ", percentage(tah_ir2vFA, tah_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", percentage(tah_ir2vFA, tah_ir2vSym), ) ``` -------------------------------- ### Configure IR2Vec Build Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Configure the IR2Vec build process using CMake, specifying paths to LLVM and Eigen. ```bash cmake -DLT_LLVM_INSTALL_DIR= -DEigen3_DIR= [-DCMAKE_INSTALL_PREFIX=] .. ``` -------------------------------- ### Compile C++ Files for OpenKE Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/OpenKE/README.md Compiles the C++ components of the OpenKE framework using the provided make script. ```bash bash make.sh ``` -------------------------------- ### General Platform Speedup Calculation Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Calculates and prints the percentage increase in speedup over DeepTune, Inst2Vec, and IR2Vec Symbolic for a general platform. Assumes necessary dataframes and variables are already computed. ```python dt = 81.99 ncc = 82.275 ccimm = (88.09 + 86.62) / 2 ir2vSym = ir2vec_sym["Correct?"].mean() * 100 ir2vFA = ir2vec_fa["Correct?"].mean() * 100 print(" % Increase in SpeedUp over DeepTune - ", slowDown(ir2vFA, dt)) print(" % Increase in SpeedUp over Inst2Vec - ", slowDown(ir2vFA, ncc)) print(" % Increase in SpeedUp over Inst2Vec - ", slowDown(ir2vFA, nccimm)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", slowDown(ir2vFA, ir2vSym), ) ``` -------------------------------- ### Load Prior Art Results Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Loads pickled data containing performance results from various prior art methods. Ensure the 'data/prior_art_results/' directory exists and contains the specified pickle files. ```python magni_res = pd.read_pickle("data/prior_art_results/magni_tf.results") deeptune_res = pd.read_pickle("data/prior_art_results/deeptune_tf.results") deeptune_tl_res = pd.read_pickle("data/prior_art_results/deeptune_tl_tf.results") cc_res = pd.read_pickle("data/prior_art_results/ncc_fix_tf.results") ``` -------------------------------- ### Performance Comparison for AMD Radeon HD 5900 Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Displays the percentage increase in speedup for IR2Vec and other optimization methods compared to the baseline on an AMD Radeon HD 5900 GPU. ```python print("AMD Radeon HD 5900") print(" % Increase in SpeedUp over Magni et al - ", percentage(rad_ir2vFA, rad_magni)) print(" % Increase in SpeedUp over DeepTune - ", percentage(rad_ir2vFA, rad_dt)) print(" % Increase in SpeedUp over DeepTune_TL - ", percentage(rad_ir2vFA, rad_dtTL)) print(" % Increase in SpeedUp over Inst2Vec - ", percentage(rad_ir2vFA, rad_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", percentage(rad_ir2vFA, rad_ir2vSym), ) ``` -------------------------------- ### Runtime Output Directory Source: https://github.com/iith-compilers/ir2vec/blob/main/CMakeLists.txt Configures the directory where runtime executables will be placed. This is set to a 'bin' subdirectory within the build directory. ```cmake set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin) ``` -------------------------------- ### Preprocess Triplets for TransE Training Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Preprocess the collected triplets using 'preprocess.py' to generate files required for TransE training. The script requires the path to the triplet file. ```bash cd OpenKE python preprocess.py --tripletFile= ``` -------------------------------- ### ir2vec.initEmbedding Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Initializes the IR2Vec embedding process for a given LLVM IR file, specifying encoding type, level, dimension, and an optional output file. ```APIDOC ## ir2vec.initEmbedding ### Description Initialize IR2Vec embedding for an LLVM IR file. ### Parameters * **file_path** (str) - Required - Path to the `.ll` or `.bc` file. * **encoding_type** (str) - Required - Choose `fa` (Flow-Aware) or `sym` (Symbolic). * **level** (str) - Required - Choose `p` for program-level or `f` for function-level. * **dim** (uint) - Optional - Choose from `[300, 100, 75]`. Default value is `300`. * **output_file** (str) - Optional - If provided, embeddings are saved to this file. Default is an empty string. ### Returns * **IR2VecObject** - Initialized object for accessing embeddings. ### Example ```python import ir2vec # Approach 1 initObj = ir2vec.initEmbedding("/path/to/file.ll", "fa", "p") # Approach 2 initObj = ir2vec.initEmbedding("/path/to/file.ll", "fa", "p", 100) # Approach 3 initObj = ir2vec.initEmbedding("/path/to/file.ll", "fa", "p", 100, "output.txt") ``` ``` -------------------------------- ### Create Conda Environment Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/README.md Creates a new conda environment using the specified YAML file. Use 'conda activate IR2Vec' for subsequent uses. ```bash conda env create --file exp_requirements.yaml ``` ```bash conda activate IR2Vec ``` -------------------------------- ### Copy Benchmark Files Source: https://github.com/iith-compilers/ir2vec/blob/main/src/test-suite/CMakeLists.txt Copies various benchmark-related files and directories to the destination. This includes LLVM IR files, vocabulary, and other test assets. ```cmake file(COPY PE-benchmarks-llfiles-llvm20 DESTINATION .) file(COPY sqlite3.ll DESTINATION .) file(COPY oracle DESTINATION .) file(COPY ../../vocabulary DESTINATION .) file(COPY index-llvm20.files DESTINATION .) ``` -------------------------------- ### Log in to GitHub Container Registry Source: https://github.com/iith-compilers/ir2vec/wiki/docker_update Logs into the GitHub Container Registry using your GitHub username and a personal access token. Replace USERNAME and TOKEN with your credentials. ```bash docker login ghcr.io -u USERNAME -p TOKEN ``` -------------------------------- ### Copy Test Runner and Config Files Source: https://github.com/iith-compilers/ir2vec/blob/main/src/test-suite/CMakeLists.txt Copies the test runner script (test-lit.py) and the main test configuration file (test-ir2vec.lit) to the destination directory. ```cmake file(COPY test-lit.py DESTINATION .) file(COPY test-ir2vec.lit DESTINATION .) ``` -------------------------------- ### Print Speedup Comparisons for AMD Tahiti 7970 with Imm Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Prints the percentage increase in speedup for IR2Vec Flow-Aware embeddings against other methods, including an 'Inst2Vec-imm' variant, for the AMD Tahiti 7970 GPU. ```python print("\nAMD Tahiti 7970") print(" % Increase in SpeedUp over Grewe et al - ", slowDown(tah_ir2vFA, tah_grewe)) print(" % Increase in SpeedUp over DeepTune - ", slowDown(tah_ir2vFA, tah_dt)) print(" % Increase in SpeedUp over Inst2Vec - ", slowDown(tah_ir2vFA, tah_ncc)) print(" % Increase in SpeedUp over Inst2Vec-imm - ", slowDown(tah_ir2vFA, tah_nccimm)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", slowDown(tah_ir2vFA, tah_ir2vSym), ) ``` -------------------------------- ### Verify Dataset Presence Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Checks if essential dataset files and directories ('data/kernels_ir', 'data/cgo17-amd.csv', 'data/cgo17-nvidia.csv') exist before proceeding. Raises an error if any are missing. ```python assert ( os.path.exists("data/kernels_ir") and os.path.exists("data/cgo17-amd.csv") and os.path.exists("data/cgo17-nvidia.csv") ), "Dataset is not present. Please download" ``` -------------------------------- ### Library and Executable Definitions (Standalone) Source: https://github.com/iith-compilers/ir2vec/blob/main/src/CMakeLists.txt Defines source files for libraries and executables when LLVM_IR2VEC is not enabled. Links against LLVM components. ```cmake set(commonsrc FlowAware.cpp Symbolic.cpp utils.cpp ${GENERATED_HEADERS_DIR}/VocabularyFactory.cpp) set(libsrc libIR2Vec.cpp ${commonsrc}) set(binsrc CollectIR.cpp IR2Vec.cpp) if(NOT LLVM_IR2VEC) set(LT_LLVM_INSTALL_DIR "" CACHE PATH "LLVM installation directory") list(APPEND CMAKE_PREFIX_PATH "${LT_LLVM_INSTALL_DIR}/lib/cmake/llvm/") find_package(LLVM 20.1.0 REQUIRED CONFIG) message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}") message(STATUS "Using LLVMConfig.cmake in: ${LLVM_DIR}") include_directories(SYSTEM ${LLVM_INCLUDE_DIRS}) # llvm_map_components_to_libnames(llvm_libs all) llvm_map_components_to_libnames(llvm_libs support core irreader analysis TransformUtils) add_executable(${PROJECT_NAME} ${binsrc}) target_link_libraries (${PROJECT_NAME} ${llvm_libs} objlib) target_include_directories(${PROJECT_NAME} PRIVATE .) add_library(objlib OBJECT ${libsrc}) set_property(TARGET objlib PROPERTY POSITION_INDEPENDENT_CODE 1) if(Eigen3_FOUND) target_link_libraries (objlib Eigen3::Eigen) endif() add_library(${IR2VEC_LIB} SHARED $) add_library(${IR2VEC_LIB_STATIC} STATIC $) set_target_properties(${IR2VEC_LIB} ${IR2VEC_LIB_STATIC} properties VERSION ${PROJECT_VERSION} SOVERSION 1 PUBLIC_HEADER "./include/IR2Vec.h" PUBLIC_HEADER "${GENERATED_VOCAB_HEADERS}" OUTPUT_NAME ${IR2VEC_LIB} LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib ) install(TARGETS ${IR2VEC_LIB} ${IR2VEC_LIB_STATIC} LIBRARY DESTINATION lib PUBLIC_HEADER DESTINATION include RESOURCE DESTINATION ./) add_subdirectory(test-suite) add_custom_target(check_ir2vec COMMAND python3 test-lit.py -a . COMMENT "Running LIT based test-suite" WORKING_DIRECTORY ./test-suite DEPENDS ${PROJECT_NAME} VERBATIM ) else() file(COPY ${CMAKE_CURRENT_SOURCE_DIR}/include/IR2Vec.h DESTINATION ${LLVM_MAIN_INCLUDE_DIR}/llvm ) set(LLVM_OPTIONAL_SOURCES ${binsrc}) add_llvm_library(LLVMIR2Vec ${libsrc} DEPENDS intrinsics_gen ) if(Eigen3_FOUND) target_link_libraries(LLVMIR2Vec PRIVATE Eigen3::Eigen) endif() target_include_directories(LLVMIR2Vec PRIVATE ${LLVM_MAIN_INCLUDE_DIR}) target_include_directories(LLVMIR2Vec PRIVATE .) endif() ``` -------------------------------- ### Print Speedup Comparisons for AMD Tahiti 7970 Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Prints the percentage increase in speedup for IR2Vec Flow-Aware embeddings against other methods for the AMD Tahiti 7970 GPU. ```python print("\nAMD Tahiti 7970") print(" % Increase in SpeedUp over Grewe et al - ", slowDown(tah_ir2vFA, tah_grewe)) print(" % Increase in SpeedUp over DeepTune - ", slowDown(tah_ir2vFA, tah_dt)) print(" % Increase in SpeedUp over Inst2Vec - ", slowDown(tah_ir2vFA, tah_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", slowDown(tah_ir2vFA, tah_ir2vSym), ) ``` -------------------------------- ### Extract Eigen Library Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Extract the downloaded Eigen library archive. ```bash tar -xvzf eigen-3.3.7.tar.gz ``` -------------------------------- ### Model Filename Format Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md The best model is saved in the specified index_dir with this filename format, incorporating training parameters. ```text seedEmbedding_{}E_{}D_{}batches_{}margin.ckpt ``` -------------------------------- ### Calculate and Print Speedup for a Given Platform Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Calculates the geometric mean of speedup for various device mapping methods (Grewe et al., DeepTune, NCC, IR2Vec Symbolic, IR2Vec Flow-Aware) for a specified platform and prints the results. ```python def calcSpeedup(platform): grewe_geomean = gmean( grewe_res[grewe_res["Platform"] == platform]["Speedup"].values ) deeptune_geomean = gmean( deeptune_res[deeptune_res["Platform"] == platform]["Speedup"].values ) ncc_geomean = gmean(ncc_res[ncc_res["Platform"] == platform]["Speedup"].values) ir2vec_sym_geomean = gmean( ir2vec_sym[ir2vec_sym["Platform"] == platform]["Speedup"].values ) ir2vec_fa_geomean = gmean( ir2vec_fa[ir2vec_fa["Platform"] == platform]["Speedup"].values ) print(f"Geometric mean of Grewe et al. {grewe_geomean:.2f}x") print(f"Geometric mean of DeepTune {deeptune_geomean:.2f}x") print(f"Geometric mean of Inst2Vec {ncc_geomean:.2f}x") print(f"Geometric mean of IR2Vec Symbolic {ir2vec_sym_geomean:.3f}x") print(f"Geometric mean of IR2Vec Flow-Aware {ir2vec_fa_geomean:.3f}x") return ( grewe_geomean, deeptune_geomean, ncc_geomean, ir2vec_sym_geomean, ir2vec_fa_geomean, ) ``` -------------------------------- ### Print Speedup Comparisons for NVIDIA GTX 970 Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Prints the percentage increase in speedup for IR2Vec Flow-Aware embeddings against other methods for the NVIDIA GTX 970 GPU. ```python print("\nNVIDIA GTX 970") print(" % Increase in SpeedUp over Grewe et al - ", slowDown(gtx_ir2vFA, gtx_grewe)) print(" % Increase in SpeedUp over DeepTune - ", slowDown(gtx_ir2vFA, gtx_dt)) print(" % Increase in SpeedUp over Inst2Vec - ", slowDown(gtx_ir2vFA, gtx_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", slowDown(gtx_ir2vFA, gtx_ir2vSym), ) ``` -------------------------------- ### Download SQLite Amalgamation Zip Source: https://github.com/iith-compilers/ir2vec/blob/main/src/test-suite/CMakeLists.txt Downloads the SQLite amalgamation source zip file if it does not already exist. Includes expected SHA3_256 hash for verification and shows download progress. ```cmake file( DOWNLOAD https://sqlite.org/2024/sqlite-amalgamation-3460000.zip ${SQLITE_ZIP} EXPECTED_HASH SHA3_256=1221eed70de626871912bfca144c00411f0c30d3c2b7935cff3963b63370ef7c SHOW_PROGRESS ) ``` -------------------------------- ### Generate Triplets for Seed Embeddings Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Use the 'triplets.sh' script to collect program triplets. Specify the build directory, number of optimizations, a file listing LLVM files, and the output file name. ```bash bash triplets.sh ../build 2 files_path.txt triplets.txt ``` -------------------------------- ### Prior Art Speedup Values Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Lists the speedup values and means for different prior art methods (Magni, DeepTune, NCC) used for comparison. ```python magni_sp_vals = [1.21, 1.01, 0.86, 0.94] magni_sp_mean = [1.005] deeptune_sp_vals = [1.10, 1.05, 1.10, 0.99] deeptune_sp_mean = [1.06] deeptuneTL_sp_vals = [1.17, 1.23, 1.14, 0.93] deeptuneTL_sp_mean = [1.1175] ccncc_sp_vals = [1.29, 1.07, 0.97, 1.01] ccncc_sp_mean = [1.086] ``` -------------------------------- ### Generate Symbolic Embeddings (All Functions) Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Use the IR2Vec binary to generate Symbolic embeddings for all functions in an input LLVM file. ```bash ir2vec -sym -dim -o -level -class ``` -------------------------------- ### Library Naming Source: https://github.com/iith-compilers/ir2vec/blob/main/CMakeLists.txt Sets variables for library names, distinguishing between dynamic and static builds. Used later in the CMakeLists.txt file. ```cmake set(IR2VEC_LIB "IR2Vec") set(IR2VEC_LIB_STATIC "IR2Vec_Static") ``` -------------------------------- ### Performance Comparison for NVIDIA GTX 480 Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Displays the percentage increase in speedup for IR2Vec and other optimization methods compared to the baseline on an NVIDIA GTX 480 GPU. ```python print("\nNVIDIA GTX 480") print(" % Increase in SpeedUp over Magni et al - ", percentage(gtx_ir2vFA, gtx_magni)) print(" % Increase in SpeedUp over DeepTune - ", percentage(gtx_ir2vFA, gtx_dt)) print(" % Increase in SpeedUp over DeepTune_TL - ", percentage(gtx_ir2vFA, gtx_dtTL)) print(" % Increase in SpeedUp over Inst2Vec - ", percentage(gtx_ir2vFA, gtx_ncc)) print( " % Increase in SpeedUp over IR2Vec Symbolic - ", percentage(gtx_ir2vFA, gtx_ir2vSym), ) ``` -------------------------------- ### Display Speedup Matrix Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Calculates and prints the speedup matrix comparing IR2Vec with other device mapping techniques. It groups data by platform to compute mean speedup values. ```python print("\nSpeedup Matrix: IR2Vec Vs. others\n") ir2vec_sp_vals = ir2vec.groupby(["Platform"])["Speedup"].mean().values ir2vec_sp_mean = ir2vec_sp_vals.mean() sp_df = pd.DataFrame( { "Static Mapping": static_sp_vals + static_sp_mean, "Grewe et al.": grewe_sp_vals + grewe_sp_mean, "DeepTune": deeptune_sp_vals + deeptune_sp_mean, "NCC": ncc_sp_vals + ncc_sp_mean, "IR2Vec": list(ir2vec_sp_vals) + [ir2vec_sp_mean], }, index=["AMD Tahiti 7970", "NVIDIA GTX 970", "Average"], ) print(sp_df) ``` -------------------------------- ### Load Embeddings and Evaluate IR2Vec Flow-Aware Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb Loads raw embeddings from a file and initializes the IR2Vec flow-aware evaluation model with specified parameters. This snippet is used for the flow-aware approach to thread coarsening. ```python raw_embeddings, fileIndex = readEmd_program( "./output/embeddings/Thread_Coarsening_FlowAware_llvm19.txt" ) ir2vec_fa = evaluate(max_depth=1, learning_rate=0.05, n_estimators=140) ``` -------------------------------- ### IR2Vec C++ API Usage Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Demonstrates how to use the IR2Vec C++ interfaces to generate vector representations for LLVM modules, functions, and instructions. ```APIDOC ## C++ API Usage ### Description This section shows how to integrate and use the IR2Vec C++ library in your project to obtain vector representations. ### Usage 1. **Include Header:** ```cpp #include "IR2Vec.h" ``` 2. **Create IR2Vec Object:** Initialize the `IR2Vec` object with an LLVM module, embedding mode, and dimension. ```cpp auto ir2vec = IR2Vec::Embeddings(, IR2Vec::IR2VecMode::FlowAware, ); ``` 3. **Get Instruction Vectors:** Retrieve a map of instructions to their vector representations. ```cpp auto instVecMap = ir2vec.getInstVecMap(); // Access vectors in the loop: for (auto instVec : instVecMap) { outs() << "Instruction : "; instVec.first->print(outs()); outs() << ": "; for (auto val : instVec.second) outs() << val << "\t"; } ``` 4. **Get Function Vectors:** Retrieve a map of functions to their vector representations. ```cpp auto funcVecMap = ir2vec.getFunctionVecMap(); // Access vectors in the loop: for (auto funcVec : funcVecMap) { outs() << "Function : " << funcVec.first->getName() << "\n"; for (auto val : funcVec.second) outs() << val << "\t"; } ``` 5. **Get Program Vector:** Retrieve the vector representation for the entire program. ```cpp auto pgmVec = ir2vec.getProgramVector(); // Access the vector: for (auto val : pgmVec) outs() << val << "\t"; ``` ``` -------------------------------- ### Train TransE Model for Seed Embeddings Source: https://github.com/iith-compilers/ir2vec/blob/main/seed_embeddings/README.md Generate seed embeddings by training the TransE model using 'generate_embedding_ray.py'. This script can be configured with various arguments for epochs, analogy scoring, link prediction, batch size, and margin. ```bash python generate_embedding_ray.py --index_dir "../seed_embeddings/preprocessed/" --epoch 1500 --is_analogy True --use_gpu true ``` -------------------------------- ### Calculate Overall Speedup Across Platforms Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Calculates and prints the geometric mean of speedup for various device mapping methods across all platforms, using the aggregated speedup values. ```python grewe_geomean = gmean(grewe_res["Speedup"].values) deeptune_geomean = gmean(deeptune_res["Speedup"].values) cc_geomean = gmean(ncc_res["Speedup"].values) ir2vec_sym_geomean = gmean(ir2vec_sym["Speedup"].values) ir2vec_fa_geomean = gmean(ir2vec_fa["Speedup"].values) print(f"Geometric mean of Grewe et al. - {grewe_geomean:.2f}x") print(f"Geometric mean of DeepTune - {deeptune_geomean:.2f}x") print(f"Geometric mean of Inst2Vec - {ncc_geomean:.2f}x") print(f"Geometric mean of IR2Vec Symbolic {ir2vec_sym_geomean:.2f}x") print(f"Geometric mean of IR2Vec Flow-Aware {ir2vec_fa_geomean:.2f}x") ``` -------------------------------- ### Load Embeddings and Evaluate IR2Vec Symbolic Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Device_Mapping/DevMap.ipynb Loads pre-computed embeddings for symbolic device mapping and then evaluates the IR2Vec model with specified hyperparameters. ```python raw_embeddings, fileIndexNum = readEmd_program( "./output/embeddings/Device_Mapping_Symbolic_llvm19txt" ) ir2vec_sym = evaluate(max_depth=10, learning_rate=0.5, n_estimators=70, seed=104) ``` -------------------------------- ### Generate Flow-Aware Embeddings (All Functions) Source: https://github.com/iith-compilers/ir2vec/blob/main/README.md Use the IR2Vec binary to generate Flow-Aware embeddings for all functions in an input LLVM file. ```bash ir2vec -fa -dim -o -level -class ``` -------------------------------- ### Search for LLVM Version Strings Source: https://github.com/iith-compilers/ir2vec/blob/main/docs/version_upgrade_process.md Use `git grep` to find all occurrences of a specific LLVM version string within the project. This is useful for identifying all locations that need to be updated during a version upgrade. ```bash git grep 16 ``` ```bash git grep llvm16 ``` -------------------------------- ### Displaying Performance Metrics DataFrame Source: https://github.com/iith-compilers/ir2vec/blob/main/experiments/Thread_Coarsening/ThreadCoarsening.ipynb This snippet generates and prints a pandas DataFrame containing performance metrics for different thread coarsening methods. It's useful for comparing the effectiveness of IR2Vec against other techniques. ```python sp_df = pd.DataFrame( { "DeepTune": deeptune_sp_vals + deeptune_sp_mean, "DeepTune-TL": deeptuneTL_sp_vals + deeptuneTL_sp_mean, "NCC": ncc_sp_vals + ncc_sp_mean, "IR2Vec": list(ir2vec_sp_vals) + [ir2vec_sp_mean], }, index=[ "AMD Radeon HD 5900", "AMD Tahiti 7970", "NVIDIA GTX 480", "NVIDIA Tesla K20c", "Average", ], ) print(sp_df) return ir2vec ```