### Tensix DSL Minimal Example: Tensor Addition

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

A minimal example demonstrating a Tensix DSL program for element-wise tensor addition. It defines compute and data movement kernels, configures the execution grid and tiling, and illustrates tensor operations using CircularBuffers and TensorAccessors. This example requires the pykernel library and PyTorch.

```python
from pykernel.d2m_api import *
import torch

@pykernel_gen(
    grid=(2, 2),            # 2x2 grid of cores
    block_factors=[
        (1, 1),  # lhs: 1x1 tiles
        (1, 1),  # rhs: 1x1 tiles
        (1, 1),  # out: 1x1 tiles
    ],
)
def add(lhs, rhs, out, block_factors=None, grid=None):
    lhs_acc = TensorAccessor(lhs)
    rhs_acc = TensorAccessor(rhs)

    @compute()
    def add_kernel(
        lhs_cb: CircularBuffer,
        rhs_cb: CircularBuffer,
        out_cb: CircularBuffer,
    ):
        lhs_blk = lhs_cb.wait()       # Acquire data
        rhs_blk = rhs_cb.wait()
        out_blk = out_cb.reserve()    # Acquire space
        result = lhs_blk + rhs_blk    # Compute
        out_blk.store(result)         # Write
        lhs_cb.pop()                  # Release (signal consumed)
        rhs_cb.pop()
        out_cb.push()                 # Release (signal data ready)

    @datamovement()
    def dm_reader(
        lhs_cb: CircularBuffer,
        rhs_cb: CircularBuffer,
        out_cb: CircularBuffer,
    ):
        core_row_idx = core_index(0)
        core_col_idx = core_index(1)
        linear_idx = core_row_idx * 2 + core_col_idx  # Assuming 2x2 grid

        lhs_blk = lhs_cb.reserve()
        lhs_tx = dma(lhs_acc[linear_idx, 0], lhs_blk)
        lhs_tx.wait()
        lhs_cb.push()

        rhs_blk = rhs_cb.reserve()
        rhs_tx = dma(rhs_acc[linear_idx, 0], rhs_blk)
        rhs_tx.wait()
        rhs_cb.push()

    return Program(add_kernel, dm_reader)(lhs, rhs, out)

lhs = torch.randn(128, 128)
rhs = torch.randn(128, 128)
out = torch.zeros(128, 128)
add(lhs, rhs, out)
```

--------------------------------

### Run TT-Lang Example

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

Executes a TT-Lang example script, such as custom_dm_matmul.py. This script can optionally be used to specify initial and final MLIR dump locations for debugging.

```bash
python example/custom_dm_matmul.py
```

--------------------------------

### Build tt-mlir and tt-lang from Source

Source: https://github.com/tenstorrent/tt-lang/blob/main/README.md

Builds tt-mlir and installs it locally, then configures tt-lang to use this local installation. This is the common scenario for users without a pre-built or pre-installed tt-mlir. It fetches, configures, builds, and installs tt-mlir based on the commit SHA in `third-party/tt-mlir.commit`. Requires an LLVM/MLIR toolchain.

```bash
cd /path/to/tt-lang
cmake -GNinja -Bbuild .
source build/env/activate
cmake --build build
```

--------------------------------

### First-time tt-lang Setup and Build

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

This sequence outlines the initial setup process for the tt-lang project. It includes configuring the build, activating the generated environment script, and then performing the actual build.

```bash
# Build tt-lang
cd /path/to/tt-lang
cmake -GNinja -Bbuild .      # Configure and generate activation script
source build/env/activate     # Activate the environment
cmake --build build           # Build tt-lang
```

--------------------------------

### Python Package Usage and Installation

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Shows how to verify the tt-lang Python package after building and how to install it as an editable package. This requires the tt-lang environment to be activated.

```bash
# Verify version after building (environment must be activated)
python3 -c "import ttlang; print(ttlang.__version__)"

# Install as an editable package
pip install -e python/
```

--------------------------------

### Install pre-commit with apt (Ubuntu/Debian)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CONTRIBUTING.md

Installs the pre-commit package using apt, the package manager for Ubuntu and Debian-based systems. This is an alternative installation method for Linux users.

```bash
# Ubuntu/Debian
sudo apt install pre-commit
```

--------------------------------

### Install Python Packages with CMake

Source: https://github.com/tenstorrent/tt-lang/blob/main/python/CMakeLists.txt

This CMake command installs Python packages located in the build directory to the destination within the project. It is configured to be part of the TTLangPythonWheel component and excluded from all targets by default. Ensure the source directory exists and the destination is correctly specified.

```cmake
install(DIRECTORY ${CMAKE_BINARY_DIR}/python_packages/
  DESTINATION .
  COMPONENT TTLangPythonWheel
  EXCLUDE_FROM_ALL)
```

--------------------------------

### Install pre-commit with Homebrew (macOS)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CONTRIBUTING.md

Installs the pre-commit package using Homebrew, a package manager for macOS. This is an alternative installation method for macOS users.

```bash
# macOS
brew install pre-commit
```

--------------------------------

### Build tt-mlir from Source

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Commands to build tt-mlir from its source directory. This process involves activating the environment, configuring with CMake, and then building the project. Ensure you are in the correct tt-mlir directory and have the necessary build tools installed.

```bash
cd /path/to/tt-mlir
source env/activate
cmake -GNinja -Bbuild .
cmake --build build
```

--------------------------------

### Install pre-commit with pip

Source: https://github.com/tenstorrent/tt-lang/blob/main/CONTRIBUTING.md

Installs the pre-commit package using pip, a package installer for Python. This is a prerequisite for setting up pre-commit hooks.

```bash
pip install pre-commit
```

--------------------------------

### CMake: Installation of Wheel Files

Source: https://github.com/tenstorrent/tt-lang/blob/main/CMakeLists.txt

This snippet uses `file(GLOB_RECURSE)` to find all `.whl` (wheel) files within the binary directory and then installs them to the `wheels` destination relative to the installation prefix. This is typically used for packaging Python components.

```cmake
# Install wheel files to prefix/wheels
file(GLOB_RECURSE WHEEL_FILES "${CMAKE_BINARY_DIR}/*.whl")
if(WHEEL_FILES)
  install(FILES ${WHEEL_FILES} DESTINATION wheels)
endif()

```

--------------------------------

### Verify tt-mlir Installation Path

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Checks if the TTMLIRConfig.cmake file exists in the specified toolchain directory. This is crucial for build systems to locate and use the tt-mlir installation. If the file is not found, the installation path needs to be corrected or tt-mlir needs to be installed there.

```bash
ls ${TTMLIR_TOOLCHAIN_DIR}/lib/cmake/ttmlir/TTMLIRConfig.cmake
```

--------------------------------

### Building tt-mlir Prerequisite

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This section outlines the steps to build the tt-mlir project, which is a prerequisite for building tt-lang. It includes activating the tt-mlir environment, configuring with CMake, and building the project. Specific instructions for macOS are noted.

```bash
cd <tt-mlir-directory>
source env/activate

# Configure and build tt-mlir
# Follow instructions in MACOS_BUILD.md (macOS) or README.md (Linux)
cmake -G Ninja -B build <options>
cmake --build build
```

--------------------------------

### Perform a Git Commit with Pre-commit

Source: https://github.com/tenstorrent/tt-lang/blob/main/README.md

Executes a standard git commit. If pre-commit hooks are installed, they will automatically run before the commit is finalized. Any formatting issues detected will halt the commit process.

```bash
git commit -m "Your commit message"
```

--------------------------------

### CMake Build Configurations for tt-lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Demonstrates various ways to configure the tt-lang build using CMake, including specifying the tt-mlir toolchain location, custom install prefixes, performance trace enablement, and Python bindings.

```bash
# Scenario 2: Use pre-installed tt-mlir (via environment variable)
export TTMLIR_TOOLCHAIN_DIR=/opt/ttmlir-toolchain
cmake -GNinja -Bbuild .

# Scenario 2: Use pre-installed tt-mlir (via CMake variable)
cmake -GNinja -Bbuild . -DTTMLIR_TOOLCHAIN_DIR=/opt/ttmlir-toolchain

# Scenario 3: Automatic build (no extra options needed)
cmake -GNinja -Bbuild .

# Scenario 3: Automatic build with custom install prefix
cmake -GNinja -Bbuild . -DTTMLIR_INSTALL_PREFIX=/tmp/my-ttmlir-install

# Scenario 3: Automatic build with performance trace enabled
cmake -GNinja -Bbuild . -DTTLANG_ENABLE_PERF_TRACE=ON -DTTMLIR_CMAKE_BUILD_TYPE=Release

# Debug build with Python bindings
cmake -GNinja -Bbuild . -DCMAKE_BUILD_TYPE=Debug -DTTLANG_ENABLE_BINDINGS_PYTHON=ON
```

--------------------------------

### Install pre-commit git hooks

Source: https://github.com/tenstorrent/tt-lang/blob/main/CONTRIBUTING.md

Installs the git hook scripts for pre-commit after cloning the repository. This configures git to automatically run pre-commit checks before each commit.

```bash
cd /path/to/tt-lang
pre-commit install
```

--------------------------------

### Building tt-lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

Instructions for configuring and building the tt-lang project using CMake and Ninja. It emphasizes ensuring tt-mlir is built successfully beforehand and provides commands for initial configuration, building, and rebuilding after code changes.

```bash
cd <tt-lang-directory>
source env/activate

# Configure
cmake -G Ninja -B build .

# Build
cmake --build build

# Rebuild after code changes
cmake --build build
```

--------------------------------

### TT-Lang DSL Path Compilation Pipeline

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Illustrates the compilation flow for the TT-Lang DSL (pykernel) path, starting from Python DSL code and progressing through D2M dialect IR, frontend and middleend passes, TTKernel dialect, and finally to EmitC and Flatbuffer for execution on hardware. It highlights key stages like MetalLayoutAttr creation, stream ops wrapping, D2M dialect generation, bufferization, and DMA lowering.

```text
Python DSL Code
  @pykernel_gen(grid=(2,2))
  def matmul(lhs, rhs, out):
      @compute() ...
      @datamovement() ...
      ↓
  d2m_api.py compiles each thread to MLIR
    - Creates MetalLayoutAttr with shapes/grid
    - Wraps TensorAccessor() args with stream_layout ops
    - Compiles thread AST to D2M dialect
    - Python operators generate linalg.generic blocks with D2M tile ops
    - Glues threads into d2m.generic op
      ↓
  D2M Dialect IR (tensor<4x4x!ttcore.tile<32x32>>)
      ↓
  Frontend Passes
    - d2m-generic-replace-globals (swap globals for function args)
    - fusion, canonicalization
      ↓
  Bufferization (tensor → memref)
      ↓
  Middleend Passes
    - d2m-allocate (assign L1 addresses)
    - d2m-generic-lower-dmas (DMA to hardware ops)
      ↓
  TTKernel Dialect
      ↓
  EmitC + Flatbuffer
      ↓
  Binary runs on hardware
```

--------------------------------

### FlashAttention Implementation with TT-Lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Implements the FlashAttention algorithm using TT-Lang. This example showcases advanced features like block-wise processing, multicast data movement, and attention score calculations. Note that some functions like `fill`, `transpose`, `sqrt`, `rowmax`, `exp`, and `rowsum` are noted as not implemented in the provided snippet.

```python
@pykernel_gen(grid=(1, 1), block_factors=[(1, 1), (1, 1), (1, 1), (1, 1)])
def flash_attention(Q, K, V, out, block_factors=None, grid=None):
    Q_acc = TensorAccessor(Q)
    K_acc = TensorAccessor(K)
    V_acc = TensorAccessor(V)
    NUM_KV_BLOCKS = 4

    @compute()
    def attention_compute(Q_cb, K_cb, V_cb, out_cb):
        m_old = fill(-inf, shape)  # fill() not implemented
        l_old = fill(0, shape)      # fill() not implemented
        O_result = fill(0, shape)   # fill() not implemented

        for kv_idx in range(NUM_KV_BLOCKS):
            Q_blk = Q_cb.wait()
            K_blk = K_cb.wait()
            V_blk = V_cb.wait()

            S = Q_blk @ transpose(K_blk)  # transpose() not implemented
            S_scaled = S * (1.0 / sqrt(d_head))  # sqrt() and scalar broadcast not implemented

            m_new = rowmax(S_scaled, m_old)  # rowmax() not implemented
            P = exp(S_scaled - m_new)  # exp() not implemented
            correction = exp(m_old - m_new)  # exp() not implemented
            l_new = correction * l_old + rowsum(P)  # rowsum() not implemented

            O_result = (l_old / l_new) * correction * O_result + (P / l_new) @ V_blk

            Q_cb.pop()
            K_cb.pop()
            V_cb.pop()

            m_old = m_new
            l_old = l_new

        out_blk = out_cb.reserve()
        out_blk.store(O_result)
        out_cb.push()

    @datamovement()
    def dm_reader(Q_cb, K_cb, V_cb, out_cb):
        core_row_idx = core_index(0)
        core_col_idx = core_index(1)
        linear_idx = core_row_idx * 1 + core_col_idx

        for kv_idx in range(NUM_KV_BLOCKS):
            Q_blk = Q_cb.reserve()
            dma(Q_acc[linear_idx, 0], Q_blk).wait()
            Q_cb.push()

            K_blk = K_cb.reserve()
            dma(K_acc[kv_idx, 0], K_blk).wait()
            K_cb.push()

            V_blk = V_cb.reserve()
            dma(V_acc[kv_idx, 0], V_blk).wait()
            V_cb.push()

    return Program(attention_compute, dm_reader)(Q, K, V, out)
```

--------------------------------

### Linking Against tt-mlir Libraries in CMake

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Provides an example of how to link your target executable or library against specific tt-mlir targets using `target_link_libraries`. This allows your code to utilize tt-mlir's core dialects and support libraries.

```cmake
# Link against tt-mlir libraries
target_link_libraries(MyTarget
  PRIVATE
    MLIRTTCoreDialect
    MLIRTTNNDialect
    MLIRTTIRDialect
    # other tt-mlir targets
)
```

--------------------------------

### Save Initial and Final MLIR

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

Saves the MLIR generated before and after transformations by setting environment variables. This allows for comparison of the IR state at different stages of the compilation process.

```bash
# Save initial and final MLIR
export TTLANG_INITIAL_MLIR=/tmp/test_initial.mlir
export TTLANG_FINAL_MLIR=/tmp/test_final.mlir

python examples/custom_dm_matmul.py

# View the files
cat /tmp/test_initial.mlir
cat /tmp/test_final.mlir
```

--------------------------------

### Clean and Reconfigure CMake Build

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Removes the existing build directory and reconfigures the project using CMake. This is a common troubleshooting step for CMake configuration errors, ensuring a fresh build environment. It requires CMake version 3.24+ and Ninja to be installed.

```bash
rm -rf build
cmake -GNinja -Bbuild .
```

--------------------------------

### ShardLayoutAttr MLIR Example

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Represents a low-level layout attribute for memrefs that are storage (not views). It specifies the stride in bytes for the physical layout. This attribute is automatically generated during bufferization for storage operands.

```mlir
#ttcore.shard<8192>  // Stride in bytes

```

--------------------------------

### ViewLayoutAttr MLIR Example

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Represents a low-level layout attribute for memrefs that are views (not storage). It contains an affine map used for indexing. This attribute is automatically generated during bufferization when the MetalLayoutAttr includes an index_map.

```mlir
#ttcore.view<map(4)>  // Identity map for rank 4

```

--------------------------------

### Element-Wise Add with TT-Lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Implements a simple element-wise addition operation using TT-Lang's `@pykernel_gen` decorator. It defines compute and datamovement phases for efficient tensor operations. This example is suitable for grid-based computations.

```python
@pykernel_gen(grid=(2,2), block_factors=[(1,1), (1,1), (1,1)])
def add(lhs, rhs, out, block_factors=None, grid=None):
    lhs_acc = TensorAccessor(lhs)
    rhs_acc = TensorAccessor(rhs)

    @compute()
    def compute_add(
        lhs_cb: CircularBuffer,
        rhs_cb: CircularBuffer,
        out_cb: CircularBuffer,
    ):
        lhs_blk = lhs_cb.wait()
        rhs_blk = rhs_cb.wait()
        out_blk = out_cb.reserve()
        result = lhs_blk + rhs_blk
        out_blk.store(result)
        lhs_cb.pop()
        rhs_cb.pop()
        out_cb.push()

    @datamovement()
    def dm(
        lhs_cb: CircularBuffer,
        rhs_cb: CircularBuffer,
        out_cb: CircularBuffer,
    ):
        core_row_idx = core_index(0)
        core_col_idx = core_index(1)
        linear_idx = core_row_idx * 2 + core_col_idx

        lhs_blk = lhs_cb.reserve()
        lhs_tx = dma(lhs_acc[linear_idx, 0], lhs_blk)
        lhs_tx.wait()
        lhs_cb.push()

        rhs_blk = rhs_cb.reserve()
        rhs_tx = dma(rhs_acc[linear_idx, 0], rhs_blk)
        rhs_tx.wait()
        rhs_cb.push()

    return Program(compute_add, dm)(lhs, rhs, out)
```

--------------------------------

### Activating tt-lang Environment

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This snippet demonstrates how to activate the tt-lang environment. It includes navigating to the tt-lang root, optionally setting the `TT_MLIR_HOME` environment variable if tt-mlir is not in the default location, and sourcing the activation script. It also shows how to verify the activation.

```bash
# Navigate to tt-lang root
cd <tt-lang-directory>

# If tt-mlir is not at ../tt-mlir, set TT_MLIR_HOME first
export TT_MLIR_HOME=/path/to/tt-mlir  # Optional, only if not sibling dir

# Activate environment (automatically sources tt-mlir's environment)
source env/activate

# Only set when running examples/tests - ask user for path when needed
# export SYSTEM_DESC_PATH=/path/to/system_desc.ttsys

# Verify activation
echo $TTLANG_ENV_ACTIVATED  # Should be 1
echo $TT_LANG_HOME         # Should be tt-lang root
echo $TT_MLIR_HOME         # Should be tt-mlir root
```

--------------------------------

### MetalLayoutAttr MLIR Example

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Defines a high-level layout attribute for tensors before bufferization. It specifies the logical shape, grid distribution, memory space, and optionally an index map for views. This is typically created by Python API functions.

```mlir
#ttcore.metal_layout<
  logical_shape = 128x128,
  dim_alignments = 32x32,
  collapsed_intervals = dense<[[0,1], [1,2]]>,
  undef,
  l1>

```

--------------------------------

### Count Operations by Type in MLIR Dumps (Bash)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This command counts the occurrences of each operation type within an MLIR file. It's useful for getting a high-level overview of the IR content and for comparing the distribution of operations between different MLIR dumps, such as before and after a compiler pass.

```bash
# Count operations by type in initial dump
grep -o "[a-z0-9_]*\.[a-z0-9_]*" /tmp/compare_initial.mlir | sort | uniq -c

# Count operations by type in final dump
grep -o "[a-z0-9_]*\.[a-z0-9_]*" /tmp/compare_final.mlir | sort | uniq -c
```

--------------------------------

### Verify LLVM/MLIR Toolchain Components

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Confirms the presence of essential LLVM and MLIR configuration files within the toolchain directory. These files are required for the build system to find and link against LLVM and MLIR libraries. If missing, the LLVM/MLIR toolchain needs to be installed or built separately.

```bash
ls ${TTMLIR_TOOLCHAIN_DIR}/lib/cmake/mlir/MLIRConfig.cmake
ls ${TTMLIR_TOOLCHAIN_DIR}/lib/cmake/llvm/LLVMConfig.cmake
```

--------------------------------

### TTNN Bindings Compilation Path

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Outlines the compilation flow when using the TTNN Python API. It starts with Python TTNN API calls, which are converted to the TTIR dialect, and then proceeds to the TTIR-to-D2M conversion, eventually joining the main DSL compilation path at the D2M dialect.

```text
Python TTNN API
  ttnn.matmul(a, b)
    ↓
TTIR Dialect
    ↓
ttir-to-d2m conversion
    ↓
(joins DSL path at D2M dialect)
```

--------------------------------

### Daily Development Workflow for tt-lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Describes the commands needed for daily development within the tt-lang project. This involves activating the environment in each new shell, building the project, and running tests.

```bash
# In each new shell session:
cd /path/to/tt-lang
source build/env/activate

# Build
cmake --build build

# Test
pytest tests/
```

--------------------------------

### Run tt-lang Tests with LLVM lit

Source: https://github.com/tenstorrent/tt-lang/blob/main/README.md

Provides instructions for running tt-lang tests directly using the LLVM 'lit' test runner. This allows for more granular control over test execution, targeting either MLIR dialect tests or Python runtime tests separately. The build environment must be activated.

```bash
source build/env/activate
llvm-lit -sv test/ttlang/     # MLIR dialect tests
llvm-lit -sv test/python/     # Python runtime tests
```

--------------------------------

### Running tt-lang Tests with llvm-lit

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This snippet shows how to run the tests for tt-lang using the `llvm-lit` testing tool. It requires activating the tt-lang environment and setting the `SYSTEM_DESC_PATH` environment variable, for which the user should be prompted. It provides a command to run all Python tests.

```bash
cd <tt-lang-directory>
source env/activate
export SYSTEM_DESC_PATH=/path/to/system_desc.ttsys  # Ask user for path

# Run all tests
llvm-lit -sv test/python/
```

--------------------------------

### CMake Build Options for tt-lang

Source: https://github.com/tenstorrent/tt-lang/blob/main/README.md

Demonstrates various CMake build options for tt-lang, including enabling Python bindings, specifying a custom installation prefix for tt-mlir, and enabling code coverage. These options allow customization of the build process to suit different development and deployment needs.

```bash
# Debug build with Python bindings
cmake -GNinja -Bbuild . -DCMAKE_BUILD_TYPE=Debug -DTTLANG_ENABLE_BINDINGS_PYTHON=ON

# Custom install prefix for automatically built tt-mlir
cmake -GNinja -Bbuild . -DTTMLIR_INSTALL_PREFIX=/tmp/my-ttmlir-install

# Enable code coverage
cmake -GNinja -Bbuild . -DCODE_COVERAGE=ON
```

--------------------------------

### Listing GitHub Issues with gh CLI

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This snippet shows how to use the GitHub CLI (`gh`) to list and view issues within the tt-lang repository. It requires navigating to the tt-lang directory first. This is useful for understanding known bugs and limitations.

```bash
cd <tt-lang-directory>
gh issue list --limit 20
gh issue view NUMBER
```

--------------------------------

### Compare MLIR Dumps Across Branches (Bash)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This workflow demonstrates how to compare MLIR outputs generated from different Git branches. It involves building the project in each branch, exporting the MLIR outputs to specific files, and then using a general-purpose agent or command-line tools to analyze the differences.

```bash
# Save MLIR from branch A
git checkout branch-a
cmake --build build
export TTLANG_INITIAL_MLIR=/tmp/branch_a_initial.mlir
export TTLANG_FINAL_MLIR=/tmp/branch_a_final.mlir
python examples/test.py

# Save MLIR from branch B
git checkout branch-b
cmake --build build
export TTLANG_INITIAL_MLIR=/tmp/branch_b_initial.mlir
export TTLANG_FINAL_MLIR=/tmp/branch_b_final.mlir
python examples/test.py

# Compare MLIR outputs (manual diff example)
diff /tmp/branch_a_final.mlir /tmp/branch_b_final.mlir
```

--------------------------------

### Python: Copy Tiles using TensorAccessor

Source: https://context7.com/tenstorrent/tt-lang/llms.txt

Demonstrates using TensorAccessor to create a view of a DRAM tensor for indexed tile-level access during DMA operations. This example shows reading from and writing to specific tiles in memory.

```python
from ttlang.d2m_api import *

@pykernel_gen(grid=(1, 1), block_factors=[(4, 1), (1, 1)])
def copy_tiles(input_tensor, output_tensor, block_factors=None, grid=None):
    """
    Copy tiles from input to output using TensorAccessor.
    Demonstrates indexed access to DRAM tensors for DMA operations.
    """
    input_acc = TensorAccessor(input_tensor)   # Create DRAM view
    output_acc = TensorAccessor(output_tensor)

    @datamovement()
    def dm_copy(in_cb: CircularBuffer, out_cb: CircularBuffer):
        # Access tiles by index: accessor[tile_row, tile_col]
        for i in range(4):  # Copy 4 tiles
            # Read from DRAM
            in_blk = in_cb.reserve()
            tx = dma(input_acc[i, 0], in_blk)  # Index tile at row i, col 0
            tx.wait()
            in_cb.push()

            # Write to DRAM
            out_blk = out_cb.wait()
            tx = dma(out_blk, output_acc[i, 0])
            tx.wait()
            out_cb.pop()

    @compute()
    def passthrough(in_cb: CircularBuffer, out_cb: CircularBuffer):
        for i in range(4):
            blk = in_cb.wait()
            out_blk = out_cb.reserve()
            out_blk.store(blk)
            in_cb.pop()
            out_cb.push()

    return Program(passthrough, dm_copy)(input_tensor, output_tensor)

# Execute
input_t = torch.randn(128, 32)   # 4x1 tiles (4 rows of 32x32)
output_t = torch.zeros(128, 32)
copy_tiles(input_t, output_t)
```

--------------------------------

### Semaphore Wait for Producer

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

A consumer waits for a semaphore signal from a producer. The `reset=0` argument indicates that the semaphore is not reset after being waited upon, allowing multiple consumers to be signaled.

```python
# Consumer waits for producer
sem.wait(1, reset=0)
```

--------------------------------

### Integrating tt-mlir CMake Modules

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/BUILD_SYSTEM.md

Illustrates how to make tt-mlir's CMake modules available to your project by adding the appropriate directory to `CMAKE_MODULE_PATH`. This is essential for tt-lang to find MLIR/LLVM and other build utilities.

```cmake
list(APPEND CMAKE_MODULE_PATH "$ENV{TT_MLIR_HOME}/cmake/modules")
```

--------------------------------

### Bufferization Layout Conversion

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Illustrates the conversion process of layout attributes during bufferization. A tensor with a MetalLayoutAttr is converted to a memref, with the MetalLayoutAttr being transformed into either a ViewLayoutAttr or a ShardLayoutAttr depending on the presence of an index_map.

```mlir
tensor<..., #ttcore.metal_layout<...>>
    ↓ one-shot-bufferize
memref<..., #ttcore.view<...>, #ttcore.memory_space<l1>>
  OR
memref<..., #ttcore.shard<...>, #ttcore.memory_space<l1>>

```

--------------------------------

### Run all pre-commit checks

Source: https://github.com/tenstorrent/tt-lang/blob/main/CONTRIBUTING.md

Manually executes all configured pre-commit hooks on every file in the repository. This is useful for ensuring all code adheres to the project's standards.

```bash
pre-commit run --all-files
```

--------------------------------

### Semaphore Increment Counter

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Increments a semaphore counter on a specific remote core. Unlike `set`, `inc` only operates on a single remote core specified by the `core` argument and does not support multicast.

```python
# Increment counter
sem.inc(1, core=(cy, 0))
```

--------------------------------

### MLIR Tile Operation for Addition

Source: https://github.com/tenstorrent/tt-lang/blob/main/docs/HITCHHIKERS_GUIDE.md

Represents a single tile computation for addition. This is a hardware intrinsic mapped directly to the hardware primitive. After lowering, this operation appears within nested loops.

```mlir
%result = d2m.tile_add(%a, %b)
```

```mlir
scf.for %i ... {
  scf.for %j ... {
    d2m.tile_add(...)  // tile operation
  }
}
```

--------------------------------

### Extract Intermediate IR Checkpoints (Bash)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This workflow demonstrates how to extract MLIR dumps at intermediate stages of the pipeline by enabling verbose pass output and then using `sed` to capture the IR between specific pass markers. This allows for granular comparison of the IR's evolution.

```bash
# Run with verbose passes enabled
export TTLANG_VERBOSE_PASSES=1
python examples/test.py 2>&1 > /tmp/full_pipeline.log

# Extract IR after bufferization pass
sed -n '/After OneShotBufferizePass/,/^\/\/ -----/\/\/ IR Dump Before/p' /tmp/full_pipeline.log > /tmp/after_bufferize.mlir

# Extract IR after allocate pass
sed -n '/After D2MAllocate/,/^\/\/ -----/\/\/ IR Dump Before/p' /tmp/full_pipeline.log > /tmp/after_allocate.mlir
```

--------------------------------

### MLIR Function Before TTCoreRegisterDevicePass

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This snippet shows the initial MLIR representation of a function before the TTCoreRegisterDevicePass. It operates on f32 tensors and uses a high-level ttir.sign operation. Input and output types are consistently tensor<128x96xf32>.

```mlir
func.func @test_sign_f32(%arg0: tensor<128x96xf32>, %arg1: tensor<128x96xf32>) -> tensor<128x96xf32> {
  %0 = "ttir.sign"(%arg0, %arg1) : (tensor<128x96xf32>, tensor<128x96xf32>) -> tensor<128x96xf32>
  return %0 : tensor<128x96xf32>
}
```

--------------------------------

### Detecting tt-mlir Repository Location

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

These commands help determine the location of the tt-mlir repository relative to the tt-lang directory. This information is crucial for setting up the environment correctly, especially if tt-mlir is not in the default sibling directory.

```bash
# From tt-lang directory
pwd                          # Get tt-lang path
ls -d ../tt-mlir 2>/dev/null # Check if tt-mlir is sibling directory
echo $TT_MLIR_HOME          # Check if user has set this
```

--------------------------------

### Compare MLIR Dumps for Tests (Bash)

Source: https://github.com/tenstorrent/tt-lang/blob/main/CLAUDE.md

This snippet outlines the process for comparing MLIR outputs between a passing and a failing test case. By saving the final MLIR for each test to separate files, one can then use diff tools to identify the specific changes that lead to the failure.

```bash
# Run passing test
export TTLANG_FINAL_MLIR=/tmp/passing_test.mlir
python test/passing_example.py

# Run failing test
export TTLANG_FINAL_MLIR=/tmp/failing_test.mlir
python test/failing_example.py

# Compare to find what's different
diff /tmp/passing_test.mlir /tmp/failing_test.mlir
```

--------------------------------

### Run tt-lang Tests with CMake

Source: https://github.com/tenstorrent/tt-lang/blob/main/README.md

Executes tt-lang tests using CMake build targets. This includes options to run all tests (MLIR and Python), MLIR dialect tests specifically, or Python runtime tests. Requires activating the build environment first.

```bash
source build/env/activate

# All tests (MLIR + Python)
cmake --build build --target check-ttlang

# MLIR dialect tests only
cmake --build build --target check-ttlang-mlir

# Python runtime tests only
cmake --build build --target check-ttlang-python
```