### Training Script Integration

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Example commands for training PointTransformerV3 on various benchmarks using the Pointcept framework.

```APIDOC
## Training Script Integration

### Description
Example integration with Pointcept training framework for semantic segmentation on standard benchmarks.

### Method
SHELL

### Endpoint
/pointcept/pointtransformerv3/train

### Usage
```bash
# Install Pointcept framework
git clone https://github.com/Pointcept/Pointcept.git
cd Pointcept

# Train PTv3 on ScanNet from scratch (4 GPUs)
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base

# Train PTv3 on ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base

# Train PTv3 on S3DIS with RPE (FlashAttention disabled)
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-0-rpe -n semseg-pt-v3m1-0-rpe

# Train PTv3 on nuScenes for outdoor segmentation
sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base

# Train PTv3 on Waymo
sh scripts/train.sh -g 4 -d waymo -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base

# PPT joint training with multiple datasets (8 GPUs)
sh scripts/train.sh -g 8 -d scannet -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme
```
```

--------------------------------

### Initialize and execute PointTransformerV3 model

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Demonstrates how to instantiate the PTv3 model with specific encoder-decoder configurations and perform a forward pass on batched 3D point cloud data. It requires a dictionary input containing coordinates, features, grid coordinates, and batch offsets.

```python
import torch
from model import PointTransformerV3, Point

model = PointTransformerV3(
    in_channels=6,
    order=("z", "z-trans", "hilbert", "hilbert-trans"),
    stride=(2, 2, 2, 2),
    enc_depths=(2, 2, 2, 6, 2),
    enc_channels=(32, 64, 128, 256, 512),
    enc_num_head=(2, 4, 8, 16, 32),
    enc_patch_size=(1024, 1024, 1024, 1024, 1024),
    dec_depths=(2, 2, 2, 2),
    dec_channels=(64, 64, 128, 256),
    dec_num_head=(4, 4, 8, 16),
    dec_patch_size=(1024, 1024, 1024, 1024),
    mlp_ratio=4,
    enable_flash=True,
    enable_rpe=False,
    cls_mode=False,
).cuda()

data_dict = {
    "coord": torch.randn(95000, 3).cuda(),
    "feat": torch.randn(95000, 6).cuda(),
    "grid_coord": torch.randint(0, 1000, (95000, 3)).cuda(),
    "offset": torch.tensor([50000, 95000]).cuda(),
}

with torch.no_grad():
    output = model(data_dict)
```

--------------------------------

### Train PointTransformerV3 Models via Shell Scripts

Source: https://github.com/pointcept/pointtransformerv3/blob/main/README.md

Executes training jobs for various datasets using the provided training script. Requires specifying the number of GPUs, dataset name, configuration file, and experiment name.

```bash
# Scratched ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# PPT joint training (ScanNet + Structured3D) and evaluate in ScanNet
sh scripts/train.sh -g 8 -d scannet -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme

# Scratched ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base

# Scratched S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-0-rpe -n semseg-pt-v3m1-0-rpe
# PPT joint training (ScanNet + S3DIS + Structured3D) and evaluate in ScanNet
sh scripts/train.sh -g 8 -d s3dis -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme

# Scratched nuScenes
sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# Scratched Waymo
sh scripts/train.sh -g 4 -d waymo -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
```

--------------------------------

### Run Pointcept Training Scripts

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Provides shell commands to train PTv3 models on various datasets like ScanNet, S3DIS, and nuScenes using the Pointcept framework.

```bash
git clone https://github.com/Pointcept/Pointcept.git
cd Pointcept
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
```

--------------------------------

### Configure PTv3 Without FlashAttention (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Demonstrates how to configure and initialize the Point Transformer V3 model on systems that do not support FlashAttention (e.g., CUDA versions older than 11.6). This involves adjusting patch sizes and explicitly disabling FlashAttention while enabling alternative features like relative position encoding.

```python
import torch
from model import PointTransformerV3

# Configure PTv3 without FlashAttention
model = PointTransformerV3(
    in_channels=6,
    # Reduce patch sizes when not using FlashAttention
    enc_patch_size=(128, 128, 128, 128, 128),
    dec_patch_size=(128, 128, 128, 128),
    # Disable FlashAttention
    enable_flash=False,
    # Enable features that require disabling flash
    enable_rpe=True,           # Relative position encoding
    upcast_attention=True,     # FP32 attention computation
    upcast_softmax=True,       # FP32 softmax
).cuda()

# Use same input format
data_dict = {
    "coord": torch.randn(30000, 3).cuda(),
    "feat": torch.randn(30000, 6).cuda(),
    "grid_coord": torch.randint(0, 512, (30000, 3)).cuda(),
    "offset": torch.tensor([15000, 30000]).cuda(),
}

output = model(data_dict)
print(f"Output without FlashAttention: {output.feat.shape}")
```

--------------------------------

### SerializedAttention Module Initialization (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Initializes the SerializedAttention layer, a core attention mechanism for processing point clouds in serialized order. Supports FlashAttention for efficiency and handles variable-length sequences with padding and masking.

```python
import torch
from model import SerializedAttention, Point

# Initialize serialized attention layer
attn = SerializedAttention(
    channels=256,
    num_heads=8,
    patch_size=1024,        # Points per attention window
    qkv_bias=True,
    qk_scale=None,
    attn_drop=0.0,
    proj_drop=0.0,
    order_index=0,          # Which serialization order to use
    enable_rpe=False,       # Relative position encoding
    enable_flash=True,      # Use FlashAttention
    upcast_attention=False,
    upcast_softmax=False,
).cuda()

# Prepare serialized point cloud
point = Point({
    "feat": torch.randn(20000, 256).cuda(),
    "grid_coord": torch.randint(0, 1024, (20000, 3)).cuda(),
    "offset": torch.tensor([10000, 20000]).cuda(),
})
point.serialization(order=("z", "hilbert"), depth=16)
point.sparsify()
```

--------------------------------

### Custom Framework Integration

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Instructions for using PointTransformerV3 in custom projects outside the Pointcept framework.

```APIDOC
## Custom Framework Integration

### Description
Standalone usage of PTv3 in custom projects without the Pointcept framework.

### Method
PYTHON

### Endpoint
/pointcept/pointtransformerv3/custom_integration

### Usage
```python
import torch
import torch.nn as nn
from model import PointTransformerV3

# Initialize the model
model = PointTransformerV3(...)
model.cuda()

# Prepare your point cloud data
# ...

# Forward pass
# output = model(point_cloud_data)
```
```

--------------------------------

### Create and Use PTv3 Segmentation Head (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Defines a custom segmentation head for Point Transformer V3, initializes the model, prepares sample data, performs a forward pass, and sets up a basic training loop with optimizer and loss calculation. This snippet demonstrates the end-to-end usage for segmentation.

```python
import torch
import torch.nn as nn
from model import PointTransformerV3

# Create custom segmentation head
class PTv3Segmentation(nn.Module):
    def __init__(self, num_classes, in_channels=6):
        super().__init__()
        self.backbone = PointTransformerV3(
            in_channels=in_channels,
            enc_channels=(32, 64, 128, 256, 512),
            dec_channels=(64, 64, 128, 256),
            enable_flash=True,
        )
        # Segmentation head on decoder output (64 channels)
        self.head = nn.Sequential(
            nn.Linear(64, 64),
            nn.BatchNorm1d(64),
            nn.ReLU(inplace=True),
            nn.Linear(64, num_classes),
        )

    def forward(self, data_dict):
        point = self.backbone(data_dict)
        logits = self.head(point.feat)
        return logits

# Initialize model
model = PTv3Segmentation(num_classes=20, in_channels=6).cuda()

# Prepare data (N points with xyz + rgb features)
data = {
    "coord": torch.randn(50000, 3).cuda(),
    "feat": torch.randn(50000, 6).cuda(),
    "grid_coord": torch.randint(0, 1000, (50000, 3)).cuda(),
    "offset": torch.tensor([25000, 50000]).cuda(),  # 2 batches
}

# Forward pass
logits = model(data)
print(f"Segmentation logits: {logits.shape}")  # (50000, 20)

# Training loop
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
criterion = nn.CrossEntropyLoss()

labels = torch.randint(0, 20, (50000,)).cuda()
loss = criterion(logits, labels)
loss.backward()
optimizer.step()
```

--------------------------------

### Apply Serialized Attention

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Demonstrates the basic application of the serialized attention mechanism on point cloud features.

```python
output = attn(point)
print(f"Output features: {output.feat.shape}")
```

--------------------------------

### Manage point cloud data with Point class

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Explains the usage of the Point class for handling batched point cloud data, including automatic batch index generation, serialization using space-filling curves, and preparation for sparse convolution operations.

```python
import torch
from model import Point

point = Point({
    "coord": torch.randn(18000, 3).cuda(),
    "feat": torch.randn(18000, 32).cuda(),
    "grid_coord": torch.randint(0, 512, (18000, 3)).cuda(),
    "offset": torch.tensor([10000, 18000]).cuda(),
})

point.serialization(
    order=("z", "hilbert"),
    depth=16,
    shuffle_orders=True
)

point.sparsify(pad=96)
```

--------------------------------

### Run S3DIS 6-Fold Cross Validation

Source: https://github.com/pointcept/pointtransformerv3/blob/main/README.md

Calculates the 6-fold cross-validation performance for S3DIS after training on all area splits. Requires setting the PYTHONPATH and providing the root directory containing the result files.

```bash
export PYTHONPATH=./
python tools/test_s3dis_6fold.py --record_root ${RECORD_FOLDER}
```

--------------------------------

### Implement Serialized Pooling and Unpooling

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Configures hierarchical pooling and unpooling layers for point cloud downsampling and upsampling. It maintains serialization structures to support skip connections between encoder and decoder stages.

```python
import torch
from model import SerializedPooling, SerializedUnpooling, Point
import torch.nn as nn

pool = SerializedPooling(
    in_channels=64,
    out_channels=128,
    stride=2,
    norm_layer=nn.BatchNorm1d,
    act_layer=nn.GELU,
    reduce="max",
    shuffle_orders=True,
    traceable=True,
).cuda()

unpool = SerializedUnpooling(
    in_channels=128,
    skip_channels=64,
    out_channels=64,
    norm_layer=nn.BatchNorm1d,
    act_layer=nn.GELU,
).cuda()

point = Point({
    "feat": torch.randn(10000, 64).cuda(),
    "coord": torch.randn(10000, 3).cuda(),
    "grid_coord": torch.randint(0, 512, (10000, 3)).cuda(),
    "offset": torch.tensor([5000, 10000]).cuda(),
})
point.serialization(order="z", depth=16)
point.sparsify()

encoder_point = point
pooled = pool(point)
upsampled = unpool(pooled)
```

--------------------------------

### Z-Order Encoding (Morton Code) with Lookup Tables (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Implements Z-order encoding by interleaving bits of x, y, z coordinates to create a single integer code. Uses lookup tables for efficient encoding and decoding. Supports optional batch indexing.

```python
import torch
from serialization.z_order import xyz2key, key2xyz

# Encode individual x, y, z coordinates to Z-order key
x = torch.tensor([0, 1, 2, 100, 255]).cuda()
y = torch.tensor([0, 1, 3, 50, 128]).cuda()
z = torch.tensor([0, 1, 1, 75, 64]).cuda()

# Encode without batch index
key = xyz2key(x, y, z, b=None, depth=16)
print(f"Z-order keys: {key}")

# Encode with batch index (batch < 32768)
batch_idx = torch.tensor([0, 0, 1, 1, 2]).cuda()
key_with_batch = xyz2key(x, y, z, b=batch_idx, depth=16)
print(f"Keys with batch: {key_with_batch}")

# Decode key back to coordinates
dec_x, dec_y, dec_z, dec_b = key2xyz(key_with_batch, depth=16)
print(f"Decoded x: {dec_x}")
print(f"Decoded batch: {dec_b}")

# Verify round-trip
assert torch.allclose(x, dec_x)
assert torch.allclose(y, dec_y)
assert torch.allclose(z, dec_z)
```

--------------------------------

### Hilbert Curve Encoding and Decoding (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Provides functions for encoding 3D locations into Hilbert curve indices and decoding them back. Offers superior locality preservation compared to Z-order. Supports variable bit depths for coordinates.

```python
import torch
from serialization.hilbert import encode as hilbert_encode, decode as hilbert_decode

# Encode 3D locations to Hilbert integers
# Input shape: (N, 3) where values are in range [0, 2^num_bits - 1]
locations = torch.randint(0, 256, (1000, 3)).cuda()  # 8-bit coordinates

# Encode to Hilbert curve index
hilbert_codes = hilbert_encode(locations, num_dims=3, num_bits=8)
print(f"Hilbert codes shape: {hilbert_codes.shape}")  # (1000,)
print(f"Hilbert codes dtype: {hilbert_codes.dtype}")  # int64

# Decode back to 3D locations
decoded_locations = hilbert_decode(hilbert_codes, num_dims=3, num_bits=8)
print(f"Decoded shape: {decoded_locations.shape}")  # (1000, 3)

# Verify round-trip encoding
assert torch.allclose(locations.long(), decoded_locations)

# Higher precision encoding for larger point clouds
high_res_locs = torch.randint(0, 65536, (5000, 3)).cuda()  # 16-bit coords
high_res_codes = hilbert_encode(high_res_locs, num_dims=3, num_bits=16)
print(f"16-bit Hilbert codes: {high_res_codes.shape}")
```

--------------------------------

### Configure Transformer Block

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Initializes a complete transformer block including sparse convolution positional encoding and attention mechanisms. This block is designed for processing sparse point cloud data with residual connections.

```python
import torch
from model import Block, Point
import torch.nn as nn

block = Block(
    channels=128,
    num_heads=8,
    patch_size=1024,
    mlp_ratio=4.0,
    qkv_bias=True,
    attn_drop=0.0,
    proj_drop=0.0,
    drop_path=0.1,
    norm_layer=nn.LayerNorm,
    act_layer=nn.GELU,
    pre_norm=True,
    cpe_indice_key="stage0",
    enable_flash=True,
).cuda()

point = Point({
    "feat": torch.randn(15000, 128).cuda(),
    "grid_coord": torch.randint(0, 256, (15000, 3)).cuda(),
    "offset": torch.tensor([7500, 15000]).cuda(),
})
point.serialization(order=("z", "hilbert"), depth=16)
point.sparsify()

output = block(point)
```

--------------------------------

### Encode/Decode 3D Coordinates with Space-Filling Curves (Python)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Provides functions to encode 3D grid coordinates into 1D codes using Z-order and Hilbert curves, and decode them back. Supports different orders (e.g., 'z', 'z-trans', 'hilbert') and batch processing. Useful for locality-preserving ordering of point clouds.

```python
import torch
from serialization import encode, decode, z_order_encode, hilbert_encode

# Sample grid coordinates (N, 3) within depth range
grid_coord = torch.randint(0, 1024, (5000, 3)).cuda()
batch = torch.zeros(5000, dtype=torch.long).cuda()

# Encode with different serialization orders
z_code = encode(grid_coord, batch, depth=16, order="z")
z_trans_code = encode(grid_coord, batch, depth=16, order="z-trans")  # Transposed axes
hilbert_code = encode(grid_coord, batch, depth=16, order="hilbert")
hilbert_trans_code = encode(grid_coord, batch, depth=16, order="hilbert-trans")

print(f"Z-order codes: {z_code.shape}, dtype: {z_code.dtype}")  # (5000,), int64
print(f"Hilbert codes: {hilbert_code.shape}")  # (5000,)

# Decode back to coordinates
decoded_coord, decoded_batch = decode(z_code, depth=16, order="z")
print(f"Decoded coordinates shape: {decoded_coord.shape}")  # (5000, 3)

# Low-level encoding without batch
z_only = z_order_encode(grid_coord, depth=16)
hilbert_only = hilbert_encode(grid_coord, depth=16)

# Sort points by serialization code for locality-preserving ordering
sorted_indices = torch.argsort(hilbert_code)
sorted_coords = grid_coord[sorted_indices]
print(f"Points sorted by Hilbert curve order")
```

--------------------------------

### SerializedPooling and Unpooling

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Hierarchical pooling operations for downsampling and upsampling point clouds while maintaining serialization structure for skip connections.

```APIDOC
## SerializedPooling and Unpooling

### Description
Hierarchical pooling operations that downsample and upsample point clouds while maintaining the serialization structure for skip connections.

### Method
POST

### Endpoint
/pointcept/pointtransformerv3/pooling

### Parameters
#### Request Body
- **point** (Point) - The input point cloud object.
- **pool_config** (dict) - Configuration for SerializedPooling.
  - **in_channels** (int) - Input channels.
  - **out_channels** (int) - Output channels.
  - **stride** (int) - Spatial downsampling factor.
  - **reduce** (str) - Pooling reduction method (e.g., 'max', 'mean').
  - **shuffle_orders** (bool) - Whether to shuffle orders.
  - **traceable** (bool) - Whether to store parent for unpooling.
- **unpool_config** (dict) - Configuration for SerializedUnpooling.
  - **in_channels** (int) - Input channels for unpooling.
  - **skip_channels** (int) - Channels from encoder skip connection.
  - **out_channels** (int) - Output channels.

### Request Example
```json
{
  "point": {
    "feat": "tensor",
    "coord": "tensor",
    "grid_coord": "tensor",
    "offset": "tensor"
  },
  "pool_config": {
    "in_channels": 64,
    "out_channels": 128,
    "stride": 2,
    "reduce": "max",
    "shuffle_orders": true,
    "traceable": true
  },
  "unpool_config": {
    "in_channels": 128,
    "skip_channels": 64,
    "out_channels": 64
  }
}
```

### Response
#### Success Response (200)
- **pooled** (Point) - The downsampled point cloud object.
- **upsampled** (Point) - The upsampled point cloud object.

#### Response Example
```json
{
  "pooled": {
    "feat_shape": "(reduced_point_count, 128)"
  },
  "upsampled": {
    "feat_shape": "(original_point_count, 64)"
  }
}
```
```

--------------------------------

### SerializedAttention

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

Applies serialized attention to point cloud features.

```APIDOC
## SerializedAttention

### Description
Applies serialized attention to point cloud features.

### Method
POST

### Endpoint
/pointcept/pointtransformerv3/attn

### Request Body
- **point** (Point) - The input point cloud object containing features.

### Request Example
```json
{
  "point": {
    "feat": "tensor",
    "coord": "tensor",
    "grid_coord": "tensor",
    "offset": "tensor"
  }
}
```

### Response
#### Success Response (200)
- **output** (Point) - The point cloud object with attention-applied features.

#### Response Example
```json
{
  "output": {
    "feat": "tensor (20000, 256)"
  }
}
```
```

--------------------------------

### Block (Transformer Block)

Source: https://context7.com/pointcept/pointtransformerv3/llms.txt

A complete transformer block combining sparse convolution positional encoding, serialized attention, and MLP with residual connections.

```APIDOC
## Block (Transformer Block)

### Description
A complete transformer block combining sparse convolution positional encoding, serialized attention, and MLP with residual connections and drop path regularization.

### Method
POST

### Endpoint
/pointcept/pointtransformerv3/block

### Parameters
#### Request Body
- **point** (Point) - The input point cloud object with features.
- **block_config** (dict) - Configuration for the Block.
  - **channels** (int) - Number of channels.
  - **num_heads** (int) - Number of attention heads.
  - **patch_size** (int) - Patch size for attention.
  - **mlp_ratio** (float) - Ratio for MLP hidden dimension.
  - **qkv_bias** (bool) - Whether to use bias for QKV projection.
  - **qk_scale** (float or None) - Scale factor for attention scores.
  - **attn_drop** (float) - Dropout rate for attention.
  - **proj_drop** (float) - Dropout rate for projection.
  - **drop_path** (float) - Stochastic depth rate.
  - **pre_norm** (bool) - Whether to use pre-normalization.
  - **order_index** (int) - Order index for serialization.
  - **cpe_indice_key** (str) - Indice key for sparse convolution positional encoding.
  - **enable_rpe** (bool) - Whether to enable relative positional encoding.
  - **enable_flash** (bool) - Whether to enable FlashAttention.

### Request Example
```json
{
  "point": {
    "feat": "tensor",
    "grid_coord": "tensor",
    "offset": "tensor"
  },
  "block_config": {
    "channels": 128,
    "num_heads": 8,
    "patch_size": 1024,
    "mlp_ratio": 4.0,
    "qkv_bias": true,
    "attn_drop": 0.0,
    "proj_drop": 0.0,
    "drop_path": 0.1,
    "pre_norm": true,
    "order_index": 0,
    "cpe_indice_key": "stage0",
    "enable_rpe": false,
    "enable_flash": true
  }
}
```

### Response
#### Success Response (200)
- **output** (Point) - The point cloud object with processed features.

#### Response Example
```json
{
  "output": {
    "feat_shape": "(15000, 128)"
  }
}
```
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.