### Training Script Integration Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Example commands for training PointTransformerV3 on various benchmarks using the Pointcept framework. ```APIDOC ## Training Script Integration ### Description Example integration with Pointcept training framework for semantic segmentation on standard benchmarks. ### Method SHELL ### Endpoint /pointcept/pointtransformerv3/train ### Usage ```bash # Install Pointcept framework git clone https://github.com/Pointcept/Pointcept.git cd Pointcept # Train PTv3 on ScanNet from scratch (4 GPUs) sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # Train PTv3 on ScanNet200 sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # Train PTv3 on S3DIS with RPE (FlashAttention disabled) sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-0-rpe -n semseg-pt-v3m1-0-rpe # Train PTv3 on nuScenes for outdoor segmentation sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # Train PTv3 on Waymo sh scripts/train.sh -g 4 -d waymo -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # PPT joint training with multiple datasets (8 GPUs) sh scripts/train.sh -g 8 -d scannet -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme ``` ``` -------------------------------- ### Initialize and execute PointTransformerV3 model Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Demonstrates how to instantiate the PTv3 model with specific encoder-decoder configurations and perform a forward pass on batched 3D point cloud data. It requires a dictionary input containing coordinates, features, grid coordinates, and batch offsets. ```python import torch from model import PointTransformerV3, Point model = PointTransformerV3( in_channels=6, order=("z", "z-trans", "hilbert", "hilbert-trans"), stride=(2, 2, 2, 2), enc_depths=(2, 2, 2, 6, 2), enc_channels=(32, 64, 128, 256, 512), enc_num_head=(2, 4, 8, 16, 32), enc_patch_size=(1024, 1024, 1024, 1024, 1024), dec_depths=(2, 2, 2, 2), dec_channels=(64, 64, 128, 256), dec_num_head=(4, 4, 8, 16), dec_patch_size=(1024, 1024, 1024, 1024), mlp_ratio=4, enable_flash=True, enable_rpe=False, cls_mode=False, ).cuda() data_dict = { "coord": torch.randn(95000, 3).cuda(), "feat": torch.randn(95000, 6).cuda(), "grid_coord": torch.randint(0, 1000, (95000, 3)).cuda(), "offset": torch.tensor([50000, 95000]).cuda(), } with torch.no_grad(): output = model(data_dict) ``` -------------------------------- ### Train PointTransformerV3 Models via Shell Scripts Source: https://github.com/pointcept/pointtransformerv3/blob/main/README.md Executes training jobs for various datasets using the provided training script. Requires specifying the number of GPUs, dataset name, configuration file, and experiment name. ```bash # Scratched ScanNet sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # PPT joint training (ScanNet + Structured3D) and evaluate in ScanNet sh scripts/train.sh -g 8 -d scannet -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme # Scratched ScanNet200 sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # Scratched S3DIS sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-0-rpe -n semseg-pt-v3m1-0-rpe # PPT joint training (ScanNet + S3DIS + Structured3D) and evaluate in ScanNet sh scripts/train.sh -g 8 -d s3dis -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme # Scratched nuScenes sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base # Scratched Waymo sh scripts/train.sh -g 4 -d waymo -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base ``` -------------------------------- ### Run Pointcept Training Scripts Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Provides shell commands to train PTv3 models on various datasets like ScanNet, S3DIS, and nuScenes using the Pointcept framework. ```bash git clone https://github.com/Pointcept/Pointcept.git cd Pointcept sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base ``` -------------------------------- ### Configure PTv3 Without FlashAttention (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Demonstrates how to configure and initialize the Point Transformer V3 model on systems that do not support FlashAttention (e.g., CUDA versions older than 11.6). This involves adjusting patch sizes and explicitly disabling FlashAttention while enabling alternative features like relative position encoding. ```python import torch from model import PointTransformerV3 # Configure PTv3 without FlashAttention model = PointTransformerV3( in_channels=6, # Reduce patch sizes when not using FlashAttention enc_patch_size=(128, 128, 128, 128, 128), dec_patch_size=(128, 128, 128, 128), # Disable FlashAttention enable_flash=False, # Enable features that require disabling flash enable_rpe=True, # Relative position encoding upcast_attention=True, # FP32 attention computation upcast_softmax=True, # FP32 softmax ).cuda() # Use same input format data_dict = { "coord": torch.randn(30000, 3).cuda(), "feat": torch.randn(30000, 6).cuda(), "grid_coord": torch.randint(0, 512, (30000, 3)).cuda(), "offset": torch.tensor([15000, 30000]).cuda(), } output = model(data_dict) print(f"Output without FlashAttention: {output.feat.shape}") ``` -------------------------------- ### SerializedAttention Module Initialization (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Initializes the SerializedAttention layer, a core attention mechanism for processing point clouds in serialized order. Supports FlashAttention for efficiency and handles variable-length sequences with padding and masking. ```python import torch from model import SerializedAttention, Point # Initialize serialized attention layer attn = SerializedAttention( channels=256, num_heads=8, patch_size=1024, # Points per attention window qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0, order_index=0, # Which serialization order to use enable_rpe=False, # Relative position encoding enable_flash=True, # Use FlashAttention upcast_attention=False, upcast_softmax=False, ).cuda() # Prepare serialized point cloud point = Point({ "feat": torch.randn(20000, 256).cuda(), "grid_coord": torch.randint(0, 1024, (20000, 3)).cuda(), "offset": torch.tensor([10000, 20000]).cuda(), }) point.serialization(order=("z", "hilbert"), depth=16) point.sparsify() ``` -------------------------------- ### Custom Framework Integration Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Instructions for using PointTransformerV3 in custom projects outside the Pointcept framework. ```APIDOC ## Custom Framework Integration ### Description Standalone usage of PTv3 in custom projects without the Pointcept framework. ### Method PYTHON ### Endpoint /pointcept/pointtransformerv3/custom_integration ### Usage ```python import torch import torch.nn as nn from model import PointTransformerV3 # Initialize the model model = PointTransformerV3(...) model.cuda() # Prepare your point cloud data # ... # Forward pass # output = model(point_cloud_data) ``` ``` -------------------------------- ### Create and Use PTv3 Segmentation Head (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Defines a custom segmentation head for Point Transformer V3, initializes the model, prepares sample data, performs a forward pass, and sets up a basic training loop with optimizer and loss calculation. This snippet demonstrates the end-to-end usage for segmentation. ```python import torch import torch.nn as nn from model import PointTransformerV3 # Create custom segmentation head class PTv3Segmentation(nn.Module): def __init__(self, num_classes, in_channels=6): super().__init__() self.backbone = PointTransformerV3( in_channels=in_channels, enc_channels=(32, 64, 128, 256, 512), dec_channels=(64, 64, 128, 256), enable_flash=True, ) # Segmentation head on decoder output (64 channels) self.head = nn.Sequential( nn.Linear(64, 64), nn.BatchNorm1d(64), nn.ReLU(inplace=True), nn.Linear(64, num_classes), ) def forward(self, data_dict): point = self.backbone(data_dict) logits = self.head(point.feat) return logits # Initialize model model = PTv3Segmentation(num_classes=20, in_channels=6).cuda() # Prepare data (N points with xyz + rgb features) data = { "coord": torch.randn(50000, 3).cuda(), "feat": torch.randn(50000, 6).cuda(), "grid_coord": torch.randint(0, 1000, (50000, 3)).cuda(), "offset": torch.tensor([25000, 50000]).cuda(), # 2 batches } # Forward pass logits = model(data) print(f"Segmentation logits: {logits.shape}") # (50000, 20) # Training loop optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4) criterion = nn.CrossEntropyLoss() labels = torch.randint(0, 20, (50000,)).cuda() loss = criterion(logits, labels) loss.backward() optimizer.step() ``` -------------------------------- ### Apply Serialized Attention Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Demonstrates the basic application of the serialized attention mechanism on point cloud features. ```python output = attn(point) print(f"Output features: {output.feat.shape}") ``` -------------------------------- ### Manage point cloud data with Point class Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Explains the usage of the Point class for handling batched point cloud data, including automatic batch index generation, serialization using space-filling curves, and preparation for sparse convolution operations. ```python import torch from model import Point point = Point({ "coord": torch.randn(18000, 3).cuda(), "feat": torch.randn(18000, 32).cuda(), "grid_coord": torch.randint(0, 512, (18000, 3)).cuda(), "offset": torch.tensor([10000, 18000]).cuda(), }) point.serialization( order=("z", "hilbert"), depth=16, shuffle_orders=True ) point.sparsify(pad=96) ``` -------------------------------- ### Run S3DIS 6-Fold Cross Validation Source: https://github.com/pointcept/pointtransformerv3/blob/main/README.md Calculates the 6-fold cross-validation performance for S3DIS after training on all area splits. Requires setting the PYTHONPATH and providing the root directory containing the result files. ```bash export PYTHONPATH=./ python tools/test_s3dis_6fold.py --record_root ${RECORD_FOLDER} ``` -------------------------------- ### Implement Serialized Pooling and Unpooling Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Configures hierarchical pooling and unpooling layers for point cloud downsampling and upsampling. It maintains serialization structures to support skip connections between encoder and decoder stages. ```python import torch from model import SerializedPooling, SerializedUnpooling, Point import torch.nn as nn pool = SerializedPooling( in_channels=64, out_channels=128, stride=2, norm_layer=nn.BatchNorm1d, act_layer=nn.GELU, reduce="max", shuffle_orders=True, traceable=True, ).cuda() unpool = SerializedUnpooling( in_channels=128, skip_channels=64, out_channels=64, norm_layer=nn.BatchNorm1d, act_layer=nn.GELU, ).cuda() point = Point({ "feat": torch.randn(10000, 64).cuda(), "coord": torch.randn(10000, 3).cuda(), "grid_coord": torch.randint(0, 512, (10000, 3)).cuda(), "offset": torch.tensor([5000, 10000]).cuda(), }) point.serialization(order="z", depth=16) point.sparsify() encoder_point = point pooled = pool(point) upsampled = unpool(pooled) ``` -------------------------------- ### Z-Order Encoding (Morton Code) with Lookup Tables (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Implements Z-order encoding by interleaving bits of x, y, z coordinates to create a single integer code. Uses lookup tables for efficient encoding and decoding. Supports optional batch indexing. ```python import torch from serialization.z_order import xyz2key, key2xyz # Encode individual x, y, z coordinates to Z-order key x = torch.tensor([0, 1, 2, 100, 255]).cuda() y = torch.tensor([0, 1, 3, 50, 128]).cuda() z = torch.tensor([0, 1, 1, 75, 64]).cuda() # Encode without batch index key = xyz2key(x, y, z, b=None, depth=16) print(f"Z-order keys: {key}") # Encode with batch index (batch < 32768) batch_idx = torch.tensor([0, 0, 1, 1, 2]).cuda() key_with_batch = xyz2key(x, y, z, b=batch_idx, depth=16) print(f"Keys with batch: {key_with_batch}") # Decode key back to coordinates dec_x, dec_y, dec_z, dec_b = key2xyz(key_with_batch, depth=16) print(f"Decoded x: {dec_x}") print(f"Decoded batch: {dec_b}") # Verify round-trip assert torch.allclose(x, dec_x) assert torch.allclose(y, dec_y) assert torch.allclose(z, dec_z) ``` -------------------------------- ### Hilbert Curve Encoding and Decoding (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Provides functions for encoding 3D locations into Hilbert curve indices and decoding them back. Offers superior locality preservation compared to Z-order. Supports variable bit depths for coordinates. ```python import torch from serialization.hilbert import encode as hilbert_encode, decode as hilbert_decode # Encode 3D locations to Hilbert integers # Input shape: (N, 3) where values are in range [0, 2^num_bits - 1] locations = torch.randint(0, 256, (1000, 3)).cuda() # 8-bit coordinates # Encode to Hilbert curve index hilbert_codes = hilbert_encode(locations, num_dims=3, num_bits=8) print(f"Hilbert codes shape: {hilbert_codes.shape}") # (1000,) print(f"Hilbert codes dtype: {hilbert_codes.dtype}") # int64 # Decode back to 3D locations decoded_locations = hilbert_decode(hilbert_codes, num_dims=3, num_bits=8) print(f"Decoded shape: {decoded_locations.shape}") # (1000, 3) # Verify round-trip encoding assert torch.allclose(locations.long(), decoded_locations) # Higher precision encoding for larger point clouds high_res_locs = torch.randint(0, 65536, (5000, 3)).cuda() # 16-bit coords high_res_codes = hilbert_encode(high_res_locs, num_dims=3, num_bits=16) print(f"16-bit Hilbert codes: {high_res_codes.shape}") ``` -------------------------------- ### Configure Transformer Block Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Initializes a complete transformer block including sparse convolution positional encoding and attention mechanisms. This block is designed for processing sparse point cloud data with residual connections. ```python import torch from model import Block, Point import torch.nn as nn block = Block( channels=128, num_heads=8, patch_size=1024, mlp_ratio=4.0, qkv_bias=True, attn_drop=0.0, proj_drop=0.0, drop_path=0.1, norm_layer=nn.LayerNorm, act_layer=nn.GELU, pre_norm=True, cpe_indice_key="stage0", enable_flash=True, ).cuda() point = Point({ "feat": torch.randn(15000, 128).cuda(), "grid_coord": torch.randint(0, 256, (15000, 3)).cuda(), "offset": torch.tensor([7500, 15000]).cuda(), }) point.serialization(order=("z", "hilbert"), depth=16) point.sparsify() output = block(point) ``` -------------------------------- ### Encode/Decode 3D Coordinates with Space-Filling Curves (Python) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Provides functions to encode 3D grid coordinates into 1D codes using Z-order and Hilbert curves, and decode them back. Supports different orders (e.g., 'z', 'z-trans', 'hilbert') and batch processing. Useful for locality-preserving ordering of point clouds. ```python import torch from serialization import encode, decode, z_order_encode, hilbert_encode # Sample grid coordinates (N, 3) within depth range grid_coord = torch.randint(0, 1024, (5000, 3)).cuda() batch = torch.zeros(5000, dtype=torch.long).cuda() # Encode with different serialization orders z_code = encode(grid_coord, batch, depth=16, order="z") z_trans_code = encode(grid_coord, batch, depth=16, order="z-trans") # Transposed axes hilbert_code = encode(grid_coord, batch, depth=16, order="hilbert") hilbert_trans_code = encode(grid_coord, batch, depth=16, order="hilbert-trans") print(f"Z-order codes: {z_code.shape}, dtype: {z_code.dtype}") # (5000,), int64 print(f"Hilbert codes: {hilbert_code.shape}") # (5000,) # Decode back to coordinates decoded_coord, decoded_batch = decode(z_code, depth=16, order="z") print(f"Decoded coordinates shape: {decoded_coord.shape}") # (5000, 3) # Low-level encoding without batch z_only = z_order_encode(grid_coord, depth=16) hilbert_only = hilbert_encode(grid_coord, depth=16) # Sort points by serialization code for locality-preserving ordering sorted_indices = torch.argsort(hilbert_code) sorted_coords = grid_coord[sorted_indices] print(f"Points sorted by Hilbert curve order") ``` -------------------------------- ### SerializedPooling and Unpooling Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Hierarchical pooling operations for downsampling and upsampling point clouds while maintaining serialization structure for skip connections. ```APIDOC ## SerializedPooling and Unpooling ### Description Hierarchical pooling operations that downsample and upsample point clouds while maintaining the serialization structure for skip connections. ### Method POST ### Endpoint /pointcept/pointtransformerv3/pooling ### Parameters #### Request Body - **point** (Point) - The input point cloud object. - **pool_config** (dict) - Configuration for SerializedPooling. - **in_channels** (int) - Input channels. - **out_channels** (int) - Output channels. - **stride** (int) - Spatial downsampling factor. - **reduce** (str) - Pooling reduction method (e.g., 'max', 'mean'). - **shuffle_orders** (bool) - Whether to shuffle orders. - **traceable** (bool) - Whether to store parent for unpooling. - **unpool_config** (dict) - Configuration for SerializedUnpooling. - **in_channels** (int) - Input channels for unpooling. - **skip_channels** (int) - Channels from encoder skip connection. - **out_channels** (int) - Output channels. ### Request Example ```json { "point": { "feat": "tensor", "coord": "tensor", "grid_coord": "tensor", "offset": "tensor" }, "pool_config": { "in_channels": 64, "out_channels": 128, "stride": 2, "reduce": "max", "shuffle_orders": true, "traceable": true }, "unpool_config": { "in_channels": 128, "skip_channels": 64, "out_channels": 64 } } ``` ### Response #### Success Response (200) - **pooled** (Point) - The downsampled point cloud object. - **upsampled** (Point) - The upsampled point cloud object. #### Response Example ```json { "pooled": { "feat_shape": "(reduced_point_count, 128)" }, "upsampled": { "feat_shape": "(original_point_count, 64)" } } ``` ``` -------------------------------- ### SerializedAttention Source: https://context7.com/pointcept/pointtransformerv3/llms.txt Applies serialized attention to point cloud features. ```APIDOC ## SerializedAttention ### Description Applies serialized attention to point cloud features. ### Method POST ### Endpoint /pointcept/pointtransformerv3/attn ### Request Body - **point** (Point) - The input point cloud object containing features. ### Request Example ```json { "point": { "feat": "tensor", "coord": "tensor", "grid_coord": "tensor", "offset": "tensor" } } ``` ### Response #### Success Response (200) - **output** (Point) - The point cloud object with attention-applied features. #### Response Example ```json { "output": { "feat": "tensor (20000, 256)" } } ``` ``` -------------------------------- ### Block (Transformer Block) Source: https://context7.com/pointcept/pointtransformerv3/llms.txt A complete transformer block combining sparse convolution positional encoding, serialized attention, and MLP with residual connections. ```APIDOC ## Block (Transformer Block) ### Description A complete transformer block combining sparse convolution positional encoding, serialized attention, and MLP with residual connections and drop path regularization. ### Method POST ### Endpoint /pointcept/pointtransformerv3/block ### Parameters #### Request Body - **point** (Point) - The input point cloud object with features. - **block_config** (dict) - Configuration for the Block. - **channels** (int) - Number of channels. - **num_heads** (int) - Number of attention heads. - **patch_size** (int) - Patch size for attention. - **mlp_ratio** (float) - Ratio for MLP hidden dimension. - **qkv_bias** (bool) - Whether to use bias for QKV projection. - **qk_scale** (float or None) - Scale factor for attention scores. - **attn_drop** (float) - Dropout rate for attention. - **proj_drop** (float) - Dropout rate for projection. - **drop_path** (float) - Stochastic depth rate. - **pre_norm** (bool) - Whether to use pre-normalization. - **order_index** (int) - Order index for serialization. - **cpe_indice_key** (str) - Indice key for sparse convolution positional encoding. - **enable_rpe** (bool) - Whether to enable relative positional encoding. - **enable_flash** (bool) - Whether to enable FlashAttention. ### Request Example ```json { "point": { "feat": "tensor", "grid_coord": "tensor", "offset": "tensor" }, "block_config": { "channels": 128, "num_heads": 8, "patch_size": 1024, "mlp_ratio": 4.0, "qkv_bias": true, "attn_drop": 0.0, "proj_drop": 0.0, "drop_path": 0.1, "pre_norm": true, "order_index": 0, "cpe_indice_key": "stage0", "enable_rpe": false, "enable_flash": true } } ``` ### Response #### Success Response (200) - **output** (Point) - The point cloud object with processed features. #### Response Example ```json { "output": { "feat_shape": "(15000, 128)" } } ``` ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.