### Test Installation (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Verifies the project installation by running the command-line interface (CLI) help command. This ensures the 'sharp' command is recognized and functional.

```bash
sharp --help
```

--------------------------------

### Install Project Dependencies (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Installs the project's Python dependencies by reading from the 'requirements.txt' file. This command should be run after creating the Python environment.

```bash
pip install -r requirements.txt
```

--------------------------------

### Download Model Checkpoint (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Downloads the SHARP model checkpoint directly from a provided URL. This is an alternative to automatic downloading during prediction.

```bash
wget https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt
```

--------------------------------

### Run Prediction with Auto Download (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Executes the SHARP model for prediction using images from the specified input directory and saves the resulting 3D Gaussian splats to the output directory. The model checkpoint is automatically downloaded and cached if not found locally.

```bash
sharp predict -i /path/to/input/images -o /path/to/output/gaussians
```

--------------------------------

### Run Prediction with Manual Checkpoint (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Executes the SHARP model for prediction, specifying a manually downloaded checkpoint file using the '-c' flag. This allows for using a specific version of the model.

```bash
sharp predict -i /path/to/input/images -o /path/to/output/gaussians -c sharp_2572gikvuh.pt
```

--------------------------------

### CLI - Render Trajectory Videos with SHARP

Source: https://context7.com/apple/ml-sharp/llms.txt

Command-line interface for rendering camera trajectory videos from pre-computed 3D Gaussian .ply files generated by SHARP. Supports rendering from single or multiple .ply files and generates both .mp4 video and .depth.mp4 depth visualizations.

```bash
# Render video from single .ply file
sharp render -i ./output/scene.ply -o ./videos

# Render videos from all .ply files in directory
sharp render -i ./output -o ./videos -v

# Expected output: Creates .mp4 video and .depth.mp4 depth visualization
# Example: output/scene.ply -> videos/scene.mp4, videos/scene.depth.mp4
# Default trajectory: 60 frames rotating forward with camera movement
```

--------------------------------

### Create Python Environment (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Creates a new Conda Python environment named 'sharp' with Python version 3.13. This is the first step to set up the project's dependencies.

```bash
conda create -n sharp python=3.13
```

--------------------------------

### Render Videos from Gaussians (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Predicts 3D Gaussian splats and then renders videos from these splats using a camera trajectory. This functionality requires a CUDA GPU and uses the '--render' option.

```bash
sharp predict -i /path/to/input/images -o /path/to/output/gaussians --render
```

--------------------------------

### Render Videos from Intermediate Gaussians (Bash)

Source: https://github.com/apple/ml-sharp/blob/main/README.md

Renders videos directly from pre-computed 3D Gaussian splat files located in the specified output directory. This command is used after the prediction step has generated the Gaussian data.

```bash
sharp render -i /path/to/output/gaussians -o /path/to/output/renderings
```

--------------------------------

### Python API: Configure Model Parameters for Neural Networks

Source: https://context7.com/apple/ml-sharp/llms.txt

Customizes neural network architecture and training hyperparameters using the SHARP Python API. This involves setting up initializer, monodepth, and Gaussian decoder parameters, as well as learning rate factors and color space configurations. The output is a configured predictor object.

```python
from sharp.models import create_predictor, PredictorParams
from sharp.models.params import (
    InitializerParams, MonodepthParams, GaussianDecoderParams, DeltaFactor
)

# Configure predictor with custom parameters
params = PredictorParams(
    # Initializer settings
    initializer=InitializerParams(
        num_layers=2,              # Number of Gaussian layers
        stride=2,                  # Spatial stride
        scale_factor=1.0,          # Initial scale multiplier
        first_layer_depth_option="surface_min",
        color_option="all_layers", # Options: "none", "first_layer", "all_layers"
        normalize_depth=True
    ),

    # Monodepth network configuration
    monodepth=MonodepthParams(
        patch_encoder_preset="dinov2l16_384",
        image_encoder_preset="dinov2l16_384",
        checkpoint_uri=None,
        unfreeze_patch_encoder=False,
        grad_checkpointing=False,
        dims_decoder=(256, 256, 256, 256, 256)
    ),

    # Gaussian decoder settings
    gaussian_decoder=GaussianDecoderParams(
        dim_in=5,
        dim_out=32,
        norm_type="group_norm",
        norm_num_groups=8,
        stride=2,
        use_depth_input=True,
        upsampling_mode="transposed_conv",
        grad_checkpointing=False
    ),

    # Learning rate factors for different properties
    delta_factor=DeltaFactor(
        xy=0.001,          # Position learning rate
        z=0.001,
        color=0.1,         # 0.1 for linearRGB, 1.0 for sRGB
        opacity=1.0,
        scale=1.0,
        quaternion=1.0
    ),

    # Gaussian constraints
    max_scale=10.0,
    min_scale=0.0,

    # Color space and activations
    color_space="linearRGB",  # or "sRGB"
    color_activation_type="sigmoid",
    opacity_activation_type="sigmoid",

    # Advanced settings
    num_monodepth_layers=2,
    sorting_monodepth=False,
    base_scale_on_predicted_mean=True
)

# Create predictor with custom configuration
predictor = create_predictor(params)
print(f"Internal resolution: {predictor.internal_resolution()}")
print(f"Output resolution: {predictor.output_resolution}")

# Load weights and run inference
# state_dict = torch.load("checkpoint.pt", weights_only=True)
# predictor.load_state_dict(state_dict)
# predictor.eval()

```

--------------------------------

### Python API: Video Writing with Synchronized Depth Visualization

Source: https://context7.com/apple/ml-sharp/llms.txt

Creates video files with synchronized color and depth visualizations using the SHARP Python API. It initializes a VideoWriter, generates synthetic color and depth frames, and writes them to the writer. The output includes a color video and an optional depth visualization video.

```python
from sharp.utils.io import VideoWriter
import torch
from pathlib import Path

# Initialize video writer
output_path = Path("output.mp4")
writer = VideoWriter(
    output_path=output_path,
    fps=30.0,           # Frames per second
    render_depth=True   # Also create .depth.mp4 file
)

# Generate and write frames
num_frames = 120
for i in range(num_frames):
    # Example: synthetic color and depth frames
    height, width = 480, 640
    color = torch.randint(0, 255, (height, width, 3), dtype=torch.uint8)
    depth = torch.randn(height, width) * 5.0 + 10.0  # Depth in meters

    writer.add_frame(color, depth)

# Finalize video files
writer.close()

# Output files:
# - output.mp4: Color video
# - output.depth.mp4: Depth visualization with colormap
# Depth is automatically normalized and clamped to [0, 10] meters for visualization

```

--------------------------------

### CLI - Predict 3D Gaussians from Images with SHARP

Source: https://context7.com/apple/ml-sharp/llms.txt

Command-line interface for predicting 3D Gaussian splat representations from input images using the SHARP system. Supports automatic model download, custom checkpoints, CUDA rendering, and verbose logging. Outputs standard .ply files compatible with public renderers.

```bash
# Basic prediction from single image or directory
sharp predict -i /path/to/input/images -o /path/to/output/gaussians

# Predict with custom checkpoint
sharp predict -i ./photos/image.jpg -o ./output -c sharp_2572gikvuh.pt

# Predict and render trajectory video (requires CUDA GPU)
sharp predict -i ./input -o ./output --render

# Predict on specific device with verbose logging
sharp predict -i ./input -o ./output --device cuda -v

# Expected output: Creates .ply files in output directory
# Example: input/photo.jpg -> output/photo.ply
# Each .ply file contains 3D Gaussians with position, color, opacity, scale, and rotation
```

--------------------------------

### Python API: Render Scene with Camera Trajectories

Source: https://context7.com/apple/ml-sharp/llms.txt

Generates video renderings of 3D Gaussian splat scenes using customizable camera motion trajectories. It loads a scene, configures a camera trajectory, initializes a renderer and video writer, and then renders each frame. The output is a video file (e.g., output.mp4) and optionally a depth visualization.

```python
from sharp.utils import camera, gsplat, io
from sharp.utils.gaussians import load_ply, Gaussians3D
import torch
from pathlib import Path

# Load scene
gaussians, metadata = load_ply(Path("scene.ply"))
device = torch.device("cuda")
f_px = metadata.focal_length_px
width, height = metadata.resolution_px

# Configure trajectory
params = camera.TrajectoryParams(
    type="rotate_forward",  # Options: "swipe", "shake", "rotate", "rotate_forward"
    lookat_mode="point",     # Options: "point", "ahead"
    max_disparity=0.08,
    max_zoom=0.15,
    distance_m=0.0,
    num_steps=60,
    num_repeats=1
)

# Create camera model
intrinsics = torch.tensor([
    [f_px, 0, (width - 1) / 2.0, 0],
    [0, f_px, (height - 1) / 2.0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1]
], device=device, dtype=torch.float32)

camera_model = camera.create_camera_model(
    gaussians, intrinsics, resolution_px=(width, height), lookat_mode="point"
)

# Generate trajectory
trajectory = camera.create_eye_trajectory(
    gaussians, params, resolution_px=(width, height), f_px=f_px
)

# Initialize renderer and video writer
renderer = gsplat.GSplatRenderer(color_space=metadata.color_space)
video_writer = io.VideoWriter(Path("output.mp4"), fps=30.0, render_depth=True)

# Render each frame
for eye_position in trajectory:
    camera_info = camera_model.compute(eye_position)
    output = renderer(
        gaussians.to(device),
        extrinsics=camera_info.extrinsics[None].to(device),
        intrinsics=camera_info.intrinsics[None].to(device),
        image_width=camera_info.width,
        image_height=camera_info.height
    )
    color = (output.color[0].permute(1, 2, 0) * 255.0).to(dtype=torch.uint8)
    depth = output.depth[0]
    video_writer.add_frame(color, depth)

video_writer.close()
# Output: output.mp4 (color) and output.depth.mp4 (depth visualization)
```

--------------------------------

### Python API: Save and Load PLY Files with Metadata

Source: https://context7.com/apple/ml-sharp/llms.txt

Handles 3D Gaussian splat .ply file I/O, preserving scene metadata such as focal length, resolution, and color space. It takes a Gaussians3D object and optional scene metadata as input and outputs a .ply file. The loaded data includes Gaussians and metadata.

```python
from sharp.utils.gaussians import Gaussians3D, save_ply, load_ply, SceneMetaData
from pathlib import Path
import torch

# Create example Gaussians
num_gaussians = 10000
gaussians = Gaussians3D(
    mean_vectors=torch.randn(1, num_gaussians, 3),
    singular_values=torch.exp(torch.randn(1, num_gaussians, 3)),
    quaternions=torch.randn(1, num_gaussians, 4),
    colors=torch.rand(1, num_gaussians, 3),
    opacities=torch.sigmoid(torch.randn(1, num_gaussians))
)

# Save with camera metadata
f_px = 512.0
image_shape = (480, 640)  # (height, width)
save_ply(gaussians, f_px, image_shape, Path("scene.ply"))

# Load Gaussians and metadata
loaded_gaussians, metadata = load_ply(Path("scene.ply"))

# metadata contains:
# - focal_length_px: 512.0
# - resolution_px: (640, 480)  # Note: (width, height)
# - color_space: "linearRGB" or "sRGB"

print(f"Loaded {loaded_gaussians.mean_vectors.shape[1]} Gaussians")
print(f"Focal length: {metadata.focal_length_px}px")
print(f"Resolution: {metadata.resolution_px}")

# Move to device
device = torch.device("cuda")
gaussians_gpu = loaded_gaussians.to(device)
```

--------------------------------

### Python API - Load and Display RGB Images with SHARP utils

Source: https://context7.com/apple/ml-sharp/llms.txt

Python function to load RGB images using SHARP's utility functions. It handles EXIF extraction, automatic rotation based on orientation tags, and focal length estimation from EXIF data. Returns the image as a NumPy array, ICC profile bytes, and focal length in pixels.

```python
from sharp.utils import io
from pathlib import Path

# Load image with metadata extraction
image_path = Path("photo.jpg")
image, icc_profile, f_px = io.load_rgb(image_path)

# Parameters:
# - auto_rotate: Apply EXIF orientation (default: True)
# - remove_alpha: Strip alpha channel (default: True)

# Returns:
# - image: numpy array (H, W, 3) with RGB values 0-255
# - icc_profile: ICC color profile bytes or None
# - f_px: focal length in pixels, estimated from EXIF or default 30mm

print(f"Image shape: {image.shape}")
print(f"Focal length: {f_px:.2f}px")

# Example output:
# Image shape: (2048, 1536, 3)
# Focal length: 512.45px
```

--------------------------------

### Python API - Predict 3D Gaussians with SHARP

Source: https://context7.com/apple/ml-sharp/llms.txt

Python function to generate 3D Gaussian representations from an RGB image using the SHARP model. It handles model initialization, checkpoint loading, image preprocessing, inference, and conversion to metric space. The output can be saved to a standard .ply file.

```python
import torch
import torch.nn.functional as F
import numpy as np
from sharp.models import create_predictor, PredictorParams
from sharp.utils.gaussians import save_ply
from pathlib import Path

# Initialize model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
predictor = create_predictor(PredictorParams())

# Load checkpoint
checkpoint_url = "https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt"
state_dict = torch.hub.load_state_dict_from_url(checkpoint_url, progress=True)
predictor.load_state_dict(state_dict)
predictor.eval()
predictor.to(device)

# Load and preprocess image
image = np.random.randint(0, 255, (1024, 768, 3), dtype=np.uint8)  # Example
f_px = 512.0
height, width = image.shape[:2]

image_pt = torch.from_numpy(image).float().to(device).permute(2, 0, 1) / 255.0
disparity_factor = torch.tensor([f_px / width]).float().to(device)

# Resize to internal resolution
internal_shape = (1536, 1536)
image_resized = F.interpolate(
    image_pt[None], size=internal_shape, mode="bilinear", align_corners=True
)

# Run inference
with torch.no_grad():
    gaussians_ndc = predictor(image_resized, disparity_factor)

# Convert to metric space
from sharp.utils.gaussians import unproject_gaussians
intrinsics = torch.tensor([
    [f_px, 0, width / 2, 0],
    [0, f_px, height / 2, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1]
], device=device, dtype=torch.float32)

gaussians = unproject_gaussians(
    gaussians_ndc, torch.eye(4).to(device), intrinsics, internal_shape
)

# Save to PLY format
save_ply(gaussians, f_px, (height, width), Path("output.ply"))

# gaussians contains:
# - mean_vectors: (1, N, 3) 3D positions
# - singular_values: (1, N, 3) scale parameters
# - quaternions: (1, N, 4) rotations

```

--------------------------------

### Python API: Apply Affine Transformations to Gaussians

Source: https://context7.com/apple/ml-sharp/llms.txt

Applies affine transformations to 3D Gaussian splat parameters including mean, scale, and rotation. It takes a Gaussians3D object and a 3x4 affine transform matrix as input. This function is not differentiable due to its use of SVD for covariance transformation. Colors and opacities are preserved.

```python
from sharp.utils.gaussians import Gaussians3D, apply_transform, unproject_gaussians
import torch

# Create example Gaussians
gaussians = Gaussians3D(
    mean_vectors=torch.randn(1, 1000, 3),
    singular_values=torch.exp(torch.randn(1, 1000, 3)),
    quaternions=torch.randn(1, 1000, 4),
    colors=torch.rand(1, 1000, 3),
    opacities=torch.sigmoid(torch.randn(1, 1000))
)

# Define affine transform (3x4 matrix: rotation + translation)
transform = torch.tensor([
    [1.0, 0.0, 0.0, 2.0],  # Translate +2 in x
    [0.0, 0.707, -0.707, 0.0],  # Rotate 45° around x-axis
    [0.0, 0.707, 0.707, 1.0]   # Translate +1 in z
], dtype=torch.float32)

# Apply transformation
transformed = apply_transform(gaussians, transform)

# Unproject from NDC to world coordinates
extrinsics = torch.eye(4)
intrinsics = torch.tensor([
    [512, 0, 384, 0],
    [0, 512, 256, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1]
], dtype=torch.float32)
image_shape = (768, 512)  # (width, height)

world_gaussians = unproject_gaussians(
    gaussians, extrinsics, intrinsics, image_shape
)

# Note: apply_transform is NOT differentiable (uses SVD for covariance)
# Transforms affect position, scale, and rotation but preserve colors and opacity
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.