### Setup Python Environment for SAM 3D Objects

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/doc/setup.md

Installs the default Python environment for SAM 3D Objects using mamba and pip. Ensure you have a Linux 64-bit system with a compatible NVIDIA GPU. Specific environment variables are set for PyTorch/CUDA dependencies and inference.

```bash
mamba env create -f environments/default.yml
mamba activate sam3d-objects

# for pytorch/cuda dependencies
export PIP_EXTRA_INDEX_URL="https://pypi.ngc.nvidia.com https://download.pytorch.org/whl/cu121"

# install sam3d-objects and core dependencies
pip install -e '.[dev]'
pip install -e '.[p3d]' # pytorch3d dependency on pytorch is broken, this 2-step approach solves it

# for inference
export PIP_FIND_LINKS="https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html"
pip install -e '.[inference]'

# patch things that aren't yet in official pip packages
./patching/hydra # https://github.com/facebookresearch/hydra/pull/2863
```

--------------------------------

### Download SAM 3D Objects Checkpoints from HuggingFace

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/doc/setup.md

Downloads pre-trained SAM 3D Objects checkpoints from HuggingFace. Requires authentication with HuggingFace Hub and installation of the `huggingface-hub` CLI. Ensure you have requested access to the model on HuggingFace.

```bash
pip install 'huggingface-hub[cli]<1.0'

TAG=hf
hf download \
  --repo-type model \
  --local-dir checkpoints/${TAG}-download \
  --max-workers 1 \
  facebook/sam-3d-objects
mv checkpoints/${TAG}-download/checkpoints checkpoints/${TAG}
rm -rf checkpoints/${TAG}-download
```

--------------------------------

### Install SAM 3D Objects Dependencies

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Installs core and optional dependencies for SAM 3D Objects, including PyTorch3D and inference capabilities. Ensure PyTorch/CUDA index is set correctly.

```bash
mamba env create -f environments/default.yml
mamba activate sam3d-objects

export PIP_EXTRA_INDEX_URL="https://pypi.ngc.nvidia.com https://download.pytorch.org/whl/cu121"

pip install -e '.[dev]'
pip install -e '.[p3d]'  # PyTorch3D

export PIP_FIND_LINKS="https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.5.1_cu121.html"
pip install -e '.[inference]'

./patching/hydra
```

--------------------------------

### Import necessary libraries

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_single_object.ipynb

Imports required modules for inference, image handling, and display. Ensure these libraries are installed.

```python
import os
import imageio
import uuid
from IPython.display import Image as ImageDisplay
from inference import Inference, ready_gaussian_for_video_rendering, render_video, load_image, load_single_mask, display_image, make_scene, interactive_visualizer
```

--------------------------------

### Import necessary libraries

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Imports all required libraries for the SAM-3D object processing pipeline. Ensure these are installed in your environment.

```python
import os
import uuid
import imageio
import numpy as np
from IPython.display import Image as ImageDisplay

from inference import Inference, ready_gaussian_for_video_rendering, load_image, load_masks, display_image, make_scene, render_video, interactive_visualizer
```

--------------------------------

### Single Object 3D Generation with SAM 3D

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/README.md

This code snippet demonstrates how to perform single-object 3D generation using the SAM 3D Objects model. It involves loading the model, an image, and a mask, then running the inference and exporting the result as a Gaussian splatting PLY file. Ensure the setup steps are followed before execution.

```python
import sys

# import inference code
sys.path.append("notebook")
from inference import Inference, load_image, load_single_mask

# load model
tag = "hf"
config_path = f"checkpoints/{tag}/pipeline.yaml"
inference = Inference(config_path, compile=False)

# load image and mask
image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
mask = load_single_mask("notebook/images/shutterstock_stylish_kidsroom_1640806567", index=14)

# run model
output = inference(image, mask, seed=42)

# export gaussian splat
output["gs"].save_ply(f"splat.ply")
```

--------------------------------

### Initialize Environment and Paths for Mesh Alignment

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_3db_mesh_alignment.ipynb

Sets up the environment by defining device, current working directory, and input/output paths for mesh alignment. It also creates the output directory if it doesn't exist.

```python
import os
import torch
import matplotlib.pyplot as plt
from PIL import Image
from mesh_alignment import process_and_save_alignment

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")
PATH = os.getcwd()
print(f"Current working directory: {PATH}")

# Please inference the SAM 3D Body (3DB) Repo (https://github.com/facebookresearch/sam-3d-body) to get the 3DB Results
image_path = f"{PATH}/images/human_object/image.png"
mask_path = f"{PATH}/meshes/human_object/3DB_results/mask_human.png"
mesh_path = f"{PATH}/meshes/human_object/3DB_results/human.ply"
focal_length_json_path = f"{PATH}/meshes/human_object/3DB_results/focal_length.json"
output_dir = f"{PATH}/meshes/human_object/aligned_meshes"
os.makedirs(output_dir, exist_ok=True)
```

--------------------------------

### Initialize Inference Pipeline

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Sets up the inference pipeline by specifying the configuration path and loading the model. Set `compile=True` for potentially faster inference after the first run.

```python
PATH = os.getcwd()
TAG = "hf"
config_path = f"{PATH}/../checkpoints/{TAG}/pipeline.yaml"
inference = Inference(config_path, compile=False)
```

--------------------------------

### Interactive Visualization

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Launches an interactive visualizer for the generated Gaussian splatting point cloud. This may take some time to load.

```python
# might take a while to load (black screen)
interactive_visualizer(f"{PATH}/gaussians/multi/{IMAGE_NAME}.ply")
```

--------------------------------

### Visualize Gaussian Splat Interactively

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_single_object.ipynb

Launches an interactive visualizer for the generated Gaussian Splat PLY file. This may take time to load.

```python
# might take a while to load (black screen)
interactive_visualizer(f"{PATH}/gaussians/single/{IMAGE_NAME}.ply")
```

--------------------------------

### Visualize Scene and Render Video

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Creates a scene from the generated Gaussian splats, saves it as a PLY file, prepares it for video rendering, and then renders an animated GIF. The GIF is saved and displayed in the notebook.

```python
scene_gs = make_scene(*outputs)
# export posed gaussian splatting (as point cloud)
scene_gs.save_ply(f"{PATH}/gaussians/{IMAGE_NAME}_posed.ply")

scene_gs = ready_gaussian_for_video_rendering(scene_gs)
# export gaussian splatting (as point cloud)
scene_gs.save_ply(f"{PATH}/gaussians/multi/{IMAGE_NAME}.ply")

video = render_video(
    scene_gs,
    r=1,
    fov=60,
    resolution=512,
){"color"}

# save video as gif
imageio.mimsave(
    os.path.join(f"{PATH}/gaussians/multi/{IMAGE_NAME}.gif"),
    video,
    format="GIF",
    duration=1000 / 30,  # default assuming 30fps from the input MP4
    loop=0,  # 0 means loop indefinitely
)

# notebook display
ImageDisplay(url=f"gaussians/multi/{IMAGE_NAME}.gif?cache_invalidator={uuid.uuid4()}",)
```

--------------------------------

### Launch Interactive 3D Viewer for Gaussian Splats

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Launches a web-based 3D viewer using Gradio, allowing interactive exploration of the reconstructed 3D model. Saves the Gaussian Splat to a PLY file before launching the viewer.

```python
import sys
sys.path.append("notebook")
from inference import Inference, load_image, load_single_mask, interactive_visualizer

# Run inference and save PLY
config_path = "checkpoints/hf/pipeline.yaml"
inference = Inference(config_path, compile=False)

image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
mask = load_single_mask("notebook/images/shutterstock_stylish_kidsroom_1640806567", index=14)
output = inference(image, mask, seed=42)

# Save Gaussian Splat
ply_path = "interactive_model.ply"
output["gs"].save_ply(ply_path)

# Launch interactive viewer (opens browser with shareable link)
interactive_visualizer(ply_path)
# Output: Running on public URL: https://xxxxx.gradio.live
```

--------------------------------

### Load Gaussian Model from PLY

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Initializes a Gaussian model and loads data from a PLY file. Requires specifying bounding box, spherical harmonics degree, minimum kernel size, and device.

```python
from sam3d_objects.model.backbone.tdfy_dit.representations.gaussian.gaussian_model import Gaussian

gaussian = Gaussian(
    aabb=[-0.5, -0.5, -0.5, 1.0, 1.0, 1.0],  # Bounding box
    sh_degree=0,                               # Spherical harmonics degree
    mininum_kernel_size=0.0,                   # Minimum Gaussian size
    device="cuda"
)
gaussian.load_ply("object.ply")
print(f"Loaded {gaussian.get_xyz.shape[0]} Gaussians")
```

--------------------------------

### Image and Mask Loading Utilities

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Provides helper functions for loading images as numpy arrays and masks from various sources, including single masks by index and multiple masks by a list of indices. Includes a function to display images with mask overlays, suitable for Jupyter notebooks.

```python
import sys
sys.path.append("notebook")
from inference import load_image, load_mask, load_single_mask, load_masks, display_image

# Load an RGB image as numpy array (H, W, 3) uint8
image = load_image("path/to/image.png")
print(f"Image shape: {image.shape}, dtype: {image.dtype}")

# Load a single mask as boolean array (H, W)
mask = load_mask("path/to/mask.png")
print(f"Mask shape: {mask.shape}, unique values: {mask.max()}")

# Load a specific mask by index from a folder containing 0.png, 1.png, etc.
mask_14 = load_single_mask("path/to/masks_folder", index=14)

# Load multiple masks by indices
masks = load_masks("path/to/masks_folder", indices_list=[0, 2, 5, 14])
print(f"Loaded {len(masks)} masks")

# Load all masks in a folder (auto-detects numbered files)
all_masks = load_masks("path/to/masks_folder")

# Display image with mask overlay (for Jupyter notebooks)
display_image(image, masks=[mask_14])
```

--------------------------------

### Render Turntable Video of Gaussian Splats

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Generates frames by orbiting the camera around the object and saves the output as a GIF. Requires `imageio` and specific inference utilities. Normalizes the Gaussian Splat for consistent visualization.

```python
import sys
import imageio
sys.path.append("notebook")
from inference import (
    Inference, load_image, load_single_mask, make_scene,
    ready_gaussian_for_video_rendering, render_video
)

# Initialize and run inference
config_path = "checkpoints/hf/pipeline.yaml"
inference = Inference(config_path, compile=False)

image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
mask = load_single_mask("notebook/images/shutterstock_stylish_kidsroom_1640806567", index=14)
output = inference(image, mask, seed=42)

# Prepare Gaussian for rendering
scene_gs = make_scene(output)
scene_gs = ready_gaussian_for_video_rendering(scene_gs)

# Render turntable video
video_frames = render_video(
    scene_gs,
    resolution=512,          # Output resolution
    bg_color=(0, 0, 0),      # Black background
    num_frames=300,          # Number of frames for full rotation
    r=1.0,                   # Camera distance
    fov=60,                  # Field of view in degrees
    pitch_deg=15,            # Camera pitch angle
    yaw_start_deg=-45,       # Starting yaw angle
){"color"}

# Save as GIF
imageio.mimsave(
    "reconstruction.gif",
    video_frames,
    format="GIF",
    duration=1000 / 30,  # 30 fps
    loop=0,              # Loop indefinitely
)
print("Video saved to reconstruction.gif")
```

--------------------------------

### Gaussian Model Representation and Saving

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Demonstrates how to access core attributes of the Gaussian model (position, scale, rotation, opacity, features) and save the model to a PLY file. The Gaussian model is typically obtained from the inference output.

```python
# The Gaussian model is returned by inference as output["gs"]
# Key properties and methods:

# Access Gaussian attributes
xyz = output["gs"].get_xyz          # (N, 3) tensor of Gaussian positions
scaling = output["gs"].get_scaling  # (N, 3) tensor of Gaussian scales
rotation = output["gs"].get_rotation # (N, 4) tensor of quaternion rotations
opacity = output["gs"].get_opacity  # (N, 1) tensor of opacity values
features = output["gs"].get_features # (N, 1, 3) tensor of color features (DC)

# Save to PLY file (standard Gaussian Splatting format)
output["gs"].save_ply("object.ply")
```

--------------------------------

### Render Gaussian Splat to Animated GIF

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_single_object.ipynb

Renders the Gaussian Splat into a video sequence and saves it as an animated GIF. The `duration` and `loop` parameters can be adjusted for desired playback.

```python
# render gaussian splat
scene_gs = make_scene(output)
scene_gs = ready_gaussian_for_video_rendering(scene_gs)

video = render_video(
    scene_gs,
    r=1,
    fov=60,
    pitch_deg=15,
    yaw_start_deg=-45,
    resolution=512,
){"color"}

# save video as gif
imageio.mimsave(
    os.path.join(f"{PATH}/gaussians/single/{IMAGE_NAME}.gif"),
    video,
    format="GIF",
    duration=1000 / 30,  # default assuming 30fps from the input MP4
    loop=0,  # 0 means loop indefinitely
)

# notebook display
ImageDisplay(url=f"gaussians/single/{IMAGE_NAME}.gif?cache_invalidator={uuid.uuid4()}")
```

--------------------------------

### Create Video from Scene Visualization

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Generates a video file from a Plotly scene visualization figure. This is useful for creating animations of the reconstructed scene.

```python
import torch
from sam3d_objects.utils.visualization import SceneVisualizer

# Create a video from the figure
SceneVisualizer.make_video_from_fig(fig, "scene_rotation.mp4")
```

--------------------------------

### Load and Display Input Image and Mask

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_3db_mesh_alignment.ipynb

Loads the input image and mask using PIL and displays them side-by-side using Matplotlib. The mask is converted to grayscale.

```python
input_image = Image.open(image_path)
mask = Image.open(mask_path).convert('L')
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(input_image)
axes[0].set_title('Input Image')
axes[0].axis('off')
axes[1].imshow(mask, cmap='gray')
axes[1].set_title('Mask')
axes[1].axis('off')
plt.tight_layout()
plt.show()
```

--------------------------------

### Load Input Image and Mask

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_single_object.ipynb

Loads a single image and its corresponding mask for 3D object generation. The `display_image` function is used for a quick preview.

```python
IMAGE_PATH = f"{PATH}/images/shutterstock_stylish_kidsroom_1640806567/image.png"
IMAGE_NAME = os.path.basename(os.path.dirname(IMAGE_PATH))

image = load_image(IMAGE_PATH)
mask = load_single_mask(os.path.dirname(IMAGE_PATH), index=14)
display_image(image, masks=[mask])
```

--------------------------------

### Transform SAM 3D Objects to OpenGL Coordinate System

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_3db_mesh_alignment.ipynb

Loads a posed SAM 3D Object from a PLY file, transforms its points to the OpenGL coordinate system by flipping the x and z axes, and saves the transformed mesh.

```python
# Please inference SAM 3D Objects Repo with https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb
# The above notebook will apply the generated layout to the generated objects, and same them as ply. 
# Then, this cell will load the posed SAM 3D Objects and transform them into the OpenGL coordinate system, which is the same system as SAM 3D Body. 
import numpy as np
import open3d as o3d

# Load PLY file
input_path = 'gaussians/human_object_posed.ply'
output_path = 'meshes/human_object/3Dfy_results/0.ply'
mesh = o3d.io.read_point_cloud(input_path)
points = np.asarray(mesh.points)

# Transform to OpenGL coordinate system. 
points[:, [0, 2]] *= -1  # flip x and z
mesh.points = o3d.utility.Vector3dVector(points)
o3d.io.write_point_cloud(output_path, mesh)
```

--------------------------------

### Generate Gaussian Splat

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_single_object.ipynb

Runs the inference model on the loaded image and mask to generate a Gaussian Splat representation. The output is then exported as a PLY point cloud file.

```python
# run model
output = inference(image, mask, seed=42)

# export gaussian splat (as point cloud)
output["gs"].save_ply(f"{PATH}/gaussians/single/{IMAGE_NAME}.ply")
```

--------------------------------

### Multi-Object Scene Reconstruction with SAM 3D Objects

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Reconstructs multiple objects from a single image and combines them into a unified scene with automatic layout optimization. Objects are transformed into world coordinates using predicted poses.

```python
import sys
sys.path.append("notebook")
from inference import Inference, load_image, load_masks, make_scene

# Initialize inference
config_path = "checkpoints/hf/pipeline.yaml"
inference = Inference(config_path, compile=False)

# Load image and all masks (numbered 0.png, 1.png, etc.)
image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
masks = load_masks("notebook/images/shutterstock_stylish_kidsroom_1640806567", extension=".png")

# Process each masked object
outputs = [inference(image, mask, seed=42) for mask in masks]

# Combine all objects into a unified scene
# This transforms each object to world coordinates using predicted poses
scene_gs = make_scene(*outputs)

# Save combined scene as PLY
scene_gs.save_ply("scene_combined.ply")

# Access individual object data
for i, output in enumerate(outputs):
    print(f"Object {i}: rotation={output['rotation'].shape}, "
          f"translation={output['translation'].shape}, "
          f"scale={output['scale'].shape}")
```

--------------------------------

### Advanced Inference Pipeline Run

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Executes the inference pipeline with fine-grained control over stages and post-processing options. Allows customization of diffusion steps, mesh post-processing, texture baking, layout optimization, and output formats.

```python
# Access via the internal pipeline (advanced usage)
# The Inference class wraps this with simplified defaults

output = inference._pipeline.run(
    image=rgba_image,                    # (H, W, 4) numpy array with mask in alpha
    mask=None,                           # Mask is embedded in alpha channel
    seed=42,                             # Random seed for reproducibility
    stage1_only=False,                   # Stop after sparse structure (faster)
    with_mesh_postprocess=True,          # Apply mesh post-processing
    with_texture_baking=True,            # Bake textures onto mesh
    with_layout_postprocess=False,       # Run layout optimization
    use_vertex_color=True,               # Use vertex colors instead of textures
    stage1_inference_steps=None,         # Custom diffusion steps for stage 1
    stage2_inference_steps=None,         # Custom diffusion steps for stage 2
    pointmap=None,                       # Optional custom pointmap
    decode_formats=None,                 # Output formats: ["gaussian", "mesh"]
    estimate_plane=False,                # Estimate ground plane instead of object
)
```

--------------------------------

### Interactive 3D Visualization of Aligned Meshes

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_3db_mesh_alignment.ipynb

Visualizes the aligned 3DB mesh and the 3Dfy mesh interactively using the `visualize_meshes_interactive` function. It generates a combined GLB file for sharing.

```python
from mesh_alignment import visualize_meshes_interactive

aligned_mesh_path = f"{PATH}/meshes/human_object/aligned_meshes/human_aligned.ply"
dfy_mesh_path = f"{PATH}/meshes/human_object/3Dfy_results/0.ply"

demo, combined_glb_path = visualize_meshes_interactive(
    aligned_mesh_path=aligned_mesh_path,
    dfy_mesh_path=dfy_mesh_path,
    share=True
)
```

--------------------------------

### Run Stage 1 Inference

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Performs stage 1 inference for faster processing without detailed geometry. Requires an RGBA image and an optional mask. Returns voxel coordinates, rotation, translation, scale, and pointmap.

```python
output_stage1 = inference._pipeline.run(
    image=rgba_image,
    mask=None,
    seed=42,
    stage1_only=True,
)
# Returns: voxel coordinates, rotation, translation, scale, pointmap
print(f"Voxel grid shape: {output_stage1['voxel'].shape}")
```

--------------------------------

### Load Image and Masks

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Loads the input image and corresponding segmentation masks. The `display_image` function is used to visualize the loaded image with its masks, aiding in verification.

```python
IMAGE_PATH = f"{PATH}/images/shutterstock_stylish_kidsroom_1640806567/image.png"
IMAGE_NAME = os.path.basename(os.path.dirname(IMAGE_PATH))

image = load_image(IMAGE_PATH)
masks = load_masks(os.path.dirname(IMAGE_PATH), extension=".png")
display_image(image, masks)
```

--------------------------------

### Align Human Meshes with SAM 3D Objects

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Aligns human meshes from SAM 3D Body to SAM 3D Objects scene scale using MoGe depth estimation. Requires paths to the mesh, mask, image, output directory, and focal length JSON. The alignment process returns success status, output mesh path, and alignment results including scale factor and translation.

```python
import torch
from notebook.mesh_alignment import (
    process_and_save_alignment,
    visualize_meshes_interactive
)

device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Process 3DB mesh alignment to MoGe/SAM 3D Objects scale
success, output_mesh_path, result = process_and_save_alignment(
    mesh_path="meshes/human_object/3DB_results/human.ply",      # 3DB mesh
    mask_path="meshes/human_object/3DB_results/mask_human.png", # Human mask
    image_path="images/human_object/image.png",                  # Original image
    output_dir="meshes/human_object/aligned_meshes",             # Output directory
    device=device,
    focal_length_json_path="meshes/human_object/3DB_results/focal_length.json"
)

if success:
    print(f"Alignment completed! Scale factor: {result['scale_factor']:.4f}")
    print(f"Translation: {result['translation']}")
    print(f"Output saved to: {output_mesh_path}")

    # Visualize aligned human with SAM 3D Objects reconstruction
    demo, combined_path = visualize_meshes_interactive(
        aligned_mesh_path=output_mesh_path,
        dfy_mesh_path="meshes/human_object/3Dfy_results/0.ply",
        share=True  # Create public URL
    )
```

--------------------------------

### Single-Object 3D Reconstruction with SAM 3D Objects

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Performs single-object 3D reconstruction from an image and mask using the SAM 3D Objects inference pipeline. Outputs Gaussian splats, meshes, pose, and layout information.

```python
import sys
sys.path.append("notebook")
from inference import Inference, load_image, load_single_mask

# Initialize inference pipeline
config_path = "checkpoints/hf/pipeline.yaml"
inference = Inference(config_path, compile=False)

# Load image and mask
image = load_image("notebook/images/shutterstock_stylish_kidsroom_1640806567/image.png")
mask = load_single_mask("notebook/images/shutterstock_stylish_kidsroom_1640806567", index=14)

# Run inference (returns dict with 'gs', 'gaussian', 'rotation', 'translation', 'scale', etc.)
output = inference(image, mask, seed=42)

# Access outputs
gaussian_splat = output["gs"]           # Gaussian model for rendering
rotation = output["rotation"]           # Quaternion (1, 4) tensor for object rotation
translation = output["translation"]     # (1, 3) tensor for object position
scale = output["scale"]                 # (1, 3) tensor for object scale
pointmap = output["pointmap"]           # (H, W, 3) depth pointmap

# Export to PLY file
output["gs"].save_ply("output_object.ply")
print("Gaussian splat saved to output_object.ply")
```

--------------------------------

### Process and Save Aligned Mesh

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_3db_mesh_alignment.ipynb

Calls the `process_and_save_alignment` function with the specified mesh, mask, image, output directory, device, and focal length. It then prints a success or failure message based on the result.

```python
success, output_mesh_path, result = process_and_save_alignment(
    mesh_path=mesh_path,
    mask_path=mask_path,
    image_path=image_path,
    output_dir=output_dir,
    device=device,
    focal_length_json_path=focal_length_json_path
)

if success:
    print(f"Alignment completed successfully! Output: {output_mesh_path}")
else:
    print("Alignment failed!")
```

--------------------------------

### Generate 3D Gaussian Splats

Source: https://github.com/facebookresearch/sam-3d-objects/blob/main/notebook/demo_multi_object.ipynb

Generates 3D Gaussian splats for each detected object mask in the image. A fixed seed is used for reproducibility.

```python
outputs = [inference(image, mask, seed=42) for mask in masks]
```

--------------------------------

### Visualize Multiple Objects Scene

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Visualizes multiple objects within a scene, including their poses and associated pointmaps. This function allows specifying object names and pointmap data for a comprehensive scene view.

```python
import torch
from sam3d_objects.utils.visualization import SceneVisualizer

# Visualize multiple objects in a scene
pose_targets = [
    {
        "xyz_local": outputs[i]["voxel"],
        "rotation": outputs[i]["rotation"],
        "translation": outputs[i]["translation"],
        "scale": outputs[i]["scale"],
    }
    for i in range(len(outputs))
]

fig = SceneVisualizer.plot_multi_objects(
    pose_targets=pose_targets,
    mask_names=["Chair", "Table", "Lamp"],
    pointmap=outputs[0]["pointmap"],
    pointmap_colors=outputs[0]["pointmap_colors"],
    title="Multi-Object Scene"
)
fig.show()
```

--------------------------------

### Visualize Single Object Scene

Source: https://context7.com/facebookresearch/sam-3d-objects/llms.txt

Visualizes a single reconstructed object in scene coordinates using Plotly. Requires local object points, rotation, translation, scale, pointmap, and image colors. The visualization can be displayed as a mesh.

```python
import torch
from sam3d_objects.utils.visualization import SceneVisualizer

# Visualize a single object in scene coordinates
fig = SceneVisualizer.plot_scene(
    points_local=output["voxel"],
    instance_quaternions_l2c=output["rotation"],
    instance_positions_l2c=output["translation"],
    instance_scales_l2c=output["scale"],
    pointmap=output["pointmap"],
    image=output["pointmap_colors"],
    title="Reconstructed Object",
    height=1000,
    show_pointmap_as_mesh=True,
)
fig.show()
```

=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.