### Visualize Experiments with Tensorboard Source: https://github.com/amaralibey/openvprlab/blob/main/README.md Starts the Tensorboard server to monitor training logs and metrics. ```bash tensorboard --logdir ./logs/resnet18/ConvAP ``` -------------------------------- ### Full Training Pipeline with PyTorch Lightning Source: https://context7.com/amaralibey/openvprlab/llms.txt Illustrates the setup and execution of a complete training experiment using PyTorch Lightning. It includes importing necessary components like Trainer, callbacks, loggers, datamodules, frameworks, backbones, aggregators, and loss functions, and setting a seed for reproducibility. ```python from lightning.pytorch import Trainer, seed_everything from lightning.pytorch.callbacks import ModelCheckpoint from lightning.pytorch.loggers import TensorBoardLogger from src.core.vpr_datamodule import VPRDataModule from src.core.vpr_framework import VPRFramework from src.models.backbones import ResNet from src.models.aggregators import BoQ from src.losses import VPRLossFunction # Set seed for reproducibility seed_everything(42, workers=True) ``` -------------------------------- ### Initialize Trainer and Start Training Source: https://context7.com/amaralibey/openvprlab/llms.txt This section initializes the PyTorch Lightning Trainer with specified configurations for GPU acceleration, logger, precision (16-mixed), maximum epochs, callbacks (including the defined checkpointing), and logging frequency. Finally, it calls the `fit` method to start the training process using the configured model and datamodule. ```python trainer = Trainer( accelerator="gpu", devices=[0], precision="16-mixed", max_epochs=40, callbacks=[checkpoint], log_every_n_steps=10 ) trainer.fit(model=model, datamodule=datamodule) ``` -------------------------------- ### Download GSV-Cities Dataset via Kaggle CLI Source: https://github.com/amaralibey/openvprlab/blob/main/notebooks/download_gsv-cities.ipynb Installs the Kaggle CLI and downloads the GSV-Cities dataset to a specified directory, automatically unzipping the files. Requires sufficient disk space for the 23GB dataset. ```python save_path = "../data/train/gsv-cities" ! pip install kaggle ! kaggle datasets download "amaralibey/gsv-cities" -p {save_path} --unzip ``` -------------------------------- ### Implement Custom Aggregator in PyTorch Source: https://context7.com/amaralibey/openvprlab/llms.txt Provides a guide and example for implementing custom aggregation techniques within the framework. It demonstrates creating a custom PyTorch module with channel projection, batch normalization, ReLU activation, and spatial pooling, followed by normalization. ```python # src/models/aggregators/my_agg.py import torch import torch.nn as nn import torch.nn.functional as F class MyCustomAggregator(nn.Module): """Custom aggregator example with channel reduction and spatial pooling.""" def __init__(self, in_channels=1024, out_channels=512, pool_size=4): super().__init__() self.channel_proj = nn.Conv2d(in_channels, out_channels, kernel_size=1) self.bn = nn.BatchNorm2d(out_channels) self.pool = nn.AdaptiveAvgPool2d((pool_size, pool_size)) def forward(self, x): x = self.channel_proj(x) x = self.bn(x) x = F.relu(x) x = self.pool(x) x = F.normalize(x.flatten(1), p=2, dim=-1) return x # Register in src/models/aggregators/__init__.py: # from .my_agg import MyCustomAggregator # Use in config or command line: # python run.py --aggregator MyCustomAggregator --batch_size 60 --lr 0.0001 ``` -------------------------------- ### Setup Logging and Checkpointing for Training Source: https://context7.com/amaralibey/openvprlab/llms.txt This snippet configures the logging and checkpointing mechanisms for the training process. It sets up TensorBoard logging to save training metrics and model checkpoints to a specified directory, with a filename format that includes the epoch number and validation recall at rank 1 (R1). ```python logger = TensorBoardLogger(save_dir="./logs/resnet50", name="BoQ") checkpoint = ModelCheckpoint( monitor="msls-val/R1", mode="max", save_top_k=3, filename="epoch({epoch:02d})_R1[{msls-val/R1:.4f}]" ) ``` -------------------------------- ### POST /train Source: https://context7.com/amaralibey/openvprlab/llms.txt Initiates the training process for VPR models using configuration files or command-line overrides. ```APIDOC ## POST /train ### Description Starts the training of a VPR model. Supports YAML configuration files and various command-line arguments for hyperparameter tuning. ### Method POST ### Endpoint run.py ### Parameters #### Query Parameters - **config** (string) - Required - Path to the YAML configuration file. - **batch_size** (integer) - Optional - Batch size for training. - **lr** (float) - Optional - Learning rate. - **backbone** (string) - Optional - Backbone architecture name. - **aggregator** (string) - Optional - Aggregator method name. ### Request Example python run.py --config ./config/resnet50_mixvpr.yaml --batch_size 40 --lr 0.0001 ### Response #### Success Response (200) - **status** (string) - Training process initialized and logging to TensorBoard. ``` -------------------------------- ### Run Training with Configuration File Source: https://github.com/amaralibey/openvprlab/blob/main/README.md Executes the training script using a specific YAML configuration file. ```bash python run.py --config ./config/my_model_config.yaml ``` -------------------------------- ### POST /framework/initialize Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the VPRFramework module with backbone, aggregator, and loss function components. ```APIDOC ## POST /framework/initialize ### Description Constructs the VPRFramework instance to orchestrate training, validation, and inference. ### Method POST ### Request Body - **backbone** (object) - Required - Pre-trained backbone model. - **aggregator** (object) - Required - Feature pooling method. - **loss_function** (object) - Required - Optimization objective. - **lr** (float) - Required - Learning rate. ### Request Example model = VPRFramework(backbone=backbone, aggregator=aggregator, loss_function=loss_function, lr=0.0001) ### Response #### Success Response (200) - **model** (object) - Initialized VPRFramework instance. ``` -------------------------------- ### Initialize VPRFramework and Perform Inference Source: https://context7.com/amaralibey/openvprlab/llms.txt Demonstrates how to assemble a VPR model by combining a backbone, aggregator, and loss function within the VPRFramework Lightning module. ```python import torch from src.core.vpr_framework import VPRFramework from src.models.backbones import ResNet from src.models.aggregators import MixVPR from src.losses import VPRLossFunction backbone = ResNet(backbone_name="resnet50", pretrained=True, num_unfrozen_blocks=1, crop_last_block=True) aggregator = MixVPR(in_channels=backbone.out_channels, in_h=20, in_w=20, out_channels=512, mix_depth=4, mlp_ratio=1, out_rows=4) loss_function = VPRLossFunction(loss_fn_name="MultiSimilarityLoss", miner_name="MultiSimilarityMiner") model = VPRFramework(backbone=backbone, aggregator=aggregator, loss_function=loss_function, lr=0.0001, optimizer="adamw", weight_decay=0.001, warmup_steps=1500, milestones=[10, 20, 30], lr_mult=0.1, verbose=True) images = torch.randn(8, 3, 320, 320) descriptors = model(images) ``` -------------------------------- ### Train and Monitor VPR Models via CLI Source: https://context7.com/amaralibey/openvprlab/llms.txt Commands to initiate model training using configuration files or command-line overrides, and to monitor progress using TensorBoard. ```bash python run.py --dev python run.py --config ./config/resnet50_mixvpr.yaml python run.py --config ./config/resnet50_mixvpr.yaml --batch_size 40 --lr 0.0001 python run.py --backbone ResNet --aggregator MixVPR --batch_size 60 --lr 0.0001 tensorboard --logdir ./logs/ ``` -------------------------------- ### Run Training Experiment via CLI Source: https://github.com/amaralibey/openvprlab/blob/main/README.md Executes the training script using command-line arguments to specify the backbone and aggregator. ```bash python run.py --backbone ResNet --aggregator ConvAP --batch_size 60 --lr 0.0001 ``` -------------------------------- ### Initialize VPR DataModule and Model Components Source: https://context7.com/amaralibey/openvprlab/llms.txt This snippet demonstrates the initialization of the VPRDataModule for handling image datasets and setting up core model components like the backbone (ResNet), aggregator (BoQ), and loss function (VPRLossFunction). It configures dataset names, image sizes, batching, and specific model architectures and parameters. ```python IMAGENET_MEAN_STD = {"mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]} datamodule = VPRDataModule( train_set_name="gsv-cities-light", train_image_size=(320, 320), batch_size=60, img_per_place=4, num_workers=8, mean_std=IMAGENET_MEAN_STD, val_set_names=["msls-val"] ) backbone = ResNet(backbone_name="resnet50", pretrained=True, num_unfrozen_blocks=1, crop_last_block=True) aggregator = BoQ(in_channels=backbone.out_channels, proj_channels=512, num_queries=32, num_layers=2, row_dim=32) loss_function = VPRLossFunction(loss_fn_name="MultiSimilarityLoss", miner_name="MultiSimilarityMiner") ``` -------------------------------- ### YAML Configuration for Experiments Source: https://context7.com/amaralibey/openvprlab/llms.txt Illustrates the structure of a YAML configuration file used for defining comprehensive experiments. It covers parameters for datamodules, backbones, aggregators, loss functions, and trainers, allowing for detailed hyperparameter specification. ```yaml # config/my_model_config.yaml datamodule: train_set_name: "gsv-cities-light" train_image_size: - 320 - 320 img_per_place: 4 batch_size: 60 num_workers: 8 val_set_names: - "msls-val" - "pitts30k-val" backbone: module: src.models.backbones class: ResNet params: backbone_name: "resnet50" pretrained: true num_unfrozen_blocks: 1 crop_last_block: true aggregator: module: src.models.aggregators class: MixVPR params: in_channels: 1024 in_h: 20 in_w: 20 out_channels: 512 mix_depth: 4 mlp_ratio: 1 out_rows: 4 loss_function: module: src.losses class: VPRLossFunction params: loss_fn_name: "MultiSimilarityLoss" miner_name: "MultiSimilarityMiner" trainer: optimizer: adamw lr: 0.0001 wd: 0.001 warmup: 1500 max_epochs: 40 milestones: - 10 - 20 - 30 lr_mult: 0.1 ``` -------------------------------- ### Initialize and Use DinoV2 Backbone Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes a DinoV2 Vision Transformer backbone with support for partial fine-tuning. It takes the backbone name and the number of unfrozen blocks as parameters. The output features are spatial, and the input image dimensions must be divisible by 14. ```python from src.models.backbones import DinoV2 # Available models: dinov2_vits14, dinov2_vitb14, dinov2_vitl14, dinov2_vitg14 backbone = DinoV2( backbone_name="dinov2_vitb14", num_unfrozen_blocks=2, # Train last 2 transformer blocks return_cls_token=False # Return spatial features only ) print(f"Output channels: {backbone.out_channels}") # 768 for vitb14 # Forward pass import torch images = torch.randn(4, 3, 322, 322) # Must be divisible by 14 features = backbone(images) # Shape: [4, 768, 23, 23] ``` -------------------------------- ### Configure ResNet Backbone Source: https://context7.com/amaralibey/openvprlab/llms.txt Instantiates a pre-trained ResNet backbone with specific layer freezing and cropping options for transfer learning. ```python from src.models.backbones import ResNet backbone = ResNet(backbone_name="resnet50", pretrained=True, num_unfrozen_blocks=1, crop_last_block=True) ``` -------------------------------- ### Implement GSVCitiesDataset for Training Source: https://context7.com/amaralibey/openvprlab/llms.txt Demonstrates the implementation of the GSVCitiesDataset for training, which handles geotagged street-view images organized by place. It shows how to initialize the dataset with specified paths, city selections, and image sampling strategies, along with applying transformations and accessing data. ```python from src.dataloaders.train.gsv_cities import GSVCitiesDataset from torchvision import transforms as T transform = T.Compose([ T.ToTensor(), T.Resize((320, 320)), T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) dataset = GSVCitiesDataset( dataset_path="./data/train/gsv-cities-light", cities="all", # Or list: ["Bangkok", "Rome", "London"] img_per_place=4, random_sample_from_each_place=True, transform=transform, hard_mining=False ) # Each item returns K images from the same place images, labels = dataset[0] print(f"Images shape: {images.shape}") # [4, 3, 320, 320] print(f"Labels: {labels}") # Same place_id repeated 4 times print(f"Total places: {len(dataset)}") print(f"Total images: {dataset.total_nb_images}") ``` -------------------------------- ### Configure VPRDataModule for Data Loading Source: https://context7.com/amaralibey/openvprlab/llms.txt Sets up the PyTorch Lightning DataModule to handle image transformations, batching, and multi-dataset loading for training and validation. ```python from src.core.vpr_datamodule import VPRDataModule IMAGENET_MEAN_STD = {"mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]} datamodule = VPRDataModule(train_set_name="gsv-cities-light", cities="all", train_image_size=(320, 320), batch_size=60, img_per_place=4, random_sample_from_each_place=True, shuffle_all=False, num_workers=8, mean_std=IMAGENET_MEAN_STD, val_set_names=["msls-val", "pitts30k-val"], val_image_size=None) datamodule.setup(stage="fit") train_loader = datamodule.train_dataloader() val_loaders = datamodule.val_dataloader() ``` -------------------------------- ### Create and Configure VPR Framework Model Source: https://context7.com/amaralibey/openvprlab/llms.txt This code defines the main VPR model by integrating the previously initialized backbone, aggregator, and loss function into the VPRFramework. It also sets hyperparameters for the optimizer (AdamW), learning rate, weight decay, and learning rate scheduling (warmup steps and milestones). ```python model = VPRFramework( backbone=backbone, aggregator=aggregator, loss_function=loss_function, lr=0.0001, optimizer="adamw", weight_decay=0.001, warmup_steps=1500, milestones=[10, 20, 30], lr_mult=0.1 ) ``` -------------------------------- ### Download VPR Datasets Source: https://context7.com/amaralibey/openvprlab/llms.txt Interactive script to download and configure training and validation datasets such as GSV-Cities, Mapillary SLS, and Pittsburgh 30k. ```bash python scripts/datasets_downloader.py ``` -------------------------------- ### Initialize and Use VPRLossFunction Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the VPRLossFunction, a flexible wrapper for various metric learning losses and online hard miners. Users can specify the loss function name and miner name. The forward pass computes the loss and batch accuracy given normalized embeddings and labels. ```python from src.losses import VPRLossFunction import torch # Available loss functions: # MultiSimilarityLoss, SupConLoss, ContrastiveLoss, TripletLoss, # CircleLoss, LiftedLoss, FastAPLoss, NTXentLoss, CentroidTripletLoss # Available miners: # MultiSimilarityMiner, TripletMarginMiner, PairMarginMiner loss_fn = VPRLossFunction( loss_fn_name="MultiSimilarityLoss", miner_name="MultiSimilarityMiner" ) # Compute loss embeddings = torch.randn(60, 512) # Batch of normalized descriptors embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1) labels = torch.tensor([0]*4 + [1]*4 + [2]*4 + [3]*4 + [4]*4 + [5]*4 + [6]*4 + [7]*4 + [8]*4 + [9]*4 + [10]*4 + [11]*4 + [12]*4 + [13]*4 + [14]*4) loss, batch_accuracy = loss_fn(embeddings, labels) print(f"Loss: {loss.item():.4f}") print(f"Batch accuracy (fraction of non-informative samples): {batch_accuracy:.4f}") ``` -------------------------------- ### Define Experiment Configuration Source: https://github.com/amaralibey/openvprlab/blob/main/README.md A YAML configuration file defining data, backbone, aggregator, and trainer parameters for reproducible experiments. ```yaml datamodule: train_set_name: "gsv-cities-light" train_image_size: - 224 - 224 img_per_place: 4 batch_size: 100 num_workers: 8 val_set_names: - "msls-val" - "pitts30k-val" backbone: module: src.models.backbones class: ResNet params: backbone_name: "resnet18" pretrained: true num_unfrozen_blocks: 1 crop_last_block: true aggregator: module: src.models.aggregators class: ConvAP params: in_channels: out_channels: 256 s1: 3 s2: 3 trainer: optimizer: adamw lr: 0.0001 wd: 0.001 warmup: 2000 max_epochs: 30 milestones: - 10 - 20 lr_mult: 0.1 ``` -------------------------------- ### POST /datasets/download Source: https://context7.com/amaralibey/openvprlab/llms.txt Downloads and configures datasets required for training and validation. ```APIDOC ## POST /datasets/download ### Description Downloads selected VPR datasets and automatically updates the local configuration paths. ### Method POST ### Endpoint scripts/datasets_downloader.py ### Parameters #### Query Parameters - **dataset_id** (integer) - Required - ID of the dataset (1: gsv-cities, 2: gsv-cities-light, 3: pitts30k-val, 4: msls-val, 5: all). ### Request Example python scripts/datasets_downloader.py ### Response #### Success Response (200) - **message** (string) - Dataset downloaded and configuration updated. ``` -------------------------------- ### Initialize and Use GeM Pooling Aggregator Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the GeM Pooling aggregator, which applies Generalized Mean Pooling with a learnable pooling parameter 'p'. It takes an initial 'p' value and an epsilon for numerical stability. The output is a tensor of global descriptors. ```python from src.models.aggregators import GeMPool aggregator = GeMPool( p=3, # Initial pooling power (learnable) eps=1e-6 # Epsilon for numerical stability ) # Forward pass import torch features = torch.randn(8, 1024, 20, 20) descriptors = aggregator(features) # Shape: [8, 1024] print(f"Learned pooling power: {aggregator.p.item()}") ``` -------------------------------- ### Initialize and Use BoQ Aggregator Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the BoQ (Bag of Queries) aggregator, a Transformer-based method using learnable queries and cross-attention. It requires input channels, projection channels, number of queries, number of layers, and a row dimension. The forward pass returns both descriptors and attention maps. ```python from src.models.aggregators import BoQ aggregator = BoQ( in_channels=1024, # Must match backbone output channels proj_channels=512, # Projection dimension after 3x3 conv num_queries=32, # Number of learnable queries num_layers=2, # Number of BoQ blocks row_dim=32 # Final row projection dimension ) # Forward pass import torch features = torch.randn(8, 1024, 20, 20) # Backbone output descriptors, attentions = aggregator(features) # Returns tuple print(f"Descriptor shape: {descriptors.shape}") # [8, 16384] print(f"Number of attention maps: {len(attentions)}") # 2 (one per layer) ``` -------------------------------- ### Initialize and Use ConvAP Aggregator Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the ConvAP (Convolutional Average Pooling) aggregator, which performs spatial pooling and channel projection to create compact descriptors. It requires input channels, output channels, and spatial pooling dimensions (s1, s2). ```python from src.models.aggregators import ConvAP aggregator = ConvAP( in_channels=1024, out_channels=512, s1=2, # Spatial pooling height s2=2 # Spatial pooling width ) # Forward pass import torch features = torch.randn(8, 1024, 20, 20) descriptors = aggregator(features) # Shape: [8, 2048] (out_channels * s1 * s2) ``` -------------------------------- ### Register Aggregator in Package Source: https://github.com/amaralibey/openvprlab/blob/main/README.md Exposes the custom aggregator class to the package by importing it into the aggregators __init__.py file. ```python from .mixvpr import MixVPR from .boq import BoQ from .my_agg import ConvAP ``` -------------------------------- ### Initialize and Use MixVPR Aggregator Source: https://context7.com/amaralibey/openvprlab/llms.txt Initializes the MixVPR aggregator, which uses MLPMixer-style layers to create compact global descriptors from backbone features. Key parameters include input channels, feature map dimensions, output channels, mix depth, and MLP ratio. The output descriptor shape depends on `out_channels` and `out_rows`. ```python from src.models.aggregators import MixVPR aggregator = MixVPR( in_channels=1024, # Must match backbone output channels in_h=20, # Feature map height (depends on input size and backbone) in_w=20, # Feature map width out_channels=512, # Depth-wise projection dimension mix_depth=4, # Number of FeatureMixer layers mlp_ratio=1, # MLP expansion ratio out_rows=4 # Row-wise projection dimension ) # Forward pass import torch features = torch.randn(8, 1024, 20, 20) # Backbone output descriptors = aggregator(features) # Shape: [8, 2048] (out_channels * out_rows) print(f"Descriptor dimension: {descriptors.shape[1]}") # 2048 ``` -------------------------------- ### Update GSV-Cities Path in Configuration Source: https://github.com/amaralibey/openvprlab/blob/main/notebooks/download_gsv-cities.ipynb Reads an existing YAML configuration file, updates the dataset path to the absolute path of the downloaded files, and saves the changes back to the configuration file. ```python import yaml from pathlib import Path with open("../config/data/config.yaml", "r") as f: config = yaml.safe_load(f) save_path = Path("../data/train/gsv-cities").resolve().as_posix() config["train"]["gsv_cities"] = save_path with open('../config/data/config.yaml', 'w') as f: yaml.dump(config, f) ``` -------------------------------- ### Implement Custom Aggregator Module Source: https://github.com/amaralibey/openvprlab/blob/main/README.md Defines a custom PyTorch module for feature aggregation using Conv2d and AdaptiveAvgPool2d. This class should be placed in the aggregators directory. ```python import torch class ConvAP(torch.nn.Module): def __init__(self, in_channels=1024, out_channels=512, s1=2, s2=2): super().__init__() self.channel_pool = torch.nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1) self.AAP = torch.nn.AdaptiveAvgPool2d((s1, s2)) def forward(self, x): x = self.channel_pool(x) x = self.AAP(x) x = torch.nn.functional.normalize(x.flatten(1), p=2, dim=1) return x ``` -------------------------------- ### Cite GSV-Cities Dataset Source: https://github.com/amaralibey/openvprlab/blob/main/notebooks/download_gsv-cities.ipynb BibTeX citation format for the GSV-Cities paper. ```bibtex @article{ali2022gsv, title={{GSV-Cities}: Toward appropriate supervised visual place recognition}, author={Ali-bey, Amar and Chaib-draa, Brahim and Gigu{\`e}re, Philippe}, journal={Neurocomputing}, volume={513}, pages={194--203}, year={2022}, publisher={Elsevier} } ``` -------------------------------- ### Compute Recall Performance with FAISS Source: https://context7.com/amaralibey/openvprlab/llms.txt Provides a function to compute recall@K metrics using FAISS for efficient nearest neighbor search on descriptor embeddings. It outlines the process of generating simulated descriptors, defining ground truth relevant indices, and calculating recall values for specified K. ```python from src.utils.metrics import compute_recall_performance import numpy as np # Example: 1000 reference images, 100 query images, 512-dim descriptors num_references = 1000 num_queries = 100 embed_dim = 512 # Simulated descriptors (reference images first, then queries) descriptors = np.random.randn(num_references + num_queries, embed_dim).astype(np.float32) # Ground truth: list of relevant reference indices for each query ground_truth = [ [i % num_references, (i + 1) % num_references] # Each query has 2 relevant refs for i in range(num_queries) ] recalls = compute_recall_performance( descriptors=descriptors, num_references=num_references, num_queries=num_queries, ground_truth=ground_truth, k_values=[1, 5, 10, 15] ) print(f"Recall@1: {recalls[1]:.4f}") print(f"Recall@5: {recalls[5]:.4f}") print(f"Recall@10: {recalls[10]:.4f}") print(f"Recall@15: {recalls[15]:.4f}") ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.