### Initializing GraphNet Installation Options UI in JavaScript Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/installation/quick-start.html This JavaScript snippet defines the data arrays for PyTorch versions, operating systems, and CUDA capabilities. It then dynamically appends corresponding `div` elements to the HTML, creating interactive selection options for the user. ```JavaScript var torchList = [ ['torch-2.2.0', 'PyTorch 2.2.*'], ['no_torch', 'w/o PyTorch'], ]; var osList = [ ['linux', 'Linux'], ['macos', 'Mac'], ]; var cudaList = [ ['cu118', '11.8'], ['cu121', '12.1'], ['cpu', 'CPU'], ]; torchList.forEach(x => $("#torch").append(`
${x[1]}
`)); osList.forEach(x => $("#os").append(`
${x[1]}
`)); cudaList.forEach(x => $("#cuda").append(`
${x[1]}
`)); ``` -------------------------------- ### Generating GraphNet Installation Commands in JavaScript Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/installation/quick-start.html This JavaScript function `updateCommand` dynamically generates and displays the appropriate GraphNet installation command based on the user's selections for OS, PyTorch, and CUDA. It handles various combinations, including specific requirements for macOS and CPU-only installations, and suggests optional dependencies like `jammy_flows`. ```JavaScript function updateCommand() { var torch = $("#command").attr("torch"); var os = $("#command").attr("os"); var package = $("#command").attr("package"); var cuda = $("#command").attr("cuda"); if (os == "macos" && cuda != "cpu") { $("#command pre").text('# macOS binaries do not support CUDA'); } if (cuda != "cpu" && torch == "no_torch") { $("#command pre").text('# GPU acceleration is not available without PyTorch.'); } if (os == "linux" && cuda != "cpu" && torch != "no_torch"){ $("#command pre").text(`git clone https://github.com/graphnet-team/graphnet.git\ncd graphnet\n\npip install -r requirements/torch_${$("#command").attr("cuda")}.txt -e .[torch,develop]\n\n#Optionally, install jammy_flows for normalizing flow support:\npip install git+https://github.com/thoglu/jammy_flows.git`); } else if (os == "linux" && cuda == "cpu" && torch != "no_torch"){ $("#command pre").text(`git clone https://github.com/graphnet-team/graphnet.git\ncd graphnet\n\npip install -r requirements/torch_${$("#command").attr("cuda")}.txt -e .[torch,develop]\n\n#Optionally, install jammy_flows for normalizing flow support:\npip install git+https://github.com/thoglu/jammy_flows.git`); } else if (os == "linux" && cuda == "cpu" && torch == "no_torch"){ $("#command pre").text(`# Installations without PyTorch are intended for file conversion only\ngit clone https://github.com/graphnet-team/graphnet.git\ncd graphnet\n\npip install -r requirements/torch_${$("#command").attr("cuda")}.txt -e .[develop]\n\n#Optionally, install jammy_flows for normalizing flow support:\npip install git+https://github.com/thoglu/jammy_flows.git`); } if (os == "macos" && cuda == "cpu" && torch != "no_torch"){ $("#command pre").text(`git clone https://github.com/graphnet-team/graphnet.git\ncd graphnet\n\npip install -r requirements/torch_macos.txt -e .[torch,develop]\n\n#Optionally, install jammy_flows for normalizing flow support:\npip install git+https://github.com/thoglu/jammy_flows.git`); } if (os == "macos" && cuda == "cpu" && torch == "no_torch"){ $("#command pre").text(`# Installations without PyTorch are intended for file conversion only\ngit clone https://github.com/graphnet-team/graphnet.git\ncd graphnet\n\npip install -r requirements/torch_macos.txt -e .[develop]\n\n#Optionally, install jammy_flows for normalizing flow support:\npip install git+https://github.com/thoglu/jammy_flows.git`); } } ``` -------------------------------- ### Handling UI Selections and Initializing GraphNet Install Options in JavaScript Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/installation/quick-start.html This JavaScript snippet defines a click event handler for the installation option `div` elements, which updates the selected state and triggers the `updateCommand` function to refresh the displayed installation command. It also includes initial calls to simulate clicks, setting default selections for PyTorch, OS, and CUDA upon page load. ```JavaScript $(".quick-start .content-column .row div").click(function() { $(this).parent().children().removeClass("selected"); $(this).addClass("selected"); $("#command").attr($(this).parent().attr("id"), $(this).attr("id")); updateCommand(); }); $("#torch").children().get(0).click(); $("#linux").click(); $("#pip").click(); $("#cpu").click(); ``` -------------------------------- ### Example GraphNeT Model Configuration YAML Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This YAML snippet provides an example of a `ModelConfig` file, showcasing how the `architecture` (e.g., `DynEdge`) and `graph_definition` components are structured. It details their respective arguments and class names, illustrating the human-readable and portable format for defining GraphNeT models. ```yaml arguments: architecture: ModelConfig: arguments: add_global_variables_after_pooling: false dynedge_layer_sizes: null features_subset: null global_pooling_schemes: [min, max, mean, sum] nb_inputs: 4 nb_neighbours: 8 post_processing_layer_sizes: null readout_layer_sizes: null class_name: DynEdge graph_definition: ModelConfig: arguments: columns: [0, 1, 2] ``` -------------------------------- ### Demonstrating CLI Help and Execution for Data Reading in Python Source: https://github.com/graphnet-team/graphnet/blob/main/examples/README.md This snippet illustrates how to interact with GraphNeT example scripts via the command-line interface. It shows how to access the help documentation for the `01_read_dataset.py` script to understand its arguments and then demonstrates a basic execution of the script to read data in SQLite format. ```bash $ python examples/02_data/01_read_dataset.py --help (...) Read a few events from data in an intermediate format. positional arguments: {sqlite,parquet} optional arguments: -h, --help show this help message and exit $ python examples/02_data/01_read_dataset.py sqlite (...) ``` -------------------------------- ### Installing Pre-Commit Hooks for GraphNeT Development Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/contribute/contribute.rst This command installs the pre-commit hooks configured for the GraphNeT project. Once installed, these hooks automatically format code using `black` and `docformatter`, and check for errors and style adherence with `flake8`, `mypy`, and `pydocstyle` every time a change is committed, ensuring consistent code quality and style. ```bash pre-commit install ``` -------------------------------- ### Energy Reconstruction Example with GraphNeT Configuration (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This comprehensive example demonstrates a full energy reconstruction workflow in GraphNeT using configuration files. It covers importing necessary modules, loading model and dataset configurations, building the model, constructing data loaders, training the model, making predictions on a test set, and saving results and the trained model. ```python # Import(s) import os from graphnet.constants import CONFIG_DIR # Local path to graphnet/configs from graphnet.data.dataloader import DataLoader from graphnet.models import Model from graphnet.utilities.config import DatasetConfig, ModelConfig # Configuration dataset_config_path = f"{CONFIG_DIR}/datasets/training_example_data_sqlite.yml" model_config_path = f"{CONFIG_DIR}/models/example_energy_reconstruction_model.yml" # Build model model_config = ModelConfig.load(model_config_path) model = Model.from_config(model_config, trust=True) # Construct dataloaders dataset_config = DatasetConfig.load(dataset_config_path) dataloaders = DataLoader.from_dataset_config( dataset_config, batch_size=16, num_workers=1, ) # Train model model.fit( dataloaders["train"], dataloaders["validation"], gpus=[0], max_epochs=5, ) # Predict on test set and return as pandas.DataFrame results = model.predict_as_dataframe( dataloaders["test"], additional_attributes=model.target_labels + ["event_no"], ) # Save predictions and model to file outdir = "tutorial_output" os.makedirs(outdir, exist_ok=True) results.to_csv(f"{outdir}/results.csv") model.save_state_dict(f"{outdir}/state_dict.pth") model.save(f"{outdir}/model.pth") ``` -------------------------------- ### Example Dataset Configuration File (YAML) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This YAML snippet provides a complete example of a `DatasetConfig` file used in GraphNeT. It defines the data source path, graph definition parameters (like node features and nearest neighbors), pulsemaps, features, truth variables, index column, truth table, and specific train/test/validation selections based on event numbers. This configuration is used to load and process data for training. ```yaml path: $GRAPHNET/data/examples/sqlite/prometheus/prometheus-events.db graph_definition: arguments: columns: [0, 1, 2] detector: arguments: {} class_name: Prometheus dtype: null nb_nearest_neighbours: 8 node_definition: arguments: {} class_name: NodesAsPulses node_feature_names: [sensor_pos_x, sensor_pos_y, sensor_pos_z, t] class_name: KNNGraph pulsemaps: - total features: - sensor_pos_x - sensor_pos_y - sensor_pos_z - t truth: - injection_energy - injection_type - injection_interaction_type - injection_zenith - injection_azimuth - injection_bjorkenx - injection_bjorkeny - injection_position_x - injection_position_y - injection_position_z - injection_column_depth - primary_lepton_1_type - primary_hadron_1_type - primary_lepton_1_position_x - primary_lepton_1_position_y - primary_lepton_1_position_z - primary_hadron_1_position_x - primary_hadron_1_position_y - primary_hadron_1_position_z - primary_lepton_1_direction_theta - primary_lepton_1_direction_phi - primary_hadron_1_direction_theta - primary_hadron_1_direction_phi - primary_lepton_1_energy - primary_hadron_1_energy - total_energy - dummy_pid index_column: event_no truth_table: mc_truth seed: 21 selection: test: event_no % 5 == 0 validation: event_no % 5 == 1 train: event_no % 5 > 1 ``` -------------------------------- ### Training and Predicting with GraphNeT for Energy Reconstruction (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This comprehensive example demonstrates the full workflow for training a GraphNeT model for energy reconstruction. It covers loading model and dataset configurations, constructing data loaders, training the model, making predictions on a test set, and saving both the results and the trained model artifacts. ```python # Import(s) import os from graphnet.constants import CONFIG_DIR # Local path to graphnet/configs from graphnet.data.dataloader import DataLoader from graphnet.models import Model from graphnet.utilities.config import DatasetConfig, ModelConfig # Configuration dataset_config_path = f"{CONFIG_DIR}/datasets/training_example_data_sqlite.yml" model_config_path = f"{CONFIG_DIR}/models/example_energy_reconstruction_model.yml" # Build model model_config = ModelConfig.load(model_config_path) model = Model.from_config(model_config, trust=True) # Construct dataloaders dataset_config = DatasetConfig.load(dataset_config_path) dataloaders = DataLoader.from_dataset_config( dataset_config, batch_size=16, num_workers=1, ) # Train model model.fit( dataloaders["train"], dataloaders["validation"], gpus=[0], max_epochs=5, ) # Predict on test set and return as pandas.DataFrame results = model.predict_as_dataframe( dataloaders["test"], additional_attributes=model.target_labels + ["event_no"], ) # Save predictions and model to file outdir = "tutorial_output" os.makedirs(outdir, exist_ok=True) results.to_csv(f"{outdir}/results.csv") model.save_state_dict(f"{outdir}/state_dict.pth") model.save(f"{outdir}/model.pth") ``` -------------------------------- ### Training DynEdge Model Programmatically (Bash) Source: https://github.com/graphnet-team/graphnet/blob/main/examples/04_training/README.md This snippet demonstrates how to train a DynEdge GNN model using the GraphNeT library by programmatically constructing the dataset and model, without relying on configuration files. It shows commands for displaying CLI help, initiating energy regression training, and utilizing single or multiple GPUs. This method is recommended for debugging and experimenting with model configurations. ```bash # Show the CLI (graphnet) $ python examples/04_training/01_train_dynedge.py --help # Train energy regression model (graphnet) $ python examples/04_training/01_train_dynedge.py # Train using a single GPU (graphnet) $ python examples/04_training/01_train_dynedge.py --gpus 0 # Train using multiple GPUs (graphnet) $ python examples/04_training/01_train_dynedge.py --gpus 0 1 ``` -------------------------------- ### Example Dataset Configuration File in GraphNeT (YAML) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/datasets/datasets.rst This YAML snippet provides a complete example of a `DatasetConfig` file used in GraphNeT. It defines the path to the input SQLite database, the graph definition (e.g., `KNNGraph` with `Prometheus` detector and `NodesAsPulses` node features), specified pulsemaps, and lists of features and truth variables to be loaded from the dataset. This configuration is crucial for setting up data processing pipelines. ```yaml path: $GRAPHNET/data/examples/sqlite/prometheus/prometheus-events.db graph_definition: arguments: columns: [0, 1, 2] detector: arguments: {} class_name: Prometheus dtype: null nb_nearest_neighbours: 8 node_definition: arguments: {} class_name: NodesAsPulses node_feature_names: [sensor_pos_x, sensor_pos_y, sensor_pos_z, t] class_name: KNNGraph pulsemaps: - total features: - sensor_pos_x - sensor_pos_y - sensor_pos_z - t truth: - injection_energy - injection_type - injection_interaction_type - injection_zenith - injection_azimuth - injection_bjorkenx - injection_bjorkeny - injection_position_x - injection_position_y - injection_position_z - injection_column_depth - primary_lepton_1_type - primary_hadron_1_type - primary_lepton_1_position_x - primary_lepton_1_position_y - primary_lepton_1_position_z - primary_hadron_1_position_x - primary_hadron_1_position_y - primary_hadron_1_position_z - primary_lepton_1_direction_theta - primary_lepton_1_direction_phi - primary_hadron_1_direction_theta - primary_hadron_1_direction_phi - primary_lepton_1_energy - primary_hadron_1_energy ``` -------------------------------- ### Training DynEdge Model from Configuration Files (Bash) Source: https://github.com/graphnet-team/graphnet/blob/main/examples/04_training/README.md This snippet illustrates how to train a DynEdge GNN model using GraphNeT with configuration files for dataset loading and model definition. It provides commands for displaying CLI help and training models for energy, vertex position, and direction reconstruction, including handling 'kappa' values for uncertainty. This approach is recommended for standard model configurations due to its readability and shareability. ```bash # Show the CLI (graphnet) $ python examples/04_training/03_train_dynedge_from_config.py --help # Train energy regression model (graphnet) $ python examples/04_training/03_train_dynedge_from_config.py # Same as above, as this is the default model config. (graphnet) $ python examples/04_training/03_train_dynedge_from_config.py \ --model-config configs/models/example_energy_reconstruction_model.yml # Train a vertex position reconstruction model (graphnet) $ python examples/04_training/03_train_dynedge_from_config.py \ --model-config configs/models/example_vertex_position_reconstruction_model.yml # Trains a direction (zenith, azimuth) reconstruction model. Note that the # chosen `Task` in the model config file also returns estimated "kappa" values, # i.e. inverse variance, for each predicted feature, meaning that we need to # manually specify the names of these. (graphnet) $ python examples/04_training/03_train_dynedge_from_config.py --gpus 0 \ --model-config configs/models/example_direction_reconstruction_model.yml \ --prediction-names zenith_pred zenith_kappa_pred azimuth_pred azimuth_kappa_pred ``` -------------------------------- ### Installing GraphNeT in IceCube CVMFS Environment (Bash) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/installation/install.rst This snippet provides a Bash script to install GraphNeT within an IceCube CVMFS environment. It first clones the GraphNeT repository, then sets up the CVMFS Python runtime with IceTray, updates pip, and finally installs GraphNeT with its dependencies as a user. ```bash # Download GraphNeT git clone https://github.com/graphnet-team/graphnet.git cd graphnet # Open your favorite CVMFS distribution eval `/cvmfs/icecube.opensciencegrid.org/py3-v4.2.1/setup.sh` /cvmfs/icecube.opensciencegrid.org/py3-v4.2.1/RHEL_7_x86_64/metaprojects/icetray/v1.5.1/env-shell.sh # Update central utils pip install --upgrade pip>=20 pip install wheel setuptools==59.5.0 # Install graphnet into the CVMFS as a user pip install --user -r requirements/torch_cpu.txt -e .[torch,develop] ``` -------------------------------- ### Specifying PyTorch CPU Wheel Source (Shell) Source: https://github.com/graphnet-team/graphnet/blob/main/requirements/torch_macos.txt This snippet provides a `--find-links` argument for `pip` to locate PyTorch CPU wheel files from the official PyTorch download server. This is typically used in `requirements.txt` or directly with `pip install` to ensure specific CPU-only versions are installed. ```Shell --find-links https://download.pytorch.org/whl/cpu ``` -------------------------------- ### Defining a StandardModel for Zenith Reconstruction in GraphNeT (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to import and configure various GraphNeT components (detector, graph, GNN, task, loss function) to construct a `StandardModel`. The example specifically builds a model for zenith angle reconstruction with uncertainties, utilizing `KNNGraph` for data representation, `DynEdge` as the GNN backbone, and `ZenithReconstructionWithKappa` for the physics task, along with `VonMisesFisher2DLoss` for training. ```python # Choice of graph representation, GNN architecture, and physics task from graphnet.models.detector.prometheus import Prometheus from graphnet.models.graphs import KNNGraph from graphnet.models.graphs.nodes import NodesAsPulses from graphnet.models.gnn.dynedge import DynEdge from graphnet.models.task.reconstruction import ZenithReconstructionWithKappa # Choice of loss function and Model class from graphnet.training.loss_functions import VonMisesFisher2DLoss from graphnet.models import StandardModel # Configuring the components # Represents the data as a point-cloud graph where each # node represents a pulse of Cherenkov radiation # edges drawn to the 8 nearest neighbours graph_definition = KNNGraph( detector=Prometheus(), node_definition=NodesAsPulses(), nb_nearest_neighbours=8, ) backbone = DynEdge( nb_inputs=detector.nb_outputs, global_pooling_schemes=["min", "max", "mean"], ) task = ZenithReconstructionWithKappa( hidden_size=backbone.nb_outputs, target_labels="injection_zenith", loss_function=VonMisesFisher2DLoss(), ) # Construct the Model model = StandardModel( graph_definition=graph_definition, backbone=backbone, tasks=[task], ) ``` -------------------------------- ### Specifying PyG CPU Wheel Source (Shell) Source: https://github.com/graphnet-team/graphnet/blob/main/requirements/torch_macos.txt This snippet provides a `--find-links` argument for `pip` to locate PyTorch Geometric (PyG) CPU wheel files from the PyG data server. This ensures that `pip` can find and install the correct CPU-compatible version of PyG, specifically for Torch 2.2.0. ```Shell --find-links https://data.pyg.org/whl/torch-2.2.0+cpu.html ``` -------------------------------- ### Example GraphNeT Model Configuration (YAML) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This YAML snippet provides a detailed example of a `ModelConfig` for an energy reconstruction model in GraphNeT. It defines the model's architecture (DynEdge), graph definition (KNNGraph with Prometheus detector and NodesAsPulses node definition), optimizer (Adam), scheduler (PiecewiseLinearLR), and a specific task (EnergyReconstruction with LogCoshLoss). This configuration can be used to instantiate a complex model programmatically. ```yaml arguments: architecture: ModelConfig: arguments: add_global_variables_after_pooling: false dynedge_layer_sizes: null features_subset: null global_pooling_schemes: [min, max, mean, sum] nb_inputs: 4 nb_neighbours: 8 post_processing_layer_sizes: null readout_layer_sizes: null class_name: DynEdge graph_definition: ModelConfig: arguments: columns: [0, 1, 2] detector: ModelConfig: arguments: {} class_name: Prometheus dtype: null nb_nearest_neighbours: 8 node_definition: ModelConfig: arguments: {} class_name: NodesAsPulses node_feature_names: [sensor_pos_x, sensor_pos_y, sensor_pos_z, t] class_name: KNNGraph optimizer_class: '!class torch.optim.adam Adam' optimizer_kwargs: {eps: 0.001, lr: 0.001} scheduler_class: '!class graphnet.training.callbacks PiecewiseLinearLR' scheduler_config: {interval: step} scheduler_kwargs: factors: [0.01, 1, 0.01] milestones: [0, 20.0, 80] tasks: - ModelConfig: arguments: hidden_size: 128 loss_function: ModelConfig: arguments: {} class_name: LogCoshLoss loss_weight: null prediction_labels: null target_labels: total_energy transform_inference: '!lambda x: torch.pow(10,x)' transform_prediction_and_target: '!lambda x: torch.log10(x)' transform_support: null transform_target: null class_name: EnergyReconstruction class_name: StandardModel ``` -------------------------------- ### Converting Data to GraphNeT Backend with DataConverter (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/integration/integration.rst This example demonstrates how to instantiate and use GraphNeT's `DataConverter` to process raw experimental data. It configures the converter with a custom `MyReader`, `ParquetWriter`, and `MyExtractor` instances, then runs the conversion process and optionally merges the output files. ```python from graphnet.data.extractors.myexperiment import MyExtractor from graphnet.data.dataconverter import DataConverter from graphnet.data.readers import MyReader from graphnet.data.writers import ParquetWriter # Your settings dir_with_files = '/home/my_files' outdir = '/home/my_outdir' num_workers = 5 # Instantiate DataConverter - exports data from MyExperiment to Parquet converter = DataConverter(file_reader = MyReader(), save_method = ParquetWriter(), extractors=[MyExtractor('hits'), MyExtractor('truth')], outdir=outdir, num_workers=num_workers, ) # Run Converter converter(input_dir = dir_with_files) # Merge files (Optional) converter.merge_files() ``` -------------------------------- ### Training GraphNeT StandardModel using `model.fit` (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This example illustrates the simplified training syntax for GraphNeT models inheriting from `StandardModel`. The `model.fit` method provides an `sklearn`-like interface for training, abstracting away much of the boilerplate code typically required for PyTorch-Lightning based training loops. It accepts a `train_dataloader` and training parameters like `max_epochs`. ```python model = Model(...) train_dataloader = DataLoader(...) model.fit(train_dataloader=train_dataloader, max_epochs=10) ``` -------------------------------- ### Configuring PyTorch and PyG CPU Wheel Find Links Source: https://github.com/graphnet-team/graphnet/blob/main/requirements/torch_cpu.txt This snippet provides `find-links` arguments, typically used with `pip install -r requirements.txt` or directly on the command line, to specify alternative locations for package wheels. It points to CPU-specific builds of PyTorch and PyTorch Geometric, ensuring compatibility for non-GPU environments. These links are crucial for resolving dependencies when standard PyPI packages are not suitable or available. ```Configuration --find-links https://download.pytorch.org/whl/cpu --find-links https://data.pyg.org/whl/torch-2.2.0+cpu.html ``` -------------------------------- ### Configuring DataConverter for LiquidO H5 to Parquet Conversion (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/data_conversion/data_conversion.rst This snippet demonstrates how to configure and use GraphNeT's `DataConverter` to convert `.h5` files from the LiquidO experiment into `.parquet` format. It initializes the converter with a `LiquidOReader`, `ParquetWriter`, and specific extractors, then executes the conversion and optionally merges the output files. This setup enables parallel processing of files. ```python from graphnet.data.extractors.liquido import H5HitExtractor, H5TruthExtractor from graphnet.data.dataconverter import DataConverter from graphnet.data.readers import LiquidOReader from graphnet.data.writers import ParquetWriter # Your settings dir_with_files = '/home/my_files' outdir = '/home/my_outdir' num_workers = 5 # Instantiate DataConverter - exports data from LiquidO to Parquet converter = DataConverter(file_reader = LiquidOReader(), save_method = ParquetWriter(), extractors=[H5HitExtractor(), H5TruthExtractor()], outdir=outdir, num_workers=num_workers, ) # Run Converter converter(input_dir = dir_with_files) # Merge files (Optional) converter.merge_files() ``` -------------------------------- ### Utilizing GraphNeT's Logger Class for Custom Messages (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to instantiate and use GraphNeT's `Logger` class to output various types of messages to the terminal and log files. It shows examples of `info`, `warning`, `warning_once`, `debug`, `error`, and `critical` logging levels, providing flexibility for custom logging within GraphNeT applications. ```python from graphnet.utilities.logging import Logger logger = Logger() logger.info("My very informative message") logger.warning("My warning shown every time") logger.warning_once("My warning shown once") logger.debug("My debug call") logger.error("My error") logger.critical("My critical call") ``` -------------------------------- ### Combining Multiple GraphNeT Datasets with EnsembleDataset (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/datasets/datasets.rst This example illustrates the use of `EnsembleDataset` to merge various GraphNeT `Dataset` instances, such as `SQLiteDataset` and `ParquetDataset`, into a single, cohesive dataset. This allows for unified data access across different storage formats. ```python from graphnet.data import EnsembleDataset from graphnet.data.parquet import ParquetDataset from graphnet.data.sqlite import SQLiteDataset dataset_1 = SQLiteDataset(...) dataset_2 = SQLiteDataset(...) dataset_3 = ParquetDataset(...) ensemble_dataset = EnsembleDataset([dataset_1, dataset_2, dataset_3]) ``` -------------------------------- ### Specifying GPU Dependencies for Python Packages Source: https://github.com/graphnet-team/graphnet/blob/main/requirements/torch_cu118.txt This snippet lists the necessary Python packages and their versions for GPU-enabled installations, specifically for PyTorch and torchvision. It also includes --find-links to custom wheel repositories to ensure compatibility with the specified CUDA version (cu118). ```Python Requirements --find-links https://download.pytorch.org/whl/torch_stable.html torch==2.2.0+cu118 torchvision==0.17.0+cu118 --find-links https://data.pyg.org/whl/torch-2.2.0+cu118.html ``` -------------------------------- ### Training GraphNeT Models with PyTorch-Lightning (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to train GraphNeT `Model`s using PyTorch-Lightning, leveraging its `Trainer` class for more granular control over the training loop. It configures the `Trainer` with parameters like GPU usage, max epochs, callbacks (e.g., `ProgressBar`), and logging, then initiates training by calling `trainer.fit`. ```python from pytorch_lightning import Trainer from graphnet.training.callbacks import ProgressBar model = Model(...) train_dataloader = DataLoader(...) # Configure Trainer trainer = Trainer( gpus=None, max_epochs=10, callbacks=[ProgressBar()], log_every_n_steps=1, logger=None, strategy="ddp", ) # Train model trainer.fit(model, train_dataloader) ``` -------------------------------- ### Loading GraphNeT Model from Configuration and State Dictionary (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to reconstruct a GraphNeT `Model` by first loading its definition from a `ModelConfig` YAML file, which initializes the model with random weights. Subsequently, the trained weights are loaded from a `.pth` file using `load_state_dict`, restoring the complete trained model. ```python from graphnet.models import Model from graphnet.utilities.config import ModelConfig model_config = ModelConfig.load("model.yml") model = Model.from_config(model_config) # With randomly initialised weights. model.load_state_dict("state_dict.pth") # Now with trained weight. ``` -------------------------------- ### Logging Configuration to Weights & Biases (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet illustrates how to log various configuration objects (training, model, and dataset configurations) to the Weights & Biases experiment run. By updating `wandb_logger.experiment.config`, these configurations are saved as artifacts, enhancing reproducibility and transparency of experiments. ```python wandb_logger.experiment.config.update(training_config) wandb_logger.experiment.config.update(model_config.as_dict()) wandb_logger.experiment.config.update(dataset_config.as_dict()) ``` -------------------------------- ### Loading Multiple Datasets from Config in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to load datasets previously defined and saved in a `dataset.yml` configuration file. The `Dataset.from_config` method returns a dictionary where keys correspond to the named selections (e.g., 'train', 'test') and values are the respective `Dataset` objects, facilitating access to the pre-defined splits. ```python datasets = Dataset.from_config("dataset.yml") >>> datasets {"train": Dataset(...), "test": Dataset(...),} ``` -------------------------------- ### Training GraphNeT Models with Built-in Fit Method (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates the simplified training process for GraphNeT `Model`s using the in-built `fit` method, similar to `sklearn`. It requires an initialized `Model` instance and a `DataLoader` for training data, and trains the model for a specified number of epochs. ```python model = Model(...) train_dataloader = DataLoader(...) model.fit(train_dataloader=train_dataloader, max_epochs=10) ``` -------------------------------- ### Defining Multiple Datasets with Selections in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to define multiple datasets (e.g., 'train' and 'test') from the same data source using a single `DatasetConfig` file. It assigns different selection criteria to each dataset based on event numbers, then dumps the configuration to a YAML file. This allows for easy recreation of these specific dataset splits. ```python dataset = Dataset(...) dataset.config.selection = { "train": "event_no % 2 == 0", "test": "event_no % 2 == 1", } dataset.config.dump("dataset.yml") ``` -------------------------------- ### Loading GraphNeT Model from PyTorch-Lightning Checkpoint (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet illustrates how to load a GraphNeT `Model` using PyTorch-Lightning's `load_from_checkpoint` method. It first builds the model structure from a `ModelConfig` and then directly loads the entire model state, including trained weights, from a `.ckpt` checkpoint file. ```python model_config = ModelConfig.load("model.yml") model = Model.from_config(model_config) # With randomly initialised weights. model.load_from_checkpoint("checkpoint.ckpt") # Now with trained weight. ``` -------------------------------- ### Build Docker Image for GraphNet Benchmarking (Bash) Source: https://github.com/graphnet-team/graphnet/blob/main/docker/NOTES.md This command builds a Docker image named 'graphnet-benchmarking-image' using the Dockerfile located in the 'benchmarking/' directory. It tags the image for easy reference and future use. ```bash $ docker build -f benchmarking/dockerfile -t graphnet-benchmarking-image benchmarking/ ``` -------------------------------- ### Recreating GraphNeT Dataset from YAML Configuration Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to recreate a GraphNeT `Dataset` instance from a previously exported YAML configuration file using `Dataset.from_config()`. This method ensures that the dataset is loaded with the exact same settings, promoting reproducibility across different sessions or environments. ```python from graphnet.data.dataset import Dataset dataset = Dataset.from_config("dataset.yml") ``` -------------------------------- ### Integrating Weights & Biases for Experiment Tracking (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to integrate Weights & Biases (W&B) for experiment tracking with GraphNeT models. It initializes a `WandbLogger` instance, specifying project, entity, and save directory, and then passes this logger to the `model.fit` method to enable automatic logging of training metrics and model artifacts to W&B. ```python import os from pytorch_lightning.loggers import WandbLogger # Create wandb directory wandb_dir = "./wandb/" os.makedirs(wandb_dir, exist_ok=True) # Initialise Weights & Biases (W&B) run wandb_logger = WandbLogger( project="example-script", entity="graphnet-team", save_dir=wandb_dir, log_model=True, ) # Fit Model model = Model(...) model.fit( ..., logger=wandb_logger, ) ``` -------------------------------- ### Exporting GraphNeT Dataset Configuration to YAML Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to export the configuration of a GraphNeT `Dataset` instance to a YAML file using `dataset.config.dump()`. This file captures details like input data paths, loaded tables/columns, and applied selections, enabling reproducible dataset creation. ```python dataset = Dataset(...) dataset.config.dump("dataset.yml") ``` -------------------------------- ### Loading GraphNeT Model from YAML Configuration (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to reconstruct a GraphNeT `Model` architecture from a previously saved YAML configuration file. The `trust=True` argument is crucial as it allows for dynamically loading classes referenced within the configuration file, enabling the recreation of complex model structures. ```python from graphnet.models import Model # Indicate that you `trust` the config file after inspecting it, to allow for # dynamically loading classes references in the file. model = Model.from_config("model.yml", trust=True) ``` -------------------------------- ### Loading GraphNeT Model with Built-in Load Method (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to load a previously saved GraphNeT model using the `Model.load` classmethod. It reconstructs the entire model object from the specified file path, making it ready for inference or further training. ```python from graphnet.models import Model loaded_model = Model.load("model.pth") ``` -------------------------------- ### Saving GraphNeT Model Configuration and State Dictionary (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to save the model's configuration to a YAML file and its trained weights (state dictionary) to a PyTorch `.pth` file. This method is recommended for version-proof model persistence, separating the model definition from its learned parameters. ```python model.save_config('model.yml') model.save_state_dict('state_dict.pth') ``` -------------------------------- ### Run GraphNet Benchmarking Docker Container with Mounted Data (Bash) Source: https://github.com/graphnet-team/graphnet/blob/main/docker/NOTES.md This command runs the 'graphnet-benchmarking-image' Docker container, mounting a local 'inference_data/' directory to '/data/' inside the container. It executes the 'apply.py' script within the container, processing input from '/data/input' and saving output to '/data/output'. The script assumes the mounted directory has an 'input/' directory and will create an 'output/' directory for results. ```bash $ docker run --rm -it --mount type=bind,source=inference_data/,target=/data/ --name graphnet-benchmarking-container graphnet-benchmarking-image 'python apply.py /data/input /data/output graphnet_zenith 50' ``` -------------------------------- ### Initializing Weights & Biases Logger and Fitting Model (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This snippet demonstrates how to set up `WandbLogger` for experiment tracking in GraphNeT. It initializes a W&B run, creates a dedicated directory for logs, and then integrates the logger when fitting a `Model` instance. This enables automatic logging of training and validation metrics. ```python import os from pytorch_lightning.loggers import WandbLogger # Create wandb directory wandb_dir = "./wandb/" os.makedirs(wandb_dir, exist_ok=True) # Initialise Weights & Biases (W&B) run wandb_logger = WandbLogger( project="example-script", entity="graphnet-team", save_dir=wandb_dir, log_model=True, ) # Fit Model model = Model(...) model.fit( ..., logger=wandb_logger, ) ``` -------------------------------- ### Loading Multiple Datasets from Config in GraphNeT (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/datasets/datasets.rst This snippet demonstrates how loading a `DatasetConfig` with multiple selections results in a dictionary of `Dataset` objects. When `Dataset.from_config()` is called on a configuration file containing named selections, it returns a dictionary where keys are selection names and values are the corresponding `Dataset` instances, allowing easy access to different data subsets. ```python datasets = Dataset.from_config("dataset.yml") >>> datasets {"train": Dataset(...), "test": Dataset(...),} ``` -------------------------------- ### Loading Dataset from Configuration in GraphNeT (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/datasets/datasets.rst This snippet shows how to recreate a `Dataset` object from a previously saved `DatasetConfig` YAML file. The `Dataset.from_config()` static method reads the configuration from `dataset.yml`, ensuring that the dataset is initialized with the exact same settings as when it was exported. ```python from graphnet.data.dataset import Dataset dataset = Dataset.from_config("dataset.yml") ``` -------------------------------- ### Saving GraphNeT Model Configuration to YAML (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet illustrates how to save the architectural configuration of a GraphNeT `Model` instance to a YAML file. This allows for the model's definition to be stored and recreated in different sessions, ensuring reproducibility of the model's structure without its trained weights. ```python model = Model(...) model.save_config("model.yml") ``` -------------------------------- ### Logging Configuration Files with Weights & Biases (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This snippet shows how to log various configuration objects (`training_config`, `model_config`, `dataset_config`) to Weights & Biases using the `wandb_logger.experiment.config.update()` method. This practice significantly improves reproducibility and transparency by saving critical experiment parameters. ```python wandb_logger.experiment.config.update(training_config) wandb_logger.experiment.config.update(model_config.as_dict()) wandb_logger.experiment.config.update(dataset_config.as_dict()) ``` -------------------------------- ### Loading GraphNeT Model from Configuration and State Dictionary in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This snippet demonstrates how to reconstruct a GraphNeT model from a saved `ModelConfig` file and then load its trained weights from a `state_dict`. This two-step process ensures that the model's definition is loaded first, followed by its specific parameters, offering robust versioning. ```python from graphnet.models import Model from graphnet.utilities.config import ModelConfig model_config = ModelConfig.load("model.yml") model = Model.from_config(model_config) # With randomly initialised weights. model.load_state_dict("state_dict.pth") # Now with trained weight. ``` -------------------------------- ### Referencing External Selection Files in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to define dataset selections by referencing external CSV or JSON files. This approach allows for managing complex or large selection criteria outside the main configuration file, promoting reusability and easier updates of selection logic. ```python dataset.config.selection = { "train": "50000 random events ~ train_selection.csv", "test": "test_selection.csv", } ``` -------------------------------- ### Saving GraphNeT Model with Built-in Save Method (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to save an entire GraphNeT model, including its `state_dict`, using the convenient `model.save` method. The model is serialized to the specified file path, allowing for easy persistence and later retrieval. ```python model.save("model.pth") ``` -------------------------------- ### Combining Multiple GraphNeT Datasets with EnsembleDataset Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet illustrates how to combine multiple GraphNeT `Dataset` instances (e.g., `SQLiteDataset`, `ParquetDataset`) into a single `EnsembleDataset`. This class allows for seamless aggregation and iteration over data from diverse sources, treating them as a unified dataset. ```python from graphnet.data import EnsembleDataset from graphnet.data.parquet import ParquetDataset from graphnet.data.sqlite import SQLiteDataset dataset_1 = SQLiteDataset(...) dataset_2 = SQLiteDataset(...) dataset_3 = ParquetDataset(...) ensemble_dataset = EnsembleDataset([dataset_1, dataset_2, dataset_3]) ``` -------------------------------- ### Defining a Basic PyTorch `nn.Module` (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This snippet presents a fundamental PyTorch neural network module, `MyModel`, inheriting from `torch.nn.Module`. It initializes a simple linear layer in its constructor and defines a `forward` method that applies this layer to an input tensor, demonstrating the basic structure of a PyTorch model. ```python import torch class MyModel(torch.nn.Module): def __init__(self, input_dim : int = 5, output_dim : int = 10): super().__init__() self._layer = torch.nn.Linear(input_dim, output_dim) def forward(self, x: torch.Tensor) -> torch.Tensor: return self._layer(x) ``` -------------------------------- ### Loading GraphNeT Model from PyTorch-Lightning Checkpoint in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/models/models.rst This snippet shows how to load a GraphNeT model using PyTorch-Lightning's `load_from_checkpoint` method. This approach leverages Lightning's built-in checkpointing capabilities to restore a model, including its trained weights, from a `.ckpt` file, often used for resuming training or inference. ```python model_config = ModelConfig.load("model.yml") model = Model.from_config(model_config) # With randomly initialised weights. model.load_from_checkpoint("checkpoint.ckpt") # Now with trained weight. ``` -------------------------------- ### Implementing MyReader for Pickle Files in GraphNeT (Python) Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/integration/integration.rst This Python class, MyReader, extends GraphNeTFileReader to handle data stored in .pickle files from 'MyExperiment'. It defines accepted file extensions and extractors, and its __call__ method opens a specified pickle file, loads its content, and applies registered Extractor instances to process the data, returning a dictionary of extracted pandas DataFrames. ```python from typing import List, Union, Dict import pandas as pd import pickle # Import the generic file reader from .graphnet_file_reader import GraphNeTFileReader # Import your own extractor from graphnet.data.extractors.myexperiment import MyExtractor class MyReader(GraphNeTFileReader): """A class for reading my pickle files from MyExperiment.""" _accepted_file_extensions = [".pickle"] _accepted_extractors = [MyExtractor] def __call__(self, file_path: str) -> Dict[str, pd.DataFrame]: """Extract data from single pickle file. Args: file_path: Path to pickle file. Returns: Extracted data. """ # Open file file = open(file_path,'r') data = pickle.load(file) # Apply extractors outputs = {} for extractor in self._extractors: output = extractor(data) if output is not None: outputs[extractor._extractor_name] = output return outputs ``` -------------------------------- ### Selecting Random Subsets of Data in Python Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet demonstrates how to select a random subset of events from a dataset using the `DatasetConfig`. The `N random events ~ ` syntax allows specifying a fixed number of random events that also satisfy a given condition, useful for creating smaller, representative datasets for testing or development. ```python dataset = Dataset(..) dataset.config.selection = "1000 random events ~ abs(injection_type) == 14" ``` -------------------------------- ### Adding Custom Labels to a GraphNeT Dataset Source: https://github.com/graphnet-team/graphnet/blob/main/docs/source/getting_started/getting_started.md This snippet shows how to integrate a previously defined custom label (e.g., `MyCustomLabel`) into a GraphNeT `Dataset` instance using the `add_label` method. After adding, the custom label can be accessed like any other feature from a graph object retrieved from the dataset. ```python dataset.add_label(MyCustomLabel()) graph = dataset[0] graph["my_custom_label"] >>> ... ```