### Train GFlowNet with Python Loop Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/guides/example.md This code trains a GFlowNet model for 1000 iterations, sampling 16 trajectories each time. It uses the tqdm library for progress tracking, and updates the optimizer after computing the loss. Dependencies include torch, tqdm, and the GFlowNet-specific sampler and loss functions. ```Python for i in (pbar := tqdm(range(1000))): # Sample trajectories off-policy with tempered distribution. # Log probabilities are omitted; estimator outputs are saved for efficiency. trajectories = sampler.sample_trajectories(env=env, n=16, save_logprobs=False, save_estimator_outputs=True, temperature=1.5) optimizer.zero_grad() loss = gfn.loss(env, trajectories) loss.backward() optimizer.step() if i % 25 == 0: pbar.set_postfix({"loss": loss.item()}) ``` -------------------------------- ### Install torchgfn with Development Dependencies (Bash) Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/README.md Installs torchgfn with specific dependency sets like 'dev' for development, 'scripts' for running examples, or 'all' for complete installation. This allows for tailored installations based on user needs. ```bash pip install torchgfn[scripts] ``` -------------------------------- ### TorchGFN Environment Setup Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb This code snippet sets up the Python environment for using the torchgfn library. It imports necessary modules for actions, environments, estimators, GFlowNet models, preprocessors, states, and utility modules. Dependencies include the `gfn` library and its submodules. ```python from typing import ClassVar, Tuple, cast import gfn from gfn.actions import Actions from gfn.env import DiscreteEnv from gfn.estimators import DiscretePolicyEstimator, ScalarEstimator from gfn.gflownet import FMGFlowNet, TBGFlowNet from gfn.preprocessors import IdentityPreprocessor from gfn.states import DiscreteStates from gfn.utils.modules import MLP ``` -------------------------------- ### Setup Flow Matching Estimator and FMGFlowNet in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Creates an MLP module to estimate log edge flows, wraps it in a `DiscretePolicyEstimator`, and constructs an `FMGFlowNet` with this estimator. An Adam optimizer is instantiated, and the training loop is invoked to train the model on the face environment. ```python # nn.Module that estimates _log_ edge flows. module = MLP( input_dim=env.state_shape[-1], output_dim=env.n_actions, hidden_dim=n_hid_units, n_hidden_layers=1, ) # This is our _log_ edge flow estimator. estimator = DiscretePolicyEstimator( module=module, n_actions=env.n_actions, ) # The gflownet class wraps our estimator (inclusing sampler functionality). gflownet = FMGFlowNet(estimator) optimizer = torch.optim.Adam(gflownet.parameters(), lr=learning_rate) # TODO: Verify. visited_terminating_states, states_visited, losses = train( gflownet, optimizer, env, n_episodes=n_episodes * 10, ) ``` -------------------------------- ### Setup experiment models and optimizer in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb Creates forward and backward MLP models with a hidden dimension, initializes a log‑partition parameter, and configures an Adam optimizer with separate learning rates for the models and logZ. This prepares all learnable components for GFlowNet training. ```python def setup_experiment(hid_dim=64, lr_model=1e-3, lr_logz=1e-1): """Generate the learned parameters and optimizer for an experiment. Forward and backward models are MLPs with a single hidden layer. logZ is a single parameter. Note that we give logZ a higher learning rate, which is a common trick used when utilizing Trajectory Balance. """ # Input = [x_position, n_steps], Output = [mus, standard_deviations]. forward_model = torch.nn.Sequential(torch.nn.Linear(2, hid_dim), torch.nn.ELU(), torch.nn.Linear(hid_dim, hid_dim), torch.nn.ELU(), torch.nn.Linear(hid_dim, 2)).to(device) backward_model = torch.nn.Sequential(torch.nn.Linear(2, hid_dim), torch.nn.ELU(), torch.nn.Linear(hid_dim, hid_dim), torch.nn.ELU(), torch.nn.Linear(hid_dim, 2)).to(device) logZ = torch.nn.Parameter(torch.tensor(0.0, device=device)) optimizer = torch.optim.Adam( [ {'params': forward_model.parameters(), 'lr': lr_model}, {'params': backward_model.parameters(), 'lr': lr_model}, {'params': [logZ], 'lr': lr_logz}, ] ) return (forward_model, backward_model, logZ, optimizer) ``` -------------------------------- ### Define step and state initialization functions in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb Provides functions to perform a forward step in the environment and to initialize the starting state for a batch. These utilities handle updating the position and step counter, and set the initial x_position based on the environment configuration. ```python def step(x, action): """Takes a forward step in the environment.""" new_x = torch.zeros_like(x) new_x[:, 0] = x[:, 0] + action # TODO: Complete - add action delta. new_x[:, 1] = x[:, 1] + 1 # TODO: Complete - increment step counter. return new_x def initalize_state(batch_size, device, env, randn=False): """Trajectory starts at state = (X_0, t=0).""" x = torch.zeros((batch_size, 2), device=device) x[:, 0] = env.init_value # TODO: Complete. return x ``` -------------------------------- ### Training Trajectory Balance GFlowNet in Python Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/guides/example.md This code sets up a HyperGrid environment and trains a Trajectory Balance GFlowNet using forward and backward policy estimators with shared MLP trunks. It requires torch, tqdm, and torchgfn library; inputs are environment states, outputs are trained policies for sampling trajectories. Limitations include on-policy sampling and fixed learning rates, suitable for discrete action spaces but may need adjustments for complex environments. ```Python import torch from tqdm import tqdm from gfn.gflownet import TBGFlowNet from gfn.gym import HyperGrid # We use the hyper grid environment from gfn.preprocessors import KHotPreprocessor from gfn.modules import DiscretePolicyEstimator from gfn.samplers import Sampler from gfn.utils.modules import MLP # is a simple multi-layer perceptron (MLP) # 1 - We define the environment. env = HyperGrid(ndim=4, height=8) # Grid of size 8x8x8x8 preprocessor = KHotPreprocessor(ndim=env.ndim, height=env.height) # 2 - We define the needed modules (neural networks). input_dim = preprocessor.output_dim if preprocessor.output_dim is not None else env.state_shape[-1] module_PF = MLP( input_dim=input_dim, output_dim=env.n_actions ) # Neural network for the forward policy, with as many outputs as there are actions module_PB = MLP( input_dim=input_dim, output_dim=env.n_actions - 1, trunk=module_PF.trunk # We share all the parameters of P_F and P_B, except for the last layer ) # 3 - We define the estimators. pf_estimator = DiscretePolicyEstimator(module_PF, env.n_actions, is_backward=False, preprocessor=preprocessor) pb_estimator = DiscretePolicyEstimator(module_PB, env.n_actions, is_backward=True, preprocessor=preprocessor) # 4 - We define the GFlowNet. gfn = TBGFlowNet(pf=pf_estimator, pb=pb_estimator, init_logZ=0.0) # We initialize logZ to 0 # 5 - We define the sampler and the optimizer. sampler = Sampler(estimator=pf_estimator) # We use an on-policy sampler, based on the forward policy # Different policy parameters can have their own LR. # Log Z gets dedicated learning rate (typically higher). optimizer = torch.optim.Adam(gfn.pf_pb_parameters(), lr=1e-3) optimizer.add_param_group({"params": gfn.logz_parameters(), "lr": 1e-1}) # 6 - We train the GFlowNet for 1000 iterations, with 16 trajectories per iteration for i in (pbar := tqdm(range(1000))): # save_logprobs=True makes on-policy training faster trajectories = sampler.sample_trajectories(env=env, n=16, save_logprobs=True) optimizer.zero_grad() loss = gfn.loss(env, trajectories) loss.backward() optimizer.step() if i % 25 == 0: pbar.set_postfix({"loss": loss.item()}) ``` -------------------------------- ### Setup Policy Estimators and TBGFlowNet for Trajectory Balance Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Initializes MLP modules and DiscretePolicyEstimators for forward and backward policies, then combines them into a TBGFlowNet. This is a prerequisite for training a GFlowNet using the trajectory balance objective. ```python # nn.Modules for the forward and backward policy estimators. pf_module = MLP( input_dim=env.state_shape[-1], output_dim=env.n_actions, hidden_dim=n_hid_units, n_hidden_layers=1, ) pb_module = MLP( input_dim=env.state_shape[-1], output_dim=env.n_actions - 1, hidden_dim=n_hid_units, n_hidden_layers=1, ) # Estimators for the forward and backward policies. pf_estimator = DiscretePolicyEstimator( module=pf_module, n_actions=env.n_actions, ) pb_estimator = DiscretePolicyEstimator( module=pb_module, n_actions=env.n_actions, is_backward=True, ) # Our trajectory balance gflownet accepts both policy estimators. gflownet = TBGFlowNet( pf=pf_estimator, pb=pb_estimator, ) ``` -------------------------------- ### Print Final Partition Function Estimate Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Retrieves the learned 'logZ' parameter from the GFlowNet object, exponentiates it to get the partition function estimate, and prints the result formatted to two decimal places. ```python print("The partition function estimate is Z={:.2f}".format( torch.exp(gflownet.logZ).item() ) ) ``` -------------------------------- ### Training Sub Trajectory Balance GFlowNet in Python Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/guides/example.md This code configures a Sub Trajectory Balance GFlowNet with additional scalar estimator for logF, using shared MLPs for policies in a HyperGrid environment. Dependencies include torch, tqdm, and torchgfn; it processes states via preprocessor, outputs policy and reward estimates. Limitations involve the lambda parameter for sub-trajectory balancing and separate optimizers for logF, best for environments with known reward structures. ```Python import torch from tqdm import tqdm from gfn.gflownet import SubTBGFlowNet from gfn.gym import HyperGrid # We use the hyper grid environment from gfn.preprocessors import KHotPreprocessor from gfn.modules import DiscretePolicyEstimator, ScalarEstimator from gfn.samplers import Sampler from gfn.utils.modules import MLP # MLP is a simple multi-layer perceptron (MLP) # 1 - We define the environment. env = HyperGrid(ndim=4, height=8) # Grid of size 8x8x8x8 preprocessor = KHotPreprocessor(ndim=env.ndim, height=env.height) # 2 - We define the needed modules (neural networks). # The environment has a preprocessor attribute, which is used to preprocess the state before feeding it to the policy estimator input_dim = preprocessor.output_dim if preprocessor.output_dim is not None else env.state_shape[-1] module_PF = MLP( input_dim=input_dim, output_dim=env.n_actions ) # Neural network for the forward policy, with as many outputs as there are actions module_PB = MLP( input_dim=input_dim, output_dim=env.n_actions - 1, trunk=module_PF.trunk # We share all the parameters of P_F and P_B, except for the last layer ) module_logF = MLP( input_dim=input_dim, output_dim=1, # Important for ScalarEstimators! ) # 3 - We define the estimators. pf_estimator = DiscretePolicyEstimator(module_PF, env.n_actions, is_backward=False, preprocessor=preprocessor) pb_estimator = DiscretePolicyEstimator(module_PB, env.n_actions, is_backward=True, preprocessor=preprocessor) logF_estimator = ScalarEstimator(module=module_logF, preprocessor=env.preprocessor) # 4 - We define the GFlowNet. gfn = SubTBGFlowNet(pf=pf_estimator, pb=pb_estimator, logF=logF_estimator, lamda=0.9) # 5 - We define the sampler and the optimizer. sampler = Sampler(estimator=pf_estimator) # Different policy parameters can have their own LR. # Log F gets dedicated learning rate (typically higher). optimizer = torch.optim.Adam(gfn.pf_pb_parameters(), lr=1e-3) optimizer.add_param_group({"params": gfn.logF_parameters(), "lr": 1e-2}) ``` -------------------------------- ### Train GFlowNet with Trajectory Balance on HyperGrid Source: https://context7.com/gfnorg/torchgfn/llms.txt This code demonstrates a complete workflow for training a GFlowNet using Trajectory Balance loss on the HyperGrid environment. It covers environment setup, neural network definition, policy estimator creation, GFlowNet initialization, sampler setup, optimizer configuration, the training loop with sampling and loss computation, and finally, sample generation from the trained model. Dependencies include torch, gfn.gflownet, gfn.gym, gfn.preprocessors, gfn.estimators, gfn.samplers, gfn.utils.modules, and gfn.utils.common. ```python import torch from gfn.gflownet import TBGFlowNet from gfn.gym import HyperGrid from gfn.preprocessors import KHotPreprocessor from gfn.estimators import DiscretePolicyEstimator from gfn.samplers import Sampler from gfn.utils.modules import MLP from gfn.utils.common import set_seed # Set random seed set_seed(42) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Create environment: 4D hypergrid with height 8 env = HyperGrid( ndim=4, height=8, reward_fn_str="original", reward_fn_kwargs={"R0": 0.1, "R1": 0.5, "R2": 2.0}, device=device, calculate_partition=True, # Calculate true log partition function store_all_states=True, # Store all states for validation check_action_validity=True ) print(f"Environment has {env.n_states} states") print(f"Environment log partition: {env.log_partition}") # Preprocessor: convert states to k-hot encoding preprocessor = KHotPreprocessor(height=env.height, ndim=env.ndim) # Define neural network modules with shared trunk module_PF = MLP( input_dim=preprocessor.output_dim, output_dim=env.n_actions, hidden_dim=256, n_hidden_layers=2 ) module_PB = MLP( input_dim=preprocessor.output_dim, output_dim=env.n_actions - 1, # No exit action in backward trunk=module_PF.trunk # Share weights with forward policy ) # Create policy estimators pf_estimator = DiscretePolicyEstimator( module_PF, env.n_actions, preprocessor=preprocessor, is_backward=False ) pb_estimator = DiscretePolicyEstimator( module_PB, env.n_actions, preprocessor=preprocessor, is_backward=True ) # Create GFlowNet with Trajectory Balance loss gflownet = TBGFlowNet(pf=pf_estimator, pb=pb_estimator, init_logZ=0.0) gflownet = gflownet.to(device) # Create sampler for trajectory generation sampler = Sampler(estimator=pf_estimator) # Setup optimizer with separate learning rates optimizer = torch.optim.Adam(gflownet.pf_pb_parameters(), lr=1e-3) optimizer.add_param_group({"params": gflownet.logz_parameters(), "lr": 1e-1}) # Training loop for iteration in range(1000): # Sample trajectories with epsilon-greedy exploration trajectories = sampler.sample_trajectories( env, n=16, # Batch size save_logprobs=True, epsilon=0.1 # 10% random actions ) # Compute loss and backpropagate optimizer.zero_grad() loss = gflownet.loss(env, trajectories, recalculate_all_logprobs=False) loss.backward() # Gradient clipping and parameter updates torch.nn.utils.clip_grad_norm_(gflownet.parameters(), 1.0) optimizer.step() if iteration % 100 == 0: print(f"Iteration {iteration}: Loss = {loss.item():.4f}, " f"LogZ = {gflownet.logz.item():.4f}") # Generate samples from trained model with torch.no_grad(): final_trajectories = sampler.sample_trajectories(env, n=1000) terminating_states = final_trajectories.terminating_states print(f"Generated {len(terminating_states)} unique terminal states") ``` -------------------------------- ### Initialize State in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb This function initializes the starting state for trajectories with zero steps and given initial position. It requires PyTorch and an environment object. Inputs are batch size, device, and env; output is initial state tensor. Limitations: Sets step counter to zero, assumes state format. ```python def initalize_state(batch_size, device, env, randn=False): """Trajectory starts at state = (X_0, t=0).""" x = torch.zeros((batch_size, 2), device=device) x[:, 0] = env.init_value return x ``` -------------------------------- ### Install torchgfn Core Package (Bash) Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/README.md Installs the latest stable version of the torchgfn package with its core dependencies using pip. This is the primary method for users to get started with the library. ```bash pip install torchgfn ``` -------------------------------- ### Import required libraries and configure device for GFlowNet (Python) Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_graphs.ipynb Sets up the Python environment by importing Torch, Matplotlib, and GFlowNet utilities. It also initializes a reproducible random seed and specifies the computation device. This snippet is required before building and training the graph model. ```python import time import matplotlib.pyplot as plt import torch from tensordict import TensorDict from matplotlib import patches from gfn.actions import GraphActionType from gfn.containers import ReplayBuffer from gfn.estimators import DiscreteGraphPolicyEstimator from gfn.gflownet.trajectory_balance import TBGFlowNet from gfn.gym.graph_building import GraphBuildingOnEdges from gfn.states import GraphStates from gfn.utils.common import set_seed from gfn.utils.graphs import get_edge_indices from gfn.utils.modules import GraphActionGNN set_seed(7) device = torch.device('cpu') ``` -------------------------------- ### Configure GFlowNet Policies and Environment in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_graphs.ipynb Initializes the seven-segment display environment, defines forward and backward policy estimators using GraphActionGNN, and sets up the TBGFlowNet for training. This prepares the GFlowNet model for learning to generate valid seven-segment display graphs. ```python directed = False n_nodes = 6 # 6 nodes for the seven-segment display env = SevenSegmentGraphBuilding( n_nodes=n_nodes, state_evaluator=reward_function, directed=directed, device=device, ) pf = DiscreteGraphPolicyEstimator( module=GraphActionGNN( num_node_classes=env.n_nodes, directed=directed, num_edge_classes=env.num_edge_classes, ) ) pb = DiscreteGraphPolicyEstimator( module=GraphActionGNN( num_node_classes=env.n_nodes, directed=directed, is_backward=True, num_edge_classes=env.num_edge_classes, ), is_backward=True, ) gflownet = TBGFlowNet(pf, pb).to(device) ``` -------------------------------- ### Define Custom Discrete Environment in Python Source: https://context7.com/gfnorg/torchgfn/llms.txt This Python code defines a custom discrete environment, `CustomGridEnv`, by inheriting from `gfn.env.DiscreteEnv`. It specifies the environment's dynamics, state transitions (step and backward_step), action masks, and reward function. The example usage demonstrates initializing the environment and resetting it to get initial states. ```python import torch from gfn.env import DiscreteEnv from gfn.states import DiscreteStates from gfn.actions import Actions class CustomGridEnv(DiscreteEnv): """Custom 2D grid environment with custom reward function.""" def __init__(self, size: int = 10, device: str = "cpu"): self.size = size # Define initial state (0, 0) and sink state (-1, -1) s0 = torch.zeros(2, dtype=torch.long, device=device) sf = torch.full((2,), fill_value=-1, dtype=torch.long, device=device) # Actions: 0=move right, 1=move up, 2=exit n_actions = 3 state_shape = (2,) # (x, y) coordinates super().__init__( n_actions=n_actions, s0=s0, state_shape=state_shape, sf=sf, check_action_validity=True ) def step(self, states: DiscreteStates, actions: Actions) -> DiscreteStates: """Apply forward actions to states.""" new_tensor = states.tensor.clone() # Action 0: increment x-coordinate mask_right = actions.tensor.squeeze(-1) == 0 new_tensor[mask_right, 0] += 1 # Action 1: increment y-coordinate mask_up = actions.tensor.squeeze(-1) == 1 new_tensor[mask_up, 1] += 1 return self.States(new_tensor) def backward_step(self, states: DiscreteStates, actions: Actions) -> DiscreteStates: """Apply backward actions to states.""" new_tensor = states.tensor.clone() # Action 0: decrement x-coordinate mask_left = actions.tensor.squeeze(-1) == 0 new_tensor[mask_left, 0] -= 1 # Action 1: decrement y-coordinate mask_down = actions.tensor.squeeze(-1) == 1 new_tensor[mask_down, 1] -= 1 return self.States(new_tensor) def update_masks(self, states: DiscreteStates) -> None: """Update action masks based on current states.""" # Cannot move right if x >= size-1 states.forward_masks[:, 0] = states.tensor[:, 0] < self.size - 1 # Cannot move up if y >= size-1 states.forward_masks[:, 1] = states.tensor[:, 1] < self.size - 1 # Can always exit (action 2) states.forward_masks[:, 2] = True # Backward masks (no exit action) states.backward_masks[:, 0] = states.tensor[:, 0] > 0 # Can move left states.backward_masks[:, 1] = states.tensor[:, 1] > 0 # Can move down def reward(self, states: DiscreteStates) -> torch.Tensor: """Compute rewards for terminal states.""" # Example: reward proportional to distance from origin distances = torch.sqrt( (states.tensor[:, 0].float() ** 2) + (states.tensor[:, 1].float() ** 2) ) rewards = torch.exp(-0.1 * (distances - 5.0) ** 2) return rewards # Usage env = CustomGridEnv(size=10, device="cpu") initial_states = env.reset(batch_shape=(4,)) ``` -------------------------------- ### Install torchgfn from Source with All Dependencies (Bash) Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/README.md Installs the latest release of torchgfn directly from the master branch using git clone and pip. It sets up a Conda environment with Python 3.10+ and installs the package with all dependencies, suitable for development or advanced usage. ```bash git clone https://github.com/GFNOrg/torchgfn.git conda create -n gfn python=3.10 conda activate gfn cd torchgfn pip install -e ".[all]" ``` -------------------------------- ### Model Training Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb This snippet shows the call to the training function, passing the necessary arguments to initiate the training process. ```Python forward_model, backward_model, logZ = train_with_exploration( seed, batch_size, trajectory_length, env, device, init_exploration_noise, n_iterations=n_iterations, ) ``` -------------------------------- ### Conditional GFlowNet Setup in Python Source: https://context7.com/gfnorg/torchgfn/llms.txt This Python code sets up a conditional GFlowNet using `ConditionalHyperGrid`, a custom environment inheriting from `HyperGrid`. It allows the reward function to be interpolated between a uniform reward and an original reward based on provided conditions. The setup includes defining the environment, a preprocessor, and specifying the condition dimension. ```python import torch from gfn.estimators import ConditionalDiscretePolicyEstimator from gfn.gflownet import TBGFlowNet from gfn.gym import HyperGrid from gfn.preprocessors import KHotPreprocessor from gfn.samplers import Sampler from gfn.utils.modules import MLP class ConditionalHyperGrid(HyperGrid): """HyperGrid with condition-dependent rewards.""" def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.conditions = None self._original_reward_fn = self.reward_fn def set_conditions(self, conditions: torch.Tensor): """Set conditions for reward computation.""" self.conditions = conditions def reward(self, states): """Interpolate between uniform and original reward.""" original_rewards = self._original_reward_fn(states.tensor) if self.conditions is None: return original_rewards # Condition values: 0=uniform, 1=original cond = self.conditions.squeeze(-1) if cond.shape[0] == 1: cond = cond.expand(original_rewards.shape[0]) # Linear interpolation uniform_reward = torch.ones_like(original_rewards) rewards = (1 - cond) * uniform_reward + cond * original_rewards return rewards # Setup environment device = torch.device("cpu") env = ConditionalHyperGrid( ndim=2, height=8, reward_fn_str="original", device=device ) # Conditional preprocessor combines state and condition preprocessor = KHotPreprocessor(height=env.height, ndim=env.ndim) condition_dim = 1 ``` -------------------------------- ### Sample and Visualize Graph Validity Before Training in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_graphs.ipynb Generates sample trajectories using an untrained GFlowNet and visualizes the distribution of valid seven-segment display graphs. It calculates and prints the percentage of valid graphs, highlighting the initial performance of the model. ```python trajectories = gflownet.sample_trajectories(env, n=64) terminating_states = trajectories.terminating_states render_states(terminating_states[:8]) # type: ignore # Distribution of valid digits before training validity_before = reward_function(terminating_states) == 1.0 # type: ignore num_valid_before = validity_before.sum().item() num_total_before = len(terminating_states) print(f"Before training: {num_valid_before} valid digits out of {num_total_before} samples ({num_valid_before/num_total_before*100:.2f}%)") # For plotting, we can show a simple bar chart: valid vs. invalid labels = ['Valid Digits', 'Invalid Graphs'] counts_before = [num_valid_before, num_total_before - num_valid_before] plt.figure(figsize=(6, 4)) plt.title("Graph Validity - Before training") plt.bar(labels, counts_before, color=['green', 'red']) plt.ylabel("Count") plt.show() ``` -------------------------------- ### Get Unique Sets - Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Filters a list of lists to return only unique sets, converting them to sorted tuples. This is useful for de-duplicating configurations. ```python def get_unique(l: list): unique = [] for i in map(set, l): if i not in unique: unique.append(i) return sorted(map(tuple, unique)) ``` -------------------------------- ### Initialize and Render Line Environment Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb Sets up a 'LineEnvironment' with specified parameters, including modes (mus), variances, and initial conditions. The environment is then rendered to visualize its configuration. ```python env = LineEnvironment( mus=[-3, 4, 6, 10], variances=[0.2, 0.4, 1, 0.2], n_sd=4.5, init_value=0 ) render(env, tight=False) ``` -------------------------------- ### Import Necessary Libraries for GFlowNets Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Imports essential Python libraries for GFlowNets, including PyTorch for neural networks and distributions, Matplotlib for plotting, NumPy for numerical operations, and tqdm for progress bars. These libraries are fundamental for building and training GFN models. ```python import matplotlib.pyplot as plt import numpy as np import matplotlib.cm as cm import random from torch.distributions.categorical import Categorical import torch import torch.nn as nn from tqdm import tqdm, trange ``` -------------------------------- ### Define Facial Features for Drawing Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Defines a dictionary of facial features (smile, frown, eyebrows) as lambda functions that add graphical elements to a Matplotlib plot. These functions allow for the programmatic drawing of expressive faces, used later in the example. ```python # @title # These feature globals will be referred to throughout. _mouth_kwargs = {"closed": False, "fill": False, "lw": 3} FEATURES = { 'smile': lambda: plt.gca().add_patch(plt.Polygon( np.stack( [np.linspace(0.2, 0.8), 0.3 - np.sin(np.linspace(0, 3.14)) * 0.15] ).T, **_mouth_kwargs ) ), 'frown': lambda: plt.gca().add_patch(plt.Polygon( np.stack( [np.linspace(0.2, 0.8), 0.15 + np.sin(np.linspace(0, 3.14)) * 0.15] ).T, **_mouth_kwargs, ) ), 'left_eb_down': lambda: plt.gca().add_line(plt.Line2D( [0.15, 0.35], [0.75, 0.7], color=(0, 0, 0)) ), 'right_eb_down': lambda: plt.gca().add_line(plt.Line2D( [0.65, 0.85], [0.7, 0.75], color=(0, 0, 0)) ), 'left_eb_up': lambda: plt.gca().add_line(plt.Line2D( [0.15, 0.35], [0.7, 0.75], color=(0, 0, 0)) ), 'right_eb_up': lambda: plt.gca().add_line(plt.Line2D( [0.65, 0.85], [0.75, 0.7], color=(0, 0, 0)) ), } ``` -------------------------------- ### Visualize State Space Flows Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Plots the learned edge flows across the state space, showing flow magnitudes at different states and highlighting invalid configurations. ```python plot_state_space(model=F_sa) ``` -------------------------------- ### Train GFlowNet using Trajectory Balance Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Initiates the training process for the GFlowNet using the configured optimizer, environment, and a specified number of episodes. It returns visited states and losses for analysis. ```python visited_terminating_states, states_visited, losses = train( gflownet, optimizer, env, n_episodes=n_episodes * 10, ) ``` -------------------------------- ### Create custom policy mixin with diagnostics in PyTorch Source: https://github.com/gfnorg/torchgfn/blob/master/docs/source/guides/estimator_policy_mixin.md Advanced example extending PolicyMixin to inject custom diagnostics into the training loop. Overrides compute_dist and log_probs to track call counts and log probability statistics via ctx.extras. Enables debugging and monitoring of policy behavior during sampling. ```python from typing import Any, Optional from torch.distributions import Distribution from gfn.estimators import PolicyMixin class TracingPolicyMixin(PolicyMixin): def compute_dist(self, states_active, ctx, step_mask=None, save_estimator_outputs=False, **kw): dist, ctx = super().compute_dist(states_active, ctx, step_mask, save_estimator_outputs, **kw) ctx.extras.setdefault("num_compute_calls", 0) ctx.extras["num_compute_calls"] += 1 return dist, ctx def log_probs(self, actions_active, dist: Distribution, ctx: Any, step_mask=None, vectorized=False, save_logprobs=False): lp, ctx = super().log_probs(actions_active, dist, ctx, step_mask, vectorized, save_logprobs) ctx.extras.setdefault("last_lp_mean", lp.mean().detach()) return lp, ctx ``` -------------------------------- ### Sample from Trained GFlowNet and Visualize (Python) Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_graphs.ipynb This code snippet demonstrates sampling trajectories from a trained GFlowNet and visualizing the resulting graphs. It also calculates and prints the proportion of valid digits generated after training and provides a bar plot comparing the distribution of valid and invalid graphs before and after training. It uses PyTorch for graph structures and Matplotlib for visualization. ```python trajectories = gflownet.sample_trajectories(env, n=64) terminating_states = trajectories.terminating_states render_states(terminating_states[:8]) # type: ignore # Distribution of valid digits after training validity_after = reward_function(terminating_states) == 1.0 # type: ignore num_valid_after = validity_after.sum().item() num_total_after = len(terminating_states) print(f"After training: {num_valid_after} valid digits out of {num_total_after} samples ({num_valid_after/num_total_after*100:.2f}%)") # Plotting comparison labels = ['Valid Digits', 'Invalid Graphs'] counts_after = [num_valid_after, num_total_after - num_valid_after] # We need counts_before from the pre-training section. Assuming it's still in scope. # If not, we might need to re-run that part or store it. # For now, let's assume `counts_before` is available. width = 0.35 # the width of the bars x = torch.arange(len(labels)) # the label locations fig, ax = plt.subplots(figsize=(8, 5)) rects1 = ax.bar(x - width/2, counts_before, width, label='Before Training', color='salmon') rects2 = ax.bar(x + width/2, counts_after, width, label='After Training', color='lightgreen') # Add some text for labels, title and custom x-axis tick labels, etc. ax.set_ylabel('Count') ax.set_title('Graph Validity Comparison') ax.set_xticks(x) ax.set_xticklabels(labels) ax.legend() fig.tight_layout() plt.show() ``` -------------------------------- ### Import libraries for GFlowNets tutorial Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb Imports necessary libraries including matplotlib for plotting, torch for tensor operations, and numpy for numerical computations. Sets the device to CUDA if available. ```python from matplotlib import pyplot as plt from torch.distributions import Normal import math import numpy as np import torch import random from tqdm import trange device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') ``` -------------------------------- ### Environment Definition Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb This code defines a LineEnvironment, setting up the environment parameters such as means, variances, and initial value. ```Python env = LineEnvironment( mus=[2, 5], variances=[0.2, 0.2], n_sd=4.5, init_value=0 ) ``` -------------------------------- ### Initialize Line Environment in Python Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_continuous.ipynb Initializes a LineEnvironment with specified modes (mus) and variances. This setup is used to create a more challenging distribution for training models. The `n_sd` parameter controls the standard deviation range, and `init_value` sets the initial state. ```python env = LineEnvironment(mus=[-3, 3], variances=[0.2, 0.2], n_sd=4.5, init_value=0) render(env) ``` -------------------------------- ### Configuration: Set Fixed Hyperparameters Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb Sets and prints fixed hyperparameters for experiments, including the number of hidden units, episodes, learning rate, and random seed. These values are used consistently across runs. ```python # Fixed hyperparameters. n_hid_units = 512 n_episodes = 10_000 learning_rate = 3e-3 seed = 42 print("For all experiments, our hyperparameters will be:") print(" + n_hid_units={}".format(n_hid_units)) print(" + n_episodes={}".format(n_episodes)) print(" + learning_rate={}".format(learning_rate)) print(" + seed={}".format(seed)) ``` -------------------------------- ### Instantiate and train FlowModel Source: https://github.com/gfnorg/torchgfn/blob/master/tutorials/notebooks/intro_discrete.ipynb This code block prepares for training the `FlowModel`. It sets a random seed for reproducibility, instantiates the `FlowModel` with specified hidden units, and initializes an Adam optimizer. It also includes a comment indicating that losses will be accumulated for later processing. ```python set_seed(seed) # Instantiate model n_hid_units optimizer F_sa = FlowModel(n_hid_units) opt = torch.optim.Adam(F_sa.parameters(), learning_rate) # To not complicate the code, I'll just accumulate losses here and take a ```