Try Live
Add Docs
Rankings
Pricing
Docs
Install
Theme
Install
Docs
Pricing
More...
More...
Try Live
Rankings
Enterprise
Create API Key
Add Docs
FORGE ARC AGI 3 Agent
https://github.com/johnlikescarrot/forge-arc-agi-3-agent
Admin
FORGE is an intelligent agent for solving ARC (Abstraction and Reasoning Corpus) puzzles using a
...
Tokens:
18,439
Snippets:
74
Trust Score:
3.4
Update:
4 days ago
Context
Skills
Chat
Benchmark
83.2
Suggestions
Latest
Show doc for...
Code
Info
Show Results
Context Summary (auto-generated)
Raw
Copy
Link
# FORGE ARC AGI 3 Agent FORGE is an intelligent agent designed for the ARC Prize 2026 competition, solving Abstraction and Reasoning Corpus (ARC) puzzle games. The agent employs a hybrid architecture that combines deterministic search algorithms with deep learning, using Breadth-First Search (BFS) as the primary solver and a Convolutional Neural Network (CNN) as a fallback when BFS fails to find solutions within time constraints. The core functionality revolves around analyzing 64x64 pixel game frames, discovering effective actions through state-space exploration, and executing optimal action sequences to complete game levels. FORGE v15 includes advanced features such as warm-up unlock for frozen games, iterative deepening DFS for deep directional puzzles, ACMD (Action-Conditional Masked RAM Delta) priority search for hidden-state games, and affine solution transfer between levels. ## BFSSolver Class - Core Search Engine The BFSSolver class provides offline puzzle solving through direct game class instantiation, supporting multiple search strategies including standard BFS, A* with counter priority, ACMD trigger search, and IDDFS fallback for deep games. ```python from my_agent import BFSSolver, find_game_source_and_class # Initialize solver by finding game source and loading the class game_id = "cd82-level1" game_path, class_name = find_game_source_and_class(game_id) # Returns: ("/kaggle/working/games/cd82.py", "Cd82") # Create solver with custom timeouts solver = BFSSolver( game_path="/kaggle/working/games/cd82.py", game_class_name="Cd82", scan_timeout=5, # Max seconds for action scanning bfs_timeout=180 # Max seconds for BFS search ) # Load the game class from source file if solver.load(): print(f"Loaded game class: {solver.class_name}") # Solve level 0, returns list of (action_id, data) tuples solution = solver.solve_level(level_idx=0, max_states=500000) # Returns: [(1, None), (2, None), (6, {'x': 32, 'y': 16, 'game_id': 'bfs'})] # Solve level 1 with solution transfer from level 0 prev_solution = solver.solutions.get(0) solution_l1 = solver.solve_level(level_idx=1, prev_solution=prev_solution) # Access cached solutions print(f"Solutions found: {list(solver.solutions.keys())}") ``` ## ForgeNet Neural Network - CNN Architecture ForgeNet is the deep learning backbone featuring a 4-layer CNN with CBAM attention, supporting both directional actions (1-5) and click actions (64x64 grid positions). The network includes ActionEffectAttention for learning from historical action effects. ```python import torch from my_agent import ForgeNet # Initialize the network device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') net = ForgeNet( in_ch=26, # 16 color channels + 5 augmentations + 5 temporal diffs g=64 # Grid size (64x64) ).to(device) # Create input tensor from game frame (26 channels, 64x64) # Channels: 16 one-hot colors + bg_mask + rarity + edge + row_pos + col_pos + 5 diff maps frame_tensor = torch.randn(1, 26, 64, 64).to(device) # Forward pass without memory (basic inference) logits = net(frame_tensor) # Shape: (1, 4101) - [5 directional actions, 4096 click positions] action_logits = logits[:, :5] # Directional actions 1-5 click_logits = logits[:, 5:4101] # Click positions (64*64=4096) # Forward pass with action-effect memory (enhanced inference) mem_diffs = torch.randn(1, 10, 1, 64, 64).to(device) # 10 historical frame diffs mem_actions = torch.randint(0, 5, (1, 10)).to(device) # Corresponding actions mem_rewards = torch.randn(1, 10).to(device) # Corresponding rewards logits_enhanced = net(frame_tensor, mem_diffs, mem_actions, mem_rewards) # ActionEffectAttention adds context from historical action effects ``` ## MyAgent Class - Main Agent Interface MyAgent extends the base Agent class and orchestrates the hybrid solving strategy. It manages BFS initialization, solution execution, CNN training, and action selection with epsilon-greedy exploration. ```python from my_agent import MyAgent from arcengine import FrameData, GameState, GameAction # Agent is instantiated by the ARC framework agent = MyAgent(game_id="cd82-level1", arc_env=arc_environment) # Main decision loop (called by framework) def game_loop(agent, frame_data): while not agent.is_done(agent.frames, frame_data): # choose_action handles all decision logic: # 1. Level transitions (init BFS, reset CNN) # 2. Game resets when needed # 3. BFS solution execution if available # 4. CNN fallback with exploration action = agent.choose_action(agent.frames, frame_data) # Action includes reasoning for debugging print(f"Action: {action}, Reasoning: {action.reasoning}") # Output examples: # "bfs:3/7" - executing step 3 of 7-step BFS solution # "cnn:a2" - CNN selected directional action 2 # "cnn:c(32,16)" - CNN selected click at position (32,16) # "reset" - game needs reset # "undo" - using undo action after 30 unproductive moves # Execute action and get new frame frame_data = environment.step(action) agent.append_frame(frame_data) # Check termination conditions done = agent.is_done(agent.frames, frame_data) # Returns True if: GameState.WIN or elapsed_time >= 8 hours - 5 minutes ``` ## Action Scanning and State Hashing The solver discovers effective actions by testing each available action and tracking which ones produce frame changes. State hashing supports both pixel-based and hidden-field-aware deduplication. ```python import numpy as np import copy from arcengine import ActionInput, GameAction # Example of action scanning logic def scan_actions(game, initial_frame, background_color): """Discover actions that change the game state.""" available = game._available_actions effective_actions = [] # Test directional/interact actions (1-5) for action_id in [a for a in available if a <= 5]: game_copy = copy.deepcopy(game) result = game_copy.perform_action( ActionInput(id=GameAction.from_id(action_id)), raw=True ) if result.frame: new_frame = np.array(result.frame[-1]) if np.sum(initial_frame != new_frame) > 0: effective_actions.append((action_id, None)) # Test click actions (action 6) on non-background pixels if 6 in available: for y in range(0, 64, 2): # 2-pixel stride for efficiency for x in range(0, 64, 2): if initial_frame[y, x] == background_color: continue game_copy = copy.deepcopy(game) result = game_copy.perform_action( ActionInput( id=GameAction.ACTION6, data={'x': x, 'y': y, 'game_id': 'bfs'} ), raw=True ) if result.frame: new_frame = np.array(result.frame[-1]) if np.sum(initial_frame != new_frame) > 0: effective_actions.append((6, {'x': x, 'y': y, 'game_id': 'bfs'})) return effective_actions # Returns: [(1, None), (3, None), (6, {'x': 24, 'y': 32, 'game_id': 'bfs'}), ...] # State hashing with hidden fields def state_hash(game, frame, hidden_fields=None): """Create unique state identifier including hidden game state.""" import hashlib frame_hash = hashlib.md5(frame.tobytes()).hexdigest()[:16] if hidden_fields: extras = [] for field_name in hidden_fields: value = getattr(game, field_name, None) if value is not None: extras.append(f"{field_name}={value}") if extras: return frame_hash + "|" + "|".join(extras) return frame_hash # Returns: "a3f2c8e1b9d04567" or "a3f2c8e1b9d04567|score=5|coins=3" ``` ## Solution Transfer Between Levels FORGE supports transferring solutions between levels using object-relative coordinate mapping, handling cases where sprites move to different positions across levels. ```python import numpy as np def transfer_solution(prev_solution, prev_frame, curr_frame, background): """Apply solution from previous level to current level with offset.""" def extract_objects(frame, bg): """Find colored objects and their centroids.""" objects = [] for color in range(16): if color == bg: continue mask = (frame == color) pixel_count = int(np.sum(mask)) if pixel_count < 2: continue ys, xs = np.where(mask) objects.append({ 'color': color, 'cx': float(np.mean(xs)), 'cy': float(np.mean(ys)), 'n': pixel_count }) return objects # Match objects between levels by color prev_objects = extract_objects(prev_frame, background) curr_objects = extract_objects(curr_frame, background) matched = [] for prev_obj in prev_objects: for curr_obj in curr_objects: if curr_obj['color'] == prev_obj['color']: matched.append((prev_obj, curr_obj)) break # Compute average offset dx = np.mean([m[1]['cx'] - m[0]['cx'] for m in matched]) dy = np.mean([m[1]['cy'] - m[0]['cy'] for m in matched]) # Apply offset to click actions transferred = [] for action_id, data in prev_solution: if data and 'x' in data: new_data = dict(data) new_data['x'] = max(0, min(63, int(data['x'] + dx))) new_data['y'] = max(0, min(63, int(data['y'] + dy))) transferred.append((action_id, new_data)) else: transferred.append((action_id, data)) return transferred # Original: [(6, {'x': 10, 'y': 20}), (1, None)] # Transferred: [(6, {'x': 15, 'y': 25}), (1, None)] with offset (5, 5) ``` ## Frame Tensor Construction The agent converts raw game frames into rich 26-channel tensors for CNN processing, including one-hot color encoding, spatial features, and temporal difference maps. ```python import torch import numpy as np def frame_to_tensor(frame, frame_history): """Convert 64x64 game frame to 26-channel tensor.""" # 16 channels: One-hot color encoding one_hot = torch.zeros(16, 64, 64, dtype=torch.float32) one_hot.scatter_(0, torch.from_numpy(frame).unsqueeze(0), 1) # Compute background color (most frequent) counts = np.bincount(frame.flatten(), minlength=16) background = int(counts.argmax()) max_count = max(counts.max(), 1) # Channel 17: Background mask bg_mask = (frame == background).astype(np.float32) # Channel 18: Rarity map (inverse frequency) rarity = np.zeros((64, 64), np.float32) for color in range(16): if counts[color] > 0: rarity[frame == color] = 1.0 - counts[color] / max_count # Channel 19: Edge detection padded = np.pad(frame, 1, mode='edge') edge = ((frame != padded[:-2, 1:-1]) | (frame != padded[2:, 1:-1]) | (frame != padded[1:-1, :-2]) | (frame != padded[1:-1, 2:])).astype(np.float32) # Channels 20-21: Position encoding row_pos = np.linspace(0, 1, 64, dtype=np.float32).reshape(64, 1).repeat(64, 1) col_pos = np.linspace(0, 1, 64, dtype=np.float32).reshape(1, 64).repeat(64, 0) # Channels 22-24: Recent frame differences diffs_recent = torch.zeros(3, 64, 64, dtype=torch.float32) for i, prev_frame in enumerate(reversed(list(frame_history)[-3:])): diffs_recent[i] = torch.from_numpy((frame != prev_frame).astype(np.float32)) # Channels 25-26: Longer-term differences diffs_long = torch.zeros(2, 64, 64, dtype=torch.float32) history = list(frame_history) if len(history) >= 2: diffs_long[0] = torch.from_numpy((history[-1] != history[-2]).astype(np.float32)) if len(history) >= 4: diffs_long[1] = torch.from_numpy((history[-2] != history[-4]).astype(np.float32)) # Combine all channels augmentations = torch.from_numpy(np.stack([bg_mask, rarity, edge, row_pos, col_pos])) return torch.cat([one_hot, augmentations, diffs_recent, diffs_long], dim=0) # Shape: (26, 64, 64) ``` ## Summary FORGE is designed for autonomous puzzle-solving in the ARC-AGI competition environment, where agents must complete multi-level games within an 8-hour time budget. The primary use case involves deploying the agent on Kaggle's competition infrastructure, where it receives game frames via the `arcengine` framework and returns optimal actions. The BFS solver excels at Level 0 solutions (typically simpler) while the CNN learns game-specific patterns for harder levels through online reinforcement learning. Integration follows the ARC-AGI-3-Agents template pattern: extend the base `Agent` class, implement `choose_action()` for decision-making and `is_done()` for termination. The agent automatically handles level transitions, maintains frame history, and balances exploration vs exploitation through epsilon-greedy action selection. For custom deployments, instantiate `BFSSolver` directly with game source paths, or use `ForgeNet` as a standalone visual reasoning model for 64x64 grid-based puzzles with up to 16 distinct colors.