# CodeCarbon CodeCarbon is a lightweight Python package that estimates and tracks carbon dioxide (CO₂) emissions from computing resources. It measures the electricity power consumption of hardware components (CPU, GPU, and RAM) and applies the carbon intensity of the electricity grid in the region where the computation is running. This allows developers and researchers to quantify the environmental impact of their machine learning training, data processing, or any compute-intensive code. The package supports both online mode (with automatic geolocation and real-time carbon intensity data) and offline mode (for air-gapped environments). It provides multiple tracking methods including decorators, context managers, and explicit object APIs, making it easy to integrate into any Python workflow. Results can be saved to CSV files, sent to the CodeCarbon API dashboard, pushed to Prometheus, Logfire, or custom HTTP endpoints, enabling comprehensive monitoring and reporting of carbon footprints across projects and experiments. ## Installation Install CodeCarbon from PyPI with pip. ```bash pip install codecarbon # For visualization dashboard support pip install 'codecarbon[carbonboard]' ``` ## EmissionsTracker Class The `EmissionsTracker` class is the primary interface for tracking carbon emissions in online mode. It automatically detects hardware, geolocation, and carbon intensity data from the internet. ```python from codecarbon import EmissionsTracker # Basic usage with explicit start/stop tracker = EmissionsTracker( project_name="my_ml_project", measure_power_secs=15, # Measure power every 15 seconds output_dir="./emissions", # Directory for CSV output output_file="emissions.csv", # CSV filename save_to_file=True, # Save results to CSV save_to_api=False, # Send to CodeCarbon API tracking_mode="machine", # "machine" or "process" log_level="info", # Logging verbosity ) tracker.start() try: # Your compute-intensive code here import time for epoch in range(10): print(f"Training epoch {epoch}") time.sleep(2) # Simulated training finally: emissions = tracker.stop() # Returns total CO2 emissions in kg print(f"Total emissions: {emissions * 1000:.4f} g CO2eq") print(f"Duration: {tracker.final_emissions_data.duration:.2f} seconds") print(f"Energy consumed: {tracker.final_emissions_data.energy_consumed:.6f} kWh") print(f"CPU energy: {tracker.final_emissions_data.cpu_energy:.6f} kWh") print(f"GPU energy: {tracker.final_emissions_data.gpu_energy:.6f} kWh") print(f"RAM energy: {tracker.final_emissions_data.ram_energy:.6f} kWh") ``` ## EmissionsTracker as Context Manager The context manager pattern provides automatic start/stop handling and is the recommended approach for tracking emissions in most scenarios. ```python from codecarbon import EmissionsTracker import tensorflow as tf # Load data mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 # Build model model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation="relu"), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10), ]) model.compile(optimizer="adam", loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=["accuracy"]) # Track emissions with context manager with EmissionsTracker(project_name="mnist_training") as tracker: model.fit(x_train, y_train, epochs=10) # Access results after context exits print(f"Carbon emissions: {tracker.final_emissions * 1000:.4f} g CO2eq") print(f"Detailed data: {tracker.final_emissions_data}") ``` ## @track_emissions Decorator The decorator pattern enables tracking with minimal code changes, ideal for wrapping training functions or scripts. ```python from codecarbon import track_emissions import time @track_emissions( project_name="model_training", output_dir="./logs", save_to_file=True, save_to_api=False, measure_power_secs=10, log_level="warning", ) def train_model(): """Training function wrapped with emissions tracking.""" # Simulate model training print("Starting training...") for epoch in range(5): time.sleep(3) # Simulated epoch print(f"Epoch {epoch + 1} complete") print("Training finished!") return "trained_model" if __name__ == "__main__": model = train_model() # Emissions are automatically logged to ./logs/emissions.csv ``` ## OfflineEmissionsTracker The `OfflineEmissionsTracker` is designed for environments without internet access. It requires a country ISO code or cloud provider/region to estimate carbon intensity. ```python from codecarbon import OfflineEmissionsTracker # Offline tracking with country ISO code tracker = OfflineEmissionsTracker( project_name="offline_experiment", country_iso_code="USA", # 3-letter ISO country code region="california", # Optional: state/province (for US/Canada) measure_power_secs=15, output_dir=".", save_to_file=True, ) tracker.start() try: # Your compute code import numpy as np data = np.random.randn(10000, 10000) result = np.dot(data, data.T) finally: emissions = tracker.stop() print(f"Offline emissions: {emissions * 1000:.4f} g CO2eq") # Offline tracking with cloud provider cloud_tracker = OfflineEmissionsTracker( project_name="cloud_experiment", cloud_provider="aws", # aws, gcp, or azure cloud_region="us-east-1", # Cloud region identifier ) ``` ## @track_emissions Decorator in Offline Mode The decorator also supports offline mode by passing the `offline` parameter and required location information. ```python from codecarbon import track_emissions @track_emissions( offline=True, country_iso_code="CAN", # Canada region="ontario", # Province project_name="offline_training", measure_power_secs=10, ) def train_offline(): """Train model in an air-gapped environment.""" import time for i in range(10): time.sleep(1) return "model_trained" if __name__ == "__main__": train_offline() ``` ## Task-Based Tracking Task tracking allows measuring emissions for specific code sections within a larger experiment. This is useful for comparing different phases like data loading, model building, and inference. ```python from codecarbon import EmissionsTracker import time def load_dataset(): time.sleep(2) return {"train": [1, 2, 3], "test": [4, 5, 6]} def build_model(): time.sleep(2) return "trained_model" def run_inference(model, data): time.sleep(1) return "predictions" # Track individual tasks tracker = EmissionsTracker(project_name="ml_pipeline", measure_power_secs=5) try: # Task 1: Load dataset tracker.start_task("load_dataset") dataset = load_dataset() task1_emissions = tracker.stop_task() # Task 2: Build model tracker.start_task("build_model") model = build_model() task2_emissions = tracker.stop_task() # Task 3: Run inference (multiple iterations) for i in range(3): tracker.start_task(f"inference_{i}") predictions = run_inference(model, dataset["test"]) task_emissions = tracker.stop_task() print(f"Inference {i}: {task_emissions.emissions * 1000:.4f} g CO2eq") finally: total_emissions = tracker.stop() # Print task-level breakdown print(f"\nTotal emissions: {total_emissions * 1000:.4f} g CO2eq") for task_name, task in tracker._tasks.items(): print(f" {task_name}: {task.emissions_data.emissions * 1000:.4f} g CO2eq " f"({task.emissions_data.duration:.1f}s)") ``` ## TaskEmissionsTracker Context Manager The `TaskEmissionsTracker` provides a context manager for tracking individual tasks within an existing tracker. ```python from codecarbon import EmissionsTracker, TaskEmissionsTracker import time # Create main tracker tracker = EmissionsTracker(project_name="task_demo") tracker.start() try: # Track specific tasks using context manager with TaskEmissionsTracker(task_name="data_preprocessing", tracker=tracker): time.sleep(2) # Simulate preprocessing with TaskEmissionsTracker(task_name="model_training", tracker=tracker): time.sleep(3) # Simulate training with TaskEmissionsTracker(task_name="evaluation", tracker=tracker): time.sleep(1) # Simulate evaluation finally: emissions = tracker.stop() print(f"Total: {emissions * 1000:.4f} g CO2eq") ``` ## flush() Method The `flush()` method writes current emissions data to outputs (CSV, API, etc.) without stopping the tracker. Useful for long-running experiments where you want periodic checkpoints. ```python from codecarbon import EmissionsTracker import time tracker = EmissionsTracker( project_name="long_training", on_csv_write="append", # "append" adds new row, "update" overwrites existing ) tracker.start() try: for epoch in range(100): time.sleep(1) # Simulate epoch training # Flush every 10 epochs to save progress if (epoch + 1) % 10 == 0: current_emissions = tracker.flush() print(f"Epoch {epoch + 1}: {current_emissions * 1000:.4f} g CO2eq cumulative") finally: total = tracker.stop() print(f"Final total: {total * 1000:.4f} g CO2eq") ``` ## Configuration File CodeCarbon supports hierarchical configuration through config files, environment variables, and code parameters. Priority: script > env vars > local config > global config. ```ini # ~/.codecarbon.config (global) or ./.codecarbon.config (local) [codecarbon] measure_power_secs = 15 output_dir = ./emissions output_file = emissions.csv save_to_file = true save_to_api = false log_level = INFO tracking_mode = machine project_name = my_project # For API usage api_endpoint = https://api.codecarbon.io experiment_id = your-experiment-uuid api_key = your-api-key # For offline mode country_iso_code = USA region = california # Advanced settings pue = 1.2 # Power Usage Effectiveness wue = 1.8 # Water Usage Effectiveness (L/kWh) force_cpu_power = 65 # Override CPU TDP in watts force_ram_power = 20 # Override RAM power in watts gpu_ids = 0,1 # Track specific GPUs only allow_multiple_runs = true # Allow concurrent trackers ``` ```python # Using environment variables import os os.environ["CODECARBON_PROJECT_NAME"] = "env_project" os.environ["CODECARBON_LOG_LEVEL"] = "DEBUG" os.environ["CODECARBON_GPU_IDS"] = "0,1" from codecarbon import EmissionsTracker tracker = EmissionsTracker() # Uses environment variables ``` ## Command Line Interface CodeCarbon provides CLI commands for monitoring, configuration, and hardware detection. ```bash # Check version codecarbon --version # Configure CodeCarbon interactively (creates .codecarbon.config) codecarbon config # Login to CodeCarbon API codecarbon login # Detect hardware capabilities codecarbon detect # Output: # - Available RAM: 32.000 GB # - CPU count: 16 thread(s) in 8 physical CPU(s) # - CPU model: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz # - GPU count: 2 # - GPU model: NVIDIA GeForce RTX 3080 # Monitor machine emissions continuously (Ctrl+C to stop) codecarbon monitor # Monitor with custom settings codecarbon monitor --measure-power-secs 10 --api-call-interval 30 # Monitor without API (local only) codecarbon monitor --no-api # Monitor in offline mode codecarbon monitor --offline --country-iso-code FRA # Monitor while running a specific command codecarbon monitor -- python train.py codecarbon monitor -- ./benchmark.sh codecarbon monitor -- npm run build # Get API token for a project codecarbon get-token ``` ## Saving to CodeCarbon API Send emissions data to the CodeCarbon dashboard for centralized tracking and visualization across experiments. ```python from codecarbon import track_emissions, EmissionsTracker # Using decorator with API @track_emissions( save_to_api=True, experiment_id="your-experiment-uuid", # From codecarbon config or dashboard api_key="your-api-key", api_call_interval=4, # Send to API every 4 measurements measure_power_secs=30, ) def train_with_api(): import time for i in range(10): time.sleep(5) # Using EmissionsTracker with API tracker = EmissionsTracker( save_to_api=True, api_endpoint="https://api.codecarbon.io", experiment_id="your-experiment-uuid", api_key="your-api-key", api_call_interval=8, # -1 to only call at end ) tracker.start() try: # Training code pass finally: tracker.stop() ``` ## Prometheus Integration Push emissions metrics to Prometheus for monitoring and alerting. Metrics are prefixed with `codecarbon_`. ```python from codecarbon import EmissionsTracker, OfflineEmissionsTracker import os # Set authentication if required os.environ["PROMETHEUS_USERNAME"] = "your_username" os.environ["PROMETHEUS_PASSWORD"] = "your_password" tracker = OfflineEmissionsTracker( project_name="prometheus_demo", country_iso_code="USA", save_to_prometheus=True, prometheus_url="localhost:9091", # Pushgateway URL save_to_file=True, ) tracker.start() try: import time for i in range(10): time.sleep(2) finally: tracker.stop() # Metrics available in Prometheus: # - codecarbon_emissions_total # - codecarbon_energy_consumed_total # - codecarbon_cpu_power # - codecarbon_gpu_power # - codecarbon_ram_power ``` ## Logfire Integration Send emissions data to Pydantic Logfire observability platform. ```python from codecarbon import EmissionsTracker tracker = EmissionsTracker( project_name="logfire_demo", save_to_logfire=True, # First run prompts for Logfire login save_to_file=True, ) tracker.start() try: import time time.sleep(10) finally: tracker.stop() # Metrics appear in Logfire as codecarbon_* format ``` ## Logger Output Send emissions data to Python logging or Google Cloud Logging for centralized log management. ```python import logging from codecarbon import EmissionsTracker from codecarbon.output import LoggerOutput # Create a dedicated logger emissions_logger = logging.getLogger("emissions") emissions_logger.setLevel(logging.INFO) # Add file handler file_handler = logging.FileHandler("emissions.log") file_handler.setFormatter(logging.Formatter('%(asctime)s - %(message)s')) emissions_logger.addHandler(file_handler) # Create LoggerOutput wrapper logger_output = LoggerOutput(emissions_logger, logging.INFO) # Use with EmissionsTracker tracker = EmissionsTracker( project_name="logging_demo", save_to_logger=True, logging_logger=logger_output, save_to_file=False, # Disable CSV since we're using logger ) tracker.start() try: import time time.sleep(5) finally: tracker.stop() ``` ```python # Google Cloud Logging example import google.cloud.logging from codecarbon import EmissionsTracker from codecarbon.output import GoogleCloudLoggerOutput client = google.cloud.logging.Client(project="my-gcp-project") gcp_logger = GoogleCloudLoggerOutput(client.logger("codecarbon-emissions")) tracker = EmissionsTracker( save_to_logger=True, logging_logger=gcp_logger, ) ``` ## HTTP Webhook Output Send emissions data to a custom HTTP endpoint when tracking stops. ```python from codecarbon import EmissionsTracker tracker = EmissionsTracker( project_name="webhook_demo", emissions_endpoint="https://your-server.com/api/emissions", save_to_file=True, ) tracker.start() try: import time time.sleep(5) finally: tracker.stop() # POST request sent to emissions_endpoint with JSON emissions data ``` ## Custom Output Handlers Create custom output handlers by extending the `BaseOutput` class. ```python from codecarbon.output import BaseOutput from codecarbon.output_methods.emissions_data import EmissionsData class DatabaseOutput(BaseOutput): """Custom output handler to save emissions to a database.""" def __init__(self, connection_string): self.connection_string = connection_string # Initialize database connection def out(self, total: EmissionsData, delta: EmissionsData): """Called when tracker.stop() or tracker.flush() is invoked.""" print(f"Saving to database: {total.emissions} kg CO2eq") # Save total emissions to database def live_out(self, total: EmissionsData, delta: EmissionsData): """Called periodically during tracking (every api_call_interval measurements).""" print(f"Live update: {delta.emissions} kg CO2eq (delta)") # Stream delta emissions to database def task_out(self, task_emissions_list, experiment_name): """Called when tasks are tracked.""" for task_data in task_emissions_list: print(f"Task: {task_data}") def exit(self): """Called when tracker stops - cleanup resources.""" # Close database connection pass # Use custom handler from codecarbon import EmissionsTracker db_output = DatabaseOutput("postgresql://localhost/emissions") tracker = EmissionsTracker( output_handlers=[db_output], save_to_file=False, ) ``` ## Hardware Detection Programmatically detect and inspect hardware configuration. ```python from codecarbon import EmissionsTracker # Create tracker without starting it tracker = EmissionsTracker(save_to_file=False) # Get detected hardware info hardware = tracker.get_detected_hardware() print(f"RAM: {hardware['ram_total_size']:.2f} GB") print(f"CPU threads: {hardware['cpu_count']}") print(f"Physical CPUs: {hardware['cpu_physical_count']}") print(f"CPU model: {hardware['cpu_model']}") print(f"GPU count: {hardware['gpu_count']}") print(f"GPU model: {hardware['gpu_model']}") print(f"Tracked GPU IDs: {hardware['gpu_ids']}") # Example output: # RAM: 32.00 GB # CPU threads: 16 # Physical CPUs: 8 # CPU model: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz # GPU count: 2 # GPU model: 2 x NVIDIA GeForce RTX 3080 # Tracked GPU IDs: None ``` ## GPU Selection Track specific GPUs instead of all available GPUs on the system. ```python from codecarbon import EmissionsTracker import os # Method 1: Using CUDA_VISIBLE_DEVICES (auto-detected) os.environ["CUDA_VISIBLE_DEVICES"] = "0,2" # Method 2: Using gpu_ids parameter (overrides CUDA_VISIBLE_DEVICES) tracker = EmissionsTracker( project_name="multi_gpu", gpu_ids=[0, 1], # Track only GPU 0 and 1 # Or as string: gpu_ids="0,1" ) tracker.start() try: # GPU training code pass finally: tracker.stop() ``` ## PUE and WUE Configuration Configure Power Usage Effectiveness (PUE) and Water Usage Effectiveness (WUE) for datacenter operations. ```python from codecarbon import EmissionsTracker # Track with datacenter efficiency factors tracker = EmissionsTracker( project_name="datacenter_tracking", pue=1.2, # PUE multiplier (typical range: 1.1 - 2.2) wue=1.8, # Water consumption: liters per kWh ) tracker.start() try: import time time.sleep(10) finally: emissions = tracker.stop() # Access water consumption data print(f"Emissions: {tracker.final_emissions_data.emissions * 1000:.4f} g CO2eq") print(f"Water consumed: {tracker.final_emissions_data.water_consumed:.4f} liters") print(f"Energy consumed: {tracker.final_emissions_data.energy_consumed:.6f} kWh") ``` ## Carbonboard Visualization Visualize emissions data locally using the built-in Dash dashboard. ```bash # Install visualization dependencies pip install 'codecarbon[carbonboard]' # Launch visualization dashboard carbonboard --filepath="./emissions.csv" --port=8050 # Open browser to http://localhost:8050 ``` ```python # Generate emissions data for visualization from codecarbon import EmissionsTracker import time for project in ["project_a", "project_b", "project_c"]: tracker = EmissionsTracker( project_name=project, output_file="emissions.csv", ) tracker.start() time.sleep(5) tracker.stop() # Then visualize: carbonboard --filepath="./emissions.csv" ``` ## EmissionsData Structure The `EmissionsData` dataclass contains all tracked metrics returned by `tracker.final_emissions_data`. ```python from codecarbon import EmissionsTracker import time tracker = EmissionsTracker() tracker.start() time.sleep(5) tracker.stop() data = tracker.final_emissions_data # Core emissions metrics print(f"Timestamp: {data.timestamp}") print(f"Duration: {data.duration} seconds") print(f"Emissions: {data.emissions} kg CO2eq") print(f"Emissions rate: {data.emissions_rate} kg/s") # Energy breakdown print(f"Total energy: {data.energy_consumed} kWh") print(f"CPU energy: {data.cpu_energy} kWh") print(f"GPU energy: {data.gpu_energy} kWh") print(f"RAM energy: {data.ram_energy} kWh") # Power averages print(f"CPU power: {data.cpu_power} W") print(f"GPU power: {data.gpu_power} W") print(f"RAM power: {data.ram_power} W") # Utilization metrics print(f"CPU utilization: {data.cpu_utilization_percent}%") print(f"GPU utilization: {data.gpu_utilization_percent}%") print(f"RAM utilization: {data.ram_utilization_percent}%") print(f"RAM used: {data.ram_used_gb} GB") # Location info print(f"Country: {data.country_name} ({data.country_iso_code})") print(f"Region: {data.region}") print(f"On cloud: {data.on_cloud}") print(f"Cloud provider: {data.cloud_provider}") print(f"Cloud region: {data.cloud_region}") # System info print(f"OS: {data.os}") print(f"Python: {data.python_version}") print(f"CodeCarbon: {data.codecarbon_version}") print(f"CPU model: {data.cpu_model}") print(f"CPU count: {data.cpu_count}") print(f"GPU model: {data.gpu_model}") print(f"GPU count: {data.gpu_count}") print(f"RAM total: {data.ram_total_size} GB") # Export to JSON print(data.toJSON()) ``` ## Comet.ml Integration Automatically track carbon emissions alongside ML experiments in Comet.ml. ```python # pip install comet_ml>=3.2.2 from comet_ml import Experiment import tensorflow as tf # Initialize Comet experiment (auto-creates EmissionsTracker) experiment = Experiment(api_key="YOUR_COMET_API_KEY") # Your training code mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0 model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation="relu"), tf.keras.layers.Dense(10), ]) model.compile(optimizer="adam", loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)) model.fit(x_train, y_train, epochs=5) experiment.end() # View CodeCarbon panel in Comet.ml dashboard ``` ## Electricity Maps API Integration Use real-time carbon intensity data from Electricity Maps for more accurate emissions estimates. ```python from codecarbon import EmissionsTracker # With Electricity Maps API token (get from electricitymaps.com) tracker = EmissionsTracker( project_name="realtime_carbon", electricitymaps_api_token="your-api-token", ) tracker.start() try: import time time.sleep(10) finally: tracker.stop() ``` CodeCarbon is an essential tool for researchers and developers who want to understand and reduce the environmental impact of their computing workloads. It seamlessly integrates into existing Python workflows through decorators, context managers, or explicit tracking objects, requiring minimal code changes while providing detailed insights into energy consumption and carbon emissions. The package supports diverse deployment scenarios including local development machines, cloud instances (AWS, GCP, Azure), and air-gapped environments. With built-in support for multiple output destinations (CSV, API, Prometheus, Logfire, custom handlers), CodeCarbon enables comprehensive monitoring and reporting. Whether tracking a single training run or monitoring an entire ML pipeline with task-level granularity, CodeCarbon provides the visibility needed to make informed decisions about computational sustainability.