### Configure Ollama LLM Provider (Local) Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Use Ollama for local LLM inference. No API key is typically needed for local setups. ```yaml llm-provider: ollama llm-model: llama3.1:70b # No API key needed for local Ollama ``` -------------------------------- ### Example PR Impact Assessment Output Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md This markdown output illustrates how the action assesses the impact of PR changes on workflow failures, highlighting specific code changes and their potential contribution to issues. ```markdown ## ๐Ÿ” PR Impact Assessment ๐Ÿ”ด **Impact Likelihood:** High The test failures are directly related to code changes in this PR: - Changes to `src/auth/login.py` introduced a validation error - Modified authentication logic conflicts with test expectations in `tests/test_auth.py` ### ๐Ÿ’ก Relevant Code Changes **src/auth/login.py** (+12 -3) - Modified password validation logic at lines 45-52 - Added new timeout parameter affecting authentication flow ``` -------------------------------- ### GitHub Actions: Analyze Separate workflow_run Trigger Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Trigger analysis after any completion of a named workflow. This example uses the 'CI' workflow and requires 'contents: read' and 'pull-requests: write' permissions. ```yaml name: Failure Analysis on: workflow_run: workflows: ["CI"] types: [completed] jobs: analyze: if: ${{ github.event.workflow_run.conclusion == 'failure' }} runs-on: ubuntu-latest permissions: contents: read pull-requests: write steps: - uses: calebevans/gha-failure-analysis@v1 with: run-id: ${{ github.event.workflow_run.id }} llm-provider: anthropic llm-model: claude-3-5-sonnet-20241022 llm-api-key: ${{ secrets.ANTHROPIC_API_KEY }} post-pr-comment: "true" ``` -------------------------------- ### Python: Config Dataclass Usage Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Demonstrates how to instantiate and use the Config dataclass, including setting environment variables for local testing, validation, and calculating token budgets. Ensure required environment variables are set. ```python from gha_failure_analysis.config import Config # Typical action invocation: env vars are set by the runner. # For local testing, set them manually: import os os.environ["INPUT_GITHUB_TOKEN"] = "ghp_..." os.environ["GITHUB_REPOSITORY"] = "owner/repo" os.environ["GITHUB_RUN_ID"] = "12345678" os.environ["INPUT_LLM_PROVIDER"] = "openai" os.environ["INPUT_LLM_MODEL"] = "gpt-4o" os.environ["INPUT_LLM_API_KEY"] = "sk-..." os.environ["INPUT_POST_PR_COMMENT"] = "true" os.environ["INPUT_IGNORED_JOBS"] = "Deploy *,Notify *" os.environ["INPUT_CORDON_BACKEND"] = "remote" os.environ["INPUT_CORDON_MODEL_NAME"] = "openai/text-embedding-3-small" config = Config() # Validate before using errors = config.validate() if errors: raise ValueError(f"Config errors: {errors}") # Detect the model's context window (queries LiteLLM DB, falls back to 128k) context_limit = config.detect_model_context_limit() print(f"Context limit: {context_limit:,} tokens") # e.g. 128,000 # Compute per-artifact token budgets given failure counts tokens_per_step, tokens_per_test, tokens_per_artifact = config.calculate_token_budgets( num_failed_steps=3, num_failed_tests=10, num_artifacts=2, ) print(tokens_per_step, tokens_per_test, tokens_per_artifact) # e.g. 38461 19230 19230 # Filtering helpers print(config.should_ignore_job("Deploy production")) # True print(config.should_ignore_job("Run tests")) # False print(config.should_ignore_step("Notify Slack")) # depends on INPUT_IGNORED_STEPS ``` -------------------------------- ### Configure OpenAI LLM Provider Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Set up the action to use OpenAI as the LLM provider. Requires specifying the provider, model, and API key. ```yaml llm-provider: openai llm-model: gpt-4o llm-api-key: ${{ secrets.OPENAI_API_KEY }} ``` -------------------------------- ### Configure Google Gemini LLM Provider Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Set up the action to use Google Gemini models. Provide the model name and your Gemini API key. ```yaml llm-provider: gemini llm-model: gemini-2.5-flash llm-api-key: ${{ secrets.GEMINI_API_KEY }} ``` -------------------------------- ### Run FailureAnalyzer for Workflow Analysis Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Instantiate and run the FailureAnalyzer with preprocessor and configuration. It takes a WorkflowAnalysis object and returns an RCAReport. Configure DSPy with your LLM before use. ```python import dspy from gha_failure_analysis.analysis.analyzer import FailureAnalyzer from gha_failure_analysis.github.models import WorkflowAnalysis, WorkflowRun, JobResult, StepResult from gha_failure_analysis.processing.preprocessor import LogPreprocessor from gha_failure_analysis.config import Config from datetime import datetime # Configure DSPy with your LLM dspy.configure(lm=dspy.LM("openai/gpt-4o", api_key="sk-...")) config = Config() preprocessor = LogPreprocessor(config=config) # Build the analysis input run = WorkflowRun( id=12345678, name="CI", head_branch="main", head_sha="abc123", status="completed", conclusion="failure", html_url="https://github.com/owner/repo/actions/runs/12345678", repository="owner/repo", pr_number=42, ) step = StepResult(name="Run tests", number=3, status="completed", conclusion="failure", started_at=datetime.now(), completed_at=datetime.now()) job = JobResult(id=99, name="Test (Python 3.12)", status="completed", conclusion="failure", steps=[step], log_path="/tmp/job_99.txt") workflow_analysis = WorkflowAnalysis( workflow_run=run, failed_jobs=[job], failed_tests=[], additional_artifacts={}, ) # Instantiate and run the analyzer analyzer = FailureAnalyzer( preprocessor=preprocessor, config=config, tokens_per_step=50_000, tokens_per_test=20_000, tokens_per_artifact_batch=20_000, pr_context=None, # or a PRContext for PR-aware analysis ) report = analyzer(workflow_analysis) print(report.summary) # "Workflow failed due to AssertionError in test_login: expected 200, got 401." print(report.category) # "test" print(report.pr_number) # "42" print(report.to_markdown()[:500]) ``` -------------------------------- ### Run gha-failure-analysis CLI Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Execute the `gha-failure-analysis analyze` command to run the full analysis pipeline. Configuration is read from environment variables. Use `--verbose` for debugging. Override `INPUT_PR_NUMBER` to test against specific PRs. ```bash # Set required environment variables (mirrors action inputs) export INPUT_GITHUB_TOKEN="ghp_..." export GITHUB_REPOSITORY="owner/repo" export GITHUB_RUN_ID="12345678" export INPUT_LLM_PROVIDER="openai" export INPUT_LLM_MODEL="gpt-4o" export INPUT_LLM_API_KEY="sk-..." # Optional: post PR comment, use remote embeddings export INPUT_POST_PR_COMMENT="true" export INPUT_CORDON_BACKEND="remote" export INPUT_CORDON_MODEL_NAME="openai/text-embedding-3-small" export INPUT_CORDON_API_KEY="sk-..." # Run analysis gha-failure-analysis analyze # Enable verbose logging for debugging gha-failure-analysis analyze --verbose # Test against a merged/closed PR by overriding PR number export INPUT_PR_NUMBER="55" gha-failure-analysis analyze # Outputs: # - Markdown report printed to stdout # - /tmp/failure-analysis-report.json written # - GITHUB_STEP_SUMMARY and GITHUB_OUTPUT updated if set ``` -------------------------------- ### Initialize ChangeCorrelator for PR Analysis Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Configure DSPy and import the ChangeCorrelator and related models. This module uses LLM reasoning to correlate PR code changes with specific failures. ```python import dspy from gha_failure_analysis.analysis.correlator import ChangeCorrelator, correlations_to_json from gha_failure_analysis.github.models import PRContext, FileChange dspy.configure(lm=dspy.LM("openai/gpt-4o", api_key="sk-...")) ``` -------------------------------- ### Integration Test Failure Evidence Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md This log snippet provides evidence of the integration test failure, including the error message, connection pool status, and database startup time. ```text ERROR: Timeout waiting for database connection Connection pool exhausted: 0/10 connections available Database startup took 45.2s (expected: <10s) ``` -------------------------------- ### Configure Local Embeddings with GPU Acceleration (MPS) Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Leverage Apple Silicon (MPS) for faster local embedding generation. Set the cordon-device to 'mps'. ```yaml cordon-device: mps # Apple Silicon ``` -------------------------------- ### GitHub Actions: Analyze Same-Workflow Job Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Add an 'analyze' job that runs only when the preceding job fails. Requires 'contents: read' and 'pull-requests: write' permissions. ```yaml jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm test analyze: needs: test if: failure() runs-on: ubuntu-latest permissions: contents: read pull-requests: write # required for post-pr-comment steps: - uses: calebevans/gha-failure-analysis@v1 with: llm-provider: openai llm-model: gpt-4o llm-api-key: ${{ secrets.OPENAI_API_KEY }} post-pr-comment: "true" analyze-pr-context: "true" pr-context-token-budget: "20" ``` -------------------------------- ### Configure Anthropic LLM Provider Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Configure the action to use Anthropic's models. Ensure you provide the correct model name and API key. ```yaml llm-provider: anthropic llm-model: claude-3-5-sonnet-20241022 llm-api-key: ${{ secrets.ANTHROPIC_API_KEY }} ``` -------------------------------- ### GitHubClient Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt GitHubClient wraps PyGitHub and the raw GitHub REST API to fetch workflow run metadata, enumerate failed jobs with their steps, download job logs, and retrieve PR diffs. Pass a Config instance so job/step filtering patterns are applied automatically. ```APIDOC ## GitHubClient โ€” GitHub API wrapper `GitHubClient` wraps PyGitHub and the raw GitHub REST API to fetch workflow run metadata, enumerate failed jobs with their steps, download job logs, and retrieve PR diffs. Pass a `Config` instance so job/step filtering patterns are applied automatically. ### Methods #### `get_workflow_run` Fetches metadata for a specific GitHub Actions workflow run. * **Parameters**: * `repository` (str): The owner and name of the repository (e.g., "owner/repo"). * `run_id` (int): The ID of the workflow run. * `manual_pr_number` (int | None, optional): Manually specify a PR number to override detection. Defaults to None. * **Returns**: * A workflow run object containing details like name, conclusion, and PR number. #### `get_failed_jobs` Retrieves a list of failed jobs for a given workflow run, respecting configured ignored job patterns. * **Parameters**: * `repository` (str): The owner and name of the repository (e.g., "owner/repo"). * `run_id` (int): The ID of the workflow run. * **Returns**: * A list of job objects, each containing job name, conclusion, and failed steps. #### `download_job_logs` Downloads the raw log content for a specific job and saves it to a temporary file. * **Parameters**: * `repository` (str): The owner and name of the repository (e.g., "owner/repo"). * `job_id` (int): The ID of the job whose logs are to be downloaded. * **Returns**: * The file path to the downloaded log file. #### `get_pr_context` Fetches the context of a pull request, including diff information, with a specified token budget for diffs. * **Parameters**: * `repository` (str): The owner and name of the repository (e.g., "owner/repo"). * `pr_number` (int): The number of the pull request. * `max_tokens` (int): The maximum number of tokens to allocate for diff analysis. * `commit_sha` (str): The commit SHA to analyze the PR context for. * **Returns**: * A PR context object containing change summary and total files changed. #### `close` Closes the client connection. Should be called when done to release resources. ``` -------------------------------- ### Configure PR Context Analysis Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Customize PR context analysis by setting the LLM provider, model, API key, and token budget. PR context analysis is enabled by default. ```yaml - uses: calebevans/gha-failure-analysis@v1 with: llm-provider: openai llm-model: gpt-4o llm-api-key: ${{ secrets.OPENAI_API_KEY }} analyze-pr-context: true # Default: true pr-context-token-budget: 20 # % of context for PR diffs (default: 20%) ``` -------------------------------- ### Configure Remote Embeddings with Gemini Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Set up remote embeddings using Google Gemini. Specify the backend, model name, and API key. ```yaml cordon-backend: remote cordon-model-name: gemini/gemini-embedding-001 cordon-api-key: ${{ secrets.GEMINI_API_KEY }} ``` -------------------------------- ### Configure Remote Embeddings with OpenAI Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Use remote embeddings with OpenAI for faster log preprocessing. Requires specifying the backend, model name, and API key. ```yaml cordon-backend: remote cordon-model-name: openai/text-embedding-3-small cordon-api-key: ${{ secrets.OPENAI_API_KEY }} ``` -------------------------------- ### Run Failure Analysis in the Same Workflow Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Integrate this action as a final job that runs only when a previous job fails. It requires read permissions for contents and pull requests. ```yaml jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm test analyze: needs: test if: failure() runs-on: ubuntu-latest permissions: contents: read pull-requests: write steps: - uses: calebevans/gha-failure-analysis@v1 with: llm-provider: openai llm-model: gpt-4o llm-api-key: ${{ secrets.OPENAI_API_KEY }} post-pr-comment: true ``` -------------------------------- ### Fetch GitHub Workflow Run Data with GitHubClient Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Use GitHubClient to retrieve workflow run metadata, failed jobs, and PR context. Ensure to close the client in a finally block. ```python from gha_failure_analysis.github.client import GitHubClient from gha_failure_analysis.config import Config config = Config() client = GitHubClient(token=config.github_token, config=config) try: # 1. Fetch workflow run metadata run = client.get_workflow_run( repository="owner/repo", run_id=12345678, manual_pr_number=None, # or int to override PR detection ) print(run.name, run.conclusion, run.pr_number) # "CI" "failure" 42 # 2. Get failed jobs (respects config.ignored_jobs_patterns) failed_jobs = client.get_failed_jobs("owner/repo", 12345678) for job in failed_jobs: print(job.name, job.conclusion, len(job.failed_steps)) # "Test (Python 3.12)" "failure" 1 # 3. Download raw log for a job to a temp file log_path = client.download_job_logs("owner/repo", job_id=failed_jobs[0].id) print(log_path) # /tmp/tmpXXXX.txt # 4. Fetch PR context with diffs (budget = 25k tokens for diffs) pr_ctx = client.get_pr_context( repository="owner/repo", pr_number=42, max_tokens=25_000, commit_sha=run.head_sha, # analyze the exact failing commit ) print(pr_ctx.change_summary) # "5 files changed, +120 -34" print(pr_ctx.total_files_changed) # 5 finally: client.close() ``` -------------------------------- ### Configure Custom LLM Endpoint Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Connect to a custom LLM API endpoint by specifying the provider, model, API key, and base URL. ```yaml llm-provider: openai llm-model: custom-model llm-api-key: ${{ secrets.CUSTOM_API_KEY }} llm-base-url: https://custom-llm-gateway.example.com ``` -------------------------------- ### Configure Local Embeddings with GPU Acceleration (CUDA) Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Utilize a CUDA-enabled GPU for faster local embedding generation. Set the cordon-device to 'cuda'. ```yaml cordon-device: cuda # NVIDIA GPUs ``` -------------------------------- ### Compress CI Logs with LogPreprocessor Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt LogPreprocessor uses anomaly detection to reduce log size while preserving critical information. It can process logs from disk or memory and supports local or remote embedding backends. ```python from gha_failure_analysis.processing.preprocessor import LogPreprocessor from gha_failure_analysis.config import Config config = Config() # Using local sentence-transformers (default) preprocessor = LogPreprocessor(config=config) # Or use a remote embedding API for speed preprocessor_remote = LogPreprocessor( config=None, backend="remote", model_name="openai/text-embedding-3-small", api_key="sk-...", device="cpu", ) # Compress a log file on disk compressed = preprocessor.preprocess_file( log_path="/tmp/job_12345.txt", step_name="Run tests", max_tokens=50_000, ) print(f"Compressed to {len(compressed)} chars") # Compress log content from memory raw_log = """ 2024-01-15T10:30:00Z Running pytest... 2024-01-15T10:30:01Z Collecting tests... 2024-01-15T10:30:05Z FAILED tests/test_auth.py::test_login - AssertionError: 401 != 200 2024-01-15T10:30:05Z short test summary info 2024-01-15T10:30:05Z FAILED tests/test_auth.py::test_logout - AttributeError """ result = preprocessor.preprocess( log_content=raw_log, step_name="Run tests", max_tokens=10_000, ) # Result contains anomalous lines wrapped in XML tags print(result) ``` -------------------------------- ### LogPreprocessor Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt LogPreprocessor wraps the cordon library to apply transformer-based anomaly detection on raw CI logs. Lines that look statistically "normal" are discarded; anomalous blocks (errors, exceptions, unexpected output) are kept. This reduces multi-megabyte logs to a few thousand tokens while retaining the most diagnostic content. ```APIDOC ## LogPreprocessor โ€” semantic log compression `LogPreprocessor` wraps the `cordon` library to apply transformer-based anomaly detection on raw CI logs. Lines that look statistically "normal" are discarded; anomalous blocks (errors, exceptions, unexpected output) are kept. This reduces multi-megabyte logs to a few thousand tokens while retaining the most diagnostic content. ### Methods #### `preprocess_file` Compresses a log file on disk by removing statistically normal lines and keeping anomalous ones. * **Parameters**: * `log_path` (str): The path to the log file to preprocess. * `step_name` (str): The name of the step associated with the log. * `max_tokens` (int): The maximum number of tokens the compressed log should contain. * **Returns**: * A string containing the compressed log content. #### `preprocess` Compresses log content provided as a string by removing statistically normal lines and keeping anomalous ones. * **Parameters**: * `log_content` (str): The raw log content as a string. * `step_name` (str): The name of the step associated with the log. * `max_tokens` (int): The maximum number of tokens the compressed log should contain. * **Returns**: * A string containing the compressed log content, with anomalous lines wrapped in `` XML tags. ``` -------------------------------- ### Run Failure Analysis in a Separate Workflow Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Set up a separate workflow that triggers on the completion of another workflow, specifically when it fails. This configuration requires read permissions for contents and pull requests. ```yaml name: Failure Analysis on: workflow_run: workflows: ["CI"] types: [completed] jobs: analyze: if: ${{ github.event.workflow_run.conclusion == 'failure' }} runs-on: ubuntu-latest permissions: contents: read pull-requests: write steps: - uses: calebevans/gha-failure-analysis@v1 with: run-id: ${{ github.event.workflow_run.id }} llm-provider: anthropic llm-model: claude-3-5-sonnet-20241022 llm-api-key: ${{ secrets.ANTHROPIC_API_KEY }} ``` -------------------------------- ### Configure Local Embeddings Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md Enable local embedding generation using sentence-transformers. Specify the backend, model name, and the device to use (CPU, CUDA, or MPS). ```yaml cordon-backend: sentence-transformers cordon-model-name: all-MiniLM-L6-v2 cordon-device: cpu # or cuda/mps for GPU ``` -------------------------------- ### SQL Schema Change Source: https://github.com/calebevans/gha-failure-analysis/blob/main/README.md This diff shows an addition of complex indexes to the database schema, which is identified as a contributing factor to increased initialization time. ```diff + -- Added complex indexes that slow initialization + CREATE INDEX CONCURRENTLY idx_users_email ON users(email); ``` -------------------------------- ### Fetch PR Context from GitHub Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Retrieve Pull Request context, including file changes and commit SHAs, from GitHub. This function respects token budgets for diff retrieval. ```python from github import Auth, Github from gha_failure_analysis.github.pr_context import ( fetch_pr_context, summarize_changes, get_relevant_diffs, find_related_files, ) g = Github(auth=Auth.Token("ghp_...")) # Fetch PR context for a specific failing commit (respects 30k token budget for diffs) pr_ctx = fetch_pr_context( github_client=g, repository="owner/repo", pr_number=42, max_tokens=30_000, commit_sha="abc123def456", ) print(pr_ctx.change_summary) # "5 files changed, +120 -34" # Build a human-readable summary for LLM prompts summary = summarize_changes(pr_ctx, max_files=10) print(summary) # PR #42: Refactor auth validation # Changed: 5 files changed, +120 -34 # Changed Files: # Modified (3): # - src/auth/login.py (+12 -3) # ... # Find which changed files are related to a failing test related = find_related_files(pr_ctx, "tests/test_auth.py::TestAuth::test_login") print(related) # ["src/auth/login.py", "tests/test_auth.py"] # Extract the diffs for only those related files diffs = get_relevant_diffs(pr_ctx, related) print(diffs[:400]) # ### src/auth/login.py (modified) # +12 -3 # ```diff # @@ -45,7 +45,12 @@ # ... ``` -------------------------------- ### Write Job Summary, JSON Report, Action Output, and PR Comment Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt These Python functions write analysis results to GitHub Actions infrastructure. Ensure the GHA environment variables are set for summary and output files. The `report` object is assumed to be an RCAReport produced by FailureAnalyzer. ```python import os from gha_failure_analysis.output.report import write_job_summary, write_json_report, set_action_output from gha_failure_analysis.output.github import post_pr_comment # Simulate GHA environment os.environ["GITHUB_STEP_SUMMARY"] = "/tmp/summary.md" os.environ["GITHUB_OUTPUT"] = "/tmp/outputs.txt" # `report` is an RCAReport produced by FailureAnalyzer # write_job_summary appends the markdown report to the step summary file write_job_summary(report) # write_json_report serializes to JSON, sanitizes secrets, writes to disk write_json_report(report, "/tmp/failure-analysis-report.json") import json data = json.loads(open("/tmp/failure-analysis-report.json").read()) print(data.keys()) # dict_keys(['workflow_name', 'run_id', 'repository', 'pr_number', 'summary', # 'detailed_analysis', 'category', 'pr_impact_assessment', # 'step_analyses', 'test_analyses', 'change_correlations']) # set_action_output writes "name=value" lines to GITHUB_OUTPUT set_action_output("summary", report.summary) set_action_output("category", report.category) set_action_output("report-path", "/tmp/failure-analysis-report.json") # post_pr_comment posts the full markdown report as a PR issue comment post_pr_comment( github_token="ghp_...", repository="owner/repo", pr_number=42, report=report, ) ``` -------------------------------- ### GitHubActionsLogParser Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt GitHubActionsLogParser parses the raw GitHub Actions log format (timestamped lines with ##[group]/##[endgroup]/##[error] annotations) into typed StepLog objects, each holding its LogLine list and annotation messages. ```APIDOC ## GitHubActionsLogParser โ€” structured log parsing `GitHubActionsLogParser` parses the raw GitHub Actions log format (timestamped lines with `##[group]`/`##[endgroup]`/`##[error]` annotations) into typed `StepLog` objects, each holding its `LogLine` list and annotation messages. ### Methods #### `parse_log_file` Parses a log file in GitHub Actions format and returns a list of `StepLog` objects. * **Parameters**: * `log_path` (str): The path to the log file. * **Returns**: * A list of `StepLog` objects, each representing a step in the workflow. #### `parse_log_content` Parses log content provided as a string in GitHub Actions format and returns a list of `StepLog` objects. * **Parameters**: * `log_content` (str): The raw log content as a string. * **Returns**: * A list of `StepLog` objects, each representing a step in the workflow. #### `extract_step_logs` Extracts and formats the logs for a specific step from a log file. * **Parameters**: * `log_path` (str): The path to the log file. * `step_name` (str): The name of the step whose logs are to be extracted. * **Returns**: * A formatted string containing the logs for the specified step. #### `get_step_names` Lists all the step names found within a log file. * **Parameters**: * `log_path` (str): The path to the log file. * **Returns**: * A list of strings, where each string is a step name. ``` -------------------------------- ### Manually Build and Render RCAReport Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Construct an RCAReport object manually for testing or specific use cases. The `to_markdown()` method renders a secret-sanitized report suitable for GitHub summaries or PR comments. Access structured fields directly for programmatic use. ```python from gha_failure_analysis.analysis.analyzer import RCAReport, StepAnalysis # Build a report manually (useful for testing output formatting) step_analysis = StepAnalysis( job_name="Test (Python 3.12)", step_name="Run tests", failure_category="test", root_cause="AssertionError: expected HTTP 200, received 401 in test_login.", evidence=[ {"source": "tests/test_auth.py:45", "content": "AssertionError: 401 != 200"}, {"source": "tests/test_auth.py:60", "content": "AttributeError: 'NoneType' object"}, ], ) report = RCAReport( workflow_name="CI", run_id="12345678", pr_number="42", summary="Tests failed due to authentication assertion errors.", detailed_analysis="### Immediate Cause\nHTTP 401 returned instead of 200.\n\n### Contributing Factors\nPR changes modified auth validation logic.", category="test", step_analyses=[step_analysis], repository="owner/repo", pr_impact_assessment="Likelihood: high\n\nChanges to src/auth/login.py introduced a stricter validator that breaks existing test expectations.", ) # Render markdown (automatically redacts secrets) md = report.to_markdown() print(md) # # ๐Ÿ” Workflow Failure Analysis # | | | # |---|---| # | **Workflow** | `CI` | # | **Run ID** | [#12345678](...) | # | **Pull Request** | [#42](...) | # | **Category** | ๐Ÿงช Test | # ... # Access structured fields print(report.category) # "test" print(report.pr_number) # "42" print(len(report.step_analyses)) # 1 ``` -------------------------------- ### Parse GitHub Actions Logs with GitHubActionsLogParser Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt GitHubActionsLogParser converts raw GitHub Actions logs into structured StepLog objects. It can parse logs from files or strings and extract specific step information. ```python from gha_failure_analysis.parsing.log_parser import GitHubActionsLogParser parser = GitHubActionsLogParser() # Parse a downloaded log file steps = parser.parse_log_file("/tmp/job_12345.txt") for step in steps: print(step.step_name, len(step.lines), step.annotations) # "Run tests" 142 ['ERROR: pytest returned exit code 1'] # Parse from a string log_content = """ 2024-01-15T10:30:00.000Z ##[group]Run tests 2024-01-15T10:30:01.000Z Collecting tests ... 2024-01-15T10:30:05.000Z ##[error]pytest returned exit code 1 2024-01-15T10:30:05.000Z ##[endgroup] """ steps = parser.parse_log_content(log_content) print(steps[0].step_name) # "Run tests" print(steps[0].annotations) # ['ERROR: pytest returned exit code 1'] # Extract and format a specific step's logs formatted = parser.extract_step_logs("/tmp/job_12345.txt", step_name="Run tests") print(formatted[:200]) # List all step names in a log file names = parser.get_step_names("/tmp/job_12345.txt") print(names) # ['Set up Python', 'Install dependencies', 'Run tests'] ``` -------------------------------- ### Retry Function Calls with Exponential Backoff Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Apply the retry_with_backoff decorator to functions that may encounter transient errors, such as network issues or rate limits. Configurable for max retries and different delay strategies. ```python from gha_failure_analysis.utils import retry_with_backoff @retry_with_backoff( max_retries=4, base_delay=1.0, # 1s, 2s, 4s for transient errors rate_limit_delay=10.0, # 10s, 20s, 40s for 429/quota errors context_errors_no_retry=True, ) def call_llm_api(prompt: str) -> str: import httpx response = httpx.post("https://api.openai.com/v1/chat/completions", json={"model": "gpt-4o", "messages": [{"role": "user", "content": prompt}]}, headers={"Authorization": "Bearer sk-..."}) response.raise_for_status() return response.json()["choices"][0]["message"]["content"] # Will retry up to 4 times on network errors or 429 responses; ``` -------------------------------- ### Sanitize Text for Secrets Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Use LeakDetector to scan text for sensitive information like API keys and passwords, replacing them with redaction labels. Useful for logging and reporting. ```python from gha_failure_analysis.security.leak_detector import LeakDetector detector = LeakDetector() text = """ Build failed. API key used: sk-proj-ABCDEFGHIJ1234567890abcdefghij Database URL: postgresql://admin:s3cr3tPassw0rd@db.example.com/prod """ sanitized = detector.sanitize_text(text) print(sanitized) # Build failed. API key used: [REDACTED: Secret Keyword] # Database URL: postgresql://admin:[REDACTED: Basic Auth Credentials]@db.example.com/prod # Safe to use on empty / None-equivalent strings print(detector.sanitize_text("")) # "" ``` -------------------------------- ### Correlate Step Failure with PR Context Source: https://context7.com/calebevans/gha-failure-analysis/llms.txt Use ChangeCorrelator to link a failed CI step to specific changes in a Pull Request. Requires a PRContext object and failure details. ```python from gha_failure_analysis.core import PRContext, FileChange, ChangeCorrelator # Build a minimal PRContext pr_context = PRContext( pr_number=42, title="Refactor auth validation", description="Tightened password validation logic.", changed_files=[ FileChange( filename="src/auth/login.py", status="modified", additions=12, deletions=3, changes=15, patch="@@ -45,7 +45,12 @@\n- if password:\n+ if len(password) >= 12:", ) ], total_additions=12, total_deletions=3, base_sha="base000", head_sha="head999", ) correlator = ChangeCorrelator() # Correlate a failed step step_result = correlator.correlate_with_step( step_name="Test (Python 3.12)/Run tests", failure_details="AssertionError: expected 200 got 401 in test_login", pr_context=pr_context, ) print(step_result.confidence) # "high" print(step_result.likely_caused_by_pr) # True print(step_result.related_files) # ["src/auth/login.py:45"] print(step_result.reasoning) # "Password length guard introduced in PR ..." # Correlate a failed test test_result = correlator.correlate_with_test( test_identifier="tests/test_auth.py::TestAuth::test_login", failure_details="AssertionError: 401 != 200", pr_context=pr_context, ) print(test_result.to_dict()) # Serialize results for downstream use from gha_failure_analysis.core import correlations_to_json json_str = correlations_to_json([step_result, test_result]) print(json_str[:300]) ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.