### Setup and Installation Commands

Source: https://github.com/vibrantlabsai/ragas/blob/main/CONTRIBUTING.md

Commands to initialize the development environment. Includes both automated Makefile targets and manual installation methods using uv.

```bash
# Automated setup
make install-minimal
make install

# Manual setup
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install -e ".[dev-minimal]"
uv sync --group dev
```

--------------------------------

### Quickstart Ragas Project Setup (pip)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/evals.md

This command first installs the Ragas library using pip, then uses the 'ragas quickstart' command to generate a new project named 'rag_eval'. Finally, it synchronizes the environment using uv. This method is an alternative if uvx is not preferred.

```sh
pip install ragas
ragas quickstart rag_eval
cd rag_eval
uv sync
```

--------------------------------

### Initialize and Install Ragas Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/rag_eval.md

Commands to create a new RAG evaluation project and install the necessary dependencies using uv or pip.

```bash
uvx ragas quickstart rag_eval
cd rag_eval
uv sync
# Or with pip:
pip install -e .
```

--------------------------------

### Quickstart Ragas Project Setup (uvx)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/evals.md

This command uses uvx to quickly set up a new Ragas project named 'rag_eval'. It includes installing dependencies, generating sample code, and synchronizing the environment. This is the recommended method for getting started.

```sh
uvx ragas quickstart rag_eval
cd rag_eval
uv sync
```

--------------------------------

### Install Ragas with Examples

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/tutorials/index.md

Installs the Ragas library along with example dependencies using pip. This command ensures that all necessary components for running Ragas examples are available.

```bash
pip install ragas[examples]
```

--------------------------------

### Install Project Dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/quickstart.md

Commands to install necessary project dependencies for the Ragas evaluation environment.

```shell
uv sync
```

```shell
pip install -e .
```

--------------------------------

### Implement RAG Query and Evaluation Workflow

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/rag_eval.md

Examples of querying a RAG client and defining custom evaluation metrics using the DiscreteMetric class.

```python
from rag import default_rag_client
rag_client = default_rag_client(llm_client=openai_client, logdir="evals/logs")
response = rag_client.query("What is Ragas?")

from ragas import Dataset, experiment
from ragas.metrics import DiscreteMetric
my_metric = DiscreteMetric(name="correctness", prompt="...", allowed_values=["pass", "fail"])

@experiment()
async def run_experiment(row):
    response = rag_client.query(row["question"])
    score = my_metric.score(llm=llm, response=response["answer"])
    return {**row, "response": response["answer"], "score": score.value}
```

--------------------------------

### Install Project Dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/agent_evals.md

Installs or synchronizes project dependencies using 'uv'. This ensures all necessary libraries for the agent evaluation are available.

```shell
uv sync
```

--------------------------------

### Install Ragas and Set OpenAI API Key

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/tutorials/prompt.md

Installs the Ragas library with example dependencies and sets the OpenAI API key as an environment variable. This is a prerequisite for running the prompt evaluation examples.

```bash
pip install ragas[examples]
export OPENAI_API_KEY = "your_openai_api_key"
```

--------------------------------

### Full Quickstart Workflow

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/index.md

A complete sequence of commands to initialize a project, install dependencies, configure environment variables, and execute an evaluation.

```shell
uvx ragas quickstart rag_eval
cd rag_eval
uv sync
export OPENAI_API_KEY="your-key"
uv run python evals.py
```

--------------------------------

### Configure LLM Providers for Evaluation

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/rag_eval.md

Environment variable setup and Python client initialization for various LLM providers including OpenAI, Anthropic, Google Gemini, and local Ollama models.

```bash
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export GOOGLE_API_KEY="your-google-api-key"
```

```python
# Anthropic
from anthropic import Anthropic
from ragas.llms import llm_factory
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
llm = llm_factory("claude-3-5-sonnet-20241022", provider="anthropic", client=client)

# Google Gemini
import google.generativeai as genai
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Local Ollama
from openai import OpenAI
client = OpenAI(api_key="ollama", base_url="http://localhost:11434/v1")
llm = llm_factory("mistral", provider="openai", client=client)
```

--------------------------------

### Initialize and Run Ragas Benchmark Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/benchmark_llm.md

Commands to create the project structure, install dependencies, set environment variables, and execute the evaluation script for various LLM models.

```bash
ragas quickstart benchmark_llm
cd benchmark_llm
uv sync
export OPENAI_API_KEY="your-openai-key"
uv run python evals.py
uv run python evals.py --model gpt-4o
uv run python evals.py --model gpt-3.5-turbo
```

--------------------------------

### Initialize and Run Ragas Text-to-SQL Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/text2sql.md

Commands to scaffold a new text-to-SQL evaluation project, install dependencies, configure the OpenAI API key, and execute the evaluation workflow.

```bash
ragas quickstart text2sql
cd text2sql
uv sync
export OPENAI_API_KEY="your-openai-key"
uv run python evals.py
```

--------------------------------

### Install Ragas Text-to-SQL dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/applications/text2sql.md

Installs the necessary Python packages required to run the text-to-SQL evaluation examples using uv.

```bash
uv pip install "ragas-examples[text2sql]"
```

--------------------------------

### Start MLflow UI for Tracing

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/improve_rag.md

Shows how to start the MLflow UI on port 5000. This is an optional step to enable detailed tracing of LLM calls during the evaluation.

```shell
mlflow ui --port 5000
```

--------------------------------

### Initialize Ragas Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/quickstart.md

Commands to create a new Ragas evaluation project directory using either uvx or pip.

```shell
uvx ragas quickstart rag_eval
cd rag_eval
```

```shell
pip install ragas
ragas quickstart rag_eval
cd rag_eval
```

--------------------------------

### Advanced Prompt Adaptation Examples

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/migrations/migrate_from_v03_to_v04.md

Examples of partial adaptation (language only) versus full translation (instruction and examples) using the adapt() method.

```python
# Adapt without instruction text (lightweight)
from ragas.metrics.collections import AnswerRelevancy
metric = AnswerRelevancy(llm=llm)
adapted_prompt = await metric.prompt.adapt(
    target_language="french",
    llm=llm,
    adapt_instruction=False
)
metric.prompt = adapted_prompt

# Adapt with instruction translation (full translation)
adapted_prompt = await metric.prompt.adapt(
    target_language="german",
    llm=llm,
    adapt_instruction=True
)
metric.prompt = adapted_prompt
```

--------------------------------

### Verify Installation with Hello World

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/index.md

Generates a basic hello world example in the specified directory to verify that the Ragas environment is configured correctly.

```shell
ragas hello_world [DIRECTORY]
```

--------------------------------

### Implement and Evaluate LLM Workflow

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/workflow_eval.md

Examples showing how to invoke a workflow client for email processing and how to structure a dataset for evaluation purposes.

```python
from workflow import default_workflow_client

workflow = default_workflow_client()
result = workflow.process_email("I found a bug in version 2.1.4...")

def load_dataset():
    dataset_dict = [
        {
            "email": "Hi, I'm getting error code XYZ-123 when using version 2.1.4...",
            "pass_criteria": "category Bug Report; product_version 2.1.4; error_code XYZ-123",
        }
    ]
```

--------------------------------

### Install and Use Ragas Plugin (Bash)

Source: https://github.com/vibrantlabsai/ragas/blob/main/src/ragas/backends/README.md

Commands to install a developed Ragas backend plugin and verify its registration within the Ragas registry.

```bash
pip install my-backend-plugin
python -c "from ragas.backends import get_registry; print(get_registry())"
```

--------------------------------

### Customizing Metric Prompts with Domain Examples (Python)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/migrations/migrate_from_v03_to_v04.md

Demonstrates how to customize a metric's prompt by providing domain-specific examples. This approach can significantly improve metric accuracy by tailoring the evaluation to a particular domain's nuances.

```python
from ragas.metrics.collections.faithfulness.util import (
    FaithfulnessInput,
    FaithfulnessOutput,
    FaithfulnessPrompt,
    StatementFaithfulnessAnswer,
)

class DomainSpecificPrompt(FaithfulnessPrompt):
    examples = [
        (
            FaithfulnessInput(
                response="ML uses statistical techniques.",
                context="Machine learning is a field that uses algorithms to learn from data.",
            ),
            FaithfulnessOutput(
                statements=[
                    StatementFaithfulnessAnswer(
                        statement="ML uses statistical techniques.",
                        reason="Related to learning from data, but context doesn't explicitly mention statistical techniques.",
                        verdict=0
                    ),
                ]
            ),
        ),
        # Add more domain-specific examples here
    ]

# Assuming 'llm' is an initialized LLM instance
# Apply custom prompt
metric = Faithfulness(llm=llm)
metric.prompt = DomainSpecificPrompt()
```

--------------------------------

### Configure Environment and API Access

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/applications/benchmark_llm.md

Commands to install the necessary Ragas example dependencies and set the required environment variables for API authentication.

```bash
pip install ragas[examples]
export OPENAI_API_KEY=your_actual_api_key
```

--------------------------------

### Initialize and Run Ragas Workflow Evaluation

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/workflow_eval.md

Commands to initialize a new Ragas workflow project, install necessary dependencies, configure API keys, and execute the evaluation script.

```shell
ragas quickstart workflow_eval
cd workflow_eval
uv sync
export OPENAI_API_KEY="your-openai-key"
uv run python evals.py
```

--------------------------------

### Configure LLM Providers

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/improve_rag.md

Demonstrates how to swap the underlying LLM model in the RAG pipeline. It shows initialization for OpenAI models and the setup for Anthropic clients.

```python
# Use GPT-4 for better accuracy
rag = RAG(llm_client=client, retriever=retriever, model="gpt-4o")

# Or use a different provider
from anthropic import Anthropic
client = Anthropic()
```

--------------------------------

### Adapt Prompts for Language Localization

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/migrations/migrate_from_v03_to_v04.md

Shows how to use the BasePrompt.adapt() method to translate prompt instructions and examples, replacing the deprecated PromptMixin approach.

```python
from ragas.metrics.collections import Faithfulness

# Create metric with default prompt
metric = Faithfulness(llm=llm)

# Adapt individual prompt to another language
adapted_prompt = await metric.prompt.adapt(
    target_language="spanish",
    llm=llm,
    adapt_instruction=True
)

# Apply adapted prompt
metric.prompt = adapted_prompt
```

--------------------------------

### Use Pre-built DiscreteMetric for Evaluation

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/evals.md

Demonstrates how to use a pre-built DiscreteMetric for evaluating specific aspects of a response. It sets up an evaluator LLM and defines a custom prompt for the metric. The example shows how to asynchronously score a given response.

```python
import asyncio
from openai import AsyncOpenAI
from ragas.metrics import DiscreteMetric
from ragas.llms import llm_factory

# Setup your evaluator LLM
client = AsyncOpenAI()
evaluator_llm = llm_factory("gpt-4o", client=client)

# Create a custom aspect evaluator
metric = DiscreteMetric(
    name="summary_accuracy",
    allowed_values=["accurate", "inaccurate"],
    prompt="""Evaluate if the summary is accurate and captures key information.

Response: {response}

Answer with only 'accurate' or 'inaccurate'."""
)

# Score your application's output
async def main():
    score = await metric.ascore(
        llm=evaluator_llm,
        response="The summary of the text is..."
    )
    print(f"Score: {score.value}")  # 'accurate' or 'inaccurate'
    print(f"Reason: {score.reason}")


if __name__ == "__main__":
    asyncio.run(main())
```

--------------------------------

### Install Ragas and Dependencies with pip

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/applications/vertexai_x_ragas.md

Installs the necessary Ragas libraries and their dependencies, including langchain-core, langchain-google-vertexai, and rouge_score, for use with Vertex AI.

```python
!pip install --upgrade --user --quiet langchain-core langchain-google-vertexai langchain ragas rouge_score
```

--------------------------------

### Define Custom Test Datasets

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/quickstart.md

Example of how to extend the load_dataset function in evals.py to include custom test cases for evaluation.

```python
from ragas import Dataset

def load_dataset():
    dataset = Dataset(
        name="test_dataset",
        backend="local/csv",
        root_dir=".",
    )
    data_samples = [
        {"question": "What is Ragas?", "grading_notes": "Ragas is an evaluation framework"},
        {"question": "How do metrics work?", "grading_notes": "Metrics evaluate quality"}
    ]
    for sample in data_samples:
        dataset.append(sample)
    dataset.save()
    return dataset
```

--------------------------------

### Quickstart Ragas Project Creation

Source: https://github.com/vibrantlabsai/ragas/blob/main/README.md

Demonstrates how to use the `ragas quickstart` command to generate a RAG evaluation project. It shows how to list available templates, create a project, and specify an output directory.

```bash
# List available templates
ragas quickstart

# Create a RAG evaluation project
ragas quickstart rag_eval

# Specify where you want to create it.
ragas quickstart rag_eval -o ./my-project
```

--------------------------------

### Create Project using uvx or ragas CLI

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/improve_rag.md

Demonstrates how to create a new RAG project using either the uvx command-line tool or the ragas quickstart command. This sets up the project directory and initial files.

```shell
# Using uvx (no installation required)
uvx ragas quickstart improve_rag
cd improve_rag

# Or with ragas installed
ragas quickstart improve_rag
cd improve_rag
```

--------------------------------

### Prompt Variations for Testing

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/prompt_evals.md

This Python code demonstrates two different prompt structures for sentiment analysis. Version 1 is a simple question, while Version 2 includes examples to guide the model, allowing for comparison of effectiveness.

```python
# Version 1: Simple
prompt = f"Is this positive or negative: {text}"

# Version 2: With examples
prompt = f"""Classify sentiment:
Examples:
- "Great movie" -> positive
- "Boring film" -> negative

Text: {text}
Sentiment:"""
```

--------------------------------

### Install Ragas and Set Environment Variable

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/applications/align-llm-as-judge.md

Installs the Ragas library with example dependencies and sets the OPENAI_API_KEY environment variable. This is a prerequisite for running the judge alignment examples.

```bash
uv pip install "ragas[examples]"
export OPENAI_API_KEY="your-api-key-here"
```

--------------------------------

### Implement Prompt and Evaluation Logic

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/benchmark_llm.md

Demonstrates how to invoke a prompt for discount calculation and define a discrete metric function to validate model output against expected values.

```python
from prompt import run_prompt

profile = "Premium customer, 5 years tenure, $50k annual spend"
result = await run_prompt(profile, model="gpt-4o")

@discrete_metric(name="discount_accuracy", allowed_values=["correct", "incorrect"])
def discount_accuracy(prediction: str, expected_discount):
    parsed_json = json.loads(prediction)
    predicted_discount = parsed_json.get("discount_percentage")

    if predicted_discount == int(expected_discount):
        return MetricResult(value="correct", ...)
    else:
        return MetricResult(value="incorrect", ...)
```

--------------------------------

### Create and Navigate Project Directory

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/prompt_evals.md

This snippet shows the command-line instructions to create a new Ragas project named 'prompt_evals' and then navigate into its directory.

```shell
ragas quickstart prompt_evals
cd prompt_evals
```

--------------------------------

### Python Environment Setup and Troubleshooting

Source: https://github.com/vibrantlabsai/ragas/blob/main/CONTRIBUTING.md

Commands to manage Python virtual environments using uv, specifically addressing compatibility issues with NumPy on Python 3.13. Includes steps to downgrade to Python 3.12 or force binary installations for newer NumPy versions.

```bash
# Downgrade to Python 3.12
uv python install 3.12
rm -rf .venv
uv venv -p 3.12
make install

# Python 3.13 workaround
rm -rf .venv
uv venv -p 3.13
make install-minimal
uv pip install "ragas[tracing,gdrive,ai-frameworks]"

# Force newer NumPy wheel
uv pip install "numpy>=2.1" --only-binary=:all:
```

--------------------------------

### Install Agentic Mode Requirement

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/improve_rag.md

Provides the command to install the 'openai-agents' package, which is a requirement for using the agentic RAG mode.

```shell
pip install openai-agents
```

--------------------------------

### Editable Install Ragas for Development

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/install.md

Sets up Ragas for development by cloning the repository and installing it in editable mode. This allows you to make changes to the Ragas code and see them reflected immediately without reinstalling.

```bash
git clone https://github.com/vibrantlabsai/ragas.git 
pip install -e .
```

--------------------------------

### Install Ragas from GitHub main branch

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/install.md

Installs the most recent version of Ragas directly from the main branch of its GitHub repository. Use this to access the latest features and bug fixes before they are officially released.

```bash
pip install git+https://github.com/vibrantlabsai/ragas.git
```

--------------------------------

### Install RAG system dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/applications/evaluate-and-improve-rag.md

Installs the necessary packages for the RAG improvement examples using the uv package manager.

```bash
uv pip install "ragas-examples[improverag]"
```

--------------------------------

### Setup Environment and Dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/integrations/openlayer.ipynb

Initializes the environment by cloning necessary datasets and configuring the OpenAI API key required for LLM operations.

```bash
git clone https://huggingface.co/datasets/vibrantlabsai/prompt-engineering-papers
```

```python
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"
```

--------------------------------

### Install Zeno Client

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/integrations/_zeno.md

Installs the zeno-client package, which is necessary for interacting with the Zeno platform. This is a prerequisite for uploading and visualizing evaluation results.

```bash
pip install zeno-client
```

--------------------------------

### Implement Text-to-SQL Agent

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/text2sql.md

Demonstrates how to instantiate the Text2SQLAgent and generate SQL queries from natural language inputs using an OpenAI client.

```python
from text2sql_agent import Text2SQLAgent

agent = Text2SQLAgent(client=openai_client)
sql = await agent.generate_sql("Find all books by Jane Austen")
```

--------------------------------

### Manage LangChain OpenAI Dependencies

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/install.md

Installs specific versions of langchain-core, langchain-openai, and openai to avoid potential version conflicts when using LangChain with OpenAI. This ensures compatibility between these libraries.

```bash
pip install -U "langchain-core>=0.2,<0.3" "langchain-openai>=0.1,<0.2" openai
```

--------------------------------

### Create Agent Evaluation Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/agent_evals.md

Initializes and navigates into the 'agent_evals' project directory. This command sets up the basic structure for evaluating AI agents on mathematical tasks.

```shell
ragas quickstart agent_evals
cd agent_evals
```

--------------------------------

### Install Ragas and Dependencies (Shell)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/integrations/_langfuse.md

Installs the Ragas library along with datasets, llama_index, and python-dotenv. The --upgrade flag ensures you get the latest versions of these packages.

```shell
%pip install datasets ragas llama_index python-dotenv --upgrade
```

--------------------------------

### Integrate AWS Bedrock LLM and Embeddings via LiteLLM

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/customizations/customize_models.md

This example outlines the setup for using AWS Bedrock models for LLM and embeddings with Ragas through LiteLLM. It requires installing LiteLLM and configuring AWS credentials (via environment variables or AWS config file). The code then uses `llm_factory` and `embedding_factory`, specifying the AWS region and the respective Bedrock model IDs.

```bash
pip install litellm
```

```python
import litellm
import os
from ragas.llms import llm_factory
from ragas.embeddings.base import embedding_factory

config = {
    "region_name": "us-east-1",  # E.g. "us-east-1"
    "llm": "anthropic.claude-3-5-sonnet-20241022-v2:0",  # Your LLM model ID
    "embeddings": "amazon.titan-embed-text-v2:0",  # Your embedding model ID
    "temperature": 0.4,
}

# Set AWS credentials as environment variables
# Option 1: Use AWS credentials file (~/.aws/credentials)
# Option 2: Set environment variables directly
os.environ["AWS_REGION_NAME"] = config["region_name"]
# os.environ["AWS_ACCESS_KEY_ID"] = "your-access-key"
# os.environ["AWS_SECRET_ACCESS_KEY"] = "your-secret-key"

# Create LLM using llm_factory with litellm provider
# Note: For Bedrock, the model ID is passed directly
# Important: Pass litellm.completion (the function), not the module
bedrock_llm = llm_factory(
    f"bedrock/{config['llm']}",
    provider="litellm",
    client=litellm.completion,
    # Optional: Add system prompt
    # system_prompt="You are a helpful assistant that evaluates RAG systems."
)

# Create embeddings using embedding_factory
# Note: Pass Bedrock config directly to embedding_factory
bedrock_embeddings = embedding_factory(
    "litellm",
    model=f"bedrock/{config['embeddings']}",
    region_name=config["region_name"],
    # Optional: Add other Bedrock specific params if needed
)
```

--------------------------------

### Executing Ragas Evaluation

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/llm-adapters.md

A complete example of setting up an evaluation dataset, defining metrics, and running the evaluation pipeline.

```python
from datasets import Dataset
from ragas import evaluate
from ragas.llms import llm_factory
from ragas.metrics import ContextPrecision, ContextRecall, Faithfulness, AnswerCorrectness

# Initialize LLM
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Create dataset
dataset = Dataset.from_dict({"question": ["..."], "answer": ["..."], "contexts": [["..."]], "ground_truth": ["..."]})

# Evaluate
results = evaluate(dataset, metrics=[ContextPrecision(llm=llm), ContextRecall(llm=llm)])
print(results)
```

--------------------------------

### Create Ragas Judge Alignment Project

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/judge_alignment.md

Initializes a new Ragas project for judge alignment and navigates into the project directory. This is the first step to set up the evaluation environment.

```sh
ragas quickstart judge_alignment
cd judge_alignment
```

--------------------------------

### Overriding Prompt Instruction for Strictness (Python)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/migrations/migrate_from_v03_to_v04.md

Provides an example of creating a prompt subclass to enforce stricter evaluation criteria. By overriding the `instruction` property, you can guide the LLM to be more rigorous in its judgments.

```python
from ragas.metrics.collections.faithfulness.util import FaithfulnessPrompt

class StrictFaithfulnessPrompt(FaithfulnessPrompt):
    @property
    def instruction(self):
        return """Be very strict when judging faithfulness. Only mark statements as faithful (verdict=1) if they are directly stated or strongly implied by the context."""

# Assuming 'llm' is an initialized LLM instance
# metric = Faithfulness(llm=llm)
# metric.prompt = StrictFaithfulnessPrompt()
```

--------------------------------

### Environment Setup and Imports

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/integrations/_ag_ui.md

Initializes the environment by loading credentials, patching the event loop with nest_asyncio, and importing necessary Ragas modules for experiment execution.

```python
import json
import nest_asyncio
import pandas as pd
from dotenv import load_dotenv
from IPython.display import display
from ragas.dataset import Dataset
from ragas.messages import HumanMessage

load_dotenv()
nest_asyncio.apply()
```

--------------------------------

### Run Evaluation Workflow

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/getstarted/quickstart.md

Commands to execute the evaluation script within the project environment.

```shell
uv run python evals.py
```

```shell
python evals.py
```

--------------------------------

### Manage Evaluation Datasets

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/rag_eval.md

Functions to define, populate, and save test datasets for RAG evaluation experiments.

```python
def load_dataset():
    dataset = Dataset(name="test_dataset", backend="local/csv", root_dir="evals")
    data_samples = [{"question": "What is Ragas?", "grading_notes": "- evaluation framework"}]
    for sample in data_samples:
        dataset.append(sample)
    dataset.save()
    return dataset
```

--------------------------------

### Accessing and Initializing Adapters

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/llm-adapters.md

Demonstrates how to list available adapters and retrieve a specific adapter instance to create an LLM client.

```python
from ragas.llms.adapters import ADAPTERS, get_adapter

# List available adapters
print(ADAPTERS)

# Retrieve specific adapter
instructor = get_adapter("instructor")
litellm = get_adapter("litellm")

# Create LLM using adapter directly
llm = instructor.create_llm(client, "gpt-4o", "openai")
```

--------------------------------

### Run Simple RAG System Example

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/tutorials/rag.md

Executes a basic RAG system example. This command initiates the RAG process, likely involving document retrieval and answer generation.

```bash
python -m ragas_examples.rag_eval.rag
```

--------------------------------

### Install and Configure Google Cloud LLMs with Ragas

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/extra/components/choose_evaluator_llm.md

Provides installation commands and configuration examples for both Google AI Studio and Google Cloud Vertex AI. It shows how to set credentials and wrap the respective LLM clients for Ragas.

```bash
# for Google AI Studio
pip install langchain-google-genai
# for Google Cloud Vertex AI
pip install langchain-google-vertexai
```

```python
# For Google AI Studio:
import os
os.environ["GOOGLE_API_KEY"] = "your-google-ai-key"  # From https://ai.google.dev/
```

```python
# For Google Cloud Vertex AI:
# Ensure you have credentials configured (gcloud, workload identity, etc.)
# Or set service account JSON path:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/service-account.json"
```

```python
config = {
    "model": "gemini-1.5-pro",  # or other model IDs
    "temperature": 0.4,
    "max_tokens": None,
    "top_p": 0.8,
    # For Vertex AI only:
    "project": "your-project-id",  # Required for Vertex AI
    "location": "us-central1",     # Required for Vertex AI
}
```

```python
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

# Choose the appropriate import based on your API:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_vertexai import ChatVertexAI

# Initialize with Google AI Studio
evaluator_llm = LangchainLLMWrapper(ChatGoogleGenerativeAI(
    model=config["model"],
    temperature=config["temperature"],
    max_tokens=config["max_tokens"],
    top_p=config["top_p"],
))
```

```python
# Or initialize with Vertex AI
evaluator_llm = LangchainLLMWrapper(ChatVertexAI(
    model=config["model"],
    temperature=config["temperature"],
    max_tokens=config["max_tokens"],
    top_p=config["top_p"],
    project=config["project"],
    location=config["location"],
))
```

```python
from langchain_google_genai import HarmCategory, HarmBlockThreshold

safety_settings = {
    HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
    # Add other safety settings as needed
}

# Apply to your LLM initialization
```

--------------------------------

### Define Improved Judge Metric with Abbreviation Guide (Python)

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/cli/judge_alignment.md

Defines an improved judge metric that includes an abbreviation guide within the prompt. This enhances the LLM's ability to evaluate responses by providing context for common abbreviations and business terms.

```python
# Improved judge (enhanced with abbreviation guide)
accuracy_metric_v2 = DiscreteMetric(
    name="accuracy",
    prompt="""Evaluate if response covers ALL key concepts...\n\n    ABBREVIATION GUIDE:\n    • Financial: val=valuation, post-$=post-money, rev=revenue...\n    • Business: mkt=market, reg=regulation...\n    """,
    allowed_values=["pass", "fail"],
)
```

--------------------------------

### Switch LLM Providers

Source: https://github.com/vibrantlabsai/ragas/blob/main/docs/howtos/llm-adapters.md

Provides a comparative example of initializing different LLM providers like OpenAI and Gemini using the unified llm_factory interface.

```python
# Before: OpenAI
from openai import OpenAI
client = OpenAI(api_key="...")
llm = llm_factory("gpt-4o", client=client)

# After: Gemini
import google.generativeai as genai
genai.configure(api_key="...")
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
```