### Install Project Dependencies Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Run this command in your terminal to install all necessary project dependencies. ```bash $ npm install ``` -------------------------------- ### Complete Evaluation Example Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/core_concepts.md Demonstrates setting up multiple evaluators, creating a sample, and aggregating the evaluation results using EvaluationAggregator. ```java import dev.langchain4j.model.chat.ChatLanguageModel; import dev.langchain4j.model.embedding.EmbeddingModel; public class EvaluationExample { public static void main(String[] args) { ChatLanguageModel chatModel = /* Any Langchain4j ChatLanguageModel */ EmbeddingModel embeddingModel = /* Any Langchain4j EmbeddingModel */ Evaluator relevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); Evaluator correctnessEvaluator = new AnswerCorrectnessEvaluator(chatModel); Evaluator faithfulnessEvaluator = new FaithfulnessEvaluator(chatModel); Evaluator similarityEvaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel); Sample sample = Sample.builder() .withQuestion("What are the main features of Java?") .withAnswer("Java is object-oriented, platform-independent, and has automatic memory management.") .withGroundTruth("Java's main features include object-oriented programming, platform independence through JVM, automatic memory management (garbage collection), and strong type safety.") .withContext("Java is a popular programming language. Key features of Java include...") .build(); EvaluationAggregation results = EvaluationAggregator.evaluateAll( sample, relevanceEvaluator, correctnessEvaluator, faithfulnessEvaluator, similarityEvaluator ); // Access results System.out.println("Relevance score: " + results.get("Answer relevance")); System.out.println("Correctness score: " + results.get("Answer correctness")); System.out.println("Faithfulness score: " + results.get("Faithfulness")); System.out.println("Semantic similarity: " + results.get("Answer semantic similarity")); } } ``` -------------------------------- ### Start Local Development Server Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Starts a local development server that automatically refreshes on code changes. The site will be available at http://localhost:3000. ```bash $ npm start ``` -------------------------------- ### Core Evaluation Example Source: https://github.com/explore-de/rage4j/blob/main/README.md Demonstrates how to create a sample, initialize an evaluator, and perform an evaluation using the core rage4j library. Ensure a chatModel is available. ```java Sample sample = Sample.builder() .withQuestion("What is the capital of France?") .withAnswer("Paris is the capital of France.") .withGroundTruth("Paris") .build(); Evaluator evaluator = new AnswerCorrectnessEvaluator(chatModel); Evaluation result = evaluator.evaluate(sample); System.out.println(result.getName() + ": " + result.getValue()); ``` -------------------------------- ### Build Rage4J Project Source: https://github.com/explore-de/rage4j/blob/main/CLAUDE.md Use these Maven wrapper commands to build, test, and format the Rage4J project. Ensure Java 21 is installed. Integration tests require the OPEN_AI_KEY environment variable. ```bash ./mvnw clean install ``` ```bash ./mvnw test ``` ```bash ./mvnw test -DincludedGroups=integration -DexcludedGroups= ``` ```bash ./mvnw test -pl dev.rage4j.rage4j -Dtest=FaithfulnessEvaluatorTest ``` ```bash ./mvnw test -pl dev.rage4j.rage4j -Dtest=FaithfulnessEvaluatorTest#testMethodName ``` ```bash ./mvnw formatter:format ``` ```bash ./mvnw formatter:validate ``` ```bash ./mvnw test -Dshow.metric.logs=true ``` -------------------------------- ### Install Rage4J Assert Dependency Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Add the rage4j-assert dependency to your project's pom.xml for testing RAG evaluations. ```xml dev.rage4j rage4j-assert 2.0.1-SNAPSHOT test ``` -------------------------------- ### Complete RAGE4J Persistence Example Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist-junit5/introduction.md A complete JUnit 5 test class demonstrating RAGE4J persistence with an OpenAI LLM assertion. Imports include necessary RAGE4J and JUnit 5 components. ```java import dev.rage4j.persist.EvaluationStore; import dev.rage4j.persist.junit5.Rage4jPersistConfig; import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import dev.rage4j.model.EvaluationAggregation; import org.junit.jupiter.api.Test; @Rage4jPersistConfig(file = "target/my-evaluations.jsonl") class RagEvaluationTest { private final String apiKey = System.getenv("OPEN_AI_KEY"); private final RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); @Test void testCorrectness(EvaluationStore store) { EvaluationAggregation result = rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris") .when() .answer("Paris is the capital of France.") .then() .assertAnswerCorrectness(0.7) .getEvaluationAggregation(); store.store(result); } } ``` -------------------------------- ### Basic Rage4J Assertion Example Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Demonstrates basic RAG evaluation assertions using RageAssert with an OpenAI LLM builder. Ensure you have the necessary API key configured. ```java import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris") .context("France is a country in Europe. Paris is the capital of France.") .when() .answer("Paris is the capital of France.") .then() .assertFaithfulness(0.7) .then() .assertAnswerCorrectness(0.8) .then() .assertAnswerRelevance(0.7); ``` -------------------------------- ### Quick Example of Answer Relevance Evaluation Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/introduction.md Demonstrates how to create an Answer Relevance evaluator, define a sample question and answer, evaluate the sample, and retrieve the metric name and score. Requires initialization of chatModel and embeddingModel. ```java // 1. Create an evaluator Evaluator answerRelevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); // 2. Create a sample Sample sample = Sample.builder() .withQuestion("What is Java?") .withAnswer("Java is a programming language.") .build(); // 3. Evaluate and get results Evaluation result = answerRelevanceEvaluator.evaluate(sample); // 4. Get our score System.out.println("Metric name: "+result.getName()); // Metric name: Answer relevance System.out.println("Metric score: "+result.getName()); // Metric score: 1.0 ``` -------------------------------- ### JSON Lines Output Format Example Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist/README.md Evaluation results are stored in JSON Lines format, with each line representing a single JSON object. ```json {"sample":{"question":"What is AI?","answer":"AI is...","groundTruth":"..."},"metrics":{"Faithfulness":0.85,"Answer correctness":0.72}} ``` -------------------------------- ### Persist Rage4J Assert Results Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist/README.md Integrate with rage4j-assert to get evaluation results and persist them using a JsonLinesStore. Use try-with-resources for automatic closing. ```java import dev.rage4j.model.EvaluationAggregation; import dev.rage4j.persist.store.JsonLinesStore; // Get results from assertions EvaluationAggregation result = rageAssert.given() .question("What is AI?") .groundTruth("Artificial intelligence...") .when() .answer(llm::chat) .then() .assertFaithfulness(0.7) .then() .assertAnswerCorrectness(0.8) .getEvaluationAggregation(); // Persist the results try (JsonLinesStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { store.store(result); } ``` -------------------------------- ### Initialize and Evaluate Answer Semantic Similarity Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/metrics/answer_semantic_similarity.md Initializes the AnswerSemanticSimilarityEvaluator with an embedding model and evaluates a sample to get the similarity score. Ensure the 'embeddingModel' is properly configured. ```java AnswerSemanticSimilarityEvaluator evaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel); Evaluation result = evaluator.evaluate(sample); double similarityScore = result.getValue(); ``` -------------------------------- ### Rage4J Fluent Assertion API Source: https://github.com/explore-de/rage4j/blob/main/CLAUDE.md Utilize the fluent API provided by the rage4j-assert module for RAG evaluation within tests. This example demonstrates asserting faithfulness with a given threshold. ```java rageAssert.given().question().groundTruth().when().answer().then().assertFaithfulness(0.7) ``` -------------------------------- ### Persist Evaluation Results to JSON Lines Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Store evaluation results in JSON Lines format using the rage4j-persist module. This example demonstrates manual storage. ```java import dev.rage4j.persist.store.JsonLinesStore; import java.nio.file.Path; EvaluationAggregation result = rageAssert.given() .question("What is AI?") .groundTruth("...") .when() .answer(llm::chat) .then() .assertFaithfulness(0.7) .getEvaluationAggregation(); // Persist manually try (JsonLinesStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { store.store(result); } // flush() is called automatically on close() ``` -------------------------------- ### Evaluate All Metrics with EvaluationAggregator Source: https://context7.com/explore-de/rage4j/llms.txt Use EvaluationAggregator to run multiple evaluators on a sample simultaneously and get a consolidated result. Ensure necessary models and sample data are initialized. ```java import dev.langchain4j.model.openai.OpenAiChatModel; import dev.langchain4j.model.openai.OpenAiEmbeddingModel; import dev.rage4j.evaluation.answercorrectness.AnswerCorrectnessEvaluator; import dev.rage4j.evaluation.answerrelevance.embedding.AnswerRelevanceEmbeddingEvaluator; import dev.rage4j.evaluation.answersemanticsimilarity.AnswerSemanticSimilarityEvaluator; import dev.rage4j.evaluation.faithfulness.FaithfulnessEvaluator; import dev.rage4j.evaluation.bleuscore.BleuScoreEvaluator; import dev.rage4j.model.EvaluationAggregation; import dev.rage4j.model.Sample; import dev.rage4j.util.EvaluationAggregator; OpenAiChatModel chatModel = OpenAiChatModel.builder() .apiKey(System.getenv("OPENAI_API_KEY")) .modelName("gpt-4") .build(); OpenAiEmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder() .apiKey(System.getenv("OPENAI_API_KEY")) .modelName("text-embedding-3-small") .build(); Sample sample = Sample.builder() .withQuestion("What causes climate change?") .withAnswer("Climate change is primarily caused by greenhouse gas emissions from human activities like burning fossil fuels.") .withGroundTruth("Climate change is driven by increased greenhouse gas emissions, mainly CO2 from burning fossil fuels, deforestation, and industrial processes.") .withContext("Human activities, especially burning fossil fuels, release greenhouse gases that trap heat in Earth's atmosphere.") .build(); // Run multiple evaluations at once EvaluationAggregation results = EvaluationAggregator.evaluateAll(sample, new AnswerCorrectnessEvaluator(chatModel), new AnswerRelevanceEmbeddingEvaluator(chatModel, embeddingModel), new FaithfulnessEvaluator(chatModel), new AnswerSemanticSimilarityEvaluator(embeddingModel), new BleuScoreEvaluator() ); // Access individual scores by metric name System.out.println("Correctness: " + results.get("Answer correctness")); System.out.println("Relevance: " + results.get("Answer relevance")); System.out.println("Faithfulness: " + results.get("Faithfulness")); System.out.println("Similarity: " + results.get("Answer semantic similarity")); System.out.println("BLEU: " + results.get("Bleu score")); ``` -------------------------------- ### Create a Sample for Evaluation Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/core_concepts.md Use the Sample.builder() to construct an instance for evaluation. Include question, answer, ground truth, and optional context. ```java Sample sample = Sample.builder() .withQuestion("What is the capital of France?") .withAnswer("Paris is the capital of France.") .withGroundTruth("Paris is the capital and largest city of France.") .withContext("Paris is the capital of France...") .build(); ``` -------------------------------- ### Create a Sample for Evaluation Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j/README.md Builds a Sample object containing the question, answer, ground truth, and context for evaluation. ```java Sample sample = Sample.builder() .withQuestion("What is the capital of France?") .withAnswer("Paris is the capital of France.") .withGroundTruth("Paris") .withContext("France is a country in Europe. Paris is its capital.") .build(); ``` -------------------------------- ### Initialize RageAssert with OpenAI API Key Source: https://context7.com/explore-de/rage4j/llms.txt Configure RageAssert for testing LLM outputs using OpenAI models. Requires an API key to be set in the environment. ```java import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import dev.rage4j.evaluation.rougescore.RougeScoreEvaluator.RougeType; import dev.rage4j.evaluation.rougescore.RougeScoreEvaluator.MeasureType; String apiKey = System.getenv("OPENAI_API_KEY"); // Build RageAssert with OpenAI models RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); ``` -------------------------------- ### Build and Access Sample Data Structure in Java Source: https://context7.com/explore-de/rage4j/llms.txt Construct a 'Sample' object using the builder pattern to hold question, answer, ground truth, and context for evaluation. Access individual fields using getter methods. ```java import dev.rage4j.model.Sample; // Build a sample with all fields Sample sample = Sample.builder() .withQuestion("What is the capital of France?") .withAnswer("Paris is the capital of France.") .withGroundTruth("Paris is the capital and largest city of France.") .withContext("Paris is the capital of France, located in northern France along the Seine River.") .build(); // Access sample fields String question = sample.getQuestion(); // "What is the capital of France?" String answer = sample.getAnswer(); // "Paris is the capital of France." String groundTruth = sample.getGroundTruth(); // "Paris is the capital and largest city..." String context = sample.getContext(); // "Paris is the capital of France..." // Check for field presence boolean hasContext = sample.hasContext(); // true boolean hasQuestion = sample.hasQuestion(); // true ``` -------------------------------- ### Build Static Site Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Generates the static content for the website into the 'build' directory. This output can be hosted on any static content hosting service. ```bash $ npm run build ``` -------------------------------- ### Build Production Docker Image Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Builds a Docker image optimized for production, containing the static site generated by the build process. ```bash $ docker build --target production -t rage4j-docs:prod . ``` -------------------------------- ### Custom Store Implementation with @Rage4jPersistConfig Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist-junit5/introduction.md Use a custom store by specifying the 'storeClass' attribute in the @Rage4jPersistConfig annotation. The custom store class must have a constructor that accepts a Path. ```java @Rage4jPersistConfig(file = "target/custom.dat", storeClass = MyCustomStore.class) class MyCustomTest { // MyCustomStore must have a constructor that accepts a Path } ``` -------------------------------- ### Simple RAGE4J-Assert Test Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/introduction.md Demonstrates a basic test using RAGE4J-Assert to check answer correctness against a language model. Requires OPEN_API_KEY environment variable. ```java import dev.langchain4j.model.openai.OpenAiChatModel; import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import org.junit.jupiter.api.Test; import static dev.langchain4j.model.openai.OpenAiChatModelName.GPT_4_O_MINI; class RageAssertTest { private final String key = System.getenv("OPEN_API_KEY"); private final OpenAiChatModel model = OpenAiChatModel.builder() .apiKey(key) .modelName(GPT_4_O_MINI) .build(); @Test void testCorrectnessApi() { RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris is the capital of France") .when() .answer(q -> model.generate(q)) .then() .assertAnswerCorrectness(0.7); } } ``` -------------------------------- ### Build Deployment Docker Image Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Builds a Docker image configured for deployment, typically including a web server like Nginx to serve the static site. ```bash $ docker build --target deploy -t rage4j-docs:deploy . ``` -------------------------------- ### Run Deployment Docker Container Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Runs the deployment Docker container, serving the static site on port 80. Access the site at http://localhost. ```bash $ docker run -p 80:80 rage4j-docs:deploy ``` -------------------------------- ### Initialize RougeScoreEvaluator Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/metrics/rouge_score.md Demonstrates initializing RougeScoreEvaluator with default settings (ROUGE-1 F1 score) and custom configurations for ROUGE-2 Precision and ROUGE-LSum Recall. The evaluator is then used to score a sample. ```java RougeScoreEvaluator evaluator = new RougeScoreEvaluator(); // Custom: ROUGE-2 Precision RougeScoreEvaluator customEvaluator = new RougeScoreEvaluator(RougeType.ROUGE2, MeasureType.PRECISION); // Custom: ROUGE-LSum Recall RougeScoreEvaluator summaryEvaluator = new RougeScoreEvaluator(RougeType.ROUGE_L_SUM, MeasureType.RECALL); Evaluation result = evaluator.evaluate(sample); double rougeScore = result.getValue(); ``` -------------------------------- ### Run Development Docker Container Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Runs the development Docker container, exposing the development server on port 3000. Access the site at http://localhost:3000. ```bash $ docker run -p 3000:3000 rage4j-docs:dev ``` -------------------------------- ### Build Development Docker Image Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/README.md Builds a Docker image specifically for development. This image includes tools and configurations for local development. ```bash $ docker build --target development -t rage4j-docs:dev . ``` -------------------------------- ### Configure OpenAI Models with OpenAiLLMBuilder Source: https://context7.com/explore-de/rage4j/llms.txt Use OpenAiLLMBuilder to create RageAssert instances with default or custom OpenAI chat and embedding models. Supports API key authentication and evaluation mode. ```java import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; String apiKey = System.getenv("OPENAI_API_KEY"); // Default models: gpt-4 for chat, text-embedding-3-small for embeddings RageAssert defaultAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); ``` ```java // Custom models RageAssert customAssert = new OpenAiLLMBuilder() .withChatModel("gpt-4-turbo") .withEmbeddingModel("text-embedding-3-large") .fromApiKey(apiKey); ``` ```java // Use evaluation mode (warnings instead of exceptions) RageAssert evalModeAssert = new OpenAiLLMBuilder() .fromApiKey(apiKey) .withEvaluationMode(); ``` ```java // Evaluation mode allows complete test runs for data collection evalModeAssert.given() .question("Test question") .groundTruth("Expected answer") .when() .answer("Actual answer that might fail threshold") .then() .assertAnswerCorrectness(0.95); // Logs warning instead of throwing exception ``` -------------------------------- ### Enable Metric Logs with Maven Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j/README.md Command to run Maven tests with detailed metric logs enabled for debugging. ```bash ./mvnw test -Dshow.metric.logs=true ``` -------------------------------- ### Enable Detailed Metric Logs with Maven Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/installation.mdx Run this Maven command during testing to enable detailed logging for metric calculations. This is useful for debugging and understanding how metrics are computed. ```bash mvn test -Dshow.metric.logs=true ``` -------------------------------- ### Custom Store Implementation Configuration Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist-junit5/README.md Configure @Rage4jPersistConfig to use a custom store implementation by specifying the storeClass attribute. The custom store must have a public constructor accepting a Path. ```java @Rage4jPersistConfig(file = "target/custom.dat", storeClass = MyCustomStore.class) class MyCustomTest { // MyCustomStore must have a public constructor that accepts a Path } ``` -------------------------------- ### Basic Test with EvaluationStore Injection Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist-junit5/README.md Annotate your test class with @Rage4jPersistConfig and inject EvaluationStore into test methods to automatically manage its lifecycle and persist results. ```java import dev.rage4j.persist.junit5.Rage4jPersistConfig; import dev.rage4j.persist.EvaluationStore; @Rage4jPersistConfig(file = "target/evaluations.jsonl") class MyEvaluationTest { @Test void testEvaluation(EvaluationStore store) { // store is automatically injected Sample sample = Sample.builder() .withQuestion("What is AI?") .withAnswer("AI is artificial intelligence") .withGroundTruth("AI is artificial intelligence") .build(); EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, new FaithfulnessEvaluator(chatModel)); store.store(result); assertThat(result.get("Faithfulness")).isGreaterThan(0.7); } ``` -------------------------------- ### CompositeStore Implementation Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md The `CompositeStore` allows writing evaluation results to multiple `EvaluationStore` instances simultaneously. ```APIDOC ## CompositeStore ### Description Enables writing evaluation results to multiple stores concurrently. All provided stores will receive the same data. ### Usage Example ```java EvaluationStore composite = new CompositeStore( new JsonLinesStore(Path.of("results.jsonl")), new JsonLinesStore(Path.of("backup.jsonl")) ); ``` ``` -------------------------------- ### Configure Custom OpenAI Models Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Customize the chat and embedding models used by the OpenAI LLM builder for RageAssert. ```java RageAssert rageAssert = new OpenAiLLMBuilder() .withChatModel("gpt-4o") .withEmbeddingModel("text-embedding-3-large") .fromApiKey(apiKey); ``` -------------------------------- ### Enable Automatic Persistence with @Rage4jPersistConfig Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist-junit5/introduction.md Annotate your test class with @Rage4jPersistConfig to enable automatic persistence. The 'file' attribute specifies the output file path for evaluations. ```java @Rage4jPersistConfig(file = "target/evaluations.jsonl") class MyEvaluationTest { @Test void testEvaluation(EvaluationStore store) { // store is injected and ready to use EvaluationAggregation aggregation = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(aggregation); } } ``` -------------------------------- ### Configure Custom Models with RAGE4J-Assert Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/introduction.md Configures RAGE4J-Assert to use custom chat and embedding models. Defaults are 'gpt-5.1' for chat and 'text-embedding-3-small' for embedding. ```java RageAssert rageAssert = new OpenAiLLMBuilder() .withChatModel("gpt-4o") .withEmbeddingModel("text-embedding-3-large") .fromApiKey(key); ``` -------------------------------- ### Add Rage4J Dependency to pom.xml Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j/README.md Include this XML snippet in your pom.xml to add the Rage4J core library dependency. ```xml dev.rage4j rage4j 1.1.1-SNAPSHOT ``` -------------------------------- ### Fluent Assertions for RAG Evaluation Source: https://github.com/explore-de/rage4j/blob/main/README.md Utilizes the rage4j-assert library to chain evaluation steps for testing RAG pipelines. Requires an API key for the LLM builder. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris") .context("Paris is the capital of France.") .when() .answer("Paris is the capital of France.") .then() .assertFaithfulness(0.7) .then() .assertAnswerCorrectness(0.8); ``` -------------------------------- ### Add Rage4J-Assert Maven Dependency Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/installation.mdx Include this dependency in your pom.xml to use Rage4J-Assert. Maven will automatically download the library. ```xml dev.rage4j rage4j-assert 1.0.2 test ``` -------------------------------- ### Chain Multiple Assertions with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Demonstrates chaining multiple assertions like correctness and semantic similarity on a single LLM-generated answer. This is the recommended approach for multi-metric evaluation. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model.generate(QUESTION)) .then() .assertAnswerCorrectness(0.7) .then() .assertSemanticSimilarity(0.7); ``` -------------------------------- ### CompositeStore for Multiple Destinations Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md Writes evaluation results to multiple `EvaluationStore` instances simultaneously, useful for primary storage and backups. ```java EvaluationStore composite = new CompositeStore( new JsonLinesStore(Path.of("results.jsonl")), new JsonLinesStore(Path.of("backup.jsonl")) ); ``` -------------------------------- ### JUnit 5 Integration with Rage4jPersistConfig Source: https://context7.com/explore-de/rage4j/llms.txt Use the @Rage4jPersistConfig annotation to automatically manage the EvaluationStore lifecycle in JUnit 5 tests. The store is injected into test methods. ```java import dev.rage4j.persist.EvaluationStore; import dev.rage4j.persist.junit5.Rage4jPersistConfig; import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import dev.rage4j.model.EvaluationAggregation; import org.junit.jupiter.api.Test; @Rage4jPersistConfig(file = "target/test-evaluations.jsonl") class RagEvaluationTest { private final String apiKey = System.getenv("OPENAI_API_KEY"); private final RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); @Test void testAnswerQuality(EvaluationStore store) { // EvaluationStore is automatically injected EvaluationAggregation result = rageAssert.given() .question("What is containerization?") .groundTruth("Containerization packages applications with their dependencies for consistent deployment.") .context("Containers isolate applications and their dependencies into self-contained units.") .when() .answer("Containerization bundles apps with dependencies for portable deployment.") .then() .assertAnswerCorrectness(0.7) .then() .assertFaithfulness(0.8) .getEvaluationAggregation(); // Get collected metrics // Persist evaluation results store.store(result); } @Test void testMultipleMetrics(EvaluationStore store) { EvaluationAggregation result = rageAssert.given() .question("Explain microservices") .groundTruth("Microservices architecture breaks applications into small, independent services.") .when() .answer("Microservices split apps into loosely coupled, independently deployable services.") .then() .assertSemanticSimilarity(0.8) .then() .assertBleuScore(0.4) .getEvaluationAggregation(); store.store(result); } } // Store is automatically flushed and closed after all tests complete ``` -------------------------------- ### Enable Evaluation Mode for Warnings Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Configure RageAssert to log assertion failures as warnings instead of throwing exceptions, useful for collecting comprehensive evaluation data. ```java RageAssert rageAssert = new OpenAiLLMBuilder() .fromApiKey(apiKey) .withEvaluationMode(); // Failures log warnings instead of throwing rageAssert.given() .question("What is AI?") .groundTruth("...") .when() .answer(llm::chat) .then() .assertFaithfulness(0.7) // Logs warning if below threshold .then() .assertAnswerCorrectness(0.8); // Still runs even if previous failed ``` -------------------------------- ### Basic JSON Lines Store Usage Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist/README.md Create a JsonLinesStore to save evaluation results to a file. The store buffers results and flushes them on close. ```java import dev.rage4j.persist.store.JsonLinesStore; import java.nio.file.Path; // Create a JSONL store JsonLinesStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl")); // Store evaluation results (buffered) EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(result); // Flush to disk and close store.close(); // flush() is called automatically ``` -------------------------------- ### Maven Dependencies for Rage4J Source: https://context7.com/explore-de/rage4j/llms.txt Include these dependencies in your pom.xml to add Rage4J core, assertion, persistence, and JUnit 5 integration to your project. ```xml dev.rage4j rage4j 2.0.1-SNAPSHOT dev.rage4j rage4j-assert 2.0.1-SNAPSHOT test dev.rage4j rage4j-persist 2.0.1-SNAPSHOT dev.rage4j rage4j-persist-junit5 2.0.1-SNAPSHOT test ``` -------------------------------- ### Add Rage4J Persist Dependency Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist/README.md Include this dependency in your pom.xml to use the Rage4J Persist module. ```xml dev.rage4j rage4j-persist 1.1.1-SNAPSHOT ``` -------------------------------- ### Combine Rage4J Persist with Rage4J Assert Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist-junit5/README.md Use @Rage4jPersistConfig to persist evaluation results obtained through fluent assertions with Rage4J Assert. ```java @Rage4jPersistConfig(file = "target/evaluations.jsonl") class MyAssertTest { @Test void testWithAssert(EvaluationStore store) { RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); EvaluationAggregation result = rageAssert.given() .question("What is AI?") .groundTruth("Artificial intelligence...") .when() .answer(llm::chat) .then() .assertFaithfulness(0.7) .getEvaluationAggregation(); store.store(result); // Persist to injected store } ``` -------------------------------- ### Add Rage4J-Persist Maven Dependency Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/installation.mdx Include this XML snippet in your pom.xml file to add Rage4J-Persist to your project. Maven will automatically download the library. ```xml dev.rage4j rage4j-persist 1.0.2 ``` -------------------------------- ### Buffering and Flushing Evaluations Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist/README.md The JsonLinesStore buffers evaluations in memory. Use flush() to write buffered data to disk, or storeFlush() for immediate writes. close() automatically flushes remaining data. ```java JsonLinesStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl")); // Multiple stores are buffered store.store(aggregation1); store.store(aggregation2); store.store(aggregation3); // Write all buffered data to file store.flush(); // Or use storeFlush() for immediate write store.storeFlush(aggregation4); // close() automatically flushes remaining buffer store.close(); ``` -------------------------------- ### Add Rage4J-Core Maven Dependency Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/installation.mdx Include this XML snippet in your project's pom.xml file to add the Rage4J-Core library. Maven will automatically handle the download and inclusion. ```xml dev.rage4j rage4j 1.0.5 ``` -------------------------------- ### Add Rage4J Persist JUnit 5 Dependency Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-persist-junit5/README.md Add this dependency to your pom.xml to include the Rage4J Persist JUnit 5 module for testing. ```xml dev.rage4j rage4j-persist-junit5 1.1.1-SNAPSHOT test ``` -------------------------------- ### Initialize and Evaluate Answer Relevance Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/metrics/answer_relevance.md Instantiate the AnswerRelevanceEvaluator with a chat model and embedding model, then use it to evaluate a sample containing a question and answer. The result provides a relevance score. ```java AnswerRelevanceEvaluator evaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); Evaluation result = evaluator.evaluate(sample); double relevanceScore = result.getValue(); ``` -------------------------------- ### Switch Back to Strict Mode Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Revert RageAssert to strict mode where assertion failures throw exceptions. ```java rageAssert.withStrictMode(); // Failures throw exceptions again ``` -------------------------------- ### Add Maven Dependency for Rage4J Persist JUnit5 Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist-junit5/installation.mdx Include this dependency in your pom.xml to use the JUnit 5 extension for automatic persistence. It transitively includes rage4j-persist. ```xml dev.rage4j rage4j-persist-junit5 1.0.2 test ``` -------------------------------- ### Integrate Custom LLM for Answers Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Use a custom LLM function to generate answers within the RageAssert workflow. This allows for flexible answer generation strategies. ```java rageAssert.given() .question("What is AI?") .groundTruth("Artificial intelligence is...") .when() .answer(question -> llm.generate(question)) // Call your LLM .then() .assertFaithfulness(0.7); ``` -------------------------------- ### JsonLinesStore Implementation Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md The `JsonLinesStore` implementation persists evaluation results in the JSON Lines format, where each line is a valid JSON object. ```APIDOC ## JsonLinesStore ### Description Stores evaluation results in JSON Lines format (`.jsonl`), with each line being a complete JSON object. This implementation supports automatic flushing on close when used with try-with-resources. ### Usage Example ```java try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(result); } // flush() is called automatically on close() ``` ### Output Format Example ```json {"sample":{"question":"...","answer":"...","groundTruth":"..."},"metrics":{"Answer correctness":0.85}} ``` ``` -------------------------------- ### Evaluate with a Single AnswerCorrectness Evaluator Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j/README.md Initializes a ChatModel and an AnswerCorrectnessEvaluator, then evaluates a sample. Requires API key and model name configuration. ```java ChatModel chatModel = OpenAiChatModel.builder() .apiKey(apiKey) .modelName("gpt-4o") .build(); Evaluator evaluator = new AnswerCorrectnessEvaluator(chatModel); Evaluation result = evaluator.evaluate(sample); System.out.println(result.getName() + ": " + result.getValue()); ``` -------------------------------- ### Write to Multiple Stores with CompositeStore Source: https://context7.com/explore-de/rage4j/llms.txt Employ CompositeStore to simultaneously write evaluation results to multiple destinations, useful for backups or multi-format output. The store automatically flushes and closes all underlying stores. ```java import dev.rage4j.persist.store.CompositeStore; import dev.rage4j.persist.store.JsonLinesStore; import dev.rage4j.persist.EvaluationStore; import java.nio.file.Path; // Write to multiple files simultaneously EvaluationStore composite = new CompositeStore( new JsonLinesStore(Path.of("target/evaluations.jsonl")), new JsonLinesStore(Path.of("target/backup/evaluations.jsonl")), new JsonLinesStore(Path.of("/shared/metrics/evaluations.jsonl")) ); try (composite) { composite.store(aggregation); // Data written to all three files on flush/close } ``` -------------------------------- ### Add rage4j Core Dependency to Maven Source: https://github.com/explore-de/rage4j/blob/main/README.md Include the core rage4j library in your project's pom.xml to use its evaluation functionalities. ```xml dev.rage4j rage4j 2.0.1-SNAPSHOT ``` -------------------------------- ### EvaluationStore Interface Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md The `EvaluationStore` interface defines the contract for persisting evaluation results. It includes methods for storing, flushing, and closing the store. ```APIDOC ## EvaluationStore Interface ### Description The `EvaluationStore` interface defines how evaluation results are persisted. ### Methods - `store(EvaluationAggregation aggregation)`: Buffers data for later persistence. - `flush()`: Writes buffered data to the storage. - `storeFlush(EvaluationAggregation aggregation)`: Stores the provided aggregation and then flushes the buffer. - `close()`: Releases any resources held by the store. ### Interface Definition ```java public interface EvaluationStore extends Closeable { void store(EvaluationAggregation aggregation); void flush(); void storeFlush(EvaluationAggregation aggregation); void close(); } ``` ``` -------------------------------- ### Evaluate Sample with BleuScoreEvaluator Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-core/metrics/bleu_score.md Instantiate BleuScoreEvaluator and use it to evaluate a sample. Requires 'answer' and 'groundTruth' fields in the sample. ```java BleuScoreEvaluator evaluator = new BleuScoreEvaluator(); Evaluation result = evaluator.evaluate(sample); double bleuScore = result.getValue(); ``` -------------------------------- ### Test Semantic Similarity with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertSemanticSimilarity to verify the semantic similarity score between a generated answer and the ground truth. Requires question, ground truth, and an LLM model. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertSemanticSimilarity(0.7); ``` -------------------------------- ### Test Semantic Similarity with RageAssert Source: https://context7.com/explore-de/rage4j/llms.txt Assert that an answer's semantic similarity to the ground truth meets a specified threshold. Throws Rage4JSimilarityException on failure. ```java // Test semantic similarity rageAssert.given() .question("Describe photosynthesis") .groundTruth("Photosynthesis is the process by which plants convert sunlight into energy.") .when() .answer("Plants use photosynthesis to transform light energy into chemical energy.") .then() .assertSemanticSimilarity(0.75); // Throws Rage4JSimilarityException if below 0.75 ``` -------------------------------- ### JsonLinesStore Usage Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md Stores evaluation results in JSON Lines format to a specified file. The store is automatically flushed and closed when exiting the try-with-resources block. ```java try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(result); } // flush() called automatically on close() ``` -------------------------------- ### Test Answer Correctness with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertAnswerCorrectness to check if a generated answer meets a specified correctness threshold against a ground truth. Requires setting up RageAssert with an LLM provider. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertAnswerCorrectness(0.7); ``` -------------------------------- ### Test Answer Relevance with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertAnswerRelevance to check if a generated answer is relevant to the provided context, with a specified relevance score. Requires question, ground truth, and an LLM model. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertAnswerRelevance(0.7); ``` -------------------------------- ### Test Faithfulness with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertFaithfulness to ensure a generated answer adheres to the provided context with a minimum faithfulness score. Requires question, ground truth, context, and an LLM model. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .context(CONTEXT) .when() .answer(model::generate) .then() .assertFaithfulness(0.7); ``` -------------------------------- ### EvaluationStore Interface Definition Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-persist/introduction.md Defines the contract for persisting evaluation results, including methods for storing, flushing, and closing. ```java public interface EvaluationStore extends Closeable { void store(EvaluationAggregation aggregation); // Buffer data void flush(); // Write to storage void storeFlush(EvaluationAggregation aggregation); // Store and flush void close(); // Release resources } ``` -------------------------------- ### Retrieve Evaluation Aggregation Results Source: https://github.com/explore-de/rage4j/blob/main/dev.rage4j.rage4j-assert/README.md Obtain the aggregated evaluation metrics and sample data after running assertions. This result can be used for analysis or persistence. ```java import dev.rage4j.model.EvaluationAggregation; EvaluationAggregation result = rageAssert.given() .question("What is AI?") .groundTruth("...") .when() .answer(llm::chat) .then() .assertFaithfulness(0.7) .then() .assertAnswerCorrectness(0.8) .getEvaluationAggregation(); // Returns aggregation with sample and all metrics // Access the metrics result.forEach((metric, score) -> System.out.println(metric + ": " + score)); // Access the sample result.getSample().ifPresent(sample -> System.out.println(sample.getQuestion())); ``` -------------------------------- ### Test BLEU Score with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertBleuScore to test the n-gram overlap precision of a generated answer against a ground truth. Requires question, ground truth, and an LLM model. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertBleuScore(0.7); ``` -------------------------------- ### Test ROUGE Score with Rage4j Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/examples.md Use assertRougeScore to evaluate the ROUGE score, specifically ROUGE_L_SUM with precision, against a ground truth. Requires question, ground truth, and an LLM model. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertRougeScore(0.9, RougeScoreEvaluator.RougeType.ROUGE_L_SUM, RougeScoreEvaluator.MeasureType.PRECISION); ``` -------------------------------- ### Enable Evaluation Mode in RAGE4J-Assert Source: https://github.com/explore-de/rage4j/blob/main/docusaurus/docs/rage4j-assert/introduction.md Enables evaluation mode in RAGE4J-Assert, which logs warnings on assertion failures instead of throwing exceptions. Strict mode can be re-enabled using withStrictMode(). ```java RageAssert rageAssert = new OpenAiLLMBuilder() .fromApiKey(key) .withEvaluationMode(); // Logs warnings instead of throwing // Switch back to strict mode if needed rageAssert.withStrictMode(); ```