### Complete Example with RageAssert and EvaluationStore Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction This example demonstrates a complete test case using RageAssert for evaluation and RAGE4J-Persist-JUnit5 to store the results. It configures the output file and uses an OpenAI LLM builder for assertions. ```java import dev.rage4j.persist.EvaluationStore; import dev.rage4j.persist.junit5.Rage4jPersistConfig; import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import dev.rage4j.model.EvaluationAggregation; import org.junit.jupiter.api.Test; @Rage4jPersistConfig(file = "target/my-evaluations.jsonl") class RagEvaluationTest { private final String apiKey = System.getenv("OPEN_AI_KEY"); private final RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey); @Test void testCorrectness(EvaluationStore store) { EvaluationAggregation result = rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris") .when() .answer("Paris is the capital of France.") .then() .assertAnswerCorrectness(0.7) .getEvaluationAggregation(); store.store(result); } } ``` -------------------------------- ### Complete Evaluation Example in Java Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts This example demonstrates how to use Rage4j to evaluate an LLM response with multiple metrics. It initializes various evaluators, creates a Sample, and aggregates the results. ```java import dev.langchain4j.model.chat.ChatLanguageModel; import dev.langchain4j.model.embedding.EmbeddingModel; public class EvaluationExample { public static void main(String[] args) { ChatLanguageModel chatModel = /* Any Langchain4j ChatLanguageModel */ EmbeddingModel embeddingModel = /* Any Langchain4j EmbeddingModel */ Evaluator relevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); Evaluator correctnessEvaluator = new AnswerCorrectnessEvaluator(chatModel); Evaluator faithfulnessEvaluator = new FaithfulnessEvaluator(chatModel); Evaluator similarityEvaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel); Sample sample = Sample.builder() .withQuestion("What are the main features of Java?") .withAnswer("Java is object-oriented, platform-independent, and has automatic memory management.") .withGroundTruth("Java's main features include object-oriented programming, platform independence through JVM, automatic memory management (garbage collection), and strong type safety.") .withContext("Java is a popular programming language. Key features of Java include...") .build(); EvaluationAggregation results = EvaluationAggregator.evaluateAll(sample, relevanceEvaluator, correctnessEvaluator, faithfulnessEvaluator, similarityEvaluator ); // Access results System.out.println("Relevance score: " + results.get("Answer relevance")); System.out.println("Correctness score: " + results.get("Answer correctness")); System.out.println("Faithfulness score: " + results.get("Faithfulness")); System.out.println("Semantic similarity: " + results.get("Answer semantic similarity")); } } ``` -------------------------------- ### Evaluate Answer Relevance with RAGE4j-Core Source: https://explore-de.github.io/rage4j/docs/rage4j-core/introduction This example demonstrates how to create an AnswerRelevanceEvaluator, define a sample with a question and answer, and then evaluate the sample to get the relevance score. Ensure you have initialized chatModel and embeddingModel. ```java // 1. Create an evaluator Evaluator answerRelevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); // 2. Create a sample Sample sample = Sample.builder() .withQuestion("What is Java?") .withAnswer("Java is a programming language.") .build(); // 3. Evaluate and get results Evaluation result = answerRelevanceEvaluator.evaluate(sample); // 4. Get our score System.out.println("Metric name: "+result.getName()); // Metric name: Answer relevance System.out.println("Metric score: "+result.getName()); // Metric score: 1.0 ``` -------------------------------- ### Get Maven Dependency for RAGE4J-Core Source: https://explore-de.github.io/rage4j/docs/category/rage4j-core Provides a JavaScript function to dynamically generate the Maven dependency string for RAGE4J-Core based on a specified version. ```javascript const getMavenDependency = (version) => ` ``` -------------------------------- ### Test ROUGE Score with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertRougeScore to evaluate the ROUGE score, specifically ROUGE_L_SUM precision in this example, against a ground truth. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertRougeScore(0.9, RougeScoreEvaluator.RougeType.ROUGE_L_SUM, RougeScoreEvaluator.MeasureType.PRECISION); ``` -------------------------------- ### JsonLinesStore Usage Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction Example of using JsonLinesStore to save evaluation results in JSON Lines format. The store is automatically closed and flushed when used in a try-with-resources block. ```APIDOC ## JsonLinesStore ### Description The primary implementation stores evaluations in JSON Lines format (`.jsonl`), where each line is a complete JSON object. ### Usage Example ```java try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(result); } // flush() called automatically on close() ``` ### Output Format Example ```json { "sample": { "question": "...", "answer": "...", "groundTruth": "..." }, "metrics": { "Answer correctness": 0.85 } } ``` ``` -------------------------------- ### ROUGE Score Evaluation Examples Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/rouge_score Demonstrates how to initialize and use the RougeScoreEvaluator for different ROUGE types and measure types. The default is ROUGE-1 F1 score. The evaluate method returns an Evaluation result object. ```java RougeScoreEvaluator evaluator = new RougeScoreEvaluator(); RougeScoreEvaluator customEvaluator = new RougeScoreEvaluator(RougeType.ROUGE2, MeasureType.PRECISION); RougeScoreEvaluator summaryEvaluator = new RougeScoreEvaluator(RougeType.ROUGE_L_SUM, MeasureType.RECALL); Evaluation result = evaluator.evaluate(sample); double rougeScore = result.getValue(); ``` -------------------------------- ### Create a Sample Object in Java Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts Use the Sample.builder() to construct a Sample object, which represents an evaluation instance. It includes a question, answer, ground truth, and optional context. ```java Sample sample = Sample.builder() .withQuestion("What is the capital of France?") .withAnswer("Paris is the capital of France.") .withGroundTruth("Paris is the capital and largest city of France.") .withContext("Paris is the capital of France...") .build(); ``` -------------------------------- ### Basic Test with RAGE4J-Assert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction Demonstrates a simple test case using RAGE4J-Assert to check the correctness of an LLM's answer against a ground truth. Requires API key and OpenAI model configuration. ```java import dev.langchain4j.model.openai.OpenAiChatModel; import dev.rage4j.asserts.RageAssert; import dev.rage4j.asserts.openai.OpenAiLLMBuilder; import org.junit.jupiter.api.Test; import static dev.langchain4j.model.openai.OpenAiChatModelName.GPT_4_O_MINI; class RageAssertTest { private final String key = System.getenv("OPEN_API_KEY"); private final OpenAiChatModel model = OpenAiChatModel.builder() .apiKey(key) .modelName(GPT_4_O_MINI) .build(); @Test void testCorrectnessApi() { RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question("What is the capital of France?") .groundTruth("Paris is the capital of France") .when() .answer(q -> model.generate(q)) .then() .assertAnswerCorrectness(0.7); } } ``` -------------------------------- ### CompositeStore Usage Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction Demonstrates how to use CompositeStore to write evaluation results to multiple stores simultaneously. ```APIDOC ## CompositeStore ### Description Write to multiple stores simultaneously. ### Usage Example ```java EvaluationStore composite = new CompositeStore( new JsonLinesStore(Path.of("results.jsonl")), new JsonLinesStore(Path.of("backup.jsonl")) ); ``` ``` -------------------------------- ### Custom Store Implementation with @Rage4jPersistConfig Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction To use a custom store implementation, specify the 'storeClass' attribute in the @Rage4jPersistConfig annotation. Ensure your custom store class has a constructor that accepts a Path. ```java @Rage4jPersistConfig(file = "target/custom.dat", storeClass = MyCustomStore.class) class MyCustomTest { // MyCustomStore must have a constructor that accepts a Path } ``` -------------------------------- ### Enable Automatic Persistence with @Rage4jPersistConfig Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction Annotate your test class with @Rage4jPersistConfig to enable automatic persistence. Specify the output file path using the 'file' attribute. The EvaluationStore will be injected into your test methods. ```java @Rage4jPersistConfig(file = "target/evaluations.jsonl") class MyEvaluationTest { @Test void testEvaluation(EvaluationStore store) { // store is injected and ready to use EvaluationAggregation aggregation = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(aggregation); } } ``` -------------------------------- ### Enable Detailed Metric Logs with Maven Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core-installation Run this Maven command during testing to enable verbose logging for metric calculations. This is useful for debugging. ```bash mvn test -Dshow.metric.logs=true ``` -------------------------------- ### Configure Custom Chat and Embedding Models Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction Shows how to configure specific chat and embedding models for RAGE4J-Assert using the OpenAiLLMBuilder. Defaults are 'gpt-5.1' for chat and 'text-embedding-3-small' for embedding. ```java RageAssert rageAssert = new OpenAiLLMBuilder() .withChatModel("gpt-4o") .withEmbeddingModel("text-embedding-3-large") .fromApiKey(key); ``` -------------------------------- ### Chain Multiple Assertions with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Demonstrates chaining multiple assertions like correctness and semantic similarity for a single LLM-generated answer. This is the recommended approach for testing against multiple metrics. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model.generate(QUESTION)) .then() .assertAnswerCorrectness(0.7) .then() .assertSemanticSimilarity(0.7); ``` -------------------------------- ### Add Rage4J-Assert Maven Dependency Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/assert-installation Include this dependency in your pom.xml to use Rage4J-Assert in your project. Maven will handle the download and inclusion automatically. ```xml dev.rage4j rage4j-assert 1.0.4 test ``` -------------------------------- ### Add Rage4J-Core Maven Dependency Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core-installation Include this dependency in your pom.xml to add Rage4J-Core to your project. Maven will handle the download and integration. ```xml dev.rage4j rage4j 1.0.4 ``` -------------------------------- ### Add Rage4J-Persist-JUnit5 Maven Dependency Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/persist-junit5-installation Include this dependency in your pom.xml to enable the JUnit 5 extension for automatic persistence. The rage4j-persist module is included transitively. ```xml dev.rage4j rage4j-persist-junit5 1.0.4 test ``` -------------------------------- ### Initialize and Use AnswerRelevanceEvaluator Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_relevance Instantiate the AnswerRelevanceEvaluator with a chat model and embedding model, then use it to evaluate a sample containing a question and answer. The result provides a relevance score. ```java AnswerRelevanceEvaluator evaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel); Evaluation result = evaluator.evaluate(sample); double relevanceScore = result.getValue(); ``` -------------------------------- ### Evaluate Sample with BleuScoreEvaluator Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/bleu_score Instantiate the BleuScoreEvaluator and use it to evaluate a sample. The result contains the BLEU score value. ```java BleuScoreEvaluator evaluator = new BleuScoreEvaluator(); Evaluation result = evaluator.evaluate(sample); double bleuScore = result.getValue(); ``` -------------------------------- ### Add Rage4J-Persist Maven Dependency Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/persist-installation Include this Maven dependency in your pom.xml file to add Rage4J-Persist to your project. Maven will automatically download and manage the library. ```xml dev.rage4j rage4j-persist 1.0.4 ``` -------------------------------- ### Evaluate Faithfulness with FaithfulnessEvaluator Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/faithfulness Instantiate the FaithfulnessEvaluator and use it to evaluate a sample. The result contains the faithfulness score. ```java FaithfulnessEvaluator evaluator = new FaithfulnessEvaluator(chatModel); Evaluation result = evaluator.evaluate(sample); double faithfulnessScore = result.getValue(); ``` -------------------------------- ### CompositeStore for Multiple Destinations Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction Writes evaluation results to multiple EvaluationStore implementations simultaneously. Useful for primary storage and backups. ```java EvaluationStore composite = new CompositeStore( new JsonLinesStore(Path.of("results.jsonl")), new JsonLinesStore(Path.of("backup.jsonl")) ); ``` -------------------------------- ### Enable Evaluation Mode in RAGE4J-Assert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction Configures RAGE4J-Assert to use evaluation mode, which logs warnings on assertion failures instead of throwing exceptions. This is useful for data collection during complete test runs. Strict mode can be re-enabled using `withStrictMode()`. ```java RageAssert rageAssert = new OpenAiLLMBuilder() .fromApiKey(key) .withEvaluationMode(); // Logs warnings instead of throwing // Switch back to strict mode if needed rageAssert.withStrictMode(); ``` -------------------------------- ### Test Semantic Similarity with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertSemanticSimilarity to verify that the semantic similarity between a model's answer and the ground truth meets a specified threshold. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertSemanticSimilarity(0.7); ``` -------------------------------- ### EvaluationStore Interface Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction The EvaluationStore interface defines the contract for persisting evaluation results. It includes methods for storing, flushing, and closing the store. ```APIDOC ## EvaluationStore Interface ### Description The `EvaluationStore` interface defines how evaluation results are persisted. ### Methods - `store(EvaluationAggregation aggregation)`: Buffers data for later persistence. - `flush()`: Writes buffered data to the storage. - `storeFlush(EvaluationAggregation aggregation)`: Stores the aggregation and then flushes the buffer. - `close()`: Releases any resources held by the store. ``` -------------------------------- ### EvaluationStore Interface Definition Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction Defines the contract for persisting evaluation results, including methods for storing, flushing, and closing. ```java public interface EvaluationStore extends Closeable { void store(EvaluationAggregation aggregation); void flush(); void storeFlush(EvaluationAggregation aggregation); void close(); } ``` -------------------------------- ### Test BLEU Score with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertBleuScore to test the n-gram overlap precision of a model's answer against a ground truth. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertBleuScore(0.7); ``` -------------------------------- ### Test Answer Relevance with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertAnswerRelevance to check if a model's answer is relevant to the provided context, meeting a minimum relevance score. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertAnswerRelevance(0.7); ``` -------------------------------- ### Test Faithfulness with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertFaithfulness to ensure a generated answer adheres to the provided context with a minimum faithfulness score. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .context(CONTEXT) .when() .answer(model::generate) .then() .assertFaithfulness(0.7); ``` -------------------------------- ### JsonLinesStore Usage Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction Persists evaluation results in JSON Lines format. Ensures data is flushed automatically when used with try-with-resources. ```java try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) { EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators); store.store(result); } // flush() called automatically on close() ``` -------------------------------- ### Evaluate Answer Semantic Similarity Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_semantic_similarity Instantiate the AnswerSemanticSimilarityEvaluator with an embedding model and use it to evaluate a sample. The similarity score is extracted from the evaluation result. ```java AnswerSemanticSimilarityEvaluator evaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel); Evaluation result = evaluator.evaluate(sample); double similarityScore = result.getValue(); ``` -------------------------------- ### Test Answer Correctness with RageAssert Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples Use assertAnswerCorrectness to check if a model's generated answer meets a specified correctness threshold against a ground truth. ```java RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key); rageAssert.given() .question(QUESTION) .groundTruth(GROUND_TRUTH) .when() .answer(model::generate) .then() .assertAnswerCorrectness(0.7); ``` -------------------------------- ### Evaluate Answer Correctness Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_correctness Instantiate the AnswerCorrectnessEvaluator and use it to evaluate a sample. The resulting score indicates the alignment of the answer with the ground truth. ```java AnswerCorrectnessEvaluator evaluator = new AnswerCorrectnessEvaluator(chatModel); Evaluation result = evaluator.evaluate(sample); double correctnessScore = result.getValue(); ``` -------------------------------- ### Evaluation Result in Java Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts The Evaluation class holds the outcome of a single metric assessment. It contains the metric's name and its score, typically between 0 and 1. ```java Evaluation result = evaluator.evaluate(sample); String metricName = result.getName(); // e.g., "Answer correctness" double score = result.getValue(); // Score between 0 and 1 ``` -------------------------------- ### Evaluation Aggregator Class Definition in Java Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts The EvaluationAggregator class provides a utility method to evaluate a sample against multiple evaluators simultaneously. ```java public class EvaluationAggregator { public static EvaluationAggregation evaluateAll(Sample sample, Evaluator... evaluators); } ``` -------------------------------- ### Evaluator Interface Definition in Java Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts The Evaluator interface defines the contract for all evaluators. Implement this interface to create custom evaluation metrics. ```java public interface Evaluator { Evaluation evaluate(Sample sample); } ``` === COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.