### Complete Example with RageAssert and EvaluationStore
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction
This example demonstrates a complete test case using RageAssert for evaluation and RAGE4J-Persist-JUnit5 to store the results. It configures the output file and uses an OpenAI LLM builder for assertions.
```java
import dev.rage4j.persist.EvaluationStore;
import dev.rage4j.persist.junit5.Rage4jPersistConfig;
import dev.rage4j.asserts.RageAssert;
import dev.rage4j.asserts.openai.OpenAiLLMBuilder;
import dev.rage4j.model.EvaluationAggregation;
import org.junit.jupiter.api.Test;
@Rage4jPersistConfig(file = "target/my-evaluations.jsonl")
class RagEvaluationTest {
private final String apiKey = System.getenv("OPEN_AI_KEY");
private final RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(apiKey);
@Test
void testCorrectness(EvaluationStore store) {
EvaluationAggregation result = rageAssert.given()
.question("What is the capital of France?")
.groundTruth("Paris")
.when()
.answer("Paris is the capital of France.")
.then()
.assertAnswerCorrectness(0.7)
.getEvaluationAggregation();
store.store(result);
}
}
```
--------------------------------
### Complete Evaluation Example in Java
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts
This example demonstrates how to use Rage4j to evaluate an LLM response with multiple metrics. It initializes various evaluators, creates a Sample, and aggregates the results.
```java
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.embedding.EmbeddingModel;
public class EvaluationExample {
public static void main(String[] args) {
ChatLanguageModel chatModel = /* Any Langchain4j ChatLanguageModel */
EmbeddingModel embeddingModel = /* Any Langchain4j EmbeddingModel */
Evaluator relevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel);
Evaluator correctnessEvaluator = new AnswerCorrectnessEvaluator(chatModel);
Evaluator faithfulnessEvaluator = new FaithfulnessEvaluator(chatModel);
Evaluator similarityEvaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel);
Sample sample = Sample.builder()
.withQuestion("What are the main features of Java?")
.withAnswer("Java is object-oriented, platform-independent, and has automatic memory management.")
.withGroundTruth("Java's main features include object-oriented programming, platform independence through JVM, automatic memory management (garbage collection), and strong type safety.")
.withContext("Java is a popular programming language. Key features of Java include...")
.build();
EvaluationAggregation results = EvaluationAggregator.evaluateAll(sample,
relevanceEvaluator,
correctnessEvaluator,
faithfulnessEvaluator,
similarityEvaluator
);
// Access results
System.out.println("Relevance score: " + results.get("Answer relevance"));
System.out.println("Correctness score: " + results.get("Answer correctness"));
System.out.println("Faithfulness score: " + results.get("Faithfulness"));
System.out.println("Semantic similarity: " + results.get("Answer semantic similarity"));
}
}
```
--------------------------------
### Evaluate Answer Relevance with RAGE4j-Core
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/introduction
This example demonstrates how to create an AnswerRelevanceEvaluator, define a sample with a question and answer, and then evaluate the sample to get the relevance score. Ensure you have initialized chatModel and embeddingModel.
```java
// 1. Create an evaluator
Evaluator answerRelevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel);
// 2. Create a sample
Sample sample = Sample.builder()
.withQuestion("What is Java?")
.withAnswer("Java is a programming language.")
.build();
// 3. Evaluate and get results
Evaluation result = answerRelevanceEvaluator.evaluate(sample);
// 4. Get our score
System.out.println("Metric name: "+result.getName()); // Metric name: Answer relevance
System.out.println("Metric score: "+result.getName()); // Metric score: 1.0
```
--------------------------------
### Get Maven Dependency for RAGE4J-Core
Source: https://explore-de.github.io/rage4j/docs/category/rage4j-core
Provides a JavaScript function to dynamically generate the Maven dependency string for RAGE4J-Core based on a specified version.
```javascript
const getMavenDependency = (version) => `
```
--------------------------------
### Test ROUGE Score with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertRougeScore to evaluate the ROUGE score, specifically ROUGE_L_SUM precision in this example, against a ground truth.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model::generate)
.then()
.assertRougeScore(0.9, RougeScoreEvaluator.RougeType.ROUGE_L_SUM, RougeScoreEvaluator.MeasureType.PRECISION);
```
--------------------------------
### JsonLinesStore Usage
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
Example of using JsonLinesStore to save evaluation results in JSON Lines format. The store is automatically closed and flushed when used in a try-with-resources block.
```APIDOC
## JsonLinesStore
### Description
The primary implementation stores evaluations in JSON Lines format (`.jsonl`), where each line is a complete JSON object.
### Usage Example
```java
try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) {
EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators);
store.store(result);
} // flush() called automatically on close()
```
### Output Format Example
```json
{
"sample": {
"question": "...",
"answer": "...",
"groundTruth": "..."
},
"metrics": {
"Answer correctness": 0.85
}
}
```
```
--------------------------------
### ROUGE Score Evaluation Examples
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/rouge_score
Demonstrates how to initialize and use the RougeScoreEvaluator for different ROUGE types and measure types. The default is ROUGE-1 F1 score. The evaluate method returns an Evaluation result object.
```java
RougeScoreEvaluator evaluator = new RougeScoreEvaluator();
RougeScoreEvaluator customEvaluator = new RougeScoreEvaluator(RougeType.ROUGE2, MeasureType.PRECISION);
RougeScoreEvaluator summaryEvaluator = new RougeScoreEvaluator(RougeType.ROUGE_L_SUM, MeasureType.RECALL);
Evaluation result = evaluator.evaluate(sample);
double rougeScore = result.getValue();
```
--------------------------------
### Create a Sample Object in Java
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts
Use the Sample.builder() to construct a Sample object, which represents an evaluation instance. It includes a question, answer, ground truth, and optional context.
```java
Sample sample = Sample.builder()
.withQuestion("What is the capital of France?")
.withAnswer("Paris is the capital of France.")
.withGroundTruth("Paris is the capital and largest city of France.")
.withContext("Paris is the capital of France...")
.build();
```
--------------------------------
### Basic Test with RAGE4J-Assert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction
Demonstrates a simple test case using RAGE4J-Assert to check the correctness of an LLM's answer against a ground truth. Requires API key and OpenAI model configuration.
```java
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.rage4j.asserts.RageAssert;
import dev.rage4j.asserts.openai.OpenAiLLMBuilder;
import org.junit.jupiter.api.Test;
import static dev.langchain4j.model.openai.OpenAiChatModelName.GPT_4_O_MINI;
class RageAssertTest
{
private final String key = System.getenv("OPEN_API_KEY");
private final OpenAiChatModel model = OpenAiChatModel.builder()
.apiKey(key)
.modelName(GPT_4_O_MINI)
.build();
@Test
void testCorrectnessApi()
{
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question("What is the capital of France?")
.groundTruth("Paris is the capital of France")
.when()
.answer(q -> model.generate(q))
.then()
.assertAnswerCorrectness(0.7);
}
}
```
--------------------------------
### CompositeStore Usage
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
Demonstrates how to use CompositeStore to write evaluation results to multiple stores simultaneously.
```APIDOC
## CompositeStore
### Description
Write to multiple stores simultaneously.
### Usage Example
```java
EvaluationStore composite = new CompositeStore(
new JsonLinesStore(Path.of("results.jsonl")),
new JsonLinesStore(Path.of("backup.jsonl"))
);
```
```
--------------------------------
### Custom Store Implementation with @Rage4jPersistConfig
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction
To use a custom store implementation, specify the 'storeClass' attribute in the @Rage4jPersistConfig annotation. Ensure your custom store class has a constructor that accepts a Path.
```java
@Rage4jPersistConfig(file = "target/custom.dat", storeClass = MyCustomStore.class)
class MyCustomTest {
// MyCustomStore must have a constructor that accepts a Path
}
```
--------------------------------
### Enable Automatic Persistence with @Rage4jPersistConfig
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/introduction
Annotate your test class with @Rage4jPersistConfig to enable automatic persistence. Specify the output file path using the 'file' attribute. The EvaluationStore will be injected into your test methods.
```java
@Rage4jPersistConfig(file = "target/evaluations.jsonl")
class MyEvaluationTest {
@Test
void testEvaluation(EvaluationStore store) {
// store is injected and ready to use
EvaluationAggregation aggregation = EvaluationAggregator.evaluateAll(sample, evaluators);
store.store(aggregation);
}
}
```
--------------------------------
### Enable Detailed Metric Logs with Maven
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core-installation
Run this Maven command during testing to enable verbose logging for metric calculations. This is useful for debugging.
```bash
mvn test -Dshow.metric.logs=true
```
--------------------------------
### Configure Custom Chat and Embedding Models
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction
Shows how to configure specific chat and embedding models for RAGE4J-Assert using the OpenAiLLMBuilder. Defaults are 'gpt-5.1' for chat and 'text-embedding-3-small' for embedding.
```java
RageAssert rageAssert = new OpenAiLLMBuilder()
.withChatModel("gpt-4o")
.withEmbeddingModel("text-embedding-3-large")
.fromApiKey(key);
```
--------------------------------
### Chain Multiple Assertions with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Demonstrates chaining multiple assertions like correctness and semantic similarity for a single LLM-generated answer. This is the recommended approach for testing against multiple metrics.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model.generate(QUESTION))
.then()
.assertAnswerCorrectness(0.7)
.then()
.assertSemanticSimilarity(0.7);
```
--------------------------------
### Add Rage4J-Assert Maven Dependency
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/assert-installation
Include this dependency in your pom.xml to use Rage4J-Assert in your project. Maven will handle the download and inclusion automatically.
```xml
dev.rage4j
rage4j-assert
1.0.4
test
```
--------------------------------
### Add Rage4J-Core Maven Dependency
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core-installation
Include this dependency in your pom.xml to add Rage4J-Core to your project. Maven will handle the download and integration.
```xml
dev.rage4j
rage4j
1.0.4
```
--------------------------------
### Add Rage4J-Persist-JUnit5 Maven Dependency
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist-junit5/persist-junit5-installation
Include this dependency in your pom.xml to enable the JUnit 5 extension for automatic persistence. The rage4j-persist module is included transitively.
```xml
dev.rage4j
rage4j-persist-junit5
1.0.4
test
```
--------------------------------
### Initialize and Use AnswerRelevanceEvaluator
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_relevance
Instantiate the AnswerRelevanceEvaluator with a chat model and embedding model, then use it to evaluate a sample containing a question and answer. The result provides a relevance score.
```java
AnswerRelevanceEvaluator evaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel);
Evaluation result = evaluator.evaluate(sample);
double relevanceScore = result.getValue();
```
--------------------------------
### Evaluate Sample with BleuScoreEvaluator
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/bleu_score
Instantiate the BleuScoreEvaluator and use it to evaluate a sample. The result contains the BLEU score value.
```java
BleuScoreEvaluator evaluator = new BleuScoreEvaluator();
Evaluation result = evaluator.evaluate(sample);
double bleuScore = result.getValue();
```
--------------------------------
### Add Rage4J-Persist Maven Dependency
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/persist-installation
Include this Maven dependency in your pom.xml file to add Rage4J-Persist to your project. Maven will automatically download and manage the library.
```xml
dev.rage4j
rage4j-persist
1.0.4
```
--------------------------------
### Evaluate Faithfulness with FaithfulnessEvaluator
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/faithfulness
Instantiate the FaithfulnessEvaluator and use it to evaluate a sample. The result contains the faithfulness score.
```java
FaithfulnessEvaluator evaluator = new FaithfulnessEvaluator(chatModel);
Evaluation result = evaluator.evaluate(sample);
double faithfulnessScore = result.getValue();
```
--------------------------------
### CompositeStore for Multiple Destinations
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
Writes evaluation results to multiple EvaluationStore implementations simultaneously. Useful for primary storage and backups.
```java
EvaluationStore composite = new CompositeStore(
new JsonLinesStore(Path.of("results.jsonl")),
new JsonLinesStore(Path.of("backup.jsonl"))
);
```
--------------------------------
### Enable Evaluation Mode in RAGE4J-Assert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/introduction
Configures RAGE4J-Assert to use evaluation mode, which logs warnings on assertion failures instead of throwing exceptions. This is useful for data collection during complete test runs. Strict mode can be re-enabled using `withStrictMode()`.
```java
RageAssert rageAssert = new OpenAiLLMBuilder()
.fromApiKey(key)
.withEvaluationMode(); // Logs warnings instead of throwing
// Switch back to strict mode if needed
rageAssert.withStrictMode();
```
--------------------------------
### Test Semantic Similarity with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertSemanticSimilarity to verify that the semantic similarity between a model's answer and the ground truth meets a specified threshold.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model::generate)
.then()
.assertSemanticSimilarity(0.7);
```
--------------------------------
### EvaluationStore Interface
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
The EvaluationStore interface defines the contract for persisting evaluation results. It includes methods for storing, flushing, and closing the store.
```APIDOC
## EvaluationStore Interface
### Description
The `EvaluationStore` interface defines how evaluation results are persisted.
### Methods
- `store(EvaluationAggregation aggregation)`: Buffers data for later persistence.
- `flush()`: Writes buffered data to the storage.
- `storeFlush(EvaluationAggregation aggregation)`: Stores the aggregation and then flushes the buffer.
- `close()`: Releases any resources held by the store.
```
--------------------------------
### EvaluationStore Interface Definition
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
Defines the contract for persisting evaluation results, including methods for storing, flushing, and closing.
```java
public interface EvaluationStore extends Closeable {
void store(EvaluationAggregation aggregation);
void flush();
void storeFlush(EvaluationAggregation aggregation);
void close();
}
```
--------------------------------
### Test BLEU Score with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertBleuScore to test the n-gram overlap precision of a model's answer against a ground truth.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model::generate)
.then()
.assertBleuScore(0.7);
```
--------------------------------
### Test Answer Relevance with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertAnswerRelevance to check if a model's answer is relevant to the provided context, meeting a minimum relevance score.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model::generate)
.then()
.assertAnswerRelevance(0.7);
```
--------------------------------
### Test Faithfulness with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertFaithfulness to ensure a generated answer adheres to the provided context with a minimum faithfulness score.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.context(CONTEXT)
.when()
.answer(model::generate)
.then()
.assertFaithfulness(0.7);
```
--------------------------------
### JsonLinesStore Usage
Source: https://explore-de.github.io/rage4j/docs/rage4j-persist/introduction
Persists evaluation results in JSON Lines format. Ensures data is flushed automatically when used with try-with-resources.
```java
try (EvaluationStore store = new JsonLinesStore(Path.of("target/evaluations.jsonl"))) {
EvaluationAggregation result = EvaluationAggregator.evaluateAll(sample, evaluators);
store.store(result);
} // flush() called automatically on close()
```
--------------------------------
### Evaluate Answer Semantic Similarity
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_semantic_similarity
Instantiate the AnswerSemanticSimilarityEvaluator with an embedding model and use it to evaluate a sample. The similarity score is extracted from the evaluation result.
```java
AnswerSemanticSimilarityEvaluator evaluator = new AnswerSemanticSimilarityEvaluator(embeddingModel);
Evaluation result = evaluator.evaluate(sample);
double similarityScore = result.getValue();
```
--------------------------------
### Test Answer Correctness with RageAssert
Source: https://explore-de.github.io/rage4j/docs/rage4j-assert/examples
Use assertAnswerCorrectness to check if a model's generated answer meets a specified correctness threshold against a ground truth.
```java
RageAssert rageAssert = new OpenAiLLMBuilder().fromApiKey(key);
rageAssert.given()
.question(QUESTION)
.groundTruth(GROUND_TRUTH)
.when()
.answer(model::generate)
.then()
.assertAnswerCorrectness(0.7);
```
--------------------------------
### Evaluate Answer Correctness
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/metrics/answer_correctness
Instantiate the AnswerCorrectnessEvaluator and use it to evaluate a sample. The resulting score indicates the alignment of the answer with the ground truth.
```java
AnswerCorrectnessEvaluator evaluator = new AnswerCorrectnessEvaluator(chatModel);
Evaluation result = evaluator.evaluate(sample);
double correctnessScore = result.getValue();
```
--------------------------------
### Evaluation Result in Java
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts
The Evaluation class holds the outcome of a single metric assessment. It contains the metric's name and its score, typically between 0 and 1.
```java
Evaluation result = evaluator.evaluate(sample);
String metricName = result.getName(); // e.g., "Answer correctness"
double score = result.getValue(); // Score between 0 and 1
```
--------------------------------
### Evaluation Aggregator Class Definition in Java
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts
The EvaluationAggregator class provides a utility method to evaluate a sample against multiple evaluators simultaneously.
```java
public class EvaluationAggregator {
public static EvaluationAggregation evaluateAll(Sample sample, Evaluator... evaluators);
}
```
--------------------------------
### Evaluator Interface Definition in Java
Source: https://explore-de.github.io/rage4j/docs/rage4j-core/core_concepts
The Evaluator interface defines the contract for all evaluators. Implement this interface to create custom evaluation metrics.
```java
public interface Evaluator {
Evaluation evaluate(Sample sample);
}
```
=== COMPLETE CONTENT === This response contains all available snippets from this library. No additional content exists. Do not make further requests.