Metrics for evaluating the Edge RAG Preview system

This article lists the metrics used when you evaluate the system of Edge RAG Preview, enabled by Azure Arc. For more information, see Evaluate the Edge RAG system

Important

Edge RAG Preview, enabled by Azure Arc is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

Generation metrics

The following metrics for evaluate the quality of generated responses.

Metric	Description
Correctness	Evaluates the accuracy and factual validity of generated responses against the expected responses (ground truth). Range score: 1-5
Groundedness	Evaluates the degree to which the responses generated by the generative AI application correspond with the information provided from the retrieved documents. Range score: 1-5
Relevancy	Evaluates the degree to which the responses generated by the generative AI application are appropriate and directly correspond to the provided input. Range score: 1-5
Rouge L	Measures the longest common subsequence between the generated text and reference text. Range score: 0-1
Bleu	Evaluates the quality of generated text by comparing it to expected responses (ground truth) while penalizing on the brevity. Range score: 0-1
Meteor	METEOR (Metric for Evaluation of Translation with Explicit Ordering) evaluates the quality of generated text by comparing it to expected responses (ground truth) while penalizing on misalignment in fragments of the actual vs. expected sentences. Range score: 0-1

Information retrieval metrics

The following metrics for evaluate the retrieval performance.

Metric	Description
Precision	Measures the proportion of correctly retrieved documents among all retrieved document. Range score: 0-1
Recall	Measures the proportion of retrieved documents among all relevant documents. Range score: 0-1
MRR	Mean reciprocal rank (MRR) measures the quality of document ranking based on the position of the first relevant document. Range score: 0-1

Evaluate the Edge RAG system

Feedback

Was this page helpful?

Last updated on 2025-05-19

Share via

Metrics for evaluating the Edge RAG Preview system

Generation metrics

Information retrieval metrics

Related content

Feedback

Additional resources