Skip to content

Evaluation Metrics

Available API-level Evaluation Metrics

Metric Name Sub-metrics Description
ndcg NDCG@10, NDCG@100 Normalized Discounted Cumulative Gain at different cut-off points, measuring the quality of ranking results.
mrr MRR@1000 Mean Reciprocal Rank at a specified cut-off, indicating the average position of the first relevant result.
mAP MAP@1000 Mean Average Precision at a specified cut-off, evaluating the precision across all relevant items.
precision P@10 Precision at a specified cut-off, measuring the proportion of relevant items among the retrieved top results.
recall Recall@10, Recall@50 Recall at different cut-off points, measuring how many relevant items are retrieved among all possible items.

More evaluation metrics can be accessed by downloading the evaluation results using download api.

from marqtune.client import Client

url = "https://marqtune.marqo.ai"
api_key = "{api_key}"
marqtune_client = Client(url=url, api_key=api_key)

marqtune_client.evaluation("evaluation_id").download()
curl --location 'https://marqtune.marqo.ai/evaluations/{evaluation_id}/download/url \
     --header 'x-api-key: {api_key}'

Detailed Description of Metrics

  • NDCG (Normalized Discounted Cumulative Gain): This metric evaluates the effectiveness of ranking results by comparing the relevance of documents in the predicted order to an ideal ranking order. Higher values indicate better ranking quality.
  • MRR (Mean Reciprocal Rank): MRR evaluates the average of the reciprocal ranks of the first relevant result across queries. The closer to 1, the better the ranking model.
  • MAP (Mean Average Precision): MAP measures the mean precision scores for all queries, considering only relevant results. It is a comprehensive indicator of a system’s precision.
  • Precision: Indicates the fraction of relevant results among the retrieved items at a specific rank threshold.
  • Recall: Indicates the fraction of all possible relevant items that are successfully retrieved.

Example: Evaluation Metrics Configuration

evaluation_metrics = {
    "NDCG@10": "",
    "NDCG@100": "",
    "MRR@1000": "",
    "MAP@1000": "",
    "P@10": "",
    "Recall@10": "",
    "Recall@50": "",
}