Skip to content

Create Evaluation

Create an evaluation task. This evaluates the ranking performance of a Marqtuned or base model. It requires an evaluation dataset ID and a pretrained model. The pretrained model can be an open_clip model or a Marqtuned model.


POST /evaluation

Body Parameters

Name Type Default value Description
datasetId UUID "" Required - ID of the evaluation dataset already created.
model String "" Required - Name of the model or model ID to evaluate. Model name must be from open_clip library. ID can be any Marqtuned model in your account.
checkpoint String "" Required - Checkpoint of the model to evaluate. Checkpoint must be from open_clip, or the epoch from a Marqtuned model.
modelType String "" Required - Type of model being evaluated. open_clip or marqtuned.
hyperparameters Dictionary "" Required - Evaluation task parameters - see the Evaluation parameters guide for details.
waitForCompletion Boolean True Optional[py-marqtune client only] - Instructs the client to continuously wait and poll until the operation is completed.

Example

from marqtune.client import Client
from marqtune.enums import ModelType, DatasetType, InstanceType

url = "https://marqtune.marqo.ai"
api_key = "{api_key}"
marqtune_client = Client(url=url, api_key=api_key)
marqtune_client.evaluate(
    model="model_id",
    dataset_id="dataset_id",
    checkpoint="epoch_4",
    model_type=ModelType.MARQTUNED,
    hyperparameters={"leftKeys": ["query"], "rightKeys": ["my_image", "my_text"], "leftWeights": [1], "rightWeights": [0.9, 0.1] },
    wait_for_completion=True
)
# Evaluate a model.
cURL -X POST 'https://marqtune.marqo.ai/evaluation' \
-H "Content-Type: application/json" \
-H 'x-api-key: {api_key}' \
-d '{
    "datasetId": "dataset_id",
    "model": "model_id",
    "checkpoint": "epoch_4",
    "modelType": "marqtuned",
    "hyperparameters": {"leftKeys": ["query"], "rightKeys": ["my_image", "my_text"], "leftWeights": [1], "rightWeights": [0.9, 0.1] },
   }'

Response: 202 Accepted

Evaluation task has been initalised and will now be executed.

{
    "statusCode": 202,
    "body": {
        "evaluationId": "evaluation_id"
    }
}

Response: 400 (Invalid dataset)

Invalid dataset

{
    "statusCode": 400,
    "body": {
      "message": "Dataset must be of type 'evaluation'"
    }
}

Response: 400 (Invalid base model)

Invalid model

{
    "statusCode": 400,
    "body": {
      "message": "Model with id {model_id} not found"
    }
}

Response: 400 (Dataset not created)

Dataset not created yet

{
    "statusCode": 400,
    "body": {
      "message": "Dataset is not created yet, wait until status of dataset is ready"
    }
}

Response: 400 (Failed dataset)

Failed dataset

{
    "statusCode": 400,
    "body": {
      "message": "Job can not be started with failed dataset"
    }
}

Response: 400 (Invalid checkpoint)

Invalid checkpoint

{
    "statusCode": 400,
    "body": {
      "message": "Invalid checkpoint. Available checkpoints: {checkpoints}"
    }
}

Response: 400 (Invalid Request)

Request path or method is invalid.

{
    "statusCode": 400,
    "body": {
      "message": "Invalid request method"
    }
}

Response: 401 (Unauthorised)

Unauthorised. Check your API key and try again.

{
  "message": "Unauthorized."
}

Response: 500 (Internal server error)

Internal server error. Check your API key and try again.

{
  "message": "Internal server error."
}