Create Evaluation
Create an evaluation task. This evaluates the ranking performance of a Marqtuned or base model. It requires an evaluation dataset ID and a pretrained model. The pretrained model can be an open_clip model or a Marqtuned model.
POST /evaluation
Body Parameters
Name | Type | Default value | Description |
---|---|---|---|
datasetId |
UUID | "" |
Required - ID of the evaluation dataset already created. |
model |
String | "" |
Required - Name of the model or model ID to evaluate. Model name must be from open_clip library. ID can be any Marqtuned model in your account. |
checkpoint |
String | "" |
Required - Checkpoint of the model to evaluate. Checkpoint must be from open_clip, or the epoch from a Marqtuned model. |
modelType |
String | "" |
Required - Type of model being evaluated. open_clip or marqtuned . |
hyperparameters |
Dictionary | "" |
Required - Evaluation task parameters - see the Evaluation parameters guide for details. |
waitForCompletion |
Boolean | True |
Optional[py-marqtune client only] - Instructs the client to continuously wait and poll until the operation is completed. |
Example
from marqtune.client import Client
from marqtune.enums import ModelType, DatasetType, InstanceType
url = "https://marqtune.marqo.ai"
api_key = "{api_key}"
marqtune_client = Client(url=url, api_key=api_key)
marqtune_client.evaluate(
model="model_id",
dataset_id="dataset_id",
checkpoint="epoch_4",
model_type=ModelType.MARQTUNED,
hyperparameters={"leftKeys": ["query"], "rightKeys": ["my_image", "my_text"], "leftWeights": [1], "rightWeights": [0.9, 0.1] },
wait_for_completion=True
)
# Evaluate a model.
cURL -X POST 'https://marqtune.marqo.ai/evaluation' \
-H "Content-Type: application/json" \
-H 'x-api-key: {api_key}' \
-d '{
"datasetId": "dataset_id",
"model": "model_id",
"checkpoint": "epoch_4",
"modelType": "marqtuned",
"hyperparameters": {"leftKeys": ["query"], "rightKeys": ["my_image", "my_text"], "leftWeights": [1], "rightWeights": [0.9, 0.1] },
}'
Response: 202 Accepted
Evaluation task has been initalised and will now be executed.
{
"statusCode": 202,
"body": {
"evaluationId": "evaluation_id"
}
}
Response: 400 (Invalid dataset)
Invalid dataset
{
"statusCode": 400,
"body": {
"message": "Dataset must be of type 'evaluation'"
}
}
Response: 400 (Invalid base model)
Invalid model
{
"statusCode": 400,
"body": {
"message": "Model with id {model_id} not found"
}
}
Response: 400 (Dataset not created)
Dataset not created yet
{
"statusCode": 400,
"body": {
"message": "Dataset is not created yet, wait until status of dataset is ready"
}
}
Response: 400 (Failed dataset)
Failed dataset
{
"statusCode": 400,
"body": {
"message": "Job can not be started with failed dataset"
}
}
Response: 400 (Invalid checkpoint)
Invalid checkpoint
{
"statusCode": 400,
"body": {
"message": "Invalid checkpoint. Available checkpoints: {checkpoints}"
}
}
Response: 400 (Invalid Request)
Request path or method is invalid.
{
"statusCode": 400,
"body": {
"message": "Invalid request method"
}
}
Response: 401 (Unauthorised)
Unauthorised. Check your API key and try again.
{
"message": "Unauthorized."
}
Response: 500 (Internal server error)
Internal server error. Check your API key and try again.
{
"message": "Internal server error."
}