Skip to content

Models

The /models endpoint provides necessary information relating to models loaded in your devices (cpu or cuda). You can check the currently loaded models, and eject a loaded model to free memory.

Get loaded models

GET /models
This returns information about all the loaded models in "cuda" and "cpu" devices.

Example

curl  -XGET http://localhost:8882/models
mq.get_loaded_models()

Response: 200 OK

{"models": [
    {'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cpu'},
    {'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cuda'},
    {'model_name': 'ViT-L/14', 'model_device': 'cpu'},
    {'model_name': 'ViT-L/14', 'model_device': 'cuda'},
    {'model_name': 'ViT-B/16', 'model_device': 'cpu'}]}

Eject a loaded model

curl -X DELETE '/models?model_name={model_name}&model_device={model_device}'

Eject a model from a specific device.

Path parameters

Name Type Description
model_name String name of the name
model_device String ("cuda" or "cpu") device of the model

Example

Models can be loaded into devices cpu and cuda, respectively.

Note: The best practice is to check the device memory usage using \device api, and check the currently loaded model \models first.

Supposing the results from mq.get_loaded_models() are

{'models': [
    {'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cpu'},
    {'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cuda'},
    {'model_name': 'ViT-L/14', 'model_device': 'cpu'},
    {'model_name': 'ViT-L/14', 'model_device': 'cuda'},
    {'model_name': 'ViT-B/16', 'model_device': 'cpu'}]}
We would like to eject the model "ViT-L/14" from device "cuda".

curl -X DELETE 'http://localhost:8882/models?model_name=ViT-L/14&model_device=cuda'
mq.eject_model(model_name = "ViT-L/14", model_device = "cuda")

Response: 200 OK

{'result': 'success',
 'message': 'successfully eject model_name `ViT-L/14` from device `cuda`'}

If you try to eject a model that is not loaded into cache, you will get

Response: 404 Not Found

{"message":"The model_name `ViT-L/14` device `cuda` is not cached or found",
 "code":"model_not_in_cache", "type":"invalid_request", "link":null}