Models
The /models
endpoint provides necessary information relating to models loaded in your devices (cpu
or cuda
).
You can check the currently loaded models, and eject a loaded model to free memory.
Get loaded models
GET /models
"cuda"
and "cpu"
devices.
Example
curl -XGET http://localhost:8882/models
mq.get_loaded_models()
Response: 200 OK
{"models": [
{'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cpu'},
{'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cuda'},
{'model_name': 'ViT-L/14', 'model_device': 'cpu'},
{'model_name': 'ViT-L/14', 'model_device': 'cuda'},
{'model_name': 'ViT-B/16', 'model_device': 'cpu'}]}
Eject a loaded model
curl -X DELETE '/models?model_name={model_name}&model_device={model_device}'
Eject a model from a specific device.
Path parameters
Name | Type | Description |
---|---|---|
model_name |
String | name of the name |
model_device |
String ("cuda" or "cpu" ) |
device of the model |
Example
Models can be loaded into devices cpu
and cuda
, respectively.
Note: The best practice
is to check the device memory usage using \device
api, and check the currently loaded model \models
first.
Supposing the results from mq.get_loaded_models()
are
{'models': [
{'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cpu'},
{'model_name': 'hf/all_datasets_v4_MiniLM-L6', 'model_device': 'cuda'},
{'model_name': 'ViT-L/14', 'model_device': 'cpu'},
{'model_name': 'ViT-L/14', 'model_device': 'cuda'},
{'model_name': 'ViT-B/16', 'model_device': 'cpu'}]}
"ViT-L/14"
from device "cuda"
.
curl -X DELETE 'http://localhost:8882/models?model_name=ViT-L/14&model_device=cuda'
mq.eject_model(model_name = "ViT-L/14", model_device = "cuda")
Response: 200 OK
{'result': 'success',
'message': 'successfully eject model_name `ViT-L/14` from device `cuda`'}
If you try to eject a model that is not loaded into cache, you will get
Response: 404 Not Found
{"message":"The model_name `ViT-L/14` device `cuda` is not cached or found",
"code":"model_not_in_cache", "type":"invalid_request", "link":null}