Bring your own models
If the models in our registry do not meet your requirements, or you have a custom model that you want to use, you can bring your own model to Marqo. In this section, we will show you how to use your own OpenCLIP models and sentence transformers models in Marqo.
Bring your own OpenCLIP model
Marqo supports you to use your own OpenCLIP models fine-tune under the
OpenCLIP framework. To load a custom OpenCLIP model, you need to provide
the model properties in the index settings. A full details of the settings in modelProperties
are listed below:
Field Name | Type | Default Value | Description |
---|---|---|---|
name |
String | No Default | The name of the model. It can be the architecture (e.g., "ViT-B-32" ) of the model or the Hugging Face model card starting with "hf-hub:" . |
dimensions |
Integer | No Default | The dimension of the embeddings generated by the model. |
type |
String | No Default | The type of the model. It should be "open_clip" since we are loading an OpenCLIP model here. |
url |
String (Optional) | None |
The URL of the model checkpoint. Cannot be provided together with "modelLocation" . |
modelLocation |
Dict (Optional) | None |
The location of the model in S3 or Hugging Face. Cannot be provided together with "url" . |
jit |
Boolean (Optional) | False |
A boolean indicating whether the model is JIT compiled. |
precision |
String (Optional) | "fp32" |
The precision of the model. It should be either "fp32" or "fp16" . |
tokenizer |
String (Optional) | None |
The Hugging Face tokenizer to be loaded. Provide this if you want to overwrite the tokenizer inferred from name . |
imagePreprocessor |
String (Optional) | "OpenCLIP" |
The image preprocess configuration. Must be one of "SigLIP" , "OpenAI" , "OpenCLIP" , "MobileCLIP" , or "CLIPA" . |
mean |
List[float] (Optional) | None |
The mean of the image preprocessor. If provided, it will overwrite the loaded configuration. |
std |
List[float] (Optional) | None |
The standard deviation of the image preprocessor. If provided, it will overwrite the loaded configuration. |
size |
Integer (Optional) | None |
The size of the image preprocessor. If provided, it will overwrite the loaded configuration. |
note |
String (Optional) | None |
A place to add notes to your model. This does not affect your model loading process. |
pretrained |
String (Optional) | None |
A place to indicate the pretrained dataset of your model. This does not affect your model loading process. |
Most of the fields are optional and have default values. You can provide the fields you want to customize in the modelProperties
.
However, you need to provide at least the name
, dimensions
, and type
fields to load a custom OpenCLIP model.
There are two ways to load a custom OpenCLIP model in Marqo:
Load from a Hugging Face model card
To load a custom OpenCLIP model from a Hugging Face model card, you need to provide the model card name with
the "hf-hub:"
in the name
,
the dimensions of the model in dimensions
, and the type of the model in type
as "open_clip"
.
Other fields are neglected in this case. This suits the case where you want to load a public model card from Hugging Face.
For example, instead of using loading the Marqo FashionCLIP model from the registry, you can load it from the Hugging Face with the following code:
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "marqo-fashion-clip-custom-load",
"modelProperties": {
"name": "hf-hub:Marqo/marqo-fashionCLIP",
"dimensions": 512,
"type": "open_clip",
},
"normalizeEmbeddings": True,
}
response = mq.create_index(
"marqo-fashion-clip-custom-load-index", settings_dict=settings
)
Load from a checkpoint file
This is the case where you have a custom OpenCLIP model checkpoint file and you want to load it in Marqo. This has the highest flexibility as you can load any custom model you have fine-tuned, from any source, and with any configurations, as long as the architecture is supported by OpenCLIP.
You need to provide the model name in name
which is the architecture of the model (e.g., "ViT-B-32"
, "ViT-L-16-SigLIP"
),
the dimensions of the model in dimensions
, and the type of the model in type
as "open_clip"
.
You have two options to provide the checkpoint file:
- 1. Provide the URL of the checkpoint file in url
. The url should be accessible by Marqo and link to the checkpoint file
with the format of *.pt
.
-
- Provide the location of the checkpoint file in S3 or Hugging Face in
modelLocation
. ThemodelLocation
has the following fields:
- Provide the location of the checkpoint file in S3 or Hugging Face in
Field Name | Type | Default Value | Description |
---|---|---|---|
s3 |
Dict | No Default | A dictionary with "Bucket" and "Key" fields to locate the *.pt checkpoint |
hf |
Dict | No Default | A dictionary with "repoId" and "filename" fields to locate the *.pt checkpoint |
authRequired |
Bool | False |
A boolean indicating whether the authentication is required. |
If authentication is required, you need to provide the authentication information in when you search or add documents to the index.
You can provide other fields like jit
, precision
, tokenizer
, imagePreprocessor
, mean
, std
, size
, note
, in the
modelProperties
to configure your model.
Examples
Here are some examples to load a custom OpenCLIP model in Marqo. Note that if your name
has the "hf-hub:"
prefix, we
will try to load it from Hugging Face and ignore the url
and modelLocation
fields. Otherwise, if you provide the url
or modelLocation
, we will load the model from the provided location and treat the name
as the model architecture.
Example 1: Load a custom OpenCLIP model from a public URL without configurations
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "my-own-clip-model",
"modelProperties": {
"name": "ViT-B-32",
"dimensions": 512,
"url": "https://github.com/mlfoundations/open_clip/releases/download/v0.2-weights/vit_b_32-quickgelu-laion400m_e32-46683a32.pt",
"type": "open_clip",
},
"normalizeEmbeddings": True,
}
response = mq.create_index("my-own-clip-model", settings_dict=settings)
The above code loads a custom OpenCLIP model from a public URL. Note it is the same as loading the model
open_clip/ViT-B-32/lainon400m_e32
from the registry. We use the public URL of the model checkpoint as an example.
Example 2: Load a custom OpenCLIP model from a public URL with custom configurations
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "my-own-clip-model",
"modelProperties": {
"name": "ViT-B-16-SigLIP",
"dimensions": 768,
"url": "https://huggingface.co/Marqo/marqo-fashionSigLIP/resolve/main/open_clip_pytorch_model.bin",
"imagePreprocessor": "SigLIP",
"type": "open_clip",
},
"normalizeEmbeddings": True,
}
response = mq.create_index("my-own-clip-model", settings_dict=settings)
The above code loads a custom OpenCLIP model from a public URL with custom configurations for the image preprocessor.
It is very important to provide the correct imagePreprocessor
configuration to match the model architecture as Marqo can
not infer the correct configuration from the model name when you load a checkpoint file and will use the default configuration("OpenCLIP"
).
The imagePreprocessor
is set to "SigLIP"
in this example to match the model architecture ViT-B-16-SigLIP
.
Note this is the same as loading the Marqo FashionSigLIP model from the registry. We use the public URL of the model checkpoint as an example.
Example 3: Load a custom OpenCLIP model from a private S3 bucket with authentication
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "my-private-clip-model",
"modelProperties": {
"name": "ViT-B-32",
"dimensions": 512,
"modelLocation": {
"s3": {
"Bucket": "my-prive-bucket",
"Key": "my-private-model-checkpoint.pt",
},
},
"authRequired": True,
"type": "open_clip",
},
"normalizeEmbeddings": True,
}
response = mq.create_index("my-own-clip-model", settings_dict=settings)
model_auth = {
"s3": {
"aws_secret_access_key": "my-secret-access-key",
"aws_access_key_id": "my-access-key-id",
}
}
mq.index("my-own-clip-model").search("test", model_auth=model_auth)
The above code loads a custom OpenCLIP model from a private S3 bucket with authentication. The authRequired
is set to True
and you need to provide the authentication information when you search or
add documents to the index.
Bring your own Hugging Face Sentence Transformers models
Marqo supports you to use your own Hugging Face Sentence Transformers models to generate embeddings for your text data.
You can use your own fine-tuned models to achieve better search results in domain-specific tasks.
Generally, your model should follow the Hugging Face Sentence Transformers model
format and can be loaded with AutoModel.from_pretrained
and AutoTokenizer.from_pretrained
in the transformers
library.
A full details of the settings in modelProperties
are listed below:
Field Name | Type | Default Value | Description |
---|---|---|---|
name |
String (Optional) | None |
The name of the model. This can be the huggingface repo name. |
dimensions |
Integer | No Default | The dimension of the embeddings generated by the model. |
type |
String | No Default | The type of the model. It must be one of hf or hf_stella . |
url |
String (Optional) | None |
The URL of the model checkpoint. Cannot be provided together with "modelLocation" . |
modelLocation |
Dict (Optional) | None |
The location of the model in S3 or Hugging Face. Cannot be provided together with "url" . |
poolingMethod |
String (Optional) | "mean" |
The pooling method to generate the sentence embeddings. It should be one of "mean" , or "cls" . |
note |
String (Optional) | None |
A place to add notes to your model. This does not affect your model loading process. |
trustRemoteCode * |
Bool (Optional) | False |
Whether to trust the remote code when loading the model. Set this to True if you are loading a hf_stella models |
tokens |
Integer (Optional) | 128 |
The maximum number of tokens to be used in the tokenizer. |
text_query_prefix |
String (Optional) | None |
The default prefix to be added to the text query. Note this will be overwritten by the text_query_prefix parameter in search. |
text_chunk_prefix |
String (Optional) | None |
The default prefix to be added to the documents chunks. Note this will be overwritten by the text_chunk_prefix parameter in add_document |
* Enabling trustRemoteCode
allows the model to execute code from remote sources,
which can be necessary for custom functionality in hf_stella
models.
However, it introduces security risks, as unverified code may be executed.
It is recommended to enable this flag only for trusted sources and use appropriate access controls and monitoring.
To load your model, at least one of the name
, url
, or modelLocation
fields should be provided.
Load from a Hugging Face model card
The easiest way to load your fine-tuned model is to use the Hugging Face model card. After the fine-tuning, you can upload your model to the Hugging Face model hub and use the model card to load your model in Marqo. The model can be public or private. If the model is private, you need to provide the authentication information when you search or add documents to the index.
If the model is public, you can load it with the following code:
# Loading the model from a public Hugging Face model card:
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"name": "<your-public-huggingface-model-card-name>",
"dimensions": 384, # the dimension of the embeddings generated by the model
"type": "hf",
"tokens": 128, # the maximum number of tokens to be used in the tokenizer
},
"normalizeEmbeddings": True,
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
If the model is private, you can load it with the following code:
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"modalLocation": {
"hf": {
"repoId": "<your-private-huggingface-model-card-name>",
},
"authRequired": True,
},
"dimensions": 384, # the dimension of the embeddings generated by the model
"type": "hf",
"tokens": 128, # the maximum number of tokens to be used in the tokenizer
},
"normalizeEmbeddings": True,
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
model_auth = {"hf": {"token": "<your-hf-token>"}}
mq.index("test-custom-hf-model").search("test", model_auth=model_auth)
Load from a zip file
You can also provide a zip file containing all the your fine-tuned model and load it in Marqo. The zip file should contain the necessary files for the model, including the model checkpoint, tokenizer, configurations, etc. Note that these files should be in the root directory of the zip file.
The zip file can be provided through 3 ways: - 1. A public URL - 2. A public/private S3 bucket - 3. A public/private Hugging Face repository
Here is the code:
# Load from a public url
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"url": "https://path/to/your/sbert/model.zip",
"dimensions": 384, # the dimension of the embeddings generated by the model
"type": "hf",
"tokens": 128, # the maximum number of tokens to be used in the tokenizer
},
"normalizeEmbeddings": True,
}
# Load from a s3 bucket
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"modelLocation": {
"s3": {
"Bucket": "<your-s3-bucket-name>",
"Key": "<your-zip-file-key.zip>", # a zip file
}
},
"dimensions": 384, # the dimension of the embeddings generated by the model
"type": "hf",
"tokens": 128, # the maximum number of tokens to be used in the tokenizer
},
"normalizeEmbeddings": True,
}
# Load from a hf repository
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"modelLocation": {
"hf": {
"repoId": "<your-hf-repo-name>",
"filename": "<your-zip-file-key.zip>", # a zip file
}
},
"dimensions": 384, # the dimension of the embeddings generated by the model
"type": "hf",
"tokens": 128, # the maximum number of tokens to be used in the tokenizer
},
"normalizeEmbeddings": True,
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
Stella Models
Marqo also supports loading the Stella models. The dunzhang/stella_en_400M_v5 is included in the model registry, and you can load it by checking the guides here.
If you want to load your fine-tuned Stella model, you can use the same loading methods above with the type
set
to "hf_stella"
, and set the trustRemoteCode
to True
. The Stella models are trained by MRL so they can have multiple
dimensions. However, Marqo requires you to concatenate the linear layers within the model. Your model must provide
end-to-end embeddings with a single dimension.
Examples
Here are some examples to load your own Hugging Face Sentence Transformers model in Marqo.
Example 1: Load a model from a public Hugging Face model card
The sentence-transformers/nli-bert-base-cls-pooling is not included in the model registry. You can still load it from the Hugging Face model card with the following code:
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"name": "sentence-transformers/nli-bert-base-cls-pooling",
"dimensions": 768,
"type": "hf",
},
"normalizeEmbeddings": True,
}
Example 2: Load a private model from a Hugging Face model card
Here, we load a private model from the Hugging Face model card with the following code:
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"modelLocation": {
"hf": {
# This is a private model that you don't have access to
"repoId": "Marqo/e5-base-v2-private-test",
},
"authRequired": True,
},
"dimensions": 768,
"type": "hf",
},
"normalizeEmbeddings": True,
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
model_auth = {"hf": {"token": "<your-hf-token>"}}
res = mq.index("test-custom-hf-model").search("test", model_auth=model_auth)
Example 3: Load a model from a public URL
Here, we load a model from a public URL to a zip file with the following code:
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"url": "https://marqo-ecs-50-audio-test-dataset.s3.us-east-1.amazonaws.com/test-hf.zip", # a public URL
"dimensions": 384,
"type": "hf",
},
"normalizeEmbeddings": True,
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
Example 4: Load a Stella model from a public URL
Here, we load a Stella model from a public URL to a zip file with the following code. Note that this feature is not available for cloud users for security reasons.
settings = {
"model": "your-own-sentence-transformers-model",
"modelProperties": {
"url": "https//:a-public-url-to-your-stella-model/model.zip",
"dimensions": 1024,
"type": "hf_stella",
"trustRemoteCode": True,
},
}
response = mq.create_index("test-custom-hf-model", settings_dict=settings)
Preloading your model
There may be cases wherein you want to preload (or prewarm, in other terms) your model before using it to index.
This can be done by adding your model (with model
and modelProperties
) to the list of models on startup in your
Marqo configuration.
The syntax for this can be found in Configuring preloaded models