Choosing a model for Marqo
This guide will explain tradeoffs and differences between Marqo's supported embedding models. See our blog post, Benchmarking Models for Multimodal Search and our Hugging Face space for more details.
The most fundamental component of any Marqo index is the embedding model used to represent the data. Marqo's embedding models take data like text or images as input and return an embedding (vector). This vector representation is indexed and searchable within Marqo by using approximate nearest neighbour algorithms along with a simililarty measure like L2 distance. You can use a varienty of different models to generate these vectors, depending on modality, language and performance requirements.
Text
The following models are supported by default (and primarily based on the excellent sbert and Hugging Face libraries and models).
- Marqo/dunzhang-stella_en_400M_v5
- hf/e5-small
- hf/e5-base
- hf/e5-large
- hf/e5-small-unsupervised
- hf/e5-base-unsupervised
- hf/e5-large-unsupervised
- hf/e5-small-v2
- hf/e5-base-v2
- hf/e5-large-v2
- hf/bge-small-en-v1.5
- hf/bge-base-en-v1.5
- hf/bge-large-en-v1.5
- hf/bge-small-zh-v1.5
- hf/bge-base-zh-v1.5
- hf/bge-large-zh-v1.5
- hf/multilingual-e5-small
- hf/multilingual-e5-base
- hf/multilingual-e5-large
- hf/multilingual-e5-large-instruct
- hf/GIST-large-Embedding-v0
- hf/snowflake-arctic-embed-m
- hf/snowflake-arctic-embed-m-v1.5
- hf/snowflake-arctic-embed-l
- hf/ember-v1
These models can be selected when creating the index and are illustrated by the example below:
# Import Marqo and create a client
settings = {
"treatUrlsAndPointersAsImages": False,
"model": "hf/e5-base-v2",
"normalizeEmbeddings": True,
}
response = mq.create_index("my-index", settings_dict=settings)
The model
field is the pertinent field for selecting the model to use. Note that once an index has been created and a model has been selected, the model cannot be changed. A new index would need to be created with the alternative model.
The model will be applied to all relevant fields. Field-specific settings which allow different models to be applied to different fields is not currently supported but will be coming soon (and contributions are always welcome).
Currently, Marqo adds prefixes by default to e5
model queries. These are trained on data with prefixes, so adding those same prefixes to text chunks before embedding improves the quality of the embeddings. The default prefix for queries is "query: "
and for documents, "passage: "
. For more information, refer to the model card here
Although use case specific, a good starting point is the model hf/e5-base-v2
. It provides a good compromise between speed and relevancy.
Images
The models that are used for vectorizing images come from OpenCLIP, and our state-of-the-art models Marqo FashionCLIP.
Marqo FashionCLIP
Marqo-FashionCLIP
& Marqo-FashionSigLIP
are two new state-of-the-art multimodal models for search and recommendations in the fashion domain.
Both models can produce embeddings for both text and images that can then be used in downstream search and recommendations applications.
See the model release article on the Marqo blog for more details.
index_settings = {
"model": "Marqo/marqo-fashionCLIP",
"treatUrlsAndPointersAsImages": True,
"type": "unstructured",
}
mq.create_index("my", settings_dict=index_settings)
OpenCLIP
- open_clip/RN101-quickgelu/openai
- open_clip/RN101-quickgelu/yfcc15m
- open_clip/RN101/openai
- open_clip/RN101/yfcc15m
- open_clip/RN50-quickgelu/cc12m
- open_clip/RN50-quickgelu/openai
- open_clip/RN50-quickgelu/yfcc15m
- open_clip/RN50/cc12m
- open_clip/RN50/openai
- open_clip/RN50/yfcc15m
- open_clip/RN50x16/openai
- open_clip/RN50x4/openai
-
open_clip/RN50x64/openai
-
open_clip/ViT-B-16-plus-240/laion400m_e31
- open_clip/ViT-B-16-plus-240/laion400m_e32
- open_clip/ViT-B-16/laion2b_s34b_b88k
- open_clip/ViT-B-16/laion400m_e31
- open_clip/ViT-B-16/laion400m_e32
- open_clip/ViT-B-16/openai
- open_clip/ViT-B-16-SigLIP/webli
- open_clip/ViT-B-16-SigLIP-256/webli
- open_clip/ViT-B-16-SigLIP-384/webli
- open_clip/ViT-B-16-SigLIP-512/webli
- open_clip/ViT-B-16-quickgelu/metaclip_fullcc
- open_clip/ViT-B-32-quickgelu/laion400m_e31
- open_clip/ViT-B-32-quickgelu/laion400m_e32
- open_clip/ViT-B-32-quickgelu/openai
- open_clip/ViT-B-32/laion2b_e16
- open_clip/ViT-B-32/laion2b_s34b_b79k
- open_clip/ViT-B-32/laion400m_e31
- open_clip/ViT-B-32/laion400m_e32
- open_clip/ViT-B-32/openai
-
open_clip/ViT-B-32-256/datacomp_s34b_b86k
-
open_clip/ViT-H-14/laion2b_s32b_b79k
- open_clip/ViT-H-14-quickgelu/dfn5b
-
open_clip/ViT-H-14-378-quickgelu/dfn5b
-
open_clip/ViT-L-14-336/openai
- open_clip/ViT-L-14/laion2b_s32b_b82k
- open_clip/ViT-L-14/laion400m_e31
- open_clip/ViT-L-14/laion400m_e32
- open_clip/ViT-L-14/openai
- open_clip/ViT-L-14-quickgelu/dfn2b
- open_clip/ViT-L-14-CLIPA-336/datacomp1b
- open_clip/ViT-L-16-SigLIP-256/webli
-
open_clip/ViT-L-16-SigLIP-384/webli
-
open_clip/ViT-bigG-14/laion2b_s39b_b160k
- open_clip/ViT-g-14/laion2b_s12b_b42k
- open_clip/ViT-g-14/laion2b_s34b_b88k
-
open_clip/ViT-SO400M-14-SigLIP-384/webli
-
open_clip/coca_ViT-B-32/laion2b_s13b_b90k
- open_clip/coca_ViT-B-32/mscoco_finetuned_laion2b_s13b_b90k
- open_clip/coca_ViT-L-14/laion2b_s13b_b90k
-
open_clip/coca_ViT-L-14/mscoco_finetuned_laion2b_s13b_b90k
-
open_clip/convnext_base/laion400m_s13b_b51k
- open_clip/convnext_base_w/laion2b_s13b_b82k
- open_clip/convnext_base_w/laion2b_s13b_b82k_augreg
- open_clip/convnext_base_w/laion_aesthetic_s13b_b82k
- open_clip/convnext_base_w_320/laion_aesthetic_s13b_b82k
- open_clip/convnext_base_w_320/laion_aesthetic_s13b_b82k_augreg
- open_clip/convnext_large_d/laion2b_s26b_b102k_augreg
- open_clip/convnext_large_d_320/laion2b_s29b_b131k_ft
- open_clip/convnext_large_d_320/laion2b_s29b_b131k_ft_soup
- open_clip/convnext_xxlarge/laion2b_s34b_b82k_augreg
- open_clip/convnext_xxlarge/laion2b_s34b_b82k_augreg_rewind
-
open_clip/convnext_xxlarge/laion2b_s34b_b82k_augreg_soup
-
open_clip/roberta-ViT-B-32/laion2b_s12b_b32k
- open_clip/xlm-roberta-base-ViT-B-32/laion5b_s13b_b90k
-
open_clip/xlm-roberta-large-ViT-H-14/frozen_laion5b_s13b_b90k
-
open_clip/EVA02-L-14-336/merged2b_s6b_b61k
- open_clip/EVA02-L-14/merged2b_s4b_b131k
- open_clip/EVA02-B-16/merged2b_s8b_b131k
Like the OpenAI based models, the larger ViT based models typically perform better. For example, open_clip/ViT-H-14/laion2b_s32b_b79k
is the best model for relevency (in general) and surpasses even the best models from OpenAI.
The names of the OpenCLIP models are in the format of "implementation source / model name / pretrained dataset"
.
The detailed configurations of models can be found here.
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "open_clip/ViT-H-14/laion2b_s32b_b79k",
"normalizeEmbeddings": True,
}
response = mq.create_index("my-index", settings_dict=settings)
Multilingual CLIP
Marqo supports multilingual CLIP models that are capable up to 200 languages. You can use the following models and achieve multimodal search in your preferred language:
- visheratin/nllb-clip-base-siglip
- visheratin/nllb-siglip-mrl-base
- visheratin/nllb-clip-large-siglip
- visheratin/nllb-siglip-mrl-large
These models can be specified at index creation time.
Note that multilingual clip models are very large models (approximately 6GB) therefore a cuda
device is highly recommended.
settings = {
"treatUrlsAndPointersAsImages": True,
"model": "visheratin/nllb-siglip-mrl-base",
"normalizeEmbeddings": True,
}
response = mq.create_index("my-index", settings_dict=settings)
It is very important to set "treat_urls_and_pointers_as_images": True
to enable the multimodal search.
The model
field is required and acts as an identifying alias to the model specified through modelProperties
.
In modelProperties
, the field name
is to identify the model type. dimensions
specifies the dimension of the output.
type
shows the framework you are using. You should also provide you custom model (checkpoint) by field url
.
You will need to serve your model and access it via a url. For more detailed instructions, please check here.
No Model
You may want to use marqo to store and search upon vectors that you have already generated. In this case, you can create your index with no model.
To do this, set model to the string "no_model"
and define model_properties with "type": "no_model"
and "dimensions"
set to your desired vector size.
Note that for a no model index, you will not be able to vectorise any documents or search queries. To add documents, use the custom_vector feature, and to search, use the context parameter with no q defined.
# Suppose you want to create an index with 384 dimensions
settings = {
"treatUrlsAndPointersAsImages": False,
"model": "no_model",
"modelProperties": {
"dimensions": 384, # Set the dimensions of the vectors
"type": "no_model", # This is required
},
}
response = mq.create_index("my-no-model-index", settings_dict=settings)
Required Keys for modelProperties
Name | Type | Description |
---|---|---|
dimensions |
Integer | Dimensions of the index |
type |
String | Type of model loader. Must be set to "no_model" |