Structured Multimodal Index
For documentation on structured indexes see here.
With structured indexes, the multimodal combination fields must be defined as part of the schema itself. This means that is it not necessary to specify the multimodal combination fields when adding the documents. If you do specify the multimodal combination fields when adding the documents, the values will override the values defined in the schema.
In this section we will go over how to create a simple structured index with a multimodal field to vectorise. We will talk through each part of the settings object and what it does.
Minimal example of creating a structured multimodal index
The image_pointer
type is used for image fields, this tells Marqo the download the image which the URL points to.
import marqo
settings = {
"type": "structured",
"model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
"allFields": [
{"name": "text_field", "type": "text", "features": ["lexical_search"]},
{"name": "image_field", "type": "image_pointer"},
{
"name": "multimodal_field",
"type": "multimodal_combination",
"dependentFields": {"image_field": 0.9, "text_field": 0.1},
},
],
"tensorFields": ["multimodal_field"],
}
mq = marqo.Client(url="http://localhost:8882")
mq.create_index("my-mm-structured-index", settings_dict=settings)
Index Settings: model
The model
field is used to specify the model to use for the vectorisation. To do multimodal search this must be a CLIP model.
Index Settings: type
The type
field is required and must be set to structured
for a structured index.
Index Settings: allFields
The allFields
field defines the schema for the index. Each field in the schema must have a name
and a type
, they can optionally have features
as well.
The name
defines the name of the field in the document. The type
defines the datatype of the field. The features
field is an array of features that can be applied to the field. In this case we are using the lexical_search
feature on the text_field
which allows us to search with the LEXICAL
search method.
The image_pointer
type is used for image fields, this tells Marqo the download the image which the URL points to.
The multimodal_combination
type is used for multimodal fields. It is a special type that requires dependentFields
, these are fields that are used to create the multimodal field. The keys are the field names and the values are the weights for each field. The weights do not have to sum to 1, they are used to calculate a weighted average of the fields.
Index Settings: tensorFields
The tensorFields
field is an array of fields that will be vectorised. In this case we are vectorising the multimodal_field
field, any fields of type multimodal_combination
must be specified in tensorFields
.
Example Add Documents Usage
Because we are using a structured index we don't need to specify the tensor fields or the multimodal mappings in add documents. The schema will take care of this for us.
documents = [
{
"_id": "1",
"text_field": "New York",
"image_field": "https://example.com/image.jpg",
},
{
"_id": "2",
"text_field": "Los Angeles",
"image_field": "https://example.com/image2.jpg",
},
]
mq.index("my-mm-structured-index").add_documents(documents)
If you want to overide the multimodal field weights then you can do this by specifying the mappings
at add documents time:
mq.index("my-mm-structured-index").add_documents(
documents,
mappings={
"multimodal_field": {
"type": "multimodal_combination",
"weights": {"image_field": 0.8, "text_field": 0.2},
}
},
)
Example Search Usage
The schema above will allow us to search the multimodal_field
with the TENSOR
search method and the text_field
field using the LEXICAL
search method.
Tensor Search (search_method="TENSOR"
is the default):
results = mq.index("my-mm-structured-index").search(q="New York")
Lexical Search:
results = mq.index("my-mm-structured-index").search(
q="New York", search_method="LEXICAL"
)