Skip to content

Structured Multimodal Index

For documentation on structured indexes see here.

With structured indexes, the multimodal combination fields must be defined as part of the schema itself. This means that is it not necessary to specify the multimodal combination fields when adding the documents. If you do specify the multimodal combination fields when adding the documents, the values will override the values defined in the schema.

In this section we will go over how to create a simple structured index with a multimodal field to vectorise. We will talk through each part of the settings object and what it does.

Minimal example of creating a structured multimodal index

The image_pointer type is used for image fields, this tells Marqo the download the image which the URL points to.

import marqo

settings = {
    "type": "structured",
    "model": "open_clip/ViT-L-14/laion2b_s32b_b82k",
    "allFields": [
        {"name": "text_field", "type": "text", "features": ["lexical_search"]},
        {"name": "image_field", "type": "image_pointer"},
        {
            "name": "multimodal_field",
            "type": "multimodal_combination",
            "dependentFields": {"image_field": 0.9, "text_field": 0.1},
        },
    ],
    "tensorFields": ["multimodal_field"],
}

mq = marqo.Client(url="http://localhost:8882")

mq.create_index("my-mm-structured-index", settings_dict=settings)

Index Settings: model

The model field is used to specify the model to use for the vectorisation. To do multimodal search this must be a CLIP model.

Index Settings: type

The type field is required and must be set to structured for a structured index.

Index Settings: allFields

The allFields field defines the schema for the index. Each field in the schema must have a name and a type, they can optionally have features as well.

The name defines the name of the field in the document. The type defines the datatype of the field. The features field is an array of features that can be applied to the field. In this case we are using the lexical_search feature on the text_field which allows us to search with the LEXICAL search method.

The image_pointer type is used for image fields, this tells Marqo the download the image which the URL points to.

The multimodal_combination type is used for multimodal fields. It is a special type that requires dependentFields, these are fields that are used to create the multimodal field. The keys are the field names and the values are the weights for each field. The weights do not have to sum to 1, they are used to calculate a weighted average of the fields.

Index Settings: tensorFields

The tensorFields field is an array of fields that will be vectorised. In this case we are vectorising the multimodal_field field, any fields of type multimodal_combination must be specified in tensorFields.

Example Add Documents Usage

Because we are using a structured index we don't need to specify the tensor fields or the multimodal mappings in add documents. The schema will take care of this for us.

documents = [
    {
        "_id": "1",
        "text_field": "New York",
        "image_field": "https://example.com/image.jpg",
    },
    {
        "_id": "2",
        "text_field": "Los Angeles",
        "image_field": "https://example.com/image2.jpg",
    },
]

mq.index("my-mm-structured-index").add_documents(documents)

If you want to overide the multimodal field weights then you can do this by specifying the mappings at add documents time:

mq.index("my-mm-structured-index").add_documents(
    documents,
    mappings={
        "multimodal_field": {
            "type": "multimodal_combination",
            "weights": {"image_field": 0.8, "text_field": 0.2},
        }
    },
)

Example Search Usage

The schema above will allow us to search the multimodal_field with the TENSOR search method and the text_field field using the LEXICAL search method.

Tensor Search (search_method="TENSOR" is the default):

results = mq.index("my-mm-structured-index").search(q="New York")

Lexical Search:

results = mq.index("my-mm-structured-index").search(
    q="New York", search_method="LEXICAL"
)